Fantastic progress, lightfoot. I just bought a cheap SMD rework station partly because of this thread. Hoping to fix a Titan cube and a few Neptune cubes with this info.
Thanks! Go for it, this is how we all learn and get better and stuff. I should do another talk at Defcon or something about this, need a good talk title (how to figure shit out when it's on fire? Hm...) When you say you pulled the caps, I assume it goes without saying that you replaced them with new ones. Did you use caps with the same values as stock? I'm wondering what I need to order before attempting any of this. Sorry if this is a dumb question, I've never done this type of work before. If you ever feel the urge to post pictures showing what you replaced, don't hold yourself back. Well, sort of. On the hashing boards I will replace the filtering caps because they serve the purpose of both stabilizing the power input which is being whacked around by the DC-DC's, and because they can help in making the supply more efficient (power factor stuff, really interesting reads out there on that). For the controller board, the caps are important, but a bit less so. You put them on the inputs for a similar reason but since the FPGA is only pulling .5a at 3.2000 volts on the input that's only 1.5 watts and only .001a on the 1.2 volt lines). So the exact values are a bit less critical and if you leave them off for testing purposes the world will not come to an end. So you can play fast and loose on these caps in the short term without destroying too much stuff. I left a few on (the most important one is the one next to the TPS65217 chip because that's where the DC-DC conversion and the chokes are) and for the rest I'll put them back on "later". Finding out the values when the manufacturer doesn't give out schematics (boo!) is a bit complicated, but a $49 or so good Radio Schlock meter with the capacitance testing function is pretty good for getting close. Now, if we were talking about caps in a RC circuit (for timing, checking waveform ripple across an inductor, or as part of a current sensing detector for an overloaded FET) then the values are more critical. But on the low power stuff on a Neptune board (aside from the fact that there is probably something like that around the TPS chip's regulator points) this is once again not too much of a problem. Sure, I'll post pics of this reflow repair, and the times it took with heat to flow the components. Will grab the cam....
|
|
|
Given that KNC seems to be a bit... repetitive... in how they build things (not a bad thing actually), it might be possible to repair a Titan with downed engines. I'm watching this thing and I can see that some power supplies may be in the perumba of the airflow stream, couple that with people's desire to run these things like hell-beasts and you could blow out some of the FETs.
Maybe I should re-title this KNC Neptune and Titan and Jupiter miners....
|
|
|
Putzing around with board #1 here, this one blew two of the trimmer caps off the board and shorted the third. Right now I need to figure out all of these settings, but it's purring along at 400gh @200 watts, so that's not too bad. All eight supplies are under 60c, which is about as high as I would take them. Really will have to find better ways to cool those DC-DC's, I'm guessing that will take down part of a hashing board.
Will let it purr for a bit, then slow it down for the night.
|
|
|
Pulled the bad caps on the other board, and we're up and hashing. So I now know how to fix a blown Neptune board, and probably by extension Triton board and later Jupiter boards.
Mining companies really should hire someone to troubleshoot and fix this stuff. Now on to the next part, fixing the hashing boards...
C
|
|
|
Ok, so anyway I see that my other board has shorted caps somewhere as well, now that I have a board that at least powers stuff I can see where I am:
On the board without the FPGA but with the fixed caps: Voltages: 3.3 volts is on the left side of the chip along that row of caps. 1.2 volts is on the right side of the chip, between the first four caps (center rail, edge ground) 2.5 volts is on the right side of the chip, outside the last two caps (edge rail, center ground).
These are correct to 3 significant figures, so I think they are highly regulated. My other board has... different values.
I'm going to work on this some more tomorrow, maybe try to put on the FPGA chip after cleaning the pads. Which is always fun.
|
|
|
Well, Im confused ... all that command states is the board powered up fine. LOL! So Im still not sure how that indicates failure.
Oh ok. Quick summary (I have all of this in my other thread, I really don't want to screw this one up with endless tech prattle) On a "bad" board: root@Neptune:/etc/init.d# io-pwr init TPS65217 OK. Modification A, revision 1.2 Wrong SEQ4 value 0x40 Wrong ENABLE value 0x09 Wrong SEQ4 value 0x40 Wrong SEQ4 value 0x40 Wrong SEQ4 value 0x40 DC/DC converter configuration failed! On a board where I pulled the shorted cap root@Neptune:/etc/init.d# io-pwr init TPS65217 OK. Modification A, revision 1.1 root@Neptune:/etc/init.d# io-pwr is the command to fire up the DC-DC converter. It puts out three voltages. If a cap is shorted it detects the crowbar and fails.
|
|
|
What does that say about failure?
Well, before the chip would never fully power up, it always gave those stupid errors. Now it comes up clean and oddly enough the red LED is lit on the board (it never did that before). Given that pulling the cap cleared the short, I can say as a flat fact that the cap shorted. Likewise the similar short on the other board makes me think this is not uncommon. As to what it says, it might say that either: a) The FPGA runs super hot and fries the caps (I doubt this a lot) b) The caps are lousy quality and suck Goat Cock (given the # of blown caps I have seen on the hashing boards I wonder). Will have to work on it a bit more to verify, but more busted boards would help. Edit: After pulling the trim caps on the other board, the chip seems to be up: Advanced tab now works. I think I have a cross-verify. We'll see.
|
|
|
Yep, I figured out why the Neptune (and possibly other) boards have been failing.
root@Neptune:/etc/init.d# io-pwr init TPS65217 OK. Modification A, revision 1.1 root@Neptune:/etc/init.d#
Yaay! Happy me.
|
|
|
Powered up the board. Ran the IO powerup
root@Neptune:/etc/init.d# io-pwr init TPS65217 OK. Modification A, revision 1.1 root@Neptune:/etc/init.d#
In technical parlance: That's it.
Now to see if I can *gently* figure out which cap is blown on the other board here.
Progress!
|
|
|
This is *very* interesting.
So with all chips off I decided to check the pads to ground. Because I noticed one of the caps seemed "shorted" on pin 1 to the tps65217 chip. Still a short. Odd.
Put the new TPS chip on and checked again. Pin 1 shorted to ground. According to the docs, pin 1 is vout2. Odd. So I pulled the filter cap next to pin 1 and ground. Pad still shorted at the chip.
Odd.
Checked the rest of the board. Only other caps on that line are the two under the FPGA pads (FPGA is off). Removed both. OPen circuit to ground. Checked both. One was shorted, put the other back.
Odd.
What I am beginning to think is this: The trimmer caps on the Neptune board can short. When one of them does it drives the associated voltage line to zero and the FPGA is fucked. Checked my other board (the control, did nothing to it) and sure enough, one of the voltage lines is shorted to ground through one of those caps on the left side.
Interesting. Anyone else with a dead-sih neptune board bored this weekend and want to play with a VOM?
|
|
|
Ok, got the power chip off the board, since the DigiKey order *finally* arrived. Christmas sucks sometimes. :-)
Now, just a quick tip: For these boards pre-heating to 300f then using the air tools at 325 degrees C is not enough to make the chip come loose. Fine. 375 is more than enough. So I don't need to use the "dragon" mode of 450c. That's nice to know.
Need to wait for an hour for cool down, then check all the caps for shorts, then try putting the new chip on and see if the BB brings up all of the power rails. If so we have progress.
|
|
|
Yeah, its loaded into the RAM on the controller board and executed. Thats why it has to be loaded up each power up.
Ok. Some FPGA's have built in EEPROM that people load the code into. Which pretty much precludes modding it in the field as you need either a Jtag tool like an Atmel Dragon or a Xylink tool to whack at it. Got the power chips off the board, waiting for cooldown so I can flux and retry. Also checking the caps to see if any shorted.
|
|
|
huh....? I thought the FPGA on (software) on the 6 port board was different and knc would not release such? want you to be right thou ...clarity...GLEN we nee CLARITY ON THIS! (yelling does not help but makes me feel better as I type this and shout at bitcointalk ...man I need a life) The problem is people have to test this stuff. And I doubt anyone with a running Titan is willing to take their system down, buy a working Neptune board, plug it all in, see if it works, then put their old system back together. If someone wants to mail me a Titan cube and a Rpi adapter+board feel free to drop me a line.
|
|
|
Yes the FPGA code is different for titans vs the SHA knc products for the controller board.
Sure, but isn't the FPGA microcode loaded as a part of the boot up process?
|
|
|
it starts from the port on the asic side then usually runs the whole cable, i have ram'd atlast 2 vega pus, and two antes 1300 plat.
*nod* Yup, burning cables. The magic thing about it is the hotter the copper gets the higher the resistance and voltage drop. Positive cycle till all hell breaks loose. For a 600 watt draw that would be 200 watts per wire. At 12 volts that is 16 amps, which should have 12 gauge wire min with 14 gauge really pushing it. 16-18 gauge wire (the kind they use in cheaper supplies) can only handle wait forget it. A 10 foot 18 gauge wire hauling 16a will drop *4* volts. Or 64 watts. Toaster wire.
|
|
|
lightfoot pm me your address, any chance you can pay for shipping?
Will do, absolutely. Do you happen to take some sort of electronic currency, maybe... bitcoin?
|
|
|
No problem, what's the worst that can happen? I do remember when one of the miners I was testing and doing some serious hacking to exploded in a plasma fireball, but that's why you have a fire extinguisher.
|
|
|
KNC?
They have instructions for the neptune boards on downloading a zip, extracting it to a small sd type card, then sticking it in the ass of the beaglebone. Reboot, come back in 10 mins, it's factory default.
I did this to fix one of my BB's. Now if I could just get my Digikey shipment I can try swapping the power and FPGA parts.
|
|
|
Nice job too, where did you get them made? Looks like a pro job, and if they put the parts on then that has really come a ways from the last time I had PCB's done.
|
|
|
|