It seems to work with the SD card in but I need to take it out or it will not boot, anyway to leave the SD card inside and have it start?
Weird, huh. How about power it up with the SD card inserted, then immediately pop it out. Or put it in far enough that it almost engages, the whine should stop. Once it's going, it seems to go. Once again, not quite sure why.
|
|
|
Snowing like hell this evening, so I unpacked my next job: 5 Neptunes, 3 controllers, three beaglebones.
Of the five Neptunes one has a shorted supply, and the rest don't have shorts on the 10 pin SPI connector. Plugged the first one in, power supplies appear but refuse to power anything. Possibly a reflow, will work on tomorrow.
Second one is *very* interesting. Comes up and runs, but is a super hot runner on the power supplies. No real reason why, but even at 350mhz it's running high 90's. So after letting it run a bit I picked up the cube from the floor.
The bottom of the cube was *HOT*. Odd, since the reference neptune (450mhz, all supplies around 80c) is not really warm on the bottom. Weird. I wonder if the heat sink has lost contact, and as a result the chip is getting super-hot and polluting the surrounding supplies with heat. One way to find out....
|
|
|
All you women need to stop getting your panties in a bunch of being JEALOUS for people with free power. YES, some of us HAVE FREE POWER. Ever heard of SOLAR, WIND or Water Turbine generators? Now stop being little kids and concentrate on having your buggy Antminer S7's operating properly and not worry about others' income.. Having 3kw of solar here, it's not free. Wind's a bitch, and if you have a stream with good head then water turbine is the damn closest thing to free there is. But there's not a lot of that around.
|
|
|
Well, here's my first less than perfect success: Blown up Titan, fireballed power supplies, replaced two. One of the new supplies refuses to come up regardless of reflow and doesn't appear to have any shorts. My guess is that the Titan board took some damage and knocked out the SPI connection to that supply.
Result is a Titan that hashes at about 70mh, keeping the last die at a speed of 150mhz, the others at 300. For the moment I'll classify this as a lot "better than dead" and make an offer on the repairs. Still, not bad overall for a dead unit.
On to the next problem.
C
|
|
|
Hawk is already getting two bridge boards from me, so the last things that could fail software wise would be rpi foolery. the actual knc board hasn't blown or failed for any of my titan sets (i have over 6) yet, it's always been the Pi that needs replacing or the bridge. Wish the titan boards used the beaglebone version.
I think they can, once I get this next shipment I'll spend some quality time seeing if I can hack something together.
|
|
|
Guys, I have one Titan cube that shuts down the psu immediately after connected to the psu. Earlier it shut down the psu randomly but now instantly when connected. It has one die that had to be off (according to its earlier owner), and it is currently off.
BTW: It shuts down the psu even without controller connected.
Do you think there is a short circuit?
Yes. You have a bad power supply inside of the Titan itself, my guess is if there weren't any flames shooting out of it that some of the high side FETs are blown. That's a fix-able thing if you have the right tools, drop me a PM and let me know if you would like me to look at it.
|
|
|
Kinda. I tried putting the Rpi code on an old first generation Rpi to trace it out. Damn thing kept rebooting every two minutes, finally found it has a daemon running that looks for the FPGA to power up. If it doesn't see that in 2 minutes it assumes a problem and reboots. Clearing out /etc/init.d directory stopped it from doing this, enough that I was able to move the files to a Beaglebone for further testing and review.
But it could be a blown interconnect board (still don't know why that would fail), would need one in here to review further.
|
|
|
So... When are you going to pull the breakers for the heat pump into more miners?
|
|
|
So latest from last night's work:
1) Still can't find the destination for those 10 pins. Thought it was the LM75's but nope. Drat, still working. Will get batteries for fuzzer.
2) One of the two bad power supplies on this Titan is fixed, waiting for the other to come in from China.
0 0.7848 43.2500 33.943 68.000 1 0.7811 43.8750 34.271 79.000 2 0.7881 44.1250 34.775 79.200 3 0.7894 43.8125 34.586 75.800 4 0.7885 41.4375 32.673 56.500 5 0.7914 41.0000 32.447 50.400 7 0.7909 41.8125 33.070 52.200
Note that PS 4 is back and powered (300mhz), Power supply 6 is still offline but supply 7 is hashing the die at 150mhz (thus the 41a, if I go any higher it will blow). Interesting thing is stable as a rock, and hashing at 60-70mh. Not bad for a wreck.
Now if I can just find out what that SPI short is... Getting there bit by bit...
|
|
|
Ada: That's something which is on the Beaglebone and *not* the controller board. I think the problem is the BB SD card pins are touching the frame from being pressed on. Unit can't turn on its' DC-DC, doesn't boot, power cycling sometimes gets it up.
What does work for me is to put a sd micro card in the slot and turn on the unit, then pop it out once the power is on. That seems to allow it to boot every time, then I just leave it running. Weird, but a $40 BB problem and not a board problem.
|
|
|
I have got to start issuing press releases:
Lightfoot, a drunken First Officer from the USS Shameless.....
|
|
|
None of the components actually blow on the bridge. Just the copper lines burn out. The ones KnC provide can't handle jolts. And the RPi is horrible as power management anyway. I haven't zero'd in on the exact reason it burns out, but it's 100% a power thing. I'll check to see if I have any burnt bridges left.
Weird. C
|
|
|
^just ninja'd me, hawkfish just bought the last two.
*Update* - The last of the KnC Titan Bridges have been sold. I no longer have any available. If you would like some (you'd need to want more than 3 to justify buying them from me), PM and if I get enough requests i'll get to work on making more bridges.
If it is an emergency and you really need a bridge i'd be willing to part with one of my backup bridges I reserved for myself if you PM me your situation.
Ok, so I have to ask: What blows up on a Rpi bridge? Smoke? Flames? Got a broken board I can buy/borrow/steal?
|
|
|
the neptune runs perfect with the other 4 cubes but if i try to plug in that one specific cube it stops hashing on all cubes and the controller goes wacky
Yup. Question 0 is "what does it do with nothing else plugged into it" (even the display). What should happen is the unit should power up, pick up an IP address, then the light on the side of the board should light up super bright for a second, then turn on a little green LED there. If the bright LED flashes very briefly 3-4 times in a row, you have a short and the DC-DC can't turn on the FPGA or the SPI busses. If one unit causes the whole string to crash, then it has the odd short between pins 4,6, and ground. I'm still trying to figure that one out.
|
|
|
lightfoot,
What did I mess up the RPi or the controller, please see the picture below? Only 1 LED is on instead of three and red LED on the RPi is off. Screen lights up but without any text or bright light at start up. I was using 2 PSUs but 1 to power both controllers, probably turned on one before the other or something to create a short. I can access the web GUI, but it won't show any cubes or hash.
Hm. When you power up the controller without any cubes attached does the super-bright light on the side come on bright, then go out or just do 3-4 little flashes? That's usually the sign the TPS chip or the FPGA has shorted, both are fix-able.
|
|
|
Which leaves me three dead Titan boards to fix. These are really odd: The main symptom is that the boards have 0 ohm shorts on pins 4 and 6 of the 10 pin ribbon cable. This forces down the FPGA and crashes the whole stack. The user said there was some sort of power surge that took three of them out but I don't see how: The FPGA is isolated from 12v power by the TPS chip (blown) and the signal voltages on those lines are small. The miners likewise don't put out that much current there, and so far I have not been able to fuzz out where those lines go.
Hm.
|
|
|
\ Well technically all ASIC mining hardware has variable power draw(load) from their power source. Whenever theres a new block, work restart, network connection issue, pool lag. Hell I wouldnt be surprised if it was even more sporadic then just those, IE: If the firmware that queues work up for the ASIC's is written horribly ... it will cause all sorts of non full queues (cores doing nothing at any given moment).
Its highly possible that the ASIC you mentioned above is just a horrible chip off the block and requires insane voltage to run stable.
True, however SHA256 is pretty basic in terms of what the chip does (rotate, rotate, rinse repeat). That's part of the reason for the insane speed, you can just build butt-tons of cores, and each core pulls about the same amount of power as it does the shifting away. Script does all sorts of things, which might be why people are having such power problems (I can think of two companies that built scrypt chips then died on the vine because they couldn't power them). It's possible that this ASIC is bad, but it hums along perfectly now at a very specific voltage and clock rate. Anyway, seems happy now. Just a note for people thinking their Titan is dead, try some faster hashing speed settings. Might find a sweet spot.
|
|
|
That Titan is now stable as a rock. I set the frequencies to 300mhz all the way around and no more problems. Once again I think KNC has some sweet spots with the chips, use too low a frequency for the voltage or a significantly different frequency from the other dies and it resets the power supplies.
Titan 1 is hashing at a full speed of 80mh. Titan 2 (the filth one) is hashing on two cores at 40mh, waiting for the two new power supplies to come in. Once those are here I will secure them to the board and go from there.
First two repaired controller boards are on the way out as well. Getting there....
|
|
|
Quick update: Switching the Titan to 300mhz/-.293 volts has resulted in 12+ hours, no problems or drops. I'm wondering if Titans (and Neptunes to a lesser extent) have a variable power draw off the DC-DC's which causes them to trip out at certain frequency/voltage combinations. Maybe it's harmonics or something.
But I do think that is what is screwing a lot of Scrypt ASIC makers, you can build a chip, but fueling it when the scrypt programming takes it across the board (now I need CPU registers, now I need to fuck with memory, now I need to play with the barrel rollers, now I need....) will cause your power levels to be wacky. Ergo why not a lot of chip makers seem to be succeeding in this realm.
|
|
|
This Titan did it again. Weird:
DC/DC Voltage (V) Current (A) Power (W) Temperature (°C) 0 0.7900 41.5625 32.834 59.900 1 0.7897 41.2500 32.575 73.400 2 0.7867 41.0625 32.304 57.400 3 0.7886 41.6250 32.825 52.700 5 0.0035 0 0.000 17.300 6 0.7883 40.3750 31.828 44.900 7 0.7928 39.8125 31.563 45.100
|
|
|
|