Bitcoin Forum
March 24, 2017, 04:22:34 AM *
News: Latest stable version of Bitcoin Core: 0.14.0  [Torrent]. (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 [6] 7 »  All
  Print  
Author Topic: Cheap and simple repair of S7 hash board  (Read 13314 times)
NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 1022


Evil beware: We have waffles!


View Profile
January 24, 2017, 12:17:29 AM
 #101

Glad to see it worked Smiley
re:
Quote
this isnt an easy fix by any means soldering to r17 is a royal pain , c76 isnt to bad but id imagine u could just solder to gnd on the pcie plug instead of the side of c76. that would make it a lil easier
Yessss it might/probably would work but -- when it comes to voltage regulators it is always best to keep the various returns as close to the regulator chip as possible. Really, that applies to all signal paths in any circuit... Minimizes phantom parasitic components.

-Joshua Zipkin aka Joshua Alexander leaked AMT A1 miner skype chats http://bit.ly/1Qjt6lj
-For bitcoin to succeed the community must police itself.
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
1490329354
Hero Member
*
Offline Offline

Posts: 1490329354

View Profile Personal Message (Offline)

Ignore
1490329354
Reply with quote  #2

1490329354
Report to moderator
1490329354
Hero Member
*
Offline Offline

Posts: 1490329354

View Profile Personal Message (Offline)

Ignore
1490329354
Reply with quote  #2

1490329354
Report to moderator
1490329354
Hero Member
*
Offline Offline

Posts: 1490329354

View Profile Personal Message (Offline)

Ignore
1490329354
Reply with quote  #2

1490329354
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1490329354
Hero Member
*
Offline Offline

Posts: 1490329354

View Profile Personal Message (Offline)

Ignore
1490329354
Reply with quote  #2

1490329354
Report to moderator
jstew
Hero Member
*****
Offline Offline

Activity: 483


View Profile
January 24, 2017, 12:30:46 AM
 #102

Glad to see it worked Smiley
re:
Quote
this isnt an easy fix by any means soldering to r17 is a royal pain , c76 isnt to bad but id imagine u could just solder to gnd on the pcie plug instead of the side of c76. that would make it a lil easier
Yessss it might/probably would work but -- when it comes to voltage regulators it is always best to keep the various returns as close to the regulator chip as possible. Really, that applies to all signal paths in any circuit... Minimizes phantom parasitic components.

thanks if i do do anymore maybe ill just keep ground at c76 then , its actually easier to solder then r17
bilabonic
Sr. Member
****
Offline Offline

Activity: 341


View Profile
February 04, 2017, 08:10:21 PM
 #103

Can ANYONE confirm what this 'fixes' ?

I have an S7 with all '0's BUT Chain 3 only shows 30 asics#, average speed is 3000Gh !!!!



Cheers guys.

BTC - 1Ayax24aAU8c1xwAakK94DVDkm4kbfZ8Ch
jamesb777
Jr. Member
*
Online Online

Activity: 34


View Profile
February 05, 2017, 07:05:48 AM
 #104

The PIC (programmable interface controller) is a chip that's used to communicate and control voltage with the ASIC chips.  Basically it's there to make sure the ASIC chips are operating at the proper voltages and for temperature monitoring.  I'm guessing this mod is used to bypass the PIC's controlling of voltage.

In the case of your missing ASICs, from Bitmain's FAQ, they call it 'chip scission'.  I believe there is an issue with the hardware around one of your ASIC chips causing the chain to be broken, I don't think this mod would be applicable in your case.

Can ANYONE confirm what this 'fixes' ?

I have an S7 with all '0's BUT Chain 3 only shows 30 asics#, average speed is 3000Gh !!!!



Cheers guys.

bilabonic
Sr. Member
****
Offline Offline

Activity: 341


View Profile
February 05, 2017, 12:34:01 PM
 #105

The PIC (programmable interface controller) is a chip that's used to communicate and control voltage with the ASIC chips.  Basically it's there to make sure the ASIC chips are operating at the proper voltages and for temperature monitoring.  I'm guessing this mod is used to bypass the PIC's controlling of voltage.

In the case of your missing ASICs, from Bitmain's FAQ, they call it 'chip scission'.  I believe there is an issue with the hardware around one of your ASIC chips causing the chain to be broken, I don't think this mod would be applicable in your case.

Can ANYONE confirm what this 'fixes' ?

I have an S7 with all '0's BUT Chain 3 only shows 30 asics#, average speed is 3000Gh !!!!



Cheers guys.


Thanks for the reply James,  idid read through the thread an understood it 'bypassed' the voltage control os the asic chips in paralleled..

I think an option for me would be to 'clean' the boards and see what reults i get.

Cheers

BTC - 1Ayax24aAU8c1xwAakK94DVDkm4kbfZ8Ch
LBNG
Newbie
*
Offline Offline

Activity: 1


View Profile
February 10, 2017, 03:33:22 AM
 #106

In all the pictures shown in this thread the U2 chip remains in place. It is just that they removed it after taking the picture? Just want to confirm that I need to remove it for this fix to work.
icside
Member
**
Offline Offline

Activity: 83


View Profile WWW
March 15, 2017, 05:53:39 PM
 #107

just did this mod to fix like 3/4 of my S7's. it worked on probably 5/6 of my dead boards. nobody at bitmain or any other mining company was willing to provide this info, and it's strikingly, frustratingly simple. many, many thanks to everybody here for saving me some time and money.

if i were to provide any input, i'd suggest not using a pot but a fixed value resistor as that connection is a current-sense circuit which is probably somewhat noise sensitive. at least use the most compact potentiometer with the shortest leads that you can.

anybody know any of the other common fixes they have in their list? curious if I can try fixing the last few dead boards I have, or if the next most common fix is resoldering ASICs...

Dibblah
Newbie
*
Offline Offline

Activity: 14


View Profile
March 15, 2017, 10:30:55 PM
 #108

As far as I can see, the ONLY fix they do really is resoldering ASICs. On the S7, it's not much fun, either.

I've been fiddling about a bit with the VRM and stuff on the board. The LM27402 is a quite nice chip, goes all the way down to .6v output - and the WebBench tools are absolutely awesome. So, theoretically, if the top side of the board has too many faulty ASICs, you could jumper the heatsinks to "cut them out" of the chain.

The only problems I am having are ringing (obviously, a proper poured power plane looks nothing like a relatively thin bridge electrically) and that damn clock snaking over the board. I am unsure of how possible it is to disconnect the clock chain and add back in the oscillator modules that haven't been stuffed.

Also looking with envious eyes at the NEC Tokin caps as used in the PS3 - insanely low ESR.

There's a couple of lessons I learned playing around, though. Firstly, cheap pots are cheap and very occasionally will run over a grain of dirt. When they do this, they go from ~1.5k in a fraction of a second to infinite resistance. Since these SMPSUs are designed to be insanely good at levelling out droops caused by high current draws, the power supply then sees that it's outputting essentially 0v. And ramps the output up to 12v. That leads onto the second thing I learned. It's amazing how much smoke a tiny little chip can emit when you put 12v @ 120A (instantaneous) or so through it.

icside
Member
**
Offline Offline

Activity: 83


View Profile WWW
March 15, 2017, 10:46:11 PM
 #109


There's a couple of lessons I learned playing around, though. Firstly, cheap pots are cheap and very occasionally will run over a grain of dirt. When they do this, they go from ~1.5k in a fraction of a second to infinite resistance. Since these SMPSUs are designed to be insanely good at levelling out droops caused by high current draws, the power supply then sees that it's outputting essentially 0v. And ramps the output up to 12v. That leads onto the second thing I learned. It's amazing how much smoke a tiny little chip can emit when you put 12v @ 120A (instantaneous) or so through it.



HAH! Yeah, one of the more fun stories I've been told is about a slow but unstoppable fire created by a low-voltage short on a PCB connected to a ~500A 5V power supply. the board caught on fire, and the power supply gave zero fcks. just kept sourcing more and more current, fire got bigger, etc.

Re: re-soldering the ASICs: I've hand-re-flowed 1mm pitch BGA FPGAs and know the pains (especially with the temperatures required for lead-free solder and the temperatures most low-quality FR4 delaminates / melts / catches on fire). I think that Bitmain has always been smart enough to use DFN's with a ground/thermal slug IIRC? DFNs I've found to be difficult but much easier in comparison for rework. The problem is I have no idea how to figure out which are dead and which are alive....

Dibblah
Newbie
*
Offline Offline

Activity: 14


View Profile
March 15, 2017, 11:08:17 PM
 #110

Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.

icside
Member
**
Offline Offline

Activity: 83


View Profile WWW
March 15, 2017, 11:50:08 PM
 #111

Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.



ah. crap. any idea on how to identify bad ones? I guess I could use a thermal camera, but i'd guess that the whole chain never comes up when one is shorted... I imagine maybe they look at the (presumably I2C) bus that all the chips sit on to see which doesn't say 'hi'?

Dibblah
Newbie
*
Offline Offline

Activity: 14


View Profile
March 16, 2017, 07:27:35 AM
 #112

The bus is serial, with a couple of status pins. Easiest way to locate the bad one is to initially look at the board status - as far as I've seen, splits only happen "up" the voltage ladder (you can't get a failed chip near ground then have anything work after it). This assumes that the entire chip has gone bad - sometimes it's just the hashing elements. In this case, you'll get an X in the status display. Once this happens, the voltage level of the entire chain becomes suspect, so...

Take a multimeter and measure the voltage on each heatsink set of 3 (either reference ground or the set of chips prior). You'll usually get one or two sets of "different" voltages. The challenge from there is figuring out what's wrong. If you have a scope, have a look on the power rail to each chip group (VCCIO is hardly ever the issue - it's done from what I recall with a simple resistive dropper). CLKOut from each chip is also a good troubleshooting step - when the chip is working, it passes a regenerated clock out of that port. When it's not hashing due to power issues, you'll see what looks like a sawtooth on that pin (never gets to a good clock, just oscillates at the switching frequency or so)

It's all quite difficult to diagnose as a unit, because generally once one or two chips die in interesting ways, the power performance of the chain in it's entirety is suspect. Even when a chip is just not hashing ("scission of chips" caused by inter-chip comms failures), it can bring the entire chain down or make it unreliable.

You can also follow the reset / busy lines down the board as well as the CI/CO data.

Short version: It's difficult. Smiley
icside
Member
**
Offline Offline

Activity: 83


View Profile WWW
March 17, 2017, 02:02:14 AM
 #113

The bus is serial, with a couple of status pins. Easiest way to locate the bad one is to initially look at the board status - as far as I've seen, splits only happen "up" the voltage ladder (you can't get a failed chip near ground then have anything work after it). This assumes that the entire chip has gone bad - sometimes it's just the hashing elements. In this case, you'll get an X in the status display. Once this happens, the voltage level of the entire chain becomes suspect, so...

Take a multimeter and measure the voltage on each heatsink set of 3 (either reference ground or the set of chips prior). You'll usually get one or two sets of "different" voltages. The challenge from there is figuring out what's wrong. If you have a scope, have a look on the power rail to each chip group (VCCIO is hardly ever the issue - it's done from what I recall with a simple resistive dropper). CLKOut from each chip is also a good troubleshooting step - when the chip is working, it passes a regenerated clock out of that port. When it's not hashing due to power issues, you'll see what looks like a sawtooth on that pin (never gets to a good clock, just oscillates at the switching frequency or so)

It's all quite difficult to diagnose as a unit, because generally once one or two chips die in interesting ways, the power performance of the chain in it's entirety is suspect. Even when a chip is just not hashing ("scission of chips" caused by inter-chip comms failures), it can bring the entire chain down or make it unreliable.

You can also follow the reset / busy lines down the board as well as the CI/CO data.

Short version: It's difficult. Smiley

wow, this is much more extensive than I was expecting. thanks! I do have a scope and will probably be checking this out at some point.

Sort of too bad that... I imagine you couldn't put a big diode or something with a similar IV curve in its place and route the CLK and D around the dead chip?

Dibblah
Newbie
*
Offline Offline

Activity: 14


View Profile
March 17, 2017, 08:58:31 AM
 #114


wow, this is much more extensive than I was expecting. thanks! I do have a scope and will probably be checking this out at some point.

Sort of too bad that... I imagine you couldn't put a big diode or something with a similar IV curve in its place and route the CLK and D around the dead chip?

As long as your diode can cope with 45+A through it continuously, possibly (assuming you're jumpering out one voltage level). However, I am unsure if you can match the on-load requirements closely enough.

I am unsure if you can jumper using the exposed diagnostic points, my feeling is probably no but would be interesting to try. Removing the chips and soldering jumper wires does not appeal at that scale! Also, I don't have enough hardware to test all of this properly, so some of this is just guesswork. It'd be great if someone who had played with this in real life would confirm some of this stuff. Or if I could get hold of a few more burnt / dead boards!

Interestingly (for me), the LEDs are connected at the end of the chain and seem to be just the busy lines from the last chip.
Dragonizer
Newbie
*
Offline Offline

Activity: 16


View Profile
March 20, 2017, 10:06:58 PM
 #115

This fix does not apply to S7 Batch 1 boards, since it deals with the board's voltage regulator and there is none on the 54-chip boards, just the 45-chip boards from later batches.

Is there a list of boards.batches available that someone can post ? I have 4 600Mhz !!!! Boards that that showed no temp or no hash speed all 45 asics, no x or -, after a F/W upgrade they all now show temp but no hash Huh

The ONLY chart i can find is this -

/p/SgWnVW]https://c1.staticflickr.com/4/3740/32998195660_c443a58f09_b.jpg/p/SgWnVW]batch by Leighton Kappen, on Flickr

Cheers
Dragonizer
Newbie
*
Offline Offline

Activity: 16


View Profile
March 20, 2017, 10:13:09 PM
 #116

This fix does not apply to S7 Batch 1 boards, since it deals with the board's voltage regulator and there is none on the 54-chip boards, just the 45-chip boards from later batches.

Hi

Where is this faulty IC chip located and where exactly do i measure the voltage ??

I have 4 faulty S7 600Mhz boards, batch 5.....all show fine no faults, all asics ok, temps low, just NO Gh Huh

I want to mark components especially the 'faulty' pic voltage controller with string UV paste ( i can get from work) and send it off for RMA so we ALL can see what they are actually replacing.

Cheers guys, great thread !!!!
Dragonizer
Newbie
*
Offline Offline

Activity: 16


View Profile
March 21, 2017, 01:45:02 AM
 #117

Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.



What !!! there is no paste between the heatsinks and asics ?, Why? even a bit of '1mm TIM' pad and then hight temp glue to hold the heatsink in place, surely that would be a better upgrade ?
NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 1022


Evil beware: We have waffles!


View Profile
March 21, 2017, 02:00:27 AM
 #118

Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.

What !!! there is no paste between the heatsinks and asics ?, Why? even a bit of '1mm TIM' pad and then hight temp glue to hold the heatsink in place, surely that would be a better upgrade ?
Read it again.
There *is* thermal contact between the chip and heatsink. The thermal epoxy. To minimize chance of the heatsinks coming off they use a LOT of it. So much that yes there is excess actually bonding the board to the sinks. Assuming that the heat sinks were properly pressed down onto the chips to give a very thin even layer, the added contact between the board and sink actually helps move a little more heat away from the package.

-Joshua Zipkin aka Joshua Alexander leaked AMT A1 miner skype chats http://bit.ly/1Qjt6lj
-For bitcoin to succeed the community must police itself.
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
Dragonizer
Newbie
*
Offline Offline

Activity: 16


View Profile
March 21, 2017, 02:18:20 AM
 #119

Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.

What !!! there is no paste between the heatsinks and asics ?, Why? even a bit of '1mm TIM' pad and then hight temp glue to hold the heatsink in place, surely that would be a better upgrade ?
Read it again.
There *is* thermal contact between the chip and heatsink. The thermal epoxy. To minimize chance of the heatsinks coming off they use a LOT of it. So much that yes there is excess actually bonding the board to the sinks. Assuming that the heat sinks were properly pressed down onto the chips to give a very thin even layer, the added contact between the board and sink actually helps move a little more heat away from the package.

Sorry, missed that. Thanks
NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 1022


Evil beware: We have waffles!


View Profile
March 21, 2017, 02:33:59 AM
 #120

np, easy to mis-read.
Of course from a repair standpoint, that makes pulling chips a royal PITA. Thass why Sidehack has no interest in doing a stick using the s7/s9 chips.

-Joshua Zipkin aka Joshua Alexander leaked AMT A1 miner skype chats http://bit.ly/1Qjt6lj
-For bitcoin to succeed the community must police itself.
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
Pages: « 1 2 3 4 5 [6] 7 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!