Bitcoin Forum
May 05, 2024, 12:06:29 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 [6] 7 8 »  All
  Print  
Author Topic: Cheap and simple repair of S7 hash board  (Read 28473 times)
icside
Member
**
Offline Offline

Activity: 117
Merit: 10


View Profile WWW
March 15, 2017, 11:50:08 PM
Last edit: July 31, 2018, 11:33:54 PM by frodocooper
 #101

Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.

ah. crap. any idea on how to identify bad ones? I guess I could use a thermal camera, but i'd guess that the whole chain never comes up when one is shorted... I imagine maybe they look at the (presumably I2C) bus that all the chips sit on to see which doesn't say 'hi'?

1714910789
Hero Member
*
Offline Offline

Posts: 1714910789

View Profile Personal Message (Offline)

Ignore
1714910789
Reply with quote  #2

1714910789
Report to moderator
1714910789
Hero Member
*
Offline Offline

Posts: 1714910789

View Profile Personal Message (Offline)

Ignore
1714910789
Reply with quote  #2

1714910789
Report to moderator
1714910789
Hero Member
*
Offline Offline

Posts: 1714910789

View Profile Personal Message (Offline)

Ignore
1714910789
Reply with quote  #2

1714910789
Report to moderator
You can see the statistics of your reports to moderators on the "Report to moderator" pages.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714910789
Hero Member
*
Offline Offline

Posts: 1714910789

View Profile Personal Message (Offline)

Ignore
1714910789
Reply with quote  #2

1714910789
Report to moderator
1714910789
Hero Member
*
Offline Offline

Posts: 1714910789

View Profile Personal Message (Offline)

Ignore
1714910789
Reply with quote  #2

1714910789
Report to moderator
Dibblah
Newbie
*
Offline Offline

Activity: 56
Merit: 0


View Profile
March 16, 2017, 07:27:35 AM
 #102

The bus is serial, with a couple of status pins. Easiest way to locate the bad one is to initially look at the board status - as far as I've seen, splits only happen "up" the voltage ladder (you can't get a failed chip near ground then have anything work after it). This assumes that the entire chip has gone bad - sometimes it's just the hashing elements. In this case, you'll get an X in the status display. Once this happens, the voltage level of the entire chain becomes suspect, so...

Take a multimeter and measure the voltage on each heatsink set of 3 (either reference ground or the set of chips prior). You'll usually get one or two sets of "different" voltages. The challenge from there is figuring out what's wrong. If you have a scope, have a look on the power rail to each chip group (VCCIO is hardly ever the issue - it's done from what I recall with a simple resistive dropper). CLKOut from each chip is also a good troubleshooting step - when the chip is working, it passes a regenerated clock out of that port. When it's not hashing due to power issues, you'll see what looks like a sawtooth on that pin (never gets to a good clock, just oscillates at the switching frequency or so)

It's all quite difficult to diagnose as a unit, because generally once one or two chips die in interesting ways, the power performance of the chain in it's entirety is suspect. Even when a chip is just not hashing ("scission of chips" caused by inter-chip comms failures), it can bring the entire chain down or make it unreliable.

You can also follow the reset / busy lines down the board as well as the CI/CO data.

Short version: It's difficult. Smiley
icside
Member
**
Offline Offline

Activity: 117
Merit: 10


View Profile WWW
March 17, 2017, 02:02:14 AM
 #103

The bus is serial, with a couple of status pins. Easiest way to locate the bad one is to initially look at the board status - as far as I've seen, splits only happen "up" the voltage ladder (you can't get a failed chip near ground then have anything work after it). This assumes that the entire chip has gone bad - sometimes it's just the hashing elements. In this case, you'll get an X in the status display. Once this happens, the voltage level of the entire chain becomes suspect, so...

Take a multimeter and measure the voltage on each heatsink set of 3 (either reference ground or the set of chips prior). You'll usually get one or two sets of "different" voltages. The challenge from there is figuring out what's wrong. If you have a scope, have a look on the power rail to each chip group (VCCIO is hardly ever the issue - it's done from what I recall with a simple resistive dropper). CLKOut from each chip is also a good troubleshooting step - when the chip is working, it passes a regenerated clock out of that port. When it's not hashing due to power issues, you'll see what looks like a sawtooth on that pin (never gets to a good clock, just oscillates at the switching frequency or so)

It's all quite difficult to diagnose as a unit, because generally once one or two chips die in interesting ways, the power performance of the chain in it's entirety is suspect. Even when a chip is just not hashing ("scission of chips" caused by inter-chip comms failures), it can bring the entire chain down or make it unreliable.

You can also follow the reset / busy lines down the board as well as the CI/CO data.

Short version: It's difficult. Smiley

wow, this is much more extensive than I was expecting. thanks! I do have a scope and will probably be checking this out at some point.

Sort of too bad that... I imagine you couldn't put a big diode or something with a similar IV curve in its place and route the CLK and D around the dead chip?

Dibblah
Newbie
*
Offline Offline

Activity: 56
Merit: 0


View Profile
March 17, 2017, 08:58:31 AM
 #104


wow, this is much more extensive than I was expecting. thanks! I do have a scope and will probably be checking this out at some point.

Sort of too bad that... I imagine you couldn't put a big diode or something with a similar IV curve in its place and route the CLK and D around the dead chip?

As long as your diode can cope with 45+A through it continuously, possibly (assuming you're jumpering out one voltage level). However, I am unsure if you can match the on-load requirements closely enough.

I am unsure if you can jumper using the exposed diagnostic points, my feeling is probably no but would be interesting to try. Removing the chips and soldering jumper wires does not appeal at that scale! Also, I don't have enough hardware to test all of this properly, so some of this is just guesswork. It'd be great if someone who had played with this in real life would confirm some of this stuff. Or if I could get hold of a few more burnt / dead boards!

Interestingly (for me), the LEDs are connected at the end of the chain and seem to be just the busy lines from the last chip.
Dragonizer
Full Member
***
Offline Offline

Activity: 279
Merit: 107


View Profile
March 20, 2017, 10:06:58 PM
Last edit: July 31, 2018, 11:35:37 PM by frodocooper
 #105

This fix does not apply to S7 Batch 1 boards, since it deals with the board's voltage regulator and there is none on the 54-chip boards, just the 45-chip boards from later batches.

Is there a list of boards.batches available that someone can post ? I have 4 600Mhz !!!! Boards that that showed no temp or no hash speed all 45 asics, no x or -, after a F/W upgrade they all now show temp but no hash Huh

The ONLY chart i can find is this -



Cheers
Dragonizer
Full Member
***
Offline Offline

Activity: 279
Merit: 107


View Profile
March 20, 2017, 10:13:09 PM
 #106

This fix does not apply to S7 Batch 1 boards, since it deals with the board's voltage regulator and there is none on the 54-chip boards, just the 45-chip boards from later batches.

Hi

Where is this faulty IC chip located and where exactly do i measure the voltage ??

I have 4 faulty S7 600Mhz boards, batch 5.....all show fine no faults, all asics ok, temps low, just NO Gh Huh

I want to mark components especially the 'faulty' pic voltage controller with string UV paste ( i can get from work) and send it off for RMA so we ALL can see what they are actually replacing.

Cheers guys, great thread !!!!
Dragonizer
Full Member
***
Offline Offline

Activity: 279
Merit: 107


View Profile
March 21, 2017, 01:45:02 AM
Last edit: July 31, 2018, 11:35:59 PM by frodocooper
 #107

Unfortunately, the bm1385 is 0.4mm pitch QFNish - with an exposed pad used as ground. So it's a bit of a nightmare to solder. Also if you have a newer board, it'll come with wonderful black adhesive holding the heatsinks onto the board. Not onto the chips, the board - they've used so much adhesive that it has squished out and forms a sort of poor-man's infill around the chip.

What !!! there is no paste between the heatsinks and asics ?, Why? even a bit of '1mm TIM' pad and then hight temp glue to hold the heatsink in place, surely that would be a better upgrade ?
NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 3626
Merit: 2531


Evil beware: We have waffles!


View Profile
March 21, 2017, 02:00:27 AM
Last edit: July 31, 2018, 11:36:20 PM by frodocooper
 #108

What !!! there is no paste between the heatsinks and asics ?, Why? even a bit of '1mm TIM' pad and then hight temp glue to hold the heatsink in place, surely that would be a better upgrade ?

Read it again.

There *is* thermal contact between the chip and heatsink. The thermal epoxy. To minimize chance of the heatsinks coming off they use a LOT of it. So much that yes there is excess actually bonding the board to the sinks. Assuming that the heat sinks were properly pressed down onto the chips to give a very thin even layer, the added contact between the board and sink actually helps move a little more heat away from the package.

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome! 1FuzzyWc2J8TMqeUQZ8yjE43Rwr7K3cxs9
 -Sole remaining active developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
Dragonizer
Full Member
***
Offline Offline

Activity: 279
Merit: 107


View Profile
March 21, 2017, 02:18:20 AM
Last edit: July 31, 2018, 11:36:38 PM by frodocooper
 #109

Read it again.

There *is* thermal contact between the chip and heatsink. The thermal epoxy. To minimize chance of the heatsinks coming off they use a LOT of it. So much that yes there is excess actually bonding the board to the sinks. Assuming that the heat sinks were properly pressed down onto the chips to give a very thin even layer, the added contact between the board and sink actually helps move a little more heat away from the package.

Sorry, missed that. Thanks
NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 3626
Merit: 2531


Evil beware: We have waffles!


View Profile
March 21, 2017, 02:33:59 AM
 #110

np, easy to mis-read.
Of course from a repair standpoint, that makes pulling chips a royal PITA. Thass why Sidehack has no interest in doing a stick using the s7/s9 chips.

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome! 1FuzzyWc2J8TMqeUQZ8yjE43Rwr7K3cxs9
 -Sole remaining active developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
Dragonizer
Full Member
***
Offline Offline

Activity: 279
Merit: 107


View Profile
March 21, 2017, 02:35:26 AM
Last edit: July 31, 2018, 11:37:04 PM by frodocooper
 #111

second board is fixed , i have it running at 500 freq/1.2 th , dont have my volt meter to set the voltage to run it at 700, so ill take care of that later tonight.

this isnt an easy fix by any means soldering to r17 is a royal pain , c76 isnt to bad but id imagine u could just solder to gnd on the pcie plug instead of the side of c76. that would make it a lil easier

that being said im sending the remainder of my boards to sidehack for pic replacement , instead of soldering the rest myself

[...]

Who are 'sidehack' mate ?

Are they better then an RMA to Btmain ?

Thanks
andresrp
Member
**
Offline Offline

Activity: 135
Merit: 11


View Profile
March 21, 2017, 04:04:50 AM
 #112

Hi, any repair guides for S7 boards showing the #48 asics? If im not wrong we have only the 54 chip version of the board. Thank you
Dibblah
Newbie
*
Offline Offline

Activity: 56
Merit: 0


View Profile
March 21, 2017, 07:57:59 AM
 #113

np, easy to mis-read.
Of course from a repair standpoint, that makes pulling chips a royal PITA. Thass why Sidehack has no interest in doing a stick using the s7/s9 chips.

It's not _that_ bad. As long as it's up to solder melting temperature, the black stuff is soft enough. Problem is, at that temp I think the package has started to weaken a little too. And you need some way to remove the black stuff from each chip which is... Less than fun. That combined with the pin-pitch makes this definitely not a hobby job.

Cheers,

Allan.
Dibblah
Newbie
*
Offline Offline

Activity: 56
Merit: 0


View Profile
March 21, 2017, 08:16:43 AM
 #114

Hi, any repair guides for S7 boards showing the #48 asics? If im not wrong we have only the 54 chip version of the board. Thank you

There is essentially not a lot you can do without QFN rework facilities and understanding the board. If it's a chip fallen off situation (desoldered), you may be able to get away with a DIY fix, but even then it's not easy. Likely faults in high-temp environments on the 54 chip boards from what I've heard are:
 Cap failure - which is not impossible to fix, the heatsinks on the back come off fairly easily.
 Failure of the tiny boost converter which does the I/O voltages for the last (couple?) of sets of chips. Hard to fix unless you know what you're doing.
 Failure of an ASIC (solder joint issue). Possible to flux and reflow, but cleaning without softening the adhesive holding all the rest of the heatsinks on is not fun.
 Failure of an actual ASIC chip. Find the chip and replace. Not going to happen if you don't already do fine-pitch SMD work.

There is no PIC or voltage control on a 54 chip board.

I haven't seen a 54 chip board with black adhesive, but I don't have a large sample size.

You can get some diagnostic information by reading the voltages between each "set of 3" heatsinks - there should be around .666 volts per chip (some small amounts of variance is normal). However, even then, there's not a lot you can do without SMD rework.

Check the kernel log - I would expect to see timeouts because it can't see any of the chain - hence why it's defaulting to 48 or 30 chips.

Dragonizer
Full Member
***
Offline Offline

Activity: 279
Merit: 107


View Profile
March 21, 2017, 12:33:05 PM
 #115

Hi guys,

How do i check that the pic is faulty, at what to points do you test ?

Is it a resistance test or do you power board and test the voltage ??

ANY help much appreciated

Thanks
NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 3626
Merit: 2531


Evil beware: We have waffles!


View Profile
March 21, 2017, 12:49:53 PM
 #116

Hi guys,

How do i check that the pic is faulty, at what to points do you test ?
Is it a resistance test or do you power board and test the voltage ??
ANY help much appreciated
Thanks
You may want to look at Sidehack's Modding s7 thread https://bitcointalk.org/index.php?topic=1504228.msg15360994#msg15360994
Lots of good info in there including how to change Vcore with a simple firmware mod.

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome! 1FuzzyWc2J8TMqeUQZ8yjE43Rwr7K3cxs9
 -Sole remaining active developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
Dragonizer
Full Member
***
Offline Offline

Activity: 279
Merit: 107


View Profile
March 22, 2017, 07:27:16 PM
Last edit: July 31, 2018, 11:38:24 PM by frodocooper
 #117

You may want to look at Sidehack's Modding s7 thread https://bitcointalk.org/index.php?topic=1504228.msg15360994#msg15360994
Lots of good info in there including how to change Vcore with a simple firmware mod.

I thought that was the inductor (copper coil with magnet in middle) and the PIC is lower down to the left ?

NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 3626
Merit: 2531


Evil beware: We have waffles!


View Profile
March 22, 2017, 11:40:58 PM
Last edit: July 31, 2018, 11:38:48 PM by frodocooper
 #118

I thought that was the inductor (copper coil with magnet in middle) and the PIC is lower down to the left ?

Ref your pic, is not a magnet, is a ferrite core for the inductor. The regulators run around 50-100kHz and iron laminations are useless at those freq.

I believe the PIC is on the other side of the board off to the right of 3 other larger chips. Follow the traces from the 6-pin programming connection. I have a s7 board pulled and will take it into work tomorrow to look at the actual chip numbers with a microscope. DAMN that writing is tiny .

EDIT: Just looked at the chips under a video scope and you are right. That little chip U3 on the inductor side of the board is the PIC.

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome! 1FuzzyWc2J8TMqeUQZ8yjE43Rwr7K3cxs9
 -Sole remaining active developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
tubexc
Hero Member
*****
Offline Offline

Activity: 496
Merit: 500


View Profile
July 02, 2017, 04:20:11 PM
 #119

My S7 and also my friend's S7 stopped hashing many times, even when boards arrived from RMA, so I decided to stop expensive sending of boards to Bitmain and wait so long without mining. This mod is only suitable for 135 chips version.

After repair I am able to adjust S7 very efficient, my B8 runs at 600MHz (4.05TH/s) with 0.22W/GHs DC, so it is around 0.235W/GHs at the wall.

Repair is very simple, every board I'v seen had malfunctioning or not working PIC microcontroller adjusting voltage for chips, so I decide to override this by 50k potentiometer. Now I can adjust voltage for each board manually.

This is original board without fix. You need to connect potentiometer between U2--R17 connection and GND which can be found on C76.

http://pantin.cz/20160209_155344.jpg

Firstly, use silicone or any other suitable glue to glue potentiometer to the board. After it dries out, you can solder its pins to R17 and C76.

http://www.pantin.cz/20160401_095409.jpg

Once you are done, you can use small screwdriver and turning clockwise potentiometer will adjust lowest voltage, about 9.3V which should be enough to start miner at 500MHz.
You can adjust voltage for each board even when miner is running and check instantly number of HW errors during operation.


I hope it will help you. My opinion is that PIC malfunction is intentional from Bitmain to lower diff after RMA period.

Hello RadekG
Can you please update the links to your pics they are not working Sad
This fix works on Antminer s5 hashing boards too ?
sidehack
Legendary
*
Offline Offline

Activity: 3318
Merit: 1848

Curmudgeonly hardware guy


View Profile
July 02, 2017, 04:24:10 PM
 #120

S5 are unregulated; since this is a repair for the regulator it can't be applied to S5.

Cool, quiet and up to 1TH pod miner, on sale now!
Currently in development - 200+GH USB stick; 6TH volt-adjustable S1/3/5 upgrade kit
Server PSU interface boards and cables. USB and small-scale miners. Hardware hosting, advice and odd-jobs. Supporting the home miner community since 2013 - http://www.gekkoscience.com
Pages: « 1 2 3 4 5 [6] 7 8 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!