Bitcoin Forum
May 04, 2024, 10:11:58 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 »  All
  Print  
Author Topic: Hacking GPU cards back into operation because I need something to do....  (Read 3880 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 02:38:36 AM
 #1

So I've been fixing Titans, Neptunes, Monarchs, Singles, Avalons, and a whole bunch of mining technologies over the years, but for some reason never really fiddled around much with GPU cards. They blow up too, and I see them on sale at Ebay all the time. I need a challenge, so I thought I would start a thread on my observations in fixing them if possible, developing techniques that can work, and figuring out how to tell one that can be fixed from a brick.

As normal, I will post my thoughts below and see what I can come up with. First up I need to find some dead cards to practice on....

Background: Years of doing SMD repair on electric car power controllers (400v/500a) as well as miners (.6 volts, 1000 amps) and other small things. I prefer to use hot air rework tools, and I like to use pre-heat to keep from roasting components. I don't use the toaster to repair boards. :-)

Let's see where this goes.
1714860718
Hero Member
*
Offline Offline

Posts: 1714860718

View Profile Personal Message (Offline)

Ignore
1714860718
Reply with quote  #2

1714860718
Report to moderator
1714860718
Hero Member
*
Offline Offline

Posts: 1714860718

View Profile Personal Message (Offline)

Ignore
1714860718
Reply with quote  #2

1714860718
Report to moderator
1714860718
Hero Member
*
Offline Offline

Posts: 1714860718

View Profile Personal Message (Offline)

Ignore
1714860718
Reply with quote  #2

1714860718
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 02:39:05 AM
 #2

Reserved for tips and tricks
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 02:39:19 AM
 #3

Reserved for status. Let's roll....
Emoclaw
Sr. Member
****
Offline Offline

Activity: 420
Merit: 251


View Profile
January 11, 2017, 02:53:47 AM
 #4

Nice. I'd be interested to see how many of the cards you attempt to repair can actually be repaired. 
I have a friend in the component-level repair industry and he says that most GPUs die because their VRMs are either of terrible design or the cooling is bad. The graphics chip itself rarely dies. Though he doesn't actually repair graphics card due to luck of schematics, which he says makes the process more time consuming.
Good luck, I'll be following this thread.
reb0rn21
Legendary
*
Offline Offline

Activity: 1898
Merit: 1024


View Profile
January 11, 2017, 02:57:59 AM
 #5

I presume if VRM get shorted, PCB will be damaged at least on mid/high end cards

In past like 6+ years ago most problems were due GPU used bad solder to PCB so reflow helped, now I think its VRM mostly or GPU memory going bad

              ▄▄▄ ▀▀▀▀▀▀▀▀▀ ▄▄▄
           ▄▀▀    ▄▄▄▄▄▄▄▄▄    ▀▀▄
        ▄▀▀  ▄▄▀█          ▀█▀▄▄  ▀▀▄
      ▄▀▀ ▄▄▀    ▀▀▄▄▄▄▄▄▄▀▀    ▀▄▄ ▀▀▄
     █   █            ▀            █   █
   ▄▀ █  ▀▄▄                     ▄█▀  █ ▀▄
  ▄▀ ▄▀ █▄ ▀▀▀██▄▄▄       ▄▄▄██▀▀  ██ ▀▄ ▀▄
  ▀▄▀▀▄ ██ ▄▄▄▄▄▄  ▀▄   ▄▀  ▄▄▄▄▄▄ ██ ▄▀▀▄▀
 ██   █ ██ ▀▄    ▀▄ █   █ ▄▀    ▄▀ ██ █  ▀██
 █  ▄█  ▀█  ▀▀▀▀▀▀▀ █   █ ▀▀▀▀▀▀▀  █   █▄  █
█▀ █  █  █          █   █          █  █  █ ▀▀
 █▀  ▄▀  █▀▄        █   █        ▄▀█  ▀▄  ▀█
 ▄  █▀   █ ▀█▄      ▀   ▀      ▄█▀ █  ▄▀█  ▄
 █▄▀  █  █                         █  █  ▀▄█
 ▀▄  █   ▀█        ▄▄▀▄▀▄▄        █▀   █  ▄
  ▀▄▀▀  █▄ █     ▀█  ▀▀▀  █▀     █ ▄█ ▄▀▀▄▀
   ▀ ▄  ██ █▀▄     ▀▀▄▄▄▀▀     ▄▀█ ██ ▀▄ ▀
    ▀█  ██ █ █▀▄    ▄▄▄▄▄    ▄▀█ █ ██  █▀
      ▀▄ ▀ █ █ ██▄         ▄██ █ █ ▀ ▄▀
        ▀▄ █ █ █ ▀█▄     ▄█▀ █ █ █ ▄▀
          ▀▀▄█ █    ▀▀▀▀▀    █ █▄▀▀
              ▀▀ ▄▄▄▄▄▄▄▄▄▄▄ ▀▀
   
..I  D  E  N  A..
   
Proof-of-Person Blockchain

Join the mining of the first human-centric
cryptocurrency
 



 
▲    2 3 2 2

..N  O  D  E  S..
   
                ██
                ██
                ██
                ██
                ██
         ▄      ██      ▄
         ███▄   ██   ▄███
          ▀███▄ ██ ▄███▀
            ▀████████▀
              ▀████▀
                ▀▀
██▄                            ▄██
███                            ███
███                            ███
███                            ███
 ███▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄███
  ▀▀██████████████████████████▀▀
   
D O W N L O A D

Idena node

   
   
▄▄▄██████▄▄▄
▄▄████████████████▄▄
▄█████▀▀        ▀▀█████▄
████▀                ▀████
███▀    ▄▄▄▄▄▄▄▄▄       ▀███
███      █   ▄▄ █▀▄        ███
██▀      █  ███ █  ▀▄      ▀██
███       █   ▀▀ ▀▀▀▀█       ███
███       █  ▄▄▄▄▄▄  █       ███
███       █  ▄▄▄▄▄▄  █       ███
██▄      █  ▄▄▄▄▄▄  █      ▄██
███      █          █      ███
███▄    ▀▀▀▀▀▀▀▀▀▀▀▀    ▄███
████▄                ▄████
▀█████▄▄        ▄▄█████▀
▀▀████████████████▀▀
▀▀▀██████▀▀▀
   
    .REQUEST INVITATION.
bathrobehero
Legendary
*
Offline Offline

Activity: 2002
Merit: 1051


ICO? Not even once.


View Profile
January 11, 2017, 03:04:13 AM
 #6

Cool.

Out of dozens of GPUs over the years I only ever had one particular model (GV-N75TOC-2GI) dying because it had weak VRMs. I think 5 out of six died withing months.
After the RMA repair process the same cards still work flawlessly.

Not your keys, not your coins!
SweaterJacket
Newbie
*
Offline Offline

Activity: 27
Merit: 0


View Profile
January 11, 2017, 03:17:15 AM
 #7

Reserved
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 03:25:25 AM
 #8

Interesting. Power subsystems are one of my specialties, it's surprisingly hard to build a good one and easy to screw it up.

My first thought was that overheating the GPU chip could cause the solder balls to go high resistance, thus causing it to fail, however the problem is most GPUs are a very high density BGA mounted on a board to a pitch that will mate to a rational PCB. The high density BGA isn't the issue, it's that they glue the die to the carrier and if you overheat the chip too much the solder balls "blow out" and short under the die. That's sunk.

I'll take a look into the VRMs.

C
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 03:29:10 AM
 #9

I presume if VRM get shorted, PCB will be damaged at least on mid/high end cards

In past like 6+ years ago most problems were due GPU used bad solder to PCB so reflow helped, now I think its VRM mostly or GPU memory going bad
Typically the high side FETs on reasonable VRMs will have a RC circuit or a op amp comparator across them to measure current flow and shut down the VRM if the current flow goes too high (ie a burned FET) before there is a cut through short to ground. Low side FETs rarely fail because their on time is much higher than the high side, so they don't have as much switching loss.

If the GPU shorts internally you're sunk of course but that can be tested by pulling the high side FETs and looking for shorts. Hm.
l8nit3
Legendary
*
Offline Offline

Activity: 1007
Merit: 1000


View Profile
January 11, 2017, 03:29:23 AM
 #10

Im highly intrigued by this idea and have thought of the same myself, however dont have the low-level hardware background to make it a possibility. Personally i have a 280x thats driving me nuts. Hopefully you end up working with a card with a similar issue.

Just to put it out there, the card mines just fine, but no matter what drivers or gpu-reading software i use (gpu-z, AB, trixx) I cannot ever get this thing to show a temperature! In fact ive spent the shipping and had it sent back to gigabyte under warrunty, and after claiming to fix it, it still shows no temp!

All that said, I love the idea of this thread and will be following very closely, Good luck, and thank you in advance for any tips and tricks you find. Smiley
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 03:32:27 AM
 #11

Nice. I'd be interested to see how many of the cards you attempt to repair can actually be repaired. 
I have a friend in the component-level repair industry and he says that most GPUs die because their VRMs are either of terrible design or the cooling is bad. The graphics chip itself rarely dies. Though he doesn't actually repair graphics card due to luck of schematics, which he says makes the process more time consuming.
Good luck, I'll be following this thread.
Indeed. Lack of cooling on VRMs will cause the FETs to go, my guess is if you're overclocking that can do it (current will avalanche as temps go up). As for schematics, there never seem to be any, anymore especially for Bitcoin miners; no one wants to take the liability I suppose. However these things are pretty simple at their heart: Get power into them, get work into the chip and out, and put the heat somewhere.

Now I need some dead boards to start working on. Anyone got a box of old dead boards?
mirny
Legendary
*
Offline Offline

Activity: 1108
Merit: 1005



View Profile
January 11, 2017, 03:47:57 AM
 #12

I have 5, or 6 dead boards, 7950,7970,280x,6990s

This is my signature...
bathrobehero
Legendary
*
Offline Offline

Activity: 2002
Merit: 1051


ICO? Not even once.


View Profile
January 11, 2017, 04:19:46 AM
 #13

Getting back to my previous comment about certain models having the same issue, I also used to have a bunch of Asus GTX 780 Ti cards that were designed in a way that their VRMs would go well above 100°C as they had absolutely no dissipation (just hid under the heatsink with no contact). I bought a few thermal pads and put it on them so that the pads connected them to the heatsink and the temps were decreased drastically.

Also, when I used to mine Ethereum I noticed the memory modules would go slightly above 100°C (GTX 970) even without overclocking while the GPU itself was about 60°C and I expect a lot of those cards will end up dying coming from miners who mined Eth for a long time or might even still mine it.

So my point is that probably each exact model of cards have an expected way of dying. And ebay is probably full of faulty cards that were already checked by someone experienced like OP and deemed FUBAR.

Not your keys, not your coins!
adaseb
Legendary
*
Offline Offline

Activity: 3752
Merit: 1710



View Profile
January 11, 2017, 11:18:17 AM
 #14

The most common failure with GPUs are the fans. Depending which type of fan it uses it can be repaired in different ways.

The Sapphire Dual-X R9 280X, Gigabyte Windforce fans, all have fan blades that you can easily pop-off using some string and then relube the bearing with grease. Works everytime pretty much.

The more durable fans like on the ASUS 7970 / ASUS 280x / MSI 280x you need to drill a hole in the back slightly off-centre and pour in the thinnest oil that can fit inside. This sometimes works great ... sometimes works but rattles.... reason being that lube would be best however its impossible to lubricate the bearing.

For the newer RX 470 / 480 the fans will probably start failing sooner or later however for those most have 2-3 year warranty and you can just RMA them.

 

.BEST..CHANGE.███████████████
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
███████████████
..BUY/ SELL CRYPTO..
FFI2013
Hero Member
*****
Offline Offline

Activity: 906
Merit: 507


View Profile
January 11, 2017, 08:08:31 PM
 #15

I have a gigabyte r9/270 you can check out I lost the receipt to RMA but if your in the us I can ship it to you. I also have a gridseed blade that needs to be looked at
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 10:12:32 PM
 #16

Yes, I am in the US, feel free to PM me as needed.
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 11, 2017, 10:14:20 PM
 #17

I had a GPU die in quite a silly way, the PCI extender I was using (16x to 1x) I had the 1x plugged in the wrong way, and apparently this killed the card through an extender. If this is something you think you can fix, I will gladly send it to you for shipping cost.
Sure. I'll PM you my address.

C
hhdllhflower
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
January 11, 2017, 10:21:52 PM
 #18

Reserved
nice job Cool
helipotte
Hero Member
*****
Offline Offline

Activity: 650
Merit: 500


Pick and place? I need more coffee.


View Profile
January 12, 2017, 01:49:18 AM
 #19

Nice to see you working on GPU's.  I have a few stencils coming for tahiti/pitcairn/hawaii and I will take a crack at re-balling some of the units I have.  They look
like they use 0.5mm balls.  Can send you some of my "trouble" units if you want to try to fix them.  I have a few units that keep popping mosfets.  Have been trying
to find out a way to narrow down bad memory chips on cards.  Don't even know if this is possible without changing them one at a time.

Cheers!
lightfoot (OP)
Legendary
*
Offline Offline

Activity: 3108
Merit: 2239


I fix broken miners. And make holes in teeth :-)


View Profile
January 12, 2017, 02:13:20 AM
 #20

You would think one could run diagnostics on the things to find the bad memory cards; those are easy to swap out but yes, a pain.

Have you tried checking the resistance with the inductors off? Back in the BFL days that was the #1 best way to identify which FET was shorted and also a way to identify a shorted die (0 ohms means infinite current no matter how you slice it).

C
Pages: [1] 2 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!