Bitcoin Forum
May 10, 2024, 06:11:27 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Can a bad pcie riser fry a gpu?  (Read 7678 times)
wedge (OP)
Full Member
***
Offline Offline

Activity: 187
Merit: 100


View Profile
August 15, 2013, 02:56:58 PM
Last edit: August 15, 2013, 07:38:15 PM by wedge
 #1

I've had some unfortunate luck lately.

I've blown up 3 7950's in the past 2 weeks.  They all suffered the exact same failure.  The whole rig would shut off for no apparent reason.  And the instant I turn it back on, one of the vrm resistors would go out with a huge spark.


The cards are all the same, gigabyte gv-r795wf3-3gd rev 2.0

I originally setup two identical rigs.  With all the exact same hardware, same boards, same everything.  They had 4 gpu's each.  One of them is now down to a single gpu the other is still working perfectly on all 4.  The 4 cards are using powered risers.  I've only been plugging in a power cable to 2 out of the 4.

Temperatures are well within reason, I keep the temp in the low 70's.  Overclock is not very high, voltage is running at 1.09v.


At first I was thinking it was just a bad card, or maybe temps suddenly jumped, or power spike caused it... I wasn't sure, and didn't have much to go on to figure out the cause.
In statistics they say that 1 point on a graph is a dot, 2 points is a straight line.  It takes 3 points to see a trend.  It sucks I had to lose 3 brand new cards to figure this out, but I may have just figured it out.  The riser!  Each of the blown cards had that in common...


I've tried swapping motherboards and power supplies.  It, obviously happens to a different gpu each time.  But this morning, when the latest card blew I had a sudden realization.  It's been happening to the card installed on the SAME RISER each time!
Now, I'm not 100% sure, because I haven't really been keeping track of that.  But I haven't been moving the risers around, and I what I realized is that the card that blows is in the same physical location each time.  So I'm 90% sure they were all using the same riser at the time when they blew.

I've now replaced that riser with a new spare, and I've put a different gpu in the same spot.  So we'll see what happens there in time.

I removed the blue tape from that riser to take a look at the soldering, and I do indeed see some poor quality work.  It's good enough to "work", but is far from ideal.  I'll post some pics later.



I've haven't found out yet if warranty is going to cover these cards or not.  They are only about 30 days old.  The store they were bought from only has a 15 day in-store replacement.  So they have to get sent back to gigabyte.  Dude at the store says since there is a burn mark on the card, they might not be covered.  Does anyone have any experience with this?  What's gigabyte's warranty service like?  Do they generally replace these?  What sort of questions will they ask me, and how should I best answer?
 I would hate to have to eat the cost of three expensive new cards.  

1715321487
Hero Member
*
Offline Offline

Posts: 1715321487

View Profile Personal Message (Offline)

Ignore
1715321487
Reply with quote  #2

1715321487
Report to moderator
"Bitcoin: the cutting edge of begging technology." -- Giraffe.BTC
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715321487
Hero Member
*
Offline Offline

Posts: 1715321487

View Profile Personal Message (Offline)

Ignore
1715321487
Reply with quote  #2

1715321487
Report to moderator
1715321487
Hero Member
*
Offline Offline

Posts: 1715321487

View Profile Personal Message (Offline)

Ignore
1715321487
Reply with quote  #2

1715321487
Report to moderator
1715321487
Hero Member
*
Offline Offline

Posts: 1715321487

View Profile Personal Message (Offline)

Ignore
1715321487
Reply with quote  #2

1715321487
Report to moderator
wedge (OP)
Full Member
***
Offline Offline

Activity: 187
Merit: 100


View Profile
August 15, 2013, 04:10:43 PM
 #2

Update:
Just checked the other end of the card, and did find one solder joint broke off.  

I checked a pci-e pinout diagram and found that it is just a ground wire.  Given the large number of ground connections in the connector, whether a single missing ground wire poses a problem would depend on the design of the gpu and the motherboard.  Best case: it would have no impact at all.  Worst case: one or more components might heat up, or act strangely due to the missing or insufficient grounding.

Even if that broken ground is not a problem in itself, it is still an indication of the poor soldering quality I noticed elsewhere.  When I push that wire to the side, I can see the size of the contact patch between the wire and the board was about the size of the tip of a pin.  I can see many bad or weak connections that look like they all have the same problem, even though they haven't broken off yet.

Here's the pics.  It's the best picture quality I could manage with my phone.  Doesn't show the problems clearly, but you can mostly see what I'm talking about.  That's as close up as I could get where the camera could still focus.

Here you can see the broken ground:


Here you can see several of the wires are not making direct contact with the card, just connected by a solder bridge, you can see it on pin 11.  
Some others don't have enough solder.  Look at pin 1 and 3 at the top.  So they have a poor connection, and could easily break loose too.


Take a look at pin 1 here.  It's making ZERO contact with the board.  It's not even aligned properly.



blasthash
Newbie
*
Offline Offline

Activity: 53
Merit: 0


View Profile WWW
August 22, 2013, 05:38:14 AM
 #3

It's hard to tell, because aside from the third-world solder job on the card connects, the cards appear intact. However, with that poor of a job, it is entirely possible the GPU started pulling more current than that joint could handle and you arced over. That's a complex electrical scenario and hard to analyze, but an uncontrolled 10-11A current flashover that close to your GPU isn't welcome. I doubt that occured for the sole reason the riser itself looks again, intact.

The photo of the slot side, notch down - the connections here look especially suspect, but without a photo of the card and fry site in question it is hard to tell.
pr0d1gy
Hero Member
*****
Offline Offline

Activity: 658
Merit: 502



View Profile
August 22, 2013, 10:06:47 AM
 #4

Id almost blame the gpu... Ive had mine fail in a fireball...

Also got reports from one of my friend's who burned a gigabyte. as well as others on forum.

Mine was after 2 months of use, and that was without powered risers. if you lost 3 in two weeks, with powered ones... wow.
But that's just me...




Set Escrow
¯\_(ツ)_/¯
blasthash
Newbie
*
Offline Offline

Activity: 53
Merit: 0


View Profile WWW
August 22, 2013, 10:26:29 AM
 #5

Id almost blame the gpu... Ive had mine fail in a fireball...

Also got reports from one of my friend's who burned a gigabyte. as well as others on forum.

Mine was after 2 months of use, and that was without powered risers. if you lost 3 in two weeks, with powered ones... wow.
But that's just me...



That fits. Shoddy regulators and bad voltage connections on the part of the manufacturer.
Could also be lesser-grade components to lower costs. Usually VR trimming or setting resistors are designed to handle a good bit of transient overcurrent, or at least, they should be.

But three in a row in the exact same area and manner suggest something different.

@OP - I'd consider pitching that to the company and trying to leverage some new GPUs. Even if it isn't warranty per se failure like that is shitty performance for a manufacturer. That's what we in first-world countries like to call a defect. I'd probably also either a) get a new riser, or b) do some rework on that one. Fusing the power connection with a rated fuse for the draw of the card might not hurt either.
wedge (OP)
Full Member
***
Offline Offline

Activity: 187
Merit: 100


View Profile
August 31, 2013, 07:11:34 PM
 #6

It's hard to tell, because aside from the third-world solder job on the card connects, the cards appear intact. However, with that poor of a job, it is entirely possible the GPU started pulling more current than that joint could handle and you arced over. That's a complex electrical scenario and hard to analyze, but an uncontrolled 10-11A current flashover that close to your GPU isn't welcome. I doubt that occured for the sole reason the riser itself looks again, intact.

The photo of the slot side, notch down - the connections here look especially suspect, but without a photo of the card and fry site in question it is hard to tell.

Yeah, other than the soldering itself, the risers themselves didn't fail.  Only the GPU.  What I want to know is if the bad soldering and faulty/intermittent connections can cause the gpu to fry?  
If so, then I will certainly be more selective about the risers I use in the future.  And will personally inspect each of them in detail prior to use.

All 3 gpu's failed in the exact same way.  There is no visible damage on the front side of the board, only on the back.  The actual component that blew up is one of the resistors connected to the vrm, not the vrm itself.  
Two have already been shipped for warranty, here's a pic of the one I still have:




To reiterate, the failure occured like this:
1. I would find the entire pc unexpectedly powered off. 
2. I switch the pc off/on at the power supply. 
3. I turn on the pc at the motherboard. 
4. Resistor on back of gpu instantly blows with a big spark.

It seems to me like something internal failed on the vrm.  Which caused an uncontrolled or abnormally high current draw through one of the vrm's.  The resistor just happened to be the weakest link in the chain, so it blew first.

pr0d1gy
Hero Member
*****
Offline Offline

Activity: 658
Merit: 502



View Profile
August 31, 2013, 09:08:54 PM
 #7

Bad gigabyte, bad!  Angry

I just went to store, bought a second one, then came back few hours later with the burned one saying "This is what your selling?"

Got my money back, and a new GPU, I wouldn't buy giga anymore.

Set Escrow
¯\_(ツ)_/¯
wedge (OP)
Full Member
***
Offline Offline

Activity: 187
Merit: 100


View Profile
September 01, 2013, 12:56:15 AM
 #8

Bad gigabyte, bad!  Angry

I just went to store, bought a second one, then came back few hours later with the burned one saying "This is what your selling?"

Got my money back, and a new GPU, I wouldn't buy giga anymore.

With the exception of the weak vrm circuit.  I think this is a fantastic card for mining.  I get an excellent hashrate out of these cards, with just a moderate overclock, and undervolted to reduce power consumption.  The cooling unit is the best I've seen for open-air usage.  It is the only gpu I've seen that expels the hot exhaust air along the top edge of the card instead of out the front and back.  This drastically reduces the airflow restriction, and so the fans don't have to work as hard to maintain a good temperature.  Which boils down to this being the quietest 7900 card I've seen so far.    
Edit: having 3 fans instead of 2 probably also helps in this regard.

I've tried all the non-reference designs from MSI, Sapphire, XFX, and Asus.  With all the same settings, same overclock, fans set to keep the card 70 - 72 degrees.  The Gigabyte is BY FAR the quietest.  Next best is the triple-slot Asus card, but that obviously takes up more space, and is a more expensive card.  The MSI was ridiculously loud.  Sapphire and XFX are somewhere in between.


That's all off topic though.  I'm just saying that in spite of the issues, I still like this card very much and would recommend it over any other non-reference card for open-air use.

pr0d1gy
Hero Member
*****
Offline Offline

Activity: 658
Merit: 502



View Profile
September 01, 2013, 01:26:21 AM
 #9

Thats the weird thing. I can have my MSI's right next to each other and the giga far from them, and even with 3 fans it is almost 10 degree higher, while the msi's are cooler. Im no expert at messing with the voltage and clocks. But I guess I just had bad luck with them. I have an open air rig as well, inside a milk crate, with all the plastic cut out for ventilation. And the quiet part, something always made the middle fan sound like it was hitting something, i could never find out what caused this. This was only after 2 months use before blowing up. Just bad experience overall. The mining was always good regardless, lol.

Set Escrow
¯\_(ツ)_/¯
therlman
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
May 08, 2014, 05:19:32 PM
 #10

Hey Wedge

I've just had the same thing happen to me with 2 cards in SLI which were working perfectly for almost an year. What happened was during a (light weight) game not even 10 minutes into it, my PC completely went off.
The way it turned off I thought there was a power surge but it wasn't as the lights were still on. Trying to reboot nothing happened the fans just tried to spin up but the PC stayed off. I removed the first card took a look at it, couldn't see anything physically wrong with it nor did I smell anything. Then I put this card back in, and took the 2nd card out. I tried to boot again this time the fans and everything started spinning and everything seemed normal, until I smelled burning, at which I turned off PC straight away. Checked card still couldn't see anything.

It was late so decided to check the card in the morning, put in the 2nd one to attempt to play just 1 more game...and then it happened again to this card almost at the same point in the game as well (about 10 minutes into the game).
In the morning I finally saw where the damage was, on the top of the card in a similar location to your pictures, were scorched black marks.
My friend let me use his PSU voltage test tool and it showed that the PSU was giving the correct readings....also I briefly ran just the CPU and MOBO (for about 5 minutes) before turning it off again.

Now I am using a second PSU from another friend while I try to confirm what the fault is, currently just my CPU and MOBO have been running in the same way I regularly use them almost the whole day while I work and sometimes in the evening I play the same (light weight) game that killed both my GPU's everything works fine, I have tested a really crappy low power GPU into all the PCI-e slots and confirmed they all work, but I have not played any games on it (using integrated graphics).

I have applied for RMA but uncertain I will get any compensation for the GPU's as the physical damage is obvious when you look carefully at the cards.

Did you ever find out what caused this problem? Was it your motherboard, PSU or bad (and really unlucky) Graphics cards?

Regards

Tiz
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!