Bitcoin Forum
April 24, 2024, 02:14:14 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2]  All
  Print  
Author Topic: I think one of my GPUs has the the dreaded electromigration.  (Read 3291 times)
cuz0882
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
March 07, 2012, 06:34:50 AM
 #21

I have 2 like that now, they really cause problems if you plug the monitor in them. One of mine crashes if I try running cgminer off it when the card is not hot. If I run guiminer for 10 mins then switch over to cgminer it works fine.
1713968054
Hero Member
*
Offline Offline

Posts: 1713968054

View Profile Personal Message (Offline)

Ignore
1713968054
Reply with quote  #2

1713968054
Report to moderator
The grue lurks in the darkest places of the earth. Its favorite diet is adventurers, but its insatiable appetite is tempered by its fear of light. No grue has ever been seen by the light of day, and few have survived its fearsome jaws to tell the tale.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
John (John K.)
Global Troll-buster and
Legendary
*
Offline Offline

Activity: 1288
Merit: 1225


Away on an extended break


View Profile
March 07, 2012, 07:30:16 AM
 #22

From what I hear, 5970's are being replaced by 6990s when there is a warranty claim.

Oh god.  Sad

At this point I am not sure I can even warranty it.  It runs perfectly stable at stock clock and I doubt any warranty guarantees stability at > than stock clock.
Oh no. I just rma'ed a 5870 and a 5850 this week - would they give me a 6xxx card instead?  Cry
cuz0882
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
March 07, 2012, 08:19:01 AM
 #23

From what I hear, 5970's are being replaced by 6990s when there is a warranty claim.

Oh god.  Sad

At this point I am not sure I can even warranty it.  It runs perfectly stable at stock clock and I doubt any warranty guarantees stability at > than stock clock.
Oh no. I just rma'ed a 5870 and a 5850 this week - would they give me a 6xxx card instead?  Cry
lets wait until they only have 7990's.
John (John K.)
Global Troll-buster and
Legendary
*
Offline Offline

Activity: 1288
Merit: 1225


Away on an extended break


View Profile
March 07, 2012, 08:23:06 AM
 #24

*notes to self*
PatrickHarnett
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
March 07, 2012, 08:32:57 AM
 #25

One of my 5970s has a similar problem.  It wont mine at 725mhz/150mhz mem/1.05v without crashing, but it runs fine at 690mhz/150mhz mem/0.95v.  I always assumed it was a fried VRM because the GPU temps are fine.

Hmm it could be a bad VRM.  I can't check VRM temps because it is in Linux.  I may drop the card in my Windows workstation to see what VRM is reading.  If one of 3 VRMs is missing in GPU-Z that is a good (well bad but clear) sign.

VRM problem is more likely.  I had had several types of failures on 5970's, and if it starts more easily when cool (as opposed to when hot), runs hot with no apparent load, or freezes when load applied (especially this one) that is one of the possible symptoms of vrm fail.  I'm assuming you haven't thrown it into a primary slot to look for ram failure (artefacts).

My cards were running 24/7 long before mining bitcoin, but not usually with over-clocks, and that helped keep temperatures sensible.  The only RMA I've managed with a 5970 got me a 6970 in return - better than nothing.
BlackPrapor
Hero Member
*****
Offline Offline

Activity: 626
Merit: 504



View Profile WWW
March 07, 2012, 12:57:19 PM
 #26

I had same issues with an old AMD Duron CPU. Simply repasted the thermal interface and it was running good again. Could be the same issue, that's just a guess. Is it possible that electromigration can affect FPGA/ASIC chips as well? Is there a temperature that would guarantee EM free 24/7 work?

There is no place like 127.0.0.1
In blockchain we trust
DeathAndTaxes (OP)
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
March 07, 2012, 01:30:28 PM
 #27

My cards were running 24/7 long before mining bitcoin, but not usually with over-clocks, and that helped keep temperatures sensible.  The only RMA I've managed with a 5970 got me a 6970 in return - better than nothing.

I hope you mean 6990.
DeathAndTaxes (OP)
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
March 07, 2012, 01:32:03 PM
 #28

I had same issues with an old AMD Duron CPU. Simply repasted the thermal interface and it was running good again. Could be the same issue, that's just a guess. Is it possible that electromigration can affect FPGA/ASIC chips as well? Is there a temperature that would guarantee EM free 24/7 work?

Electromigration affects all silicon chips.  Every chip ever produced will eventually be destroyed by electromigration.  Lower current and lower temps could potentially make that timeline decades beyond its economical lifespan but electromigration is the wear and tear of transistor gates and is both unavoidable and irreversible.
PatrickHarnett
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
March 07, 2012, 05:43:49 PM
 #29

My cards were running 24/7 long before mining bitcoin, but not usually with over-clocks, and that helped keep temperatures sensible.  The only RMA I've managed with a 5970 got me a 6970 in return - better than nothing.

I hope you mean 6990.

No.  It's just a single GPU card.

As for the thermal paste suggestions, I've never had a problem on that front.  I also have a friend in Germany who knows some info on the electro-migration issue, and for the volume of material that needs shifting, even for micro circuitry, his view was it is unlikely (he is another long time 24/7 gpu user).
rjk
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
March 07, 2012, 05:45:15 PM
 #30

My cards were running 24/7 long before mining bitcoin, but not usually with over-clocks, and that helped keep temperatures sensible.  The only RMA I've managed with a 5970 got me a 6970 in return - better than nothing.

I hope you mean 6990.

No.  It's just a single GPU card.

As for the thermal paste suggestions, I've never had a problem on that front.  I also have a friend in Germany who knows some info on the electro-migration issue, and for the volume of material that needs shifting, even for micro circuitry, his view was it is unlikely (he is another long time 24/7 gpu user).
The 5970 is dual GPU, and so is the 6990. If you got a single GPU card in return for a dual GPU card, then you got gyped.

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
PatrickHarnett
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
March 07, 2012, 06:28:03 PM
 #31


The 5970 is dual GPU, and so is the 6990. If you got a single GPU card in return for a dual GPU card, then you got gyped.

Yes.  But I neither the 5970's or 6990's had much availability, and from that particular supplier, I don't think they could get any (or didn't want to spend over $1000 to find a replacement).

The 5970 was second hand, and not a bad price and having the purchase documentation was a bonus (allowing the RMA in the first place).  I don't particularly like the 6970 (noisy and only single gpu). 
johnyj
Legendary
*
Offline Offline

Activity: 1988
Merit: 1012


Beyond Imagination


View Profile
March 07, 2012, 09:11:20 PM
 #32

I just RMAed a 5970 weeks ago, and since it is out of stock now, I only get my money back, so I have to hunt another 5970

one of my 5970 never has been able to run stable above 760 MHz, but it turned out to be a bad pci-e extender cable, re-soldered one of the wire and now it runs at 800+ Mhz stable

Another 5870 died after 30 seconds into mining, that is because I installed too thick heat pad on the memory thus make the GPU not able to contact the cooler surface evenly, replaced the heat pad solved the problem

DeathAndTaxes (OP)
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
March 07, 2012, 09:15:14 PM
 #33

The 5970 was second hand, and not a bad price and having the purchase documentation was a bonus (allowing the RMA in the first place).  I don't particularly like the 6970 (noisy and only single gpu).  

Regardless of the price you paid,  they actually "replaced" a 5970s with a 6970?

I mean by any metric you look at they robbed you:

5970
4.6 GFLOPs
$600 launch price
3200 shaders @725 MHz
Ebay value today: ~$400

6970
2.7 GFLOPs  (41% less)
$370 launch price (38% less)
1536 shaders @ 880 MHz (42% less)
Ebay value today: ~$200 (50% less)
DeathAndTaxes (OP)
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
March 07, 2012, 09:16:19 PM
 #34

An update.  Running it at 750/240 it has been stable for almost 72 hours now.  Still not sure if it is EM or just maybe 1 of 3 VRMs are blown.  Also no sure if it can do more than 750.  I want to run it like this until either it crashes again or it lasts a week.
PatrickHarnett
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
March 08, 2012, 12:54:06 AM
 #35

An update.  Running it at 750/240 it has been stable for almost 72 hours now.  Still not sure if it is EM or just maybe 1 of 3 VRMs are blown.  Also no sure if it can do more than 750.  I want to run it like this until either it crashes again or it lasts a week.

Don't turn it off.  The first 5970 I had fail was an XFX black edition (fancy name, still just a bit of kit).  It took four months to die (i.e. not turn on when cool), but even when very stuffed, it would drive a screen.  Recycled the fan and heatinks Smiley

As for the RMA - I paid about 80% of new price, 2nd hand rates were still 80% of new up until a few months ago (now about 50%), and they could have denied the return because I wasn't the registered purchaser.  Something better than nothing.
Pages: « 1 [2]  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!