Bitcoin Forum

Alternate cryptocurrencies => Mining (Altcoins) => Topic started by: monstrs on November 27, 2013, 06:45:11 AM



Title: What may cause this ?
Post by: monstrs on November 27, 2013, 06:45:11 AM
I have one remotely controled rig with 2x reference 7970 card + 7950 reference design card.
The thing is that now acting very strange. Had several restarts last day, last time a saw 511 degrees reported by cgminer, is it normal and what can couse that kind of action?
Using one pcie-express 16-16 riser unpowered and 2x pci-express 1x-16x power risers.


Title: Re: What may cause this ?
Post by: MRKLYE on November 27, 2013, 06:46:42 AM
If that temp gauge is remotely accurate and the card is still somehow working.. the card is obviously a wizard.

KLYE


Title: Re: What may cause this ?
Post by: Kluge on November 27, 2013, 06:51:11 AM
They worked fine for a long while before?

Can probably ignore the 511 degree reading. I'd guess it was an electrical blip which caused it, which is what I'd guess is causing your other issue, but I'm going to assume the cards ran fine for a good while (I'd guess it's the rail supplying 6-pin molex connectors which is insufficient). Would probably be worth reseating them all.

Are temps normally <90 degrees Celsius on both of them? Assuming electromigration isn't a factor, reseating doesn't help, and the power-supply isn't at fault, I also agree your card is a wizard. Have you tried counteracting its flame breath spell? Think carefully, because you don't want to end up with a bunch of water in your components.


Title: Re: What may cause this ?
Post by: monstrs on November 27, 2013, 07:12:55 AM
they ran fine for a month, tw days ago i had to go there and pfisically restart it and it run for a while. today it was resarted again and when i launched msi afterburner it showed, that no card is available, although it must be 3. They run at 72-76 degrees before, pretty damn good for reference, vrm was in 76-86 degrees. I used this setup to warm that appartment, while its no other source of heat there, so far, so good, but now this. The 511 degrees was reported on all 3 cards fans were spinning at 100%


Title: Re: What may cause this ?
Post by: sveetsnelda on November 27, 2013, 07:51:22 AM
The ADL SDK will report 511 degrees or -127 (IIRC) when the cards are somehow reset while the PC is running.  When this happens to me, it's usually a loose/bad PCI-E riser.  In your case (since all of the cards showed this), I'm sure it was just a significant power fluctuation (probably a quick voltage drop from a brown-out).


Title: Re: What may cause this ?
Post by: monstrs on November 27, 2013, 09:46:03 AM
The ADL SDK will report 511 degrees or -127 (IIRC) when the cards are somehow reset while the PC is running.  When this happens to me, it's usually a loose/bad PCI-E riser.  In your case (since all of the cards showed this), I'm sure it was just a significant power fluctuation (probably a quick voltage drop from a brown-out).

Thanks, that may be it.


Title: Re: What may cause this ?
Post by: monstrs on November 27, 2013, 04:50:39 PM
Still got this problem.
Lowered core clocks on all cards, after a wile rig is not connected :(


Title: Re: What may cause this ?
Post by: monstrs on November 29, 2013, 05:02:33 PM
got the problem still.
Now when pc starts with one card all is good, when with 3 cards, it is not seen on network (remotely turned on). When 2 cards is connected, one of them is only seen, changed the riser, either one card os seen or its nothing at all. Is it really could be riser problem, are they going down so fast (2 weeks) ?


Title: Re: What may cause this ?
Post by: sveetsnelda on November 29, 2013, 06:48:36 PM
got the problem still.
Now when pc starts with one card all is good, when with 3 cards, it is not seen on network (remotely turned on). When 2 cards is connected, one of them is only seen, changed the riser, either one card os seen or its nothing at all. Is it really could be riser problem, are they going down so fast (2 weeks) ?

It really sounds like a failing power supply.  While I mentioned that risers can cause the issue (bad riser, loose riser, etc), a loose PCI-e power cable can cause the same problem as well (and I'm not saying that it *is* a loose PCI-e power cable).

Since you mentioned that all 3 cards died at the same time once, you're pretty much down to a bad power supply, or a bad 12v on your 24-pin ATX connector (make sure it's not hot/melting).  Somehow/somewhere, you're randomly losing 12v to your video cards.


Title: Re: What may cause this ?
Post by: LouReed on December 03, 2013, 04:30:54 AM
My guess is you have a bad card. Try running each one individually and see if you have trouble with one of them.


Title: Re: What may cause this ?
Post by: dima1236 on January 04, 2014, 12:47:43 AM
Hi,
I got the same issue with the new risers i got,  system running fine with 3 x 280x cards, but when i connect it via risers it works for few mins then freeze and 1/2 out of 3 cards shows temp of 511.

My config is :

PSU : Zalman 1250W platinum
GPU : 3x 280x dual-x sapphire
Mobo : GA-Z87X-UD5H

I did try switching mobo and psu from another rig and problem continues, i took 3 other new risers and problem continues...

Did you find solution to this weird problem ?


Title: Re: What may cause this ?
Post by: dannyst225 on January 23, 2014, 03:14:12 PM
i also got this issue,
im running 2x 1500 watt psu's on 4x Hd7990 Ati cards.
first time i booted the system it was running stable for two to three weeks everything worked fine.
and someday i came back cause my hashrate dropped to zero and checked the rig, cg miner reported temprature on gpu 1 511 degree and the rpm from gpu 3 & 4 was zero. so i checkt gpu 3 & 4 and the fans were running like they always do..

now everytime i boot my system up it runs stable for 10 - 30 mins and then the same thing happens gpu 1 511 degree
and fan speed of gpu 3 & 4 is zero, then windows freezes and then my rig shows a complete black screen and doesnt do anything anymore so i need to reboot the system and then everything works fine till a start cgminer and then after 10 - 30 mins the same thing happens.

does anybody know what this is?


Title: Re: What may cause this ?
Post by: sgrimmett on February 21, 2014, 01:55:18 PM
i also got this issue,
im running 2x 1500 watt psu's on 4x Hd7990 Ati cards.
first time i booted the system it was running stable for two to three weeks everything worked fine.
and someday i came back cause my hashrate dropped to zero and checked the rig, cg miner reported temprature on gpu 1 511 degree and the rpm from gpu 3 & 4 was zero. so i checkt gpu 3 & 4 and the fans were running like they always do..

now everytime i boot my system up it runs stable for 10 - 30 mins and then the same thing happens gpu 1 511 degree
and fan speed of gpu 3 & 4 is zero, then windows freezes and then my rig shows a complete black screen and doesnt do anything anymore so i need to reboot the system and then everything works fine till a start cgminer and then after 10 - 30 mins the same thing happens.


Hi - did you solve your problem?  I am experiencing this problem for the first time on a new rig build...  seems to be always GPU 4 that flips to 511 and crashes the whole system...  going to check risers, card, molex, PSU, again, but so far, they all appear to be in fine working order.  Strange.   Will try removing GPU 4 and see if anything else funny pops up..., or maybe I'll try it without a riser, right on the mb..

Any ideas welcome.


Title: Re: What may cause this ?
Post by: sgrimmett on February 21, 2014, 04:07:17 PM
Simple solution it seems for me...  fresh usb stick, fresh BAMT install - now running stable for a few hours...  going to keep watching this thing, though to see if this is just random, or a fix.


Title: Re: What may cause this ?
Post by: Equate on February 21, 2014, 07:24:23 PM
Its a riser problem , replace the riser and try again.


Title: Re: What may cause this ?
Post by: -ck on February 21, 2014, 10:12:34 PM
511 is an overflow, the driver is returning a response of -1 saying it doesn't know and it comes out as 511 (512-1)


Title: Re: What may cause this ?
Post by: mattbigblue on February 28, 2014, 10:40:54 AM
I get the same problem in both on my rigs randomly every few hours/days..I find out it's electricity. I also find out that once it happens, it will keep happening... >:(    First time it happened on both rigs at the same time when my girlfriend switched iron on the same breaker (16 amp). Now any bigger fluctuation of power in my house creates this effect. Tomorrow I'm moving my rig to designated area where I have 20amp line ready for it and I will leave other rig in my house where I connect it to washing maschine electric line (of course I won't use washing machine anymore on this line, but I hope it will provide stable power for 5x r9 290).

cheers,
Matt


Title: Re: What may cause this ?
Post by: polanskiman on July 09, 2014, 11:23:40 AM
511 is an overflow, the driver is returning a response of -1 saying it doesn't know and it comes out as 511 (512-1)

And what would be the solution to this? I am having the same problem on one of my cards since yesterday. It crashes the whole system and I need to do a hard reset.

511 Degrees - 0RPM


Title: Re: What may cause this ?
Post by: -ck on July 09, 2014, 11:55:15 AM
It's a crash, so don't do things that make your graphic card crash... buy better hardware, don't overdrive it, don't overheat it, get your graphics card manufacturer to make better drivers...


Title: Re: What may cause this ?
Post by: polanskiman on July 09, 2014, 12:41:18 PM
It's a crash, so don't do things that make your graphic card crash... buy better hardware, don't overdrive it, don't overheat it, get your graphics card manufacturer to make better drivers...

Obviously the reasons can be multiple. By hardware I guess you meant risers and accessories? The main hardware is pretty much standard (280xs, mobo gigabyte etc etc). As for overdriving, well all is stock. Heat, well cards are running at 70-71C. Perhaps the drivers would be my last resort by reinstalling, but I can only rely on those who produce it...AMD...

Could the crash also simply be a defective GPU?


Title: Re: What may cause this ?
Post by: vm1990 on July 10, 2014, 09:30:12 AM
It's a crash, so don't do things that make your graphic card crash... buy better hardware, don't overdrive it, don't overheat it, get your graphics card manufacturer to make better drivers...

Obviously the reasons can be multiple. By hardware I guess you meant risers and accessories? The main hardware is pretty much standard (280xs, mobo gigabyte etc etc). As for overdriving, well all is stock. Heat, well cards are running at 70-71C. Perhaps the drivers would be my last resort by reinstalling, but I can only rely on those who produce it...AMD...

Could the crash also simply be a defective GPU?

I'd test each card on its own 1 at a time and mine with them for at least 30mins. This will find if there's a weak or defective card. If it finds nothing and all the cards check out its more than likely a bad psu or failing pci-e adapters on the motherboards.


Title: Re: What may cause this ?
Post by: polanskiman on July 11, 2014, 02:28:15 PM
It's a crash, so don't do things that make your graphic card crash... buy better hardware, don't overdrive it, don't overheat it, get your graphics card manufacturer to make better drivers...

Obviously the reasons can be multiple. By hardware I guess you meant risers and accessories? The main hardware is pretty much standard (280xs, mobo gigabyte etc etc). As for overdriving, well all is stock. Heat, well cards are running at 70-71C. Perhaps the drivers would be my last resort by reinstalling, but I can only rely on those who produce it...AMD...

Could the crash also simply be a defective GPU?

I'd test each card on its own 1 at a time and mine with them for at least 30mins. This will find if there's a weak or defective card. If it finds nothing and all the cards check out its more than likely a bad psu or failing pci-e adapters on the motherboards.

Thanks. It really seems that it's the GPU that has a problem. The rig kept crashing quite often. I took that GPU out and all seems to be working fine now. I tried plugging it back several times and problems kept happening again and again, and often even at boot time. Tried changing the riser too, to no avail.


Title: Re: What may cause this ?
Post by: wunkbone on July 11, 2014, 02:33:18 PM
Try different setting for your GPU that hang, sometimes lower clock helps..


Title: Re: What may cause this ?
Post by: polanskiman on July 12, 2014, 01:54:05 AM
Try different setting for your GPU that hang, sometimes lower clock helps..

I had tried that before. I even brought the GPU to its default values. Those hangs seemed more of a hardware problem than anything else as the whole rig froze each time and the only way to recover was by doing a hard reset.

The rig has now been running continuously without that particular GPU for the past 12 hours without hangs.


Title: Re: What may cause this ?
Post by: cryptom00n5 on January 17, 2018, 11:36:43 AM
Anyone find a solution to this problem?