Bitcoin Forum
December 17, 2017, 10:35:50 PM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: [1] 2 »  All
  Print  
Author Topic: What may cause this ?  (Read 3997 times)
monstrs
Hero Member
*****
Offline Offline

Activity: 493



View Profile
November 27, 2013, 06:45:11 AM
 #1

I have one remotely controled rig with 2x reference 7970 card + 7950 reference design card.
The thing is that now acting very strange. Had several restarts last day, last time a saw 511 degrees reported by cgminer, is it normal and what can couse that kind of action?
Using one pcie-express 16-16 riser unpowered and 2x pci-express 1x-16x power risers.

1513550150
Hero Member
*
Offline Offline

Posts: 1513550150

View Profile Personal Message (Offline)

Ignore
1513550150
Reply with quote  #2

1513550150
Report to moderator
1513550150
Hero Member
*
Offline Offline

Posts: 1513550150

View Profile Personal Message (Offline)

Ignore
1513550150
Reply with quote  #2

1513550150
Report to moderator
"This isn't the kind of software where we can leave so many unresolved bugs that we need a tracker for them." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1513550150
Hero Member
*
Offline Offline

Posts: 1513550150

View Profile Personal Message (Offline)

Ignore
1513550150
Reply with quote  #2

1513550150
Report to moderator
1513550150
Hero Member
*
Offline Offline

Posts: 1513550150

View Profile Personal Message (Offline)

Ignore
1513550150
Reply with quote  #2

1513550150
Report to moderator
MRKLYE
Legendary
*
Offline Offline

Activity: 1260


Designer - Developer


View Profile WWW
November 27, 2013, 06:46:42 AM
 #2

If that temp gauge is remotely accurate and the card is still somehow working.. the card is obviously a wizard.

KLYE
Kluge
Donator
Legendary
*
Offline Offline

Activity: 1218


Michael, send me some coins before I hitman you


View Profile
November 27, 2013, 06:51:11 AM
 #3

They worked fine for a long while before?

Can probably ignore the 511 degree reading. I'd guess it was an electrical blip which caused it, which is what I'd guess is causing your other issue, but I'm going to assume the cards ran fine for a good while (I'd guess it's the rail supplying 6-pin molex connectors which is insufficient). Would probably be worth reseating them all.

Are temps normally <90 degrees Celsius on both of them? Assuming electromigration isn't a factor, reseating doesn't help, and the power-supply isn't at fault, I also agree your card is a wizard. Have you tried counteracting its flame breath spell? Think carefully, because you don't want to end up with a bunch of water in your components.

Don't mix your coins someone said isn't legal
monstrs
Hero Member
*****
Offline Offline

Activity: 493



View Profile
November 27, 2013, 07:12:55 AM
 #4

they ran fine for a month, tw days ago i had to go there and pfisically restart it and it run for a while. today it was resarted again and when i launched msi afterburner it showed, that no card is available, although it must be 3. They run at 72-76 degrees before, pretty damn good for reference, vrm was in 76-86 degrees. I used this setup to warm that appartment, while its no other source of heat there, so far, so good, but now this. The 511 degrees was reported on all 3 cards fans were spinning at 100%

sveetsnelda
Hero Member
*****
Offline Offline

Activity: 620


View Profile
November 27, 2013, 07:51:22 AM
 #5

The ADL SDK will report 511 degrees or -127 (IIRC) when the cards are somehow reset while the PC is running.  When this happens to me, it's usually a loose/bad PCI-E riser.  In your case (since all of the cards showed this), I'm sure it was just a significant power fluctuation (probably a quick voltage drop from a brown-out).

14u2rp4AqFtN5jkwK944nn741FnfF714m7
monstrs
Hero Member
*****
Offline Offline

Activity: 493



View Profile
November 27, 2013, 09:46:03 AM
 #6

The ADL SDK will report 511 degrees or -127 (IIRC) when the cards are somehow reset while the PC is running.  When this happens to me, it's usually a loose/bad PCI-E riser.  In your case (since all of the cards showed this), I'm sure it was just a significant power fluctuation (probably a quick voltage drop from a brown-out).

Thanks, that may be it.

monstrs
Hero Member
*****
Offline Offline

Activity: 493



View Profile
November 27, 2013, 04:50:39 PM
 #7

Still got this problem.
Lowered core clocks on all cards, after a wile rig is not connected Sad

monstrs
Hero Member
*****
Offline Offline

Activity: 493



View Profile
November 29, 2013, 05:02:33 PM
 #8

got the problem still.
Now when pc starts with one card all is good, when with 3 cards, it is not seen on network (remotely turned on). When 2 cards is connected, one of them is only seen, changed the riser, either one card os seen or its nothing at all. Is it really could be riser problem, are they going down so fast (2 weeks) ?

sveetsnelda
Hero Member
*****
Offline Offline

Activity: 620


View Profile
November 29, 2013, 06:48:36 PM
 #9

got the problem still.
Now when pc starts with one card all is good, when with 3 cards, it is not seen on network (remotely turned on). When 2 cards is connected, one of them is only seen, changed the riser, either one card os seen or its nothing at all. Is it really could be riser problem, are they going down so fast (2 weeks) ?

It really sounds like a failing power supply.  While I mentioned that risers can cause the issue (bad riser, loose riser, etc), a loose PCI-e power cable can cause the same problem as well (and I'm not saying that it *is* a loose PCI-e power cable).

Since you mentioned that all 3 cards died at the same time once, you're pretty much down to a bad power supply, or a bad 12v on your 24-pin ATX connector (make sure it's not hot/melting).  Somehow/somewhere, you're randomly losing 12v to your video cards.

14u2rp4AqFtN5jkwK944nn741FnfF714m7
LouReed
Hero Member
*****
Online Online

Activity: 731


Nosce te Ipsum


View Profile
December 03, 2013, 04:30:54 AM
 #10

My guess is you have a bad card. Try running each one individually and see if you have trouble with one of them.
dima1236
Newbie
*
Offline Offline

Activity: 26


View Profile
January 04, 2014, 12:47:43 AM
 #11

Hi,
I got the same issue with the new risers i got,  system running fine with 3 x 280x cards, but when i connect it via risers it works for few mins then freeze and 1/2 out of 3 cards shows temp of 511.

My config is :

PSU : Zalman 1250W platinum
GPU : 3x 280x dual-x sapphire
Mobo : GA-Z87X-UD5H

I did try switching mobo and psu from another rig and problem continues, i took 3 other new risers and problem continues...

Did you find solution to this weird problem ?
dannyst225
Newbie
*
Offline Offline

Activity: 6


View Profile
January 23, 2014, 03:14:12 PM
 #12

i also got this issue,
im running 2x 1500 watt psu's on 4x Hd7990 Ati cards.
first time i booted the system it was running stable for two to three weeks everything worked fine.
and someday i came back cause my hashrate dropped to zero and checked the rig, cg miner reported temprature on gpu 1 511 degree and the rpm from gpu 3 & 4 was zero. so i checkt gpu 3 & 4 and the fans were running like they always do..

now everytime i boot my system up it runs stable for 10 - 30 mins and then the same thing happens gpu 1 511 degree
and fan speed of gpu 3 & 4 is zero, then windows freezes and then my rig shows a complete black screen and doesnt do anything anymore so i need to reboot the system and then everything works fine till a start cgminer and then after 10 - 30 mins the same thing happens.

does anybody know what this is?
sgrimmett
Jr. Member
*
Offline Offline

Activity: 34


View Profile
February 21, 2014, 01:55:18 PM
 #13

i also got this issue,
im running 2x 1500 watt psu's on 4x Hd7990 Ati cards.
first time i booted the system it was running stable for two to three weeks everything worked fine.
and someday i came back cause my hashrate dropped to zero and checked the rig, cg miner reported temprature on gpu 1 511 degree and the rpm from gpu 3 & 4 was zero. so i checkt gpu 3 & 4 and the fans were running like they always do..

now everytime i boot my system up it runs stable for 10 - 30 mins and then the same thing happens gpu 1 511 degree
and fan speed of gpu 3 & 4 is zero, then windows freezes and then my rig shows a complete black screen and doesnt do anything anymore so i need to reboot the system and then everything works fine till a start cgminer and then after 10 - 30 mins the same thing happens.


Hi - did you solve your problem?  I am experiencing this problem for the first time on a new rig build...  seems to be always GPU 4 that flips to 511 and crashes the whole system...  going to check risers, card, molex, PSU, again, but so far, they all appear to be in fine working order.  Strange.   Will try removing GPU 4 and see if anything else funny pops up..., or maybe I'll try it without a riser, right on the mb..

Any ideas welcome.

50MH/s - scrypt
212GH/s - sha-256
sgrimmett
Jr. Member
*
Offline Offline

Activity: 34


View Profile
February 21, 2014, 04:07:17 PM
 #14

Simple solution it seems for me...  fresh usb stick, fresh BAMT install - now running stable for a few hours...  going to keep watching this thing, though to see if this is just random, or a fix.

50MH/s - scrypt
212GH/s - sha-256
Equate
Hero Member
*****
Offline Offline

Activity: 700


View Profile
February 21, 2014, 07:24:23 PM
 #15

Its a riser problem , replace the riser and try again.
-ck
Staff
Legendary
*
Offline Offline

Activity: 2366


Ruu \o/


View Profile WWW
February 21, 2014, 10:12:34 PM
 #16

511 is an overflow, the driver is returning a response of -1 saying it doesn't know and it comes out as 511 (512-1)

Primary developer/maintainer for cgminer and ckpool/ckproxy.
ZERO FEE Pooled mining at ckpool.org 1% Fee Solo mining at solo.ckpool.org
-ck
mattbigblue
Full Member
***
Offline Offline

Activity: 182


View Profile
February 28, 2014, 10:40:54 AM
 #17

I get the same problem in both on my rigs randomly every few hours/days..I find out it's electricity. I also find out that once it happens, it will keep happening... Angry    First time it happened on both rigs at the same time when my girlfriend switched iron on the same breaker (16 amp). Now any bigger fluctuation of power in my house creates this effect. Tomorrow I'm moving my rig to designated area where I have 20amp line ready for it and I will leave other rig in my house where I connect it to washing maschine electric line (of course I won't use washing machine anymore on this line, but I hope it will provide stable power for 5x r9 290).

cheers,
Matt

polanskiman
Full Member
***
Offline Offline

Activity: 238


View Profile
July 09, 2014, 11:23:40 AM
 #18

511 is an overflow, the driver is returning a response of -1 saying it doesn't know and it comes out as 511 (512-1)

And what would be the solution to this? I am having the same problem on one of my cards since yesterday. It crashes the whole system and I need to do a hard reset.

511 Degrees - 0RPM

BTC: 1JnH2HVoWnDubGmrXintsNWEaByGRjY8wL
DMD: dZuojpnwkmzqUegQaFk7ynDY8p7zAmG25H
XCN: CKnXnjXsVJzXgqEEqjsMRRc7ohkk6jHsb6
-ck
Staff
Legendary
*
Offline Offline

Activity: 2366


Ruu \o/


View Profile WWW
July 09, 2014, 11:55:15 AM
 #19

It's a crash, so don't do things that make your graphic card crash... buy better hardware, don't overdrive it, don't overheat it, get your graphics card manufacturer to make better drivers...

Primary developer/maintainer for cgminer and ckpool/ckproxy.
ZERO FEE Pooled mining at ckpool.org 1% Fee Solo mining at solo.ckpool.org
-ck
polanskiman
Full Member
***
Offline Offline

Activity: 238


View Profile
July 09, 2014, 12:41:18 PM
 #20

It's a crash, so don't do things that make your graphic card crash... buy better hardware, don't overdrive it, don't overheat it, get your graphics card manufacturer to make better drivers...

Obviously the reasons can be multiple. By hardware I guess you meant risers and accessories? The main hardware is pretty much standard (280xs, mobo gigabyte etc etc). As for overdriving, well all is stock. Heat, well cards are running at 70-71C. Perhaps the drivers would be my last resort by reinstalling, but I can only rely on those who produce it...AMD...

Could the crash also simply be a defective GPU?

BTC: 1JnH2HVoWnDubGmrXintsNWEaByGRjY8wL
DMD: dZuojpnwkmzqUegQaFk7ynDY8p7zAmG25H
XCN: CKnXnjXsVJzXgqEEqjsMRRc7ohkk6jHsb6
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!