Bitcoin Forum
December 14, 2024, 03:13:56 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2] 3 »  All
  Print  
Author Topic: Mining rig randomly shuts off?  (Read 13128 times)
ghost
Newbie
*
Offline Offline

Activity: 34
Merit: 0


View Profile
April 30, 2011, 02:42:03 AM
 #21

Single rail or multi rail power supply? If multi rail make sure you spread the load across all the available PCIE rails.

This post helped me tremendously. I have dual 6990s and an HX1000 power supply. The 6990s have an over-clock switch that raises the voltage and produces an extra 15% to my hash rates. It was causing the computer to freeze after a couple of minutes and I figured it might be overloading the power supply. Sure enough I didn't have the two cards spread evenly between the rails (in the case of the HX1000, it doesn't have rails but is kind of like 2 500 watt power supplies bundle together).

I have the same power supply and even with a single 5970 it is still recommended that one spreads the load across the two 12v rails. I'm amazed the PSU handled two 6990 on the same rail. However when I think of it I am not surprised as I've read somewhere that the HX1000W is conservatively rated (ie it can deliver more than 1000W if pushed) and it really a 1200W PSU sold as 1000W (designed by an OEM called CWT)



One of the cards was split between the rails, so one rail had 3 of the 4 PCIe cables plugged into it, and the other rail had 1. Even still, the power supply was pulling over 800 (I believe close to 900) watts from the socket.

Nice, you got yourself one serious space heater there Smiley However I'd be concerned running the PSU at 80 to 90% rated capacity. I think that will shorten its lifespan.

Did you also try to underclock the memory on those cards? You can save 20-40 watts per card doing so.


Hey btw, checkout this picture to see how the rails are distributed on the HX1000W:

http://forum.corsair.com/forums/showthread.php?t=70317



No, I haven't had much luck figuring out how to lower mem speeds on linux. I think that's about the only thing left to get this computer running optimally.

I found that rail diagram when your initial post got me on the right track.
allinvain
Legendary
*
Offline Offline

Activity: 3080
Merit: 1083



View Profile WWW
April 30, 2011, 03:35:14 AM
 #22

Quote

No, I haven't had much luck figuring out how to lower mem speeds on linux. I think that's about the only thing left to get this computer running optimally.

I found that rail diagram when your initial post got me on the right track.

Hmm, there is a thread (or two perhaps) in the mining section which mentions a specific program you can use to lower the mem even under linux. I am pretty sure it's possible. It's really worth giving it your best because it can help you save some $ by lowering the power consumption of your card(s)

Yeah that diagram was very useful to me too a while back.

gigabytecoin
Sr. Member
****
Offline Offline

Activity: 280
Merit: 252


View Profile
April 30, 2011, 04:14:22 AM
 #23

Probably cheap parts. I have the exact same problem Sad
allinvain
Legendary
*
Offline Offline

Activity: 3080
Merit: 1083



View Profile WWW
April 30, 2011, 04:42:40 AM
 #24

Probably cheap parts. I have the exact same problem Sad

Well I dunno. I don't think the OP mentioned whether the power supply was new or whether the cards are new.

Therein lies the problem..in our drive to lower our costs us miners often run into problems that totally wipes out the savings we're chasing after.

clonedone (OP)
Full Member
***
Offline Offline

Activity: 350
Merit: 100


View Profile
April 30, 2011, 12:54:45 PM
 #25

okay well guys its been running for over 10 hours now, im pretty confident it WAS indeed logmein.
i havened used logmein the whole tiem, and when i finally just did, i saw the mhash shoot down from 333mhash to 29mhash and then shoot back to 333mhash

i really hope this means my psu is not messed up.

thanks for your help guys, ill try team viewer maybe it wont crash it. but in anycase, logmein may be a suspect. if your gonna use it, use it for a couple of seconds just to check on it. im not gonna be able to test if logmein really does it because i got alotta stuff to do this weekend.

thanks again guys
grndzero
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
April 30, 2011, 01:09:43 PM
 #26

okay well guys its been running for over 10 hours now, im pretty confident it WAS indeed logmein.
i havened used logmein the whole tiem, and when i finally just did, i saw the mhash shoot down from 333mhash to 29mhash and then shoot back to 333mhash

i really hope this means my psu is not messed up.

thanks for your help guys, ill try team viewer maybe it wont crash it. but in anycase, logmein may be a suspect. if your gonna use it, use it for a couple of seconds just to check on it. im not gonna be able to test if logmein really does it because i got alotta stuff to do this weekend.

thanks again guys

From your second statement about the Mh/s crashing when using it, it sounds like logmein is accessing OpenCL and probably causing a conflict in the device driver. Why it would just poweroff instead of throwing up a BSOD is kind of odd.

Ubuntu Desktop x64 -  HD5850 Reference - 400Mh/s w/ cgminer  @ 975C/325M/1.175V - 11.6/2.1 SDK
Donate if you find this helpful: 1NimouHg2acbXNfMt5waJ7ohKs2TtYHePy
clonedone (OP)
Full Member
***
Offline Offline

Activity: 350
Merit: 100


View Profile
April 30, 2011, 01:20:41 PM
 #27

hah! i can confirm it now! i just used logmein about 40 minutes ago when i woke up to see if it was still on, it indeed was then it shut off within the last half hour.

so people avoid logmein for rigs.
okay well guys its been running for over 10 hours now, im pretty confident it WAS indeed logmein.
i havened used logmein the whole tiem, and when i finally just did, i saw the mhash shoot down from 333mhash to 29mhash and then shoot back to 333mhash

i really hope this means my psu is not messed up.

thanks for your help guys, ill try team viewer maybe it wont crash it. but in anycase, logmein may be a suspect. if your gonna use it, use it for a couple of seconds just to check on it. im not gonna be able to test if logmein really does it because i got alotta stuff to do this weekend.

thanks again guys

From your second statement about the Mh/s crashing when using it, it sounds like logmein is accessing OpenCL and probably causing a conflict in the device driver. Why it would just poweroff instead of throwing up a BSOD is kind of odd.

you are probably right.
the thing is, it dosent entirely shut off, it kinda of just stops working, but the fans are still running, even though the operating system is now.
allinvain
Legendary
*
Offline Offline

Activity: 3080
Merit: 1083



View Profile WWW
April 30, 2011, 02:41:56 PM
 #28

That is highly odd. I googled your problem with logmein and no useful results came up. So this is something that you may wish to report to the logmein developers.

Good thing that your PSU wasn't at fault Smiley

grndzero
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
April 30, 2011, 05:43:15 PM
 #29

hah! i can confirm it now! i just used logmein about 40 minutes ago when i woke up to see if it was still on, it indeed was then it shut off within the last half hour.

so people avoid logmein for rigs.
okay well guys its been running for over 10 hours now, im pretty confident it WAS indeed logmein.
i havened used logmein the whole tiem, and when i finally just did, i saw the mhash shoot down from 333mhash to 29mhash and then shoot back to 333mhash

i really hope this means my psu is not messed up.

thanks for your help guys, ill try team viewer maybe it wont crash it. but in anycase, logmein may be a suspect. if your gonna use it, use it for a couple of seconds just to check on it. im not gonna be able to test if logmein really does it because i got alotta stuff to do this weekend.

thanks again guys

From your second statement about the Mh/s crashing when using it, it sounds like logmein is accessing OpenCL and probably causing a conflict in the device driver. Why it would just poweroff instead of throwing up a BSOD is kind of odd.

you are probably right.
the thing is, it dosent entirely shut off, it kinda of just stops working, but the fans are still running, even though the operating system is now.

If you have another computer that you can test with if it happens again I would suggest the following:

ping to see if the system and networking are still up
try to connect to a file share to see if higher level networking is still online
try to do a remote desktop connection and see whether it's just the video card that is cutting out and killing the signal to the monitor as well

At least then you have some idea at what level it's actually happening.

Ubuntu Desktop x64 -  HD5850 Reference - 400Mh/s w/ cgminer  @ 975C/325M/1.175V - 11.6/2.1 SDK
Donate if you find this helpful: 1NimouHg2acbXNfMt5waJ7ohKs2TtYHePy
clonedone (OP)
Full Member
***
Offline Offline

Activity: 350
Merit: 100


View Profile
May 01, 2011, 12:20:28 AM
 #30

hmmm now im confused. sorry about the confirmation on logmein earlier. it happened again today twice. once after it happened, i turned it on again and windows was completeing updates, so that MIGHT be it, im not gonna confirm it though. sigh ill have to test the psu
JJG
Member
**
Offline Offline

Activity: 70
Merit: 20


View Profile
May 01, 2011, 12:29:07 AM
 #31

hmmm now im confused. sorry about the confirmation on logmein earlier. it happened again today twice. once after it happened, i turned it on again and windows was completeing updates, so that MIGHT be it, im not gonna confirm it though. sigh ill have to test the psu

Best of luck. Keep us posted.
TurboK
Full Member
***
Offline Offline

Activity: 136
Merit: 100



View Profile
May 01, 2011, 12:57:20 AM
 #32

It's not your PSU. It's a driver bug. Accessing the video overlay when OpenCL apps are running will lock up a 69xx card. 58xx series suffered from this too, but they fixed it in catalyst 10.4. The 69xx is affected even with catalyst 11.4.

http://bitcointalk.org/index.php?topic=4669
http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=128404&enterthread=y

12zJNWtM2HknS2EPLkT9QPSuSq1576aKx7

Tradehill viral bullshit code: TH-R114411
Steve
Hero Member
*****
Offline Offline

Activity: 868
Merit: 1008



View Profile WWW
May 01, 2011, 01:01:34 AM
 #33

What motherboard are you using that will let you run with 3 cards?  Are they spaced far enough apart to allow airflow between?

(gasteve on IRC) Does your website accept cash? https://bitpay.com
TurboK
Full Member
***
Offline Offline

Activity: 136
Merit: 100



View Profile
May 01, 2011, 01:07:03 AM
 #34

What motherboard are you using that will let you run with 3 cards?  Are they spaced far enough apart to allow airflow between?

The card scales back its own clockspeed if temps reach too high. You'd have to instantly reach a meltdown temperature for it to hang outright.

I'm telling you, it's the opencl + video bug. Easy enough to reproduce: Have your mining up running, then go to Catalyst Control Center and try opening the Video Settings menu. Or just launch GPU-Z AFTER you have started up the mining app.

12zJNWtM2HknS2EPLkT9QPSuSq1576aKx7

Tradehill viral bullshit code: TH-R114411
clonedone (OP)
Full Member
***
Offline Offline

Activity: 350
Merit: 100


View Profile
May 01, 2011, 02:34:17 AM
 #35

What motherboard are you using that will let you run with 3 cards?  Are they spaced far enough apart to allow airflow between?

The card scales back its own clockspeed if temps reach too high. You'd have to instantly reach a meltdown temperature for it to hang outright.

I'm telling you, it's the opencl + video bug. Easy enough to reproduce: Have your mining up running, then go to Catalyst Control Center and try opening the Video Settings menu. Or just launch GPU-Z AFTER you have started up the mining app.

ahh so what do you recommend then? should i downgrade to 10.4?
TurboK
Full Member
***
Offline Offline

Activity: 136
Merit: 100



View Profile
May 01, 2011, 03:05:08 AM
 #36

10.4 is where they fixed it for the 58xx series (after half a year of the card being out). Earliest Catalyst to support 6950 is 10.11a, I think, and 6990 support was just added recently. So you can't roll back drivers.
68xx cards are not affected, only the 69xx ones that use the new VLIW4 architecture. Chances are, the upcoming Radeon 7xxx will have this bug too, unless they fix it till then.

The only thing you can do is watch out and not open any apps that trigger the video overlay, while mining. That means most video players, untick "hardware acceleration" for Flash, the Video Settings in CCC, GPU-Z at startup (works fine if you run your miner after gpu-z started up though), pcsx2, and Google Chrome with hardware acceleration enabled - these are the ones I've stumbled across so far.

And, perhaps keep bugging AMD about it. Maybe they'll fix it sooner if more people complain. I wouldn't keep my hopes up, the 58xx series got fixed cause it affected an entire generation of cards, here it only affects 3 cards.

12zJNWtM2HknS2EPLkT9QPSuSq1576aKx7

Tradehill viral bullshit code: TH-R114411
allinvain
Legendary
*
Offline Offline

Activity: 3080
Merit: 1083



View Profile WWW
May 01, 2011, 03:53:17 AM
 #37

Wow, way the go AMD. Way to keep up your reputation for having shitty drivers.

clonedone (OP)
Full Member
***
Offline Offline

Activity: 350
Merit: 100


View Profile
May 01, 2011, 04:00:12 AM
 #38

i see... i shouldnt have gotten these cards then... dam
thanks for your help. im not even sure what to do now.
im thinking of setting up a program to restart my computer every hour and then turn on guiminer
that way i can restart my comp b4 the driver issue shuts me down.
what do you think of this plan? maybe it will slow my bitcoin mining down a lot i suppose
clonedone (OP)
Full Member
***
Offline Offline

Activity: 350
Merit: 100


View Profile
May 01, 2011, 10:48:57 AM
 #39

i see... i shouldnt have gotten these cards then... dam
thanks for your help. im not even sure what to do now.
im thinking of setting up a program to restart my computer every hour and then turn on guiminer
that way i can restart my comp b4 the driver issue shuts me down.
what do you think of this plan? maybe it will slow my bitcoin mining down a lot i suppose

What he is saying is that any app that uses the video overlay will crash the card. If this is a dedicated mining rig, no such app should be opening, and it won't crash. So, are you using the computer when this is happening, or is it just sitting somewhere mining?

i was initially using logmein, now im not...but its still installed... ill uninstall it today but i doubt it would make a difference. i dont use this thing for anything else otherwise.
TurboK
Full Member
***
Offline Offline

Activity: 136
Merit: 100



View Profile
May 01, 2011, 02:25:04 PM
 #40

Wow, way the go AMD. Way to keep up your reputation for having shitty drivers.


The problem affects a very slim percent of users. If you are running the card as a dedicated opencl accelerator, chances are you aren't the type of guy to hang out on Youtube all day. On a dedicated mining rig, no other app should be running that triggers the bug.
Besides, nVidia drivers aren't any better either. Remember the nforce chipset?

clonedone: run through the windows services, see if there's anything suspicious there (such as Windows Media Center). Also check if there are any scheduled tasks. I run a 6950 and any time it hangs, it's from an application I know I opened, never from something that opened in the background.

12zJNWtM2HknS2EPLkT9QPSuSq1576aKx7

Tradehill viral bullshit code: TH-R114411
Pages: « 1 [2] 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!