Bitcoin Forum
May 06, 2024, 03:42:00 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Managing unstable overclock behavior, GPU errors under Linux / XP / Win7  (Read 3908 times)
Transisto (OP)
Donator
Legendary
*
Offline Offline

Activity: 1731
Merit: 1008



View Profile WWW
August 10, 2011, 03:28:49 AM
Last edit: August 10, 2011, 03:46:29 AM by Transisto
 #1

I would like to know a way to get a High and Stable overclock without risking having a hung GPU that can go unnoticed for days.

An unstable card can have so many different behavior that I now run my card as stable safe as possible.

I only have experience with Win7 but would gladly move to Linux if GPU errors are better handled.

Here are some of the errors I encounter.

Windows lockup,
Windows reboot
Windows BSOD
Windows stop poclbm.exe
Driver downclock GPU to a ridiculously slow speed.

The worst is when guiminer.exe goes to 100% CPU and all other miner stall at ~100Khash (this happen on single core CPU)

I found out the hard way, that any Ghashes I try to gain from fine tuning an OC for 10-20mhz is always at lost compared to having a build hung for days in the above conditions.

I keep hearing about people overclocking their card much higher than I can have mines Stable.
Personally I run my 5850s no faster than 830mhz stock, and some of them can't even do that stable.

Do you know of a reliable automated tool for finding Speed vs. Voltage sweet spots ?

What are your experiences with other OSes / software / miner / drivers ?
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715010120
Hero Member
*
Offline Offline

Posts: 1715010120

View Profile Personal Message (Offline)

Ignore
1715010120
Reply with quote  #2

1715010120
Report to moderator
1715010120
Hero Member
*
Offline Offline

Posts: 1715010120

View Profile Personal Message (Offline)

Ignore
1715010120
Reply with quote  #2

1715010120
Report to moderator
1715010120
Hero Member
*
Offline Offline

Posts: 1715010120

View Profile Personal Message (Offline)

Ignore
1715010120
Reply with quote  #2

1715010120
Report to moderator
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
August 10, 2011, 09:58:49 AM
 #2

Is your miner dedicated to mining or are you using it for other tasks?

My experience is with Linux and my cards go to 975 MHz and 1020 MHz stably at stock voltage (Sapphire HD5850 Xtremes)  I thought my slow card could handle 980 MHz but it crashed after a few days.

The most important factors for me in achieving stability were:
  • Not running a GUI,
  • Keeping the temperatures low.

On the first point, people find their cards are more stable after disabling Flash hardware acceleration for example.  Removing desktop effects can help too.  I've removed everything: no mouse, no console mode, not even a black screen (when a monitor is connected to a card it reports no signal).  I don't know how to do this in Windows but thought it worth mentioning as it made a big difference to me.

I don't know anything about "speed vs. voltage sweet spots".  I just fix a voltage and manually search for the max stable clock (which can take days).  Currently my cards are both undervolted to 0.9875V and are clocked to 847/308 and 899/327 but with only 48 hours continuous mining so far I can't be too sure of the stability of this.

I don't think that Linux is any better at handling crashes.  When one of my cards crashes I've not found a way of restarting it without rebooting the system!
error
Hero Member
*****
Offline Offline

Activity: 588
Merit: 500



View Profile
August 10, 2011, 06:13:15 PM
 #3

If you want to minimize your risk, DO NOT OVERCLOCK.

There's no automated way to find a "sweet spot." Overclocking, at least to find the maximum "stable" performance, is all 100% manual tuning and a lot of crashing.

3KzNGwzRZ6SimWuFAgh4TnXzHpruHMZmV8
cirz8
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
August 11, 2011, 06:30:27 PM
 #4

If using linux, code a script that checks for GPU-hungs, if a hung is detected, it sets the default overclock on that card to 5MHz lower than the current MHz(you might want to set some thresholds here so you don't end up with a GPU that is running at clocks like 200MHz), then coldreboot.
At the same time this scripts lowers the clocks if the temperature go over predefined limits, and raises the clocks again as it gets cooler. Use other temperature sensors(such as the motherboard) to trigger presets.
ssateneth
Legendary
*
Offline Offline

Activity: 1344
Merit: 1004



View Profile
August 13, 2011, 10:13:14 AM
 #5

I've found that only trial and error is the best process for overclocking. If you're underclocking memory, I've found that all 58xx have stability problems with memory 350-400+. Turn memory speed to at most 330 to get rid of "Hardware problem?" errors. The gain of 0.1 mhash isn't worth it. If you're scared about voltages, a small bump from stock (1.163 -> 1.175) can go a long way. I managed to push my 5830 from 985 core to 1016 with that small bump, and no hardware errors! Driver hangs and restarts are another thing, and seems to purely result from core problems (memory doesn't seem to affect driver hang+reset; it just causes "Hardware problem?" and bad shares). Core voltage and memory aside then, that just leaves core speed. I generally started around 975 and went from there, increasing by 5 until driver hang/crash, then when I get around to noticing it, i'll go back by 2 mhz, and keep doing that until I forget the card is even running because it's stopped crashing because of stability.

Tx2000
Full Member
***
Offline Offline

Activity: 182
Merit: 100



View Profile
August 13, 2011, 01:50:01 PM
 #6

Unfortunately or perhaps fortunately, all hardware, even those of the same brand and model and batch are unique.  I dare say they are almost human in that regard.  Bottom line is that you have to manual inch your way up (or down, if you will, regarding memory and/or efficiency).  Test for long periods, test a wide array of drivers, SDKs and for some people, even OSes.  On top of that, even BIOS settings which may or may not affect your PCIe lanes.


There is already a wealth of information out there and too much as a result of variables to list here.  I know that you don't want to hear it but perhaps you should try Google.
Transisto (OP)
Donator
Legendary
*
Offline Offline

Activity: 1731
Merit: 1008



View Profile WWW
August 15, 2011, 02:39:43 PM
 #7

  I know that you don't want to hear it but perhaps you should try Google.

I was expecting people here, with large scale GPU operations, would have had figured ways around these.  There exist automatic overclocking tools but they're not very good, There's Ati catalyst and AtiTool that scan for artifact.

I had though of setting up a mechanical timer on each rig that would hard reset the computer every days.
Yanz
Full Member
***
Offline Offline

Activity: 133
Merit: 100


View Profile
August 22, 2011, 09:22:24 PM
 #8

Is your miner dedicated to mining or are you using it for other tasks?

My experience is with Linux and my cards go to 975 MHz and 1020 MHz stably at stock voltage (Sapphire HD5850 Xtremes)  I thought my slow card could handle 980 MHz but it crashed after a few days.

The most important factors for me in achieving stability were:
  • Not running a GUI,
  • Keeping the temperatures low.

On the first point, people find their cards are more stable after disabling Flash hardware acceleration for example.  Removing desktop effects can help too.  I've removed everything: no mouse, no console mode, not even a black screen (when a monitor is connected to a card it reports no signal).  I don't know how to do this in Windows but thought it worth mentioning as it made a big difference to me.

I don't know anything about "speed vs. voltage sweet spots".  I just fix a voltage and manually search for the max stable clock (which can take days).  Currently my cards are both undervolted to 0.9875V and are clocked to 847/308 and 899/327 but with only 48 hours continuous mining so far I can't be too sure of the stability of this.

I don't think that Linux is any better at handling crashes.  When one of my cards crashes I've not found a way of restarting it without rebooting the system!

How did you do this? Can you explain more, I'd like to do this too.

With great video cards comes great power consumption.
bal3wolf
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250

Power to the people!


View Profile
August 22, 2011, 11:44:09 PM
 #9

Iv found i can run my cards insane clocks at 1100mhz on a 5870 and 950 on a 5970 but if im on my pc much they will freeze but if i dont use the pc they are fine.  I have to run my clocks much lower to not crash while on my pc doing stuff.
Transisto (OP)
Donator
Legendary
*
Offline Offline

Activity: 1731
Merit: 1008



View Profile WWW
August 23, 2011, 01:08:24 AM
 #10

Iv found i can run my cards insane clocks at 1100mhz on a 5870 and 950 on a 5970 but if im on my pc much they will freeze but if i dont use the pc they are fine.  I have to run my clocks much lower to not crash while on my pc doing stuff.
1100 that is insane indeed, what brand/voltage ?

I wonder what is the best test for finding stability issue.
bal3wolf
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250

Power to the people!


View Profile
August 23, 2011, 01:28:48 AM
 #11

asus 5870 ref 1.288
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!