Bitcoin Forum
November 16, 2024, 12:41:45 PM *
News: Check out the artwork 1Dq created to commemorate this forum's 15th anniversary
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Windows 10 - 8 RX580 - Claymore 14.7 - Rig randomly restarting around 24h mark  (Read 237 times)
dipepmining (OP)
Jr. Member
*
Offline Offline

Activity: 43
Merit: 15


View Profile
July 25, 2019, 11:41:33 PM
Merited by nc50lc (1)
 #1

After running my rig for about 24 hours straight, it is crashing and automatically restarting. As soon as I get to the desktop I see the message that Wattman settings were restored to defaults (I am aware this usually means something went wrong with one of the cards). I checked the Claymore logs and the entire bottom of the txt file is filled with <0x00> so there is no useful information from when things crashed. If anyone can give me some ideas on how to pinpoint what is going on the next time this happens that would be great.

Every time I check all GPUs during mining are 70 degrees or less and everything appears to be running smooth @ 246 mh/s and 1037 watts.

I basically followed the entire https://mining.help/ guide to set this rig up.

Rig specs:
OS: Windows 10 (running the commands to clean it up for mining)
GPUs: 8 - MSI Armor OC RX580's
Motherboard: Asus b250 Mining Expert (made sure I configured the bios property, setting Gen2 for PCIe)
1200 watt power supply plugged into EATX Power A (to power the main system, and 4 gpus)
1200 watt power supply plugged into EATX Power B (to power the remaining 4 gpus)
I have risers plugged into ports: A01, A02, A03, A04, B07, B08, B09, B010

All cards are using a modded vBios that I created using Polaris BIOS editor (pimp my straps option).

All cards are using the following overclock settings (maybe this could potentially be the issue, but they run so smooth for such a long time so I feel like these OC settings are safe):

Micron Memory:
gpu core: 1150mhz / 850mV
memory: 2050mhz / 850mV

I am running Claymore 14.7 to mine Etherium using Ethermine as the pool with the following settings:

I made sure to set the following environment variables:
setx GPU_FORCE_64BIT_PTR 0
setx GPU_MAX_HEAP_SIZE 100
setx GPU_USE_SYNC_OBJECTS 1
setx GPU_MAX_ALLOC_PERCENT 100
setx GPU_SINGLE_ALLOC_PERCENT 100

Here are the additional settings in my start.bat (excluding my wallet info)
-mode 1 -mport 0 -rxboost 1 -tt 70 -ttli 75 -tstop 85
Lunga Chung
Member
**
Offline Offline

Activity: 277
Merit: 23


View Profile
July 25, 2019, 11:58:33 PM
Merited by nc50lc (1)
 #2

Have similar issues on my farm with 14.7

add more mV on all cards and let it run see if it crashes, if it passes 24h mark lower each card individually for another 24h

I know this is the pain to do but i see no other way to determine the problematic card.

IF this doesn't help check your wall power, use to have a problem on my el. grid when nearby car service turns on there lifters, power spikes and reboots my rigs
dipepmining (OP)
Jr. Member
*
Offline Offline

Activity: 43
Merit: 15


View Profile
July 26, 2019, 01:02:49 AM
 #3

Thank you for the advice, on a side note the rig actually went about 1.5-2 days with no problem, then the restart happened. Is it common for a rig with all of the exact same video cards to have a card that needs settings of its own? I am going to try exactly what you said increasing the voltage to 900W on all cards and slowly lower as time goes on. Can I adjust while Claymore is actively running? I am convinced the issue has to do with the overclock/undervolt settings just because when Wattman gets restored it usually means there was a memory error of some sort. I have played around with overclock/undervolt in the past and all of the combinations I tried caused crashes, but I was able to witness them within seconds/minutes. On a side note one of the 2 power supplies is hot once in a while which doesn't make much sends since I am only using about half the load on each of them, I am hoping it has nothing to do with a faulty PSU or as you were saying a power issue (which I highly doubt because Wattman settings were restored to defaults, which usually means there was a gpu memory error of some sort).
Lunga Chung
Member
**
Offline Offline

Activity: 277
Merit: 23


View Profile
July 26, 2019, 03:12:51 AM
Merited by nc50lc (1)
 #4

Thank you for the advice, on a side note the rig actually went about 1.5-2 days with no problem, then the restart happened. Is it common for a rig with all of the exact same video cards to have a card that needs settings of its own? I am going to try exactly what you said increasing the voltage to 900W on all cards and slowly lower as time goes on. Can I adjust while Claymore is actively running? I am convinced the issue has to do with the overclock/undervolt settings just because when Wattman gets restored it usually means there was a memory error of some sort. I have played around with overclock/undervolt in the past and all of the combinations I tried caused crashes, but I was able to witness them within seconds/minutes. On a side note one of the 2 power supplies is hot once in a while which doesn't make much sends since I am only using about half the load on each of them, I am hoping it has nothing to do with a faulty PSU or as you were saying a power issue (which I highly doubt because Wattman settings were restored to defaults, which usually means there was a gpu memory error of some sort).

silicon lottery, every card is different...

don't go that high start from 875mV, if its crashing in 1-2 days just a little bump on voltage should be enough to make

them stable.

Also switch to OverdriveNtool like in the guide you mentioned previously
dipepmining (OP)
Jr. Member
*
Offline Offline

Activity: 43
Merit: 15


View Profile
July 26, 2019, 03:50:13 AM
 #5

So the Silicon lottery thing is real, got it. I have been using overdriveNT for overclocking. Thank you for the help!!!
Piskeante
Member
**
Offline Offline

Activity: 924
Merit: 15


View Profile
July 26, 2019, 05:58:51 PM
 #6

the normal issue: everybody undervolting memory.

DO NOT DO IT!!!! It makes crashes more often!!!

BTC no more than 6k by end of 2019. ETH no more than 300$ by end 2019. Huge market manipulation, huge amount of scammers and hypers.
adaseb
Legendary
*
Offline Offline

Activity: 3878
Merit: 1733


View Profile
July 27, 2019, 05:17:35 AM
Merited by Raja_MBZ (1)
 #7

If it is almost always restarting around the 24 hour mark then it's most likely some software update or some hibernate cycle that you got set.

Go to power settings and make sure everything is Always On and not to suspend or hibernate after 1 day.

Another issue could be it might be some virus scanner that starts everyday after there is 24 hours of inactivity or some type of Windows Update which always fails to install and 24 hours later tries again and has the same issue.

Don't think it's a GPU issue here.
swogerino
Legendary
*
Offline Offline

Activity: 3346
Merit: 1248


Bitcoin Casino Est. 2013


View Profile
July 27, 2019, 10:34:51 AM
 #8

Have you tried to edit the start.bat of your claymore miner and set the watchdog to 0, this way you can easily spot if there is a graphic card issue,without this option enabled Claymore restarts when there is a problem but it doesn't do so with the watchdog option set to 0 it just gives you continuous notifications that miner needs to be restarted but it continues to mine no matter what.

███▄▀██▄▄
░░▄████▄▀████ ▄▄▄
░░████▄▄▄▄░░█▀▀
███ ██████▄▄▀█▌
░▄░░███▀████
░▐█░░███░██▄▄
░░▄▀░████▄▄▄▀█
░█░▄███▀████ ▐█
▀▄▄███▀▄██▄
░░▄██▌░░██▀
░▐█▀████ ▀██
░░█▌██████ ▀▀██▄
░░▀███
▄▄██▀▄███
▄▄▄████▀▄████▄░░
▀▀█░░▄▄▄▄████░░
▐█▀▄▄█████████
████▀███░░▄░
▄▄██░███░░█▌░
█▀▄▄▄████░▀▄░░
█▌████▀███▄░█░
▄██▄▀███▄▄▀
▀██░░▐██▄░░
██▀████▀█▌░
▄██▀▀██████▐█░░
███▀░░
dipepmining (OP)
Jr. Member
*
Offline Offline

Activity: 43
Merit: 15


View Profile
July 27, 2019, 03:58:52 PM
 #9

Swogerino thank you, this is exactly the type of setting I was looking for. So the command line will never close and I will be able to scroll up and see which card was having problems?
Lunga Chung
Member
**
Offline Offline

Activity: 277
Merit: 23


View Profile
July 27, 2019, 09:04:32 PM
 #10

same can be found in log file
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!