Bitcoin Forum
June 24, 2024, 02:54:06 AM *
News: Voting for pizza day contest
 
  Home Help Search Login Register More  
  Show Posts
Pages: [1]
1  Alternate cryptocurrencies / Mining (Altcoins) / Re: How to reboot a rig if it stops mining? on: January 22, 2018, 08:55:08 PM
Thank Bro... God Bless You.
greetings from Turkey.

I have a DIY solution for this that I made up that you can use.  Basically I made a powershell script that checks GPU usage percentage and based on results it reboots the computer, although you can modify that to restart the mining program as well.  I also have something I use to restart the mining program instead of reboot the computer since sometimes that is enough.

The difficulty with restarting the mining app is shutting it down.  I use something called closeprog.exe to shut down the mining application.  I set a scheduled task that runs a script when the video driver crashes.  That script uses closeprog.exe to kill the mining app and then rerun it.  I can't remember where I got this closeprog.exe, I have had it for years and used it for various scripted things.  You can probably figure out how to use taskkill to do that also.

To monitor GPU usage and reboot when a specific GPU drops out I use openhardwaremonitor.  You can get that here: http://openhardwaremonitor.org/
That thing basically exposes the GPU sensor stats to windows management instrumentation which powershell can work with.  The powershell script runs from a scheduled task every 10 minutes.  It cycles through all of the GPU's looking for low usage results.  If it gets 9 low results in a row it reboots the whole rig.  The reason I do 9 results is because if a driver crash causes the other scheduled task to restart the mining application while this script is testing GPU usage then I can sometimes wind up with some low readings while things are being restarted by the other task.  Also the GPU usage will dip normally during certain work restarts from the pool.  So I want to be sure I am really seeing a consistent low result before I reboot.  The PS script is this:

Code:
$Log = "LogFile.log"
$Date = Get-Date
$TestValue = 0

#Test if openhardwaremonitor is running and if not, start it
$ProcessName = "openhardwaremonitor"

    if((get-process $ProcessName -ErrorAction SilentlyContinue) -eq $Null)
    { Start-Process -FilePath ".\OpenHardwareMonitor\OpenHardwareMonitor.exe" -WindowStyle Minimized}
else
    { echo "Process is already running" }

#if the computer just started it will get zeros while the miner is still getting the dag file ready so we wait
Start-Sleep 120

#Check GPU load

$FirstGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $FirstGPULoads)
     {if($GPU.value -lt 10)
        {$FirstGPULoadValue = $GPU.value
        Write-Host $FirstGPULoadValue "seems low"
        #"$Date - Low result obtained $FirstGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$FirstGPULoadValue = $GPU.value
        Write-Host $FirstGPULoadValue "seems fine"  
        }
     }

#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20

$SecondGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $SecondGPULoads)
     {if($GPU.value -lt 10)
        {$SecondGPULoadValue = $GPU.value
        Write-Host $SecondGPULoadValue "seems low"
        #"$Date - Low result obtained $SecondGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$SecondGPULoadValue = $GPU.value
        Write-Host $SecondGPULoadValue "seems fine"  
        }
     }

#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20


$ThirdGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $ThirdGPULoads)
     {if($GPU.value -lt 10)
        {$ThirdGPULoadValue = $GPU.value
        Write-Host $ThirdGPULoadValue "seems low"
        #"$Date - Low result obtained $ThirdGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$ThirdGPULoadValue = $GPU.value
        Write-Host $ThirdGPULoadValue "seems fine"  
        }
     }


#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20


$FourthGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $FourthGPULoads)
     {if($GPU.value -lt 10)
        {$FourthGPULoadValue = $GPU.value
        Write-Host $FourthGPULoadValue "seems low"
        #"$Date - Low result obtained $FourthGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$FourthGPULoadValue = $GPU.value
        Write-Host $FourthGPULoadValue "seems fine"  
        }
     }


#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20


$FifthGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $FifthGPULoads)
     {if($GPU.value -lt 10)
        {$FifthGPULoadValue = $GPU.value
        Write-Host $FifthGPULoadValue "seems low"
        #"$Date - Low result obtained $FifthGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$FifthGPULoadValue = $GPU.value
        Write-Host $FifthGPULoadValue "seems fine"  
        }
     }


#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20


$SixthGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $SixthGPULoads)
     {if($GPU.value -lt 10)
        {$SixthGPULoadValue = $GPU.value
        Write-Host $SixthGPULoadValue "seems low"
        #"$Date - Low result obtained $SixthGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$SixthGPULoadValue = $GPU.value
        Write-Host $SixthGPULoadValue "seems fine"  
        }
     }


#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20


$SeventhGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $SeventhGPULoads)
     {if($GPU.value -lt 10)
        {$SeventhGPULoadValue = $GPU.value
        Write-Host $SeventhGPULoadValue "seems low"
        #"$Date - Low result obtained $SeventhGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$SeventhGPULoadValue = $GPU.value
        Write-Host $SeventhGPULoadValue "seems fine"  
        }
     }


#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20


$EightthGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $EightthGPULoads)
     {if($GPU.value -lt 10)
        {$EightthGPULoadValue = $GPU.value
        Write-Host $EightthGPULoadValue "seems low"
        #"$Date - Low result obtained $EightthGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$EightthGPULoadValue = $GPU.value
        Write-Host $EightthGPULoadValue "seems fine"  
        }
     }


#if we have bad timing on a driver crash (and recovery) or work restarts we may get low results so we wait between tests
Start-Sleep 20


$NinthGPULoads = Get-WmiObject -namespace root\openhardwaremonitor -class sensor | Where-Object {$_.SensorType -Match "load" -and $_.Identifier -like "*gpu*"}

ForEach($GPU In $NinthGPULoads)
     {if($GPU.value -lt 10)
        {$NinthGPULoadValue = $GPU.value
        Write-Host $NinthGPULoadValue "seems low"
        #"$Date - Low result obtained $NinthGPULoadValue" >> $Log
        $TestValue = $TestValue + 1                                      
        }
      else
        {$NinthGPULoadValue = $GPU.value
        Write-Host $NinthGPULoadValue "seems fine"  
        }
     }

#all nine tests have to get a low result restart
if($TestValue -gt 8)
     {"$Date - Obtained $TestValue low results - Seems dead - restarting" >> $Log
      Restart-Computer -force
     }
else
     {"$Date - Obtained $TestValue low results - Seems ok" >> $Log}
        

Each of the nine GPU tests will test all GPU's and if any of them gives a result less than 10% it increments the TestValue variable.  If the TestValue variable gets a value of nine at the end that means each of the nine tests resulted in at least one GPU reporting a usage of under 10%.  Since there are also pauses while it waits in between tests that works pretty well for me in terms of only doing a reboot when a GPU has well and truly crashed.  The openhardwaremonitor app has to be in a subdirectory called openhardwaremonitor underneath where this powershell script runs (or you need to change the path).  First it checks if that app is running and if it isn't it runs it.  That way you never have to bother with actually making sure the openhardwaremonitor app is running.  The script also has a long pause in the beginning so that if the computer did get rebooted and it runs right after reboot it waits a minute for the miner apps to get started (because you have those start automatically at boot right?)

To run a powershell script from a scheduled task I use a batch file to run the PS script (I know convoluted but it works).  The batch file that runs the PS script is in the same directory as the PS script and it looks like this:
powershell.exe .\GPU_Monitor.ps1

EDIT: I was just looking over my log file for this and I see that it actually increments the TestValue variable for every single low reading so if you have 6 GPU's (I don't) then it could actually hit 9 low results quite easily if the test was run at an inopportune moment (like during a video driver crash).  You can change the value in the line towards the end to determine how many low results you need to initiate a reboot.  The line that determines that is
Code:
if($TestValue -gt 8)
just change 8 to whatever.  You can comment out the restart also by putting a # in front of it.  I ran this for a couple days with the restart command commented out to be sure it was really only going to restart when I wanted it to.  When I was satisfied that it wasn't going to cause a lot of unnecessary reboots I removed the comment on that line so it could restart the rig.  But looking at my log for the past few days it looks like the script could use some refinement.
Pages: [1]
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!