RobertN
Newbie
Offline
Activity: 21
Merit: 0
|
|
November 08, 2017, 06:49:14 PM |
|
Your ZEN_ADDRESS (note the extra D) for suprnova should correspond to your login name, not an actual wallet address. The pool and the port are correct. The next thing to consider is that 3main will use "x" for you miner password. I am not sure if suprnova is a pool that allows anything for password or not but I just wanted to point it out. You can also look at the logging to figure out what is happening using this command: This will show the output of the miner (and other things if REMOTE) as it tries to start and connect. Hope this helps. Thank you! That has to be the problem, i'll replace that wallet key as soon as I solve my new problem.. it won't boot anymore, now I'm ending up at BusyBox at every startup.. Need to reinstall, hopa this will work Thanks for your help guys!
|
|
|
|
|
|
|
|
|
The forum was founded in 2009 by Satoshi and Sirius. It replaced a
SourceForge forum.
|
|
|
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
|
|
|
|
leenoox
|
|
November 08, 2017, 06:53:00 PM |
|
@damNmad: I found a small bug in your telegram configuration. Command which use for getting GPUs count is invalid when rig have installed over 9 cards... Your command: GPU_COUNT=$(nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }') and result: m1@rig-bafomet:~$ nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }' 2 There isn't any reason, why use awk. Easier way is use wc only. So, fix: GPU_COUNT=$(nvidia-smi -L | wc -l') and result: m1@rig-bafomet:~$ nvidia-smi -L | wc -l 11 I used to use same cmd, fixed it with this one for 1.4: nvidia-smi --query-gpu=count --format=csv,noheader,nounits | tail -1 Here is even more optimized code, less cycles, no need to use pipe: nvidia-smi -i 0 --query-gpu=count --format=csv,noheader,nounits every one has GPU0 plugged in so we can query only one GPU to get the total number of GPU's instead of querying all GPU's to return the same number then pipe it trough tail
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
November 08, 2017, 07:19:28 PM |
|
@damNmad: I found a small bug in your telegram configuration. Command which use for getting GPUs count is invalid when rig have installed over 9 cards... Your command: GPU_COUNT=$(nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }') and result: m1@rig-bafomet:~$ nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }' 2 There isn't any reason, why use awk. Easier way is use wc only. So, fix: GPU_COUNT=$(nvidia-smi -L | wc -l') and result: m1@rig-bafomet:~$ nvidia-smi -L | wc -l 11 I used to use same cmd, fixed it with this one for 1.4: nvidia-smi --query-gpu=count --format=csv,noheader,nounits | tail -1 Here is even more optimized code, less cycles, no need to use pipe: nvidia-smi -i 0 --query-gpu=count --format=csv,noheader,nounits every one has GPU0 plugged in so we can query only one GPU to get the total number of GPU's instead of querying all GPU's to return the same number then pipe it trough tail Totally correct and thanks for the help 👍👍👍
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
November 08, 2017, 07:27:05 PM |
|
Bios question!
Is it better to change PEG0, PEG1,... Max link speed to GEN1, GEN2 GEN3 in nvOC and Linux or those are only for windows ?
|
|
|
|
kk003
Member
Offline
Activity: 117
Merit: 10
|
|
November 08, 2017, 08:25:15 PM |
|
@damNmad: I found a small bug in your telegram configuration. Command which use for getting GPUs count is invalid when rig have installed over 9 cards... Your command: GPU_COUNT=$(nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }') and result: m1@rig-bafomet:~$ nvidia-smi -L | tail -n 1| cut -c 5 |awk '{ SUM += $1+1} ; { print SUM }' 2 There isn't any reason, why use awk. Easier way is use wc only. So, fix: GPU_COUNT=$(nvidia-smi -L | wc -l') and result: m1@rig-bafomet:~$ nvidia-smi -L | wc -l 11 I used to use same cmd, fixed it with this one for 1.4: nvidia-smi --query-gpu=count --format=csv,noheader,nounits | tail -1 Here is even more optimized code, less cycles, no need to use pipe: nvidia-smi -i 0 --query-gpu=count --format=csv,noheader,nounits every one has GPU0 plugged in so we can query only one GPU to get the total number of GPU's instead of querying all GPU's to return the same number then pipe it trough tail Totally correct and thanks for the help 👍👍👍 I've never understood why in this query every GPU shows the total number of GPUs instead of just the number of GPUs.!!!
|
|
|
|
damNmad
Full Member
Offline
Activity: 378
Merit: 104
nvOC forever
|
|
November 08, 2017, 08:49:17 PM |
|
Guys; i just found this new (??) ethash miner, not sure it dual mines but would like someone to test it and see whether it earns place in nvOC?? https://github.com/ethash/eminer-release/releases/It comes with some gadgets to see the hashrate and stuff, please have a look and put your opinion out. Would suggest to run it on a test RIG (I don;t have one Sorry ) You can also join the discussion here on discord https://discord.gg/trw4c3cIt seems this miner is using OpenCL, not CUDA. Should be ok for AMD cards but not so much for Nvidia. How on earth i missed that point I only saw NVIDIA from this line; "Fully support AMD and NVIDIA OpenCL devices". Thanks @leenoox
|
|
|
|
Stubo
Member
Offline
Activity: 224
Merit: 13
|
|
November 08, 2017, 08:55:28 PM |
|
Watchdog Improvement? I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER, initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart. My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod. Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this: # Begin Stubo Mod # Look for no miner screen and get right to miner restart if [[ `screen -ls |grep miner |wc -l` -eq 0 ]] then COUNT=0 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} fi # End Stubo Mod
By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action. Thoughts?
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
November 08, 2017, 09:44:29 PM |
|
Watchdog Improvement? I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER, initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart. My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod. Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this: # Begin Stubo Mod # Look for no miner screen and get right to miner restart if [[ `screen -ls |grep miner |wc -l` -eq 0 ]] then COUNT=0 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} fi # End Stubo Mod
By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action. Thoughts? Amazing idea, What I was testing is to just restart miner instead of restart 3main which will take 2-3 minutes, by using a separate miner start file which I use for WTM switcher You can check it in ~/z_papampi_versions/wtm-miner It has the 3main miner startup lines ( without salfter, zpool, mph) and I use that instead of restarting 3main instead of : echo "WARNING: $(date) - Utilization is too low: restart 3main" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} # If miner runs in screen 'miner' kill the screen to be sure it's gone pkill -e miner bash '/home/m1/telegram' # Best to restart oneBash - settings might be adjusted already target=$(ps -ef | awk '$NF~"3main" {print $2}') kill $target #| tee -a ${LOG_FILE} echo "" #| tee -a ${LOG_FILE} RESTART=$(($RESTART + 1)) REBOOTRESET=0 COUNT=$GPU_COUNT I use : echo "WARNING: $(date) - Utilization is too low: restart 3main" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} # If miner runs in screen 'miner' kill the screen to be sure it's gone pkill -e miner sleep 1 bash /home/m1/wtm-miner bash '/home/m1/telegram' RESTART=$(($RESTART + 1)) REBOOTRESET=0 COUNT=$GPU_COUNT Can you please tell me where you add your edit? I made so many edits that line 98 doesnt look like the correct place Update: I think that wtm-miner included in 1.4 is missing a done or exit at the end.
|
|
|
|
Stubo
Member
Offline
Activity: 224
Merit: 13
|
|
November 08, 2017, 09:58:31 PM |
|
papampi: I don't have a wtm-miner. That is another reason that I beautify any scripts my miners run with beautify_bash.py. Mismatches like that are trivial to find with the proper indentation. As for the mod I did, here is more of it: <cut> #IAmNotAJeep MOD from V002 if [ $JEEP -gt 0 ] then echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART" # Begin Stubo Mod # Look for no miner screen and get right to miner restart if [[ `screen -ls |grep miner |wc -l` -eq 0 ]] then COUNT=0 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} fi # End Stubo Mod
if [ $COUNT -le 0 ] then INTERNET_IS_GO=0 <cut>
With this in place, an 8 GPU rig miner restart by the watchdog is reduced from 8 minutes to ~ 10 seconds.
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
November 08, 2017, 10:06:59 PM |
|
papampi: I don't have a wtm-miner. That is another reason that I beautify any scripts my miners run with beautify_bash.py. Mismatches like that are trivial to find with the proper indentation. As for the mod I did, here is more of it: <cut> #IAmNotAJeep MOD from V002 if [ $JEEP -gt 0 ] then echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART" # Begin Stubo Mod # Look for no miner screen and get right to miner restart if [[ `screen -ls |grep miner |wc -l` -eq 0 ]] then COUNT=0 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} fi # End Stubo Mod
if [ $COUNT -le 0 ] then INTERNET_IS_GO=0 <cut>
With this in place, an 8 GPU rig miner restart by the watchdog is reduced from 8 minutes to ~ 10 seconds. Thanks mate Can you please have a look at the ~/z_papampi_versions/wtm-miner and see why it wont exit when done? I tried with done, exit, exit 0, .... and it stays running in the background
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
November 08, 2017, 10:11:12 PM Last edit: November 08, 2017, 10:22:12 PM by papampi |
|
papampi: I don't have a wtm-miner. That is another reason that I beautify any scripts my miners run with beautify_bash.py. Mismatches like that are trivial to find with the proper indentation. As for the mod I did, here is more of it: <cut> #IAmNotAJeep MOD from V002 if [ $JEEP -gt 0 ] then echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART" # Begin Stubo Mod # Look for no miner screen and get right to miner restart if [[ `screen -ls |grep miner |wc -l` -eq 0 ]] then COUNT=0 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} fi # End Stubo Mod
if [ $COUNT -le 0 ] then INTERNET_IS_GO=0 <cut>
With this in place, an 8 GPU rig miner restart by the watchdog is reduced from 8 minutes to ~ 10 seconds. Thanks mate Can you please have a look at the ~/z_papampi_versions/wtm-miner and see why it wont exit when done? I tried with done, exit, exit 0, .... and it stays running in the background Can I ask what this line is for ? echo "Debug: JEEP=$JEEP, COUNT=$COUNT, RESTART=$RESTART"
Update: Got it after add your edits, Thanks a lot for edit
|
|
|
|
walliskmine
Newbie
Offline
Activity: 4
Merit: 0
|
|
November 08, 2017, 10:13:56 PM |
|
Sorry if you want a laff read on, many dumb questions will follow. how to I get the nicehash profit switching to work? I think I am using an earlier version of ncOC, Im not sure which one but it wont update, and for sure its not v0019-1.4. This is the top part of 1bash *************** # XMR SIGT ZPOOL_SKUNK UBQ ONION # DMD GRS ZPOOL_LYRA2V2 ZPOOL_BLAKE2S # ZEC ZCOIN HUSH ZEN ZCL # NICE_ETHASH ETH MUSIC ETC EXP DCR PASC # MONA VTC DGB SIA FTC LBC # DUAL_ETC_DCR DUAL_ETC_PASC DUAL_ETC_LBC DUAL_ETC_SC # DUAL_EXP_DCR DUAL_EXP_PASC DUAL_EXP_LBC DUAL_EXP_SC # DUAL_ETH_DCR DUAL_ETH_PASC DUAL_ETH_LBC DUAL_ETH_SC # DUAL_MUSIC_DCR DUAL_MUSIC_PASC DUAL_MUSIC_LBC DUAL_MUSIC_SC # SALFTER_NICEHASH_PROFIT_SWITCHING # SALFTER_MPH_PROFIT_SWITCHING
COIN="FTC"
Maxximus007_AUTO_TEMPERATURE_CONTROL="YES"
IAmNotAJeep_and_Maxximus007_WATCHDOG="YES"
************************
Ive downloaded v0019-1.4, but how do I go about installing it? Is it just a case of unpacking it in the home directory?
|
|
|
|
damNmad
Full Member
Offline
Activity: 378
Merit: 104
nvOC forever
|
|
November 08, 2017, 10:14:58 PM |
|
Watchdog Improvement? I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER, initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart. My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod. Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this: # Begin Stubo Mod # Look for no miner screen and get right to miner restart if [[ `screen -ls |grep miner |wc -l` -eq 0 ]] then COUNT=0 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} fi # End Stubo Mod
By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action. Thoughts? Yep, good call mate, i agree there were so many other corner cases still missing, like @leenoox saying we need to go through most parts of our code and re write it, it wasn't the best but it does the job more or less, it has reached that point with the help of our early contributors. Every one occupied with so many things, even though we want to spend time on those things, we only have 24 hours a day and we have a life too, not easy to spend time, but we can do bit by bit like 20-30 mins once in a while (even weekly works) and improve it together, join those pieces together. Commit for nothing, deliver something as something is always better than nothing. I think it would be nice to gather all these points and improve it step by step, @leenoox you also suggested some change in the code where "bitcoin = the ground" stuff like that, that has been lost some where in chat. Any such things please PM me here or on Discord, I will put everything on discord in 'to_do' locked channel (not putting it open because of lots of messages!!) , or any other place, suggestions are welcome. Thanks everyone.
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
November 08, 2017, 10:21:35 PM Last edit: November 08, 2017, 10:38:08 PM by papampi |
|
My other suggestion for faster miner restart is to separate miner start lines from 3main, so wdog only start miner and not 3main
|
|
|
|
damNmad
Full Member
Offline
Activity: 378
Merit: 104
nvOC forever
|
|
November 08, 2017, 10:21:54 PM |
|
Sorry if you want a laff read on, many dumb questions will follow. how to I get the nicehash profit switching to work? I think I am using an earlier version of ncOC, Im not sure which one but it wont update, and for sure its not v0019-1.4. This is the top part of 1bash *************** # XMR SIGT ZPOOL_SKUNK UBQ ONION # DMD GRS ZPOOL_LYRA2V2 ZPOOL_BLAKE2S # ZEC ZCOIN HUSH ZEN ZCL # NICE_ETHASH ETH MUSIC ETC EXP DCR PASC # MONA VTC DGB SIA FTC LBC # DUAL_ETC_DCR DUAL_ETC_PASC DUAL_ETC_LBC DUAL_ETC_SC # DUAL_EXP_DCR DUAL_EXP_PASC DUAL_EXP_LBC DUAL_EXP_SC # DUAL_ETH_DCR DUAL_ETH_PASC DUAL_ETH_LBC DUAL_ETH_SC # DUAL_MUSIC_DCR DUAL_MUSIC_PASC DUAL_MUSIC_LBC DUAL_MUSIC_SC # SALFTER_NICEHASH_PROFIT_SWITCHING # SALFTER_MPH_PROFIT_SWITCHING
COIN="FTC"
Maxximus007_AUTO_TEMPERATURE_CONTROL="YES"
IAmNotAJeep_and_Maxximus007_WATCHDOG="YES"
************************
Ive downloaded v0019-1.4, but how do I go about installing it? Is it just a case of unpacking it in the home directory?
Set your coin to this COIN="SALFTER_NICEHASH_PROFIT_SWITCHING" Make sure you have added your BTC_ADDRESS at necessary places. If you are not sure, search for 'BTC_ADDRESS' in 1bash, some where in the middle you will see this BTC_ADDRESS="replace_with_your_BTC_address" Add your BTC address there, that should do the trick. Regarding 1.4 version, you can't update to 1.4 using 4update or other commands, you need to do a fresh install (new flash) EDIT : About 1.4 version Extract the zip, take another memory stick/ssd write that image using HDDRAW (what ever you have used to write the image before)and plug and play.
|
|
|
|
Stubo
Member
Offline
Activity: 224
Merit: 13
|
|
November 08, 2017, 11:01:13 PM |
|
Can you please have a look at the ~/z_papampi_versions/wtm-miner and see why it wont exit when done? I tried with done, exit, exit 0, .... and it stays running in the background
I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever: BITCOIN="theGROUND" while [ $BITCOIN == "theGROUND" ] do sleep 60 done
So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run. Hope this helps.
|
|
|
|
Stubo
Member
Offline
Activity: 224
Merit: 13
|
|
November 08, 2017, 11:23:37 PM |
|
Can you please have a look at the ~/z_papampi_versions/wtm-miner and see why it wont exit when done? I tried with done, exit, exit 0, .... and it stays running in the background
I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever: BITCOIN="theGROUND" while [ $BITCOIN == "theGROUND" ] do sleep 60 done
So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run. Hope this helps. I hate quoting myself, but here goes. I did a simple test to see if the infinite loop is necessary given the screen command. Try this for yourself: will launch top in a screen. Then exit the ssh session and login again. You can then resume the top session with the usual screen -r top. So, I don't think that infinite loop in 3main is needed any longer. However, I have not exercised all parts of nvOC but I can tell you for certain that it is not needed after those miners who are launched with screen. Hope this helps.
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
November 08, 2017, 11:38:24 PM |
|
Can you please have a look at the ~/z_papampi_versions/wtm-miner and see why it wont exit when done? I tried with done, exit, exit 0, .... and it stays running in the background
I ran it through beautify and read it. It is basically a stripped down version of 3main. I don't know what your expectations are as far as it exiting, but this code block (same as in 3main) is meant to just run forever: BITCOIN="theGROUND" while [ $BITCOIN == "theGROUND" ] do sleep 60 done
So scripts like this and 3main are meant to not "fall out". I am not overly familiar with the screen command but it would appear that the parent process of them is not 3main or as in your case wtm-miner, but rather "/sbin/upstart --user". As such, I am not sure if this infinite loop is required any longer. It may be legacy code before screen was implemented. Give it a go without it in your script and see if the miner continues to run. Hope this helps. I hate quoting myself, but here goes. I did a simple test to see if the infinite loop is necessary given the screen command. Try this for yourself: will launch top in a screen. Then exit the ssh session and login again. You can then resume the top session with the usual screen -r top. So, I don't think that infinite loop in 3main is needed any longer. However, I have not exercised all parts of nvOC but I can tell you for certain that it is not needed after those miners who are launched with screen. Hope this helps. Thanks a lot mate. It was always a question for me what those loops are for... Tested the miner start script without the loops and all is good. I think as you said its some legacy code from old nvOC and can be removed. And as I said before that wtm-miner is just a copy of 3main miner starts so wtm auto switch dont restart 3main which will take so long and just switch miner.
|
|
|
|
leenoox
|
|
November 09, 2017, 03:09:32 AM |
|
Watchdog Improvement? I have been reading over the watchdog script and seen what I think is an opportunity for improvement. The current logic loops every 10 seconds and looks for GPU utilization below a threshold (90%), if it finds this, it begins incrementing one counter (JEEP) and decrementing another (COUNTER, initially set to 6*#GPU) for each occurrence [per GPU]. My issue is that the current logic has to wait 6*#GPU*10 seconds, or a minute per GPU before it will attempt a miner restart. My thinking is why not just check for the existence of a miner process - if none is found, we can just skip the rest if the countdown and get right to attempting to restart the miner. This can save many minutes of lost mining time. The downside is that it is so much faster than the existing that if you do have a configuration issue, it will be launching miners and rebooting at a much faster rate. So, maybe this is more of an expert mod. Anyway, I have been running a modified version of IAmNotAJeep_and_Maxximus007_WATCHDOG that has this additional logic and it work great for ME. In the existing 19-1.4 version of the script, starting at line 98, I added this: # Begin Stubo Mod # Look for no miner screen and get right to miner restart if [[ `screen -ls |grep miner |wc -l` -eq 0 ]] then COUNT=0 echo "Found no miner, jumping to 3main restart" | tee -a ${LOG_FILE} ${ALERT_LOG_FILE} fi # End Stubo Mod
By setting COUNT=0 for the "no miner found" condition, I am effectively removing the delay in decrementing COUNTER all of the way down before the script takes action. Thoughts? I've mentioned this problem a while ago... incorporated quick and dirty patch on my rigs to fix it and wanted to rewrite the watchdog... it is still on my TODO list... Your solution is also not the best one but it helps... that code is there to detect semi-freeze, when some card is acting up, however, as you noticed it is not well written and in some cases it takes hours before it reacts. On few ocasions I had one card freezing, pulling whole rig to a crawl... on 13 GPU rig it took about 3-4 hours for watchdog to realize that it was time to restart... sigh... a quick patch was to lower the counter as well... I'll jump to it once I finish rewriting the temp control.
|
|
|
|
leenoox
|
|
November 09, 2017, 03:22:47 AM |
|
Yep, good call mate, i agree there were so many other corner cases still missing, like @leenoox saying we need to go through most parts of our code and re write it, it wasn't the best but it does the job more or less, it has reached that point with the help of our early contributors. Every one occupied with so many things, even though we want to spend time on those things, we only have 24 hours a day and we have a life too, not easy to spend time, but we can do bit by bit like 20-30 mins once in a while (even weekly works) and improve it together, join those pieces together.
Commit for nothing, deliver something as something is always better than nothing.
I think it would be nice to gather all these points and improve it step by step, @leenoox you also suggested some change in the code where "bitcoin = the ground" stuff like that, that has been lost some where in chat. Any such things please PM me here or on Discord, I will put everything on discord in 'to_do' locked channel (not putting it open because of lots of messages!!) , or any other place, suggestions are welcome.
Thanks everyone.
yup, i wish i had little bit more time to dedicate to this too btw, regarding the bitcoin=theground... I posted solution how to rewrite it on discord right after you asked in the same subchannel, i beleive it was the oc channel, about two days ago... if you can't find it, just replace all instances of this: BITCOIN="theGROUND" while [ $BITCOIN == "theGROUND" ] do with this: or, depending on your style, with one-liner
|
|
|
|
|