jyakulis
Sr. Member
Offline
Activity: 469
Merit: 250
J
|
|
September 05, 2017, 01:23:10 PM |
|
What could cause my OS/miner to freeze? I got it running but can't stay up for long. I'm sure my OC settings are fine because I started very low.
It looks like there are lots of possible reasons for a freeze: 1) overclock, 2) bad riser, 3) adapter cable to the riser, 4) power cable to the riser, 5) bad GPU, 6) PCIe slots on the motherboard are touching I am not sure how to diagnose which one ... Could a slow processor or the fact I'm running the OS on a slower USB stick? Anyway, I'm going to try to put on a SSD and will report back if that works.
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
September 05, 2017, 01:35:15 PM |
|
how do you scroll up in the guake terminal?
I get to the point where I'm accessing the miner however I can't navigate that window, is there no option for that
Also how can I access logs to help diagnose a problem?
I just wrote how to enable screen log 4 posts before Here https://bitcointalk.org/index.php?topic=1854250.msg21529172#msg21529172
|
|
|
|
Bibi187
Full Member
Offline
Activity: 420
Merit: 106
https://steemit.com/@bibi187
|
|
September 05, 2017, 01:41:56 PM Last edit: September 05, 2017, 01:52:46 PM by Bibi187 |
|
What is your problem exactly ? Miner log dont give so much information and a lot verbose at my think. If u expect some hardware failure, just do a "tail -f /var/log/kern.log" If u want a log file to keep info just do "tail -f /var/log/kern.log >> kernlog" Is a clean log file, after bot, launch this command you will grab every failure from GPU and details on it. U can use alias so dont have to type every time, edit .bashrc, go to the end and add (this mine) #Home Alias
alias Nwatch='watch -n 1 nvidia-smi' alias Kwatch='tail -f /var/log/kern.log' alias LOG "tail -f /var/log/kern.log >> kernlog" alias GPUstate='export DISPLAY=:0 && nvidia-settings -q /GPUCurrentClockFreqsString' alias MReset='pkill -e miner && bash 3main &' alias HardReset='sudo reboot'
|
|
|
|
jyakulis
Sr. Member
Offline
Activity: 469
Merit: 250
J
|
|
September 05, 2017, 03:00:50 PM |
|
"the root password for nvOC is:
miner1"
Ok, so does this mean all my worker passwords need to be this? Maybe this is why miningpoolhub works which doesn't check passwords for me but others don't.
|
|
|
|
damNmad
Full Member
Offline
Activity: 378
Merit: 104
nvOC forever
|
|
September 05, 2017, 03:05:14 PM |
|
Hi
Is it possible to run 12 Cards on this OS? and what is the Max what you could run without issues on Linux and Windows?
Cheers
I believe you can run more than 12 cards for sure, read some where that this OS (Linux) supports up to 15 (possibly), but at the moment supports 13 for sure. This is Ubuntu (16.x) based OS not Windows!
|
|
|
|
damNmad
Full Member
Offline
Activity: 378
Merit: 104
nvOC forever
|
|
September 05, 2017, 03:10:03 PM |
|
"the root password for nvOC is:
miner1"
Ok, so does this mean all my worker passwords need to be this? Maybe this is why miningpoolhub works which doesn't check passwords for me but others don't.
NO. Your worker password in miningpoolhub doesn't matter but I would suggest you to use 'x'. Your mininpoolhub login name is the ADDRESS name; for eg ETH_ADDRESS = 'your login name' Your worker in miningpool hub is WORKER name; for eg ETH_WORKER = 'your worker name' 99% passwords are defaulted to 'x' in 3main, but miningpoolhub (many other pools too) doesn't care about the password as long as you have the loginName & worker name (login.worker) is good.
|
|
|
|
damNmad
Full Member
Offline
Activity: 378
Merit: 104
nvOC forever
|
|
September 05, 2017, 03:12:00 PM |
|
Hi im trying to use SALFTER coin like nicehash and MPH but always end up to this screen Have you checked your remote/local option?? Make sure you have LOCAL mode setup. Let me know if you still see the issue.
|
|
|
|
jyakulis
Sr. Member
Offline
Activity: 469
Merit: 250
J
|
|
September 05, 2017, 03:39:24 PM |
|
"the root password for nvOC is:
miner1"
Ok, so does this mean all my worker passwords need to be this? Maybe this is why miningpoolhub works which doesn't check passwords for me but others don't.
NO. Your worker password in miningpoolhub doesn't matter but I would suggest you to use 'x'. Your mininpoolhub login name is the ADDRESS name; for eg ETH_ADDRESS = 'your login name' Your worker in miningpool hub is WORKER name; for eg ETH_WORKER = 'your worker name' 99% passwords are defaulted to 'x' in 3main, but miningpoolhub (many other pools too) doesn't care about the password as long as you have the loginName & worker name (login.worker) is good. what about suprnova? that's the pool i'm having trouble with, not miningpoolhub.
|
|
|
|
damNmad
Full Member
Offline
Activity: 378
Merit: 104
nvOC forever
|
|
September 05, 2017, 03:50:35 PM |
|
"the root password for nvOC is:
miner1"
Ok, so does this mean all my worker passwords need to be this? Maybe this is why miningpoolhub works which doesn't check passwords for me but others don't.
NO. Your worker password in miningpoolhub doesn't matter but I would suggest you to use 'x'. Your mininpoolhub login name is the ADDRESS name; for eg ETH_ADDRESS = 'your login name' Your worker in miningpool hub is WORKER name; for eg ETH_WORKER = 'your worker name' 99% passwords are defaulted to 'x' in 3main, but miningpoolhub (many other pools too) doesn't care about the password as long as you have the loginName & worker name (login.worker) is good. what about suprnova? that's the pool i'm having trouble with, not miningpoolhub. Sorry, that's what I've tried to say, same thing applies to suprnova too, I set my worker's password to 'x' on portal and ADDRESS to login name and WORKER to worker name. The thing it might be failing where you join both the address and worker name, which forms like this 'loginName.workerName', make sure you have '.' not '/'. There is an option DOT_POOL_FORMAT_or_FORWARD_SLASH_POOL_FORMAT="DOT" # DOT or SLASH is set to 'DOT' Try the above if it doesn't work, tell me the coin name you are trying to mine, I can help you with that.
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 04:49:57 PM |
|
@fullzero
please help me, howto log ethminer console on a log file
Find ethminer line in 3main and add this to the end of it 2>&1 | tee your_log-file_name.log like this : screen -dmS miner $HCD -S $UBQ_POOL -O $UBQADDR:x -U 2>&1 | tee your_log-file_name.log I tried but the log file was empty My guess is you will most likely need to redirect the screen to a log and not ethminer; but I haven't tried it. What do I need to do? I will try Try adding a "L" to the screen args like this: screen -dmSL miner $HCD -S $UBQ_POOL -O $UBQADDR:x -U 2>&1 | tee your_log-file_name.log That's what a I do. Then a file called "screenlog.0" will be created. Keep in main that this file grows constantly, so eventually you will have to do some thing about it. To watch the output in real time: tail -f /path to screenlog.0 To clear logs you dont want to keep enable the clear log file at reboot in 1bash Then add your file address to "Clear_Logs" file This is probably the easiest way. I made the clear_logs bash to inhibit the runaway growth; which is very quick in somecases where there is a soft crash constantly occuring. Unless I am troubleshooting a rig; I don't need logs, so I use: with my rigs.
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 04:56:44 PM |
|
Hi, i just upgraded from 0018 to 0019 and my 1060 3g with same settings (cc100, mc800, pl75, Dual ETH_SC, dcri 40) shows 3-2 Mh less in 0019 then 0018. What is wrong, may be some additional settings? Sometimes a new claymore version will do worse with specific GPUs. My guess is thats what is happening. Try changing: to
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:03:51 PM |
|
This is the scheme of my setup. Is a box 44cm wide x 32cm high and gpus are setup on two levels one on top of other. (1060 3Gb gpu) There is a raspberry pi wich allows me to reset/poweroff-on and control the inside temps using 5 sensors DS18B20. 12 fans push fresh air inside the box. They feed on an external psu. Temps with power limite 92W are 57-68ºC. By the way I had a big cpu load (5-14) when I had power limit to 72W. Once set to 92W got down to 1.5-2.2 (yes I know no good yet, but much better). I use a celeron g3900, looking forward to upgrade to I3. I post this hoping the schematic helps others like me who have their rig in remote locations. Here the code bits for temps, poweron/off and reset. #!/bin/bash # Script name: poweron-poweroff.sh # El script apaga o enciende el equipo al que esta conectado # usando el canal 1 del rele GPIO_POWER=24 gpio -g mode $GPIO_POWER out sleep 2 # Enciende o apaga el equipo segun su estado previo gpio -g write $GPIO_POWER 1 sleep 1 # Lo mantengo activado 1 segundo gpio -g write $GPIO_POWER 0 exit
#!/bin/bash # Nombre script: reset_remoto.sh # El script resetea el equipo al que esta conectado # usando el canal 2 del rele GPIO_RESET=23 gpio -g mode $GPIO_RESET out sleep 2 # Resetea el equipo gpio -g write $GPIO_RESET 1 sleep 1 # Lo mantengo pulsado 1 segundo gpio -g write $GPIO_RESET 0 exit
#!/bin/bash Nombre script: show_temp_sensores.sh # El script lee la temperatura de los sensores NOMBRES_DIR=/root/nombres /bin/ls /sys/bus/w1/devices/ | grep 28* > $NOMBRES_DIR N=$(/bin/cat $NOMBRES_DIR | wc -l) echo "Numero de sensores : " $N for ((LINEAS=1; LINEAS <= $N ; LINEAS=LINEAS+1)) do NOMBRE_FILE=`/bin/sed -n -e "${LINEAS}p" $NOMBRES_DIR` TEMP=$(/bin/cat /sys/bus/w1/devices/$NOMBRE_FILE/w1_slave | grep t= | awk -F= '{ printf "%.2f C\n", $2/1000}') echo -n "Sensor$LINEAS :" $NOMBRE_FILE echo " - Temp " $TEMP done
Nice automation.
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:32:34 PM |
|
Forgive me if this questions has been answered already !
How can i have different worker names per system? i use 1 pastebin for all rigs.
see this section of 1bash: # GLOBAL_WORKERNAME will use a single worker name for all coins GLOBAL_WORKERNAME="YES" # YES NO
# HOST will use the rigs host address # MAC will use the rigs NIC's MAC address # CUSTOM will use your own AUTO_WORKERNAME="CUSTOM" # HOST or MAC or CUSTOM
# if AUTO_WORKERNAME="CUSTOM" CUSTOM_WORKERNAME="nvOC_v0019" And choose either: or
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:36:01 PM |
|
Thank you so much for sharing your work.
I literally struggled all weekend (10+ hours of hair pulling frustration) trying to get 4 gtx 1070's going on Windows 10 Pro. I was considering getting another MB. Honestly, I didn't know what to think. Guys on the windows forums told me my PSU wasn't high enough (corsair 850 watt gold) or the GPU's weren't compatible in the pcie x1 slots.
I put this on a flash drive to try (not a fast one). I struggled a little bit at start up but then reread your instructions and read through this thread. I rewrote the flash drive and put my settings in and sure enough I got it going no problem.
I can do ZEC/ZCL on miningpoolhub no prob. I am having trouble doing ZEN on Suprnova. I may try another pool but for now I'll just leave it on zclassic for the night and see how it goes. I still need to fiddle with my settings and read and reread through the bash file as well to optimize a bit.
I think maybe next step would be to write nvOC to my SSD I was using for Windows 10.
Also, my MB was not listed in your OP but worked fine: Asus M5A97 r2.0. I did not have to change anything in the Bios. But I did flash the Bios with the most recent update from Asus when I was trying to get Windows to work. It only uses an old Sempron processor. This was just a board I converted from a 270x build from when I mined years back. So, glad it still worked since I tried to do this as low budget as possible. I ordered one of those gpu splitters to tinker with and will report back if I am able to use it with that board. If so, I may expand my build a bit.
A Sempron might have difficulty with additional GPUs and some ALGOs; I previously used all Sempron's with BTC / LTC gpu rigs: they were solid for BTC, but had some issues with LTC with more than 4x GPUs. I haven't used one in a long time, so I have no idea how they fair with current algos.
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:39:05 PM |
|
Is there any OC list for 1060 and 1070 gpus? Seems to me that the recommendations are deleted I removed them from the OP as different OC settings are better for different ALGOs and I thought it was misleading to only have Equihash optimized OC settings listed. Many members can give you good recommendations; what COIN are you mining / what model 1060s / 1070s ?
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:41:59 PM |
|
how do you scroll up in the guake terminal?
press the up key on your keyboard I get to the point where I'm accessing the miner however I can't navigate that window, is there no option for that
Not sure what your asking here. Also how can I access logs to help diagnose a problem?
change: to
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:45:22 PM |
|
I edited this on my previous reply while you were replying, not sure if you noticed:
Oh, one more thing... On the Asrock 13 gpu board, the PCIe slots on the board are so close, the small riser boards that plug to the motherboard might be touching each other and create short circuits. What I did is: I cut the excessive lenght of the soldered pins of the USB connector on the back side of that small board; used heat-shrink tube (1.5 inch wide) and wrapped the small boards where they could touch eachother, making sure the pins that plug to the PCIe slots are not covered. I would suggest that everyone usung the Asrock 13 GPU board do this to prevent problems... the heat-shrink tube is about 5-6 bucks and it's quete long, I don't remember exactly but it was at least 4 feet long. I got it from Sayal in Toronto but I am sure you can find one in electronic shops or even Home Depot.
I had put heat resistant electrical tape on the back side of each of the small riser parts that plug into those slots (not initially, but as I have gone through this process). Would that have a similar impact to what you describe? yes, as long as they are insulated it is fine. the heat-shrink is more permanent and clean solution I had this issue prevent a rig from booting. The risers I was using had unusually long soder points; I bent these flat against the connectors and it resolved the short. I like your solution leenoox.
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:47:58 PM |
|
Ok - found an issue.
I've tried setting "manual fan speed" and played around with the maximus auto fan control - but some of my cards just dont want to run the fan speed i set in 1Bash.
When i was running off 1PSU with 8 cards it ran perfectly.
I just added a second PSU to my ASROCK BTC PRO board to run 13 cards and 4/ 13 of the cards wont adjust their fan speed correctly.
EDIT: With every restart its getting less and less stable - fan speeds are NOT changing on majority of cards now.
I recommend reimaging; I am not sure why yet; but I have had an unusually high number of bad images with v0019.
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:56:08 PM |
|
I just placed an order for some g4600's. I will see how that works - the rig that had been crashing every couple hours crashed after 24 hours after I turned off teamviewer - hopefully this puts me over the top.
As for power, both are built the same way. Each is running on 2 750w EVGA G3 power supplies. Each SATA/molex power connector is running no more than 2 GPU risers. The molex connectors (and single SATA) hooked to the board each share with the riser for a single GPU (this means 13 GPU's, 2 motherboard molex, and 1 motherboard SATA to connect. The PSU's each have 3 SATA and 1 Perif (molex) connector, hence my using two connections per cable). I did have a 550w PSU in the mix (as a third) to see if that would change things, but it did not.
The only other discernable different is that not all GPU's are the same brand/make (but all are 1060 6GB), and the risers are not all identical (though I have switched them out in troubleshooting).
Good, G4600 is 2 core, 4 threads CPU. That will definetely help, much better than Celerons you have. As for the mixed GPU's, try to put as many of the same brand/model in the same rig. Then you will have to manualy set overclocking for each GPU. Don't use the global OC for all. Try with the lowest stable value for all (I believe it was about 600 memory for you), then increase +50 memory on one brand/model and see if it's stable, then try +50 on different model and see if it's stable, then repeat until you get max for all models. It will take a while to fine tune it, that's the downside of mixed cards. Good luck and keep us posted. Yesterday I swapped both Celeron G3930's for G4600's. Overnight, the stable rig remained stable, the unstable rig crashed again. Short of replacing all my risers again (and I am still open to specifics on what version to get) or RMA'ing the motherboard, is there anything else I might be missing? JudoFlash, have you tried my previous suggestion of manualy overclocking each card, not using the global settings since you have a mix of different cards? If that doesn't work, disconnect one card completely, then run the rig, see if it's stable. If it still crashes, connect the disconnected card back and disconnect another card completely... repeat until you pinpoint the card that misbehave. Once you pinpointed the troublemaker, troubleshoot further: replace the riser, replace the adapter cable to the riser (depends on your risers, PCIe to SATA or Molex to SATA cable), plug the riser power to another connector of the SATA cable coming from the power supply, lastly, significantly lower the overclock values for that card. If previous steps didn't help, you have a bad GPU, replace it. I've had problems with, bad riser, bad power adapter cable, bad SATA connector and bad GPU before. It's not easy to troubleshoot when the problem is intermittent but atleast with the above suggestion (disconnecting one card at a time completely) you can narrow the problematic section then troubleshoot further. BTW, I am using version 006-c risers with PCIe to SATA power adapter cables. Thank you. Yes I have tried the manual tuning. Right now I have everything running at stock as sort of a "last ditch" effort on making this configuration work (or at least narrowing it down). As you said, the challenge is that it does not crash right away (sometimes it takes 24 hours or so), so with so many variables, it takes a long time to pinpoint. I ordered some new risers (I would need them anyway as I plan to set up another rig or two), so I will do some further testing with those once I have them. I appreciate everyone's input. Yes, I know I could just stand up another board and not have so much riding on this one, but I feel longer term, finding the issue will help my overall success. Plus, folks on this forum seem to be making things work with this board and similar setups. Also, I'm stubborn. I have been making / transitioning a lot of rigs lately. On some of them I used risers a friend had bought from a seller on amazon; out of 39 risers 4 were bad. This is a higher failure rate than I am used to with the v006+ 6pin power risers. Near the white latch lock on the back of the riser there is usually a word; which I believe denotes the factory where the riser was made. Some of these are better than others, some risers also have no word at all. It would be helpful if resellers provided this information when making listings / selling risers.
|
|
|
|
fullzero (OP)
Legendary
Offline
Activity: 1260
Merit: 1009
|
|
September 05, 2017, 05:57:54 PM |
|
Setting target temp does not affect Maxximus 6_autotemplog : GPU 0, Target temp: 70, Current: 69, Diff: 1, Fan: 100, Power: 131.56 GPU 1, Target temp: 70, Current: 69, Diff: 1, Fan: 75, Power: 128.21 GPU 2, Target temp: 70, Current: 67, Diff: 3, Fan: 65, Power: 128.31 GPU 3, Target temp: 70, Current: 70, Diff: 0, Fan: 75, Power: 129.51 GPU 4, Target temp: 70, Current: 69, Diff: 1, Fan: 85, Power: 130.39 GPU 5, Target temp: 70, Current: 70, Diff: 0, Fan: 65, Power: 130.88
1bash : TARGET_TEMP=75 __FAN_ADJUST=5 # Adjustment size in percent POWER_ADJUST=5 # Adjustment size in watts # Difference in actual temperature allowed before action: Works only if current is BELOW target temp ALLOWED_TEMP_DIFF=3 # Restore original power limit if fan speed is lower than this percentage RESTORE_POWER_LIMIT=90 # lowest fan speed that will be used MINIMAL_FAN_SPEED=65 Edit 1 : Only read from individual target temps even if it set to NO Changing individual changes it : GPU 0, Target temp: 75, Current: 74, Diff: 1, Fan: 80, Power: 133.05 GPU 1, Target temp: 75, Current: 73, Diff: 2, Fan: 65, Power: 129.05 GPU 2, Target temp: 75, Current: 68, Diff: 7, Fan: 65, Power: 129.48 GPU 3, Target temp: 75, Current: 73, Diff: 2, Fan: 65, Power: 131.14 GPU 4, Target temp: 75, Current: 75, Diff: 0, Fan: 70, Power: 132.09 GPU 5, Target temp: 75, Current: 71, Diff: 4, Fan: 65, Power: 130.52
1bash : INDIVIDUAL_TARGET_TEMPS="NO" # YES NO # Set individual target temps here if INDIVIDUAL_TARGET_TEMPS="YES" TARGET_TEMP_0=75 TARGET_TEMP_1=75 TARGET_TEMP_2=75 TARGET_TEMP_3=75 TARGET_TEMP_4=75 TARGET_TEMP_5=75 TARGET_TEMP_6=75 Edit 2 : Acting strange, it reads all targets from TARGET_TEMP_0=76 More tests : 1bash : TARGET_TEMP=74 __FAN_ADJUST=5 # Adjustment size in percent POWER_ADJUST=5 # Adjustment size in watts # Difference in actual temperature allowed before action: Works only if current is BELOW target temp ALLOWED_TEMP_DIFF=3 # Restore original power limit if fan speed is lower than this percentage RESTORE_POWER_LIMIT=90 # lowest fan speed that will be used MINIMAL_FAN_SPEED=65 INDIVIDUAL_TARGET_TEMPS="NO" # YES NO # Set individual target temps here if INDIVIDUAL_TARGET_TEMPS="YES" TARGET_TEMP_0=76 TARGET_TEMP_1=70 TARGET_TEMP_2=70 TARGET_TEMP_3=70 TARGET_TEMP_4=75 TARGET_TEMP_5=70 ==> 6_autotemplog <== GPU 0, Target temp: 76, Current: 72, Diff: 4, Fan: 65, Power: 130.84 GPU 1, Target temp: 76, Current: 67, Diff: 9, Fan: 65, Power: 126.78 GPU 2, Target temp: 76, Current: 63, Diff: 13, Fan: 65, Power: 133.73 GPU 3, Target temp: 76, Current: 67, Diff: 9, Fan: 65, Power: 118.40 GPU 4, Target temp: 76, Current: 70, Diff: 6, Fan: 65, Power: 130.45 GPU 5, Target temp: 76, Current: 65, Diff: 11, Fan: 65, Power: 129.71
Edit 3 Individual target temp works as it should 1bash: INDIVIDUAL_TARGET_TEMPS="YES" # YES NO # Set individual target temps here if INDIVIDUAL_TARGET_TEMPS="YES" TARGET_TEMP_0=76 TARGET_TEMP_1=70 TARGET_TEMP_2=70 TARGET_TEMP_3=70 TARGET_TEMP_4=75 TARGET_TEMP_5=70 ==> 6_autotemplog <== GPU 0, Target temp: 76, Current: 66, Diff: 10, Fan: 70, Power: 111.09 GPU 1, Target temp: 70, Current: 65, Diff: 5, Fan: 65, Power: 121.22 GPU 2, Target temp: 70, Current: 60, Diff: 10, Fan: 65, Power: 119.53 GPU 3, Target temp: 70, Current: 66, Diff: 4, Fan: 65, Power: 118.83 GPU 4, Target temp: 75, Current: 67, Diff: 8, Fan: 65, Power: 118.53 GPU 5, Target temp: 70, Current: 62, Diff: 8, Fan: 65, Power: 118.99
Can any one else test and clarify this please. Just loaded up the new v0019 onto my rig and works AWESOME!
Only bug I could find was that the maximum script didn't seem to work unless I selected "YES" on setting individual card temps. But no biggie! Love it! Thanks fullzero
I will look into this; it is probably an error I made when modifying Maxximus007_AUTO_TEMPERATURE_CONTROL to work with lost_post's method for v0019.
|
|
|
|
|