Bitcoin Forum
March 29, 2024, 09:16:27 AM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 [101] 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 ... 417 »
  Print  
Author Topic: [OS] nvOC easy-to-use Linux Nvidia Mining  (Read 417927 times)
gyoztes
Newbie
*
Offline Offline

Activity: 16
Merit: 0


View Profile
July 19, 2017, 04:04:42 PM
 #2001

I continue the above: if I set it with my hand, it is ok BUT just for 1-2 minutes and anything changes it back to 125!
Remember that Bitcoin is still beta software. Don't put all of your money into BTC!
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
Maxximus007
Full Member
***
Offline Offline

Activity: 153
Merit: 100


View Profile
July 19, 2017, 04:14:10 PM
 #2002

Are you using the new V0018? Did you made a new img, or just used a new oneBash? What version did you have before? If you used V0017 or before, perhaps you did not change the autotemp file? The first version had it's own power limits set at 125W.

Can you please try the following:
SSH into your rig or open a guake terminal, and enter:
Code:
sudo nvidia-smi -pl 60
nvidia-smi will give you output, probably telling that the power limits are now set to 60 W. If that's the case, at least you know that setting powerlimits is possible, and it's perhaps the above.

Thank you for the fast reply! Yes I use the newest v18 with the new 1bash. If I set what you mentioned before, it works well. (I set it to 100 because I use 1070 cards.)

My experience: if I set it in the 1bash - and monitor it in another terminal - it works UNTIL the dag reading's end! And after this, it is changing to 125.

In the 1bash it is in the line 522 and (as I see well) there is no setting after this line ith this parameter.

I set the all of the individual card settints to 100 too but set no to induvidual_powerlimit on line 133 and all power limit started from 243 to 100 too (form nicehash but now I do not use it yet).

I try to set on/off the temperature control in line 51 but it is not changing the problem.

What is yor next idea?

Thank you!
Okay, so you are using the autotemp, but did set INDIVIDUAL_POWERLIMIT="NO" ?
This can give a problem, to set autotemp correctly it needs this set to YES.
OverEasy
Sr. Member
****
Offline Offline

Activity: 301
Merit: 251


View Profile
July 19, 2017, 04:25:51 PM
 #2003

SSH users and new 1bash. You can no longer start things up by using bash 2unix as all the pastebin stuff is commented out.
You must now start up by SSH in to rig and run 1bash directly. Fullzero has placed the pastebin stuff in 1bash so make sure you edit 1bash first to remote and put your pastebin info in it.

For me personally new v18.1 does not work. Many many errors. Can't afford to have rig down to diagnosis right now so I am back to V.17 that works flawlessly for my purposes.

I would like to suggest making version 17 the "base" build as it works so well and incorporate all the features in as "modules" (separate programs), instead of adding all the code to the 1bash file.

File is getting so big it is hard to diagnosis stuff. Just one noobs opinion lol...
Yes, the 2unix has the pastebin out, and it's indeed quite handy in there to have a fitting 1bash directly. As fullzero already said, it will be modularized and optimized pretty soon, once it's on github. Currently there is just too much in one file, and makes it error prone. I use the new V0018 as a template, and removing parts I don't use.

Perhaps the first step should be separating the variables from the code with something like: source myvariables

It's still not a simple task for fullzero: There are many wishes, and for instance overclock is different per miner/coin etc..

TBH, I don't care too much to have all the possible (obscure) coins in it, if you want it, edit yourself (for now). Once it's split up in modules it's way easier to add additional coins.

I agree completely. Poor Fullzero is working overtime on this and giving it for free. I sure hope everyone is giving him some hash every once in a while. Hey Fullzero you should add your ETH or whatever you want to the code right at the top and comment it out. Most folks just open 1bash and delete yours and paste their address in. You are acknowledging everyone but yourself!
gyoztes
Newbie
*
Offline Offline

Activity: 16
Merit: 0


View Profile
July 19, 2017, 04:29:03 PM
 #2004

Okay, so you are using the autotemp, but did set INDIVIDUAL_POWERLIMIT="NO" ?
This can give a problem, to set autotemp correctly it needs this set to YES.

I set the temp control to set in line 51 and set the individual to yes in line 145 I set all the lines below to 100. I set all the target temp to 60 (this is the old v17 measured value) nand set restore to 20 in line 237.

Question: what does orginal power limit means? The card's default 125? Or powerlimit_watts in line 74?

Thank you!
Maxximus007
Full Member
***
Offline Offline

Activity: 153
Merit: 100


View Profile
July 19, 2017, 05:07:16 PM
 #2005

Okay, so you are using the autotemp, but did set INDIVIDUAL_POWERLIMIT="NO" ?
This can give a problem, to set autotemp correctly it needs this set to YES.

I set the temp control to set in line 51 and set the individual to yes in line 145 I set all the lines below to 100. I set all the target temp to 60 (this is the old v17 measured value) nand set restore to 20 in line 237.

Question: what does orginal power limit means? The card's default 125? Or powerlimit_watts in line 74?

Thank you!
The original power limit is the one read from the variables set in 1bash. You can see in the autotemp file: echo "INDIVIDUAL_POWERLIMIT_0:  ${POWER_LIMIT[0]}". That value is coming from the 1bash file, for instance INDIVIDUAL_POWERLIMIT_0=100.

Please run the autotemp script in a terminal and read the output.
Code:
/home/m1/Maxximus007_AUTO_TEMPERATURE_CONTROL
Maxximus007
Full Member
***
Offline Offline

Activity: 153
Merit: 100


View Profile
July 19, 2017, 05:12:07 PM
 #2006

SSH users and new 1bash. You can no longer start things up by using bash 2unix as all the pastebin stuff is commented out.
You must now start up by SSH in to rig and run 1bash directly. Fullzero has placed the pastebin stuff in 1bash so make sure you edit 1bash first to remote and put your pastebin info in it.

For me personally new v18.1 does not work. Many many errors. Can't afford to have rig down to diagnosis right now so I am back to V.17 that works flawlessly for my purposes.

I would like to suggest making version 17 the "base" build as it works so well and incorporate all the features in as "modules" (separate programs), instead of adding all the code to the 1bash file.

File is getting so big it is hard to diagnosis stuff. Just one noobs opinion lol...
Yes, the 2unix has the pastebin out, and it's indeed quite handy in there to have a fitting 1bash directly. As fullzero already said, it will be modularized and optimized pretty soon, once it's on github. Currently there is just too much in one file, and makes it error prone. I use the new V0018 as a template, and removing parts I don't use.

Perhaps the first step should be separating the variables from the code with something like: source myvariables

It's still not a simple task for fullzero: There are many wishes, and for instance overclock is different per miner/coin etc..

TBH, I don't care too much to have all the possible (obscure) coins in it, if you want it, edit yourself (for now). Once it's split up in modules it's way easier to add additional coins.

I agree completely. Poor Fullzero is working overtime on this and giving it for free. I sure hope everyone is giving him some hash every once in a while. Hey Fullzero you should add your ETH or whatever you want to the code right at the top and comment it out. Most folks just open 1bash and delete yours and paste their address in. You are acknowledging everyone but yourself!
After reading the code: the new upPASTE will automatically update the default 1bash. So just change these lines in 1bash BEFORE booting (in the windows partition):
Code:
_Parallax_MODE="NO"             # YES NO

pasteBASH="np9FSHew"

upPASTE_TIMEOUT_IN_MINUTES=30
And your rig will update 1bash within 30 minutes. So you will send a few hashes to fullzero, not a big problem I believe..
lance04
Full Member
***
Offline Offline

Activity: 462
Merit: 112



View Profile
July 19, 2017, 05:36:23 PM
 #2007

Which image for MSI Gaming 5 motherboard to support 7 GPU?  Not sure if it's an issue with the image or something else but it wouldn't boot up even for the BIOS screen with the GPUs and NVOS flash drive connected when using the TB85 motherboard image.

 

For an MSI Z170-A GAMING M5 use this image:


Currently each image is unique so I can only ensure they will work for the mobo listed to support.

can use another mobo with 6 pcie ?
any brand ?
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 19, 2017, 11:14:30 PM
 #2008

My configuration:
v0018
Biostar TB250-BTC PRO + 12 Zotac P106-100 cards (without output).
When I run it with LOCAL (GT 730 for monitor + 7 P106-100 cards) I see it works.
But when I remove GT 730 adapter and monitor and attach all 12 P106-100 cards and use REMOTE and connect by SSH it doesn't seem to be working.
I tried to run it manually but the OS was rebooted with Xorg error.
Any ideas how to fix it?

P.S. I tried new 1bash - still the same issue.

Code:
m1@m1-desktop:~$ pkill -e miner
m1@m1-desktop:~$ export DISPLAY=:0
m1@m1-desktop:~$ screen -r miner
There is no screen to be resumed matching miner.
m1@m1-desktop:~$ bash /home/m1/1bash


workername: nv045

Xorg PROBLEM DETECTED

Restoring Xorg

Rebooting in 5

FIrst: ensure you have made the 2x bios changes as indicated in the OP for this mobo; and saved / restarted as directed.  If you have made additional bios changes then you should restore the default settings and perform the procedure in the OP. 

Second while troubleshooting I recommend attaching the GPU with output to the primary 16x slot and using 11 of the mining GPUs in the other slots.  Run in local mode.

If you have significantly changed the GPU configuration; especially in regard to the the primary GPU it is likely the system will need to restore the xorg and reboot.  If it does this once it is expected; if it does this in a loop (ie multiple times in a row there is a problem).

Let me know how this goes.

PS: I highly recommend using the ASRock 13x mobo to get out the box; easy setup.  If I was having a lot of trouble with this mobo, I would get one of the ASRock and then return the Biostar when I had the rig running with 13x.


fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 19, 2017, 11:15:57 PM
 #2009

I wrote a code that check any income messages to telegram bot from you, and answer you.
Save it as separate file and run it at start.
Quote
source ~/wallets #import wallets
source ~/settings.sh #import settings

cd /tmp
while [ 1 ]
do
rm getUpdates
wget https://api.telegram.org/bot$TELEGRAM_API/getUpdates

INCOME=$(cat ./getUpdates | grep $TELEGRAM_CHAT | tail -1 | awk -F ":" '{print $13}' | cut -d \" -f 2)
INCOME_TIME=$(cat ./getUpdates | grep $TELEGRAM_CHAT | tail -1 | awk -F ":" '{print $12}' | cut -c -10)

LAST_INCOME_TIME=$(cat /home/m1/last_inc_time)

if [ $INCOME_TIME != $LAST_INCOME_TIME ]
then
  if [[ $INCOME == "State" || $INCOME == "state" || $INCOME == "STATE" ]]
  then
  echo state of rig
  ~/mail.sh 9
else
  echo invalid msg!
  fi
  echo $INCOME_TIME > /home/m1/last_inc_time #first time you must create this file yourself. put any numbers inside.
else
  echo no new messeges!
fi
sleep 5
done

To change or add new msg: $INCOME is text of message.   ~/mail.sh 9 - is command to do.
Quote
if [[ $INCOME == "State" || $INCOME == "state" || $INCOME == "STATE" ]]
  then
  echo state of rig
  ~/mail.sh 9
else
  echo invalid msg!
  fi

Please add to nvOC .

I will add this to my update stack.
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 19, 2017, 11:19:58 PM
 #2010

Thank you very much for making such a wonderful OS.

Does NVOC v0018 have log file when miner restart after GPU soft crash?

In the v0018 1bash it does; in this updated version it only logs restarts.  This is because logging slightly decreases stability with using USB keys.  I will make watchdog logs a YES/NO option for the next 1bash.  For now you can open the watchdog file:

Code:
IAmNotAJeep_and_Maxximus007_WATCHDOG

go to line 86:

Code:
kill $target #| tee -a ${LOG_FILE}

and remove the # so it reads:

Code:
kill $target | tee -a ${LOG_FILE}

and it will log soft crashes.
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 19, 2017, 11:21:05 PM
 #2011

Okay, so you are using the autotemp, but did set INDIVIDUAL_POWERLIMIT="NO" ?
This can give a problem, to set autotemp correctly it needs this set to YES.

I set the temp control to set in line 51 and set the individual to yes in line 145 I set all the lines below to 100. I set all the target temp to 60 (this is the old v17 measured value) nand set restore to 20 in line 237.

Question: what does orginal power limit means? The card's default 125? Or powerlimit_watts in line 74?

Thank you!
The original power limit is the one read from the variables set in 1bash. You can see in the autotemp file: echo "INDIVIDUAL_POWERLIMIT_0:  ${POWER_LIMIT[0]}". That value is coming from the 1bash file, for instance INDIVIDUAL_POWERLIMIT_0=100.

Please run the autotemp script in a terminal and read the output.
Code:
/home/m1/Maxximus007_AUTO_TEMPERATURE_CONTROL

Maxximus007, thanks for helping  Smiley
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 19, 2017, 11:27:50 PM
 #2012

SSH users and new 1bash. You can no longer start things up by using bash 2unix as all the pastebin stuff is commented out.
You must now start up by SSH in to rig and run 1bash directly. Fullzero has placed the pastebin stuff in 1bash so make sure you edit 1bash first to remote and put your pastebin info in it.

For me personally new v18.1 does not work. Many many errors. Can't afford to have rig down to diagnosis right now so I am back to V.17 that works flawlessly for my purposes.

I would like to suggest making version 17 the "base" build as it works so well and incorporate all the features in as "modules" (separate programs), instead of adding all the code to the 1bash file.

File is getting so big it is hard to diagnosis stuff. Just one noobs opinion lol...
Yes, the 2unix has the pastebin out, and it's indeed quite handy in there to have a fitting 1bash directly. As fullzero already said, it will be modularized and optimized pretty soon, once it's on github. Currently there is just too much in one file, and makes it error prone. I use the new V0018 as a template, and removing parts I don't use.

Perhaps the first step should be separating the variables from the code with something like: source myvariables

It's still not a simple task for fullzero: There are many wishes, and for instance overclock is different per miner/coin etc..

TBH, I don't care too much to have all the possible (obscure) coins in it, if you want it, edit yourself (for now). Once it's split up in modules it's way easier to add additional coins.

I agree completely. Poor Fullzero is working overtime on this and giving it for free. I sure hope everyone is giving him some hash every once in a while. Hey Fullzero you should add your ETH or whatever you want to the code right at the top and comment it out. Most folks just open 1bash and delete yours and paste their address in. You are acknowledging everyone but yourself!
After reading the code: the new upPASTE will automatically update the default 1bash. So just change these lines in 1bash BEFORE booting (in the windows partition):
Code:
_Parallax_MODE="NO"             # YES NO

pasteBASH="np9FSHew"

upPASTE_TIMEOUT_IN_MINUTES=30
And your rig will update 1bash within 30 minutes. So you will send a few hashes to fullzero, not a big problem I believe..

My bad; I should have put the timeout at the bottom of the while loop so it executes at launch.  This morning when I was testing this I was using a 1 minute timeout so I wasn't thinking about this.

To make this change (running the update at launch) open the upPASTE file and cut line 18:

Code:
sleep $TIMEOUT

you must cut (remove this line so that line 18 is blank)

go the the open line 62 before:

Code:
done
fi

and paste:

Code:
sleep $TIMEOUT

so the bottom of the file reads:
Code:
sleep $TIMEOUT
done
fi

save

fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 19, 2017, 11:33:37 PM
 #2013

Which image for MSI Gaming 5 motherboard to support 7 GPU?  Not sure if it's an issue with the image or something else but it wouldn't boot up even for the BIOS screen with the GPUs and NVOS flash drive connected when using the TB85 motherboard image.

 

For an MSI Z170-A GAMING M5 use this image:


Currently each image is unique so I can only ensure they will work for the mobo listed to support.

can use another mobo with 6 pcie ?
any brand ?

Most intel mobos should work; however many require bios setting changes.  In general; the z170 chipset are the hardest to get working with all pcie slots.

I have tested many motherboards and listed the bios settings that need to be changed with pictures on the OP. 

I highly recommend only getting a motherboard that has already been tested.  Specifically I would get a motherboard that works out the box with no bios changes if possible. 

Last time I checked the old but good: ASRock H81 PRO BTC (6x gpu) mobo was in stock at newegg.  I have a ton of these; even though they use an old chipset they are rock solid and work out the box.
TenaciousJ
Full Member
***
Offline Offline

Activity: 122
Merit: 100


View Profile
July 20, 2017, 04:57:46 AM
 #2014

Beats me Fullzero. My version of 1bash v. 18 has no way of executing the bash file Watchdog until I added it.
Just saying "yes" to the switch won't start it.

The other stuff does not work as I stated which is why I wrote my own part and edited out some stuff.

Maybe just me and my rig..shrug dunno.

I don't wanna mess up anyone with my crazy changes so I'll just keep em to myself for now unless I see others with similar issues.

This is getting big and complex. Ever consider client side program running in background and controlling stuff via a webpage?

Yes this is planned: monitor / push / update / dashboard app; keep getting sidetracked adding contributions / new coins.

The new 1bash should solve problems / start watchdog and autotemp in a screen when in remote.

Love the v0018 release and all the functionality! 

However, POWERLIMIT NIGHTMARES! 

I have one major issue, I cannot lower the POWERLIMIT.  I run 8 rigs of 1050Ti and 125W is just way to high.  I have tried adjusting the base line and the individual POWERLIMIT settings and I am still seeing maximum power being utilized in NVIDIA-SMI and TEMP CONTROL.  I thought maybe the TEMP CONTROL was trumping the setting, but I don't think that is the case (at least based on what my 46 year old brain and eye balls looking at the 1bash code understands).  I thought maybe it was the correction in line 527, but that didn't change anything.

I tried "NO" for both WATCHDOG and TEMP CONTROL with POWERLIMIT set below MAX for the 1050Ti and I still see max power output.

I did notice during startup, of the three terminal screens that pop-up during startup that the second terminal session has the POWERLIMIT set correctly at 60.   However, something happens after the third terminal screen initiates (miner starting) that pushes the POWER back to MAX.

I added another rig of 1050Tis tonight and I saw more unusual behavior from POWER settings again where GPU0 goes to 125W as the max power limit and the rest of the GPUs all complied with my setting of 65Watts.   I have no idea what is causing this inconsistency in power limit settings.

I also noticed in the Guake terminal that the TEMP CONTROL module is displaying continuous notifications that 125W is not a valid power limit (even after changing the settings in the module to 60-65).

I normally run all my rigs at 60W, which keeps the current draw low enough to run 3 rigs of 8 GPUs on each 15 AMP circuit.  Also, extremely efficient.

I am still hunting for what is causing the forced 125W power setting.

Try the new 1bash and additional files posted on the OP.  Let me know if it doesn't solve this for you.


I tried updating to the newest posted 1bash files as you suggested to resolve a problem where I have set the powerlimits for my cards individually, but the script changes one of my 1080ti cards (250w) to the power limit set for the 1070s (140w) so the card is only pulling 550sol instead of the 750 it should be.

I've triple checked my individual power limit settings vs. the GPU ID from Nvidia xserver against the powerlimit ID in the script, and they match.  But it's not processing properly. 

So after updating to the new 1bash fileset, it gets to the point where the fan settings are modified, the script loads the EWBF miner (for HUSH), and promptly crashes with the 'screen is terminated' message.  I also have disabled autotemp and watchdog, but the problem persists.

Here's a screen shot of the problem that set this chain in motion showing the power draw of the cards vs. the powerlimit settings in 1bash and the IDs that were used in Xserver to match the powerlimits, and of the current problem. 

http://imgur.com/a/zwf2s


I can't figure out what might be causing the miner to crash immediately on load like it is.. I've tried zeroing overclocks in case it was related to that, but that was no help. I ruled out power mizer by disabling that as well, but it crashes still.

I'm also set to LOCAL mode. 
TenaciousJ
Full Member
***
Offline Offline

Activity: 122
Merit: 100


View Profile
July 20, 2017, 05:01:17 AM
 #2015

Hi fullzero,

thank you for keeping this project alive and the constant updates.
I've been running 017 version on z270-hd3p gigabyte motherboard + 3 x 1080TI and a 1070 for almost 2 weeks now with no issues.

meanwhile does anyone have the issue with 018 version not working at all? ewbf does not even start. Most settings have been the same as from the onebash file in 017. Turned off most of the new additional features like watchdog and auto temp.
I've tried booting from an ssd as well as a 32gb sandisk ultra flair thumbdrive; I keep getting the error [Screen is terminating] at the end.

I understand the issue is most likely a configuration somewhere gone wrong, therefore it terminated before even trying to load ewbf miner, but was there such a drastic change from 017 to 018 that I missed out?

Would really like to find out if anyone faced a similar issue, so I iron it out and run ver 018.
Thanks!

I ran into the same problem after using the most current files.  No idea what's causing it.  I've disabled autotemp, watchdog, set to LOCAL, tried mizer on and off, etc. but nothing fixed it.  I see EWBF load for 1/2 a second then that 'screen is terminating' message pops up.  I think it might be related to watchdog, even though its disabled in the 1bash file, but I can't figure out how exactly.
kw1k
Newbie
*
Offline Offline

Activity: 14
Merit: 0


View Profile
July 20, 2017, 05:15:36 AM
 #2016

fullzero -- did you managed to get ccminer_alexei78 into this new v18 build?

I still need to add some more ccminer versions; v0018 doesn't have the version I believe you are looking for.



I have compiled the correct alexis78 version for nvoc with arch flags for 10x series cards.

https://mega.nz/#!p64lHS4Q!BpaOMyEx5pL8GhkEXx6WTfgILxMa5FjvreN7jwLxuVE
BaliMiner
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
July 20, 2017, 05:22:16 AM
 #2017


BaliMiner please provide a BTC address for the next version.


Hil Fullzero this is my BTC address: 1HbzxQ6AVeWYvFm322KtxZcJJLAqfJHpN8
Avarets
Newbie
*
Offline Offline

Activity: 12
Merit: 0


View Profile
July 20, 2017, 08:56:43 AM
 #2018

My configuration:
v0018
Biostar TB250-BTC PRO + 12 Zotac P106-100 cards (without output).
When I run it with LOCAL (GT 730 for monitor + 7 P106-100 cards) I see it works.
But when I remove GT 730 adapter and monitor and attach all 12 P106-100 cards and use REMOTE and connect by SSH it doesn't seem to be working.
I tried to run it manually but the OS was rebooted with Xorg error.
Any ideas how to fix it?

P.S. I tried new 1bash - still the same issue.

Code:
m1@m1-desktop:~$ pkill -e miner
m1@m1-desktop:~$ export DISPLAY=:0
m1@m1-desktop:~$ screen -r miner
There is no screen to be resumed matching miner.
m1@m1-desktop:~$ bash /home/m1/1bash


workername: nv045

Xorg PROBLEM DETECTED

Restoring Xorg

Rebooting in 5

FIrst: ensure you have made the 2x bios changes as indicated in the OP for this mobo; and saved / restarted as directed.  If you have made additional bios changes then you should restore the default settings and perform the procedure in the OP. 

Second while troubleshooting I recommend attaching the GPU with output to the primary 16x slot and using 11 of the mining GPUs in the other slots.  Run in local mode.

If you have significantly changed the GPU configuration; especially in regard to the the primary GPU it is likely the system will need to restore the xorg and reboot.  If it does this once it is expected; if it does this in a loop (ie multiple times in a row there is a problem).

Let me know how this goes.

PS: I highly recommend using the ASRock 13x mobo to get out the box; easy setup.  If I was having a lot of trouble with this mobo, I would get one of the ASRock and then return the Biostar when I had the rig running with 13x.




I figured out this was because of wrong xorg.conf.
Used this command:
Code:
sudo nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration
Also commented out this part and forced XORG to be OK:
Code:
XORG="OK"

#if grep -q "28800" /etc/X11/xorg.conf;
#then
#XORG="OK"
#fi
Now the script starts fine.

One more thing. The script doesn't support P106-100 overclokling because of this part:
Code:
___1050_or_1050ti="NO"

NORMAL="NO"

nvidia-smi -L > /tmp/tempa

if grep -q "1050" /tmp/tempa;
then
___1050_or_1050ti="YES"
fi

if grep -q "1060" /tmp/tempa;
then
NORMAL="YES"
fi

"nvidia-smi -L > /tmp/tempa" in case of P106-100 is like this:
Code:
m1@m1-desktop:~$ cat /tmp/tempa
GPU 0: P106-100 (UUID: GPU-afea0b93-e083-bde7-f6dd-fb5b9f55ae98)
GPU 1: P106-100 (UUID: GPU-191d50dc-d599-de1d-fa4b-54493a9035c6)
GPU 2: P106-100 (UUID: GPU-2ae0b358-33bb-8438-f47b-2a2ce8088f88)
GPU 3: P106-100 (UUID: GPU-66bce3b8-51aa-9f9d-f3c5-fce4e667f994)
GPU 4: P106-100 (UUID: GPU-bae124b9-96ad-5086-20f4-32bdb6d2663f)
GPU 5: P106-100 (UUID: GPU-a9664776-7549-499a-6cfa-3b74a6c6c843)
GPU 6: P106-100 (UUID: GPU-4b57123b-20b9-20c6-ffb9-0203a51cf009)
GPU 7: P106-100 (UUID: GPU-f851be56-15e7-adf2-5a65-7508a25e6e66)
GPU 8: P106-100 (UUID: GPU-1249a132-7df6-a1d3-4794-947cd1e1887a)
GPU 9: P106-100 (UUID: GPU-f31fca46-13ad-4eee-5024-177de21d36f9)
GPU 10: P106-100 (UUID: GPU-4161850e-1f6a-7c6b-fda6-03d58826f758)
GPU 11: P106-100 (UUID: GPU-af9286f8-e0c5-2139-87f1-7019b8a1ccca)


So I manually set "TI=2" and now overcloking values are applied.

Code:
TI="2"

if [ $___1050_or_1050ti == "YES" ]
then
    TI="2"
if [ $NORMAL == "YES" ]
then
    TI="2 3"
fi
fi
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 20, 2017, 01:02:26 PM
 #2019

Beats me Fullzero. My version of 1bash v. 18 has no way of executing the bash file Watchdog until I added it.
Just saying "yes" to the switch won't start it.

The other stuff does not work as I stated which is why I wrote my own part and edited out some stuff.

Maybe just me and my rig..shrug dunno.

I don't wanna mess up anyone with my crazy changes so I'll just keep em to myself for now unless I see others with similar issues.

This is getting big and complex. Ever consider client side program running in background and controlling stuff via a webpage?

Yes this is planned: monitor / push / update / dashboard app; keep getting sidetracked adding contributions / new coins.

The new 1bash should solve problems / start watchdog and autotemp in a screen when in remote.

Love the v0018 release and all the functionality! 

However, POWERLIMIT NIGHTMARES! 

I have one major issue, I cannot lower the POWERLIMIT.  I run 8 rigs of 1050Ti and 125W is just way to high.  I have tried adjusting the base line and the individual POWERLIMIT settings and I am still seeing maximum power being utilized in NVIDIA-SMI and TEMP CONTROL.  I thought maybe the TEMP CONTROL was trumping the setting, but I don't think that is the case (at least based on what my 46 year old brain and eye balls looking at the 1bash code understands).  I thought maybe it was the correction in line 527, but that didn't change anything.

I tried "NO" for both WATCHDOG and TEMP CONTROL with POWERLIMIT set below MAX for the 1050Ti and I still see max power output.

I did notice during startup, of the three terminal screens that pop-up during startup that the second terminal session has the POWERLIMIT set correctly at 60.   However, something happens after the third terminal screen initiates (miner starting) that pushes the POWER back to MAX.

I added another rig of 1050Tis tonight and I saw more unusual behavior from POWER settings again where GPU0 goes to 125W as the max power limit and the rest of the GPUs all complied with my setting of 65Watts.   I have no idea what is causing this inconsistency in power limit settings.

I also noticed in the Guake terminal that the TEMP CONTROL module is displaying continuous notifications that 125W is not a valid power limit (even after changing the settings in the module to 60-65).

I normally run all my rigs at 60W, which keeps the current draw low enough to run 3 rigs of 8 GPUs on each 15 AMP circuit.  Also, extremely efficient.

I am still hunting for what is causing the forced 125W power setting.

Try the new 1bash and additional files posted on the OP.  Let me know if it doesn't solve this for you.


I tried updating to the newest posted 1bash files as you suggested to resolve a problem where I have set the powerlimits for my cards individually, but the script changes one of my 1080ti cards (250w) to the power limit set for the 1070s (140w) so the card is only pulling 550sol instead of the 750 it should be.

I've triple checked my individual power limit settings vs. the GPU ID from Nvidia xserver against the powerlimit ID in the script, and they match.  But it's not processing properly. 

So after updating to the new 1bash fileset, it gets to the point where the fan settings are modified, the script loads the EWBF miner (for HUSH), and promptly crashes with the 'screen is terminated' message.  I also have disabled autotemp and watchdog, but the problem persists.

Here's a screen shot of the problem that set this chain in motion showing the power draw of the cards vs. the powerlimit settings in 1bash and the IDs that were used in Xserver to match the powerlimits, and of the current problem. 

http://imgur.com/a/zwf2s


I can't figure out what might be causing the miner to crash immediately on load like it is.. I've tried zeroing overclocks in case it was related to that, but that was no help. I ruled out power mizer by disabling that as well, but it crashes still.

I'm also set to LOCAL mode. 

I made an updated 1bash which should resolve these powerlimit / remote issues.  With the powerlimits the autotemp was not reinitializing unless explicitly killed or the rig was logged out or rebooted.  This had the effect of not allowing changes to the individual powerlimits until such killing or logout / reboot.

In your picture I can see another problem which is most likely what has been killing your 1bash prematurely.

By removing the unused individual powerlimit variables you have created a situation where later in 1bash those variables are undefined. 

For now; don't delete unused variables. 

I can add logic to check or otherwise avoid this type of problem in the future; but it is simple enough to leave the extra variables for now.

VoskCoin
Sr. Member
****
Offline Offline

Activity: 1414
Merit: 487


YouTube.com/VoskCoin


View Profile WWW
July 20, 2017, 01:06:41 PM
 #2020

I have a major issue, all of my miners just completely turned off 30 minutes ago,

Room was around 80 degrees, they never rebooted, breaker wasn't tripped, I have one asic miner in there and it was mining away when I walked in while every other machine was sitting there off?

Any idea on what happened? How can I figure out more and how can I prevent this from happening in the future?

Check out my Crypto YouTube channel
https://www.youtube.com/VoskCoin
If you enjoy my content click Subscribe
Pages: « 1 ... 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 [101] 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 ... 417 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!