Bitcoin Forum
April 18, 2024, 01:39:01 PM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 [122] 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 ... 417 »
  Print  
Author Topic: [OS] nvOC easy-to-use Linux Nvidia Mining  (Read 417953 times)
Bibi187
Full Member
***
Offline Offline

Activity: 420
Merit: 106


https://steemit.com/@bibi187


View Profile WWW
August 08, 2017, 01:55:56 PM
 #2421

Hello,

I have two scenarios, where I don't have a solution. And I'm not able to search only in this entire thread, is that possible?

Maxximus007_AUTO_TEMPERATURE_CONTROL is disabled
Config is set to REMOTE

1) Sometimes the miner get stuck. This means, that I can still open the session (screen -r miner), but I cannot kill it (pkill -e miner).
If I happen to be in that situation, how do I kill the miner? Sending a pkill -9 -e miner also doesn't help.

2) After the miner gets killed (via pkill -e miner), there is no screen session opened again (screen -ls returns nothing). How can I manually start the 1bash script again, so that it starts the corresponding and correct screen session?

Thanks very much in advance.
Kind regards


When u are on the miner terminal do CTRL+C interrupt your miner.
To launch a new 1bash just do "bash 1bash"
Let him do the work, when u see the final line do CTRL+C
"screen -ls" gona show you new miner

DeepOnion    ▬▬  Anonymous and Untraceable  ▬▬    ENJOY YOUR PRIVACY  •  JOIN DEEPONION
▐▐▐▐▐▐▐▐   ANN  Whitepaper  Facebook  Twitter  Telegram  Discord    ▌▌▌▌▌▌▌▌
Get $ONION  (✔Cryptopia  ✔KuCoin)  |  VoteCentral  Register NOW!  |  Download DeepOnion
1713447541
Hero Member
*
Offline Offline

Posts: 1713447541

View Profile Personal Message (Offline)

Ignore
1713447541
Reply with quote  #2

1713447541
Report to moderator
1713447541
Hero Member
*
Offline Offline

Posts: 1713447541

View Profile Personal Message (Offline)

Ignore
1713447541
Reply with quote  #2

1713447541
Report to moderator
1713447541
Hero Member
*
Offline Offline

Posts: 1713447541

View Profile Personal Message (Offline)

Ignore
1713447541
Reply with quote  #2

1713447541
Report to moderator
Each block is stacked on top of the previous one. Adding another block to the top makes all lower blocks more difficult to remove: there is more "weight" above each block. A transaction in a block 6 blocks deep (6 confirmations) will be very difficult to remove.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
JayneL
Member
**
Offline Offline

Activity: 104
Merit: 10


View Profile
August 08, 2017, 02:28:31 PM
 #2422

thanks guys i finally manage to make it run for couple of hours now, heres the new problem i plug all my 6 GPU in my Biostar TB250-BTC PRO but only 5 was working
bios already set to mining above 6gpu, pcie set to GEN1, i plug my monitor on the 1x16 pcie all fans are working. and im lost Sad
uat88
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
August 08, 2017, 02:30:06 PM
 #2423


I've managed to make it work with my RIG, it seems working fine (issue was with path as mentioned in one of the replies).

Getting around 125 MH with my 8x 1060 6GB ASUS (oc:0,mc:0,pl:80), please share you OC settings if you think I'm getting low hashrate.

Thanks.

I dont know about 1060, but on skunk algo, u dont care about mc so put them in lower value, max as u can for 1070 is -2000.
Up your powerlimit to the default powerlimit and Clockmemory step by step start at 100, wait 24h and increment by 10 after every 24hours before crash.
When crash occur reduce to the previous step, wait again more 24h if stable lower your powerlimit to get a better ratio watt / hashrate.

So for my ASUS 1070 GTX 8OG
MemClock : -2000
CoreClock : 130 ( the 24hours step, i go push to 130 just now )
PowerLimit : 150W per gpu
TempLimit : 70c
Fan range : 95-75 ( autotemp ON )  

As SIGT is running good for 2500 blocks after reward get halved, i suggest you to get a stable setting cause watchdog is disable, so if your RIG crash, he dont gona reboot him self Wink

Thanks mate for the suggestions and sharing your OC settings, i will try to increase as per your suggestions and see how it goes.

Will share my OC settings if it improves.

Is getting around 125 MH with my 8x 1060 6GB considered low hashrate?
What should be the ideal for this GTX 1060 or 1070?
damNmad
Full Member
***
Offline Offline

Activity: 378
Merit: 104


nvOC forever


View Profile
August 08, 2017, 03:59:28 PM
 #2424


I've managed to make it work with my RIG, it seems working fine (issue was with path as mentioned in one of the replies).

Getting around 125 MH with my 8x 1060 6GB ASUS (oc:0,mc:0,pl:80), please share you OC settings if you think I'm getting low hashrate.

Thanks.

I dont know about 1060, but on skunk algo, u dont care about mc so put them in lower value, max as u can for 1070 is -2000.
Up your powerlimit to the default powerlimit and Clockmemory step by step start at 100, wait 24h and increment by 10 after every 24hours before crash.
When crash occur reduce to the previous step, wait again more 24h if stable lower your powerlimit to get a better ratio watt / hashrate.

So for my ASUS 1070 GTX 8OG
MemClock : -2000
CoreClock : 130 ( the 24hours step, i go push to 130 just now )
PowerLimit : 150W per gpu
TempLimit : 70c
Fan range : 95-75 ( autotemp ON )  

As SIGT is running good for 2500 blocks after reward get halved, i suggest you to get a stable setting cause watchdog is disable, so if your RIG crash, he dont gona reboot him self Wink

Thanks mate for the suggestions and sharing your OC settings, i will try to increase as per your suggestions and see how it goes.

Will share my OC settings if it improves.

Is getting around 125 MH with my 8x 1060 6GB considered low hashrate?
What should be the ideal for this GTX 1060 or 1070?

Well, from my experiments, i came to conclusion that the following settings are good (maybe not!!) I think.

cc :150
mc:-1500
pl  :80

gave me around 138MH

But I just found this

https://bitcointalk.org/index.php?topic=2070862

Testing it now, will keep you posted.

Please let me know if the above OC settings has made any improvement?

 


DeepOnion    ▬▬  Anonymous and Untraceable  ▬▬    ENJOY YOUR PRIVACY  •  JOIN DEEPONION
▐▐▐▐▐▐▐▐   ANN  Whitepaper  Facebook  Twitter  Telegram  Discord    ▌▌▌▌▌▌▌▌
Get $ONION  (✔Cryptopia  ✔KuCoin)  |  VoteCentral  Register NOW!  |  Download DeepOnion
tomlev5
Newbie
*
Offline Offline

Activity: 35
Merit: 0


View Profile
August 08, 2017, 09:13:16 PM
 #2425

Hi!
I'm about to build a rig with Pentium G4400.
But after reading about Skylake issues with hyper threading don't know
should I use it or better to return and exchange it with i3?
i have a rig runs 8 cards with G3900, no problem.

I have a rig with G3900 and 10 cards, but it seems that the G3900 is bottlenecking the system. First core uses 100%, the other around 20%.

Is there a way to split the workload on both cores (If I could run EWBF miner in two terminals or is there some other way)?

Can somebody please help me, i'm a noob in linux.


nvOC is a great system. Thanks for all the work.

I tried opening two miners in two screens with some change in 1bash:

screen -dmS minerX1 $HCD --eexit 3 --fee $EWBF_PERCENT --cuda_devices 0 1 2 3 4 --pec --server $ZEC_POOL --user $ZECADDR --pass z --port $ZEC_PORT;
screen -dmS minerX2 $HCD --eexit 3 --fee $EWBF_PERCENT --cuda_devices 5 6 7 8 9 --pec --server $ZEC_POOL --user $ZECADDR --pass z --port $ZEC_PORT;

After the change there are two processes in System monitor, but CPU usage is still the same (100% one core, 20% second core).

Is it possible that System monitor doesn't show the correct CPU utilization?
Will it help if I change the CPU from G3900 to i3 or perhaps i5?

Fullzero, I see You have 13 GPUs on Asrock H110 PRO BTC (I am using the same motherboard). What kind of CPU do you have so that everything is working smoothly?

salfter
Hero Member
*****
Offline Offline

Activity: 651
Merit: 501


My PGP Key: 92C7689C


View Profile WWW
August 08, 2017, 09:29:13 PM
 #2426

@salfter :

I'd like to suggest an idea to your switcher: when the most profitable algorithm is Ethash, give the option to dual-mine automatically with the second most profitable algorithm if it's able to do so.

Of course I'm oversimplifying, as the switching itself will have to consider a higher power limit, more complex profit calculation and perform a "switch inside a switch" (Ethash fixed + switching second algo). But I think we can get a few more bucks this way, at least from my short mining experience it's usually profitable to dual-mine when Ethash is the most profitable.

My somewhat limited experience is that it only squeezes out a few more cents, not dollars.  I'd also need to redo miner benchmarks, as I've been using Genoil's ethminer.

Tipjars: BTC 1TipsGocnz2N5qgAm9f7JLrsMqkb3oXe2 LTC LTipsVC7XaFy9M6Zaf1aGGe8w8xVUeWFvR | My Bitcoin Note Generator | Pool Auto-Switchers: zpool MiningPoolHub NiceHash
Bitgem Resources: Pool Explorer Paper Wallet
dbolivar
Member
**
Offline Offline

Activity: 119
Merit: 10


View Profile
August 08, 2017, 09:40:28 PM
 #2427

I tried opening two miners in two screens with some change in 1bash:

screen -dmS minerX1 $HCD --eexit 3 --fee $EWBF_PERCENT --cuda_devices 0 1 2 3 4 --pec --server $ZEC_POOL --user $ZECADDR --pass z --port $ZEC_PORT;
screen -dmS minerX2 $HCD --eexit 3 --fee $EWBF_PERCENT --cuda_devices 5 6 7 8 9 --pec --server $ZEC_POOL --user $ZECADDR --pass z --port $ZEC_PORT;

After the change there are two processes in System monitor, but CPU usage is still the same (100% one core, 20% second core).

Is it possible that System monitor doesn't show the correct CPU utilization?
Will it help if I change the CPU from G3900 to i3 or perhaps i5?

I don't have experience with your particular hardware, but from the specs of your CPU, it supports up to 16 PCIe lanes, so at least I/O shouldn't be a problem (https://ark.intel.com/products/90741/Intel-Celeron-Processor-G3900-2M-Cache-2_80-GHz), as long as you use all your cards with risers, so they all run at PCIe 1x.

You can check the following in your Linux installation:

1) In "top" or "system monitor", what's the breakdown of the CPU usage for USER, SYSTEM, NICE and WAIT? This will help identify where the bottleneck could be.

2) Which are the top 3 processes using more CPU?

3) Check if you have the process "irqbalance" running with the correct parameters: "ps aux | grep irqbalance".
salfter
Hero Member
*****
Offline Offline

Activity: 651
Merit: 501


My PGP Key: 92C7689C


View Profile WWW
August 08, 2017, 09:40:51 PM
 #2428

Sorry but that's not really related to my question. It's no problem how loud the graphic cards are, they should just run at 100%. The cooler the card, the longer the lifespan. And at the moments in the room I really need them to run at 100%. They get 70 degree at 50% which is too much.

I thought I read somewhere that running the fans much past 85% won't do much in the way of additional cooling, but it will run up additional wear on the motors.  

Tipjars: BTC 1TipsGocnz2N5qgAm9f7JLrsMqkb3oXe2 LTC LTipsVC7XaFy9M6Zaf1aGGe8w8xVUeWFvR | My Bitcoin Note Generator | Pool Auto-Switchers: zpool MiningPoolHub NiceHash
Bitgem Resources: Pool Explorer Paper Wallet
tomlev5
Newbie
*
Offline Offline

Activity: 35
Merit: 0


View Profile
August 08, 2017, 10:02:55 PM
Last edit: August 08, 2017, 10:21:47 PM by tomlev5
 #2429

I tried opening two miners in two screens with some change in 1bash:

screen -dmS minerX1 $HCD --eexit 3 --fee $EWBF_PERCENT --cuda_devices 0 1 2 3 4 --pec --server $ZEC_POOL --user $ZECADDR --pass z --port $ZEC_PORT;
screen -dmS minerX2 $HCD --eexit 3 --fee $EWBF_PERCENT --cuda_devices 5 6 7 8 9 --pec --server $ZEC_POOL --user $ZECADDR --pass z --port $ZEC_PORT;

After the change there are two processes in System monitor, but CPU usage is still the same (100% one core, 20% second core).

Is it possible that System monitor doesn't show the correct CPU utilization?
Will it help if I change the CPU from G3900 to i3 or perhaps i5?

I don't have experience with your particular hardware, but from the specs of your CPU, it supports up to 16 PCIe lanes, so at least I/O shouldn't be a problem (https://ark.intel.com/products/90741/Intel-Celeron-Processor-G3900-2M-Cache-2_80-GHz), as long as you use all your cards with risers, so they all run at PCIe 1x.

You can check the following in your Linux installation:

1) In "top" or "system monitor", what's the breakdown of the CPU usage for USER, SYSTEM, NICE and WAIT? This will help identify where the bottleneck could be.

2) Which are the top 3 processes using more CPU?

3) Check if you have the process "irqbalance" running with the correct parameters: "ps aux | grep irqbalance".

When I start the miner in single process:

1) breakdown of the CPU usage after about 20 minutes of uptime:
      1x process miner used 3:51
      14x process kworker (they used from 0:40 to 1:20)
      9x irq nvidia (they used from 0:40 to 1:03)

2) in the top twenty processes are miner, kworker, irq nvidia (nothing else)

3) if I type "ps aux | grep irqbalance" in Guake terminal i get two processes: one is root the other is m1.
What parameters can I check?

Thanks for helping dbolivar. I am stuck, because I have no experience with linux.
fogcity
Newbie
*
Offline Offline

Activity: 11
Merit: 0


View Profile
August 08, 2017, 10:54:46 PM
 #2430

Hey there --

I'm running v0018 on a single rig with 6x 1070 GPUs (1x EVGA & 5x ASUS).  I have all the GPUs over-clocked to where my total eth mining MHz is ~180MHz and I'm quite content to just let it run.

That being said, my rig crashed today.  The error on the screen was:

CUDA error in function 'search' at line 346 : the launch timed out and was terminated

This error was repeated for all 6x cudaminers (cudaminer0 thru cudaminer5) at the same system clock time, and then the rig halted.

Can anyone possibly tell me what caused this issue and what I can do to avoid it happening again?

Thanks in advance!

Fogcity




dbolivar
Member
**
Offline Offline

Activity: 119
Merit: 10


View Profile
August 09, 2017, 12:23:19 AM
 #2431

When I start the miner in single process:

1) breakdown of the CPU usage after about 20 minutes of uptime:
      1x process miner used 3:51
      14x process kworker (they used from 0:40 to 1:20)
      9x irq nvidia (they used from 0:40 to 1:03)

2) in the top twenty processes are miner, kworker, irq nvidia (nothing else)

3) if I type "ps aux | grep irqbalance" in Guake terminal i get two processes: one is root the other is m1.
What parameters can I check?

Thanks for helping dbolivar. I am stuck, because I have no experience with linux.

It's OK to have a high CPU usage time* for kworker and irq/nvidia, these account for internal kernel worker threads and interrupts for GPU I/O (expected on a multi-GPU rig mining). What I'm really looking for is how much CPU usage is for user processes, system processes, and I/O wait, that's why I asked these values. Try this: run the "top" utility, it will be constantly updating -- type "1", it will expand the third line (CPU usage summary) to show each CPU. Then paste the values here, like that:

%Cpu0  :  0.3 us,  1.0 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.3 us,  1.3 sy,  0.0 ni, 98.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
...
%CpuN  : ....

Regarding irqbalance, running the command I suggested, you should get one line like this:

root       874  0.0  0.0  19536  2232 ?        Ss   Aug06   0:12 /usr/sbin/irqbalance --pid=/var/run/irqbalance.pid

* EDIT: high CPU usage TIME (which accumulates until the next reboot), not a constant high CPU usage percentual.
BigSmurf
Newbie
*
Offline Offline

Activity: 15
Merit: 0


View Profile
August 09, 2017, 01:50:30 AM
 #2432

hi,
i have Asrock H110 btc+ pro, and 13xMining/P106-100 (1060 6gb) GPU-z everything works fine and stable with other OS-es, but with your OS it keeps restarting and restarting because of not being able to see/find xorg file.

We managed to start the OS via GTX 1060 card and then added the Mining P106-100 (1060 6gb) cards, but no go with only the Mining P106-100 (1060 6gb) cards. can you look in to this ? i would like to use your OS but dont want to have a normal gaming card, don't need it when using all Mining cards.

Can you also make a explination on overckloking as we tried via the file but failed. we only managed to individually overclock via the Nvidia x server.

thank you.
BigSmurf
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
August 09, 2017, 03:42:30 AM
 #2433

Hi,

I got a gainward 1060 today and it is the first card that ignores the power-limit out of 8 cards (gigabyte, asus) wether I set it by bash or nvidia-smi. ie. sucks 100w/80w. card damaged?

still using v0017

btw: any planned date for v0019? I much appretiate this OS!

best regards



I am not familiar with gainward.  The GPU may not be properly recognized by X; in turn causing OC and PL to not work on it.  I usually test GPUs that have a problem in a rig; by moving them to a mobo 16x slot direct (only GPU on the mobo) and testing if the same problems manifest.

There are a lot of changes for v0019; I want to test them before releasing. 
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
August 09, 2017, 03:43:05 AM
 #2434

fullzero thanks for such a great and handy os for mining, real timesaver.

Currently mining SIGT with 1070 and having around 19-19.75 MH/s per card (100 COC, 1300 MOC, 125 PW, intensity 20). Tried different settings, tweaked to 20MH/s, but it wasn't stable, so this is where I stopped.

One question: Sometimes miner still crashes and I don't see the reason, screen is just getting terminated and I just see that miner restarts. Is there any log file where I could see reason for last termination. Thanks.

BTW yesterday I contacted sp too about signatum ccminer mod, I was ready to pay him 0.05 BTC for mod, but unfortunately he said he makes only mod for windows. Maybe someone can convince him there is quite big audience for him if he makes linux version.

fullzero thanks for such a great and handy os for mining, real timesaver.

Currently mining SIGT with 1070 and having around 19-19.75 MH/s per card (100 COC, 1300 MOC, 125 PW, intensity 20). Tried different settings, tweaked to 20MH/s, but it wasn't stable, so this is where I stopped.

One question: Sometimes miner still crashes and I don't see the reason, screen is just getting terminated and I just see that miner restarts. Is there any log file where I could see reason for last termination. Thanks.

BTW yesterday I contacted sp too about signatum ccminer mod, I was ready to pay him 0.05 BTC for mod, but unfortunately he said he makes only mod for windows. Maybe someone can convince him there is quite big audience for him if he makes linux version.


Click Ubuntu button on top left and type:

s

Click on system log

look thru the logs for error messages
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
August 09, 2017, 03:45:18 AM
 #2435

After reading the ccminer readme.txt I just noticed that DMD and ZCOIN had the wrong algo commands in onebash:
DMD = dmd-gr
ZCOIN = lyra2z

Not that anyone is mining these anymore Smiley

I will change these for the next 1bash.  Thanks jlbaseball11
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
August 09, 2017, 03:50:11 AM
 #2436

question for fullzero: Is there a reason why you do not failover mining pool addresses in 1bash? I was watching my rig (for a completely separate issue which I might require help later) it lost connection to the mining pool I use (nanopool west) then the mining process tried restarted twice, then the whole rig just shut down. I tested my internet connection while the rig was restarting and trying to connect to the pool and my connection was ok. Wouldn't it be better to connect to a failover pool address and if it connects to the failover, try to re-establish with the original pool an hour later? or something along that line, perhaps incorporated into the watchdog? It's cool that it shut itself down and not waste power, but I would prefer that the rig try to connect to another pool if it can.

I haven't tested it, but for Claymore, can't you just put all the failover addresses into /home/m1/eth/9_7/epools.txt like you do on Windows?

You should be able to use the Claymore failover.

I have a general client implementation of failover planned (it will work with any mining client).  This may or may not be in v0019, but will be included in time.  And yes this will be added in via the watchdog.
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
August 09, 2017, 03:53:15 AM
 #2437

Hello guys,

Any of you mining Pascal Lite? If so can you please share your PASL coin code part from oneBash (would prefer DUAL though), would like to create an account with PASL.

Tried with accounts.pascallite.com, but don't have any PASL on Cryptopia to buy (it just costs 0.05 PASL though), so want to try pasl.fairpool.xyz as they said they will give an account if I mine with my public wallet key for 12.25 PASL.

I've tried to edit and add on my oneBash, it does nothing but crashing/hanging my RIG.

Thanks in advance.

&&

Would also like to share my stable OC settings with ASUS DUAL 1060 6GB.

cc : -100; mc:  1100; pl  :  90W

giving me 170MH for ETHASH and 200MH for LBC (dcri-40)

Hope it help some new users trying for stable OC, also happy to get some advice if I can increase the yield.

Thanks and good luck with our mining.

Hail fullZero for creating and expanding this OS with amazing support.

Long live CRYPTO & fullZero...

I have tried adding the following code to my one bash as per the instructions on replies, but still my rig is crashing when I tried to mine PASL. Though I want DUAL_ETC_PASL, I've tried just PASL using this code.

Code:
COIN="PASL"

Code:
PASL_WORKER="$IP_AS_WORKER"
PASL_ADDRESS="xxxxxxxxxxxxxx"
PASL_POOL="stratum+tcp://mine.pasl.fairpool.xyz:4009"

Code:
if [ $COIN == "PASL" ]
then
HCD='/home/m1/pasc/sgminer'
ADDR="$PASL_ADDRESS..$PASL_WORKER"

screen -dmS miner $HCD -k pascal -o $PASL_POOL -u $ADDR -p x -p x -I 21 -w 64 -g2

if [ $LOCALorREMOTE == "LOCAL" ]
then
screen -r miner
fi

BITCOIN="theGROUND"

while [ $BITCOIN == "theGROUND" ]
do
sleep 60
done
fi

SGMINER starts but RIG hangs, can some one please help me with this...

I have figured it out, don't need to add additional code, just use the current PASC code as is, change the details of pool and account of PASC to PASL details.

Thanks; knowing this will make adding PASL faster.
philipma1957
Legendary
*
Offline Offline

Activity: 4102
Merit: 7710


'The right to privacy matters'


View Profile WWW
August 09, 2017, 03:53:31 AM
 #2438

Hi,

I got a gainward 1060 today and it is the first card that ignores the power-limit out of 8 cards (gigabyte, asus) wether I set it by bash or nvidia-smi. ie. sucks 100w/80w. card damaged?

still using v0017

btw: any planned date for v0019? I much appretiate this OS!

best regards



I am not familiar with gainward.  The GPU may not be properly recognized by X; in turn causing OC and PL to not work on it.  I usually test GPUs that have a problem in a rig; by moving them to a mobo 16x slot direct (only GPU on the mobo) and testing if the same problems manifest.

There are a lot of changes for v0019; I want to test them before releasing.  

skunk on zpool please  thanks phil

▄▄███████▄▄
▄██████████████▄
▄██████████████████▄
▄████▀▀▀▀███▀▀▀▀█████▄
▄█████████████▄█▀████▄
███████████▄███████████
██████████▄█▀███████████
██████████▀████████████
▀█████▄█▀█████████████▀
▀████▄▄▄▄███▄▄▄▄████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀
.
 MΞTAWIN  THE FIRST WEB3 CASINO   
.
.. PLAY NOW ..
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
August 09, 2017, 03:56:43 AM
 #2439

Hi Fullzero

I want to share with you a GPU failed that the watchdog is not able to detect

wdog screen:

GPU UTILIZATION:  Unable to determine the device handle for GPU 0000:09:00.0: GPU is lost. Reboot the system to recover this GPU

/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: Unable: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: to: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: determine: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: the: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: device: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: handle: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: for: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: 0000:09:00.0:: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: is: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: lost.: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: Reboot: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: the: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: system: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: to: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: recover: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: this: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
Tue Jul 25 16:57:01 CEST 2017 - All good! Will check again in 60 seconds


GPU UTILIZATION:  Unable to determine the device handle for GPU 0000:09:00.0: GPU is lost. Reboot the system to recover this GPU

/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: Unable: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: to: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: determine: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: the: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: device: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: handle: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: for: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: 0000:09:00.0:: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: is: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: lost.: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: Reboot: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: the: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: system: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: to: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: recover: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: this: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
Tue Jul 25 16:58:01 CEST 2017 - All good! Will check again in 60 seconds


the miner show/detect only 6 GPU over 7

nvidia-smi doesn't work
$ nvidia-smi
Unable to determine the device handle for GPU 0000:09:00.0: GPU is lost.  Reboot the system to recover this GPU

temp screen:
Provided power limit 75.00 W is not a valid power limit which should be between 115.00 W and 291.00 W for GPU 00000000:0A:00.0
Terminating early due to previous errors.
Tue Jul 25 17:01:07 CEST 2017 - All good, will check again soon

GPU 0, Target temp: 61, Current: 60, Diff: 1, Fan: 75, Power: 123.46

GPU 1, Target temp: 61, Current: 60, Diff: 1, Fan: 63, Power: 124.62

GPU 2, Target temp: 61, Current: 59, Diff: 2, Fan: 77, Power: 119.23

GPU 3, Target temp: 61, Current: 60, Diff: 1, Fan: 68, Power: 120.72

GPU 4, Target temp: 61, Current: 59, Diff: 2, Fan: 57, Power: 124.26

GPU 5, Target temp: 61, Current: Unable, Diff: 61, Fan: to, Power: determine

/home/m1/Maxximus007_AUTO_TEMPERATURE_CONTROL: line 125: [: Unable: integer expression expected
/home/m1/Maxximus007_AUTO_TEMPERATURE_CONTROL: line 158: [: the: integer expression expected
/home/m1/Maxximus007_AUTO_TEMPERATURE_CONTROL: line 171: [: to: integer expression expected
GPU 6, Target temp: 61, Current: 55, Diff: 6, Fan: 50, Power: 126.76

Tue Jul 25 17:01:37 CEST 2017 - Restoring Power limit for gpu:6. Old limit: 125 New limit: 75 Fan speed: 50

Provided power limit 75.00 W is not a valid power limit which should be between 115.00 W and 291.00 W for GPU 00000000:0A:00.0
Terminating early due to previous errors.
Tue Jul 25 17:01:37 CEST 2017 - All good, will check again soon


I believe this is the exact problem that Maxximus007 recently made a new code block to resolve.

Fullzero,

I'm getting this error as well, and looks like watchdog is not rebooting the system.
I believe I have the latest bash files.
are Maxximus007's changes to resolve this issue in the current bash files?
Thank you.



GPU UTILIZATION:  Unable to determine the device handle for GPU 0000:01:00.0: GPU is lost. Reboot the system to recover this GPU

/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: Unable: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: to: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: determine: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: the: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: device: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: handle: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: for: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: 0000:01:00.0:: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: is: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: lost.: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: Reboot: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: the: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: system: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: to: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: recover: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: this: integer expression expected
/home/m1/IAmNotAJeep_and_Maxximus007_WATCHDOG: line 44: [: GPU: integer expression expected
Sat Jul 29 21:07:09 PDT 2017 - All good! Will check again in 60 seconds



The newest watchdog download link is at the top of the OP in purple.  It resolves this problem, and is more effective; it should not have a false positive reboot at all.
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
August 09, 2017, 03:58:10 AM
 #2440


I will test this driver / switch to it if it is better for v0019
Pages: « 1 ... 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 [122] 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 ... 417 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!