Bitcoin Forum
April 28, 2024, 05:51:22 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 [81] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 ... 417 »
  Print  
Author Topic: [OS] nvOC easy-to-use Linux Nvidia Mining  (Read 417954 times)
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 04:49:41 PM
 #1601

Is anybody else experiencing nvOC hang / lockup to the point of needing a hard powerdown when Genoil crashes?  I can log in but when I try to close the miner and shutdown the OS becomes locked up. I am wondering if it is hardware related?  I am only using 4GB of ddr4 is that enough???  I believe I will be going back to Claymore.  I can't seem to get Genoil stable even dialed 300mc back from Claymore.  I will reimage a USB stick and go back to Claymore to see if stability comes back. 

Having more ram would probably help; I use 8gb on most of my rigs and I have achieved multi day stability with the ones that are using genoil by previously lowing the clocks / adjusting the powerlimits whenever a soft crash occurred.

On a 4 x 1060 rig I use 1.1GB of RAM.  On a 6 x 1060 rig I use 1.3GB of RAM.  Unless there are spikes of memory usage or a memory leak somewhere, I don't see why 4GB would be more than enough.  It certainly shouldn't have any effect on Genoil stability.  Genoil seems to give comparable/better hash rates even with lower clocks and power limits than Claymore requires.  I've found that dropping the clocks has definitely increased stability without reducing hash rate.

With Ethash more ram up to 16gb will increase stability.  It is not that you need this much; but that: it will decrease your failure rate.  You can empirically verify this, but if you want to use 4gb of ram go ahead.  Using stable clocks and powerlimit will have a much larger impact on stability.  You can probably reach marginally higher levels of stability with more ram.
1714283482
Hero Member
*
Offline Offline

Posts: 1714283482

View Profile Personal Message (Offline)

Ignore
1714283482
Reply with quote  #2

1714283482
Report to moderator
1714283482
Hero Member
*
Offline Offline

Posts: 1714283482

View Profile Personal Message (Offline)

Ignore
1714283482
Reply with quote  #2

1714283482
Report to moderator
Each block is stacked on top of the previous one. Adding another block to the top makes all lower blocks more difficult to remove: there is more "weight" above each block. A transaction in a block 6 blocks deep (6 confirmations) will be very difficult to remove.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714283482
Hero Member
*
Offline Offline

Posts: 1714283482

View Profile Personal Message (Offline)

Ignore
1714283482
Reply with quote  #2

1714283482
Report to moderator
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 04:53:04 PM
 #1602

This line means there is a problem with the bios (rom) on one of the GPUs:
Code:
WARNING: infoROM is corrupted at gpu 0000:07:00.0

I would return this GPU or RMA it.

You could try re flashing its rom with NVFlash; but if this doesn't work it will most likely void your warranty; so if the GPUs are new I would go the other route.

For fan speed, try setting:

Code:
SLOW_USB_KEY_MODE="YES" 

let me know if that works.

Also what kind of USB / SSD are you using?


Heya, thanks for the reply.

About to return the GPU, it's brand new bought couple of days ago. Not going to reflash it or anything not to void warranty, thanks for the tip.

About the fan speed.

Code:
m1@rig1:~$ export DISPLAY=
m1@rig1:~$ echo $DISPLAY
m1@rig1:~$ nvidia-settings -a [fan:0]/GPUTargetFanSpeed=75
Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused

ERROR: The control display is undefined; please run `nvidia-settings --help` for usage information.
m1@rig1:~$ cat Desktop/oneBash | grep 'SLOW_USB_KEY_MODE='
SLOW_USB_KEY_MODE="YES"         # YES NO
m1@rig1:~$ export DISPLAY=:0.0
m1@rig1:~$ xrandr
xrandr: Failed to get size of gamma for output default
Screen 0: minimum 1024 x 768, current 1024 x 768, maximum 1024 x 768
default connected 1024x768+0+0 0mm x 0mm
   1024x768       0.00*
m1@rig1:~$ echo $DISPLAY
:0.0
m1@rig1:~$ nvidia-settings -a [fan:0]/GPUTargetFanSpeed=75

** (nvidia-settings:5815): WARNING **: Couldn't register with accessibility bus: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'fan:0' (No targets match target specification), specified in assignment '[fan:0]/GPUTargetFanSpeed=75'.

xorg.conf
Code:
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 378.13  (buildmeister@swio-display-x86-rhel47-05)  Tue Feb  7 19:37:00 PST 2017


Section "ServerLayout"
    Identifier     "layout"
    Screen      0  "nvidia" 0 0
    Inactive       "intel"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "keyboard"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "intel"
    Driver         "modesetting"
    Option         "AccelMethod" "None"
    BusID          "PCI:0@0:2:0"
EndSection

Section "Device"
    Identifier     "nvidia"
    Driver         "nvidia"
    BusID          "PCI:1@0:0:0"
EndSection

Section "Device"
    Identifier     "nvidia"
    Driver         "nvidia"
    Option         "ConstrainCursor" "off"
    BusID          "PCI:4@0:0:0"
EndSection

Section "Device"
    Identifier     "nvidia"
    Driver         "nvidia"
    Option         "ConstrainCursor" "off"
    BusID          "PCI:7@0:0:0"
EndSection

Section "Device"
    Identifier     "nvidia"
    Driver         "nvidia"
    Option         "ConstrainCursor" "off"
    BusID          "PCI:8@0:0:0"
EndSection

Section "Device"
    Identifier     "nvidia"
    Driver         "nvidia"
    Option         "ConstrainCursor" "off"
    BusID          "PCI:10@0:0:0"
EndSection

Section "Screen"
    Identifier     "intel"
    Device         "intel"
    Monitor        "Monitor0"
EndSection

Section "Screen"
    Identifier     "nvidia"
    Device         "nvidia"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "on"
    Option         "IgnoreDisplayDevices" "CRT"
    Option         "ConstrainCursor" "off"
    Option         "Coolbits" "24"
    SubSection     "Display"
        Depth       24
        Modes      "nvidia-auto-select"
    EndSubSection
EndSection

Section "Screen"
    Identifier     "nvidia"
    Device         "nvidia"
    Monitor        "Monitor0"
    Option         "AllowEmptyInitialConfiguration" "on"
    Option         "IgnoreDisplayDevices" "CRT"
EndSection

Section "Screen"
    Identifier     "nvidia"
    Device         "nvidia"
    Monitor        "Monitor0"
    Option         "AllowEmptyInitialConfiguration" "on"
    Option         "IgnoreDisplayDevices" "CRT"
EndSection

Section "Screen"
    Identifier     "nvidia"
    Device         "nvidia"
    Monitor        "Monitor0"
    Option         "AllowEmptyInitialConfiguration" "on"
    Option         "IgnoreDisplayDevices" "CRT"
EndSection

Section "Screen"
    Identifier     "nvidia"
    Device         "nvidia"
    Monitor        "Monitor0"
    Option         "AllowEmptyInitialConfiguration" "on"
    Option         "IgnoreDisplayDevices" "CRT"
EndSection

Sandisk SSD 120GB, used dd to write the img to disk. Access to rigs only possible via SSH, no TV, no RDP (maybe VGA/HDMI if required).

When having a few rigs, easier to identify them like this than by IP (atleast in my case).
Code:
# hostname rig1
# echo "rig1" > /etc/hostname
# sed -i 's/m1-desktop/rig1/g' /etc/hosts
 then in oneBash
XXX_WORKER="$HOSTNAME"

Thanks for the help, i'll keep trying to fix the fanspeed thing

your xorg.conf is most likely the problem

I would try re imaging a USB; although depending on your version you might not need to: what version are you using?

Also I don't recommend using dd to image USBs or SSDs; use etcher for linux instead.


fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 04:57:54 PM
 #1603

Hi. I cannot get Genoil to work on NiceHash.
Been trying for about an hour and finally gave up

After Genoil starts I immediately get an error about invalid argument and then it shows my BTC address.

I made a million changes to the oneBash line concerning NICE_ETHASH. Here is my latest (NOT WORKING)
My assumption is that this version of Genoil is not compatible with NH (I read somewhere NH made their own fork)
If I am wrong (hope so) please assist
Thanks

# NICE autoconverts to BTC: ensure you update BTC_ADDRESS if you use NICE
NICE_ETHASH_WORKER="nv$IP_AS_WORKER"
NICE_POOL="stratum+tcp://daggerhashimoto.usa.nicehash.com:3353"
GENOIL_NICE_POOL="daggerhashimoto.usa.nicehash.com:3353"
NICE_EXTENTION_ARGUMENTS="" 

if [ $COIN == "NICE_ETHASH" ]
then
 
if [ $GENOILorCLAYMORE == "GENOIL" ]
then
HCD='/home/m1/eth/Genoil-U/ethminer'
 
NICEADDR="$BTC_ADDRESS"
WORKER="$NICE_ETHASH_WORKER"
until $HCD -SP 2 -U -S $GENOIL_NICE_ETHASH_POOL -O $NICEADDR.$WORKER:x
 
   do
   echo "FAILURE; reinit in 5" >&2
   sleep 5
done
fi

the oneBash linked on the OP should work without modifications

I have used nicehash Ethash on some rigs since v0016 without issue.
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 04:58:48 PM
 #1604

Question for the Linux gurus. So my problems were coming from bad risers, once I changed them out, the pci bus errors messages stopped. The messages were saying that error correction had occurred. I found that there is a way to suppress these messages in grub by using the "pci=nomsi" option. So once I enabled this option the operating system works (using Simplemining, about to try this on nvoc next) and the cards seem to work. So what is the danger/consequences of enabling this option and leaving it on? At least until I get a new batch of risers from China

thanks

see:

https://ubuntuforums.org/showthread.php?t=1327209
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 04:59:57 PM
 #1605

Really looking forward to all the new features slated for nvOC 0018. I tried implementing the nicehash profit switching from @salfter but it was a little beyond my Linux skills.

Any idea when we might see it released?

see OP, added SALFTER_NICEHASH_PROFIT_SWITCHING with new oneBash + modded switch file
fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 05:02:40 PM
 #1606


i'm actually dreading the 0018 release... i've made so many changes to the 0017 oneBash and nvOC in general that moving them to the new version is gonna take forever.. currently working on automatic rig reboot in the event of a system freeze from the miner crashing.

@ tempgoga - I'm curious on your approach, this is next on my plate (without a remote power switch) the easiest  - that seems applicable to my setup - is to filter the Genoil watchdog script for a reduced hash rate threshold instead of just 'error' since that script filters stdout from Genoil so technically the info is there to capture already.
I haven't gotten around to this yet but I'm hoping this weekend.

Another idea that I'm thinking of is some sort of port-knocking from a remote machine - it could be enough since the rigs usually are responsive to ssh or local scripts after they "soft crash" with the video cards - but this won't help in case of a complete freeze.
Then you need a remote power cycle ability which is a whole different level of infra.

Cheers!


IAmNotAJeep let me know if your going to work on this more:

https://bitcointalk.org/index.php?topic=1854250.msg19943144#msg19943144

fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 05:10:32 PM
 #1607

So new build, similar problem to the first rig I built myself. Getting the never ending "bootloop" when I fire up my mobo with everything plugged in. I know everyone says to unplug everything and try it one part at a time, which i will indeed do, but I was wondering if anyone else had this issue and found a more uniform way to fix it? Last time it was because my RAM was loose. This time my connections are all secure.

Background: building a trio rig with 2 1080ti and a 1080 mini using a 270 mobo with 850w psu. First mobo I used for this build didn't work at all. This one fires up but then goes into the endless loop. I have tried a different psu and a different RAM as well as different GPU. Leaning towards it being a faulty cpu but curious to see if anyone else has any other suggestions before I dismantle everything.

CPUs are almost never bad; sometimes you can have a bent mobo pin that causes CPU related problems.  However, I don't think that is the case here.

Maybe this will work:

Ensure the monitor is connected to the primary GPU ( the one in the 16x slot closest to the CPU )

Disconnect the USB or SSD/HHD from the rig.

Fully power off everything: including the PSU.

Press the power button several times to clear any remaining power in the mobo.

Turn the PSU powerswitch back to | "on".

power on (without the USB attached)

See if the bios posts; if you get nothing in 20 seconds; press ctrl + alt + del repeatedly until the system reboots.

Wait and see if the bios posts.

If the bios posts attach the USB key and press ctrl + alt + delete.

Let me know if this works.


Thanks a ton, will give it a shot. And if this doesn't work? Just break everything?

If this doesn't work I would try reimaging the USB key. (first ensure your downloaded zip produces the correct hash)

What kind of USB key are you using btw?

I've tried using the one that is on my working rig (same exact setup) and it didn't make a difference. Both USBs are Lexar JumpDrive S75 32GB

I would try swapping out the risers first; if you are using risers.  If you have an identical setup that works with the same USB; then it is almost for sure a hardware problem.

Are the bios settings and version the same on the working and trouble rig?
TheCoinMine
Newbie
*
Offline Offline

Activity: 39
Merit: 0


View Profile
July 07, 2017, 05:24:52 PM
 #1608

So new build, similar problem to the first rig I built myself. Getting the never ending "bootloop" when I fire up my mobo with everything plugged in. I know everyone says to unplug everything and try it one part at a time, which i will indeed do, but I was wondering if anyone else had this issue and found a more uniform way to fix it? Last time it was because my RAM was loose. This time my connections are all secure.

Background: building a trio rig with 2 1080ti and a 1080 mini using a 270 mobo with 850w psu. First mobo I used for this build didn't work at all. This one fires up but then goes into the endless loop. I have tried a different psu and a different RAM as well as different GPU. Leaning towards it being a faulty cpu but curious to see if anyone else has any other suggestions before I dismantle everything.

CPUs are almost never bad; sometimes you can have a bent mobo pin that causes CPU related problems.  However, I don't think that is the case here.

Maybe this will work:

Ensure the monitor is connected to the primary GPU ( the one in the 16x slot closest to the CPU )

Disconnect the USB or SSD/HHD from the rig.

Fully power off everything: including the PSU.

Press the power button several times to clear any remaining power in the mobo.

Turn the PSU powerswitch back to | "on".

power on (without the USB attached)

See if the bios posts; if you get nothing in 20 seconds; press ctrl + alt + del repeatedly until the system reboots.

Wait and see if the bios posts.

If the bios posts attach the USB key and press ctrl + alt + delete.

Let me know if this works.


Thanks a ton, will give it a shot. And if this doesn't work? Just break everything?

If this doesn't work I would try reimaging the USB key. (first ensure your downloaded zip produces the correct hash)

What kind of USB key are you using btw?

I've tried using the one that is on my working rig (same exact setup) and it didn't make a difference. Both USBs are Lexar JumpDrive S75 32GB

I would try swapping out the risers first; if you are using risers.  If you have an identical setup that works with the same USB; then it is almost for sure a hardware problem.

Are the bios settings and version the same on the working and trouble rig?


I can't even get the bios to post. I swapped all the parts minus the mobo from bad rig to good rig, and they worked fine. so i'm guessing I have my answer
TenaciousJ
Full Member
***
Offline Offline

Activity: 122
Merit: 100


View Profile
July 07, 2017, 06:09:18 PM
 #1609


I can't even get the bios to post. I swapped all the parts minus the mobo from bad rig to good rig, and they worked fine. so i'm guessing I have my answer

This may have been suggested before, but I'm lazy and don't like to read so I'll just toss this out there and take the heat later.    Cool

Have you tried adding power to an ATX4P connection on the motherboard (or molex 4 pin connector if it has one) to supply additional power to the PCIe connections?  Even if the risers are separately powered, sometimes the board can't produce enough juice to the PCIe slots to support GPUs in all 6 positions. 

I have a Gigabyte Z170x-UD5 TH mobo that had that bootloop issue that drove me nuts for a few days straight. It wouldn't run more than 5 cards when there are 6 slots (that was with all gtx 1070s) so I finally discovered the purpose of the ATX4P connector and added a SATA power cable to the board and now it boots fine with 5 1070s on risers and a 1080ti in the last onboard slot.

Sorry if this is redundant info.

fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 07:01:54 PM
 #1610


I can't even get the bios to post. I swapped all the parts minus the mobo from bad rig to good rig, and they worked fine. so i'm guessing I have my answer

This may have been suggested before, but I'm lazy and don't like to read so I'll just toss this out there and take the heat later.    Cool

Have you tried adding power to an ATX4P connection on the motherboard (or molex 4 pin connector if it has one) to supply additional power to the PCIe connections?  Even if the risers are separately powered, sometimes the board can't produce enough juice to the PCIe slots to support GPUs in all 6 positions. 

I have a Gigabyte Z170x-UD5 TH mobo that had that bootloop issue that drove me nuts for a few days straight. It wouldn't run more than 5 cards when there are 6 slots (that was with all gtx 1070s) so I finally discovered the purpose of the ATX4P connector and added a SATA power cable to the board and now it boots fine with 5 1070s on risers and a 1080ti in the last onboard slot.

Sorry if this is redundant info.

Sometimes hardware is bad.  Having a redundant setup is very helpful in identifying trouble components as you have.
TheCoinMine
Newbie
*
Offline Offline

Activity: 39
Merit: 0


View Profile
July 07, 2017, 07:15:48 PM
 #1611


I can't even get the bios to post. I swapped all the parts minus the mobo from bad rig to good rig, and they worked fine. so i'm guessing I have my answer

This may have been suggested before, but I'm lazy and don't like to read so I'll just toss this out there and take the heat later.    Cool

Have you tried adding power to an ATX4P connection on the motherboard (or molex 4 pin connector if it has one) to supply additional power to the PCIe connections?  Even if the risers are separately powered, sometimes the board can't produce enough juice to the PCIe slots to support GPUs in all 6 positions. 

I have a Gigabyte Z170x-UD5 TH mobo that had that bootloop issue that drove me nuts for a few days straight. It wouldn't run more than 5 cards when there are 6 slots (that was with all gtx 1070s) so I finally discovered the purpose of the ATX4P connector and added a SATA power cable to the board and now it boots fine with 5 1070s on risers and a 1080ti in the last onboard slot.

Sorry if this is redundant info.



I'm only running 3 cards so idk if that is super applicable but if nothing else works I'll see if that helps. Thanks for the suggestion
TenaciousJ
Full Member
***
Offline Offline

Activity: 122
Merit: 100


View Profile
July 07, 2017, 07:30:33 PM
 #1612



Powerstates are weird in Linux; usually don't change if you issue the command to change them.  I suspect this particular driver disallows nvidia-settings control over them.

Also you will need to use a higher OC offset to match the results from windows; as the OC curve is different in linux.

It seems like the power state will not change if the miner is running since the value is locked while in use by the miner, probably because the xserver p-state setting is initialized when linux loads but the miner changes the p-state settings after that which resets the values overriding the o/s default settings. Once that's done, you have to stop the miner, reset the p-state in nvidia config panel  and then run the miner again... but the problem is that oneBash changes the setting back when it loads.  I've tested setting the cards individually to 'prefer max performance' in nvidia's control panel while miner is running.  P-state doesn't change.  As soon as the miner, is closed the p-state will go up to max settings as you'd expect.

One way I think you could avoid this is by not setting power states in oneBash, but rather set them manually in the nvidia xserver settings so they are in place before oneBash runs.  I'm sure there's a way to automate the P-state through nvidia xserver config file, but I'd have to dig around to find out how precisely to do it.  That also means you'd have to remove the p-state config from the oneBash options to avoid it being changed back by the miner script.

fullzero (OP)
Legendary
*
Offline Offline

Activity: 1260
Merit: 1009



View Profile
July 07, 2017, 07:39:36 PM
 #1613



Powerstates are weird in Linux; usually don't change if you issue the command to change them.  I suspect this particular driver disallows nvidia-settings control over them.

Also you will need to use a higher OC offset to match the results from windows; as the OC curve is different in linux.

It seems like the power state will not change if the miner is running since the value is locked while in use by the miner, probably because the xserver p-state setting is initialized when linux loads but the miner changes the p-state settings after that which resets the values overriding the o/s default settings. Once that's done, you have to stop the miner, reset the p-state in nvidia config panel  and then run the miner again... but the problem is that oneBash changes the setting back when it loads.  I've tested setting the cards individually to 'prefer max performance' in nvidia's control panel while miner is running.  P-state doesn't change.  As soon as the miner, is closed the p-state will go up to max settings as you'd expect.

One way I think you could avoid this is by not setting power states in oneBash, but rather set them manually in the nvidia xserver settings so they are in place before oneBash runs.  I'm sure there's a way to automate the P-state through nvidia xserver config file, but I'd have to dig around to find out how precisely to do it.  That also means you'd have to remove the p-state config from the oneBash options to avoid it being changed back by the miner script.

In my tests different GPUs had different levels of responsiveness to the power state cmds.  Maybe the new driver will work better overall.

There is no explict power state implementation in oneBash; if it is changed it is due to internal workings of the nvidia api when either OC or powerlimit is applied.  It is possible that adding power state cmds before the launching of the mining client, but after the powerlimit and OC will work.


min3333r
Newbie
*
Offline Offline

Activity: 25
Merit: 0


View Profile
July 07, 2017, 07:48:30 PM
 #1614

your xorg.conf is most likely the problem

I would try re imaging a USB; although depending on your version you might not need to: what version are you using?

Also I don't recommend using dd to image USBs or SSDs; use etcher for linux instead.

Tried using hdd raw copy on win machine and etcher - same issue.

Tried burning the image on SSD and USB - same.

Using v0017 and latest oneBash

Could anyone provide me their xorg.conf which works with fan speed changing?

Thanks
TenaciousJ
Full Member
***
Offline Offline

Activity: 122
Merit: 100


View Profile
July 07, 2017, 08:46:56 PM
Last edit: July 07, 2017, 09:16:01 PM by TenaciousJ
 #1615



Powerstates are weird in Linux; usually don't change if you issue the command to change them.  I suspect this particular driver disallows nvidia-settings control over them.

Also you will need to use a higher OC offset to match the results from windows; as the OC curve is different in linux.

It seems like the power state will not change if the miner is running since the value is locked while in use by the miner, probably because the xserver p-state setting is initialized when linux loads but the miner changes the p-state settings after that which resets the values overriding the o/s default settings. Once that's done, you have to stop the miner, reset the p-state in nvidia config panel  and then run the miner again... but the problem is that oneBash changes the setting back when it loads.  I've tested setting the cards individually to 'prefer max performance' in nvidia's control panel while miner is running.  P-state doesn't change.  As soon as the miner, is closed the p-state will go up to max settings as you'd expect.

One way I think you could avoid this is by not setting power states in oneBash, but rather set them manually in the nvidia xserver settings so they are in place before oneBash runs.  I'm sure there's a way to automate the P-state through nvidia xserver config file, but I'd have to dig around to find out how precisely to do it.  That also means you'd have to remove the p-state config from the oneBash options to avoid it being changed back by the miner script.

In my tests different GPUs had different levels of responsiveness to the power state cmds.  Maybe the new driver will work better overall.

There is no explict power state implementation in oneBash; if it is changed it is due to internal workings of the nvidia api when either OC or powerlimit is applied.  It is possible that adding power state cmds before the launching of the mining client, but after the powerlimit and OC will work.




How tricky would it be to use the nvidia-settings commands from within oneBash to run the following command recursively for each card that's detected when it runs, rather than setting a specific power limit or clock offset?

Enable PowerMizer (Prefer Maximum Performance)

nvidia-settings -a '[gpu:0]/GPUPowerMizerMode=1'


essentially this, but with more efficient code because I never was good at foreach loops and such...

if [ $POWERLIMIT == "NO" ]
then
sudo nvidia-settings -a '[gpu:0]/GPUPowerMizerMode=1'
sudo nvidia-settings -a '[gpu:1]/GPUPowerMizerMode=1'
sudo nvidia-settings -a '[gpu:2]/GPUPowerMizerMode=1'
sudo nvidia-settings -a '[gpu:3]/GPUPowerMizerMode=1'
sudo nvidia-settings -a '[gpu:4]/GPUPowerMizerMode=1'
sudo nvidia-settings -a '[gpu:5]/GPUPowerMizerMode=1'
fi


On a side note, I updated to the most recent nvidia open source drivers (v. 381) and now my 1080 ti is recognized, so that's a plus.  Before it just listed as 'graphical device'

lbrasi
Newbie
*
Offline Offline

Activity: 26
Merit: 0


View Profile
July 08, 2017, 04:13:34 AM
 #1616

I have had a lot of requests for this; so here is a new oneBash and modded switch file which implement full integration of SALFTER_NICEHASH_PROFIT_SWITCHING

see the OP for links:

Replace your current oneBash with the new one.

extract switch and move it to the:
Code:
 /home/m1

directory

(the one which opens when you click the Files icon on the left)

configure the following in oneBash

Code:
SALFTER_NICEHASH_PROFIT_SWITCHING="YES"

# LOCAL will attach the mining process to the guake terminal
# REMOTE will leave it unattached / ready for SSH
LOCALorREMOTE="LOCAL"       # LOCAL  or  REMOTE

CURRENCY=USD
POWER_COST=0.10
MINIMUM_PROFIT=0.0
# this is salfters BTC address:
PAYMENT_ADDRESS=1TipsGocnz2N5qgAm9f7JLrsMqkb3oXe2
WORKER_NAME=nv$IP_AS_WORKER

daggerhashimoto_POWERLIMIT_WATTS=125
__daggerhashimoto_CORE_OVERCLOCK=100
daggerhashimoto_MEMORY_OVERCLOCK=100
_______daggerhashimoto_FAN_SPEED=75

equihash_POWERLIMIT_WATTS=125
__equihash_CORE_OVERCLOCK=100
equihash_MEMORY_OVERCLOCK=100
_______equihash_FAN_SPEED=75

neoscrypt_POWERLIMIT_WATTS=125
__neoscrypt_CORE_OVERCLOCK=100
neoscrypt_MEMORY_OVERCLOCK=100
_______neoscrypt_FAN_SPEED=75

lyra2rev2_POWERLIMIT_WATTS=125
__lyra2rev2_CORE_OVERCLOCK=100
lyra2rev2_MEMORY_OVERCLOCK=100
_______lyra2rev2_FAN_SPEED=75

lbry_POWERLIMIT_WATTS=125
__lbry_CORE_OVERCLOCK=100
lbry_MEMORY_OVERCLOCK=100
_______lbry_FAN_SPEED=75

pascal_POWERLIMIT_WATTS=125
__pascal_CORE_OVERCLOCK=100
pascal_MEMORY_OVERCLOCK=100
_______pascal_FAN_SPEED=75

remember to thank salfter if you use this  Smiley



Thanks for implementing this, but for some odd reason I keep getting two instances of the miner screen running which causes the system to crash, I will do some more testing to try and figure out what is going on.

EDIT: Actually the kill code does not seem to work causing multiple miner screens, this is how the system is crashing.
TenaciousJ
Full Member
***
Offline Offline

Activity: 122
Merit: 100


View Profile
July 08, 2017, 06:02:37 AM
 #1617

I finally got my system stable with 6x 1070's after swapping mobos to an asrock z270 (the ga-z170x-ud5 th mobo i had the cards on seemed stable, but froze up and then went back to the old bootloop scenario again) - I just have one glitch that keeps creeping up. 1 gpu is only being utilized at about 66% consistently, where the rest are at 99% - overall it drops my hashrate 100 sol/s.  Any clue what might cause that to happen?   I'm running v. 0017.



tempgoga
Newbie
*
Offline Offline

Activity: 29
Merit: 0


View Profile
July 08, 2017, 08:12:14 AM
 #1618


i'm actually dreading the 0018 release... i've made so many changes to the 0017 oneBash and nvOC in general that moving them to the new version is gonna take forever.. currently working on automatic rig reboot in the event of a system freeze from the miner crashing.

@ tempgoga - I'm curious on your approach, this is next on my plate (without a remote power switch) the easiest  - that seems applicable to my setup - is to filter the Genoil watchdog script for a reduced hash rate threshold instead of just 'error' since that script filters stdout from Genoil so technically the info is there to capture already.
I haven't gotten around to this yet but I'm hoping this weekend.

Another idea that I'm thinking of is some sort of port-knocking from a remote machine - it could be enough since the rigs usually are responsive to ssh or local scripts after they "soft crash" with the video cards - but this won't help in case of a complete freeze.
Then you need a remote power cycle ability which is a whole different level of infra.

Cheers!


Sorry for the late response, right now i'm trying to initiate a system reboot in the event that Xorg service takes up 98% or more cpu for for 10 seconds or longer, which happens every time any miner crashes, Xorg always shoots up to 99-100% cpu and stays there, trying to use monit for this, will update if it works.

i like your port-knocking idea.
IAmNotAJeep
Newbie
*
Offline Offline

Activity: 44
Merit: 0


View Profile
July 08, 2017, 12:34:21 PM
 #1619


i'm actually dreading the 0018 release... i've made so many changes to the 0017 oneBash and nvOC in general that moving them to the new version is gonna take forever.. currently working on automatic rig reboot in the event of a system freeze from the miner crashing.

@ tempgoga - I'm curious on your approach, [... snip]

Sorry for the late response, right now i'm trying to initiate a system reboot in the event that Xorg service takes up 98% or more cpu for for 10 seconds or longer, which happens every time any miner crashes, Xorg always shoots up to 99-100% cpu and stays there, trying to use monit for this, will update if it works.

i like your port-knocking idea.

Hey monit looks nice, I'm about half way there to taking the output of genoil and turing it into a hearbeat, then to remotely cycle the server, the hard part is how to define the conditions of when to trigger that event.
Nexillus
Full Member
***
Offline Offline

Activity: 169
Merit: 100


View Profile
July 08, 2017, 01:27:27 PM
 #1620

Got my second right up and running, currently tuning the 1060's with Ethminer, genoil fork on VER0017 nvOC.

So far getting around 149MH/s total on 6 cards with these settings. Short term stability is their, long term us currently unknown.

Getting about 24.83MH/s on each.

5 cards:
PL: 125
Core: -100
Mem: 1700

1 card:
PL: 125
Core: -100
Mem: 1700

Anybody else have settings for their 1060s as I am curious to what others have gotten so far.

Besides stability for the OC, going to be stepping down the power to optimize it.
Pages: « 1 ... 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 [81] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 ... 417 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!