Proofer
Member
Offline
Activity: 266
Merit: 36
|
|
January 02, 2012, 11:36:10 PM |
|
Feature request...
On startup output all settings to stderr.
Edit: Or not. ... I realize I can in effect do this myself by adding a command to my cgminer startup script to copy cgminer.conf to my logs dir, adding a timestamp to the filename. And for extra credit, copy only if it's been changed since the last copy.
Here's my current startup script: xhost + &> /dev/null now="`date +%Y.%m.%d.%H.%M.%S`" cd ~/miners/cgminer # or wherever your cgminer directory is # logs/ is assumed to be in the cgminer directory diff cgminer.conf $(ls -1 logs/*.conf | tail -n 1) &> /dev/null || cp cgminer.conf logs/$now.conf DISPLAY=:0 cgminer -c cgminer.conf 2> logs/$now.log
This results in files in the logs directory like this: 2012.01.01.21.12.05.log 2012.01.02.07.24.45.log 2012.01.02.08.39.02.conf 2012.01.02.08.39.02.log 2012.01.02.12.38.31.log ...where the .conf file is copied to logs (and renamed with a timestamp) only if it's changed since the last time it was copied. Edit: my first copy/paste omitted the first line of the script.
|
|
|
|
kano
Legendary
Offline
Activity: 4620
Merit: 1851
Linux since 1997 RedHat 4
|
|
January 02, 2012, 11:41:28 PM |
|
Read my second reply, it was running for at least an hour the first time I noticed.
Besides, its completely repeatable. Start 2.0.8 and I immediately get 420mH or thereabout. Start 2.0.7 or 2.1.1 and its immediately capped at 380mH and sticks there.
Exactly the same .bin files?
|
|
|
|
simonk83
|
|
January 02, 2012, 11:45:37 PM |
|
Read my second reply, it was running for at least an hour the first time I noticed.
Besides, its completely repeatable. Start 2.0.8 and I immediately get 420mH or thereabout. Start 2.0.7 or 2.1.1 and its immediately capped at 380mH and sticks there.
Exactly the same .bin files? Basically all I did was download the 2.0.7 zip, extract the folder to the desktop and add my bat file. Same goes for 2.1.1. I was previously using 2.0.7 full time before 2.0.8 (obviously) and it was all fine, so I can't really explain it.
|
|
|
|
kano
Legendary
Offline
Activity: 4620
Merit: 1851
Linux since 1997 RedHat 4
|
|
January 02, 2012, 11:59:02 PM |
|
Read my second reply, it was running for at least an hour the first time I noticed.
Besides, its completely repeatable. Start 2.0.8 and I immediately get 420mH or thereabout. Start 2.0.7 or 2.1.1 and its immediately capped at 380mH and sticks there.
Exactly the same .bin files? Basically all I did was download the 2.0.7 zip, extract the folder to the desktop and add my bat file. Same goes for 2.1.1. I was previously using 2.0.7 full time before 2.0.8 (obviously) and it was all fine, so I can't really explain it. No I mean the actual compiled CL .bin file that does the GPU mining. e.g. for my 6950 it is called: phatk110817Caymanbitalignv2w128long8.bin (I use mostly default options) See if copying the 2.0.8 one into the 2.1.1 directory makes it run the same ... Those .bin files are usually create once the first time and used forever after My 2.1.1 is still running one dated 13-Nov from when I first used a 2.0.8 version of the software in my miner directory - but of course my 2.0.8 versions were many and varied
|
|
|
|
simonk83
|
|
January 03, 2012, 12:33:25 AM |
|
Right, gotcha. I'm at work at the mo so I'll mess with it when I get home
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
January 03, 2012, 02:32:58 AM |
|
Feature request... [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit GPU 0: 69.5C 4535RPM | 357.0/363.8Mh/s | A:285 R:0 HW:0 U:5.02/m I: 9 GPU 1: 74.0C | 366.4/363.9Mh/s | A:299 R:0 HW:0 U:5.26/m I: 9 GPU 2: 67.5C 4108RPM | 372.9/363.8Mh/s | A:289 R:0 HW:0 U:5.09/m I: 9 GPU 3: 62.5C | 366.4/363.7Mh/s | A:262 R:0 HW:0 U:4.61/m I: 9 GPU 4: 68.0C 3564RPM | 370.8/363.6Mh/s | A:294 R:0 HW:0 U:5.18/m I: 9 GPU 5: 71.0C | 340.5/363.6Mh/s | A:318 R:1 HW:0 U:5.60/m I: 9
These are three 5970s. auto-fan is on with a target of 70C for all, 3C hysteresis. At this snapshot GPUs 1 and 5 ran 3C-4.5C hotter than their card-mates, and GPU 3 ran 5C cooler than its mate. I believe that because GPUs 1, 3, and 5 don't return fan values that cgminer is ignoring their temps w/r auto-fan. Assuming that cgminer can't tell via ADL or otherwise that two GPUs share a fan, I would like to able to tell that to cgminer and thus have my temp targets applied to (in my case) odd-numbered GPUs as well as to even-numbered ones. I cannot code for a 5970 or 6990 without poking and prodding them with code, and since I don't own one, it's unlikely to happen in a safe manner. If I just guess, I'll likely do something which could be bad...
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Proofer
Member
Offline
Activity: 266
Merit: 36
|
|
January 03, 2012, 03:50:46 AM |
|
Feature request... ... I believe that because GPUs 1, 3, and 5 don't return fan values that cgminer is ignoring their temps w/r auto-fan. Assuming that cgminer can't tell via ADL or otherwise that two GPUs share a fan, I would like to able to tell that to cgminer and thus have my temp targets applied to (in my case) odd-numbered GPUs as well as to even-numbered ones.
I cannot code for a 5970 or 6990 without poking and prodding them with code, and since I don't own one, it's unlikely to happen in a safe manner. If I just guess, I'll likely do something which could be bad... I might've been unclear. I was suggesting that the user have the option to specify to the software, presumably via .conf or command line, that certain GPUs comprise a "fan group," i.e., share a fan, and also which of the group has the fan output and control. I don't know, something like, in my case, "fan-group" : "0,1/0, 2,3/2, 4,5/5" ...meaning GPUs 0 and 1 share a fan, the speed of which is readable and controllable via GPU 0; etc. What I'm thinking of would not require any additional hardware coding, but it would require additional fan-control logic within cgminer.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
January 03, 2012, 03:53:04 AM |
|
Feature request... ... I believe that because GPUs 1, 3, and 5 don't return fan values that cgminer is ignoring their temps w/r auto-fan. Assuming that cgminer can't tell via ADL or otherwise that two GPUs share a fan, I would like to able to tell that to cgminer and thus have my temp targets applied to (in my case) odd-numbered GPUs as well as to even-numbered ones.
I cannot code for a 5970 or 6990 without poking and prodding them with code, and since I don't own one, it's unlikely to happen in a safe manner. If I just guess, I'll likely do something which could be bad... I might've been unclear. I was suggesting that the user have the option to specify to the software, presumably via .conf or command line, that certain GPUs comprise a "fan group," i.e., share a fan, and also which of the group has the fan output and control. I don't know, something like, in my case, "fan-group" : "0,1/0, 2,3/2, 4,5/5" ...meaning GPUs 0 and 1 share a fan, the speed of which is readable and controllable via GPU 0; etc. What I'm thinking of would not require any additional hardware coding, but it would require additional fan-control logic within cgminer. No, that's actually unnecessary because the ADL does have information about shared thermal devices... interpreting the results would need prodding though.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Proofer
Member
Offline
Activity: 266
Merit: 36
|
|
January 03, 2012, 04:00:15 AM |
|
No, that's actually unnecessary because the ADL does have information about shared thermal devices... interpreting the results would need prodding though.
Sorry, I don't understand. Interpreting what results? If you mean additional ADL results, then forgo that and just let the user tell you as I suggested. Then you already have the temps and the fan speed and can control the latter. I am suggesting that you use both relevant core temps when calculating a new auto-fan speed for a card instead of just one temp.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
January 03, 2012, 04:02:25 AM |
|
I'm sick of adding special case command line parameters...
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Proofer
Member
Offline
Activity: 266
Merit: 36
|
|
January 03, 2012, 04:08:47 AM |
|
Over in mining hardware I just whined that I had an instance of a "SICK" GPU even after falling back to pretty vanilla settings of gpu-engine 725 (stock) and gpu-memclock 300 for my 5970s. Is there any chance that SICK like the following is not a GPU hardware issue? [2012-01-02 17:56:39] Thread 2 idle for more than 60 seconds, GPU 2 declared SICK! [2012-01-02 17:56:39] Attempting to restart GPU [2012-01-02 17:56:39] Thread 2 still exists, killing it off [2012-01-02 17:56:39] Thread 8 still exists, killing it off [2012-01-02 17:56:39] Thread 2 restarted [2012-01-02 17:56:40] Thread 8 restarted [2012-01-02 17:56:40] Accepted 00000000.30702585.cb8fdf73 GPU 5 thread 11 pool 0 [2012-01-02 17:56:41] Accepted 00000000.676a69c6.4b59b7db GPU 5 thread 5 pool 0 [2012-01-02 17:56:43] Accepted 00000000.1e5767ae.f669070b GPU 2 thread 2 pool 0 # note how healthy it is now!
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
January 03, 2012, 04:17:43 AM |
|
Over in mining hardware I just whined that I had an instance of a "SICK" GPU even after falling back to pretty vanilla settings of gpu-engine 725 (stock) and gpu-memclock 300 for my 5970s. Is there any chance that SICK like the following is not a GPU hardware issue? [2012-01-02 17:56:39] Thread 2 idle for more than 60 seconds, GPU 2 declared SICK! [2012-01-02 17:56:39] Attempting to restart GPU [2012-01-02 17:56:39] Thread 2 still exists, killing it off [2012-01-02 17:56:39] Thread 8 still exists, killing it off [2012-01-02 17:56:39] Thread 2 restarted [2012-01-02 17:56:40] Thread 8 restarted [2012-01-02 17:56:40] Accepted 00000000.30702585.cb8fdf73 GPU 5 thread 11 pool 0 [2012-01-02 17:56:41] Accepted 00000000.676a69c6.4b59b7db GPU 5 thread 5 pool 0 [2012-01-02 17:56:43] Accepted 00000000.1e5767ae.f669070b GPU 2 thread 2 pool 0 # note how healthy it is now!
Anything's possible, but note that the restart code was tested extensively on literally dozens of GPUs to get this sick restart code working -when possible- and the person who helped me test it had 72 GPUs that would often have boxes going down with any other miner. The idea was to make it recover to a fine state after enough rest if possible. So yes it's possible. Maybe even likely, who knows, but this particular scenario was not unusual even at normal clocks when some GPUs were run flat out, regardless of which miner it was. Interestingly it became FAR more common with the phatk2 kernel (which is what is used in cgminer) since that seemed to run GPUs that little bit more than anything else.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
LightRider
Legendary
Offline
Activity: 1500
Merit: 1022
I advocate the Zeitgeist Movement & Venus Project.
|
|
January 03, 2012, 04:23:17 AM |
|
Still causing the video driver to fail. I'll trying a clean reinstall and see if that helps.
|
|
|
|
tnkflx
|
|
January 03, 2012, 09:15:42 AM |
|
Feature request... [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit GPU 0: 69.5C 4535RPM | 357.0/363.8Mh/s | A:285 R:0 HW:0 U:5.02/m I: 9 GPU 1: 74.0C | 366.4/363.9Mh/s | A:299 R:0 HW:0 U:5.26/m I: 9 GPU 2: 67.5C 4108RPM | 372.9/363.8Mh/s | A:289 R:0 HW:0 U:5.09/m I: 9 GPU 3: 62.5C | 366.4/363.7Mh/s | A:262 R:0 HW:0 U:4.61/m I: 9 GPU 4: 68.0C 3564RPM | 370.8/363.6Mh/s | A:294 R:0 HW:0 U:5.18/m I: 9 GPU 5: 71.0C | 340.5/363.6Mh/s | A:318 R:1 HW:0 U:5.60/m I: 9
These are three 5970s. auto-fan is on with a target of 70C for all, 3C hysteresis. At this snapshot GPUs 1 and 5 ran 3C-4.5C hotter than their card-mates, and GPU 3 ran 5C cooler than its mate. I believe that because GPUs 1, 3, and 5 don't return fan values that cgminer is ignoring their temps w/r auto-fan. Assuming that cgminer can't tell via ADL or otherwise that two GPUs share a fan, I would like to able to tell that to cgminer and thus have my temp targets applied to (in my case) odd-numbered GPUs as well as to even-numbered ones. I cannot code for a 5970 or 6990 without poking and prodding them with code, and since I don't own one, it's unlikely to happen in a safe manner. If I just guess, I'll likely do something which could be bad... Would it be beneficial if we get you a 6990?
|
| Operating electrum.be & us.electrum.be |
|
|
|
cuz0882
|
|
January 03, 2012, 11:01:21 AM |
|
I have a 6990 and 2x 6970's all set at 955 clock speed, but the 6970's each run about 20-30 hash's behind the 6990. I've tried reinstalling the video drivers. They all run within 1 hash of each other with guiminer so I'm a little lost on what could cause this. Any ideas?
|
|
|
|
cablepair
|
|
January 03, 2012, 12:10:01 PM |
|
ckvolias : I know everyone is coming at you from a million directions but I have a very strange problem I would love your opinion on or anyone else for that matter that can help.
I have four rigs, I have three of them working fine with CGMINER
this last rig is very problematic, at first I thought it was something wrong with a single card, than a single type of card but now I realize its not the cards
all of my rigs are 890fxa-gd70s MB
this thing if I have five cards in it - I start it mining with cg miner, within like 30-60 mins
one of the cards the fan will show 0RPM and will show the temp at 127.5c (its ALWAYS 127.5c for some reason) then the system will freeze up and windows will crash
if I put my hand on the card it does not feel hot at all and I can visibly see the fan spinning at a normal speed
if I move the cards around, or swap the cards out for ones I know work it does not matter
now if I take it down to having four cards on the motherboard instead of five
the system will not crash, and we do not see the 127.5c but eventually one of the cards will display an incorrect fan speed right now its a 5970, I see it in GPU2 - its displaying 1RPM fan speed, but hashing along at a normal speed with the fan spinning at a normal rate and it does not feel overly hot.
what could be causing this? I am dumbfounded here. Any help would be greatly appreciated and will result in a 1 btc tip for the person that gives me the right answer. Thanks!
|
|
|
|
cuz0882
|
|
January 03, 2012, 12:23:02 PM |
|
ckvolias : I know everyone is coming at you from a million directions but I have a very strange problem I would love your opinion on or anyone else for that matter that can help.
I have four rigs, I have three of them working fine with CGMINER
this last rig is very problematic, at first I thought it was something wrong with a single card, than a single type of card but now I realize its not the cards
all of my rigs are 890fxa-gd70s MB
this thing if I have five cards in it - I start it mining with cg miner, within like 30-60 mins
one of the cards the fan will show 0RPM and will show the temp at 127.5c (its ALWAYS 127.5c for some reason) then the system will freeze up and windows will crash
if I put my hand on the card it does not feel hot at all and I can visibly see the fan spinning at a normal speed
if I move the cards around, or swap the cards out for ones I know work it does not matter
now if I take it down to having four cards on the motherboard instead of five
the system will not crash, and we do not see the 127.5c but eventually one of the cards will display an incorrect fan speed right now its a 5970, I see it in GPU2 - its displaying 1RPM fan speed, but hashing along at a normal speed with the fan spinning at a normal rate and it does not feel overly hot.
what could be causing this? I am dumbfounded here. Any help would be greatly appreciated and will result in a 1 btc tip for the person that gives me the right answer. Thanks!
I would try managing the fans with msi afterburner and see if it still happens. Has the pc ever had 11.12 catalyst installed on it? Does not really sound like the problem, but my 6 gpu system was not working until I removed some garbage files that 11.12 left behind before reinstalling 11.11. https://bitcointalk.org/index.php?topic=54972.msg655079#msg655079
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
January 03, 2012, 12:31:14 PM |
|
I cannot code for a 5970 or 6990 without poking and prodding them with code, and since I don't own one, it's unlikely to happen in a safe manner. If I just guess, I'll likely do something which could be bad...
Would it be beneficial if we get you a 6990? That would most definitely come under the definition of rhetorical questions. Given 6990s cost more than any other card on the market, I think I know what the likelihood of that happening is, though. But just to be clear since I haven't answered: of course it would...
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
P4man
|
|
January 03, 2012, 12:34:58 PM |
|
Got another "network bug". This time with 2.1.1 (on linux), while 2.0.8 (on windows) did not get it. I was expecting the opposite really. Usually both machines got it simultaneously and I assumed 2.1.1 fixed it. Apparently not. Here is the debug output: 2012-01-03 12:31:48] json_rpc_call failed on get work, retry after 155 seconds [2012-01-03 12:31:48] HTTP request failed: The requested URL returned error: 503 [2012-01-03 12:31:48] Failed json_rpc_call in get_upstream_work [2012-01-03 12:31:48] json_rpc_call failed on get work, retry after 155 seconds [2012-01-03 12:31:48] HTTP request failed: The requested URL returned error: 503 [2012-01-03 12:31:48] Failed json_rpc_call in get_upstream_work [2012-01-03 12:31:48] json_rpc_call failed on get work, retry after 155 seconds [2012-01-03 12:31:49] Queueing getwork request to work thread [2012-01-03 12:31:49] Popping work from get queue to get work [2012-01-03 12:31:49] Popping work to work thread
[2012-01-03 12:31:50] 19.5 C F: 40%(-1RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:31:50] 28.0 C F: 40%(1490RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:31:53] (5s):0.0 (avg):799.0 Mh/s | Q:14417 A:11676 R:2 HW:0 E:81% U:10.85/m [2012-01-03 12:31:53] 19.5 C F: 40%(-1RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:31:53] 27.5 C F: 40%(1493RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:31:54] HTTP request failed: The requested URL returned error: 503 [2012-01-03 12:31:54] Failed json_rpc_call in get_upstream_work [2012-01-03 12:31:54] json_rpc_call failed on get work, retry after 155 seconds [2012-01-03 12:31:54] HTTP request failed: The requested URL returned error: 503 [2012-01-03 12:31:54] Failed json_rpc_call in get_upstream_work [2012-01-03 12:31:54] json_rpc_call failed on get work, retry after 155 seconds [2012-01-03 12:31:55] HTTP request failed: The requested URL returned error: 503 [2012-01-03 12:31:56] 19.5 C F: 40%(-1RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:31:56] 27.0 C F: 40%(1492RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:31:56] HTTP request failed: The requested URL returned error: 503 [2012-01-03 12:31:56] Failed json_rpc_call in get_upstream_work [2012-01-03 12:31:56] json_rpc_call failed on get work, retry after 155 seconds [2012-01-03 12:31:59] (5s):0.0 (avg):798.9 Mh/s | Q:14417 A:11676 R:2 HW:0 E:81% U:10.85/m [2012-01-03 12:31:59] 19.5 C F: 40%(-1RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:31:59] 27.0 C F: 40%(1496RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0%
[2012-01-03 12:32:02] 19.5 C F: 40%(-1RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:32:02] 27.0 C F: 40%(1496RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:32:03] HTTP request failed: The requested URL returned error: 503
[2012-01-03 12:32:04] HTTP request failed: The requested URL returned error: 503 [2012-01-03 12:32:05] (5s):0.0 (avg):798.8 Mh/s | Q:14417 A:11676 R:2 HW:0 E:81% U:10.85/m [2012-01-03 12:32:05] 19.5 C F: 40%(-1RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0% [2012-01-03 12:32:05] 27.0 C F: 40%(1497RPM) E: 157MHz M: 300Mhz V: 0.950V A: 0% P: 0%
Restarting cgminer fixed it. Both primary and backup pools where working properly AFAICT.
|
|
|
|
Turbor
Legendary
Offline
Activity: 1022
Merit: 1000
BitMinter
|
|
January 03, 2012, 01:23:52 PM |
|
Am i the only one without problems ? Win7 32, 2.1.1, 2 rigs, zero problems
|
|
|
|
|