Bitcoin Forum
December 08, 2016, 08:35:46 PM *
News: Latest stable version of Bitcoin Core: 0.13.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 ... 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 [235] 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 ... 830 »
  Print  
Author Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.9.2  (Read 4823502 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
The00Dustin
Hero Member
*****
Offline Offline

Activity: 806


View Profile
March 23, 2012, 12:59:08 PM
 #4681

I noticed two small issues this morning, one which would be easy to reproduce but doesn't necessarily need fixed, the other which would be difficult to reproduce but should theoretically be fixed for best performance (assuming my observation was correct).

Issue one is related to the use of system time, my mining machine was about 6 minutes slow, over the course of a couple weeks with ntp not running.  I started ntp and when the time changed, cgminer immediately reported "pool not providing work enough" and subsequently switched pools.  This probably wouldn't have happened if ntp was running all along, which is why I think it is a minor issue, but it makes me wonder if DST causes dropped work and pool failovers when they shouldn't exist.  Since that would only be a problem once or twice a year, I still don't know if this would be worth fixing, because I'm guessing it would be difficult to fix (would require a unique time source or some sort of interrupt to deal with known time changes on the existing time source [assuming either one of those is even possible]).

Issue two is related to submission of stale shares.  Note that I did not have --submit-stale on the command line in this case, but I have previously seen cgminer submit detected stales to the pool because it was instructed to (pool is eligius).  This morning, I just happened to look at my miner while this was still on the screen:
Quote
[2012-03-23 08:22:53] LONGPOLL requested work restart, waiting on fresh work
[2012-03-23 08:22:57] Accepted 00000000.949b59d0.7ecd59ab GPU 0 thread 1 pool 0
[2012-03-23 08:23:03] Accepted 00000000.58e019ff.7a3bc657 GPU 0 thread 1 pool 0
[2012-03-23 08:23:05] Pool 0 communication failure, caching submissions
[2012-03-23 08:23:05] Stale share detected, discarding
[2012-03-23 08:23:07] Pool 0 communication resumed, submitting work
[2012-03-23 08:23:07] Accepted 00000000.ade4193a.07083649 GPU 0 thread 0 pool 0
I had recently restarted cgminer, so I can't be 100% certain cgminer was again instructed to submit stales, but assuming it was, I believe the discarded share above would indicate a bug in stale share handling in this scenario.  I also believe it would have been accepted based on the length of time between the last longpoll and this event and the fact that other shares were accepted before and after this one with no other new block event shown.

I can't do much about either of these, but I thought I would bring them up in case anyone who could might want to.
1481229346
Hero Member
*
Offline Offline

Posts: 1481229346

View Profile Personal Message (Offline)

Ignore
1481229346
Reply with quote  #2

1481229346
Report to moderator
1481229346
Hero Member
*
Offline Offline

Posts: 1481229346

View Profile Personal Message (Offline)

Ignore
1481229346
Reply with quote  #2

1481229346
Report to moderator
1481229346
Hero Member
*
Offline Offline

Posts: 1481229346

View Profile Personal Message (Offline)

Ignore
1481229346
Reply with quote  #2

1481229346
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1481229346
Hero Member
*
Offline Offline

Posts: 1481229346

View Profile Personal Message (Offline)

Ignore
1481229346
Reply with quote  #2

1481229346
Report to moderator
1481229346
Hero Member
*
Offline Offline

Posts: 1481229346

View Profile Personal Message (Offline)

Ignore
1481229346
Reply with quote  #2

1481229346
Report to moderator
1481229346
Hero Member
*
Offline Offline

Posts: 1481229346

View Profile Personal Message (Offline)

Ignore
1481229346
Reply with quote  #2

1481229346
Report to moderator
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218


Gerald Davis


View Profile
March 23, 2012, 01:02:57 PM
 #4682

If you want to avoid the second one there is a submit stale = true option available in the config file.

If the share is genuinely stale though that will simply increase the stale count on the pool server.  If the share is valid it will prevent cgminer from proactively killing it.
Hawkix
Hero Member
*****
Offline Offline

Activity: 517



View Profile WWW
March 23, 2012, 01:20:06 PM
 #4683

Why?  Compiling the kernel takes time and resources.  It makes the first start much slower.

That is the entire point of saving a bin file.

1) cgminer looks to see if bin files exists (binary version of OpenCL kernel)
2a) if no bin file exists it compiles the open CL kernel using CURRENTLY INSTALLED SDK and saves a copy as bin file
2b) if bin file exists then cgminer saves time and resources and loads that bin file
3) compiled binary (bin file) is loaded onto the GPU
4) execution begins

The 2a & 2b means if you upgrade sdk but DON'T delete the bin files you will run cgminer used kernel compiled on the SDK at the time of first start.  If later you install a new copy of cgminer (which has no precompiled bin files) then and only then will the new SDK be used.

ckolivas could make cgminer compile on the fly on each startup and it likely would reduce confusion because as soon as you upgraded SDK and restarted cgminer you would see performance drop and people would stop blaming later version of cgminer for the drop. Still that change would only help the uninformed be less clueless and would slow start times for everyone else.

Isn't there a way in OpenCL to check what version of the SDK the .bin was compiled in and re-compile only if the SDK did not match currently installed? Or, encode the version into .bin filename. Sorry for nitpicking Smiley.

Donations: 1Hawkix7GHym6SM98ii5vSHHShA3FUgpV6
http://btcportal.net/ - All about Bitcoin - coming soon!
The00Dustin
Hero Member
*****
Offline Offline

Activity: 806


View Profile
March 23, 2012, 01:27:44 PM
 #4684

If you want to avoid the second one there is a submit stale = true option available in the config file.

If the share is genuinely stale though that will simply increase the stale count on the pool server.  If the share is valid it will prevent cgminer from proactively killing it.
Note that I did not have --submit-stale on the command line in this case, but I have previously seen cgminer submit detected stales to the pool because it was instructed to (pool is eligius).
Expiry is irrelevant when you have submit stale enabled or the pool asks for submitold (as p2pool does).
Wink

ETA emphasis

ETA2  If your point was that there is something I can do (in spite of the fact that my post should have made it clear that I was aware of that, I did say "I can't do much about either of these"), then I should clarify and note that I meant to fix the issues (as opposed to working around them, I can disable ntp and dst if need be for issue one and use --submit-old for issue 2, but that doesn't change the fact that either issue may be considered a design flaw worthy of fixing, which is why I posted them).
kano
Legendary
*
Offline Offline

Activity: 1932


Linux since 1997 RedHat 4


View Profile
March 23, 2012, 02:04:23 PM
 #4685

...
Isn't there a way in OpenCL to check what version of the SDK the .bin was compiled in and re-compile only if the SDK did not match currently installed?
No.

Pool: https://kano.is BTC: 1KanoiBupPiZfkwqB7rfLXAzPnoTshAVmb
CKPool and CGMiner developer, IRC FreeNode #ckpool and #cgminer kanoi
Help keep Bitcoin secure by mining on pools with Stratum, the best protocol to mine Bitcoins with ASIC hardware
kano
Legendary
*
Offline Offline

Activity: 1932


Linux since 1997 RedHat 4


View Profile
March 23, 2012, 02:25:49 PM
 #4686

I noticed two small issues this morning, one which would be easy to reproduce but doesn't necessarily need fixed, the other which would be difficult to reproduce but should theoretically be fixed for best performance (assuming my observation was correct).

Issue one is related to the use of system time, my mining machine was about 6 minutes slow, over the course of a couple weeks with ntp not running.  I started ntp and when the time changed, cgminer immediately reported "pool not providing work enough" and subsequently switched pools.  This probably wouldn't have happened if ntp was running all along, which is why I think it is a minor issue, but it makes me wonder if DST causes dropped work and pool failovers when they shouldn't exist.  Since that would only be a problem once or twice a year, I still don't know if this would be worth fixing, because I'm guessing it would be difficult to fix (would require a unique time source or some sort of interrupt to deal with known time changes on the existing time source [assuming either one of those is even possible]).
No - time always travels forwards - oddly enough Smiley
Daylight savings does not affect the internal storage of time, it simply displays it differently.

However, if you change the computer clock backwards or forwards (with ntp) - you get what you deserve.

Issue two is related to submission of stale shares.  Note that I did not have --submit-stale on the command line in this case, but I have previously seen cgminer submit detected stales to the pool because it was instructed to (pool is eligius).  This morning, I just happened to look at my miner while this was still on the screen:
Quote
[2012-03-23 08:22:53] LONGPOLL requested work restart, waiting on fresh work
[2012-03-23 08:22:57] Accepted 00000000.949b59d0.7ecd59ab GPU 0 thread 1 pool 0
[2012-03-23 08:23:03] Accepted 00000000.58e019ff.7a3bc657 GPU 0 thread 1 pool 0
[2012-03-23 08:23:05] Pool 0 communication failure, caching submissions
[2012-03-23 08:23:05] Stale share detected, discarding
[2012-03-23 08:23:07] Pool 0 communication resumed, submitting work
[2012-03-23 08:23:07] Accepted 00000000.ade4193a.07083649 GPU 0 thread 0 pool 0
I had recently restarted cgminer, so I can't be 100% certain cgminer was again instructed to submit stales, but assuming it was, I believe the discarded share above would indicate a bug in stale share handling in this scenario.  I also believe it would have been accepted based on the length of time between the last longpoll and this event and the fact that other shares were accepted before and after this one with no other new block event shown.

I can't do much about either of these, but I thought I would bring them up in case anyone who could might want to.
If you tell cgminer to submit stales it tells you explicitly that it is doing that:
"Stale share detected, submitting as user requested" before it shows the share Accepted/Rejected/whatever

If you don't tell cgminer to submit stales, then it will discard what it considers stale.
The thing to know about stales is that a work request is stale based on when it was received, not when you see it's share(s) shown in cgminer.
Cgminer threw away that share coz it wasn't told to submit stales and the getwork time implied the share was stale.
You can't assume the getwork order matches the share display order.

Edit: oops yes the pool can also tell cgminer to submit stales - forgot to mention that - as mentioned above.

Pool: https://kano.is BTC: 1KanoiBupPiZfkwqB7rfLXAzPnoTshAVmb
CKPool and CGMiner developer, IRC FreeNode #ckpool and #cgminer kanoi
Help keep Bitcoin secure by mining on pools with Stratum, the best protocol to mine Bitcoins with ASIC hardware
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218


Gerald Davis


View Profile
March 23, 2012, 02:46:35 PM
 #4687

ETA2  If your point was that there is something I can do (in spite of the fact that my post should have made it clear that I was aware of that, I did say "I can't do much about either of these")

Yes that was the point.   You said "I can't do much" and I was showing there is something = workaround that you CAN do.

I didn't say conman ignore this guy cgminer is flawless and never needs any future patches/fixes.  Just pointing out if it is a bug it is somewhat rare and likely hard to track down one.  It is also possible your pool screwed up.  Are you saving raw copies of getworks?  If pool has a bug where sometimes it fails to indicate stales should be submitted then cgminer won't.

In the meantime using submit stale = true will force cgminer to always submit a share.  The code is very simple and likely immune to any potential bugs.  cgminer simply always submits completed work, no matter what.  I have one rig running 90+ days with almost 4 million shares and SS = 0 because submit stale = true.

If you don't want to workaround until a fix is found (if ever) then feel free to ignore the suggestion.
The00Dustin
Hero Member
*****
Offline Offline

Activity: 806


View Profile
March 23, 2012, 03:18:36 PM
 #4688

Daylight savings does not affect the internal storage of time, it simply displays it differently.
Right, I forgot, too much time in Windows-land where this isn't the case, so I agree, this one doesn't really need fixing.
Edit: oops yes the pool can also tell cgminer to submit stales - forgot to mention that - as mentioned above.
And this pool does tell it to.  Like I said, it would be difficult to replicate, but I believe the stale handling via the SUBMITOLD function (pool telling cgminer to submit them) was ignored when the communication failure occurred (whereas --submit-stale would have presumably held onto it indefinitely based on posts to another user who was experiencing a memory leak issue that turned out to be due to GPU idle restarts with the 5970 bug).  It is possible that the pool didn't send the SUBMITOLD command this time, but the same pool did in my previous run that was only a few days, and I haven't seen any posts indicating a change to the pool, so I wanted to bring up the possibility of that function not being as persistent as --submit-stale switch, as I don't know whether it should be or not.  My discussion regarding whether or not the share would have been accepted shouldn't have been included, that was extraneous information.
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
March 23, 2012, 03:33:26 PM
 #4689

Kano, just tried to compile your latest, getting this:

Code:
bitforce.c: In function ‘bitforce_scanhash’:
bitforce.c:310: error: ‘REASON_THERMAL_CUTOFF’ undeclared (first use in this function)
bitforce.c:310: error: (Each undeclared identifier is reported only once
bitforce.c:310: error: for each function it appears in.)

Haven't looked into it myself yet, but thought I'd post it straight away...
boozer
Sr. Member
****
Offline Offline

Activity: 309


View Profile
March 23, 2012, 04:26:59 PM
 #4690

I've put a few commits in my git:
 https://github.com/kanoi/cgminer/
that add a simple device history that is accessible via the new API command 'notify'

Compiling my git reports itself as 2.3.1k

You can see it with
Code:
echo -n notify | nc 127.0.0.1 4028 ; echo

The base code change adds a few extra fields and counters to the device structure (that are all reported by the API)
Including: per device: last well time, last not well time, last not well reason, and counters for each of the reasons meaning how many times they have happened (e.g. Device Over Heat count, Device Thermal Cutoff count among others)

I ran for 30 minutes at stock gpu clocks and several gpu threads restarted... again on the GPU Managment screen, it only showed that gpu 5 had been re-initialized according to the times tamps.  Seems to be random cards.... first time i looked it was 2,3 and 5.  I restarted and this time its 1,3,4, and 5... after 30 minutes at stock gpu clock.

Here's the output of your command:
Code:
STATUS=S,When=1332519326,Code=60,Msg=Notify,Description=cgminer 2.3.1k|NOTIFY=0,Name=GPU,ID=0,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=1,Name=GPU,ID=1,Last Well=1332519326,Last Not Well=1332518925,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=2,Name=GPU,ID=2,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=3,Name=GPU,ID=3,Last Well=1332519325,Last Not Well=1332518862,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=4,Name=GPU,ID=4,Last Well=1332519326,Last Not Well=1332517934,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=2,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=5,Name=GPU,ID=5,Last Well=1332519326,Last Not Well=1332518716,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|

I'll try running on a different pool other than gpumax again see what happens.
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218


Gerald Davis


View Profile
March 23, 2012, 04:42:10 PM
 #4691

So was testing over heat protection on my water cooled farm ...

Code:

 cgminer version 2.3.1 - Started: [2012-03-23 15:45:20]
--------------------------------------------------------------------------------
 (10s):2507.1 (avg):2573.0 Mh/s | Q:3286  A:1932  R:26  HW:0  E:59%  U:35.30/m
 TQ: 8  ST: 9  SS: 21  DW: 1506  NB: 7  LW: 3806  GF: 0  RF: 0
 Connected to http://192.168.0.189:9332 with LP as user user/1000+1
 Block: 000006e1c8f6fcf1aa1e1f358d344831...  Started: [16:36:55]
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  58.0C  960RPM | 324.2/335.3Mh/s | A:263 R:2 HW:0 U:   4.81/m I: 8
 GPU 1:  59.0C  960RPM | REST  /333.1Mh/s | A:261 R:6 HW:0 U:   4.77/m I: 8
 GPU 2:  60.0C  960RPM | REST  /225.7Mh/s | A:188 R:4 HW:0 U:   3.44/m I: 8
 GPU 3:  60.5C  960RPM | REST  /328.6Mh/s | A:246 R:5 HW:0 U:   4.49/m I: 8
 GPU 4:  56.0C  960RPM | 324.5/358.6Mh/s | A:217 R:3 HW:0 U:   3.96/m I: 8
 GPU 5:  59.5C  960RPM | REST  /330.7Mh/s | A:239 R:0 HW:0 U:   4.37/m I: 8
 GPU 6:  59.5C  960RPM | REST  /330.4Mh/s | A:261 R:3 HW:0 U:   4.77/m I: 8
 GPU 7:  58.5C  960RPM | REST  /333.3Mh/s | A:262 R:3 HW:0 U:   4.79/m I: 8

Notice the 10s avg hashrate is inaccurate.  Looks like when card goes idle due to overheat its last hashrate is still added to global average.
ancow
Sr. Member
****
Offline Offline

Activity: 373


View Profile WWW
March 23, 2012, 08:32:14 PM
 #4692

Code:
[2012-03-22 22:46:43] Started cgminer 2.3.1

[2012-03-22 22:46:43] Started cgminer 2.3.1
[2012-03-22 22:46:43] Probing for an alive pool
[2012-03-22 22:46:44] Long-polling activated for http://mining.eligius.st:8337/LP
Segmentation fault
root@ds-r:~/cgminer-2.3.1-2#

Running on debian squeeze, sdk 2.4, newer/ish fglrx

I get the same error from both a self built and the pre-built ubuntu binary.

I'm getting a similar segfault with 12.x fglrx if I use an SDK older than 2.6 (debian testing, though). The best solution I found was to remove the APP SDK completely, install the packages "opencl-headers", "amd-libopencl1" and "amd-opencl-icd" and run cgminer with "GPU_USE_SYNC_OBJECTS=1" set after re-compiling it. Alternatively, using fglrx 11.x was the only alternative I found.
But then, I compile cgminer myself and have never gotten the opencl version mismatch.

BTC: 1GAHTMdBN4Yw3PU66sAmUBKSXy2qaq2SF4
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:16:52 PM
 #4693

been getting this error on 2 of my miners lately. they run about 6 hours every time too. been crashing like clockwork for a few days now

[regular log stuff, then suddenly this]
[2012-03-20 14:04:09] Failed to create submit_work_thread
[lots of statistics]
[2012-03-20 14:04:09] API failed (Socket Error: (10004) Interrupted system call) - API will not be available
[2012-03-20 14:04:09] longpoll failed for http://api.bitcoin.cz:8408, sleeping for 30s
and thats the end, they are then sitting at "press any key to continue . . ."

same exact error on both

they are both running 2.3.1, no special flags aside from autoclock and autofan settings.

my 6870 is win7 64 bit, 11.11 driver, 2.5 sdk, and the 6770 is vista 32, 12.1, 2.3 sdk. clean installs, I never reuse bins, and delete bins when upgrading drivers and/or sdks. cgminer is always installed fresh, never over itself.

my 5830 on 11.4, 2.1 and XP with cgminer 2.2.7 runs perfect. same pool settings as the 6870 and 6770, so if its the longpoll fail thats killing the 2.3.1 miners the 2.2.7 version is OK with whatever it is.. is longpoll handled differently between 2.2.7 and 2.3.1?

any ideas?
Failing to create  submit_work_thread is the key here. It has nothing to do with how it fails after that. Inability to create the thread suggests a system resource problem, like running out of memory or too many threads starting up for some reason.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:19:09 PM
 #4694

Is it efficient to use balance or rotation mode? I would like to get text alerts from a pool but I don't like putting on my hashing power there. Balance seems like a good option if LP still works correctly. Does it matter how many pools are used?
It is a little less efficient trying to use multiple pools at the same time because of the problem with pools disagreeing about when a block changes by a few seconds between them and having to run multiple long polls that discard more work. That said it's only 2 or 3 seconds' work every 10 minutes so doesn't amount to much but will be visible if you watch stats on cgminer at the time.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:21:18 PM
 #4695

Hi, I'm sure I am doing something really stupid but I can't get my 3x 7970s over 200 MHash/S each.

Running 12.2 driver and the newest version of cgminer.

Is there anything extra I need to do for 7970? I used cgminner for all my other cards. Thanks.
Presumably that's the dodgy sdk that diablo kernel doesn't work with. Try -k poclbm if you are passing a kernel choice to it.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:22:12 PM
 #4696


I would assume this is a false positive, but I guess it doesn't hurt to ask official advice. MS Security Essentials didn't pick up on anything but AVG did.


Read the increasingly unread FAQ in the readme included in the zip file about cgeminer being a virus.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:24:07 PM
 #4697

Code:
[2012-03-22 22:46:43] Started cgminer 2.3.1

[2012-03-22 22:46:43] Started cgminer 2.3.1
[2012-03-22 22:46:43] Probing for an alive pool
[2012-03-22 22:46:44] Long-polling activated for http://mining.eligius.st:8337/LP
Segmentation fault
root@ds-r:~/cgminer-2.3.1-2#

Running on debian squeeze, sdk 2.4, newer/ish fglrx

I get the same error from both a self built and the pre-built ubuntu binary.
Partial install of 2 different SDKs.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:27:26 PM
 #4698

I noticed two small issues this morning, one which would be easy to reproduce but doesn't necessarily need fixed, the other which would be difficult to reproduce but should theoretically be fixed for best performance (assuming my observation was correct).

Issue one is related to the use of system time, my mining machine was about 6 minutes slow, over the course of a couple weeks with ntp not running.  I started ntp and when the time changed, cgminer immediately reported "pool not providing work enough" and subsequently switched pools.  This probably wouldn't have happened if ntp was running all along, which is why I think it is a minor issue, but it makes me wonder if DST causes dropped work and pool failovers when they shouldn't exist.  Since that would only be a problem once or twice a year, I still don't know if this would be worth fixing, because I'm guessing it would be difficult to fix (would require a unique time source or some sort of interrupt to deal with known time changes on the existing time source [assuming either one of those is even possible]).

Issue two is related to submission of stale shares.  Note that I did not have --submit-stale on the command line in this case, but I have previously seen cgminer submit detected stales to the pool because it was instructed to (pool is eligius).  This morning, I just happened to look at my miner while this was still on the screen:
Quote
[2012-03-23 08:22:53] LONGPOLL requested work restart, waiting on fresh work
[2012-03-23 08:22:57] Accepted 00000000.949b59d0.7ecd59ab GPU 0 thread 1 pool 0
[2012-03-23 08:23:03] Accepted 00000000.58e019ff.7a3bc657 GPU 0 thread 1 pool 0
[2012-03-23 08:23:05] Pool 0 communication failure, caching submissions
[2012-03-23 08:23:05] Stale share detected, discarding
[2012-03-23 08:23:07] Pool 0 communication resumed, submitting work
[2012-03-23 08:23:07] Accepted 00000000.ade4193a.07083649 GPU 0 thread 0 pool 0
I had recently restarted cgminer, so I can't be 100% certain cgminer was again instructed to submit stales, but assuming it was, I believe the discarded share above would indicate a bug in stale share handling in this scenario.  I also believe it would have been accepted based on the length of time between the last longpoll and this event and the fact that other shares were accepted before and after this one with no other new block event shown.

I can't do much about either of these, but I thought I would bring them up in case anyone who could might want to.
1: It will just switch back a minute later so it's not a big problem.
2: Pools selectively send the submitold flag depending on their work. It is not always on for all work from the pool. The reason for that has to do with merged mining. If there is a real block change for both namecoin and bitcoin (and anywhere else that eligius is sending your shares without your consent), then you really dont want to submit the old share. So this is not a bug.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:29:39 PM
 #4699

Isn't there a way in OpenCL to check what version of the SDK the .bin was compiled in and re-compile only if the SDK did not match currently installed? Or, encode the version into .bin filename. Sorry for nitpicking Smiley.
Doing this would circumvent the cheating people can do by using a bin from an older sdk like 2.1 with a newer installed sdk indefinitely to get the performance from the older sdk while installing newer drivers without losing their hashrate.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
-ck
Moderator
Legendary
*
Offline Offline

Activity: 2002


Ruu \o/


View Profile WWW
March 23, 2012, 10:31:26 PM
 #4700

I've put a few commits in my git:
 https://github.com/kanoi/cgminer/
that add a simple device history that is accessible via the new API command 'notify'

Compiling my git reports itself as 2.3.1k

You can see it with
Code:
echo -n notify | nc 127.0.0.1 4028 ; echo

The base code change adds a few extra fields and counters to the device structure (that are all reported by the API)
Including: per device: last well time, last not well time, last not well reason, and counters for each of the reasons meaning how many times they have happened (e.g. Device Over Heat count, Device Thermal Cutoff count among others)

I ran for 30 minutes at stock gpu clocks and several gpu threads restarted... again on the GPU Managment screen, it only showed that gpu 5 had been re-initialized according to the times tamps.  Seems to be random cards.... first time i looked it was 2,3 and 5.  I restarted and this time its 1,3,4, and 5... after 30 minutes at stock gpu clock.

Here's the output of your command:
Code:
STATUS=S,When=1332519326,Code=60,Msg=Notify,Description=cgminer 2.3.1k|NOTIFY=0,Name=GPU,ID=0,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=1,Name=GPU,ID=1,Last Well=1332519326,Last Not Well=1332518925,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=2,Name=GPU,ID=2,Last Well=1332519326,Last Not Well=0,Reason Not Well=None,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=0,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=3,Name=GPU,ID=3,Last Well=1332519325,Last Not Well=1332518862,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=4,Name=GPU,ID=4,Last Well=1332519326,Last Not Well=1332517934,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=2,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|NOTIFY=5,Name=GPU,ID=5,Last Well=1332519326,Last Not Well=1332518716,Reason Not Well=Device idle for 60s,Thread Fail Init=0,Thread Zero Hash=0,Thread Fail Queue=0,Dev Sick Idle 60s=1,Dev Dead Idle 600s=0,Dev Nostart=0,Dev Over Heat=0,Dev Thermal Cutoff=0|

I'll try running on a different pool other than gpumax again see what happens.
Did you read my advice about fan control and 5970s?

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
Pages: « 1 ... 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 [235] 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 ... 830 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!