techman05
|
 |
October 30, 2013, 01:25:28 AM |
|
If its flashing than coming back on , did you notice if it said "new block detected , Pool requests restart". At least when I was using eclipseMC that would completely stop mining and then start back up for each block. Another pool does that but it doesn't restart . Eclipse found so many blocks at one point that I had to switch to another pool or just unplug my eroupters.
That's my easy excuse for the calming fans and lights.
|
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
 |
October 30, 2013, 01:49:04 AM |
|
If its flashing than coming back on , did you notice if it said "new block detected , Pool requests restart". At least when I was using eclipseMC that would completely stop mining and then start back up for each block. Another pool does that but it doesn't restart . Eclipse found so many blocks at one point that I had to switch to another pool or just unplug my eroupters.
That's my easy excuse for the calming fans and lights.
As far as I can tell there has been only one message since startup (in the console window) for this run, and that was when Icarus hotplugged the last AMU. Each of these tests seems to have different patterns at startup. Sometimes they are all quick, sometimes there's a pause after a few, and sometimes it thinks it has to hotplug the last one or two. Anyway - no pool-requested restarts on this run that I know of. I have seen that kind of message in the past though. I've been using Slush for most of these tests, and stats there for my miners look pretty normal, within the usual error factors, I think. Anyway, thanks for the tip and I'll keep my eyes open for pool restart messages. Another tester has been logging his tests (see upthread), so that might help pin things down too.
|
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
 |
October 30, 2013, 01:57:19 AM Last edit: October 30, 2013, 02:16:36 AM by aigeezer |
|
Edit: the chop1 test is interesting. No failures yet, one hour in. From time to time random LEDs put on a festive but somewhat scary display, for many seconds, often four or five at a time. Also, the hash rate reported soars randomly from time to time, no apparent correlation with the LED lights. An AMU will show well over 400, sometimes well over 500 Mh/s, then slowly drift back to a normal 333 or so. I don't recall seeing this behavior before, but perhaps I've just missed it. The test continues.
Thanks. If you didn't have enough to test, here's 2 more. http://ck.kolivas.org/apps/cgminer/temp/cgminer-zlp.exehttp://ck.kolivas.org/apps/cgminer/temp/cgminer-zlpcps10.exeThe chop1 test is 3+ hours old and no failures yet. It just did something dramatic though - all 13 AMU LEDs came on at about the same time and the BAL fan throttled down and its hash rate dropped somewhat. A few seconds later everything was back to normal. No smoke, no flames, no melted solder - I'll stay with it a while unless it fails.  Heh odd. The ZLP is likely a very real fix for an issue though, and hopefully it's a related issue to what's biting you and indirectly fixed with the chop1 executable. It still doesn't sound like normal behaviour even if it hasn't outright failed, though bear in mind a couple of pools are under ddos yet again so that might be related  OK. I'll set chop1 aside and try zlp next. Chop1 has gone almost 4 hours without failure, but it's lively to watch. I'll probably leave zlp unattended for the next 9 hours or so if it gets off to a good start. Edit: Yikes - zlp is off to a very bad start. Tons of error messages, but they scroll quickly. Let's see - AMU0 Timeout sendwork took (about 2 seconds) but was (about 1 second) sendwork USB write error 7 LIBUSB_ERR_TIMEOUT - that's about all I can grab while it scrolls. It's all intermingled with "accepted" messages from the BAL unit. I've left it running for now in the hope that it might stabilize. Edit: Did a quick test of zlpcps10 and got similar errors to zlp. I was in too much of a hurry though and didn't disconnect the hubs between tests, so I should test it again. I'm out of time right now though, so I have set it back to running chop1. I'll check the thread in about 10 hours.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4494
Merit: 1665
Ruu \o/
|
 |
October 30, 2013, 02:36:39 AM |
|
Yeah zlp were done blind. Seems they're broken so no one else should bother testing them.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4494
Merit: 1665
Ruu \o/
|
 |
October 30, 2013, 05:04:18 AM |
|
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
jesse11
Sr. Member
  
Offline
Activity: 333
Merit: 250
Ants Rock
|
 |
October 30, 2013, 07:32:41 AM |
|
You still have: WTF RDLOCK ERROR ON LOCK! in the binary.
|
Mining with: BE's,BE Cubes, K16's, AntMiners U1's and AntMiners S1's
|
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
 |
October 30, 2013, 12:06:16 PM |
|
Yeah zlp were done blind. Seems they're broken so no one else should bother testing them.
Whew - I was afraid I'd done something goofy - I was in a big rush at the time. No problem, where were we?... I'll stop the chop1 test now. It ran for about 10 hours and has (only) one zombie. I'll test the "rs" version shortly.
|
|
|
|
jmc1517
Newbie
Offline
Activity: 56
Merit: 0
|
 |
October 30, 2013, 12:16:06 PM |
|
OK, so a bit more testing done this morning on the following 3 builds, using 34 Erupters on 3 Anker 10-port hubs plus a 4-port Silvercrest oldie, all hubs powered off/on before each test, results as follows: cgminer-cwa16: The first solid LED on an AMU came on after about 30 minutes. No zombie reported - just the hash count slowly declining to zero. Loads of timeout errors scrolling up the console. cgminer-chop1: First zombie after about 22 minutes and the usual timeout errors from others. cgminer-rs: Three zombies within 10-14 minutes and other AMUs with zero hash count. Loads of timeouts. Logfiles (without --debug) here if required: https://dl.dropboxusercontent.com/u/44240170/logfile-cwa16.txt https://dl.dropboxusercontent.com/u/44240170/logfile-chop1.txt https://dl.dropboxusercontent.com/u/44240170/logfile-rs.txtSo I'm back to mining on 3.3.1 after completing the above tests and everything is running perfectly. I would stick with 3.3.1 except that it doesn't always show all the devices on startup, and doesn't handle hot-plugging as well as the later builds. Perhaps we can't have everything?!
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4494
Merit: 1665
Ruu \o/
|
 |
October 30, 2013, 12:17:40 PM |
|
So I'm back to mining on 3.3.1 after completing the above tests and everything is running perfectly. I would stick with 3.3.1 except that it doesn't always show all the devices on startup, and doesn't handle hot-plugging as well as the later builds. Perhaps we can't have everything?!
Why the fuck not goddamnit  And why 3.3.1? Did it break precisely at 3.3.2?
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
 |
October 30, 2013, 12:23:58 PM |
|
Puzzling that chop1 was almost error-free for me (if you don't count the festive LED displays) but fairly poor for jmc1517. Messy stuff.
"rs" is looking good for me, 10 minutes in.
Edit: Oops, AMU8 went zombie, 15 minutes in.
|
|
|
|
techman05
|
 |
October 30, 2013, 12:25:07 PM |
|
Maybe that's the last one to support windows xp. There's at least one poster posting problems with xp. Feel free to change the words for what you feel. 
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4494
Merit: 1665
Ruu \o/
|
 |
October 30, 2013, 12:29:24 PM |
|
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
jmc1517
Newbie
Offline
Activity: 56
Merit: 0
|
 |
October 30, 2013, 12:34:55 PM Last edit: October 30, 2013, 12:55:17 PM by jmc1517 |
|
So I'm back to mining on 3.3.1 after completing the above tests and everything is running perfectly. I would stick with 3.3.1 except that it doesn't always show all the devices on startup, and doesn't handle hot-plugging as well as the later builds. Perhaps we can't have everything?!
Why the fuck not goddamnit  And why 3.3.1? Did it break precisely at 3.3.2? Good question! The answer is - I don't know as I went directly from 3.3.1 to 3.6.4 which was the latest at that time. I had taken my eye off software-updates as I had 3.3.1 working perfectly (except for the "sometimes not showing all the devices when starting up" problem, and the "unable to recover an AMU by unplugging/replugging" problem). But since I get hardly any problems with AMUs going down under 3.3.1, the hotplug problem isn't really a concern, just a "nice-to-have." I *could* try some intermediate versions between 3.3.1 and 3.6.4 if they are still available, and if it would help? Edit: Currently running 3.3.4. I'll do a 2 hour test on selected builds until I hit the one where it breaks
|
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
 |
October 30, 2013, 12:41:05 PM Last edit: October 30, 2013, 01:42:45 PM by aigeezer |
|
Running rs8 now. Edit: zombie about 15 minutes into the run. Trying rs10 shortly. Edit: zombie from rs10 about 10 minutes into the run. I'll leave rs10 running. Edit: it couldn't recover so I restarted and got 4 zombies within 6 minutes. "rs10" isn't my fave among the recent candidates. 
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4494
Merit: 1665
Ruu \o/
|
 |
October 30, 2013, 12:55:40 PM |
|
I *could* try some intermediate versions between 3.3.1 and 3.6.4 if thay are still available, and if it would help?
If there's a specific version update that causes it, it narrows down the possible causes of post 3.3.1 breakage (though I've rewritten everything I could think of in the meantime so could have some other issue now).
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
jmc1517
Newbie
Offline
Activity: 56
Merit: 0
|
 |
October 30, 2013, 12:58:11 PM |
|
I *could* try some intermediate versions between 3.3.1 and 3.6.4 if thay are still available, and if it would help?
If there's a specific version update that causes it, it narrows down the possible causes of post 3.3.1 breakage (though I've rewritten everything I could think of in the meantime so could have some other issue now). I'll try to find it. Seems like as good a plan as any. Starting with 3.3.4... 
|
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
 |
October 30, 2013, 01:10:31 PM |
|
I *could* try some intermediate versions between 3.3.1 and 3.6.4 if thay are still available, and if it would help?
If there's a specific version update that causes it, it narrows down the possible causes of post 3.3.1 breakage (though I've rewritten everything I could think of in the meantime so could have some other issue now). I'll try to find it. Seems like as good a plan as any. Starting with 3.3.4...  I'm fairly sure the bug was present in 3.4.3, fwiw.
|
|
|
|
jmc1517
Newbie
Offline
Activity: 56
Merit: 0
|
 |
October 30, 2013, 05:26:41 PM Last edit: October 30, 2013, 07:22:01 PM by jmc1517 |
|
I *could* try some intermediate versions between 3.3.1 and 3.6.4 if thay are still available, and if it would help?
If there's a specific version update that causes it, it narrows down the possible causes of post 3.3.1 breakage (though I've rewritten everything I could think of in the meantime so could have some other issue now). I'll try to find it. Seems like as good a plan as any. Starting with 3.3.4...  I'm fairly sure the bug was present in 3.4.3, fwiw. Hmmm, interesting. I'm not seeing that. It seems to have occurred much later for me. I'm doing binary chop testing to localise the onset of the bug among the 11 releases between 3.3.1 and 3.6.4. Initially just running a 2-hour (maximum) test on each version, but stopping immediately I get a zombie. I will go back and verify with a longer run as soon as I find the last "perfect" version! Results so far: 3.3.1 - ok 3.3.4 - ok 3.4.0 - 3.4.1 - 3.4.2 - 3.4.3 - ok 3.5.0 - 3.5.1 - ok 3.6.0 - does not run. All AMU LEDS full on. No AMUs detected? 3.6.1 - ZOMBIE 3.6.2 - 3.6.3 - 3.6.4 - ZOMBIE 3.6.6 - ZOMBIE Edit: 3.5.1 ran for 2 hours, no problems. 3.6.0 will not start for me - it does not seem to "see" the AMUs at all. 3.6.1 runs but gives zombies, so it looks as if the bug was introduced (probably) at the 3.6.x build. I am currently running 3.5.1 as a long term test to ensure it is as stable as 3.3.1 is (for me). 2012-10-30 @ 19:20
|
|
|
|
chadtn
|
 |
October 30, 2013, 05:30:58 PM |
|
I'd like to think some other variable is at play here. I've run every single version of cgminer from 3.3.1 to 3.5.1 and have never had a single problem. I ran it on a Windows 7 machine 24/7 with two 7970 video cards and six block erupters. The only down time I've had is from halting for software updates and the maybe an average of one restart a month for system updates. After 3.5.1 I moved the block erupters to a raspberry pi and it has almost literally 100% up time. The only times it has stopped mining are when I halted the process to start the latest version of cgminer. It's run every version between 3.5.1 all the way up to 3.6.6 without issue.
There almost has to be some key variable...os, wiring, electrical, hardware defect, environment...
Chad
|
|
|
|
|