loshia
Legendary
Offline
Activity: 1610
Merit: 1000
|
|
November 07, 2013, 08:52:00 AM |
|
Con, I discovered flowing nasty bug which is present on MIPS TP-link only /* This is the central place all work that is about to be retired should be * cleaned to remove any dynamically allocated arrays within the struct */ void clean_work(struct work *work) { if (work->job_id) free(work->job_id); For some reason work->job_id is not set always - do not ask why and tp-link breaks badly - debugged and fixed with gdb server/client Ioshia. Thanks , that's very interesting, however calling free on NULL is a valid thing to do so I don't understand how this helps? If work->job_id == NULL then free(work->job_id) equates to free(NULL); I know... but life sucks Can you comment the diff issue also? PS: I can revert back my change and post gdb output if you want ?
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4256
Merit: 1644
Ruu \o/
|
|
November 07, 2013, 08:56:38 AM |
|
PS:
[2013-11-07 10:38:34] Accepted fbc145d6 Diff 260/128 HEXa 3 pool 0 [2013-11-07 10:38:54] Accepted f6a0cfde Diff 266/256 HEXa 0 pool 0 [2013-11-07 10:39:30] Pool 0 difficulty changed to 128 [2013-11-07 10:39:51] Rejected 017a21de Diff 173/128 HEXa 3 pool 0 (Below difficulty)
I'll investigate.... no wait a minute, HEXa is not a driver that I maintain or support in cgminer so I can point the finger at that.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
jmc1517
Newbie
Offline
Activity: 56
Merit: 0
|
|
November 07, 2013, 09:58:20 AM |
|
Update: The test of cgminer-ymmv is still going strong. Almost 24 hours now and no zombies. Whatever is in this build seems to have done the trick Ongoing logfile here (without --debug), although nothing new in it except accepted shares https://dl.dropboxusercontent.com/u/44240170/logfile-ymmv-ongoing.txtEdit: OK, the only problem I can see is with AMU6 which has a much lower accepted share rate than all the rest (others are all up in 6000's). Suppose this might be something to worry about (?): AMU 2: | 335.3M/333.5Mh/s | A:6322 R:12 HW: 67 WU: 4.6/m AMU 3: | 335.8M/333.6Mh/s | A:6276 R: 8 HW: 57 WU: 4.6/m AMU 4: | 335.5M/333.4Mh/s | A:6169 R: 3 HW:207 WU: 4.5/m AMU 5: | 335.3M/333.3Mh/s | A:6300 R: 8 HW:191 WU: 4.6/m AMU 6: | 335.8M/333.4Mh/s | A:2245 R:19 HW: 83 WU: 1.6/m <---- low ?? AMU 7: | 335.5M/333.5Mh/s | A:6149 R:16 HW: 50 WU: 4.6/m AMU 8: | 335.6M/333.6Mh/s | A:5795 R: 8 HW: 67 WU: 4.6/m
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4256
Merit: 1644
Ruu \o/
|
|
November 07, 2013, 10:03:01 AM |
|
Sync transfers. Except for regular gross timeout overflows. [2013-11-06 15:41:15] AMU11: TIMEOUT GetResults took 822ms but was 100ms [2013-11-06 15:41:15] AMU3: TIMEOUT GetResults took 813ms but was 100ms [2013-11-06 15:41:15] AMU6: TIMEOUT GetResults took 803ms but was 100ms [2013-11-06 15:41:15] AMU27: TIMEOUT GetResults took 803ms but was 100ms [2013-11-06 15:41:15] AMU19: TIMEOUT GetResults took 797ms but was 100ms [2013-11-06 15:41:15] AMU7: TIMEOUT GetResults took 795ms but was 100ms [2013-11-06 15:41:15] AMU12: TIMEOUT GetResults took 785ms but was 100ms [2013-11-06 15:41:15] AMU20: TIMEOUT GetResults took 785ms but was 100ms [2013-11-06 15:41:15] AMU5: TIMEOUT GetResults took 784ms but was 100ms [2013-11-06 15:41:15] AMU1: TIMEOUT GetResults took 784ms but was 100ms [2013-11-06 15:41:15] AMU26: TIMEOUT GetResults took 785ms but was 100ms [2013-11-06 15:41:15] AMU4: TIMEOUT GetResults took 783ms but was 100ms [2013-11-06 15:41:15] AMU25: TIMEOUT GetResults took 770ms but was 100ms [2013-11-06 15:41:15] AMU21: TIMEOUT GetResults took 770ms but was 100ms [2013-11-06 15:41:15] AMU29: TIMEOUT GetResults took 769ms but was 100ms [2013-11-06 15:41:15] AMU32: TIMEOUT GetResults took 760ms but was 100ms [2013-11-06 15:41:15] AMU31: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU0: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU18: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU33: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU30: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU9: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU22: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU23: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU24: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU16: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU8: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU17: TIMEOUT GetResults took 735ms but was 100ms
Note they all happen at the same time. So whatever is causing communication issues is happening across the board which is why it looks hardware related to me. Still not sure what to make about all this and whether it's even pursuing it any further. Perhaps offering a sync option or a binary for troublesome setups or something...
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
jmc1517
Newbie
Offline
Activity: 56
Merit: 0
|
|
November 07, 2013, 10:09:14 AM |
|
Sync transfers. Except for regular gross timeout overflows. [2013-11-06 15:41:15] AMU11: TIMEOUT GetResults took 822ms but was 100ms [2013-11-06 15:41:15] AMU3: TIMEOUT GetResults took 813ms but was 100ms [2013-11-06 15:41:15] AMU6: TIMEOUT GetResults took 803ms but was 100ms [2013-11-06 15:41:15] AMU27: TIMEOUT GetResults took 803ms but was 100ms [2013-11-06 15:41:15] AMU19: TIMEOUT GetResults took 797ms but was 100ms [2013-11-06 15:41:15] AMU7: TIMEOUT GetResults took 795ms but was 100ms [2013-11-06 15:41:15] AMU12: TIMEOUT GetResults took 785ms but was 100ms [2013-11-06 15:41:15] AMU20: TIMEOUT GetResults took 785ms but was 100ms [2013-11-06 15:41:15] AMU5: TIMEOUT GetResults took 784ms but was 100ms [2013-11-06 15:41:15] AMU1: TIMEOUT GetResults took 784ms but was 100ms [2013-11-06 15:41:15] AMU26: TIMEOUT GetResults took 785ms but was 100ms [2013-11-06 15:41:15] AMU4: TIMEOUT GetResults took 783ms but was 100ms [2013-11-06 15:41:15] AMU25: TIMEOUT GetResults took 770ms but was 100ms [2013-11-06 15:41:15] AMU21: TIMEOUT GetResults took 770ms but was 100ms [2013-11-06 15:41:15] AMU29: TIMEOUT GetResults took 769ms but was 100ms [2013-11-06 15:41:15] AMU32: TIMEOUT GetResults took 760ms but was 100ms [2013-11-06 15:41:15] AMU31: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU0: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU18: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU33: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU30: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU9: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU22: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU23: TIMEOUT GetResults took 736ms but was 100ms [2013-11-06 15:41:15] AMU24: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU16: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU8: TIMEOUT GetResults took 735ms but was 100ms [2013-11-06 15:41:15] AMU17: TIMEOUT GetResults took 735ms but was 100ms
Note they all happen at the same time. So whatever is causing communication issues is happening across the board which is why it looks hardware related to me. Still not sure what to make about all this and whether it's even pursuing it any further. Perhaps offering a sync option or a binary for troublesome setups or something... True, but... a) It hasn't happened since yesterday. b) These look like the same "errors" which 3.5.1 reports c) It doesn't result in the setting of a zombie, so mining continues (?) I would happily run this build as it doesn't cause me - the end-user - any problems!! Edit: Please see edited previous post - One AMU has low accepted share rate. Ah, it hasn't done any work since 19:00 last night. Perhaps this one "should" have been a zombie after all then?
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4256
Merit: 1644
Ruu \o/
|
|
November 07, 2013, 10:11:33 AM |
|
OK, the only problem I can see is with AMU6 which has a much lower accepted share rate than all the rest (others are all up in 6000's). Suppose this might be something to worry about (?):
AMU 2: | 335.3M/333.5Mh/s | A:6322 R:12 HW: 67 WU: 4.6/m AMU 3: | 335.8M/333.6Mh/s | A:6276 R: 8 HW: 57 WU: 4.6/m AMU 4: | 335.5M/333.4Mh/s | A:6169 R: 3 HW:207 WU: 4.5/m AMU 5: | 335.3M/333.3Mh/s | A:6300 R: 8 HW:191 WU: 4.6/m AMU 6: | 335.8M/333.4Mh/s | A:2245 R:19 HW: 83 WU: 1.6/m <---- low ?? AMU 7: | 335.5M/333.5Mh/s | A:6149 R:16 HW: 50 WU: 4.6/m AMU 8: | 335.6M/333.6Mh/s | A:5795 R: 8 HW: 67 WU: 4.6/m
Makes me wonder then if it isn't this one device causing issues with everything else cause that's clearly dodgy.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
loshia
Legendary
Offline
Activity: 1610
Merit: 1000
|
|
November 07, 2013, 10:18:00 AM Last edit: November 07, 2013, 10:28:28 AM by loshia |
|
-----------
Never Mind excuse me it was me who was corrupting memory
thank YOU!!!!
Something else just popped up On startup
(gdb) c Continuing. [New Thread 727]
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 727] 0x004384e0 in json_decref (json=0xc10ecb9) at jansson.h:108 108 if(json && json->refcount != (size_t)-1 && --json->refcount == 0) (gdb) bt full #0 0x004384e0 in json_decref (json=0xc10ecb9) at jansson.h:108 No locals. #1 0x004385d0 in hashtable_do_clear (hashtable=0x4c1090) at hashtable.c:157 list = 0x410050 <gbt_decode+348> next = 0xafa20138 pair = 0x41004c <gbt_decode+344> #2 0x004386f8 in hashtable_close (hashtable=0x4c1090) at hashtable.c:220 No locals. #3 0x0043b68c in json_delete_object (object=0x4c1088) at value.c:60 No locals. #4 json_delete (json=0x4c1088) at value.c:844 No locals. #5 0x0043b6b8 in json_delete_array (array=<optimized out>) at value.c:349 i = 2 #6 json_delete (json=0x4c0be0) at value.c:847 No locals. #7 0x004385d0 in hashtable_do_clear (hashtable=0x4b9ca0) at hashtable.c:157 list = 0x543044 next = 0x53f5ec pair = 0x543040 #8 0x004386f8 in hashtable_close (hashtable=0x4b9ca0) at hashtable.c:220 No locals. #9 0x0043b68c in json_delete_object (object=0x4b9c98) at value.c:60 No locals. #10 json_delete (json=0x4b9c98) at value.c:844 No locals. #11 0x004385d0 in hashtable_do_clear (hashtable=0x4c0230) at hashtable.c:157 list = 0x53f7cc next = 0x53f80c pair = 0x53f7c8 #12 0x004386f8 in hashtable_close (hashtable=0x4c0230) at hashtable.c:220 No locals. #13 0x0043b68c in json_delete_object (object=0x4c0228) at value.c:60 No locals. #14 json_delete (json=0x4c0228) at value.c:844 No locals. #15 0x00413264 in pool_active (pool=0x469478, pinging=<optimized out>) at cgminer.c:5891 append = true submit = true i = 4 mutsize = 4 res_val = <optimized out> mutables = 0x543168 tv_getwork = {tv_sec = 0, tv_usec = 0} tv_getwork_reply = {tv_sec = 0, tv_usec = 0} ret = false val = 0x4c0228 curl = 0x4a7248 rolltime = <optimized out> #16 0x00413960 in test_pool_thread (arg=<optimized out>) at cgminer.c:7581 pool = 0x469478 #17 0x77eefc94 in start_thread (arg=0x775f8530) at libpthread/nptl/pthread_create.c:297 pd = 0x775f8530 unwind_buf = {cancel_jmp_buf = {{jmp_buf = {{__pc = 0x77eefbb8 <start_thread+184>, __sp = 0x775f8020, __regs = {2002748720, 2012246048, 2012174776, 2002747648, 0, 0, 4096, 2097152}, __fp = 0x775f8020, __gp = 0x77ec33b0, __fpc_csr = 0, __fpregs = {0, 0, 0, 0, 0, 0}}}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = 0 robust = <optimized out> pagesize_m1 = <optimized out> ---Type <return> to continue, or q <return> to quit--- sp = 0x775f8020 "" freesize = <optimized out> #18 0x77ee80e0 in __thread_start () at ./libc/sysdeps/linux/mips/clone.S:146 No locals. Backtrace stopped: frame did not save the PC
siz = strlen(pool->rpc_url) + strlen(copy_start) + 2; pool->lp_url = malloc(siz); cgminer.c:5891 - in my case if (!pool->lp_url) { applog(LOG_ERR, "Malloc failure in pool_active"); return false; }
|
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
|
November 07, 2013, 12:50:31 PM Last edit: November 07, 2013, 01:10:33 PM by aigeezer |
|
OK, the only problem I can see is with AMU6 which has a much lower accepted share rate than all the rest (others are all up in 6000's). Suppose this might be something to worry about (?):
AMU 2: | 335.3M/333.5Mh/s | A:6322 R:12 HW: 67 WU: 4.6/m AMU 3: | 335.8M/333.6Mh/s | A:6276 R: 8 HW: 57 WU: 4.6/m AMU 4: | 335.5M/333.4Mh/s | A:6169 R: 3 HW:207 WU: 4.5/m AMU 5: | 335.3M/333.3Mh/s | A:6300 R: 8 HW:191 WU: 4.6/m AMU 6: | 335.8M/333.4Mh/s | A:2245 R:19 HW: 83 WU: 1.6/m <---- low ?? AMU 7: | 335.5M/333.5Mh/s | A:6149 R:16 HW: 50 WU: 4.6/m AMU 8: | 335.6M/333.6Mh/s | A:5795 R: 8 HW: 67 WU: 4.6/m
Makes me wonder then if it isn't this one device causing issues with everything else cause that's clearly dodgy. New wrinkle? My ymmv run is 24 hours old, but some time during it AMU7 went zombie. I'll swap it now if possible and keep the run going. Edit: Weirdness - the hot swap had no effect. A "Q" and restart seemed to work, all LEDs went off on restart and flicker normally but the display shows one AMU missing. The physical AMUs look normal, including the former zombie, but the display shows 0-11 rather than 0-12 - total physical AMUs are 13 for this machine. I'll disconnect them all and/or try another Q and restart next. Edit: The Q and restart changed nothing. Even after restart the display showed AMUs 0-11 rather than 0-12. I then disconnected the main USB line from the (daisy-chained) hubs. All (12, not 13) AMUs reported zombie status, as expected. When I reconnected the hubs the AMUs were reallocated as 12-24, thirteen of them now as there should be. Bug gone, but cause unknown.
|
|
|
|
vayvanne
|
|
November 07, 2013, 02:49:35 PM Last edit: November 07, 2013, 05:28:41 PM by vayvanne |
|
8 BEs and Jala in Tecknet USB 3.0 hub connected to laptop. No zombies in 3.7.2 but often see this [2013-11-07 14:15:44] AMU0: GetResults (amt=0 err=-7 ern=34) [2013-11-07 14:15:44] Icarus Read: No data for 12633 ms
BEs has often light solid and HW errors increased in my estimates. Jala is OK. 3.5.1 is best for my setup yet.
|
|
|
|
Aurum
|
|
November 07, 2013, 04:57:57 PM |
|
...
Unless I misunderstand how you are wiring it up I would say maybe the Power supply doesn't like working at >100%. I came to that number based on 60 erupters at .5 amp being 30A and I would have to guess the hub would waste some amount of power. I also have 0 idea how long your cables connecting to the power supply is but at 30A you should have a really short run and large cable. But I can't figure out why one on the laptop port wouldn't work. I would have taken a shot at it earlier but I didn't see anything obvious unless you only use 1 power supply. Even then your laptop may not provide the full .5 amp without too much voltage sag. Laptops are usually not as powerful on the usb ports. But I know by spec it should work...... EDIT: by I misunderstand how you are wiring it up I mean that you say you have 2 hubs and one power supply. Possibly you wanted to convey a power supply per hub. I made the assumption that you had one power supply per hub the first time I read it as you would be woefully under powered otherwise. Sorry maybe it was my writing. I have 2 hubs and 2 PSU's (both 30a @ 5v). When I have 30 BE's in hub 1 and 29 BE's in hub 2, it works fine (still, both hubs on their own psu) but as soon as I add a 60th BE (be it in hub 1 or hub 2), it doesn't work anymore. Anyway at least I know it's not a limitation by the miner, so I'll have to find a solution somewhere else. Thanks to those who replied Do you have "usb" : ":60" in your conf file? That's been the best thing for keeping all my BEs running. I have a miner or two where I made the mistake of letting Windows find the driver and installing something like ftdixxx driver for the BEs. I switched it with Zadig but later when I'd add another it might get the old ftdi driver and not the WinUSB driver. On miners where I never made that mistake it always installs WinUSB driver when I add another BE. I wish I knew how to remove that incorrect driver from my PC.
|
ghghghfgh
|
|
|
Aurum
|
|
November 07, 2013, 05:03:37 PM |
|
Trying to mine litecoins with cgminer. W2KSP3 GPU 0 HD4850
Should I be using an older version of cgminer (2.6.1)?
I don't know squat about alt coin mining. But, doesn't scrypt require newer Catalyst versions than are supported by that old GPU and OS? Pretty sure you need Catalyst in the 13.x version range. All my gpu miners run on 12.8 because I mix 5970s and 7970s and 5970s need 12.8 and cgminer 3.7.2
|
ghghghfgh
|
|
|
jmc1517
Newbie
Offline
Activity: 56
Merit: 0
|
|
November 07, 2013, 05:04:34 PM |
|
New wrinkle? My ymmv run is 24 hours old, but some time during it AMU7 went zombie. I'll swap it now if possible and keep the run going.
Edit: Weirdness - the hot swap had no effect. A "Q" and restart seemed to work, all LEDs went off on restart and flicker normally but the display shows one AMU missing. The physical AMUs look normal, including the former zombie, but the display shows 0-11 rather than 0-12 - total physical AMUs are 13 for this machine. I'll disconnect them all and/or try another Q and restart next.
Edit: The Q and restart changed nothing. Even after restart the display showed AMUs 0-11 rather than 0-12. I then disconnected the main USB line from the (daisy-chained) hubs. All (12, not 13) AMUs reported zombie status, as expected. When I reconnected the hubs the AMUs were reallocated as 12-24, thirteen of them now as there should be. Bug gone, but cause unknown.
Re: New wrinkle: I've experienced the same, but would assume that it's normal for a zombie not to be recoverable simply by stopping/starting cgminer(?). Whenever I've had zombies I got into the habit of powering off/on all the hubs and therefore cold-starting all the AMUs. I think it was past experience with older versions of cgminer that taught me that I needed to do that otherwise I would have AMUs missing. Of course it may have changed with newer versions, but having got into the habit, I always stop cgminer, then power off/on, then restart cgminer to fix the problem. YMMV !
|
|
|
|
os2sam
Legendary
Offline
Activity: 3586
Merit: 1098
Think for yourself
|
|
November 07, 2013, 05:06:54 PM |
|
Do you have "usb" : ":60" in your conf file? That's been the best thing for keeping all my BEs running.
So are you over 60 BE's? I had trouble with the 60th too. Mining with 59 at the moment.
|
A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
|
|
|
Aurum
|
|
November 07, 2013, 05:17:44 PM |
|
Trying to mine litecoins with cgminer. W2KSP3 GPU 0 HD4850 LTC.bat: set term= setx GPU_MAX_ALLOC_PERCENT 100 setx GPU_USE_SYNC_OBJECTS 1 cgminer.exe -d 0 --scrypt -o http://ltc.give-me-coins.com:3333 -u user.worker -p pass --failover-only --shaders=800 -I 10 -Q 2 Log: [2013-11-06 23:18:41] Started cgminer 3.7.2 Just seems to hang. I can CTRL-C to cmd prompt. I've looked at just about every site's config for cgminer. They are all the same. Question: What's the most likely reason I can't connect to their server? Should I be using an older version of cgminer (2.6.1)? Thanks! Using the conf file is better than the command line, but, try this LTC.bat. If you create a shortcut and put it in your startup folder it'll automatically launch on booting and the "timeout" gives it 10 seconds to discover a network. If this works you can likely increase to I 18: timeout /t 10 cgminer.exe -o http://ltc.give-me-coins.com:3333 -u user.worker -p pass --scrypt -I 13 --gpu-engine 625 --gpu-memclock 993 --thread-concurrency 3200 or cgminer.exe -o http://ltc.give-me-coins.com:3333 -u user.worker -p pass --scrypt -I 13 --gpu-engine 625 --gpu-memclock 993 --thread-concurrency 4032 BTW, you most likely failed because --shaders=800 should be --shaders 800. If you'd prefer to use "--shaders 800" then replace "--gpu-engine 625 --gpu-memclock 993 --thread-concurrency 4032" with "--shaders 800".
|
ghghghfgh
|
|
|
aigeezer
Legendary
Offline
Activity: 1450
Merit: 1013
Cryptanalyst castrated by his government, 1952
|
|
November 07, 2013, 05:21:38 PM Last edit: November 09, 2013, 12:49:34 PM by aigeezer |
|
New wrinkle? My ymmv run is 24 hours old, but some time during it AMU7 went zombie. I'll swap it now if possible and keep the run going.
Edit: Weirdness - the hot swap had no effect. A "Q" and restart seemed to work, all LEDs went off on restart and flicker normally but the display shows one AMU missing. The physical AMUs look normal, including the former zombie, but the display shows 0-11 rather than 0-12 - total physical AMUs are 13 for this machine. I'll disconnect them all and/or try another Q and restart next.
Edit: The Q and restart changed nothing. Even after restart the display showed AMUs 0-11 rather than 0-12. I then disconnected the main USB line from the (daisy-chained) hubs. All (12, not 13) AMUs reported zombie status, as expected. When I reconnected the hubs the AMUs were reallocated as 12-24, thirteen of them now as there should be. Bug gone, but cause unknown.
Re: New wrinkle: I've experienced the same, but would assume that it's normal for a zombie not to be recoverable simply by stopping/starting cgminer(?). Whenever I've had zombies I got into the habit of powering off/on all the hubs and therefore cold-starting all the AMUs. I think it was past experience with older versions of cgminer that taught me that I needed to do that otherwise I would have AMUs missing. Of course it may have changed with newer versions, but having got into the habit, I always stop cgminer, then power off/on, then restart cgminer to fix the problem. YMMV ! I started out very conservatively several versions back - unplugging the hubs, restarting cgminer, even rebooting in the early days. Over time I found I could usually get by with fewer and fewer interventions, including just a quick hot-swap of a zombie - so my memory says - it's all starting to blur a bit after so many different versions, usually out of synch on my two machines, and staying conservative is probably the wiser course. Anyway, latest minor event - this run of ymmv is now 4.25 hours old and somewhere along the way it dropped AMU 17 and reallocated it to AMU25. No intervention on my part. A feature, I hope. Edit: Ten 24 hours into the ymmv run now. One zombie appeared and I was able to hot-swap it. A couple of other AMUs appear to have been reallocated to new numbers during the run. My other machine has been running 3.7.2 (not ymmv) for about four hours now, without incident. It also has apparently reallocated AMUs on occasion - no overt failures though. Edit: Both my machines are running ymmv now, without incident. One has run for 48+ hours straight and the other for 10+ hours. For me, ymmv is the most stable candidate yet but of course ymmv.
|
|
|
|
RoadStress
Legendary
Offline
Activity: 1904
Merit: 1007
|
|
November 07, 2013, 05:22:31 PM |
|
Cross post: Ok this is new. I'm getting 2 errors i never saw before. When i start cgminer i get an error something like "reset error.this is not an avalon (0:0:0...) and this is when i stop cgminer WTF MUTEX ERROR ON LOCK! errno=0 in driver-avalon.c avalon_flush_work():1549 reinstalled zadig drivers but the same. fortunately they still hash. are my boards dieing already? Edit: I think it has something to do with cgminer, because using 3.6.6 doesn't give me those errors.
|
|
|
|
Aurum
|
|
November 07, 2013, 05:24:05 PM |
|
Do you have "usb" : ":60" in your conf file? That's been the best thing for keeping all my BEs running.
So are you over 60 BE's? I had trouble with the 60th too. Mining with 59 at the moment. No, I only have 49 BEs, but, I have a dozen PC miners and I spread them over several so I can read the bottom of the cgminer screen. Leave no USB unplugged Just a thought, have you tried running 30 in one instance of cgminer and the other 30 in a second instance?
|
ghghghfgh
|
|
|
Aurum
|
|
November 07, 2013, 05:36:09 PM |
|
Been running 3.7.2 for 2 days and the only problem I have is that it gives me much false hope When mining BTC it says I found a block about every minute
|
ghghghfgh
|
|
|
Gixer1970
Member
Offline
Activity: 66
Merit: 10
|
|
November 07, 2013, 06:58:54 PM Last edit: November 07, 2013, 07:10:28 PM by Gixer1970 |
|
Hi one and all, just a quick couple of question I would like cleared up in my mind and a better understanding, as I have been mining some time now. Am just a small time miner here with just 8.3ghs total at my disposal which I have set between two pools.
Queue - Staged work. When in writing your own batch file, the option to add --queue -Q has a maximum value allowed of 10, but by default this is 1 if not added into the batch file. In the settings menu in the UI, there is also the same option, does the option in the UI have the same upper bound limit of 10 ?, if not what is the maximum ?. I have been playing around with this for months and trying to find the right balance is not easy. I assume that this does relate to the amount of staged work ready (ST). Why would you leave it at a value of 1 which is the default ?
Total WU from version 3.6.4 - 3.7.2 Since moving from 3.6.4 to version 3.7.2, why does the work utility show a much lower figure than before ? as I mentioned earlier, I am multi pooled with a quota split between two pools of pool 0 at 64% and pool 1 at 36%. Minimum difficulty shares are pool 0 at 1 and pool 1 at 8. I do not have the fastest of internet connections and with pool 0 there is no option to change the difficulty.
I have found also that by changing the scantime to 90s and the expiry to 180s, this reduced greatly the number of rejected shares.
Any help most welcome.
Regards
|
|
|
|
Askit2
|
|
November 07, 2013, 10:44:30 PM |
|
...
Unless I misunderstand how you are wiring it up I would say maybe the Power supply doesn't like working at >100%. I came to that number based on 60 erupters at .5 amp being 30A and I would have to guess the hub would waste some amount of power. I also have 0 idea how long your cables connecting to the power supply is but at 30A you should have a really short run and large cable. But I can't figure out why one on the laptop port wouldn't work. I would have taken a shot at it earlier but I didn't see anything obvious unless you only use 1 power supply. Even then your laptop may not provide the full .5 amp without too much voltage sag. Laptops are usually not as powerful on the usb ports. But I know by spec it should work...... EDIT: by I misunderstand how you are wiring it up I mean that you say you have 2 hubs and one power supply. Possibly you wanted to convey a power supply per hub. I made the assumption that you had one power supply per hub the first time I read it as you would be woefully under powered otherwise. Sorry maybe it was my writing. I have 2 hubs and 2 PSU's (both 30a @ 5v). When I have 30 BE's in hub 1 and 29 BE's in hub 2, it works fine (still, both hubs on their own psu) but as soon as I add a 60th BE (be it in hub 1 or hub 2), it doesn't work anymore. Anyway at least I know it's not a limitation by the miner, so I'll have to find a solution somewhere else. Thanks to those who replied Do you have "usb" : ":60" in your conf file? That's been the best thing for keeping all my BEs running. I have a miner or two where I made the mistake of letting Windows find the driver and installing something like ftdixxx driver for the BEs. I switched it with Zadig but later when I'd add another it might get the old ftdi driver and not the WinUSB driver. On miners where I never made that mistake it always installs WinUSB driver when I add another BE. I wish I knew how to remove that incorrect driver from my PC. There is an option in device manager for Windows to check Windows Update every time a new device is detected. Control Panel, System, Advanced System Setting, then click on the hardware tab and then click the device installation settings button Never install drivers from Windows Update should fix it. I am going to try Install from windows update if the driver isn't found on my computer. It should work but may not. Never will work.
|
|
|
|
|