PS- Why did you disable Avalon lock lately? If everything is ok the lock shall be ok also.
Everything was not okay with the avalon. There are recursive locks called in a different order from the async flush code and the avalon code. Recursive locks in a different order cause deadlocks. If you are having deadlocks then perhaps you have a hardware driver with a similar problem. The one I have not been able to fully audit so far is the klondike driver which may be prone to the same problem. Thank you con!!!! You do not mind to spam the thread now and then do you? I am finding bugs from time to times ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) I am joking..Thank you very much Absolutely do not mind at all. People auditing code is rare, and reporting meaningful bugs is the key to fixing them. What hardware do you have anyway, if you are having deadlock problems? By the way, if you can reliably reproduce what appears to be a deadlock, make sure you have the absolute latest git, edit miner.h to enable LOCK_TRACKING change line 765 to and start cgminer logging the output and with the API enabled with for example the following extra options: --api-listen --api-allow "W:127.0.0.1" 2>log.txt
and when you see a deadlock send the command to get a summary of the lock status with This should spew extra information into the logging file you generated called log.txt which will allow me to see how the deadlock was caused. Thank You! I will do as suggested. I do have HEX16A. And i am playing with them for the moment I will let you know if i find out what happens related to your code as long hex16 is not pushed to your git ... Best
|
|
|
PS- Why did you disable Avalon lock lately? If everything is ok the lock shall be ok also.
Everything was not okay with the avalon. There are recursive locks called in a different order from the async flush code and the avalon code. Recursive locks in a different order cause deadlocks. If you are having deadlocks then perhaps you have a hardware driver with a similar problem. The one I have not been able to fully audit so far is the klondike driver which may be prone to the same problem. Thank you con!!!! You do not mind to spam the thread now and then do you? I am finding bugs from time to times ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) I am joking..Thank you very much Best
|
|
|
Yues i guess i do not understand the details but with latest git i am experiencing deadlocks applog(LOG_WARNING, "Waiting for work to be available from pools."); and everything is freezing If i find out the reason i will let you know but something is not right PS- Why did you disable Avalon lock lately - commit c3f13369961a7be6b19fe838ac4b5a7bc8592b16? If everything is ok the lock shall be ok also. Anyway Thank you Hello
I do think that this piece of code can create dead locks we are waiting to be awaken pthread_cond_wait holding lock mutex_lock(stgd_lock);
mutex_lock(stgd_lock); ts = __total_staged();
if (!pool_localgen(cp) && !ts && !opt_fail_only) lagging = true;
/* Wait until hash_pop tells us we need to create more work */ if (ts > max_staged) { pthread_cond_wait(&gws_cond, stgd_lock); ts = __total_staged(); } mutex_unlock(stgd_lock);
From the other side wake_gws needs same lock and so on and so on....Con please take a look at it when you can
static void wake_gws(void) { mutex_lock(stgd_lock); pthread_cond_signal(&gws_cond); mutex_unlock(stgd_lock); }
A quick fix might be
mutex_lock(stgd_lock); ts = __total_staged(); mutex_unlock(stgd_lock);
if (!pool_localgen(cp) && !ts && !opt_fail_only) lagging = true;
/* Wait until hash_pop tells us we need to create more work */ if (ts > max_staged) { pthread_cond_wait(&gws_cond, stgd_lock); mutex_lock(stgd_lock); ts = __total_staged(); mutex_unlock(stgd_lock);
}
I appreciate you looking over the code, however I guess you don't understand that pthread_cond_wait DROPS the mutex lock associated with it. You are supposed to called a pthread conditional wait holding a mutex lock and it picks up the lock again when the conditional or timeout is over.
|
|
|
Hello
I do think that this piece of code can create dead locks we are waiting to be awaken pthread_cond_wait holding lock mutex_lock(stgd_lock);
mutex_lock(stgd_lock); ts = __total_staged();
if (!pool_localgen(cp) && !ts && !opt_fail_only) lagging = true;
/* Wait until hash_pop tells us we need to create more work */ if (ts > max_staged) { pthread_cond_wait(&gws_cond, stgd_lock); ts = __total_staged(); } mutex_unlock(stgd_lock);
From the other side wake_gws needs same lock and so on and so on....Con please take a look at it when you can
static void wake_gws(void) { mutex_lock(stgd_lock); pthread_cond_signal(&gws_cond); mutex_unlock(stgd_lock); }
A quick fix might be
mutex_lock(stgd_lock); ts = __total_staged(); mutex_unlock(stgd_lock);
if (!pool_localgen(cp) && !ts && !opt_fail_only) lagging = true;
/* Wait until hash_pop tells us we need to create more work */ if (ts > max_staged) { pthread_cond_wait(&gws_cond, stgd_lock); mutex_lock(stgd_lock); ts = __total_staged(); mutex_unlock(stgd_lock);
}
|
|
|
-----------
Never Mind excuse me it was me who was corrupting memory
thank YOU!!!!
Something else just popped up On startup
(gdb) c Continuing. [New Thread 727]
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 727] 0x004384e0 in json_decref (json=0xc10ecb9) at jansson.h:108 108 if(json && json->refcount != (size_t)-1 && --json->refcount == 0) (gdb) bt full #0 0x004384e0 in json_decref (json=0xc10ecb9) at jansson.h:108 No locals. #1 0x004385d0 in hashtable_do_clear (hashtable=0x4c1090) at hashtable.c:157 list = 0x410050 <gbt_decode+348> next = 0xafa20138 pair = 0x41004c <gbt_decode+344> #2 0x004386f8 in hashtable_close (hashtable=0x4c1090) at hashtable.c:220 No locals. #3 0x0043b68c in json_delete_object (object=0x4c1088) at value.c:60 No locals. #4 json_delete (json=0x4c1088) at value.c:844 No locals. #5 0x0043b6b8 in json_delete_array (array=<optimized out>) at value.c:349 i = 2 #6 json_delete (json=0x4c0be0) at value.c:847 No locals. #7 0x004385d0 in hashtable_do_clear (hashtable=0x4b9ca0) at hashtable.c:157 list = 0x543044 next = 0x53f5ec pair = 0x543040 #8 0x004386f8 in hashtable_close (hashtable=0x4b9ca0) at hashtable.c:220 No locals. #9 0x0043b68c in json_delete_object (object=0x4b9c98) at value.c:60 No locals. #10 json_delete (json=0x4b9c98) at value.c:844 No locals. #11 0x004385d0 in hashtable_do_clear (hashtable=0x4c0230) at hashtable.c:157 list = 0x53f7cc next = 0x53f80c pair = 0x53f7c8 #12 0x004386f8 in hashtable_close (hashtable=0x4c0230) at hashtable.c:220 No locals. #13 0x0043b68c in json_delete_object (object=0x4c0228) at value.c:60 No locals. #14 json_delete (json=0x4c0228) at value.c:844 No locals. #15 0x00413264 in pool_active (pool=0x469478, pinging=<optimized out>) at cgminer.c:5891 append = true submit = true i = 4 mutsize = 4 res_val = <optimized out> mutables = 0x543168 tv_getwork = {tv_sec = 0, tv_usec = 0} tv_getwork_reply = {tv_sec = 0, tv_usec = 0} ret = false val = 0x4c0228 curl = 0x4a7248 rolltime = <optimized out> #16 0x00413960 in test_pool_thread (arg=<optimized out>) at cgminer.c:7581 pool = 0x469478 #17 0x77eefc94 in start_thread (arg=0x775f8530) at libpthread/nptl/pthread_create.c:297 pd = 0x775f8530 unwind_buf = {cancel_jmp_buf = {{jmp_buf = {{__pc = 0x77eefbb8 <start_thread+184>, __sp = 0x775f8020, __regs = {2002748720, 2012246048, 2012174776, 2002747648, 0, 0, 4096, 2097152}, __fp = 0x775f8020, __gp = 0x77ec33b0, __fpc_csr = 0, __fpregs = {0, 0, 0, 0, 0, 0}}}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = 0 robust = <optimized out> pagesize_m1 = <optimized out> ---Type <return> to continue, or q <return> to quit--- sp = 0x775f8020 "" freesize = <optimized out> #18 0x77ee80e0 in __thread_start () at ./libc/sysdeps/linux/mips/clone.S:146 No locals. Backtrace stopped: frame did not save the PC
siz = strlen(pool->rpc_url) + strlen(copy_start) + 2; pool->lp_url = malloc(siz); cgminer.c:5891 - in my case if (!pool->lp_url) { applog(LOG_ERR, "Malloc failure in pool_active"); return false; }
|
|
|
Con, I discovered flowing nasty bug which is present on MIPS TP-link only /* This is the central place all work that is about to be retired should be * cleaned to remove any dynamically allocated arrays within the struct */ void clean_work(struct work *work) { if (work->job_id) free(work->job_id); For some reason work->job_id is not set always - do not ask why ![Wink](https://bitcointalk.org/Smileys/default/wink.gif) and tp-link breaks badly - debugged and fixed with gdb server/client Ioshia. Thanks , that's very interesting, however calling free on NULL is a valid thing to do so I don't understand how this helps? If work->job_id == NULL then free(work->job_id) equates to free(NULL); I know... but life sucks ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) Can you comment the diff issue also? PS: I can revert back my change and post gdb output if you want ?
|
|
|
Kon, I discovered flowing nasty bug which is present on MIPS TP-link only /* This is the central place all work that is about to be retired should be * cleaned to remove any dynamically allocated arrays within the struct */ void clean_work(struct work *work) { if (work->job_id) free(work->job_id); For some reason work->job_id is not set always - do not ask why ![Wink](https://bitcointalk.org/Smileys/default/wink.gif) and tp-link breaks badly - debugged and fixed with gdb server/client Best PS: [2013-11-07 10:38:34] Accepted fbc145d6 Diff 260/128 HEXa 3 pool 0 [2013-11-07 10:38:54] Accepted f6a0cfde Diff 266/256 HEXa 0 pool 0 [2013-11-07 10:39:30] Pool 0 difficulty changed to 128 [2013-11-07 10:39:51] Rejected 017a21de Diff 173/128 HEXa 3 pool 0 (Below difficulty) On tp-link MIPS from times to time some good shares seemed to be rejected. Probably there is some issue in checking work if it matches pool difficulty? Please comment 10X
|
|
|
if (cgpu && cgpu->deven == DEV_ENABLED) cgpu->drv->flush_work(cgpu); flush work is not initialized in this case Updated git. The mining thread shouldn't appear with read lock held so see if that code suffices. 10X But on my tplinks it sill segfaults from time to times because of it - drv->flush-queue . It is a temporally fix probably. But unfortunately no gdb on my tplink. I hope you will see it on your Avalon unit... Thank you Best
|
|
|
I have noticed that 3.7 problems during start are caused by Make calls to flush queue and flush work asynchronous wrt to the main…comit What happens is that flush_queue is using qlock which might be not initialized from hotplug when there are more USB miners connected Simple if(cgpu) check is solving that issue for me
That's a very insightful observation, thanks, will investigate further. 10X Con but after your latest changes (rev 6bc691adb26cad59f0598882cb85488f3f5edbe6 1 parent 42b3cf1 ckolivas authored 18 minutes ago ) it is still not working What works for me is: for (i = 0; i < mining_threads; i++) { cgpu = mining_thr[i]->cgpu; mining_thr[i]->work_restart = true; if (cgpu && cgpu->deven == DEV_ENABLED) { flush_queue(cgpu); cgpu->drv->flush_work(cgpu); } }
Oh I see, thanks. if (cgpu && cgpu->deven == DEV_ENABLED) cgpu->drv->flush_work(cgpu); flush work is not initialized in this case
|
|
|
I have noticed that 3.7 problems during start are caused by Make calls to flush queue and flush work asynchronous wrt to the main…comit What happens is that flush_queue is using qlock which might be not initialized from hotplug when there are more USB miners connected Simple if(cgpu) check is solving that issue for me
That's a very insightful observation, thanks, will investigate further. 10X Con but after your latest changes (rev 6bc691adb26cad59f0598882cb85488f3f5edbe6 1 parent 42b3cf1 ckolivas authored 18 minutes ago ) it is still not working What works for me is: for (i = 0; i < mining_threads; i++) { cgpu = mining_thr ->cgpu; mining_thr->work_restart = true; if (cgpu && cgpu->deven == DEV_ENABLED) { flush_queue(cgpu); cgpu->drv->flush_work(cgpu); } }
__func__ = "restart_threads" #1 0x0000000000412ae6 in test_work_current (work=0x7fe9bc0771f0) at cgminer.c:4347 pool = 0x18577d0 bedata = "\000\000\000\000\000\000\000\003\263\f2c\020\373\024\065Rաߡ/\305\375\325\304\325\351%\224", <incomplete sequence \347> hexstr = '0' <repeats 15 times>, "3b30c326310fb143552d5a1dfa12fc5fdd5c4d5e925946be7\000g\363Z" ret = true __func__ = "test_work_current" #2 0x000000000041bc02 in stratum_rthread (userdata=0x18577d0) at cgminer.c:5711 work = 0x7fe9bc0771f0 timeout = {tv_sec = 90, tv_usec = 0} sel_ret = <optimized out> rd = {fds_bits = {1024, 0 <repeats 15 times>}} s = 0x7fe9bc0771f0 "" pool = 0x18577d0 threadname = "StratumR/0\000\000\000\000\000" #3 0x00007fe9cba15e9a in start_thread (arg=0x7fe9b75f6700) at pthread_create.c:308 __res = <optimized out> pd = 0x7fe9b75f6700 now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -183325595836687369, ---Type <return> to continue, or q <return> to quit--- 140642120497280, 140641780591040, 0, 3, 191364403420650487, 191162282270186487}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = 0 pagesize_m1 = <optimized out> sp = <optimized out> freesize = <optimized out> __PRETTY_FUNCTION__ = "start_thread" #4 0x00007fe9cb7423fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 No locals. #5 0x0000000000000000 in ?? () No symbol table info available.
|
|
|
I have noticed that 3.7 problems during start are caused by Make calls to flush queue and flush work asynchronous wrt to the main…comit What happens is that flush_queue is using qlock which might be not initialized from hotplug when there are more USB miners connected Simple if(cgpu) check is solving that issue for me
|
|
|
Are all the capacitors really necessary ?
Yes dude they are. The ugly it looks the better it works
|
|
|
50 GH is not real guys. That is what I think 40-41 top calculating pool paid shares as suggested by kano. Have that in mind and choose wisely
All boards were tested before we shipped them. The boards are doing ~50GH/s, no doubt about that. I will take some screenshots. OK I got it 1.1v ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) and ambinient about 10 c What are you getting with .85v
|
|
|
@CryptX
before selling a new batch you should care about your existing customers!
my board is only getting 19GH/sec and you are not replying to that issue also I got no hashrate protection refund!
50 GH is not real guys. That is what I think 40-41 top calculating pool paid shares as suggested by kano. Have that in mind and choose wisely
|
|
|
Hey,
No chaining is available for the moment as far as i know.
Thanks for the info loshia But i guess i will never see my 2 x HEX16 I feel bad for you dude..... ![Cry](https://bitcointalk.org/Smileys/default/cry.gif) Because you are knowledgeable person and you know what to do which means that you will enjoy them working ....
|
|
|
Hi, chaotz. I remember marto said that HEX 16 can be chained to run off one USB host.
Hey, No chaining is available for the moment as far as i know. If you have more than one board i will recommend you to use this USB HUB http://www.amazon.de/dp/B00602C91U/ref=pe_386171_38075861_TE_itemBus 001 Device 002: ID 1a40:0201 Terminus Technology Inc. FE 2.1 7-port Hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 003: ID 1a40:0101 Terminus Technology Inc. 4-Port HUB it works up to 6 Hexes (stable and more) (Terminus Technology Inc. FE 2.1 7-port Hub - THAT IS A REAL CHIP!) + TP-link. Make sure that hub is powered from PSU 5V do not use original hub EXTERNAL power supply/adapter As a rule of thumb use common ground for your TP-LINK, HEX16 and USB Hub if any from PSU ![Wink](https://bitcointalk.org/Smileys/default/wink.gif) Which means that you have to throw away TP-Link power adapter also. Power up tplink from HUB 7,8,9,10 port OK? What i do is: 750-800W PSU to power HUB+HEX16 - 6-7 boards depending of overclocking and 12V PSU RAIL +TP-Link and it works great
|
|
|
don't worry Zich, it take a week to get my HEX16A via speedy.bg, and i localise in same country as Marto. don't be rush , take a coffee and chill a little if you can. that is my advice, but you do what you want. Regards.
What is the point of your comment nitrox? Just trolling? Or you wanna make Zich happy? I am wandering if you paid for shipping at all? Oh i forgot there is a free lunch a and it supposed to be free? I am damn sure that speedy is taking one business day for domestic deliveries but most probably you live in a very secret place. A Village in the mountains? Please share something valuable here for instance : I followed 2GOOD how to and it works great or I donated 0 BTC to 2GOOD because he spent his valuable time for jerks like me. Or at least i am poor guy because i live in Bulgaria but thank you 2GOOD for explaining me how to use my HEX16!!!!!!!!
|
|
|
every think is OK. now ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) but i somehow do not find wifi config in there. You will never find it. WiFi drivers are missing - not compiled and this is done on purpose No matter how hard are you tying all hex images are wired only
|
|
|
Avalon mini improved system heat dissipation (add aluminum platen and high speed fan) and DC-DC module (2 * HI-MOSFET). the chips operate stable under 375 MHz and still have some upside potential.
the miner count is a small bug here. if reduce the miner count to 16, the overall speed will reduce to about half(30-40GHs), this is a FPGA controller bug. so just maintain the miner count to 24 is ok.
we did a 36 hours burn-in test before shipment, which guarantee the minis is OK when we ship them. but during transportation, the vibrations will also cause some hardware problem when you received them. if you have some issues, please check all cables are fastened.
Hey, Nice to see you are back! Best!
|
|
|
![Wink](https://bitcointalk.org/Smileys/default/wink.gif) You are a real hacker dude. By the way did you get your hex already. I would like to know if all is good with tp-link. 10X No, my hex still on speedy office ![Cry](https://bitcointalk.org/Smileys/default/cry.gif) because of document problem. But many said it's work perfectly & super stable ![Wink](https://bitcointalk.org/Smileys/default/wink.gif) Really? I am glad to hear it ![Cheesy](https://bitcointalk.org/Smileys/default/cheesy.gif)
|
|
|
|