-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 05, 2011, 09:50:49 AM |
|
Thanks very much. Updated the tree. Please try again.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
figvam
Newbie
Offline
Activity: 42
Merit: 0
|
|
July 05, 2011, 11:14:22 AM |
|
It crashes after some run time with a floating point exception: Core was generated by `./cgminer --algo sse2_64 --debug --url http://mineco.in:3000/ --userpass xxxxxx'. Program terminated with signal 8, Arithmetic exception. #0 0x00000000004024b2 in miner_thread (userdata=<value optimized out>) at main.c:949 949 max64 = work.blk.nonce + (gdb) where #0 0x00000000004024b2 in miner_thread (userdata=<value optimized out>) at main.c:949 #1 0x000000393800673d in start_thread () from /lib64/libpthread.so.0 #2 0x00000039374d44bd in clone () from /lib64/libc.so.6
|
|
|
|
kripz
|
|
July 05, 2011, 12:10:21 PM |
|
Maybe im dreaming, would it be possible to detect a dropped/slow gpu and restart that thread?
Not that this has happened but have happened with the other single instance mine on all gpu hashers.
|
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 05, 2011, 12:12:39 PM |
|
Maybe im dreaming, would it be possible to detect a dropped/slow gpu and restart that thread?
Not that this has happened but have happened with the other single instance mine on all gpu hashers.
That shouldn't happen. This miner goes to great lengths to keep the mining threads busy at all times in the face of terrible network connectivity. Unless the pool is down for extended periods it should do what you want already.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
d3m0n1q_733rz
|
|
July 05, 2011, 06:11:00 PM |
|
Hey, have you added in the SSE2_x64_atom update yet?
No. Is it stable/working ok? Link? Link: http://yyz.us/bitcoin/sha256_xmm_amd64_atom.asmAnd yes, it's stable and working. It was determined that you need to strip "_atom" from both the name of the file and the name within the text of the file. Use it to replace your existing sha256_xmm_amd64.asm file in the x86_64 folder then recompile. I've been using it since it came out. Alternatively, just copy-paste the code I already stripped from the cpuminer thread and name it sha256_xmm_amd64.asm. You'll be surprised at how many more hashes you'll find with it.
|
Funroll_Loops, the theoretically quicker breakfast cereal! Check out http://www.facebook.com/JupiterICT for all of your computing needs. If you need it, we can get it. We have solutions for your computing conundrums. BTC accepted! 12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 05, 2011, 11:01:09 PM |
|
Updated tree: Reworked the screen update to have a non-scrolling status line and make the stderr log optional. Updated assembly for 64 bit cpu mining - d3m0n1q thanks a lot. Updated with newer assembly. It seems to be worth 5-10% improvement.
Getting close now to a release. Need to find a way to include the kernel files in distdir and install.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
kripz
|
|
July 06, 2011, 12:24:37 AM |
|
[2011-07-06 09:26:27] [188.20 | 188.33 Mhash/s] [1484 Accepted] [21 Rejected] [0 HW errors] [2011-07-06 09:26:27] [thread 1: 67108864 hashes, 41359356 khash/sec] [2011-07-06 09:26:27] [thread 0: 50331648 hashes, 31027223 khash/sec] [2011-07-06 09:26:28] LONGPOLL detected new block [2011-07-06 09:26:30] getwork thread 1 [2011-07-06 09:26:30] work retrieval failed, exiting gpu mining thread 0 [2011-07-06 09:26:30] [thread 1: 50331648 hashes, 22013848 khash/sec] [2011-07-06 09:26:30] Failed to tq_push work in workio_get_work [2011-07-06 09:26:30] Received kill message [2011-07-06 09:26:30] workio thread dead, exiting.
What happened here?
|
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 06, 2011, 03:19:23 AM |
|
[2011-07-06 09:26:27] [188.20 | 188.33 Mhash/s] [1484 Accepted] [21 Rejected] [0 HW errors] [2011-07-06 09:26:27] [thread 1: 67108864 hashes, 41359356 khash/sec] [2011-07-06 09:26:27] [thread 0: 50331648 hashes, 31027223 khash/sec] [2011-07-06 09:26:28] LONGPOLL detected new block [2011-07-06 09:26:30] getwork thread 1 [2011-07-06 09:26:30] work retrieval failed, exiting gpu mining thread 0 [2011-07-06 09:26:30] [thread 1: 50331648 hashes, 22013848 khash/sec] [2011-07-06 09:26:30] Failed to tq_push work in workio_get_work [2011-07-06 09:26:30] Received kill message [2011-07-06 09:26:30] workio thread dead, exiting.
What happened here? For some reason it was unable to retrieve any work and that's a fatal error, so it aborted cgminer entirely. I've updated the tree to retry getting work up to the configured amount of times (infinite by default) now.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
d3m0n1q_733rz
|
|
July 06, 2011, 09:22:05 AM |
|
Question about an optimization: I was wondering if I were to take advantage of the horizontal math function of SSE3 in place of the multiple paddd functions, would it be quicker? And also, I was wondering if someone could give me a hand with si/di registers and explain why they weren't used effectively in the code?
|
Funroll_Loops, the theoretically quicker breakfast cereal! Check out http://www.facebook.com/JupiterICT for all of your computing needs. If you need it, we can get it. We have solutions for your computing conundrums. BTC accepted! 12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
|
|
|
mf
Newbie
Offline
Activity: 24
Merit: 0
|
|
July 06, 2011, 09:29:17 AM |
|
Getting close now to a release. Need to find a way to include the kernel files in distdir and install.
Con, would you consider adding or receiving a pull request to get the status line to be similar to phoenix's, and add a couple more info points: - Total getworks received in the current run - Efficiency %, calculated as percentage of getworks received vs accepted (can be >100%) in the current run - Utility (for lack of a better name), calculated as being the number of accepted per minute in the current run (varies wildly at first, should stabilise after a bit). All the above needs is: - an unsigned to hold the total getworks received - one more %d per log printf and/or _info for the total getworks - two more %.2f per log printf and/or _info for the efficiency % and utility: - Efficiency: getwork_requested ? cgpu->accepted * 100 / getwork_requested : 0.0 - Utility: accepted/total_secs*60 cgminer's MHash/s seem to wildly differ from other miners I have. The "efficiency" and "utility" are probably better measures of how the miner has performed in the long run than "just" the MHash. Especially the "Utility", as (together with "accepted") is the only measure of how well the miner performs with regards to submitting the shares upstream.
|
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 06, 2011, 09:31:51 AM |
|
Yes, I'm more than happy to take code from elsewhere. Just be aware the code is still in heavy flux so make sure you pull before making any changes.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 06, 2011, 09:57:41 AM |
|
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 07, 2011, 05:06:33 AM |
|
Updated tree.
I've included mf's changes (thanks) which add the efficiency and utility columns to the output. Since the output got very busy I've abbreviated it to look like the following (new columns are efficiency and utility):
[(5s):186.5 (avg):204.5 Mh/s] [Q:84 A:83 R:9 HW:0 E:99% U:2.75/m]
I've also added code to put new work into a staging area first where the latest work is examined to see if it belongs to the same block or not. The utility of this is to cope at times when longpoll becomes unreliable, slow or is not supported/working. This minimises the chance of working on stale work under those circumstances and will produce a message like this:
[Accepted] [GPU 0] [192.0 Mh/s] [Q:60 A:50 R:6 HW:0 E:83% U:2.45/m] [2011-07-07 14:52:01] New block detected, possible missed longpoll, flushing work queue [Accepted] [GPU 0] [191.8 Mh/s] [Q:64 A:51 R:6 HW:0 E:80% U:2.35/m]
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
kripz
|
|
July 07, 2011, 12:46:30 PM |
|
Windows build please. [2011-07-06 09:26:27] [188.20 | 188.33 Mhash/s] [1484 Accepted] [21 Rejected] [0 HW errors] [2011-07-06 09:26:27] [thread 1: 67108864 hashes, 41359356 khash/sec] [2011-07-06 09:26:27] [thread 0: 50331648 hashes, 31027223 khash/sec] [2011-07-06 09:26:28] LONGPOLL detected new block [2011-07-06 09:26:30] getwork thread 1 [2011-07-06 09:26:30] work retrieval failed, exiting gpu mining thread 0 [2011-07-06 09:26:30] [thread 1: 50331648 hashes, 22013848 khash/sec] [2011-07-06 09:26:30] Failed to tq_push work in workio_get_work [2011-07-06 09:26:30] Received kill message [2011-07-06 09:26:30] workio thread dead, exiting.
What happened here? For some reason it was unable to retrieve any work and that's a fatal error, so it aborted cgminer entirely. I've updated the tree to retry getting work up to the configured amount of times (infinite by default) now. I have dropped cgminer for now because of this. It runs for about half a day, then randomly exits, forcing me to re-start it or run a restart loop around it. Not worth the hassle. Once I've seen it run stable for a week or so in the presence of pool shenanigans (ddos, upgrade downtime, etc), I'll consider running it again. He said it was fixed?
|
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 07, 2011, 01:56:09 PM |
|
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
dikidera
|
|
July 07, 2011, 09:23:55 PM |
|
With the latest version, it doesnt seem to like pools that much [2011-07-08 00:22:18] 0 gpu miner threads started [2011-07-08 00:22:18] Long-polling activated for http://192.168.233.128:8337/LP[2011-07-08 00:22:18] 2 cpu miner threads started, using SHA256 '4way' algorithm. [2011-07-08 00:22:18] JSON decode failed(-1): unable to decode byte 0xc9 at position 592 [2011-07-08 00:22:19] Failed json_rpc_call in get_upstream_work [2011-07-08 00:22:19] json_rpc_call failed on get work, retry after 5 seconds
|
|
|
|
d3m0n1q_733rz
|
|
July 08, 2011, 12:09:16 PM Last edit: July 08, 2011, 12:21:12 PM by d3m0n1q_733rz |
|
Question: Which is more optimum? movdqa xmm1, [rdx] pshufd xmm2, xmm1, 0x55 paddd xmm5, xmm2 pshufd xmm6, xmm1, 0xAA paddd xmm4, xmm6 pshufd xmm11, xmm1, 0xFF paddd xmm3, xmm11 pshufd xmm1, xmm1, 0 paddd xmm7, xmm1
movdqa xmm1, [rdx+4*4] pshufd xmm2, xmm1, 0x55 paddd xmm8, xmm2 pshufd xmm6, xmm1, 0xAA paddd xmm9, xmm6 pshufd xmm11, xmm1, 0xFF paddd xmm10, xmm11 pshufd xmm1, xmm1, 0 paddd xmm0, xmm1
movdqa [hash+0*16], xmm7 movdqa [hash+1*16], xmm5 movdqa [hash+2*16], xmm4 movdqa [hash+3*16], xmm3 movdqa [hash+4*16], xmm0 movdqa [hash+5*16], xmm8 movdqa [hash+6*16], xmm9 movdqa [hash+7*16], xmm10 or movdqa xmm1, [rdx] pshufd xmm2, xmm1, 0x55 pshufd xmm6, xmm1, 0xAA pshufd xmm11, xmm1, 0xFF pshufd xmm1, xmm1, 0
paddd xmm5, xmm2 paddd xmm4, xmm6 paddd xmm3, xmm11 paddd xmm7, xmm1
movdqa xmm1, [rdx+4*4] pshufd xmm2, xmm1, 0x55 pshufd xmm6, xmm1, 0xAA pshufd xmm11, xmm1, 0xFF pshufd xmm1, xmm1, 0
paddd xmm8, xmm2 paddd xmm9, xmm6 paddd xmm10, xmm11 paddd xmm0, xmm1
movdqa [hash+0*16], xmm7 movdqa [hash+1*16], xmm5 movdqa [hash+2*16], xmm4 movdqa [hash+3*16], xmm3 movdqa [hash+4*16], xmm0 movdqa [hash+5*16], xmm8 movdqa [hash+6*16], xmm9 movdqa [hash+7*16], xmm10
Oddly enough, I'm seeing higher optimization using the first set of code which is part of my modifications to the sse2_64_atom code.
|
Funroll_Loops, the theoretically quicker breakfast cereal! Check out http://www.facebook.com/JupiterICT for all of your computing needs. If you need it, we can get it. We have solutions for your computing conundrums. BTC accepted! 12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 09, 2011, 12:35:14 AM |
|
Updated tree:
Implemented never idle logic. During periods of network or server problems it takes existing work and generates more work from that till the server starts responding properly or fast enough. This means that hash rates should -never- drop now with cgminer.
This is it in action:
[Accepted] [GPU 2] [435.0 Mh/s] [Q:111 A:21 R:0 HW:0 E:19% U:22.20/m] [2011-07-09 10:26:01] Server not providing work fast enough, generating work locally [Accepted] [GPU 3] [429.4 Mh/s] [Q:135 A:21 R:1 HW:0 E:16% U:18.13/m] [2011-07-09 10:26:53] Resumed retrieving work from server
If your network is down for extensive periods eventually this will generate more rejected blocks, but for transient blips this makes a massive difference.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
d3m0n1q_733rz
|
|
July 09, 2011, 01:50:53 AM Last edit: July 09, 2011, 04:54:30 AM by d3m0n1q_733rz |
|
Question: Which is more optimum? movdqa xmm1, [rdx] pshufd xmm2, xmm1, 0x55 paddd xmm5, xmm2 pshufd xmm6, xmm1, 0xAA paddd xmm4, xmm6 pshufd xmm11, xmm1, 0xFF paddd xmm3, xmm11 pshufd xmm1, xmm1, 0 paddd xmm7, xmm1
movdqa xmm1, [rdx+4*4] pshufd xmm2, xmm1, 0x55 paddd xmm8, xmm2 pshufd xmm6, xmm1, 0xAA paddd xmm9, xmm6 pshufd xmm11, xmm1, 0xFF paddd xmm10, xmm11 pshufd xmm1, xmm1, 0 paddd xmm0, xmm1
movdqa [hash+0*16], xmm7 movdqa [hash+1*16], xmm5 movdqa [hash+2*16], xmm4 movdqa [hash+3*16], xmm3 movdqa [hash+4*16], xmm0 movdqa [hash+5*16], xmm8 movdqa [hash+6*16], xmm9 movdqa [hash+7*16], xmm10 or movdqa xmm1, [rdx] pshufd xmm2, xmm1, 0x55 pshufd xmm6, xmm1, 0xAA pshufd xmm11, xmm1, 0xFF pshufd xmm1, xmm1, 0
paddd xmm5, xmm2 paddd xmm4, xmm6 paddd xmm3, xmm11 paddd xmm7, xmm1
movdqa xmm1, [rdx+4*4] pshufd xmm2, xmm1, 0x55 pshufd xmm6, xmm1, 0xAA pshufd xmm11, xmm1, 0xFF pshufd xmm1, xmm1, 0
paddd xmm8, xmm2 paddd xmm9, xmm6 paddd xmm10, xmm11 paddd xmm0, xmm1
movdqa [hash+0*16], xmm7 movdqa [hash+1*16], xmm5 movdqa [hash+2*16], xmm4 movdqa [hash+3*16], xmm3 movdqa [hash+4*16], xmm0 movdqa [hash+5*16], xmm8 movdqa [hash+6*16], xmm9 movdqa [hash+7*16], xmm10
Oddly enough, I'm seeing higher optimization using the first set of code which is part of my modifications to the sse2_64_atom code. I've managed to smooth out the beginning hashes by prefetching the initial values so that the beginning hashes don't start slow and make their way up to their max value. With some adjustments to function placement along with some vertical and horizontal math conversions via SSE3, we've managed to squeeze quite a few more hashes out of this code. Unfortunately, it's no longer just SSE2_64 but just about all 64-bit processors support SSE3 anyway. I've sent a copy of the newer optimized code to everyone that's helped me out as a thank you. Stay frosty! Edit: New portion of the project includes the use of YMM registers. Anyone feel like playing around with this?
|
Funroll_Loops, the theoretically quicker breakfast cereal! Check out http://www.facebook.com/JupiterICT for all of your computing needs. If you need it, we can get it. We have solutions for your computing conundrums. BTC accepted! 12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
|
|
|
-ck
Legendary
Offline
Activity: 4242
Merit: 1644
Ruu \o/
|
|
July 09, 2011, 03:05:34 AM |
|
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
|