adv
|
|
February 16, 2011, 01:11:29 AM |
|
Hello Jgarzik. Thanks for the good miner.
I want to suggest some improvements (possibly incorrect). As I understand your sources. Now, each thread makes his getwork and processes it. Why not parallelize the processing of one getwork a few threads? Like this: thread_1(nonce = 0 to N), thread_2(nonce = N to M), ..., thread_X(nonce = Y to Z).
This should reduce the load on the network and the loss in the processing of RPC request. (For the case when the block solution time > maximum admissible (skantime). But this is the most common case, no?)
The following improvements (for the distant future) - Miner parallelization on multiple hosts with the aid of a binary (for fast) protocol supports pooling, an indication of nonse range and calibrate the performance of customers (to determine the nonce range). In practice, this task of building a new server pool with their customers. It is interesting to you?
P.S. This is somewhat similar to the processing of one getwork on SSE2 processors that are discussed in the next thread.
|
U may thank me here: 14Js1ng1SvYBPgUJnjNAEPYH4d6SHF79UF
|
|
|
jgarzik (OP)
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
February 16, 2011, 02:35:23 AM |
|
I want to suggest some improvements (possibly incorrect). As I understand your sources. Now, each thread makes his getwork and processes it. Why not parallelize the processing of one getwork a few threads? Like this: thread_1(nonce = 0 to N), thread_2(nonce = N to M), ..., thread_X(nonce = Y to Z).
This should reduce the load on the network and the loss in the processing of RPC request. (For the case when the block solution time > maximum admissible (skantime). But this is the most common case, no?)
Yes, this is a good idea for the future. If someone wanted to contributed a clean implementation of this, I would accept it. As it stands now, simply starting completely independent threads was easier to code. Breaking up the nonce work implies some level of coordination among threads, which must be done carefully to avoid hurting performance due to locks / asynchronous queues / similar threading details. One must also take care never to stall one thread, while it waits on other threads. This is not a hard problem... but it must be done right to avoid these pitfalls. The following improvements (for the distant future) - Miner parallelization on multiple hosts with the aid of a binary (for fast) protocol supports pooling, an indication of nonse range and calibrate the performance of customers (to determine the nonce range). In practice, this task of building a new server pool with their customers. It is interesting to you?
Pooling already exists. See this thread, or see this thread for a binary pool protocol.
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
jgarzik (OP)
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
February 16, 2011, 02:37:51 AM |
|
Because: $ ./configure [...] checking for pkg-config... /opt/local/bin/pkg-config checking pkg-config is at least version 0.9.0... yes ./configure: line 4301: syntax error near unexpected token `,' ./configure: line 4301: `LIBCURL_CHECK_CONFIG(, 7.10.1, ,'
autogen could not find your libcurl autoconf macros. Sounds like you have a non-standard installation, outside the normal autoconf/automake paths. I have the same thing. Trying to configure sources on FreeBSD 8.1-RELEASE-p2 i386. Maybe, there are some linker flags or workarounds or something? The solution here is easy: don't build from git, build from tarball releases. The tarball already has the proper configure script generated, so you don't have to worry about missing macros or broken OS configurations.
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
hangover
Newbie
Offline
Activity: 14
Merit: 0
|
|
February 16, 2011, 08:25:03 AM |
|
I have the same thing. Trying to configure sources on FreeBSD 8.1-RELEASE-p2 i386. Maybe, there are some linker flags or workarounds or something?
The solution here is easy: don't build from git, build from tarball releases. The tarball already has the proper configure script generated, so you don't have to worry about missing macros or broken OS configurations. Thank you very much!
|
|
|
|
myrkul
|
|
February 16, 2011, 11:27:53 AM |
|
Quick question: is there a way to setup minerd to run as a linux screensaver? an extra thread while i'm not using the computer at all would be a nice way to boost my hash/sec.
|
|
|
|
hangover
Newbie
Offline
Activity: 14
Merit: 0
|
|
February 16, 2011, 03:15:06 PM |
|
Patch for successfull compilation of minerd under FreeBSD: --- miner.h.orig 2011-02-10 11:51:55.000000000 +0600 +++ miner.h 2011-02-16 16:04:28.000000000 +0600 @@ -5,7 +5,7 @@ #include <stdint.h> #include <sys/time.h> #include <jansson.h> -#include <curl/curl.h> +#include </usr/local/include/curl/curl.h> #ifdef __SSE2__ #define WANT_SSE2_4WAY 1 @@ -18,7 +18,12 @@ #if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)) #define WANT_BUILTIN_BSWAP #else -#include <byteswap.h> +/* #include <byteswap.h> */ // <-- doesn't exist under FreeBSD +# define bswap_64 __bswap64 +# define bswap_32 __bswap32 +# define bswap_16 __bswap16 +# define __BIG_ENDIAN__ (_BYTE_ORDER == _BIG_ENDIAN) +#include <sys/endian.h> #endif #if defined(__i386__)
|
|
|
|
myrkul
|
|
February 17, 2011, 09:53:00 AM |
|
One more thing to ask: Is there a way to log the output? I tried the standard > redirect, but nothing is dumped into the text file, and nothing displays in the terminal screen, either.
Interestingly, the redirect works with the --help switch, and does fill the text file with the help output, and once I try redirecting the live program output, the text is blanked, but even with the >> append, nothing is ever put into the file. What am I doing wrong?
OS: Linux 10.04 Processor: Crap GPU: Crap
|
|
|
|
hangover
Newbie
Offline
Activity: 14
Merit: 0
|
|
February 17, 2011, 10:10:12 AM |
|
One more thing to ask: Is there a way to log the output? I tried the standard > redirect, but nothing is dumped into the text file, and nothing displays in the terminal screen, either.
Interestingly, the redirect works with the --help switch, and does fill the text file with the help output, and once I try redirecting the live program output, the text is blanked, but even with the >> append, nothing is ever put into the file. What am I doing wrong?
OS: Linux 10.04 Processor: Crap GPU: Crap
To double program output to file, you can make something like that: minerd --threads X --url http://mypool.local:8332 --userpass user:pass 2>&1 | tee /some/path/log.log
|
|
|
|
myrkul
|
|
February 17, 2011, 10:28:48 AM |
|
To double program output to file, you can make something like that: minerd --threads X --url http://mypool.local:8332 --userpass user:pass 2>&1 | tee /some/path/log.log Ok, that puts "1 miner threads started, using SHA256 'cryptopp_asm32' algorithm." into the logfile, but fails to record (or display) the hashcount. Progress, but still not 100% Thanks for the fast response, btw.
|
|
|
|
hangover
Newbie
Offline
Activity: 14
Merit: 0
|
|
February 17, 2011, 11:28:06 AM |
|
To double program output to file, you can make something like that: minerd --threads X --url http://mypool.local:8332 --userpass user:pass 2>&1 | tee /some/path/log.log Ok, that puts "1 miner threads started, using SHA256 'cryptopp_asm32' algorithm." into the logfile, but fails to record (or display) the hashcount. Progress, but still not 100% Thanks for the fast response, btw. Hmm. It seems to me that cpuminer with tee gives output 'by portions'. I started minerd and got this: hangover@hangover:~$ minerd --threads 2 --url http://mining.bitcoin.cz:8332 --userpass hangover.XX:YY 2>&1 | tee /home/hangover/loglog 2 miner threads started, using SHA256 'c' algorithm. And after waiting for 5 minutes I got this: HashMeter(1): 16777215 hashes, 628.20 khash/sec HashMeter(0): 16777215 hashes, 620.19 khash/sec HashMeter(0): 8388607 hashes, 643.54 khash/sec HashMeter(1): 8388607 hashes, 637.17 khash/sec HashMeter(1): 4194303 hashes, 502.27 khash/sec HashMeter(0): 4194303 hashes, 430.20 khash/sec HashMeter(1): 3194303 hashes, 582.57 khash/sec HashMeter(0): 3194303 hashes, 602.13 khash/sec HashMeter(1): 3194303 hashes, 582.68 khash/sec HashMeter(0): 3194303 hashes, 586.11 khash/sec HashMeter(0): 3194303 hashes, 632.09 khash/sec HashMeter(1): 3194303 hashes, 598.81 khash/sec HashMeter(0): 3194303 hashes, 643.12 khash/sec HashMeter(1): 3194303 hashes, 628.86 khash/sec HashMeter(0): 3294303 hashes, 638.16 khash/sec HashMeter(1): 3194303 hashes, 643.76 khash/sec HashMeter(0): 3294303 hashes, 661.30 khash/sec HashMeter(1): 3294303 hashes, 641.22 khash/sec HashMeter(0): 3394303 hashes, 645.43 khash/sec HashMeter(1): 3294303 hashes, 645.00 khash/sec HashMeter(0): 3394303 hashes, 647.02 khash/sec ...
|
|
|
|
myrkul
|
|
February 17, 2011, 11:47:54 AM |
|
Ahh. Just need to wait longer then?
Me and my impatient self. Thanks!
|
|
|
|
jwf
Newbie
Offline
Activity: 1
Merit: 0
|
|
February 17, 2011, 06:58:11 PM Last edit: February 17, 2011, 07:32:46 PM by jwf |
|
ymmv, but I found that -Os (optimize for size, better exploits caches) outperforms -O3 or -O4 by about 8%.
update:
only bothered testing on 4way, but tested on 3 recent Intel processors, a Xeon W3530 @ 2.80GHz (8192 KB cache), a Xeon @ 3.00GHz (2048 KB cache), and a Xeon E5430 @ 2.66GHz (6144 KB cache). All with approximately the same results.
Using version obtained from git with most recent commit 4a7f3f70b5628cb804ca4f46cf51651a1a42507f.
gcc version Ubuntu 4.4.3-4ubuntu5, CFLAGS="-O(s|3) -ftree-vectorize -march=native".
jordan
|
|
|
|
Cerebrum
Newbie
Offline
Activity: 34
Merit: 0
|
|
February 17, 2011, 07:12:25 PM |
|
ymmv, but I found that -Os (optimize for size, better exploits caches) outperforms -O3 or -O4 by about 8%.
jordan
This may also be why ufasoft's code runs faster than the 4way implementation used here. Since he doesn't unroll the loop used in the SHA calculation, the code is smaller and may fit into the cache more easily. MEMORY HIERARCHIES: THAT $#%! IS SO CACHE.
|
|
|
|
Patches
Newbie
Offline
Activity: 16
Merit: 0
|
|
February 18, 2011, 01:42:35 PM |
|
Just to share, the miner seems to work quite well under wine on OS X on a Mac Pro. Just install wine as per these instructions: http://wiki.winehq.org/MacOSX/Installing (I used Mac Ports) and you are good to go.
|
|
|
|
BeeCee1
Member
Offline
Activity: 115
Merit: 10
|
|
February 19, 2011, 04:34:30 PM |
|
For those of you on Linux that are itching to try out ufasoft's optimizations (see this thread: http://bitcointalk.org/index.php?topic=3486.0). one of the optimizations he does is "skip last 3 rounds, because we need not them to calc last 32-bit word of hash [h word]." I tried this by simply commenting out the final three rounds and it successfully generated shares in slush's mining pool and made things a tiny bit faster (1 or 2 %). I didn't verify that it works in all cases, someone more familiar with the protocol might want to. Like this: //w13 = add4(SIGMA1_256(w11), w6, SIGMA0_256(w14), w13); //SHA256ROUND(d, e, f, g, h, a, b, c, 61, w13); //w14 = add4(SIGMA1_256(w12), w7, SIGMA0_256(w15), w14); //SHA256ROUND(c, d, e, f, g, h, a, b, 62, w14); //w15 = add4(SIGMA1_256(w13), w8, SIGMA0_256(w0), w15); //SHA256ROUND(b, c, d, e, f, g, h, a, 63, w15); /* store resulsts directly in thash */
|
|
|
|
jgarzik (OP)
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
March 03, 2011, 03:51:21 AM |
|
Version 0.7.1 is released. See top of thread for URLs.
Changes: - Add support for JSON-format configuration file. See example file example-cfg.json. Any long argument on the command line may be stored in the config file. - Timestamp each solution found - Improve sha256_4way performance. NOTE: This optimization makes the 'hash' debug-print output for sha256_way incorrect. - Use __builtin_expect() intrinsic as compiler micro-optimization - Build on Intel compiler - HTTP library now follows HTTP redirects
SHA1: 5520112505b16f89b473ed897b89e1593aeb1371 cpuminer-installer-0.7.1.zip MD5: 1b77192b76bf50938c005b2c26d3809f cpuminer-installer-0.7.1.zip
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
TurdHurdur
|
|
March 03, 2011, 04:27:28 AM |
|
Fine update.
I have a question though, is there a reason STDIN and STDERR aren't accessible when exec'd from Perl? I've had luck with IO::Pty on Linux, but not on Cygwin or Strawberry Perl on Win32. I ask this assuming that exec-ing it from any non-term/console would have the same issue and this isn't a Perl-specific problem.
|
|
|
|
mrballcb
Newbie
Offline
Activity: 10
Merit: 0
|
|
March 03, 2011, 08:57:37 PM |
|
- Add support for JSON-format configuration file. See example file example-cfg.json. Any long argument on the command line may be stored in the config file. Curious, how come a json format was chosen over something like yaml? JSON is extremely unforgiving when it comes to syntax, compared to yaml which is much more forgiving. Personally I also find YAML to be more readable, but that's just a personal preference not based on capabilities nor function. Are there plans to centralize configs or something (since json is so web friendly)? Or was it just a convenient way to save/retrieve configuration? Regards... Todd
|
|
|
|
chromicant
Newbie
Offline
Activity: 40
Merit: 0
|
|
March 03, 2011, 08:59:30 PM |
|
Curious, how come a json format was chosen over something like yaml?
Probably because there's a whole JSON parser that's part of the miner...so might as well reuse the code!
|
|
|
|
Raulo
|
|
March 03, 2011, 10:04:38 PM Last edit: March 03, 2011, 11:30:52 PM by Raulo |
|
If anybody is interested, this is the assembly code from Intel Compiler which gives me the fastest 4way code on AMD K10. 1.13 GH/s MH/s per 1 GHz per physical core. https://gist.github.com/853566Compile cpuminer-0.7.1, download the above code, issue: gcc -c sha256_4way.s and do make again to link the object file to the executable. It's about 7% faster than the gcc version.
|
1HAoJag4C3XtAmQJAhE9FTAAJWFcrvpdLM
|
|
|
|