Bitcoin Forum
April 16, 2024, 06:49:05 AM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 ... 71 »
  Print  
Author Topic: An (even more) optimized version of cpuminer (pooler's cpuminer, CPU-only)  (Read 1958259 times)
BeeCee1
Member
**
Offline Offline

Activity: 115
Merit: 10


View Profile
December 20, 2011, 01:36:48 AM
 #41

That's an impressive improvement.  It doubled my hash power with a small decrease in power consumption.  My spidey sense is tingling, there will be a big jump in difficulty soon.

On the old miner I found that I'd get a small increase in hash power by running more threads than cores, for this one there isn't any advantage to extra threads.
1713250145
Hero Member
*
Offline Offline

Posts: 1713250145

View Profile Personal Message (Offline)

Ignore
1713250145
Reply with quote  #2

1713250145
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713250145
Hero Member
*
Offline Offline

Posts: 1713250145

View Profile Personal Message (Offline)

Ignore
1713250145
Reply with quote  #2

1713250145
Report to moderator
1713250145
Hero Member
*
Offline Offline

Posts: 1713250145

View Profile Personal Message (Offline)

Ignore
1713250145
Reply with quote  #2

1713250145
Report to moderator
HolodeckJizzmopper
Member
**
Offline Offline

Activity: 106
Merit: 10


View Profile
December 20, 2011, 01:47:57 AM
Last edit: December 20, 2011, 02:06:41 AM by HolodeckJizzmopper
 #42

Running minerd.exe on Windows with no parameters causes a gpf.

Edit: ... and confirming performance gains of 1.5-2x on Intel with this latest code, across a variety of processors.
Remember remember the 5th of November
Legendary
*
Offline Offline

Activity: 1862
Merit: 1011

Reverse engineer from time to time


View Profile
December 20, 2011, 01:58:49 AM
 #43

Running minerd.exe on Windows with no parameters causes a gpf.
Known bug

BTC:1AiCRMxgf1ptVQwx6hDuKMu4f7F27QmJC2
ThiagoCMC
Legendary
*
Offline Offline

Activity: 1204
Merit: 1000

฿itcoin: Currency of Resistance!


View Profile
December 20, 2011, 02:17:08 AM
 #44

My total hashrate jump from 110khash to 160khash!!!!!

You are The Man Pooler!!! YOU ROCK!!

Donation sent to your Litecoin Address: LTCPooLqTK1SANSNeTR63GbGwabTKEkuS7

68fdc851264fe0efa91292725b7ae5a30bb0780a5575bf7220b9d0ad06d392b0

Cheers!
Thiago
BitcoinPorn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


Posts: 69


View Profile WWW
December 20, 2011, 04:46:19 AM
 #45

Saying nothing new here, but amazing.

exahash
Sr. Member
****
Offline Offline

Activity: 278
Merit: 250



View Profile
December 20, 2011, 04:52:28 AM
 #46

I have a handful of boxes with Pentium D 2.8 GHz cpu's (820's I think) and this version of cpuminer is actually slower than artforz's

I'm not sure what's going on with these boxes, but artforz's runs at 1.48 khash/s/thread (two threads) while pooler's is only doing 1.40  I know its a tiny difference, but still seems strange to me.  I've tried recompiling each with varying CFLAGS and seen no change.

This is under ubuntu 10.04 LTS completely updated.

Any one else seen this or have ideas?

mrx
Member
**
Offline Offline

Activity: 86
Merit: 10



View Profile
December 20, 2011, 05:07:36 AM
Last edit: December 20, 2011, 05:28:40 AM by mrx
 #47

Test results:

Linux x86-64, Intel Xeon, before (artforz, modified speed output):
Code:
[2011-12-20 13:02:41] thread 6: 15074 hashes, 3.01210 khash/sec
[2011-12-20 13:02:42] thread 7: 14079 hashes, 2.96454 khash/sec
[2011-12-20 13:02:42] thread 0: 14959 hashes, 2.99386 khash/sec
[2011-12-20 13:02:44] thread 1: 14920 hashes, 2.97748 khash/sec
[2011-12-20 13:02:44] thread 3: 14619 hashes, 2.87325 khash/sec
[2011-12-20 13:02:45] thread 2: 14765 hashes, 2.93248 khash/sec
[2011-12-20 13:02:45] thread 4: 15090 hashes, 3.01685 khash/sec
[2011-12-20 13:02:45] thread 5: 15079 hashes, 2.89249 khash/sec
[2011-12-20 13:02:46] thread 6: 15061 hashes, 3.01753 khash/sec


after (modified speed output):
Code:
[2011-12-20 13:06:44] thread 5: 20551 hashes, 4.28743 khash/s
[2011-12-20 13:06:45] thread 0: 21568 hashes, 4.31442 khash/s
[2011-12-20 13:06:45] thread 1: 20840 hashes, 4.18909 khash/s
[2011-12-20 13:06:46] thread 6: 21690 hashes, 4.33446 khash/s
[2011-12-20 13:06:48] thread 2: 21572 hashes, 4.30622 khash/s
[2011-12-20 13:06:49] thread 3: 21128 hashes, 4.27796 khash/s
[2011-12-20 13:06:49] thread 7: 21588 hashes, 4.25990 khash/s
[2011-12-20 13:06:49] thread 5: 21438 hashes, 4.32439 khash/s
[2011-12-20 13:06:50] thread 4: 21709 hashes, 3.77865 khash/s

Windows 32-bit, Intel Core 2 Duo, before(amdfam10-sse4a):
Code:
[2011-12-20 13:15:10] thread 1: 6553 hashes, 1.40 khash/sec
[2011-12-20 13:15:10] thread 0: 6553 hashes, 1.38 khash/sec

after:
Code:
[2011-12-20 13:17:05] thread 0: 16422 hashes, 3.49 khash/s
[2011-12-20 13:17:06] thread 1: 16346 hashes, 3.46 khash/s

Windows 32-bit, AMD Phenom II X4, before(amdfam10-sse4a):
Code:
[2011-12-20 13:22:01] thread 1: 9101 hashes, 1.70 khash/sec
[2011-12-20 13:22:04] thread 0: 6965 hashes, 1.76 khash/sec
[2011-12-20 13:22:04] thread 3: 9362 hashes, 1.87 khash/sec
[2011-12-20 13:22:05] thread 2: 8364 hashes, 1.62 khash/sec

after:
Code:
[2011-12-20 13:28:24] thread 1: 12141 hashes, 2.39 khash/s
[2011-12-20 13:28:24] thread 0: 11528 hashes, 2.31 khash/s
[2011-12-20 13:28:24] thread 2: 12009 hashes, 2.45 khash/s
[2011-12-20 13:28:24] thread 3: 11708 hashes, 2.35 khash/s


Splendid!
meti
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 20, 2011, 07:01:42 AM
 #48

Wow thank you!

My Macbook Pro i5 M 2,53 had 1,18 kh/s x 4 = 4,72 now it's got 2,9 kh/s x4 = 11,6 !

Amazing!

Edit: When I think about it maybe i build something wrong with the first minerd. I'm new to such things. Now I used the prebuilt binary.
P4man
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
December 20, 2011, 07:12:12 AM
 #49

Im seeing very impressive speedups on Core 2 duos and quads. Less impressive on my AMD machines and an old P4. unfortunately this speed bump course does little to increase profitability of litecoin mining, which is currently pretty much non existent, since everyone will upgrade heh. But great job tweaking the code!

Come-from-Beyond
Legendary
*
Offline Offline

Activity: 2142
Merit: 1009

Newbie


View Profile
December 20, 2011, 09:55:35 AM
 #50

It seems to me that speed boost is higher for old machines than for new ones. If i'm right then it's very good feature. It helps those who own obsolete computers to compete with the others.
pooler (OP)
Hero Member
*****
Offline Offline

Activity: 838
Merit: 507


View Profile
December 20, 2011, 11:12:35 AM
 #51

I have a handful of boxes with Pentium D 2.8 GHz cpu's (820's I think) and this version of cpuminer is actually slower than artforz's

I'm not sure what's going on with these boxes, but artforz's runs at 1.48 khash/s/thread (two threads) while pooler's is only doing 1.40  I know its a tiny difference, but still seems strange to me.  I've tried recompiling each with varying CFLAGS and seen no change.

This is under ubuntu 10.04 LTS completely updated.

Any one else seen this or have ideas?

Uhm, bizarre. I have never worked with Pentium D's, so... let me have a look at Wikipedia... ok, these basically seem to be 64-bit-enabled dual-core Pentium 4's (i.e. Netburst arch).
Judging from the results, I guess you are running a 32-bit environment, but still I don't understand how the new version could be slower.
Anyone else with a Pentium D can confirm this issue?

BTC: 15MRTcUweNVJbhTyH5rq9aeSdyigFrskqE · LTC: LTCPooLqTK1SANSNeTR63GbGwabTKEkuS7
Matoking
Sr. Member
****
Offline Offline

Activity: 352
Merit: 250

Firstbits: 1m8xa


View Profile WWW
December 20, 2011, 02:57:46 PM
Last edit: December 20, 2011, 03:16:19 PM by Matoking
 #52

Holy crap.

My hashrate per thread jumped from 1.7 khash/sec per thread to 3.2 khash/sec per thread. Shocked

And with full firepower it's now 13 khash/sec in total. Amazing!

BTC : 1CcpmVDLvR7DgA5deFGScoNhiEtiJnh6H4 - LTC : LYTnoXAHNsemMB2jhCSi1znQqnfupdRkSy
Bitcoin-otc
BitBin - earn bitcoins with your pastes!
xurious
Sr. Member
****
Offline Offline

Activity: 413
Merit: 250


View Profile
December 20, 2011, 03:25:28 PM
 #53

I have a handful of boxes with Pentium D 2.8 GHz cpu's (820's I think) and this version of cpuminer is actually slower than artforz's

I'm not sure what's going on with these boxes, but artforz's runs at 1.48 khash/s/thread (two threads) while pooler's is only doing 1.40  I know its a tiny difference, but still seems strange to me.  I've tried recompiling each with varying CFLAGS and seen no change.

This is under ubuntu 10.04 LTS completely updated.

Any one else seen this or have ideas?

Uhm, bizarre. I have never worked with Pentium D's, so... let me have a look at Wikipedia... ok, these basically seem to be 64-bit-enabled dual-core Pentium 4's (i.e. Netburst arch).
Judging from the results, I guess you are running a 32-bit environment, but still I don't understand how the new version could be slower.
Anyone else with a Pentium D can confirm this issue?

I'll check on a work machine in a bit, but I'm not sure which artforz miner he was using. I'm only going to see what the new kh/s is with a 32 bit os and miner.

SiaMining.com -- First PPS SiaMining Pool! 3%, VarDiff, Stratum Support
ovidiusoft
Sr. Member
****
Offline Offline

Activity: 252
Merit: 250


View Profile
December 20, 2011, 04:17:31 PM
 #54

Here are my results:

Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz

8 threads
0,8 khash/thread => 2 khash/thread

Intel(R) Xeon(R) CPU           X3430  @ 2.40GHz

4 threads
2,2 khash/thread => 3,6 khash/thread

and just for the fun of it:

AMD Sempron(tm) Processor 3200+

2 threads
0,8 khash/thread => 0,9 khash/thread

Optimal CFLAGS determined with this method: http://blog.mybox.ro/2011/11/02/how-to-recompile-software-with-hardware-optimizations/
exahash
Sr. Member
****
Offline Offline

Activity: 278
Merit: 250



View Profile
December 20, 2011, 04:19:53 PM
 #55

I have a handful of boxes with Pentium D 2.8 GHz cpu's (820's I think) and this version of cpuminer is actually slower than artforz's

I'm not sure what's going on with these boxes, but artforz's runs at 1.48 khash/s/thread (two threads) while pooler's is only doing 1.40  I know its a tiny difference, but still seems strange to me.  I've tried recompiling each with varying CFLAGS and seen no change.

This is under ubuntu 10.04 LTS completely updated.

Any one else seen this or have ideas?

Uhm, bizarre. I have never worked with Pentium D's, so... let me have a look at Wikipedia... ok, these basically seem to be 64-bit-enabled dual-core Pentium 4's (i.e. Netburst arch).
Judging from the results, I guess you are running a 32-bit environment, but still I don't understand how the new version could be slower.
Anyone else with a Pentium D can confirm this issue?

I'll check on a work machine in a bit, but I'm not sure which artforz miner he was using. I'm only going to see what the new kh/s is with a 32 bit os and miner.

The Pentium D's are running fresh installs of Ubuntu 10.04 64-bit, up to date "aptitude dist-upgrade" and rebooted.  

The artforz miner was pulled from git yesterday with "git pull https://github.com/ArtForz/cpuminer" as was pooler's "git pull https://github.com/pooler/cpuminer"  Both were compiled with CFLAGS="-march=native -O3 -Wall -msse2"

I even tried copying over the binaries that were compiled on the sempron and xeon boxes but got the same results.

I'm thinking there's either something about the P4D's that makes them bad at scrypt, or I've got a bios or OS setting messed up somewhere.

It seems odd that they are 2.8 GHz dual-core chips with each thread doing exactly half it's GHz in khash/s.  Maybe there is some instruction that takes two clock cycles which only takes one in newer chips?

If I can get some time, I might go try Windows and/or Ubuntu 11.10 on one of them to see if it makes a difference.





pooler (OP)
Hero Member
*****
Offline Offline

Activity: 838
Merit: 507


View Profile
December 20, 2011, 07:11:26 PM
 #56

The Pentium D's are running fresh installs of Ubuntu 10.04 64-bit, up to date "aptitude dist-upgrade" and rebooted.  

The artforz miner was pulled from git yesterday with "git pull https://github.com/ArtForz/cpuminer" as was pooler's "git pull https://github.com/pooler/cpuminer"  Both were compiled with CFLAGS="-march=native -O3 -Wall -msse2"

I even tried copying over the binaries that were compiled on the sempron and xeon boxes but got the same results.

I'm thinking there's either something about the P4D's that makes them bad at scrypt, or I've got a bios or OS setting messed up somewhere.

It seems odd that they are 2.8 GHz dual-core chips with each thread doing exactly half it's GHz in khash/s.  Maybe there is some instruction that takes two clock cycles which only takes one in newer chips?

If I can get some time, I might go try Windows and/or Ubuntu 11.10 on one of them to see if it makes a difference.

Yes, I would like to see the performance in 32-bit mode. The Pentium D's were very early 64-bit cpus, and are not as good at SSE as later Core-based models, but I expected them to get some improvement from the new code.

BTC: 15MRTcUweNVJbhTyH5rq9aeSdyigFrskqE · LTC: LTCPooLqTK1SANSNeTR63GbGwabTKEkuS7
Come-from-Beyond
Legendary
*
Offline Offline

Activity: 2142
Merit: 1009

Newbie


View Profile
December 20, 2011, 07:20:31 PM
 #57

64-bit miner should work 50% faster if u calc 2 hashes at once. To get this bonus u should double ur code in the following maneur:

1st SSE instruction that calc hash #1 (using xmm0-xmm7)
1st SSE instruction that calc hash #2 (using xmm8-xmm15)
2nd SSE instruction that calc hash #1 (using xmm0-xmm7)
2nd SSE instruction that calc hash #2 (using xmm8-xmm15)
...
...
Nth SSE instruction that calc hash #1 (using xmm0-xmm7)
Nth SSE instruction that calc hash #2 (using xmm8-xmm15)
pooler (OP)
Hero Member
*****
Offline Offline

Activity: 838
Merit: 507


View Profile
December 20, 2011, 08:35:46 PM
 #58

64-bit miner should work 50% faster if u calc 2 hashes at once. To get this bonus u should double ur code in the following maneur:

1st SSE instruction that calc hash #1 (using xmm0-xmm7)
1st SSE instruction that calc hash #2 (using xmm8-xmm15)
2nd SSE instruction that calc hash #1 (using xmm0-xmm7)
2nd SSE instruction that calc hash #2 (using xmm8-xmm15)
...
...
Nth SSE instruction that calc hash #1 (using xmm0-xmm7)
Nth SSE instruction that calc hash #2 (using xmm8-xmm15)

Thank you for the suggestion, I'll try to implement something like that as soon as I find some time. I'm currently trying to fix a couple bugs already present in the old minerd.

BTC: 15MRTcUweNVJbhTyH5rq9aeSdyigFrskqE · LTC: LTCPooLqTK1SANSNeTR63GbGwabTKEkuS7
ThiagoCMC
Legendary
*
Offline Offline

Activity: 1204
Merit: 1000

฿itcoin: Currency of Resistance!


View Profile
December 20, 2011, 10:45:09 PM
 #59

Pooler,

 Don't you think that NVidia CUDA can run some new assembly code for Scrypt more fast than a CPU?
 I'm asking this because I see that some guys are working on a open source miner for SolidCoin (a.k.a Shitcoin) that run on CUDA...

Best!
Thiago
Red Emerald
Hero Member
*****
Offline Offline

Activity: 742
Merit: 500



View Profile WWW
December 20, 2011, 11:04:53 PM
 #60

Went from 3.5 khash/sec to 4.7 on my 3.0GHz AMD Athlon II X2 250 Processor

Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 ... 71 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!