Bitcoin Forum
May 11, 2024, 05:22:31 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 [152] 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 »
  Print  
Author Topic: [LOCKED] cpuminer-opt v3.12.3, open source optimized multi-algo CPU miner  (Read 443981 times)
lncm
Member
**
Offline Offline

Activity: 388
Merit: 13


View Profile
December 16, 2017, 10:33:05 PM
 #3021

Yes it's normal and dependent on the algo. It means cpuminer-opt has no optimizations for scrypt algo.

Oh, OK, it's just it previously stated SSE2.

On another subject, I tried 3.7.5 windows binary in my desktop (Ryzen 1700) and all executables fail to start - it states:
"thread xx (random): Scrypt buffer allocation failed Fail: thread xx failed to initiate.

I noted the change in feature reporting in the release announcement.

You're out of memory.  You only have enough memory for xx -1 threads.

Thanks, fiddling around with virtual memory settings allowed it to run.

Performance is still very bad with Ryzen CPU using Scrypt. At same level as a Xeon Westmere-EP 6 cores @ 2.4 GHz. Is this really the CPU fault, or could cpuminer-opt be more optimized for Zen architecture? 

Thanks and keep up the good work!
1715404951
Hero Member
*
Offline Offline

Posts: 1715404951

View Profile Personal Message (Offline)

Ignore
1715404951
Reply with quote  #2

1715404951
Report to moderator
Remember that Bitcoin is still beta software. Don't put all of your money into BTC!
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715404951
Hero Member
*
Offline Offline

Posts: 1715404951

View Profile Personal Message (Offline)

Ignore
1715404951
Reply with quote  #2

1715404951
Report to moderator
1715404951
Hero Member
*
Offline Offline

Posts: 1715404951

View Profile Personal Message (Offline)

Ignore
1715404951
Reply with quote  #2

1715404951
Report to moderator
1715404951
Hero Member
*
Offline Offline

Posts: 1715404951

View Profile Personal Message (Offline)

Ignore
1715404951
Reply with quote  #2

1715404951
Report to moderator
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 16, 2017, 10:49:02 PM
 #3022

Yes it's normal and dependent on the algo. It means cpuminer-opt has no optimizations for scrypt algo.

Oh, OK, it's just it previously stated SSE2.

On another subject, I tried 3.7.5 windows binary in my desktop (Ryzen 1700) and all executables fail to start - it states:
"thread xx (random): Scrypt buffer allocation failed Fail: thread xx failed to initiate.

I noted the change in feature reporting in the release announcement.

You're out of memory.  You only have enough memory for xx -1 threads.

Thanks, fiddling around with virtual memory settings allowed it to run.

Performance is still very bad with Ryzen CPU using Scrypt. At same level as a Xeon Westmere-EP 6 cores @ 2.4 GHz. Is this really the CPU fault, or could cpuminer-opt be more optimized for Zen architecture? 

Thanks and keep up the good work!

Virtual memory is slow, you need the real thing.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
lncm
Member
**
Offline Offline

Activity: 388
Merit: 13


View Profile
December 16, 2017, 11:14:54 PM
 #3023

Yes it's normal and dependent on the algo. It means cpuminer-opt has no optimizations for scrypt algo.

Oh, OK, it's just it previously stated SSE2.

On another subject, I tried 3.7.5 windows binary in my desktop (Ryzen 1700) and all executables fail to start - it states:
"thread xx (random): Scrypt buffer allocation failed Fail: thread xx failed to initiate.

I noted the change in feature reporting in the release announcement.

You're out of memory.  You only have enough memory for xx -1 threads.

Thanks, fiddling around with virtual memory settings allowed it to run.

Performance is still very bad with Ryzen CPU using Scrypt. At same level as a Xeon Westmere-EP 6 cores @ 2.4 GHz. Is this really the CPU fault, or could cpuminer-opt be more optimized for Zen architecture? 

Thanks and keep up the good work!

Virtual memory is slow, you need the real thing.

I have 16 Gb of Ram, it shouldn't be a problem.
I had a fixed page file size, I set it to auto, and it worked. Maybe a bug?
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 12:53:36 AM
 #3024

Yes it's normal and dependent on the algo. It means cpuminer-opt has no optimizations for scrypt algo.

Oh, OK, it's just it previously stated SSE2.

On another subject, I tried 3.7.5 windows binary in my desktop (Ryzen 1700) and all executables fail to start - it states:
"thread xx (random): Scrypt buffer allocation failed Fail: thread xx failed to initiate.

I noted the change in feature reporting in the release announcement.

You're out of memory.  You only have enough memory for xx -1 threads.

Thanks, fiddling around with virtual memory settings allowed it to run.

Performance is still very bad with Ryzen CPU using Scrypt. At same level as a Xeon Westmere-EP 6 cores @ 2.4 GHz. Is this really the CPU fault, or could cpuminer-opt be more optimized for Zen architecture? 

Thanks and keep up the good work!

Virtual memory is slow, you need the real thing.

I have 16 Gb of Ram, it shouldn't be a problem.
I had a fixed page file size, I set it to auto, and it worked. Maybe a bug?

You don't have enough RAM to run that many threads without using VM. Using VM is slow.
Stop arguing and do the math: N*threads.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
lncm
Member
**
Offline Offline

Activity: 388
Merit: 13


View Profile
December 17, 2017, 11:28:35 AM
 #3025

Yes it's normal and dependent on the algo. It means cpuminer-opt has no optimizations for scrypt algo.

Oh, OK, it's just it previously stated SSE2.

On another subject, I tried 3.7.5 windows binary in my desktop (Ryzen 1700) and all executables fail to start - it states:
"thread xx (random): Scrypt buffer allocation failed Fail: thread xx failed to initiate.

I noted the change in feature reporting in the release announcement.

You're out of memory.  You only have enough memory for xx -1 threads.

Thanks, fiddling around with virtual memory settings allowed it to run.

Performance is still very bad with Ryzen CPU using Scrypt. At same level as a Xeon Westmere-EP 6 cores @ 2.4 GHz. Is this really the CPU fault, or could cpuminer-opt be more optimized for Zen architecture?  

Thanks and keep up the good work!

Virtual memory is slow, you need the real thing.

I have 16 Gb of Ram, it shouldn't be a problem.
I had a fixed page file size, I set it to auto, and it worked. Maybe a bug?

You don't have enough RAM to run that many threads without using VM. Using VM is slow.
Stop arguing and do the math: N*threads.

How many RAM per thread? So if I run less threads could it be actually faster?

Sorry to annoy you with so many questions.

PS: in task manager cpuminer has 11.5 Gb RAM allocated.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 03:45:41 PM
 #3026

Sorry to annoy you with so many questions.

You ask snap questions without thinking then you challenge my answers based on your misconceptions.

Running out of memory is a simple problem that you should be able to solve yourself.

You don't need to apologize, just try harder before asking questions. And if you do need to ask a
question about a problem you should show how you tried to solve it. You learn more that way.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 05:12:42 PM
 #3027

New release cpuminer-opt-3.7.7

Fixed regression caused by 64 CPU support.
Fixed lyra2h.

https://github.com/JayDDee/cpuminer-opt/releases/tag/v3.7.7

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
Larvitar
Jr. Member
*
Offline Offline

Activity: 196
Merit: 1


View Profile
December 17, 2017, 05:16:09 PM
 #3028

cpuminer-opt-3.7.6 is released.

Added lyra2h algo for Hppcoin.
Added support for more than 64 CPUs.
Optimized shavite with AES, improves x11 etc.

Get it on git:  https://github.com/JayDDee/cpuminer-opt/releases

More detailed release notes:

Lyra2h has not been tested. It is virtually a clone of lyra2z so it should work.
Please report any problems.

Support for over 64 CPU is limited in that specifying --cpu-affinity has no effect.
The arg will be ignored and he default affinity will be used. This has not been
tested either so if anyone has the ability to test it please do so and report.

There are no new 4way algos this release but optiizing shavite came as a surprise
and helps all CPUs with AES.

The past two releases have also seen some reworking of some existing SIMD code as
I learn new techniques. It should be more efficient but not likely to produce a significant
speed up.

There are currently 2 4way blockers. BMW is blocking full optimization of x11 and blake256
is blocking m7m. I'd like to get those resolved but I'm stuck at the moment. Since m7m is
CPU only I'd like to prioritize that algo.

A few algos have 4way enabled bur are either untested or have known problems that affect
performance.

Tested working: skein, keccak, keccakc, nist5, tribus.

Enabled untested: skein2, jha, whirlpool, pentablake.

Enabled with known problems: blake256 lane corruption: lyra2z, decred, blake.
These algos operate in 2way mode due to invalid hash in 2 lanes.

Kudos for you! Awesome miner Smiley
Lets to the feedback:
I have a Ryzen 7 1700 at 3.7GHz. The 4way is around 15% slower than AES-AVX/AVX2 mining nist5. Around 240KH/s per core (8 threads) to 4way and 270KH/s per core to AES-AVX2. Its working stable, but with less performance. I can get 2.1~2.2MH/s NIST5.

I would like to see SHA enabled and working in Windows, but I saw how difficult are. But, if I could help, I can allow you to connect to my machine to try something. I dont have knowledge about coding, but want help to compile a SHA miner.
My9bot
Full Member
***
Offline Offline

Activity: 239
Merit: 100


View Profile
December 17, 2017, 05:20:53 PM
 #3029

cpuminer-opt-3.7.6 is released.

Added lyra2h algo for Hppcoin.
Added support for more than 64 CPUs.
Optimized shavite with AES, improves x11 etc.

Get it on git:  https://github.com/JayDDee/cpuminer-opt/releases

More detailed release notes:

Lyra2h has not been tested. It is virtually a clone of lyra2z so it should work.
Please report any problems.

Support for over 64 CPU is limited in that specifying --cpu-affinity has no effect.
The arg will be ignored and he default affinity will be used. This has not been
tested either so if anyone has the ability to test it please do so and report.

There are no new 4way algos this release but optiizing shavite came as a surprise
and helps all CPUs with AES.

The past two releases have also seen some reworking of some existing SIMD code as
I learn new techniques. It should be more efficient but not likely to produce a significant
speed up.

There are currently 2 4way blockers. BMW is blocking full optimization of x11 and blake256
is blocking m7m. I'd like to get those resolved but I'm stuck at the moment. Since m7m is
CPU only I'd like to prioritize that algo.

A few algos have 4way enabled bur are either untested or have known problems that affect
performance.

Tested working: skein, keccak, keccakc, nist5, tribus.

Enabled untested: skein2, jha, whirlpool, pentablake.

Enabled with known problems: blake256 lane corruption: lyra2z, decred, blake.
These algos operate in 2way mode due to invalid hash in 2 lanes.

Kudos for you! Awesome miner Smiley
Lets to the feedback:
I have a Ryzen 7 1700 at 3.7GHz. The 4way is around 15% slower than AES-AVX/AVX2 mining nist5. Around 240KH/s per core (8 threads) to 4way and 270KH/s per core to AES-AVX2. Its working stable, but with less performance. I can get 2.1~2.2MH/s NIST5.

I would like to see SHA enabled and working in Windows, but I saw how difficult are. But, if I could help, I can allow you to connect to my machine to try something. I dont have knowledge about coding, but want help to compile a SHA miner.

cpuminer-opt-3.7.7-sha win

https://ufile.io/mkuq4

I'm better with code than with words-SatoshiNakamoto
Espers [ESP]SiteOnBlockchain
Larvitar
Jr. Member
*
Offline Offline

Activity: 196
Merit: 1


View Profile
December 17, 2017, 05:34:51 PM
Last edit: December 17, 2017, 05:52:25 PM by Larvitar
 #3030


cpuminer-opt-3.7.7-sha win

https://ufile.io/mkuq4


Thank you!  Cheesy

EDIT:
Starting miner it asks for libcrypto-1_1-x64.dll. Do I need it or just have to rename the libcrypto1.0.0.dll?

EDIT2:
Solved by installing OpenSSL 1.1 x64.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 06:09:52 PM
Last edit: December 17, 2017, 06:47:32 PM by joblo
 #3031


I have a Ryzen 7 1700 at 3.7GHz. The 4way is around 15% slower than AES-AVX/AVX2 mining nist5. Around 240KH/s per core (8 threads) to 4way and 270KH/s per core to AES-AVX2. Its working stable, but with less performance. I can get 2.1~2.2MH/s NIST5.

This is very interesting feedback.  I get 340 kH/s per thread 4way vs 255 kH/s AVX2 1way on my i7-6700K @4GHz.

Something isn't right, need lots of details to eliminate simple stuff. Can you post the startup for both?
None of the following should cause that much of a difference, but it helps to quantify.

AMD AVX2 performance is known to be slower than AVX. Try running a test with just AVX2 and again
with AVX to compare. Another, better, way to copmare AVX2 vs AVX performance is lyra2rev2. It has the most
AVX2 code.

4way uses 4 times the memory of plain AVX2. This will expose any cache performance issues. Try running fewer
threads to see if performance (total, not just per thread) improves.

Try tribus algo, it's pure 4way parallel while nist5 has a serial component which reduces gain and adds some overhead.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 06:40:55 PM
 #3032


cpuminer-opt-3.7.7-sha win

https://ufile.io/mkuq4


Thanks for that. Do you have a howto guide? I need to file it for when I finally upgrade my build environment

With your permission I will add your link to the OP.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
Larvitar
Jr. Member
*
Offline Offline

Activity: 196
Merit: 1


View Profile
December 17, 2017, 06:53:04 PM
Last edit: December 17, 2017, 08:06:59 PM by Larvitar
 #3033


I have a Ryzen 7 1700 at 3.7GHz. The 4way is around 15% slower than AES-AVX/AVX2 mining nist5. Around 240KH/s per core (8 threads) to 4way and 270KH/s per core to AES-AVX2. Its working stable, but with less performance. I can get 2.1~2.2MH/s NIST5.

This is very interesting feedback.  I get 340 kH/s per thread 4way vs 255 kH/s AVX2 1way on my i7-6700K @4GHz.

Something isn't right, need lots of details to eliminate simple stuff. Can you post the startup for both?
None of the following should cause that much of a difference, but it helps to quantify.

AMD AVX2 performance is known to be slower than AVX. Try running a test with just AVX2 and again
with AVX to compare.

4way uses 4 time the memory of plain AVX2. This will expose any cache performance issues. Try running fewer
threads to see if performance (total, not just per thread) improves.

Try tribus algo, it's pure 4way parallel while nist5 has a serial component which reduces gain and adds some overhead.

Thanks for the reply.

About Tribus (3.7.7 version):

Tribus AVX 16 threads:
Code:
[2017-12-17 15:45:48] tribus block 449382, diff 297.717
[2017-12-17 15:45:48] CPU #3: 73.32 kH, 226.66 kH/s
[2017-12-17 15:45:48] CPU #2: 60.95 kH, 225.42 kH/s
[2017-12-17 15:45:48] CPU #1: 68.89 kH, 228.54 kH/s
[2017-12-17 15:45:48] CPU #0: 59.57 kH, 220.31 kH/s
[2017-12-17 15:45:48] CPU #7: 71.66 kH, 226.42 kH/s
[2017-12-17 15:45:48] CPU #4: 47.67 kH, 206.94 kH/s
[2017-12-17 15:45:48] CPU #14: 69.70 kH, 228.19 kH/s
[2017-12-17 15:45:48] CPU #6: 66.07 kH, 226.71 kH/s
[2017-12-17 15:45:48] CPU #12: 36.67 kH, 223.24 kH/s
[2017-12-17 15:45:48] CPU #15: 69.95 kH, 228.24 kH/s
[2017-12-17 15:45:48] CPU #11: 66.53 kH, 225.95 kH/s
[2017-12-17 15:45:48] CPU #5: 70.96 kH, 227.81 kH/s
[2017-12-17 15:45:48] CPU #10: 312.06 kH, 275.75 kH/s
[2017-12-17 15:45:48] CPU #8: 43.73 kH, 172.57 kH/s
[2017-12-17 15:45:48] CPU #9: 68.83 kH, 238.64 kH/s
[2017-12-17 15:45:48] CPU #13: 72.51 kH, 228.39 kH/s

Tribus AVX2 16 threads:
Code:
[2017-12-17 15:45:48][2017-12-17 15:49:10] tribus block 449390, diff 254.451
[2017-12-17 15:49:10] CPU #4: 97.38 kH, 211.38 kH/s
[2017-12-17 15:49:10] CPU #6: 110.08 kH, 237.92 kH/s
[2017-12-17 15:49:10] CPU #7: 110.38 kH, 238.04 kH/s
[2017-12-17 15:49:10] CPU #0: 103.07 kH, 221.32 kH/s
[2017-12-17 15:49:10] CPU #1: 109.05 kH, 234.17 kH/s
[2017-12-17 15:49:10] CPU #9: 109.41 kH, 238.00 kH/s
[2017-12-17 15:49:10] CPU #8: 108.26 kH, 234.98 kH/s
[2017-12-17 15:49:10] CPU #13: 109.99 kH, 238.22 kH/s
[2017-12-17 15:49:10] CPU #5: 112.40 kH, 241.36 kH/s
[2017-12-17 15:49:10] CPU #11: 111.49 kH, 239.40 kH/s
[2017-12-17 15:49:10] CPU #3: 111.29 kH, 238.97 kH/s
[2017-12-17 15:49:10] CPU #15: 110.46 kH, 238.21 kH/s
[2017-12-17 15:49:10] CPU #2: 110.69 kH, 237.67 kH/s
[2017-12-17 15:49:10] CPU #10: 111.39 kH, 239.19 kH/s
[2017-12-17 15:49:10] CPU #14: 110.70 kH, 237.20 kH/s
[2017-12-17 15:49:10] CPU #12: 94.46 kH, 199.39 kH/s
[2017-12-17 15:49:15] CPU #12: 836.08 kH, 196.43 kH/s
[2017-12-17 15:49:15] Accepted 1/1 (100%), 2472.11 kH, 3722.47 kH/s


Tribus 4way 16 threads:
Code:
[2017-12-17 15:45:48][2017-12-17 15:49:10] [2017-12-17 15:50:38] tribus block 449392, diff 221.049
[2017-12-17 15:50:38] CPU #0: 2552.29 kH, 340.11 kH/s
[2017-12-17 15:50:38] CPU #1: 3076.95 kH, 410.02 kH/s
[2017-12-17 15:50:38] CPU #12: 2199.45 kH, 293.25 kH/s
[2017-12-17 15:50:38] CPU #8: 2508.86 kH, 334.41 kH/s
[2017-12-17 15:50:38] CPU #14: 2807.39 kH, 374.11 kH/s
[2017-12-17 15:50:38] CPU #9: 3002.02 kH, 400.25 kH/s
[2017-12-17 15:50:38] CPU #2: 2978.50 kH, 396.85 kH/s
[2017-12-17 15:50:38] CPU #3: 2993.07 kH, 398.79 kH/s
[2017-12-17 15:50:38] CPU #5: 2997.27 kH, 399.67 kH/s
[2017-12-17 15:50:38] CPU #4: 2927.24 kH, 390.44 kH/s
[2017-12-17 15:50:38] CPU #6: 2954.16 kH, 393.72 kH/s
[2017-12-17 15:50:38] CPU #7: 2983.57 kH, 397.69 kH/s
[2017-12-17 15:50:38] CPU #11: 3005.27 kH, 400.79 kH/s
[2017-12-17 15:50:38] CPU #15: 2946.88 kH, 393.06 kH/s
[2017-12-17 15:50:38] CPU #10: 2947.45 kH, 392.77 kH/s
[2017-12-17 15:50:38] CPU #13: 2742.90 kH, 365.66 kH/s


Tribus 4way 8 threads:
Code:
[2017-12-17 15:45:48][2017-12-17 15:49:10] [2017-12-17 17:05:32] tribus block 449483, diff 735.578
[2017-12-17 17:05:32] CPU #7: 461.65 kH, 398.07 kH/s
[2017-12-17 17:05:32] CPU #6: 460.63 kH, 398.21 kH/s
[2017-12-17 17:05:32] CPU #5: 460.43 kH, 397.70 kH/s
[2017-12-17 17:05:32] CPU #2: 460.88 kH, 397.74 kH/s
[2017-12-17 17:05:32] CPU #4: 460.51 kH, 397.76 kH/s
[2017-12-17 17:05:32] CPU #3: 460.82 kH, 398.03 kH/s
[2017-12-17 17:05:32] CPU #0: 454.80 kH, 393.86 kH/s
[2017-12-17 17:05:32] CPU #1: 463.35 kH, 399.53 kH/s

Apparently Tribus 4way likes SMT/HT here.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 08:26:01 PM
 #3034

Tribus 4way 8 threads:
Code:
[2017-12-17 15:45:48][2017-12-17 15:49:10] [2017-12-17 17:05:32] tribus block 449483, diff 735.578
[2017-12-17 17:05:32] CPU #7: 461.65 kH, 398.07 kH/s
[2017-12-17 17:05:32] CPU #6: 460.63 kH, 398.21 kH/s
[2017-12-17 17:05:32] CPU #5: 460.43 kH, 397.70 kH/s
[2017-12-17 17:05:32] CPU #2: 460.88 kH, 397.74 kH/s
[2017-12-17 17:05:32] CPU #4: 460.51 kH, 397.76 kH/s
[2017-12-17 17:05:32] CPU #3: 460.82 kH, 398.03 kH/s
[2017-12-17 17:05:32] CPU #0: 454.80 kH, 393.86 kH/s
[2017-12-17 17:05:32] CPU #1: 463.35 kH, 399.53 kH/s

Apparently Tribus 4way likes SMT/HT here.

It's interesting that the thread rate didn't increase with fewer threads. Were the threads spread over
all 8 cores? You can try "-t 8 --cpu-affinity 0x5555" to select alternate vcores.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
Larvitar
Jr. Member
*
Offline Offline

Activity: 196
Merit: 1


View Profile
December 17, 2017, 08:38:13 PM
 #3035

Tribus 4way 8 threads:
Code:
[2017-12-17 15:45:48][2017-12-17 15:49:10] [2017-12-17 17:05:32] tribus block 449483, diff 735.578
[2017-12-17 17:05:32] CPU #7: 461.65 kH, 398.07 kH/s
[2017-12-17 17:05:32] CPU #6: 460.63 kH, 398.21 kH/s
[2017-12-17 17:05:32] CPU #5: 460.43 kH, 397.70 kH/s
[2017-12-17 17:05:32] CPU #2: 460.88 kH, 397.74 kH/s
[2017-12-17 17:05:32] CPU #4: 460.51 kH, 397.76 kH/s
[2017-12-17 17:05:32] CPU #3: 460.82 kH, 398.03 kH/s
[2017-12-17 17:05:32] CPU #0: 454.80 kH, 393.86 kH/s
[2017-12-17 17:05:32] CPU #1: 463.35 kH, 399.53 kH/s

Apparently Tribus 4way likes SMT/HT here.

It's interesting that the thread rate didn't increase with fewer threads. Were the threads spread over
all 8 cores? You can try "-t 8 --cpu-affinity 0x5555" to select alternate vcores.

Code:
[2017-12-17 17:34:59] [2017-12-17 17:36:25] tribus block 449526, diff 130.915
[2017-12-17 17:36:25] CPU #6: 5670.24 kH, 753.19 kH/s
[2017-12-17 17:36:25] CPU #5: 5840.23 kH, 775.66 kH/s
[2017-12-17 17:36:25] CPU #0: 69.55 kH, 763.09 kH/s
[2017-12-17 17:36:25] CPU #7: 5672.16 kH, 753.14 kH/s
[2017-12-17 17:36:25] CPU #4: 5766.59 kH, 765.78 kH/s
[2017-12-17 17:36:25] CPU #2: 5597.96 kH, 743.19 kH/s
[2017-12-17 17:36:25] CPU #3: 5665.52 kH, 752.36 kH/s
[2017-12-17 17:36:25] CPU #1: 5690.77 kH, 755.51 kH/s
[2017-12-17 17:36:26] Accepted 2/2 (100%), 39.97 MH, 6061.92 kH/s

Ya, the default affinity was choosing virtual threads instead physical ones. Damn! 6MH/s!
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 08:58:34 PM
 #3036

Ya, the default affinity was choosing virtual threads instead physical ones. Damn! 6MH/s!

All Ryzen users should take note. Intel chooses one thread per core before using HT.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
Larvitar
Jr. Member
*
Offline Offline

Activity: 196
Merit: 1


View Profile
December 17, 2017, 11:18:59 PM
 #3037

Ya, the default affinity was choosing virtual threads instead physical ones. Damn! 6MH/s!

All Ryzen users should take note. Intel chooses one thread per core before using HT.

In fact. Joblo, is there an updated algo list that receive boost from SHA hardware acceleration? I found a little list some pages before:

Quote
sha256t, lbry, skein, myr-groestl, m7m.

Are there more algos?
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 17, 2017, 11:29:10 PM
 #3038

Ya, the default affinity was choosing virtual threads instead physical ones. Damn! 6MH/s!

All Ryzen users should take note. Intel chooses one thread per core before using HT.

In fact. Joblo, is there an updated algo list that receive boost from SHA hardware acceleration? I found a little list some pages before:

Quote
sha256t, lbry, skein, myr-groestl, m7m.

Are there more algos?

I converted all of them at the time and I don't recall any new algos that can use it.

What about nist5? Can you try that again? I'd like to understand what's going on there.
I get good performance on my Intel.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
My9bot
Full Member
***
Offline Offline

Activity: 239
Merit: 100


View Profile
December 17, 2017, 11:58:21 PM
 #3039

Ya, the default affinity was choosing virtual threads instead physical ones. Damn! 6MH/s!

All Ryzen users should take note. Intel chooses one thread per core before using HT.

In fact. Joblo, is there an updated algo list that receive boost from SHA hardware acceleration? I found a little list some pages before:

Quote
sha256t, lbry, skein, myr-groestl, m7m.

Are there more algos?

I converted all of them at the time and I don't recall any new algos that can use it.

What about nist5? Can you try that again? I'd like to understand what's going on there.
I get good performance on my Intel.

what do you need?

I'm better with code than with words-SatoshiNakamoto
Espers [ESP]SiteOnBlockchain
Larvitar
Jr. Member
*
Offline Offline

Activity: 196
Merit: 1


View Profile
December 18, 2017, 12:01:25 AM
 #3040

Ya, the default affinity was choosing virtual threads instead physical ones. Damn! 6MH/s!

All Ryzen users should take note. Intel chooses one thread per core before using HT.

In fact. Joblo, is there an updated algo list that receive boost from SHA hardware acceleration? I found a little list some pages before:

Quote
sha256t, lbry, skein, myr-groestl, m7m.

Are there more algos?

I converted all of them at the time and I don't recall any new algos that can use it.

What about nist5? Can you try that again? I'd like to understand what's going on there.
I get good performance on my Intel.

I reduced overclock (to keep everything cold). New results with NIST5:

NIST5 4way 8 threads with --cpu-affinity 0x5555
Code:
[2017-12-17 20:33:17] nist5 block 14635, diff 14699.054
[2017-12-17 20:33:27] CPU #6: 2097.15 kH, 209.42 kH/s
[2017-12-17 20:33:27] CPU #2: 2097.15 kH, 207.48 kH/s
[2017-12-17 20:33:27] CPU #5: 2097.15 kH, 205.79 kH/s
[2017-12-17 20:33:27] CPU #7: 2097.15 kH, 205.61 kH/s
[2017-12-17 20:33:27] CPU #4: 2097.15 kH, 204.96 kH/s
[2017-12-17 20:33:27] CPU #1: 2097.15 kH, 204.46 kH/s
[2017-12-17 20:33:27] CPU #0: 2097.15 kH, 204.01 kH/s
[2017-12-17 20:33:27] CPU #3: 2097.15 kH, 199.72 kH/s

NIST5 16 threads
Code:
[2017-12-17 20:47:55] nist5 block 14649, diff 22837.326
[2017-12-17 20:47:55] CPU #2: 667.71 kH, 121.83 kH/s
[2017-12-17 20:47:55] CPU #3: 672.92 kH, 122.76 kH/s
[2017-12-17 20:47:55] CPU #0: 454.52 kH, 83.08 kH/s
[2017-12-17 20:47:55] CPU #1: 653.82 kH, 119.54 kH/s
[2017-12-17 20:47:55] CPU #14: 647.14 kH, 118.14 kH/s
[2017-12-17 20:47:55] CPU #7: 657.04 kH, 119.95 kH/s
[2017-12-17 20:47:55] CPU #6: 635.59 kH, 116.06 kH/s
[2017-12-17 20:47:55] CPU #11: 681.85 kH, 124.55 kH/s
[2017-12-17 20:47:55] CPU #5: 682.78 kH, 124.85 kH/s
[2017-12-17 20:47:55] CPU #4: 570.23 kH, 104.24 kH/s
[2017-12-17 20:47:55] CPU #12: 565.09 kH, 103.26 kH/s
[2017-12-17 20:47:55] CPU #10: 681.59 kH, 124.53 kH/s
[2017-12-17 20:47:55] CPU #8: 617.07 kH, 112.53 kH/s
[2017-12-17 20:47:55] CPU #9: 684.27 kH, 124.89 kH/s
[2017-12-17 20:47:55] CPU #15: 669.73 kH, 122.23 kH/s
[2017-12-17 20:47:55] CPU #13: 642.05 kH, 117.12 kH/s

NIST5 AES-AVX2 8 threads with --cpu-affinity 0x5555
Code:
[2017-12-17 20:59:30] nist5 block 14655, diff 22762.164
[2017-12-17 20:59:36] CPU #4: 2097.15 kH, 369.94 kH/s
[2017-12-17 20:59:36] CPU #5: 2097.15 kH, 365.38 kH/s
[2017-12-17 20:59:36] CPU #7: 2097.15 kH, 365.42 kH/s
[2017-12-17 20:59:36] CPU #6: 2097.15 kH, 365.26 kH/s
[2017-12-17 20:59:36] CPU #3: 2097.15 kH, 365.13 kH/s
[2017-12-17 20:59:36] CPU #0: 2097.15 kH, 359.60 kH/s
[2017-12-17 20:59:36] CPU #1: 2097.15 kH, 359.14 kH/s
[2017-12-17 20:59:36] CPU #2: 2097.15 kH, 356.60 kH/s

EDIT1:
NIST5 AES-AVX 8 threads with --cpu-affinity 0x5555
Code:
[2017-12-17 21:02:19] nist5 block 14657, diff 22797.808
[2017-12-17 21:02:26] CPU #6: 2097.15 kH, 374.14 kH/s
[2017-12-17 21:02:26] CPU #5: 2097.15 kH, 373.53 kH/s
[2017-12-17 21:02:26] CPU #2: 2097.15 kH, 370.56 kH/s
[2017-12-17 21:02:26] CPU #7: 2097.15 kH, 369.28 kH/s
[2017-12-17 21:02:26] CPU #4: 2097.15 kH, 367.82 kH/s
[2017-12-17 21:02:26] CPU #0: 2097.15 kH, 367.53 kH/s
[2017-12-17 21:02:26] CPU #1: 2097.15 kH, 365.98 kH/s
[2017-12-17 21:02:26] CPU #3: 2097.15 kH, 365.38 kH/s

EDIT2:
Quote from: My9bot
what do you need?

Wich algos are affected by SHA acceleration?
Pages: « 1 ... 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 [152] 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!