Bitcoin Forum
November 08, 2024, 06:09:28 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 [196] 197 »
  Print  
Author Topic: [LOCKED] cpuminer-opt v3.12.3, open source optimized multi-algo CPU miner  (Read 444060 times)
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 03, 2019, 04:25:34 PM
Last edit: December 10, 2019, 10:05:27 PM by joblo
 #3901

Some notes about pecuriarities using GCC 9 that affect cpuminer-opt
and may be of interest to developpers.

1. It produces more warnings about array bounds, found some violations
in cpuminer-opt that will be fixed in the next release.

2. It no longer includes AES in "-march=core-avx2", need to add aes
manually: "-march=core-avx2 -maes".  

3. It doesn't rebuild Makefile.in after removing a source file from Makefile.am.
The compiler still looked for the deleted file. It was necessary to edit Makefile.in
manually to remove all references to the deleted file. Will follow up.

Edit: I was missing automake, didn't need it until I changed Makefile,am


For the time being I will continue to use GCC 7 for devepolment and production of
the Windows binaries.


AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 03, 2019, 05:48:16 PM
 #3902

After many delays, AVX-512 was supposed to be generally available 3 years ago with
Intel Cannon Lake, cpuminer-opt now supports AVX-512.

AVX-512 is currently available on Intel Skylake-X and the newly released Cascadelake-X
CPUs from Intel. It is also available on Icelake but only for mobile CPUs.

It looks like AVX512 will finally be released for mainstream desktops in 2020.
I'm not aware of plans to add AVX512 to AMD Ryzen CPUs.

Algos will be optimized gradually over the next few releases. First up are argon2d, blake2s,
keccak, keccakc, skein and skein2.

https://github.com/JayDDee/cpuminer-opt/releases/tag/v3.10.0

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 04, 2019, 06:36:00 PM
 #3903

I previously asked if someone would be kind enough to do a test on a Ryzen 3xxx
to compare AVX2 vs AVX performance. With Ryzen 1xxx AVX2 was often slower than AVX.

The results will help me decide how to deliver Windows binaries for Ryzen and whether
AVX2 should override SHA.

Currently only Ryzen has SHA so it's simple, use it if it's there because AVX2 is slow.
It gets more complicated when Intel releases Icelake with SHA for the desktop. AVX2 is
faster than SHA on Intel CPUs.

Which is faster on Ryzen 3xxx and does the new znver2 compile arch make a difference?

Requirements:

Any Ryzen or TR CPU from the 3xxx series.
A recent Linux distro.

Goal:

Compare AVX2 vs AVX performance on Ryzen 3000 series CPUs using blake2s algo.
Compare AVX2 vs SHA performance on Ryzen 3000 series CPUs usimg sha256t algo.
Determine if the new znver2 compile arch has an effect on the results.
Determine if Intel and Ryzen need to prioritize features differently..

Procedure:

1. Compile seperate builds for znver1, znver2, and avx2 and avx

Code:
./autogen.sh
CFLAGS="-O3 -march=znver1 -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-znver1
make clean
CFLAGS="-O3 -march=znver2 -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-znver2
make clean
CFLAGS="-O3 -march=core-avx2 -maes -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-avx2
[make clean
CFLAGS="-O3 -march=core-avx -maes -Wall" ./configure --with-curl
make -j 4
mv cpuminer cpuminer-avx

2. Do a blake2s benchmark on each build. 5 minutes each should be enough
to produce a stable hash rate.

Code:
./cpuminer-znver1 -a blake2s --benchmark --hash-meter
./cpuminer-znver2 -a blake2s --benchmark --hash-meter
./cpuminer-avx2 -a blake2s --benchmark --hash-meter
./cpuminer-avx -a blake2s --benchmark --hash-meter

3. Repeat the tests with sha256t.

4. Post your results including CPU model, GCC version and the stable total hash rate
for each test.

Thanks in advance, the results will help ensure optimum performance on Ryzen CPUs.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
A-Bolt
Legendary
*
Offline Offline

Activity: 2334
Merit: 2374


View Profile
December 05, 2019, 12:35:04 PM
Last edit: December 05, 2019, 01:37:16 PM by A-Bolt
 #3904

I previously asked if someone would be kind enough to do a test on a Ryzen 3xxx

Ryzen 5 3600 @ 4.2GHz (CPU Core Ratio - 42x, PBO is disabled) GCC 9.2.1:
Code:
blake2s:
        znver1  231.46 MH/s
        znver2  238.08 MH/s
          avx2  236.11 MH/s
           avx  236.09 MH/s

sha256t:
        znver1   61.44 MH/s
        znver2   61.69 MH/s
          avx2   46.25 MH/s
           avx   46.26 MH/s


joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 05, 2019, 02:18:08 PM
 #3905

I previously asked if someone would be kind enough to do a test on a Ryzen 3xxx

Ryzen 5 3600 @ 4.2GHz (CPU Core Ratio - 42x, PBO is disabled) GCC 9.2.1:
Code:
blake2s:
        znver1  231.46 MH/s
        znver2  238.08 MH/s
          avx2  236.11 MH/s
           avx  236.09 MH/s

sha256t:
        znver1   61.44 MH/s
        znver2   61.69 MH/s
          avx2   46.25 MH/s
           avx   46.26 MH/s


Many thanks. It's not quite the results I expected. I was hoping AVX2 would be better.
SHA is clearly the winner over AVX2. That was expected given the AVX2 results.

I see no need for seperate znver1 and znver2 packages, there is only a slight improvement
for AVX and AVX2.

I also see no need to override SHA until Intel CPUs with SHA become mainstream.
with Icelake.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 06, 2019, 12:21:35 AM
Last edit: December 10, 2019, 12:07:22 AM by joblo
 #3906

cpuminer-opt-3.10.1 was just released. It fixes some bugs that can cause generally poor performance
without reporting any errors. All users should upgrade.

https://github.com/JayDDee/cpuminer-opt/releases

AVX512 for blake2b, nist5, quark, tribus.

More broken lane fixes, fixed buffer overflow in skein AVX512, fixed
quark invalid shares AVX2.

Only the highest ranking feature in a class is listed at startup, lower ranking
features are available but no longer listed.

Edit: v3.10.3 is out with more AVX512

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 15, 2019, 09:57:10 PM
Last edit: December 22, 2019, 06:45:09 PM by joblo
 #3907

Good evening
Can MSR be implemented in your cpuminer-opt?
https://xmrig.com/docs/miner/randomx-optimization-guide/msr
Or is it just about RandomX.
Thanks you

It looks interesting but I have lots of questions about it. I'm deep into AVX512 right now
so I'll follow up later.

It might be specific to RandomX (and probably cryptonight) because they were both designed
with specific cache usage in mind.

I assume the technique is to disable next line prefetching which assumes sequential access.
RandomX won't need the next line due to it's randomness so it's waste to prefetch it.

Edit:

It appears this optimzation is specific to certain algorithms and could negatively impact others.
To implement it would require using it only on selected algos. The algos currently benefitting
are not supported by cpuminer-opt. It would be a lot of work to analyze which supported algos
might be helped.

I'm also concerned about the system impact. This kind of  optimization may be appropriate for a
dedicated mining system but not for a multi purpose desktop. Changing the prefetch configuration
has system wide effect and will affect other applications positively or negatively, even when not mining.

There is no gaceful way to undo the changes. Miners don't usually exit gracefuly, Ctrl C is
the standard exit, or sometimes a crash. This would leave the system prefetch configuration
modified and would require manually restoring it.

I think I'll pass.


AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 22, 2019, 07:14:41 PM
Last edit: December 23, 2019, 03:04:25 AM by joblo
 #3908

The previous optimization request got me thinking. It raised concerns similar to
another request I resisted and raises an interesting question.

How far should a miner go for optimizing performance?

Should it modify system configuration?

Should the miner be required to run with root/admin privileges?

The 2 cases that illustrate the issue are the one imediately above. The miner makes a system
configuration change that will affect all applications, and it can't restore the original config itself.

The other case is huge pages. Huge pages requires system configuration changes as well
but only to enable the feature. It does not affect applications that don't explicitly use it.
Buit it requires the miner to be run by administrator on Windows.

My opinion is these features may be appropriate on a dedicated mining system but maybe not
for a typical desktop PC.

The ideal would be able to handle both environments transparently but that takes a lot of work.

Automated config changes that affect everything and aren't automatically reversed is
completely unnacceptible, IMO. If manual intervention is reruired to "undo" it should also
be required to "do".

My only concern is with the automation of the change and lack of automated reversal.
That has a simple solution. Don't do it in the miner.

HW prefetch changes should be done manually by the user before starting to mine, and then
undone when no longer mining.  It's completely up to the user which algos to use it with
and requires no complex logic in the miner.

Huge pages is not so risky but does have the issue of requiring the miner to be run by admin.
My other concern is the lack of transparency.

Huge pages should be completely transparent. The system should be smart enough to allocate
huge pages for large datasets. I don't see why any application changes should be required,
it should all happen behind the scenes in malloc. And it shouldn't require root/admin.

My stubbornness on this point may be part of the issue.

Both of these optimizations could help some algos and hurt others, they have to be set for
each algo individually. With nealry 100 algos that a huge task.

So aside from the technical concerns I don't know if it's worth the work.

Comments are welcome.






AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
alucard20724
Sr. Member
****
Offline Offline

Activity: 703
Merit: 272


View Profile
December 23, 2019, 03:31:23 AM
 #3909

cpuminer-opt-3.10.1 was just released. It fixes some bugs that can cause generally poor performance
without reporting any errors. All users should upgrade.

https://github.com/JayDDee/cpuminer-opt/releases

AVX512


Here are my results so far with a 7820X... i've only benched for the pools shown.

ps.. i'm mining ethash on two VII also at the same time




and here are the programs currently benched

joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 23, 2019, 05:29:23 AM
 #3910


Here are my results so far with a 7820X... i've only benched for the pools shown.


Thanks for posting. It would be nice to compare with AVX2.

I'm seeing genarally around 30% increase in most X algos as they are a mix of optimized
and unoptimized hash functions. Algos like lyra2v3, which are 100% optimized are getting
nearly double.

It's too bad CPUs don't have a chance with those algos anymore.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
alucard20724
Sr. Member
****
Offline Offline

Activity: 703
Merit: 272


View Profile
December 23, 2019, 06:33:17 AM
 #3911


Here are my results so far with a 7820X... i've only benched for the pools shown.


Thanks for posting. It would be nice to compare with AVX2.


which version of AVX2 would you like to see?.. i think i have twenty of your previous versions benched up to version 3.10.2 for avx2 on this cpu
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 23, 2019, 06:41:28 AM
 #3912

which version of AVX2 would you like to see?.. i think i have twenty of your previous versions benched up to version 3.10.2 for avx2 on this cpu

Just use the latest release compiled for avx2. That will provide the most direct comparison. If you have Windows
it's already compiled for you. With Linux just compile with "-march=skylake" instead of "-march=native".

You can confirm that the SW features only list AVX2 but the CPU still lists AVX512.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 23, 2019, 05:27:43 PM
 #3913

Scam warning

A user is posting fake links to cpuminer-opt. Don't download.

The only real cpuminer-opt is here and only here:

https://github.com/JayDDee/cpuminer-opt

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
thefix
Legendary
*
Offline Offline

Activity: 1049
Merit: 1001



View Profile
December 24, 2019, 01:12:37 AM
 #3914

Scam warning

A user is posting fake links to cpuminer-opt. Don't download.

The only real cpuminer-opt is here and only here:

https://github.com/JayDDee/cpuminer-opt

Thanks for the head up, I am sure the link they are posting has a download filled with all kinds of holiday goodies intended to make his/her holidays more festive. Its a good reminder to always double check things before you click them, because even the best of us get caught slipping sometimes.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 24, 2019, 04:10:30 AM
 #3915

Scam warning

A user is posting fake links to cpuminer-opt. Don't download.

The only real cpuminer-opt is here and only here:

https://github.com/JayDDee/cpuminer-opt

Thanks for the head up, I am sure the link they are posting has a download filled with all kinds of holiday goodies intended to make his/her holidays more festive. Its a good reminder to always double check things before you click them, because even the best of us get caught slipping sometimes.

The POS tried to copy my ANN but couldn't even do that right, A real winner.

I reported it to Mod and it seems to have been deleted.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
alucard20724
Sr. Member
****
Offline Offline

Activity: 703
Merit: 272


View Profile
December 24, 2019, 07:44:38 PM
 #3916

which version of AVX2 would you like to see?.. i think i have twenty of your previous versions benched up to version 3.10.2 for avx2 on this cpu

Just use the latest release compiled for avx2. That will provide the most direct comparison. If you have Windows
it's already compiled for you. With Linux just compile with "-march=skylake" instead of "-march=native".

You can confirm that the SW features only list AVX2 but the CPU still lists AVX512.

@joblo
Here's my results for AVX512 vs AVX2 on version 3.10.5    i'm running windows pro 10 x64  8gigs ram

joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
December 28, 2019, 09:13:51 PM
 #3917

@joblo
Here's my results for AVX512 vs AVX2 on version 3.10.5    i'm running windows pro 10 x64  8gigs ram

Thanks, Those results are in line with mine. The 100% AVX512 algos are pretty close to double the
hash rate so that indicates no significant scaling issues with AVX 512 unless memory accesses are
bottlenecked.

The long X chains are showing the effects of diminishing returns. Further optimization of previously
optimized code has less effect as it represents a diminishing proportion of the complete algo.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
January 03, 2020, 05:21:28 AM
 #3918

cpuminer-opt-3.11.0 introduces full support for Intel's Icelake CPUs.

Iclelake architecture includes AVX512, SHA, and VAES. AVX512 and SHA are already supported on
Intel Skylake-X and AMD Ryzen, respectively. VAES is new with Icelake and is an extension of
AES_NI and AVX512 that provides 4 way parallel AES encryption and decryption in a 512 bit vector.

Icelake is only available for mobile at this time, desktop availability is unknown.

VAES support is only available as source code and requires GCC 8.

See the OP for more details about v3.11.0

This release marks the end of the rapid development of the past several weeks. Things
will slow down considerably with mostly bug fixes and minor tweaks.

I am also planning a cleanup to remove some troublesome and useless code, namely the macros
for blake, bmw, etc used by algos like x11, as well as scrypt-jane algo. The macros don't provide
any noticeable performance difference from the refernce code and srypt-jane hasn't been used
for several years. There are other dead algos but they don't cause problems so there is no need to
remove them. This will also reduce the bloat. If anyone has concerns wwith this plan, please speak up.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
alucard20724
Sr. Member
****
Offline Offline

Activity: 703
Merit: 272


View Profile
January 03, 2020, 08:03:18 AM
 #3919

cpuminer-opt-3.11.0 introduces full support for Intel's Icelake CPUs.

Iclelake architecture includes AVX512, SHA, and VAES. AVX512 and SHA are already supported on
Intel Skylake-X and AMD Ryzen, respectively. VAES is new with Icelake and is an extension of
AES_NI and AVX512 that provides 4 way parallel AES encryption and decryption in a 512 bit vector.

Icelake is only available for mobile at this time, desktop availability is unknown.

VAES support is only available as source code and requires GCC 8.

See the OP for more details about v3.11.0

This release marks the end of the rapid development of the past several weeks. Things
will slow down considerably with mostly bug fixes and minor tweaks.

I am also planning a cleanup to remove some troublesome and useless code, namely the macros
for blake, bmw, etc used by algos like x11, as well as scrypt-jane algo. The macros don't provide
any noticeable performance difference from the refernce code and srypt-jane hasn't been used
for several years. There are other dead algos but they don't cause problems so there is no need to
remove them. This will also reduce the bloat. If anyone has concerns wwith this plan, please speak up.

is m7m supported with AVX512 now?
i didn't see it and i haven't noticed any speed increase based on the prior versions.... haven't tested v3.11.0 yet ... working on it.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
January 03, 2020, 01:57:38 PM
 #3920

is m7m supported with AVX512 now?
i didn't see it and i haven't noticed any speed increase based on the prior versions.... haven't tested v3.11.0 yet ... working on it.

Unfortunately AVX512 only improves algos that have already been taken over by GPUS and ASICS
and they are improvimng faster than CPUs can. That's because GPUs are real vector processors
while CPU SIMD just emulates vector processing with strict restrictions on data organization.
A GPU can run thousands of threads while the biggests CPUs with AVX512 can barely crack 100.

The secret is in the algorithm, those can can be vectorized can be vectoized better on a GPU.
The only way to speed up M7M is more CPU cores and faster clocks.

VAES has some potential as a few CPU algos use can use it. But VAES will only help with linear
vectorizing (loop unrolling) rather than enabling parallel operation.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
Pages: « 1 ... 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 [196] 197 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!