Bitcoin Forum
February 23, 2026, 05:29:18 PM *
News: Latest Bitcoin Core release: 30.2 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [21] 22 »  All
  Print  
Author Topic: [ANN] cpuminer-opt v26.1, Optimized multi-algo CPU miner for x86_64 and AArch64  (Read 11319 times)
JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
December 26, 2024, 05:18:02 AM
Last edit: December 29, 2024, 06:56:43 AM by JayDDee
 #401

cpuminer-opt-24.8

A little end of the year treat for Apple owners. This for MacOS only, no pads or phones.

Changelog:
ARM: Apple MacOS on M series CPU is now supported compiled from source code.
ARM: Fix incorrect compiler version display when using clang.
build.sh can now be used to compile all targets, arm_build.sh & build_msys2.sh have been removed.
Windows: MSys2 build now enables CPU groups by default, prebuilt binaries continue to be compiled with CPU groups disabled.

The Wiki has now been updated with MacOS procedure and simplified instructions for Linux & Windows
https://github.com/JayDDee/cpuminer-opt/wiki/Compiling-from-source

Please report your experiences, good or bad.

Finally the obligatory disclaimer since many Macs are laptops: I don't recommended mining with a laptop for reasons that have been stated many times.

JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
December 31, 2024, 02:40:03 AM
Last edit: December 31, 2024, 06:37:24 AM by JayDDee
 #402

cpuminer-opt-25.1

Another release focussed on MacOS. MacOS should now be stable on ARM64 & x86_64.
Windows on ARM64 is still a work in progress, see Wiki for details.

MacOS ARM64: m7m algo is now working.
MacOS x86_64: is now working compiled with GCC.
Fixed some minor bugs & removed some obsolete code.

JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
January 13, 2025, 12:28:41 AM
 #403

cpuminer-opt-25.2

ARM: Fixed regression from v25.1 that could cause build fail.
BSD: FreeBSD is now supported. Other BSDs may also work.
MacOS: build with installed jansson library instead of compiling the included source code.
Windows: remove "_WIN32_WINNT=0x0601" which is a downgrade on Win11.
Changed build.sh shell from bash to sh.

This should be the last new release for a while unless something comes up. Support has been expanded to
MacOS and FreeBSD operating systems on both x86_64 & ARM64 CPU architectures. Other BSDs may also work but
have not been tested. Unfortunately Windows on ARM64 is still a work in progress.

The build process has also been streamlined and tweaked over the past few releases. Once all the packages are installed
for the specific environment a single command will build cpuminer-opt. The Wiki has also been updated to reflect these changes.

Orestes
Sr. Member
****
Offline Offline

Activity: 452
Merit: 251



View Profile
January 14, 2025, 01:46:40 AM
 #404

Hello JayDDee,

I wanted to inform you that Duality Blockchain Solutions with its Dynamic (DYN) has been abandoned. Dynamic was using the argon2d_dyn algo (m_cost 500).
Instead, I am relaunching the project as "Zero Dynamics" with the Cash (0DYNC) coin, which is traded on the Exbitron exchange. The 0-DYN | Cash addition could be pretty straight forward; it is using the same argon2d algo but with m_cost = 1000.

More information about the project can be found here:
https://bitcointalk.org/index.php?topic=1902896.0
JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
January 14, 2025, 03:12:40 AM
 #405

Thanks for the heads up. I'll remove all references to DYN in cpuminer-opt.

As far as supporting argon2d-1000, I'm undecided. If it's only a matter of changing the mcost it should
be simple. But there isn't much happening with argon2d anywhere at this time.
If I see it pop up in a pool I'l take a look.

Orestes
Sr. Member
****
Offline Offline

Activity: 452
Merit: 251



View Profile
January 14, 2025, 10:04:34 AM
 #406

Thanks for the heads up. I'll remove all references to DYN in cpuminer-opt.

As far as supporting argon2d-1000, I'm undecided. If it's only a matter of changing the mcost it should
be simple. But there isn't much happening with argon2d anywhere at this time.
If I see it pop up in a pool I'l take a look.

It should be just a change from 500 to 1000, the coin uses a different port obviously from i.e. 33300/33350 to 44400/44450.
Currently the coins is traded on: https://app.exbitron.com/exchange/?market=0DYNC-USDT
We are working on a pool with Exbitron.
JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
January 16, 2025, 05:39:05 PM
Last edit: January 16, 2025, 08:58:39 PM by JayDDee
 #407

cpuminer-opt-25.3

#442, #443: Fixed a regression in Makefile.am.
Removed algo features log display.
Some code cleanup.

Displaying the algo features was always somewhat of a hack. The features were set manually with no clear criteria and prone to error.
The lack of any technical basis for the data makes it unreliable and not worth the effort to maintain.


Orestes
Sr. Member
****
Offline Offline

Activity: 452
Merit: 251



View Profile
January 16, 2025, 09:49:50 PM
Last edit: January 16, 2025, 10:44:06 PM by Orestes
 #408

Hello JayDDee,

0-Dyn | Cash has been added to the pool: LetsHash.it.
Instead of the non-conventional term Dynodes for Masternodes, I have reverted it back to Masternodes with Cash. cpuminer-opt fork for your consideration.

Regards
tatral
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
January 17, 2025, 03:07:23 PM
 #409

I've been using the miner for some time on different intel cpus. Why are Alderlake series perform considerably worse than other, supposedly weaker cpus? For example i'm running yespower/power2b on 2 12700 that do about 750-800 kh/s, while 6700 does almost the same (~700) and 8700 is actually better (~1200). 12900k is faster at ~2000.
They also don't register the correct number of active cpus, but rather half of the total cores: 12700 is 12 cores (8p+4e) and 20 total threads but reads 10 active. 12900 is 16 (8p+8e) and total 24 cores, but reads 12 active.
Alderlake is running miner variant avx2-sha-vaes while older ones running only avx2.
JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
January 17, 2025, 04:45:11 PM
Last edit: January 17, 2025, 05:01:08 PM by JayDDee
 #410

I've been using the miner for some time on different intel cpus. Why are Alderlake series perform considerably worse than other, supposedly weaker cpus? For example i'm running yespower/power2b on 2 12700 that do about 750-800 kh/s, while 6700 does almost the same (~700) and 8700 is actually better (~1200). 12900k is faster at ~2000.
They also don't register the correct number of active cpus, but rather half of the total cores: 12700 is 12 cores (8p+4e) and 20 total threads but reads 10 active. 12900 is 16 (8p+8e) and total 24 cores, but reads 12 active.
Alderlake is running miner variant avx2-sha-vaes while older ones running only avx2.


I don't understand what you are saying. Unless you are using the --threads option it should use all available cores.

Hybrid CPUs are a pain for mining, especially Intel where P-cores have hyperthreading and E-cores don't.
Setting the affinity correctly is a nightmare.

Start with using --hash-meter to help identify hich CPUs are P-core and which are E-core.
You can also disable hyperthreading in the BIOS to help identify which cores have it and which don't.

Once you have your CPU all mapped out you can play with different thread counts & affinity strategies to see what work best.
Typically you would use half the P-cores to avoid hyperthreading while using all the E-cores.

Ryzen is simpler because all the cores on a hybrid Ryzen have the same features including SMT (hyperthreading).

Edit: if you think something weird is happening post a console log with --debug.
Also I noticed you are using the pre-built binaries, they are built to Windows-7 standard which knows nothing about hybrid acrchitecture.
I suggest compiling from source using MSys2 & MinGW which will use up to date Windows libraries and should better handle hybrids.

tatral
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
January 17, 2025, 08:08:51 PM
 #411

I tested each core separately on the 12700s and concluded that first 16 cores are p-type (with alternating threads) and last 4 are e-type. However when using more than 4 cores the performance only suffers. As of now the best setting is 4 p threads, using only even-numbered ones.

Indeed I use the prebuilt binaries. I'll try compiling myself and see what I get.
JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
January 17, 2025, 09:08:11 PM
 #412

I tested each core separately on the 12700s and concluded that first 16 cores are p-type (with alternating threads) and last 4 are e-type. However when using more than 4 cores the performance only suffers. As of now the best setting is 4 p threads, using only even-numbered ones.

Indeed I use the prebuilt binaries. I'll try compiling myself and see what I get.

That's interestng. I have no personal experience with a hybrid so I'm learning too.
Yespower is I/O bound, AKA memory hard, so compute performance is less important than cache size
and memory performance. E-cores have smaller cache and lower clock but I would have expected them
to help some without interfering with the P-cores. It's basically trial and error, and it's probably different for each
CPU based on its combination of P vs E cores.

Alternating threads is pretty standard for yespower on non-hybrid CPUs, it avoids hyperthreading and distributes the cache load evenly.

If you're seeing all the CPUs the binaries should be ok, your initial post implied something weird was going on and the miner wasn't seeing all the cores.

tatral
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
January 17, 2025, 09:28:08 PM
 #413

It was always detecting thr correct number of total threads. Never mind on the rest.

Just to put some numbers behind my rant: p cores do ~380 h/s independently, and e cores ~170. 4 p cores give me ~1250 and anything more is getting slower. My initail run of 12 alternating cores (out of habit of cours) got me the ~750-800 i mentioned above. I tried differnet combos of p and e, only p, even only e. The 4 core test is always the fastest... confirmed on 2 seperate 12700s.
I still need to the same test for the 12900k. My current affinity config is probably wrong anyway.
tatral
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
January 18, 2025, 05:40:13 AM
 #414

Didn't get a chance to test the 12900k yet, but did test a 12400 which has no e cores, only 6 p cores (12 threads). I get the same outcome: each core runs at ~350 h/s, 4 cores at ~1000 and more than that only gets slower.
valhs
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile
January 19, 2025, 08:41:48 AM
 #415

Now supporting ARM64 & MacOS

Supporting over 90 agorithms with many optimized for CPUs with the latest technologies:

...
ARM Orange Pi-5, Apple M2: NEON, AES, SHA2

Thank you for the miner.

I have two questions.

In zergpool is for algorithm yespowerTIDE also the miner cpuminer-opt-24.4 mentioned.
Is it a different cpuminer-opt than the one in this thread or which paramater enable that algorithm?


The code is prepared for risc-v and not yet activated.
There exist already reasonably priced cpus, e.g spacemit in Banana BPI-F3, which support the SIMD instruction RVV.
Will cpuminer-opt support soon risc-v and maybe use 'SIMDe - SIMD-everywhere' solution or sse2rvv?
GreenMelon011925
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile
January 19, 2025, 09:10:33 AM
 #416

I tested each core separately on the 12700s and concluded that first 16 cores are p-type (with alternating threads) and last 4 are e-type. However when using more than 4 cores the performance only suffers. As of now the best setting is 4 p threads, using only even-numbered ones.

Indeed I use the prebuilt binaries. I'll try compiling myself and see what I get.

That's interestng. I have no personal experience with a hybrid so I'm learning too.
Yespower is I/O bound, AKA memory hard, so compute performance is less important than cache size
and memory performance. E-cores have smaller cache and lower clock but I would have expected them
to help some without interfering with the P-cores. It's basically trial and error, and it's probably different for each
CPU based on its combination of P vs E cores.

Alternating threads is pretty standard for yespower on non-hybrid CPUs, it avoids hyperthreading and distributes the cache load evenly.

If you're seeing all the CPUs the binaries should be ok, your initial post implied something weird was going on and the miner wasn't seeing all the cores.

So if yespower is memory bond than cpu speed, how i can configure the memory in Linux to work best for cpuminer-opt v25.1? Is hugepages setting has any effects for cpuminer-opt? Or that it just need ram with recent speed or l1,l2,l3 of processor size?

I am using cpuminer-opt v25.1 with yespowerr16 on Debian 12 x86_64 linux.
JayDDee (OP)
Full Member
***
Offline Offline

Activity: 1454
Merit: 241


View Profile
January 19, 2025, 04:30:43 PM
Last edit: May 07, 2025, 03:32:10 PM by mprep
 #417

Thank you for the miner.

I have two questions.

In zergpool is for algorithm yespowerTIDE also the miner cpuminer-opt-24.4 mentioned.
Is it a different cpuminer-opt than the one in this thread or which paramater enable that algorithm?

The code is prepared for risc-v and not yet activated.
There exist already reasonably priced cpus, e.g spacemit in Banana BPI-F3, which support the SIMD instruction RVV.
Will cpuminer-opt support soon risc-v and maybe use 'SIMDe - SIMD-everywhere' solution or sse2rvv?

You're welcome.

cpuminer-opt can mine any yescrypt  or yespower coin by specifying the parameters. See https://github.com/JayDDee/cpuminer-opt/wiki/Supported--Algorithms.

The parameters are part of the coin's specification and should be published by the coin's developers. Tide chose not to publish the parameters
in their mining guide, however, I found the parameters buried in the code:

Code:
int scanhash_tidecoin_yespower(int thr_id, uint32_t *pdata,
const uint32_t *ptarget,
uint32_t max_nonce, unsigned long *hashes_done)
{
static const yespower_params_t params = {
.version = YESPOWER_1_0,
.N = 2048,
.r = 8,
.pers = NULL,
.perslen = 0
};

Simply add "-R 8"  "-N 2048 -R 8" to the command line and it will mine Tide.  Pers (-K) is left at the default of NULL.
Edit: N is not required Tide uses the default 2048. Tide has ben added to the list on the Wiki.

RISC_V is another story. Pi is not a viable mining platform, whether ARM or RISC-V. ARM now has Apple and Snapdragon-X
that produce more powerful CPUs but they only perform as a light desktop. They can't compare with Intel Core or AMD Ryzen.

On the HW side the most interesting thing for RISC-V is the HiFive P550: https://www.sifive.com/boards/hifive-premier-p550
but the CPU Is still very weak.

On the SW side all the vector code would have to be rewritten for RISC-V. Translation layers like sse2neon (for ARM) don't cut it.
They only provide compatibility by emulating SSE instructions using NEON. Performance is terrible.

I'm still trying to understand ARM SVE wich uses vector length agnostic programmimg, meaning the code doesn't know the size of
the vector registers and can't be optimized for the HW. Tuning SVE for a particular vector length is critical for cpuminer-opt but it
will add run time overhead which will affect it's performance. It will likely also require a complete rewrite of all vector code in cpuminer-opt.
This is a lot more complex that implementing NEON. RISC-V is also vector length agnostic but I don't yet know how its implemented.

The short answer is no RISC-V anytime soon if ever.

On a tangent, Bitmain released a XMR miner that actually uses a cluster of RISC-V CPUs. Some people mistakingly call it an ASIC but it's just CPUs.
AFAIK it hasn't sold well.



So if yespower is memory bond than cpu speed, how i can configure the memory in Linux to work best for cpuminer-opt v25.1? Is hugepages setting has any effects for cpuminer-opt? Or that it just need ram with recent speed or l1,l2,l3 of processor size?

There's no real short answer for this, pretty much all of the above. You can enable transparent huge pages on Linux and give it a try.
If hugetlbfs is enabled verthash and scryptn2 will try to use them.

Edit: a little clarification: I/O bound doesn't mean it uses a large amount of memory, it means it accesses memory a lot. Huge pages won't help with that.

[moderator's note: consecutive posts merged]

valhs
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile
January 20, 2025, 06:48:35 AM
 #418

Simply add "-R 8"  "-N 2048 -R 8" to the command line and it will mine Tide.  Pers (-K) is left at the default of NULL.
Edit: N is not required Tide uses the default 2048. Tide has ben added to the list on the Wiki.
Thank you for the Wiki entry, it works.

The short answer is no RISC-V anytime soon if ever.
Thank you for the chance.

Maybe the neon2rvv header, e.g. github.com/howjmay/neon2rvv , makes the work easier.

GreenMelon011925
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile
January 20, 2025, 09:52:53 AM
Last edit: January 20, 2025, 10:03:35 AM by GreenMelon011925
 #419

So if yespower is memory bond than cpu speed, how i can configure the memory in Linux to work best for cpuminer-opt v25.1? Is hugepages setting has any effects for cpuminer-opt? Or that it just need ram with recent speed or l1,l2,l3 of processor size?

There's no real short answer for this, pretty much all of the above. You can enable transparent huge pages on Linux and give it a try.
If hugetlbfs is enabled verthash and scryptn2 will try to use them.

Edit: a little clarification: I/O bound doesn't mean it uses a large amount of memory, it means it accesses memory a lot. Huge pages won't help with that.

Yes i just confirmed that. I configured and activated hugepages but it stays zero, nothing is using it. I will just wait for 2 days or so before i deactivate it and restart. It seems that people has to do experimenting with processor cache and recent ram speed. Is it true? Thats really expensive thing to do.
tatral
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
January 26, 2025, 01:38:45 PM
 #420

Didn't get a chance to test the 12900k yet, but did test a 12400 which has no e cores, only 6 p cores (12 threads). I get the same outcome: each core runs at ~350 h/s, 4 cores at ~1000 and more than that only gets slower.

I did test the 12900k at last and the results are different. This time the best peroformance was when using 12 cores, 8p+4e (out of 8p+8e total available), however running 8p+2e is just about the same. At least this time it's using most cores...

Then I tested some other algos and got more confused:
Yespower with R=8 (tide coin) does best when using all physical cores, the more the better, both p and e. It's not "restricted" to 4 p cores on the 12400 and 12700s like generic yespower/power2b.

MinotaurX performs best on the 12400 when using all 12 threads! Not just the physical. That was a surprise for me. On 12700s best is only physical cores, 8p+4e.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [21] 22 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!