The other thread was too haphazard and confusing, so I started a new one. I modified CPUMiner-Multi to give 2x the speed or more when mining CryptoNote coins. It is slower on Windows, but still far faster than Lucas' miner.
I ran a test, LucasJones' repo against mine. Each had the exact same CFLAGS (-Ofast -flto -fuse-linker-plugin -funroll-loops -fsplit-ivs-in-unroller -fvariable-expansion-in-unroller -falign-loops=16 -falign-functions=16 -falign-jumps=16 -falign-labels=16). Each were run with 21 threads for a period of time over 20min. I can't be more accurate than that, because I didn't sit and time it. They were run at seperate times on the same machine - a 32 core Amazon EC2 instance with 58GB of RAM. The results? Lucas' cpuminer reported 627.58H/s - but only pulled 600 at the pool. My miner reported 1021.73 and pulled an impressive 1.25KH/s at the pool. Now, even with vardiff causing high share difficulties and luck contributing to inaccuracy, this shows a clear 100% increase. I have screenshots to prove it, but before I post them, I have to warn - I was too lazy to crop out my wallpapers, so they are NSFW.
Lucas' miner (NSFW): https://ottrbutt.com/tmp/lucasminer-proof.png
My miner (NSFW): https://ottrbutt.com/tmp/wolfminer-proof.png
So... get the source here: https://github.com/wolf9466/cpuminer-multi
Win64 binaries exist but require AES-NI.
How do you know if you have it? If the binary crashes, you don't. They're here:https://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-05-29-2014.ziphttps://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-05-29-2014.zip.sig
I made some new ones that should provide an improvement. It's based off some new code I just pushed to my github minutes ago:https://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-05-30-2014.ziphttps://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-05-30-2014.zip.sig
For the GPG sigs, my key ID and fingerprint is in my signature.
For best performance - play with the number of threads. Often, less is more - there's a reason I used only 21 threads on the 32 core AWS instance in the official test - it resulted in the highest hashrate.
If you see this hosted anywhere else - be wary of it. Check the GPG sig. If you don't know how to, then just know that ottrbutt.com is the ONLY official place to find them, and copies elsewhere may be laced with malware.
Since there has been trouble reproducing my first test, I did another one, this time on a Digital Ocean 16-core. To reduce inaccuracy caused by luck, I used a port the owner of moneropool.com opened for me - it has a share diff of 1000 and no vardiff. The results are roughly the same. These screenshots are NSFW, as well. Why? Well, since hardly anyone ever donates, I may as well get to have some fun for my work https://ottrbutt.com/tmp/lucasminer2.pnghttps://ottrbutt.com/tmp/wolfminer2.png
NEW -- 06/08/2014
Up to a 25% speedup if you're lucky - but only for Linux. First, do a "sysctl vm.nr_hugepages" - it'll probably say 0. Set it with "sudo sysctl -w vm.nr_hugepages=num" where "num" is the number of threads you're running times three. You may want to play with the threads again to get the best performance. Once set, use "sysctl vm.nr_hugepages" again to make sure it's set. If it's not - reboot and try again, your memory is too fragmented, and the kernel cannot allocate enough contiguous memory for the hugepages. Once done, compile the miner from git as usual and try it out.
Before this, I had maybe 200H/s on my i7-4770K - clocked to 4.5Ghz and underwater. Peak at around 220H/s.
With this new miner: https://ottrbutt.com/tmp/newminer-06082014.png
This will do almost nothing in virtualized environments like AWS.
Even faster. Got it consistently above 230H/s on my machine, and it tends to stay above 250H/s: https://ottrbutt.com/tmp/newminer-06082014-2.png
Oh, right. It runs faster as root, since I can use setpriority() and mlock(). However, it doesn't require it.
Tested on my laptop just now - about 140H/s instead of 125H/s from the previous commits: https://ottrbutt.com/tmp/moreoptimizations.png
Third update for today:
More optimized CFLAGS - even faster. Laptop test, for comparison with previous: https://ottrbutt.com/tmp/evenmoreoptimizations.png
You'll need to re-run autogen.sh after pulling this one.
I got my desktop's i7-4770K above 300H/s. Will test on my laptop in a bit, but: https://ottrbutt.com/tmp/andmoreoptimizations.png
No more per-thread output - it was in the thread function and printing to stdout is slow. Also, I moved a divide out of said function, as they are ouch slow.
NEW -- 06/09/2014
Windows x64 binaries! AES-NI only!https://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-06-09-2014.ziphttps://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-06-09-2014.zip.sig
I did another comparison between my binaries and Lucas'. His seemed to benefit from the flags I use as well - I used the same ones for both. Mine is a lot faster, however, it's less consistent, which is something I'll have to work on.https://ottrbutt.com/tmp/lucasminer.pnghttps://ottrbutt.com/tmp/wolfminer.png
Some Win binaries for those without AES-NI.https://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-06-12-2014.ziphttps://ottrbutt.com/cpuminer-multi/cpuminer-multi-wolf-06-12-2014.zip.sig
Pushed another commit - I removed all algorithms but CryptoNight. The files are still there for now, but unused. Compile time is WAY down, code size and binary size, as well. Don't use the -a or --algo switches anymore.
I rolled back the earlier commit - broke stuff. I tested and pushed a similar one just now, though. The -a switch must still be used for now, but there is no code for other algos, making it a lot slimmer. I also tested it before pushing this time.
Removed the -a and --algo switches from the help text. It will still accept them, but ignore them for compatibility. No need to use them anymore. I also set the default algorithm to CryptoNight.
Added a check for AES-NI that is compiled in only if the miner is being built for AES-NI. AES-NI binaries built from this source or newer should output a message telling the user that the CPU does not have AES-NI, rather than crashing.
Non-AES-NI implementation fixed just now. No binaries yet - but you can compile from github.
Did a LOT of code cleaning - also remembered to add an optimization flag I'd been meaning to put in for ages. Probably won't do too much, though.
Had a thought a little later - removed that lingering, now-pointless OpenSSL requirement.
Okay, done. Have a Windows x64 binary for non-AES-NI. Does it submit shares? Yes. Does it work on a non-AES-NI system? Beats me.https://ottrbutt.com/cpuminer-multi/minerd-wolf-07-09-14.exehttps://ottrbutt.com/cpuminer-multi/minerd-wolf-07-09-14.exe.sig
No DLLs anymore!
Just wanted to say that to my surprise, a pool, MONERO.RS, has donated to me for my work on the CPU miner!
If you like it - donate! My addresses are on my GitHub, also below (XMR address needs code tags or BCT screws it up):