Hi Joblo.
An update on cpuminer 3.0.2 on Core 2 duo.
I managed to get it going - at least mining X11.
Checking CPU capatibility...
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
AES_NI: No.
SSE2: No, start mining without optimizations...
[2016-01-23 17:25:40] Starting Stratum on stratum+tcp://hashpower.co:3533
[2016-01-23 17:25:40] 2 miner threads started, using 'x11' algorithm.
[2016-01-23 17:25:41] Stratum difficulty set to 0.016
[2016-01-23 17:25:41] hashpower.co:3533 x11 block 652575
[2016-01-23 17:25:47] CPU #1: 43.13 kH/s
[2016-01-23 17:25:47] CPU #0: 43.13 kH/s
[2016-01-23 17:25:54] hashpower.co:3533 x11 block 652576
[2016-01-23 17:25:54] CPU #0: 43.07 kH/s
[2016-01-23 17:25:54] CPU #1: 43.07 kH/s
As you can see the hashrate is not great, and even worse (actually half) if I dont override the return from the has_sse2() function which doesnt work in this case.
Here is a list of the changes I did to get it working:
../cpuminer-opt-3.0.2/algo/aes_ni/groestl/groestl-intr-aes.h <- Added #ifdef HAVE_AESNI to exclude the aesni code from being compiled.
../cpuminer-opt-3.0.2/algo/aes_ni/echo512/hash.c <- remove inline keyword. Also added #undef AES_NI
..//home/arve/cpuminer/joblo3.0.2/cpuminer-opt-3.0.2/algo/ <- remove inline keyword in
qubit-aes.c
quark-sse2.c
quark-aes.c
../cpuminer-opt-3.0.2/cpu-miner.c line 1947 force cpu_sse2 = true
In my opinion it should be possible to add a switch when running configure to disable-aesni ?
That's great news after another difficult day trying to get windows working.
The generic kernel is pretty slow as shown in the performce charts. Support is
more about compatibility than performance. The sse2 kernels are looking
pretty good, relatively speaking.
It seems the SSE2 check fails on your core2 but when you force it the SSE2 kernel
runs fine. Is that correct?
I'll restore all the compiler directives, I had removed them LOL, when I was hacking and slashing
the code an focussed only on the top tier.
I'll follow up on your findings, thanks a lot.
Edit: Another dramatic turn of events, this time a positive one!
Reenabling the AESNI_NI defines (and the companion OPTIMIZE_SSE2 has unlocked a ton
more performance thaty had been hidden by chainsaw approach.
X11 is up to 865 from 720 on my i7-4790K, but things aren't perfect, I see some rejects.
Some of the exposed code may not be perfect and may require a scalple to cut out the
cancer.
I've got a lot of work to do but I'm back on track.
BTW thanks for the excelllent report, very clear, complete, and precise. I followed it like a script.
---------------------------------------------------------
Edit:
It looks like the dramatic sped increase was due to an error on my part. It seems the #define AES_NI
I put in miner.h isn't being seen. I had to #define it in every file that uses it. Clumsy but it works.
I'm building a debug tarbar so you can do some better testing. It will require you to make some small
code changes. I will send a PM with a link to the file.
Here's how it works:
Default is to enable and respect all AES_NI checks.
Edit: changed default configuration for your convenience.
Will
disable AES_NI because we already know your CPU can't handle it
To force disable AES_NI you have to do two things before compiling:
- in cpu-miner.c:1945 hard code cpu_aesni to false
- remove #define AESNI from all algos, grep -r AES_NI to find them all, sorry To force disable SSE2
- in addition to the steps above hard code cpu_sse2 to false
These changes will affect kernel selection only, the start up capability check is still performed but not enforced.
This should solve the compile problem you had and allow you to do more testing.
The ugly workarounds are for the debug load only. I will investigate a more user firendly implentation
before release. Any suggestions welcome, including why the algos don't pick up the #define in miner.h
The goals of the further testing:
1. confirm the capability of your core2 CPU
2. determine if cpuminer-opt can correctly identify your CPU's capability level and select the appropriate kernel
3. compare sse2 vs x86_64 performance.
The first two are pretty obvious The third will allow me to extrapolate an estimate of the hash deficit of SSE2 vs AES_NI
and split it into the HW component and software component. How much of the loss is due purely due to the lack of
AES_NI and how much is due to other CPU optimizations in the latest generation.
Thanks for the great work.