^ Also maybe worthwhile checking if multiple ccminer instances will help.
So you could launch instance 1 with -d 0,1,2, and instance 2 with -d 3,4,5.
In addition, ensure that cpu affinity is reserving specific cpu cores for instance one, and other cpu cores for instance 2.
I had to think about this a bit. ccminer should already create multiple CPU threads spread over all cores.
This can be confirmed by using the -D option to enable debug output. Even so, with three threads per core
it could introduce scheduling latency. That combined with the small cache could easilly cause a 10% degradation
in performance.
The algo probbaly factors into it as well. Does the degradation occur with other algos?
I have seen some odd performnance differences while testing cryptonight on cpuminer that I still don't understand.
At first i thought it was due to some affinity tricks but I haven't found anything in the code to explain it.
In short cryptonight performs radically differently on different CPUs/OSs. On a 6700K running Linux I get best
performance CPU mining with 4 threads. More threads causes the total hashrate to drop to as low as half the 4 thread
rate. Most other algos perform much better with more threads. The CPUs also run pretty cool on cryptonight suggesting
they are often stalled waiting for data (ie memory bound).
There shouldn't be any scheduling delays because the number of running threads is less than the available virtual
cores. Any thread contention would occur during execution and be mitigated by hyperthreading.
That leaves cache performance as the most likley cause for both issues. If the total memory requirements of all
threads exceeds the available cache it will significantly affect cache performance. It's a step function as each cache
level overflows.
Seems like going too cheap with a CPU for a mining rig isn't a good idea.