Bitcoin Forum
April 19, 2024, 04:22:41 PM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 [191] 192 193 194 195 196 197 »
  Print  
Author Topic: [LOCKED] cpuminer-opt v3.12.3, open source optimized multi-algo CPU miner  (Read 443953 times)
4ward
Member
**
Offline Offline

Activity: 473
Merit: 18


View Profile
May 22, 2019, 05:36:27 PM
 #3801

Can you add Ranfonrest2?

https://github.com/MicroBitcoinOrg/Cpuminer

From my experience, the reference miner reports significantly higher speed than actual on pool side (Seems like x256)

Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713543761
Hero Member
*
Offline Offline

Posts: 1713543761

View Profile Personal Message (Offline)

Ignore
1713543761
Reply with quote  #2

1713543761
Report to moderator
1713543761
Hero Member
*
Offline Offline

Posts: 1713543761

View Profile Personal Message (Offline)

Ignore
1713543761
Reply with quote  #2

1713543761
Report to moderator
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
May 22, 2019, 06:55:30 PM
 #3802

Can you add Ranfonrest2?

https://github.com/MicroBitcoinOrg/Cpuminer

From my experience, the reference miner reports significantly higher speed than actual on pool side (Seems like x256)

TPruvot has it, does it work better? I've already looked at the code.

My first glance shows it's a completely new algo and can't benefit from any of the canned
optimizations. To optimize it requires a detailed analysis of the code to look for opportunities to
vectorize either serially, parallelly, or not at all. I expect the scalar code to be near optimum already.
It's a huge task to do the whole algo at once. Not really interested at this time.

Hashrate displayed by the miner, both thread and share, are artificially
calculated based on the number of iterations over time. The pool calculates based on the number and
difficulty of submitted valid shares. Perhaps there's a math error in the miners calculations.


AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
4ward
Member
**
Offline Offline

Activity: 473
Merit: 18


View Profile
May 22, 2019, 07:02:40 PM
 #3803

Can you add Ranfonrest2?

https://github.com/MicroBitcoinOrg/Cpuminer

From my experience, the reference miner reports significantly higher speed than actual on pool side (Seems like x256)

TPruvot has it, does it work better? I've already looked at the code.

My first glance shows it's a completely new algo and can't benefit from any of the canned
optimizations. To optimize it requires a detailed analysis of the code to look for opportunities to
vectorize either serially, parallelly, or not at all. I expect the scalar code to be near optimum already.
It's a huge task to do the whole algo at once. Not really interested at this time.

Hashrate displayed by the miner, both thread and share, are artificially
calculated based on the number of iterations over time. The pool calculates based on the number and
difficulty of submitted valid shares. Perhaps there's a math error in the miners calculations.



Tpruvot has the first version of the algo, but they released a tweaked one (RFv2).
There is also a pull request with RFv2, but it has the same issue.

Anyway, I get your point about not being interested ))

joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
May 23, 2019, 12:11:06 AM
 #3804

Tpruvot has the first version of the algo, but they released a tweaked one (RFv2).
There is also a pull request with RFv2, but it has the same issue.

Anyway, I get your point about not being interested ))

It was a new algo and it's changed already, yet another reason why I don't like it.

This seems to be a trend: vertcoin, zcoin, cryptonight, ...

It appears to be an anti ASIC strategy, with SW miners able to adapt quicky without
requiring new HW.

It's not that big of a deal for a single coin but daunting for a multialgo miner to keep up.
That the race I've withdrawn from.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
marte1982
Member
**
Offline Offline

Activity: 129
Merit: 10


View Profile
May 27, 2019, 03:57:58 PM
 #3805

Hi Dev...is it possible ad  algo Lyra2CZ  new algo for mining BitcoinCZ

At the moment no miner only by wallet...listed on sistemkoin exchange

https://bitcointalk.org/index.php?topic=5140548.0

https://github.com/BitcoinCZ

Thanks for your good work and support
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
May 28, 2019, 05:05:00 PM
 #3806

Hi Dev...is it possible ad  algo Lyra2CZ  new algo for mining BitcoinCZ

At the moment no miner only by wallet...listed on sistemkoin exchange

https://bitcointalk.org/index.php?topic=5140548.0

https://github.com/BitcoinCZ

Thanks for your good work and support


It looks like lyra2Z. Have you tried it?

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
May 29, 2019, 04:32:40 AM
 #3807

Here's a tease. It's the only visible part, much more is going on behind the scene.
I'm trying to streamline the process, reduce overhead (especially interleaving for
4 way) and new innovative (imo) ideas for increasing performance. For now it's still
in the napkin stage but it's starting to take shape. It means increasing the parallelization
beyond the size of the largest vector. I have no idea if it will incease performance or
by how much. It may actually be a flop but I think the idea has merit. It's a bit of a twist
on another idea pioneered by a long time miner developper with an explosive name.
That's all for now. I have a bug fix someone is waiting for I'm almost ready to think
about a new release, still a few days away.

Code:
[2019-05-29 00:17:17] Share 8 submitted by thread 12, lane 1.
[2019-05-29 00:17:17] Accepted 8/8 (100%), diff 0.0113, 2659.60 kH/s, 70C
[2019-05-29 00:17:29] Share 9 submitted by thread 2, lane 1.
[2019-05-29 00:17:29] Accepted 9/9 (100%), diff 0.0187, 2659.60 kH/s, 70C
[2019-05-29 00:17:35] Share 10 submitted by thread 11, lane 2.
[2019-05-29 00:17:35] Accepted 10/10 (100%), diff 0.00811, 2659.60 kH/s, 70C
[2019-05-29 00:17:52] Share 11 submitted by thread 8, lane 1.
[2019-05-29 00:17:52] Accepted 11/11 (100%), diff 0.0127, 2659.02 kH/s, 71C

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
May 30, 2019, 04:27:52 PM
Last edit: May 30, 2019, 06:16:42 PM by joblo
 #3808

Attention Ryzen users.

It is well known that Ryzen has a HW implementation of SHA and also well known that
Ryzen also added AVX2 capabilities. Unfortunately Ryzen's AVX2 performance is poor.

The combination of these 2 points makes for some unusual effects on some algorithms
depending on how much sha256 they use and how much AVX2 they use.

An extreme example is the sha256t algo, which is pure sha256 and also supports 8-way AVX2
and 4-way SSE.

The hw SHA implementation can't do parallel so the 8-way and 4-way code uses sw sha.

On Intel CPUs the performance is very predictable, 8-way AVX2 is fastest, 4-way SSE2 is next
and 1 way is slowest.

On Ryzen it's the reverse. the single stream using HW SHA is fastest. A 16 thread Ryzen 1700
using HW SHA outperforms an 8 thread i7-6700K 8 way AVX2 by 50%. The 4 way SSE2 code is
just as fast as, and maybe a little faster than, 8-way AVX2 on Ryzen. And the AVX2 performance
is downright pitifull in most cases. Th eonly case where AVX2 may perform better is in
4-way AVX2 where there is no SSE2 equivalent.

As previously mentioned the impact depends on the mix of SHA and AVX2 in the algo
as well as whether SSE2 parallel hashing is available.

I will investigate further and provide recommendations for Ryzen users.

The solution may extend beyond compiling and may require some code changes to ensure
Ryzen prefers SHA over n-way when the algo contains a significant amout of sha256.

It likely won't be the upcoming release.

Edit: Here's a list of algos that use sha256

sha256t: as described above.
lbry: significantly affected but less than sha256t
skein: similar to lbry.
m7m: no 4-way, not a problem.
yescrypt and yespower: no 4 way, not a problem.


AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
May 30, 2019, 09:12:43 PM
 #3809

cpuminer-opt-3.9.1 is released

https://github.com/JayDDee/cpuminer-opt/releases

Fixed AVX2 version of anime algo.

Added sonoa algo.

Added "-DRYZEN_" compile option for Ryzen to override 4-way hashing when algo
contains sha256 and use SHA instead. This is due to a combination of
the introduction of HW SHA support combined with the poor performance
of AVX2 on Ryzen. The Windows binaries package replaces cpuminer-avx2-sha
with cpuminer-zen compiled with the override. Refer to the build instructions
for more information.

Ongoing restructuring to streamline the process, reduce latency,
reduce memory usage and unnecessary copying of data. Most of these
will not result in a notoceably higher reported hashrate as the
change simply reduces the time wasted that wasn't factored into the
hash rate reported by the miner. In short, less dead time resulting in
a higher net hashrate.

One of these measures to reduce latency also results in an enhanced
share submission message including the share number*, the CPU thread,
and the vector lane that found the solution. The time difference between
the share submission and acceptance (or rejection) response indicates
network ltatency. One other effect of this change is a reduction in hash
meter messages because the scan function no longer exits when a share is
found. Scan cycles will go longer and submit multiple shares per cycle.
*the share number is antcipated and includes both accepted and rejected
shares. Because the share is antipated and not synchronized it may be
incorrect in time of very rapid share submission. Under most conditions
it should be easy to match the submission with the corresponding response.

Removed "-DUSE_SPH_SHA" option, all users should have a recent version of
openssl installed: v1.0.2 (Ubuntu 16.04) or better. Ryzen SHA requires
v1.1.0 or better. Ryzen SHA is not used when hashing multi-way parallel.
Ryzen SHA is available in the Windows binaries release package.

Improved compile instructions, now in seperate files: INSTALL_LINUX and
INSTALL_WINDOWS. The Windows instructions are used to build the binaries
release package. It's built on a Linux system either running as a virtual
machine or a seperate computer. At this time there is no known way to
build natively on a Windows system.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
May 31, 2019, 05:29:12 PM
 #3810

cpuminer-opt-3.9.1.1 is released

Fixed lyra2 regression affecting non-AVX2.

Compiling on Windows using Cygwin now works.

Simply use "./build.sh" from a cygwin shell.

I have no list of likely packages that need installing on top of the base Cygwin
installation. You'll have to wing it for now.

It isn't portable therefore the Windows binaries package continues to use
the existing procedure.

As always please report any problems.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
malafaya
Sr. Member
****
Offline Offline

Activity: 490
Merit: 256



View Profile
June 03, 2019, 12:52:31 PM
Last edit: June 03, 2019, 03:53:39 PM by malafaya
 #3811

Hi,. joblo!
Nice to see you back!

I noticed a few things in this v3.9.1.1 release:

* --cpu-affinity truncates to a 32-bit value which means one can't use CPUs at or above 32 unless I don't specify affinity at all (which for most algos is worse). I think this has been addressed in the past (regression?)
* Miner now reports failure to affine to CPU x, y, z on startup if the startup processed is not affined to them: I usually affine the command prompt process before launching the miner (used to be a fix to the previous problem and it's also more flexible to me). I hope this does not affect anything performance related.
* I'm consistently getting about 5% less hashrate for yescrypt than with older v3.8.8 for the exact same configuration. I didn't take more metrics but I think some other algos have slightly less performance as well.
* For yescryptR16yespowerR16, I get 2200H/s, quite a bit below the 3000H/s I get with bellflower2015's variant. I suppose this is because you just introduced this algo and still didn't have the chance to tweak it.

Cheers!
4ward
Member
**
Offline Offline

Activity: 473
Merit: 18


View Profile
June 03, 2019, 01:02:46 PM
 #3812


* I'm consistently getting about 5% less hashrate for yescrypt than with older v3.8.8 for the exact same configuration. I didn't take more metrics but I think some other algos have slightly less performance as well.

If they are compiled with MinGW, the performance will be lower. Cross-compile with GCC does a better job optimizing
If it's not the case, it might be something in the recent changes

malafaya
Sr. Member
****
Offline Offline

Activity: 490
Merit: 256



View Profile
June 03, 2019, 01:04:40 PM
 #3813


* I'm consistently getting about 5% less hashrate for yescrypt than with older v3.8.8 for the exact same configuration. I didn't take more metrics but I think some other algos have slightly less performance as well.

If they are compiled with MinGW, the performance will be lower. Cross-compile with GCC does a better job optimizing
If it's not the case, it might be something in the recent changes

In both cases, I'm using the official Windows binaries.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
June 03, 2019, 03:28:03 PM
Last edit: June 03, 2019, 04:18:40 PM by joblo
 #3814


* I'm consistently getting about 5% less hashrate for yescrypt than with older v3.8.8 for the exact same configuration. I didn't take more metrics but I think some other algos have slightly less performance as well.

If they are compiled with MinGW, the performance will be lower. Cross-compile with GCC does a better job optimizing
If it's not the case, it might be something in the recent changes

In both cases, I'm using the official Windows binaries.


A few points. and questions:

What's your CPU and OS?

Summary: Changes were made to support Windows CPU groups
                 There have been no changes to yescrypt code.
                 There have been no recent changes to Windows buld process

Edit: please use -D to display affinity debug info.

Long version:

CPU limit and affinity:
A change was made initially in 3.9.0, later tweaked to add CPU groups support to Windows.
This may be responsible for that issue. I can have a look at the code in light of the specific symtoms you saw.
Do you use CPU groups? Which version of Windows? You said you affine the process seperately and that
causes problems. That could be related to CPU groups if the process is in a different group from the miner
threads.

General performance degredation:
The binaries are still made the same way using mingw,specifically using the winbuild-cross.sh script.
The compiler was upgraded (evident in the startup messages showing the compiler version) prior
to 3.9.
However, I have been making some architectural changes that may have a small impact on performance,
though 5% seems a bit much. I'm making them due to issues in preparation for AVX512 where
up to 16 lanes can run parallel in a single CPU tread. The overhead for interleaving and deinterleaving the data,
the increase in memory usage, etc, don't scale well.

Some of those changes affect the locally displayed hashrate, both in volume of thread hash reports and
their values. I have reduced the latency between detecting a solution and submitting it to the pool. As
a side effect there are fewer hash meter reports and the reported hashrate is actually from the previous
block. Another side effect is the reduced latency is not reflected in the hash rate reported by the miner.
I considered it an acceptible compromise as it's just optics. The acumulated share difficulty over time is
what the pool uses. In both the miner and the pool the hashrate is an artificial metric.
The changes result in less deinterleaving of final hash (check for solution before interleaving instead of
after), and submitting a share immediately when found while continuing the scan instead of aborting the scan
to submit the share an start a new scan. On their face it is obviously more efficient but I measured no
discernable difference in reprted hash rate.
These changes are being migrated slowly and can be confirmed by a more detailed share submitted message
indicating which thread and lane found the solution.

Sorry for the ramble but there's a lot going on at the same time. I appreciate the testing and reports of any deviations
from previous versions, especially the unintended ones.


AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
malafaya
Sr. Member
****
Offline Offline

Activity: 490
Merit: 256



View Profile
June 03, 2019, 04:04:56 PM
Last edit: June 03, 2019, 04:44:07 PM by malafaya
 #3815


A few points. and questions:

What's your CPU and OS?

[...]

Do you use CPU groups? Which version of Windows? You said you affine the process seperately and that
causes problems. That could be related to CPU groups if the process is in a different group from the miner
threads.


Tested on Intel(R) Xeon CPU E5-266O v4 @ 2.OOGHz CPUs, on Windows Servr 2016.
I do not use CPU groups (not sure what those are: will look into that) and I tested with 20 CPUs. [EDIT: I checked and CPU groups are applicable to machines with more than 64 CPUs so I only have one group here]
I open a command prompt and prior to launching cpuminer-opt I set the command prompt's affinity to the one desired (set to 20 CPUs). The miner then runs with the desired processors already affined. v3.9.1.1 now complains that it can't affine to all CPUs (obviously, because I removed some from the parent process). I'm supposing that's just a benign warning and nothing will really change. You probably made the miner explicitely affine to all CPUs by default on startup hence the warnings.


Yescrypt performance:
There were no changes to the yescrypt code. I added yespower and tinkered with using that code for
yescrypt without success so I left yescrypt as is. If you are aware of a better performing miner
please point me to it and I'll have a look.

Argh, I'm sorry. I meant yespowerR16 instead of yescryptR16 in my last item! And I was referring to bellflower2015's fork of your miner (you can easily find it on github).

Thanks!
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
June 03, 2019, 04:40:57 PM
 #3816


Argh, I'm sorry. I meant yespowerR16 instead of yescryptR16 in my last item! And I was referring to bellflower2015's fork of your miner (you can easily find it on github).

Thanks!

I suggest you not set the process affinity explicitly, it confuses cpuminer. Also please add  -D option
and post output.

About yespower, I see the correction you made to your initial post. If I understand, you observed a drop
in yescrypt performance over v3.8.8, but the big difference was with yespower vs bellflower fork.
I can't explain the yescrypt difference, as I said I made no changes but I'll take another look.
I'll also check out Bellflower.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
malafaya
Sr. Member
****
Offline Offline

Activity: 490
Merit: 256



View Profile
June 03, 2019, 04:55:39 PM
 #3817


I suggest you not set the process affinity explicitly, it confuses cpuminer. Also please add  -D option
and post output.

I'll have to set it explicitely for now because --cpu-affinity truncates to 32 bits, thus not allowing the use of CPUs above 31.
Sorry, I'm not sure on what situation you want me to post the debug.

Is this enough?
Code:
         **********  cpuminer-opt 3.9.1.1  ***********
     A CPU miner with multi algo support and optimized for CPUs
     with AES_NI and AVX2 and SHA extensions.
     BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT

CPU: Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz.
SW built on May 31 2019 with GCC 7.3.0.
CPU features: SSE2 AES SSE4.2 AVX AVX2.
SW features: SSE2 AES SSE4.2 AVX AVX2.
Algo features: SSE2.
Start mining with SSE2.

[2019-06-03 17:52:38] 56 CPU cores available, 30 miner threads selected.
[2019-06-03 17:52:38] Starting Stratum on stratum+tcp://*****
[2019-06-03 17:52:38] Binding thread 0 to cpu 0 (mask 1)
[2019-06-03 17:52:38] Binding thread 1 to cpu 1 (mask 2)
[2019-06-03 17:52:38] Binding thread 4 to cpu 4 (mask 10)
[2019-06-03 17:52:38] Binding thread 26 to cpu 26 (mask 4000000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 1 returned 57
[2019-06-03 17:52:38] Binding thread 2 to cpu 2 (mask 4)
[2019-06-03 17:52:38] Binding thread 5 to cpu 5 (mask 20)
[2019-06-03 17:52:38] affine_to_cpu_mask for 5 returned 57
[2019-06-03 17:52:38] Binding thread 7 to cpu 7 (mask 80)
[2019-06-03 17:52:38] affine_to_cpu_mask for 7 returned 57
[2019-06-03 17:52:38] Binding thread 3 to cpu 3 (mask 8)
[2019-06-03 17:52:38] affine_to_cpu_mask for 3 returned 57
[2019-06-03 17:52:38] Binding thread 10 to cpu 10 (mask 400)
[2019-06-03 17:52:38] Binding thread 11 to cpu 11 (mask 800)
[2019-06-03 17:52:38] affine_to_cpu_mask for 11 returned 57
[2019-06-03 17:52:38] Binding thread 13 to cpu 13 (mask 2000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 13 returned 57
[2019-06-03 17:52:38] Binding thread 15 to cpu 15 (mask 8000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 15 returned 57
[2019-06-03 17:52:38] Binding thread 17 to cpu 17 (mask 20000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 17 returned 57
[2019-06-03 17:52:38] Binding thread 6 to cpu 6 (mask 40)
[2019-06-03 17:52:38] Binding thread 19 to cpu 19 (mask 80000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 19 returned 57
[2019-06-03 17:52:38] Binding thread 21 to cpu 21 (mask 200000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 21 returned 57
[2019-06-03 17:52:38] Binding thread 23 to cpu 23 (mask 800000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 23 returned 57
[2019-06-03 17:52:38] 30 miner threads started, using 'yespowerr16' algorithm.
[2019-06-03 17:52:38] Binding thread 25 to cpu 25 (mask 2000000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 25 returned 57
[2019-06-03 17:52:38] Binding thread 27 to cpu 27 (mask 8000000)
[2019-06-03 17:52:38] Binding thread 28 to cpu 28 (mask 10000000)
[2019-06-03 17:52:38] Binding thread 29 to cpu 29 (mask 20000000)
[2019-06-03 17:52:38] affine_to_cpu_mask for 29 returned 57
[2019-06-03 17:52:38] Binding thread 12 to cpu 12 (mask 1000)
[2019-06-03 17:52:38] Binding thread 14 to cpu 14 (mask 4000)
[2019-06-03 17:52:38] Binding thread 16 to cpu 16 (mask 10000)
[2019-06-03 17:52:38] Binding thread 18 to cpu 18 (mask 40000)
[2019-06-03 17:52:38] Binding thread 20 to cpu 20 (mask 100000)
[2019-06-03 17:52:38] Binding thread 22 to cpu 22 (mask 400000)
[2019-06-03 17:52:38] Binding thread 24 to cpu 24 (mask 1000000)
[2019-06-03 17:52:38] Binding thread 8 to cpu 8 (mask 100)
[2019-06-03 17:52:38] Binding thread 9 to cpu 9 (mask 200)
[2019-06-03 17:52:38] affine_to_cpu_mask for 9 returned 57
[2019-06-03 17:52:38] Stratum session id: 2ee7a1bb44758f49710830c357335031
[2019-06-03 17:52:39] Stratum difficulty set to 1
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=00000000 ntime=5cf55048
[2019-06-03 17:52:39] yespowerr16 block 403605, network diff 0.013
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=01000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=02000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=03000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=04000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=05000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=06000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=07000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=08000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=09000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=0a000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=0b000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=0c000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=0d000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=0e000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=0f000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=10000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=11000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=12000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=13000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=14000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=15000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=16000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=17000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=18000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=19000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=1a000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=1b000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=1c000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=1d000000 ntime=5cf55048
[2019-06-03 17:52:39] DEBUG: job_id='690a' extranonce2=1e000000 ntime=5cf55048
[2019-06-03 17:52:46] DEBUG: job_id='690c' extranonce2=00000000 ntime=5cf5505e
[2019-06-03 17:52:54] DEBUG: hash <= target
Hash:   0000758736c194d8a115384dffac8218e7af7a552d235f254016c1c164269128
Target: 0000ffff00000000000000000000000000000000000000000000000000000000
[2019-06-03 17:52:54] Share submitted.
[2019-06-03 17:52:55] Accepted 1/1 (100%), diff 3.32e-005, 2179.22 H/s
[2019-06-03 17:53:07] DEBUG: job_id='690e' extranonce2=00000000 ntime=5cf55073
[2019-06-03 17:53:28] DEBUG: job_id='690f' extranonce2=00000000 ntime=5cf55088
[2019-06-03 17:53:41] DEBUG: hash <= target
Hash:   00000f473fd9fdd57695a4730ccb13bbb31645b08ec9a5fdd6cc0f7c870dd7ef
Target: 0000ffff00000000000000000000000000000000000000000000000000000000
[2019-06-03 17:53:41] Share submitted.
[2019-06-03 17:53:41] Accepted 2/2 (100%), diff 0.000256, 2179.37 H/s
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
June 03, 2019, 05:37:29 PM
Last edit: June 03, 2019, 06:27:27 PM by joblo
 #3818


I suggest you not set the process affinity explicitly, it confuses cpuminer. Also please add  -D option
and post output.

I'll have to set it explicitely for now because --cpu-affinity truncates to 32 bits, thus not allowing the use of CPUs above 31.
Sorry, I'm not sure on what situation you want me to post the debug.

Is this enough?

What is confusing me is all your changes from the norm. I would like to see how it works
with defaults to get a reference. I also don't know what you mean by truncating to 32,
affinity is 64 bits. and you don't have more than 32 CPUs anyway.

I don't know the case you posted  but there were errors

Code:
affine_to_cpu_mask for 1 returned 57

repeated for many CPUs, seems to be all the odd numbered ones.

EDIT:

I can't find what error 57 means.

Some useful tests, you don't have to post the session just whether it worked as expected.
Running less than N threads should be by factors of 2. Anything else is YMMV.
And forcing the process affinity disqualifies everything.

1. All defaults

2. 14 threads default affinity, note wether cpu loads are balanced, ie affinity was properly
distributed.

3. If unbalanced try setting affinity 0x5555555 or  0xaaaaaaa

If everything works as expected I don't see a problem. Windows issues like CPU groups and
NUMA shouldn't be an issue until you get over 64 CPUs.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
June 03, 2019, 07:12:40 PM
 #3819


If they are compiled with MinGW, the performance will be lower. Cross-compile with GCC does a better job optimizing

This statement caught my attention. The binaries are cross-compiled with GCC in a mingw environment
runing on Linux. What are you referring to that is better?

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
malafaya
Sr. Member
****
Offline Offline

Activity: 490
Merit: 256



View Profile
June 03, 2019, 07:40:37 PM
Last edit: June 03, 2019, 08:09:39 PM by malafaya
 #3820



What is confusing me is all your changes from the norm. I would like to see how it works
with defaults to get a reference. I also don't know what you mean by truncating to 32,
affinity is 64 bits. and you don't have more than 32 CPUs anyway.

I don't know the case you posted  but there were errors

Code:
affine_to_cpu_mask for 1 returned 57

repeated for many CPUs, seems to be all the odd numbered ones.

EDIT:

I can't find what error 57 means.

Some useful tests, you don't have to post the session just whether it worked as expected.
Running less than N threads should be by factors of 2. Anything else is YMMV.
And forcing the process affinity disqualifies everything.

1. All defaults

2. 14 threads default affinity, note wether cpu loads are balanced, ie affinity was properly
distributed.

3. If unbalanced try setting affinity 0x5555555 or  0xaaaaaaa

If everything works as expected I don't see a problem. Windows issues like CPU groups and
NUMA shouldn't be an issue until you get over 64 CPUs.

There are more than 32 CPUs (check the debug above: 56 CPU cores available, 30 miner threads selected.).
If I select all 28 even CPUs for instance, that means an affinity of 0x55555555555555 which is over 32 bits. If I do set --cpu-affinity=0x55555555555555 , I can check in Task Manager that affinities for CPUs above CPU 31 are not set, which led me to think that affinity is truncating to lower 32 bits.
I do think the warning is just that: a warning, but wanted to be sure as v3.8.8 did not issue such warning before.
And yes, there is no warning issued if I use --cpu-affinity (or don't use it at all) as long as I don't set affinity externally beforehand, so that is not an issue.

EDIT: I made a few tests and verified the following:
* With v3.8.8, a cpu affinity works correctly up to 0xffffff; above that, it triggers that exactly the first 32 CPUs are used, no matter the value.
* With v3.9.1.1, a cpu affinity works correctly up to 0xf; above that, it triggers that exactly the first 32 CPUs are used, no matter the value.
I verified this by using the CPU affinity option in Task Manager for the miner process.

So the reason I set affinity externally is because I could never rely on the built-in cpu affinity for most miners. It seems to break in some configurations.
Pages: « 1 ... 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 [191] 192 193 194 195 196 197 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!