Bitcoin Forum
May 05, 2024, 11:21:26 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 [146] 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 »
  Print  
Author Topic: [LOCKED] cpuminer-opt v3.12.3, open source optimized multi-algo CPU miner  (Read 443972 times)
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 27, 2017, 03:54:47 PM
 #2901

Do I go with Ryzen and just SHA or wait for Cannonlake with SHA and AVX-512?

Well, Ryzens have 2x128-bit wide AVX units instead of 256, don't forget about it Wink I think this is not a good implementation to target to.

As for AVX-512, the only adequate choice for now is i7-7800X, it's not so expensive but has 140W TDP Tongue (and liquit ship inside instead of solder).

Yes Ryzen's implementation of AVX2 is inferiour. But AVX2 and AVX512 don't improve a CPU's competitive disadvantage
to GPUs. SHA does and is available now with Ryzen. If Cannonlake would come out in summer I could wait for it but as the
release gets delayed it makes a Ryzen purchase more likely.


AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
There are several different types of Bitcoin clients. The most secure are full nodes like Bitcoin Core, but full nodes are more resource-heavy, and they must do a lengthy initial syncing process. As a result, lightweight clients with somewhat less security are commonly used.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714951286
Hero Member
*
Offline Offline

Posts: 1714951286

View Profile Personal Message (Offline)

Ignore
1714951286
Reply with quote  #2

1714951286
Report to moderator
1714951286
Hero Member
*
Offline Offline

Posts: 1714951286

View Profile Personal Message (Offline)

Ignore
1714951286
Reply with quote  #2

1714951286
Report to moderator
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 28, 2017, 05:41:45 PM
 #2902

I've encountered my first major roadblock with 4-way with whirlpool.

The core of whirlpool is a table lookup but the table index is a variable meaning each lane in the vector
uses a different index, ie each lane reads a different address. This operation is not efficient with SIMD as
it needs to load one 64 bit element from 4 different addresses. Although there is a SIMD instruction to do this
it is very expensive with an optimum throughput of 4 to read 4 items. That's no faster than performing
the operation with scalar instructions. When the 4-way overhead is added it hashes significantly slower
than the old way.

I suspect GPUs don't have this problem because each lane has it's own dedicated core with it's own local memory.
All memory accesses can run in parallel with different addresses. On a CPU 4 lanes run on the same core
accessing data from 4 addresses from the same memory system, serially.

This looks like an architectural issue that can't be overcome.

This will affect algos like x15, xevan & m7m which will gain less than previously anticipated.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 28, 2017, 10:01:30 PM
 #2903

cpuminer-opt-3.7.4 is released.

Added 4 way support for tribus and nist5.

Removed some unnecessary compile options.

A 4-way Windows binary is now available.

I'm waiting for someone to get the bonus.The bonus if if one thread can fine more than one nonce in parallel.
It's very rare and I haven't seen it yet but the code checks for it to make sure second, third or even fourth
nonces are submitted. It's almost like a lotto but you don't win anything. The multiple nonces are all part
of the odds.

git: https://github.com/JayDDee/cpuminer-opt

tarball: https://drive.google.com/file/d/1AwdqMWFufxZmuKWKHkWCjlfRm0SPqID8/view?usp=sharing

Windows binaries: https://drive.google.com/file/d/1opN5Wb5tL9_wes8RsZ6QSftOOo2Uhb6p/view?usp=sharing

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
spider703
Full Member
***
Offline Offline

Activity: 1764
Merit: 148



View Profile
November 28, 2017, 10:58:30 PM
 #2904

cpuminer-4way not working on my i7-3770

BTC 1Hof999zuqUKpifmzrSABv7tNr4nRaoJKM LTC Lf2L6DTBr2gXT38d7cVRqDQiHMndtXQyNW or write me in https://t.me/spider703
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 29, 2017, 02:22:05 AM
 #2905

cpuminer-4way not working on my i7-3770

From README.txt:

Quote
4way requires a CPU with AES and AVX2.

Your CPU is Ivybrige,  no AVX2.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
nizzuu
Full Member
***
Offline Offline

Activity: 187
Merit: 100

Cryptocurrency enthusiast


View Profile
November 29, 2017, 06:54:39 AM
 #2906

Seems one of the questions was lost in the thread(

Sample usage:

cpuminer-aes-avx2 -a lyra2z330 -t 2 --benchmark

First hashrate output is showed after ~7-8minutes on i5-7600 (860+ h/s), and ~15minutes on a slower (450+ h/s) pentium, but the appropriate cpu utilization starts immediately.

Tried new 4way nist5, tribus - speed is showed immediately, as well as on lyra2z. Why the first output is so slow? It's a real pain to benchmark...
warcries
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
November 29, 2017, 07:03:41 AM
 #2907

@joblo

the program is working fine in my windows 10 but when I ran it in my windows server 2012. Err is program not responding.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 29, 2017, 01:31:27 PM
 #2908

Seems one of the questions was lost in the thread(

Sample usage:

cpuminer-aes-avx2 -a lyra2z330 -t 2 --benchmark

First hashrate output is showed after ~7-8minutes on i5-7600 (860+ h/s), and ~15minutes on a slower (450+ h/s) pentium, but the appropriate cpu utilization starts immediately.

Tried new 4way nist5, tribus - speed is showed immediately, as well as on lyra2z. Why the first output is so slow? It's a real pain to benchmark...

Interesting observation. There is nothing unique about how cpuminer handles lyra2z330 vs other algos. Lyra2z330 is, however,
unique as the slowest hashing algo. it also has to do a little more work on sttartup (malloc) that others algos don't do. But that
doesn't take minutes.

It might be worthwhile to pay attention to the hash count. Is it proportional to the time? I'm not sure what else to suggest.

BTW lyra2z330 will not benefit from 4way. It is pure lyra2 which is already using AVX2 horizontally. Vertical (4way) AVX2 would
not likely affect compute performance. Furthemore lyra2z is I/O bound (memory hard) so improving compute performance just
means the CPU would spend more time stalled waiting on data from memory.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 29, 2017, 01:32:25 PM
 #2909

@joblo

the program is working fine in my windows 10 but when I ran it in my windows server 2012. Err is program not responding.

More info please It's probably your CPU.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
fynxgloire
Full Member
***
Offline Offline

Activity: 294
Merit: 100


View Profile
November 29, 2017, 03:52:02 PM
Last edit: November 29, 2017, 06:45:43 PM by fynxgloire
 #2910

Hi,
What is the best bang for the buck Xeon processor to go with the H110 Pro BTC+ motherboard?
or
Can an Intel Core i7-8700K CPU work with this motherboard?

regards
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 29, 2017, 08:09:15 PM
 #2911

Hi,
What is the best bang for the buck Xeon processor to go with the H110 Pro BTC+ motherboard?
or
Can an Intel Core i7-8700K CPU work with this motherboard?

regards

System building recommendations deserve their own thread, I'd rather keep this one about
cpuminer software.

That being said If you're tryng to build a combo GPU/CPU rig it's entirely feasible. CPU choice
depends on the features of the various CPU architectures. There are several threads already discussing
the benefits of different architectures and features for CPU mining.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 29, 2017, 11:42:35 PM
Last edit: November 30, 2017, 01:31:59 AM by joblo
 #2912

Here's a puzzle for coding experts.

I was testing with both sph and 4way running side by side and comparing the hash.
Everything was fine. Then I started cleaning up the code and the hash broke. What remains
is the last bit of code I can't remove without breaking the hash. I left a couple of comented out
lines for context (no pun intended). The code as presented works. If I remove the line indicated
the hash breaks and it only submits invalid shares that are rejected. It should be noted that
blake_ctx was never initialized nor was sph_blake256 run before blake256_close so close is
running with random data. Both variables are local and are not referenced anywhere else.

I would suspect local stack corruption but in reverse. Instead of code corrupting the stack,
removing code does.

The input data is 4 80 byte streams interleaved for blake_4way.
vhash is 4 32 byte hash streams returned from blake_4way interleaved.
hash0..3 is vhash deinterleaved for lyra2 to be run serially.
hash and ctx_blake are not in any way involved in the proper functioning of the code.

I'm stumped. Anyone have any insight?

Edit:
I tried nulling sph256_close but it failed. It seems to be dependent on actually running the code
in the function.

I moved the funky code to the end of the function and everything still works. But still, if I remove it
the returned hash is invalid. SPH is stable code and not likely to be accessing data it shouldn't.
Even if it did it would break something, not fix it. It's not even being used properly. There should be
no interactions between the sph code and the 4way code, they have their own data structures and supporting
functions and don't share anything

I'm even more stumped.

Code:
void lyra2z_hash_4way( void *state, const void *input )
{
     uint32_t hash0[8] __attribute__ ((aligned (32)));
     uint32_t hash1[8] __attribute__ ((aligned (32)));
     uint32_t hash2[8] __attribute__ ((aligned (32)));
     uint32_t hash3[8] __attribute__ ((aligned (32)));
     uint32_t vhash[8*4] __attribute__ ((aligned (64)));
     blake256_4way_context ctx __attribute__ ((aligned (64)));

uint32_t _ALIGN(64) hash[8];
sph_blake256_context ctx_blake __attribute__ ((aligned (64)));
//memcpy( &ctx_blake, &lyra2z_blake_mid, sizeof lyra2z_blake_mid );
//sph_blake256( &ctx_blake, input + 64, 16 );
// removing the following line breaks the hash
sph_blake256_close( &ctx_blake, hash );

     memcpy( &ctx, &ctx_mid, sizeof ctx_mid );
     blake256_4way( &ctx, input + (64<<2), 16 );
     blake256_4way_close( &ctx, vhash );

     m128_deinterleave_4x32( hash0, hash1, hash2, hash3, vhash, 256 );

     LYRA2Z( lyra2z_wholeMatrix, hash0, 32, hash0, 32, hash0, 32, 8, 8, 8);
     LYRA2Z( lyra2z_wholeMatrix, hash1, 32, hash1, 32, hash1, 32, 8, 8, 8);
     LYRA2Z( lyra2z_wholeMatrix, hash2, 32, hash2, 32, hash2, 32, 8, 8, 8);
     LYRA2Z( lyra2z_wholeMatrix, hash3, 32, hash3, 32, hash3, 32, 8, 8, 8);

     memcpy( state   , hash0, 32 );
     memcpy( state+32, hash1, 32 );
     memcpy( state+64, hash2, 32 );
     memcpy( state+96, hash3, 32 );
}

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 30, 2017, 04:50:28 AM
 #2913

I'm giving up on lyra2z 4way. It wasn't about lyra2z, the gain turned out to be only 2 % with
rejects.

The real point was to test blake256 as a step toward other algos. It's also used by cryptonight
and lyra2rev2.

With the whirlpool problem that's 2 failures in 2 days. it's a good thing i don't have a boss or
customers to answer to.

The problem with lyra2z is one of the weirdest I've ever encountered. I will probably revisit this
in the future when I am able to test the other algos that use blake256. For now I'll move forward
with other algos that build on the work done for tribus and nist5.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
warcries
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
November 30, 2017, 06:14:38 AM
 #2914

@joblo

the program is working fine in my windows 10 but when I ran it in my windows server 2012. Err is program not responding.

More info please It's probably your CPU.

I'm using Intel Xeon x3430(Lynfield) in windows server 2012. Err is program not responding.

thank you.
joblo (OP)
Legendary
*
Offline Offline

Activity: 1470
Merit: 1114


View Profile
November 30, 2017, 06:48:37 AM
 #2915

@joblo

the program is working fine in my windows 10 but when I ran it in my windows server 2012. Err is program not responding.

More info please It's probably your CPU.

I'm using Intel Xeon x3430(Lynfield) in windows server 2012. Err is program not responding.

thank you.

Did you read README.txt? There is a very interesting line:

Quote
Choose the exe that best matches you CPU's features or use trial and
error to find the fastest one that doesn't crash.

AKA JayDDee, cpuminer-opt developer. https://github.com/JayDDee/cpuminer-opt
https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
fynxgloire
Full Member
***
Offline Offline

Activity: 294
Merit: 100


View Profile
November 30, 2017, 08:14:26 AM
 #2916

I am thinking of upgrading my Celeron to an 8700k or 7700k on my mining rig.
Currently on my gaming rig I am getting 2000kH/s on the Intel Core i7-6700 CPU
( using the 4 way executable )
Does anyone have any benchmarks for the 7700 or 8700 chips so I can
compare hash rates and decide what to buy?
nizzuu
Full Member
***
Offline Offline

Activity: 187
Merit: 100

Cryptocurrency enthusiast


View Profile
November 30, 2017, 10:35:18 AM
 #2917

to an 8700k or 7700k

This is out of topic, but...

I've just tested 4-way binary (the one provided by joblo) on i5-7600 (non-K) under Deepcool Redhat (it handles up to 250w TDP, haha) with tribus and nist5.

settings:
cpuminer-4way -t 4 -a ... --benchmark

CPU heats up to 95°C (it's 19°C in my server room) on all cores and goes to throttling - about 30-40% hashrate drop (same settings for cryptonight w/o avx/avx2 max-out at 61°C on 3 of 4 cores with -t 3 setting). This means that high-freq CPU must have Z-series chipset to set AVX downclock offset in bios. Other solution - go for a lower-freq desktop cpu (e.g. buy i5-7400 instead of i5-7600 and so) or cpu with solder under it's cap. I'm happy that this cpu maxes-out with -t 2 on lyra2z330 at 72°C, not -t 4, so I'm not that angry Smiley

I would never ever buy this crap for a rig again (yep, I know this crap is good in gaming and compiling, and so on), and 7700/7700k/8700/8700k, too. Seems that 3.7...3.8GHz it that maximum that Skylake/Kabylake/Coffe Lake can handle under avx load for all cores w/o overheating.
 
If I were you, I'd go for i7-6800k/6850k/6900k or old Xeons E5v3 (they have avx2 and are rather cheap for now).
ol92
Sr. Member
****
Offline Offline

Activity: 445
Merit: 255


View Profile
November 30, 2017, 10:58:23 AM
 #2918

I am thinking of upgrading my Celeron to an 8700k or 7700k on my mining rig.
Currently on my gaming rig I am getting 2000kH/s on the Intel Core i7-6700 CPU
( using the 4 way executable )
Does anyone have any benchmarks for the 7700 or 8700 chips so I can
compare hash rates and decide what to buy?
which algo?
fynxgloire
Full Member
***
Offline Offline

Activity: 294
Merit: 100


View Profile
November 30, 2017, 11:36:39 AM
 #2919

I am thinking of upgrading my Celeron to an 8700k or 7700k on my mining rig.
Currently on my gaming rig I am getting 2000kH/s on the Intel Core i7-6700 CPU
( using the 4 way executable )
Does anyone have any benchmarks for the 7700 or 8700 chips so I can
compare hash rates and decide what to buy?
which algo?

I am using this script to get 2000 kH/s on current Intel i7-6700

Bin\cpuminer-opt\cpuminer-4way.exe -a lyra2z -o stratum+tcp://xzc.suprnova.cc:1596 -u fynxgloire.Desktop1080CPU -p Intel
4ward
Member
**
Offline Offline

Activity: 473
Merit: 18


View Profile
November 30, 2017, 01:00:29 PM
 #2920

to an 8700k or 7700k

This is out of topic, but...

I've just tested 4-way binary (the one provided by joblo) on i5-7600 (non-K) under Deepcool Redhat (it handles up to 250w TDP, haha) with tribus and nist5.

settings:
cpuminer-4way -t 4 -a ... --benchmark

CPU heats up to 95°C (it's 19°C in my server room) on all cores and goes to throttling - about 30-40% hashrate drop (same settings for cryptonight w/o avx/avx2 max-out at 61°C on 3 of 4 cores with -t 3 setting). This means that high-freq CPU must have Z-series chipset to set AVX downclock offset in bios. Other solution - go for a lower-freq desktop cpu (e.g. buy i5-7400 instead of i5-7600 and so) or cpu with solder under it's cap. I'm happy that this cpu maxes-out with -t 2 on lyra2z330 at 72°C, not -t 4, so I'm not that angry Smiley

I would never ever buy this crap for a rig again (yep, I know this crap is good in gaming and compiling, and so on), and 7700/7700k/8700/8700k, too. Seems that 3.7...3.8GHz it that maximum that Skylake/Kabylake/Coffe Lake can handle under avx load for all cores w/o overheating.
 
If I were you, I'd go for i7-6800k/6850k/6900k or old Xeons E5v3 (they have avx2 and are rather cheap for now).


i5 7600k overclocked @ 4.5Ghz with watercooling reaches 75°C on tribus (70 with nist5)

If you get 95°C at stock speed, you need a better cooler. Intel's stock cooler is not enough for high loads over a long time.

Pages: « 1 ... 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 [146] 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!