joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 06:26:43 PM |
|
cpuminer-opt-3.4.3 is available for download. It includes faster m7m on most CPUs and Windows binaries. Source code: https://drive.google.com/file/d/0B0lVSGQYLJIZM0RJZVZSUnpCR0k/view?usp=sharingWindows binaries https://drive.google.com/file/d/0B0lVSGQYLJIZRlVsc3FEVWhYU0U/view?usp=sharingAll supported architectures have seperate binaries, see README.txt for details. Compiling was done on a i7-4790K (Haswell). AMD amdfam10 failed to compile due to AVX inconsistencies. AMD btver1 appears to have been compiled without AES and AVX. As this is the first release with pre-built portable Windows binaries there may be some problems. There are also some specific questions I have that users may be able to answer. When reporting problems please provide all relevant information such as CPU architecture, commands used, compile environment, error messages and any other information that may be useful. Specific questions: I was not able to compile for Broadwell/Skylake on my Haswell. Does a native compile on these CPUs perform better than a core-avx2 compile? AMD performance is expected to be poor with the pre-built binaries. I suspect compiling for AMD on an Intel CPU may not produce the optimum code. AMD users that can compile their own can confirm whether this is the case. The major code optimisations involve AES, AVX and AVX2. The architecture that introduced these individual features should see the biggest incremental improvement. I would like to know how much of a performance penalty exists if users were forced to use a lesser compile. For example how much slower is Ivybridge using the corei7-avx vs a native compile or the core-avx-i build.
|
|
|
|
t2yax
Sr. Member
Offline
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
|
|
August 24, 2016, 06:56:43 PM |
|
neoscrypt is slow when compared to neoscrypt cpuminer from ghoslander: https://github.com/ghostlander/cpuminer-neoscryptcan you take that miner as base for neoscrypt algo? also can you implement aes_ni avx features for cryptolight ?
|
|
|
|
vingaard
Legendary
Offline
Activity: 1246
Merit: 1011
|
|
August 24, 2016, 07:10:18 PM |
|
cpuminer-opt-3.4.3 is available for download. It includes faster m7m on most CPUs and Windows binaries. Source code: https://drive.google.com/file/d/0B0lVSGQYLJIZM0RJZVZSUnpCR0k/view?usp=sharingWindows binaries https://drive.google.com/file/d/0B0lVSGQYLJIZRlVsc3FEVWhYU0U/view?usp=sharingAll supported architectures have seperate binaries, see README.txt for details. Compiling was done on a i7-4790K (Haswell). AMD amdfam10 failed to compile due to AVX inconsistencies. AMD btver1 appears to have been compiled without AES and AVX. As this is the first release with pre-built portable Windows binaries there may be some problems. There are also some specific questions I have that users may be able to answer. When reporting problems please provide all relevant information such as CPU architecture, commands used, compile environment, error messages and any other information that may be useful. Specific questions: I was not able to compile for Broadwell/Skylake on my Haswell. Does a native compile on these CPUs perform better than a core-avx2 compile? AMD performance is expected to be poor with the pre-built binaries. I suspect compiling for AMD on an Intel CPU may not produce the optimum code. AMD users that can compile their own can confirm whether this is the case. The major code optimisations involve AES, AVX and AVX2. The architecture that introduced these individual features should see the biggest incremental improvement. I would like to know how much of a performance penalty exists if users were forced to use a lesser compile. For example how much slower is Ivybridge using the corei7-avx vs a native compile or the core-avx-i build. Thank you very very much
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 07:10:57 PM |
|
I'll look into the ghostlander fork of neoscrypt. I don't know that cryptolight even works as I am unaware of any coin that uses it. I would need a way to test. I would also have to examine the code to see if the cryptonight optimisations can be ported to cryptolight. It's not at the top of my priority list.
|
|
|
|
t2yax
Sr. Member
Offline
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
|
|
August 24, 2016, 07:16:54 PM |
|
I'll look into the ghostlander fork of neoscrypt. I don't know that cryptolight even works as I am unaware of any coin that uses it. I would need a way to test. I would also have to examine the code to see if the cryptonight optimisations can be ported to cryptolight. It's not at the top of my priority list. aeon coin uses it https://bitcointalk.org/index.php?topic=641696.0https://coinmarketcap.com/currencies/aeon/
|
|
|
|
Epsylon3
Legendary
Offline
Activity: 1484
Merit: 1082
ccminer/cpuminer developer
|
|
August 24, 2016, 07:39:47 PM |
|
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 07:41:05 PM |
|
WTF, ghostlader neoscrypt is half the speed of cpuminer-opt. Do your homework before making silly requests.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 07:45:11 PM |
|
Thanks. I'm hoping to get going with git soon. I haven't found any more AVX2 quick kills, so if the Windows binaries release doesn't have too many problems I'l have some time to explore git in more detail.
|
|
|
|
t2yax
Sr. Member
Offline
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
|
|
August 24, 2016, 07:46:51 PM |
|
http://prntscr.com/c9yfon this is neoscrypt cpuminer screen of ghostlander http://prntscr.com/c9ygeh this is your miner cpu is i5 3337u i've used your cpuminer-core-avx-i and 3.4.3 version so i did it good.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 08:01:44 PM |
|
I can't see your images, too many scripts want to run in my browser. Please post the numbers. This is not a code problem because I compiled both myself on the same CPU. You used a precompiled binary that was clearly identified as a test release and requested information from users. Had you done so I wouldn't have wasted my time chasing down a slower fork. If you go back a few posts and read the release announcement then provide some useful data I'll look at it. You could start with comparing core-avx-i with corei7-avx. If you can compile your own native, even better.
|
|
|
|
t2yax
Sr. Member
Offline
Activity: 462
Merit: 250
Arianee:Smart-link Connecting Owners,Assets,Brands
|
|
August 24, 2016, 08:07:56 PM |
|
when it makes 10.3 khash sec yours makes 8.3 khash sec also i have not compiling,programming skills unfortunately
|
|
|
|
NDBob
Newbie
Offline
Activity: 14
Merit: 0
|
|
August 24, 2016, 08:14:20 PM |
|
Joblo ...
OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least. The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....
1) Comment out or remove the macro definitions for min and max in miner.h 2) add a local definition of the min macro to decred.c
After that I was able to get it to compile on one of my Haswell systems. Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.
Bob
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 08:21:27 PM |
|
when it makes 10.3 khash sec yours makes 8.3 khash sec also i have not compiling,programming skills unfortunately I tried the core-avx-i build on my haswell and got the same performance as a native build. Your CPU is definitely underperforming with cpuminer-opt but I have no clue why. Unless someone else can reproduce your poor results and can provide more data there's nothing more I can do. I suggest you use whatever works best for you.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 08:29:17 PM |
|
Joblo ...
OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least. The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....
1) Comment out or remove the macro definitions for min and max in miner.h 2) add a local definition of the min macro to decred.c
After that I was able to get it to compile on one of my Haswell systems. Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.
Bob
Good work. I'll make the change proactively. The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7? "gcc -Q -march=native --help=target" will tell you which arch is the default for native.
|
|
|
|
NDBob
Newbie
Offline
Activity: 14
Merit: 0
|
|
August 24, 2016, 09:03:51 PM |
|
Joblo ...
OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least. The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....
1) Comment out or remove the macro definitions for min and max in miner.h 2) add a local definition of the min macro to decred.c
After that I was able to get it to compile on one of my Haswell systems. Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.
Bob
Good work. I'll make the change proactively. The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7? "gcc -Q -march=native --help=target" will tell you which arch is the default for native. Westmere support AES-NI but not AVX. Nehalem doesn't support either. I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on. setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously). Appears to be some sort of a conflict in the capabilities check on the HODL AES code.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 24, 2016, 09:21:08 PM |
|
Joblo ...
OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least. The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....
1) Comment out or remove the macro definitions for min and max in miner.h 2) add a local definition of the min macro to decred.c
After that I was able to get it to compile on one of my Haswell systems. Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.
Bob
Good work. I'll make the change proactively. The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7? "gcc -Q -march=native --help=target" will tell you which arch is the default for native. Westmere support AES-NI but not AVX. Nehalem doesn't support either. I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on. setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously). Appears to be some sort of a conflict in the capabilities check on the HODL AES code. You're right, I only check for AES_NI, not AVX. This may affect some other algos that also have AVX code mixed in with AES. If I can identify which ones are pure AES I can make a distinction otherwise I'll have to use non-AES code unless the CPU also supports AVX. I don't have the necessary HW to test but if you don't mind doing a little more work it would help a lot. There are three groups of AES code. There is code used only by hodl, code only used by cryptonight and code shared among many algos including x11. Those three should cover the entire spectrum of AES optimized code. Those that work on your Westmere can have AES enabled without AVX. Those like Hodl will require a CPU with AVX before AES can be enabled.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
August 25, 2016, 02:10:24 AM |
|
I have fixes for the two compile problems with hodl on Westmere CPUs. Westmere will now use the unoptimized hodl function. I have also fixed the min/max duplication by making local definitions where required, instead of a global definition.
I'd like to wait for more test results before building a new release in case more problems are reported, particularly with the Windows binaries.
|
|
|
|
|
|
ryen123
|
|
August 25, 2016, 08:48:48 AM |
|
@Joblo would it be cool with you if I mine to your BTC donation address at nicehash as donation time for your work? I'll mine one full day per week as donation. Every other user can do the same as well.
|
|
|
|
|