joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 15, 2016, 07:05:09 PM |
|
There is a problem in v3.1.14 with the new compile procedure for CPUs that do not have AES_NI. Please continue to use _DNO_AES_NI on the configure command line if your CPU does not have AES_NI.
Yeah, you added the NO_AES_NI define inside the VisualC-specific section of miner.h  DOH!. That explains everything. Thanks for noticing that stupidity. I found the bugs in has_sse2, yes there were two of them, wrong reg & wrong field. I'll look over your other stuff and fully understand it before I implement it. I will learn more by hand coding it. I also need the practice. Eliminating dependencies is good. You should pass these along to Epsylon3 as he may be interested in porting them to his fork.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 15, 2016, 07:09:16 PM Last edit: April 15, 2016, 07:56:39 PM by joblo |
|
The problem is now understood and a fix has been coded and tested. The fix will be in the next release but since the workaround is to use the old build procedure I will not rush the next release for this issue. V3.1.14is still a better release than v3.1.13. There is a problem in v3.1.14 with the new compile procedure for CPUs that do not have AES_NI. Please continue to use -DNO_AES_NI on the configure command line if your CPU does not have AES_NI. cpuminer-opt v3.1.14 released. https://drive.google.com/file/d/0B0lVSGQYLJIZaE5DYXA4SHl2WVk/view?usp=sharingNew in v3.1.14 Algos - cryptonight algo is now supported on CPUs without AES_NI. All algos now support both CPU architectures. - jane added as an alias for scryptjane with default N-factor 16 Build enhancements, see details in README.md (thanks to hmage) - build.sh now works for CPUs with and without AES_NI - it is no longer necessary to add -DNO_AES_NI CFLAG to configure command when building for CPUs without AES_NI. Note: Compiling requires some additional libraries not included in the default instalation of most Linux distributions: libboost-dev, libboost-system-dev, libboost-thread-dev. UI enhancements - enhanced checks for CPU architecture, SW build and algo for AES_NI and SSE2 capabilities. - a warning is displayed if mining an untested algo. Code cleanup - removed a few more compiler warnings - removed some dead code Algo gate enhancements (for devs) - replaced algo specific null gate functions with generic null functions
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 15, 2016, 07:49:33 PM |
|
AES fixed, has_sse2 fixed. I'm hesitant to change boost because the compiler flagged it as experimental. I'm not sure I want to go there yet. Adding #include "miner.h' does nothing in groestl-version.h, everything is hard coded. Everything now works as I intended in 3.1.14, just need to do some thorough testing.
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
April 15, 2016, 08:35:40 PM |
|
I'm hesitant to change boost because the compiler flagged it as experimental. I'm not sure I want to go there yet.
What version of compiler? Did you put -std=gnu++11 compiler flag into CXXFLAGS? That's important. Adding #include "miner.h' does nothing in groestl-version.h, everything is hard coded.
It does when you try to compile for -march=core2 or on core2 with -march=native, without the include groestl won't have NO_AES_NI when needed and it won't compile on core2. To simplify testing I do a -march=core2 pass after every change, and then -march=native pass.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 15, 2016, 09:46:51 PM |
|
I'm hesitant to change boost because the compiler flagged it as experimental. I'm not sure I want to go there yet.
What version of compiler? Did you put -std=gnu++11 compiler flag into CXXFLAGS? That's important. Adding #include "miner.h' does nothing in groestl-version.h, everything is hard coded.
It does when you try to compile for -march=core2 or on core2 with -march=native, without the include groestl won't have NO_AES_NI when needed and it won't compile on core2. To simplify testing I do a -march=core2 pass after every change, and then -march=native pass. I saw the compiler flag but I'd rather stay with the default to ensure compatibility. If it produces a measurable performance gain I could be persuaded otherwise. All refs to NO_AES_NI groestl-version.h are commented out. Are you confusing it with hash-groestl.c which indeed needs block out AES code in order to compile on core2?
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
April 17, 2016, 03:55:52 AM Last edit: April 17, 2016, 04:20:33 AM by hmage |
|
I saw the compiler flag but I'd rather stay with the default to ensure compatibility. If it produces a measurable performance gain I could be persuaded otherwise.
The defaults vary between versions of compiler. Even on latest GCC 5.3 it's not default and it won't be default for some time (the next major release after 5.3 changes it to C++11), there are still distributions that come with gcc 4.7. By explicitly stating what language version you're using you're actually ensuring compatibility of your code with the compiler (it won't try to treat 'auto' as a keyword, which is new in C++14, for example, also inline semantics are incompatible between C89 and C99). The message you were seeing about unordered_map was applicable to that language version you were using (C++98), it does not apply to other language version and it goes away if you explicitly specify newer C++11 rather than default C++98 (12 years of difference) in gcc 4.7 or newer. It's up to you anyway, I'll just keep it in my fork, maybe people will find it easier to start using. Don't forget to add checks for boost in configure.ac so it's absence is detected at configure time. All refs to NO_AES_NI groestl-version.h are commented out. Are you confusing it with hash-groestl.c which indeed needs block out AES code in order to compile on core2?
Try compiling with -march=core2 and you'll see it won't compile. Maybe the location I placed that include isn't the best.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 17, 2016, 04:23:17 AM |
|
I saw the compiler flag but I'd rather stay with the default to ensure compatibility. If it produces a measurable performance gain I could be persuaded otherwise.
The defaults vary between versions of compiler. On GCC 5.x, it will compile fine because it's default there, by explicitly stating what language version you're using you're actually ensuring compatibility of your code with the compiler (it won't try to treat 'auto' as a keyword, which is new in C++14, for example, and default in 5.3.0, also inline semantics are incompatible between C89 and C99). The message you were seeing about unordered_map was applicable to that language version you were using, it does not apply to other language version and it goes away if you explicitly specify newer C++11 rather than default C++98 (12 years of difference) in gcc 4.9. I don't understand your point. A new compiler feature was added experimentally and required a non-default option to enable it. This is to ensure any incompatibilities with legacy code can be avoided. When the app has migrated to use the new feature it would set that option to enable the compiler to use the new feature. After some transition period the new feature would be made default on the assumption that all apps have migrated. Based on that using the default (experimental feature disabled) should be safer. Why not? Edit: You mentioned the message is due to the language version I'm using. That supports my point. I'm running the latest LTS version of mint and the feature is experimental. Older distros may be incompatible and I want to remain compatible with older distros using older compilers. I can't assume that Centos 6, or Centos 5 (yes still supported) have updated the compiler to support the new feature. All refs to NO_AES_NI groestl-version.h are commented out. Are you confusing it with hash-groestl.c which indeed needs block out AES code in order to compile on core2?
Try compiling with -march=core2 and you'll see it won't compile. Maybe the location I placed that include isn't the best. I did (march=core2, not real HW), and it did. One of us has a big blind spot. I've looked at groestl-version.h many times and there is no reference to NO_AES_NI. I had previously commented them out and hard coded VAES because that's all that was being used.
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
April 18, 2016, 02:15:58 AM |
|
I don't understand your point. A new compiler feature was added experimentally
Lemme stop you there, GCC supports C++11 since 4.8.1 as feature-complete and stable, not experimental. https://gcc.gnu.org/projects/cxx-status.html#cxx11They do not change defaults for other reason -- a lot of code will break if you upgrade to newer version of the standard, not because it's experimental. I did (march=core2, not real HW), and it did. One of us has a big blind spot. I've looked at groestl-version.h many times and there is no reference to NO_AES_NI. I had previously commented them out and hard coded VAES because that's all that was being used.
Yes, I was wrong, sorry. Probably a leftover change that isn't needed anymore. PS: On this forum there's no need for doing linebreaks yourself — it looks weird when the window size is smaller than your linebreaks, and different font rendering systems and different installed fonts will end up with words being different sizes than what you see on your computer (yay compatibility!). Here's how it looks for me — https://i.imgur.com/BRx5I0r.png — note the hanging 'to' and 'the'.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 18, 2016, 04:35:39 AM |
|
I don't understand your point. A new compiler feature was added experimentally
Lemme stop you there, GCC supports C++11 since 4.8.1 as feature-complete and stable, not experimental. https://gcc.gnu.org/projects/cxx-status.html#cxx11They do not change defaults for other reason -- a lot of code will break if you upgrade to newer version of the standard, not because it's experimental. I'm still not sure it's the rigt thing to do. So c++11 is not the default until 6.1 over concerns it may break existing code. You have shown it is not the case with cpuminer-opt sothere is little risk. The fact the code is agnostic is good because it can still work on c++98 (not sure about this because I believe it requires code changes between versions). These are all reasons it's not a bad thing but I'm having a hard time coming up with any reasons it's a good thing. I don't expect any performance improvements out of the box, and if there are any to be realized it would require code chnages that would break compatibility. It's forward thinking but the migration could be done at any time, such as after c++11 v6.1 is released. If it had been done with the first hodl release it would have eliminated the need to install packages but now that hodl is in the wild with the libboost dependencies it's no longer an issue. Can you help me out here? Why is it a good thing?
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
April 18, 2016, 01:40:33 PM |
|
I don't expect any performance improvements out of the box.
My bad, we were talking about this from different angles. Performance-wise there is no good reason for this. This will _not_ improve performance at all, in fact, the code that is in gcc's implementation of unordered_map and in boost is very similar and will likely behave exactly the same. The only difference is that users won't need to install boost, and on some platforms like Windows boost doesn't come pre-packaged. This harms eventual plans to port this to Windows. Not requiring boost means people won't have to deal with 210 pages of this -- http://stackoverflow.com/search?q=boost+visual-studioSo, to recap: * Performance wise -- there's no difference. * Maintenance wise -- there is. Sorry for misunderstanding.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 18, 2016, 03:04:38 PM |
|
I don't expect any performance improvements out of the box.
My bad, we were talking about this from different angles. Performance-wise there is no good reason for this. This will _not_ improve performance at all, in fact, the code that is in gcc's implementation of unordered_map and in boost is very similar and will likely behave exactly the same. The only difference is that users won't need to install boost, and on some platforms like Windows boost doesn't come pre-packaged. This harms eventual plans to port this to Windows. Not requiring boost means people won't have to deal with 210 pages of this -- http://stackoverflow.com/search?q=boost+visual-studioSo, to recap: * Performance wise -- there's no difference. * Maintenance wise -- there is. Sorry for misunderstanding. Many thanks, I appreciate your patience with my stubbornness and picky questions. I't's all clear now.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 18, 2016, 04:30:52 PM |
|
cpuminer-opt now supports 31 algorithms on CPUs with at least SSE2 capabilities including Intel Core2 and AMD equivalent. In addition 13 algorithms have optimizations to take advantage of CPUs with AES_NI for even greater performance, including the Intel Core-i 2xxx and AMD equivalent. See the first post of this thread and the README.md file for details. It is currently available in source code format compileable in Linux. Windows source and binary support is planned. cpuminer-opt v3.1.15 is available for download. https://drive.google.com/file/d/0B0lVSGQYLJIZdnI3SG9jNmZNRHM/view?usp=sharingAll users are encouraged to upgrade. New in v3.1.15 - unified build procedure fixed - build.sh now works for CPUs with and without AES_NI - it is no longer necessary to add "-DNO_AES_NI" CFLAG to the configure command when building for CPUs without AES_NI. - The system will automatically compile for the correct architecture
|
|
|
|
th3.r00t
|
 |
April 18, 2016, 08:23:25 PM |
|
New in v3.1.15
- unified build procedure fixed - build.sh now works for CPUs with and without AES_NI - it is no longer necessary to add "-DNO_AES_NI" CFLAG to the configure command when building for CPUs without AES_NI. - The system will automatically compile for the correct architecture
Thanks! Strange thing - when I compile with ./autogen.sh && ./configure CFLAGS="-DNO_AES_NI -O3 -march=btver1" --with-curl --with-crypto && make on AMD Sempron 145, all works like a charm. When I use ./build.sh I get an error in compile. Will try on AMD Phenom II X4 940 and see can I reproduce it. Edit: ./build.sh fails also on AMD Phenom II X4 940 make[2]: *** [algo/echo/aes_ni/cpuminer-hash.o] Error 1 make[2]: *** Waiting for unfinished jobs.... mv -f algo/groestl/sse2/.deps/cpuminer-grso-asm.Tpo algo/groestl/sse2/.deps/cpuminer-grso-asm.Po mv -f algo/argon2/ar2/.deps/cpuminer-opt.Tpo algo/argon2/ar2/.deps/cpuminer-opt.Po mv -f algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Tpo algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Po make[2]: Leaving directory `/home/urban/cpuminer-opt-3.1.15' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/*****/cpuminer-opt-3.1.15' make: *** [all] Error 2
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 18, 2016, 08:37:21 PM |
|
New in v3.1.15
- unified build procedure fixed - build.sh now works for CPUs with and without AES_NI - it is no longer necessary to add "-DNO_AES_NI" CFLAG to the configure command when building for CPUs without AES_NI. - The system will automatically compile for the correct architecture
Thanks! Strange thing - when I compile with ./autogen.sh && ./configure CFLAGS="-DNO_AES_NI -O3 -march=btver1" --with-curl --with-crypto && make on AMD Sempron 145, all works like a charm. When I use ./build.sh I get an error in compile. Will try on AMD Phenom II X4 940 and see can I reproduce it. Edit: ./build.sh fails also on AMD Phenom II X4 940 make[2]: *** [algo/echo/aes_ni/cpuminer-hash.o] Error 1 make[2]: *** Waiting for unfinished jobs.... mv -f algo/groestl/sse2/.deps/cpuminer-grso-asm.Tpo algo/groestl/sse2/.deps/cpuminer-grso-asm.Po mv -f algo/argon2/ar2/.deps/cpuminer-opt.Tpo algo/argon2/ar2/.deps/cpuminer-opt.Po mv -f algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Tpo algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Po make[2]: Leaving directory `/home/urban/cpuminer-opt-3.1.15' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/*****/cpuminer-opt-3.1.15' make: *** [all] Error 2
The only real difference I see is -march. build.sh always uses native. -DNO_AES_NI is now redundant and should not be necessary unless -march=native is choosing an incompatible architecture. If that's the case I don't know what I can do other than document the workaround. It would be nice to see the compile errors to give me an idea what it was trying to do. I don't have a non-aesni CPU to test on so it would be nice if you could confirm the capabilities reported when the miner is started is correct. Edit: the real errors are further back but didn't cause the compile to fail immediately. Do they compile if you specify a specific arch but without -DNO_AES_NI?
|
|
|
|
th3.r00t
|
 |
April 18, 2016, 08:42:05 PM |
|
Following the topic I get the impression that the new changes detects cpu, OS, algo (for AES-NI support) and if the cpu, OS, algo doesnt support AES-NI it falls back to SSE2.
I will send you PM in a few minutes with the compile logs for both CPU's - Phenom II X4 940 and Sempron 145
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 18, 2016, 08:48:09 PM |
|
Following the topic I get the impression that the new changes detects cpu, OS, algo (for AES-NI support) and if the cpu, OS, algo doesnt support AES-NI it falls back to SSE2.
I will send you PM in a few minutes with the compile logs for both CPU's - Phenom II X4 940 and Sempron 145
That is correct. You need all three to get the best performance. I haven't found specifically whether either of your CPUs has AES_NI but I suspect not. Try compiling with -march=core2. It seems to have worked for others.
|
|
|
|
th3.r00t
|
 |
April 18, 2016, 08:55:28 PM |
|
Following the topic I get the impression that the new changes detects cpu, OS, algo (for AES-NI support) and if the cpu, OS, algo doesnt support AES-NI it falls back to SSE2.
I will send you PM in a few minutes with the compile logs for both CPU's - Phenom II X4 940 and Sempron 145
That is correct. You need all three to get the best performance. I haven't found specifically whether either of your CPUs has AES_NI but I suspect not. Try compiling with -march=core2. It seems to have worked for others. Both AMD's are without AES-NI support. -march=core2 works on both, but the compiled binary is about 15% slower than with -march=btver1. AFAIK its related to SSE implementations on AMD CPU's. So with -march=btver1 which is AMD specific I got the best results. I tried several -march options, starting with core2 and measured the hashrate on the miner and the pool to choose that one for me. P.S. Logs are on the way 
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 18, 2016, 09:19:46 PM |
|
Following the topic I get the impression that the new changes detects cpu, OS, algo (for AES-NI support) and if the cpu, OS, algo doesnt support AES-NI it falls back to SSE2.
I will send you PM in a few minutes with the compile logs for both CPU's - Phenom II X4 940 and Sempron 145
That is correct. You need all three to get the best performance. I haven't found specifically whether either of your CPUs has AES_NI but I suspect not. Try compiling with -march=core2. It seems to have worked for others. Both AMD's are without AES-NI support. -march=core2 works on both, but the compiled binary is about 15% slower than with -march=btver1. AFAIK its related to SSE implementations on AMD CPU's. So with -march=btver1 which is AMD specific I got the best results. I tried several -march options, starting with core2 and measured the hashrate on the miner and the pool to choose that one for me. P.S. Logs are on the way  The logs you sent had no errors, but that's ok I found the nugget in the first line of the previous compile session: make[2]: *** [algo/echo/aes_ni/cpuminer-hash.o] Error 1 It's clear it was trying to compile AES code. Also, just to confirm, did you successfully compile without the -DNO_AES_NI flag? I've drafted the following note to be added to the README file: Also, just to confirm, did you successfully compile without the -DNO_AES_NI flag?
|
|
|
|
th3.r00t
|
 |
April 18, 2016, 09:31:48 PM |
|
Also, just to confirm, did you successfully compile without the -DNO_AES_NI flag?
I compiled successfully few times on each cpu with this configure ./configure CFLAGS="-DNO_AES_NI -O3 -march=btver1" --with-curl --with-crypto The build.sh error at the end is this: In file included from algo/echo/aes_ni/hash.c:19:0: algo/echo/aes_ni/hash.c: At top level: ./miner.h:479:20: warning: ‘algo_names’ defined but not used [-Wunused-variable] static const char *algo_names[] = { ^ In file included from algo/echo/aes_ni/hash.c:20:0: algo/echo/aes_ni/hash_api.h:47:23: warning: ‘initial_echo512_ctx’ defined but not used [-Wunused-variable] static hashState_echo initial_echo512_ctx = ^ make[2]: *** [algo/echo/aes_ni/cpuminer-hash.o] Error 1 make[2]: *** Waiting for unfinished jobs.... mv -f algo/groestl/.deps/cpuminer-groestl.Tpo algo/groestl/.deps/cpuminer-groestl.Po In file included from algo/groestl/myr-groestl.c:1:0: ./miner.h:479:20: warning: ‘algo_names’ defined but not used [-Wunused-variable] static const char *algo_names[] = { ^ mv -f algo/.deps/cpuminer-fresh.Tpo algo/.deps/cpuminer-fresh.Po mv -f algo/groestl/.deps/cpuminer-myr-groestl.Tpo algo/groestl/.deps/cpuminer-myr-groestl.Po make[2]: Leaving directory `/home/*****/cpuminer-opt-3.1.15-phenom' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/*****/cpuminer-opt-3.1.15-phenom' make: *** [all] Error 2 strip: 'cpuminer': No such file And this one algo/echo/aes_ni/hash.c:385:4: note: in expansion of macro ‘TRANSFORM’ TRANSFORM(_state[i][j], _k_opt, t1, t2); ^ In file included from algo/echo/aes_ni/hash.c:19:0: algo/echo/aes_ni/hash.c: At top level: ./miner.h:479:20: warning: ‘algo_names’ defined but not used [-Wunused-variable] static const char *algo_names[] = { ^ In file included from algo/echo/aes_ni/hash.c:20:0: algo/echo/aes_ni/hash_api.h:47:23: warning: ‘initial_echo512_ctx’ defined but not used [-Wunused-variable] static hashState_echo initial_echo512_ctx = ^ mv -f algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Tpo algo/argon2/ar2/.deps/cpuminer-ar2-scrypt-jane.Po gcc -std=gnu99 -DHAVE_CONFIG_H -I. -Iyes/include -Iyes/include -fno-strict-aliasing -I. -Iyes/include -Iyes/include -Wno-pointer-sign -Wno-pointer-to-int-cast -O3 -march=native -Wall -Iyes/include -Iyes/include -MT algo/groestl/sse2/cpuminer-grso-asm.o -MD -MP -MF algo/groestl/sse2/.deps/cpuminer-grso-asm.Tpo -c -o algo/groestl/sse2/cpuminer-grso-asm.o `test -f 'algo/groestl/sse2/grso-asm.c' || echo './'`algo/groestl/sse2/grso-asm.c make[2]: *** [algo/echo/aes_ni/cpuminer-hash.o] Error 1 make[2]: *** Waiting for unfinished jobs.... mv -f algo/groestl/.deps/cpuminer-myr-groestl.Tpo algo/groestl/.deps/cpuminer-myr-groestl.Po mv -f algo/groestl/sse2/.deps/cpuminer-grso.Tpo algo/groestl/sse2/.deps/cpuminer-grso.Po mv -f algo/groestl/sse2/.deps/cpuminer-grso-asm.Tpo algo/groestl/sse2/.deps/cpuminer-grso-asm.Po make[2]: Leaving directory `/home/*****/cpuminer-opt-3.1.15-sempron' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/*****/cpuminer-opt-3.1.15-sempron' make: *** [all] Error 2 strip: 'cpuminer': No such file
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
April 18, 2016, 09:38:19 PM |
|
Also, just to confirm, did you successfully compile without the -DNO_AES_NI flag?
I compiled successfully few times on each cpu with this configure ./configure CFLAGS="-DNO_AES_NI -O3 -march=btver1" --with-curl --with-crypto Please try with ./configure CFLAGS="-O3 -march=btver1" --with-curl --with-crypto
|
|
|
|
|