th3.r00t
|
|
April 18, 2016, 09:45:02 PM |
|
Please try with ./configure CFLAGS="-O3 -march=btver1" --with-curl --with-crypto Build successful on both AMD CPU's without -DNO_AES_NI
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 18, 2016, 09:46:06 PM |
|
Please try with ./configure CFLAGS="-O3 -march=btver1" --with-curl --with-crypto Build successful on both AMD CPU's without -DNO_AES_NI Excelllent, thanks.
|
|
|
|
hmage
Member
Offline
Activity: 83
Merit: 10
|
|
April 19, 2016, 08:33:21 PM |
|
Please try with ./configure CFLAGS="-O3 -march=btver1" --with-curl --with-crypto Build successful on both AMD CPU's without -DNO_AES_NI So, what was the setup that was giving the errors? If it works with and without NO_AES_NI? What was the version of gcc?
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 20, 2016, 02:09:05 AM |
|
Please try with ./configure CFLAGS="-O3 -march=btver1" --with-curl --with-crypto Build successful on both AMD CPU's without -DNO_AES_NI So, what was the setup that was giving the errors? If it works with and without NO_AES_NI? What was the version of gcc? I don't understand your question. If you mean my request above to test it was just to confirm it works on real non-aes HW, which I don't have. If you are referring to the compile problems in 3.1.14 it was because of a misunderstanding on the maning of __AES__. If you're referring to you problems with -march=native I have no idea. gcc v4.8.4 Did any of these answer your question?
|
|
|
|
th3.r00t
|
|
April 20, 2016, 08:26:09 AM |
|
Please try with ./configure CFLAGS="-O3 -march=btver1" --with-curl --with-crypto Build successful on both AMD CPU's without -DNO_AES_NI So, what was the setup that was giving the errors? If it works with and without NO_AES_NI? What was the version of gcc? Default Ubuntu Server GCC 3.19.0-58-generic kernel AMD Phenom II X4 940 @ 3.6GHz with 4GB RAM gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.4-2ubuntu1~14.04.1' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.1)
|
|
|
|
hmage
Member
Offline
Activity: 83
Merit: 10
|
|
April 20, 2016, 06:08:20 PM |
|
gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.4-2ubuntu1~14.04.1' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.1) Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Below are examples of how output should look like. Output on my core2: hmage@wraith:~$ gcc -march=native -Q --help=target | fgrep march -march= core2
On haswell: hmage@dhmd:~$ gcc -march=native -Q --help=target | fgrep march -march= haswell
On ivybridge: hmage@education:~$ gcc -march=native -Q --help=target | fgrep march -march= ivybridge
List of supported architectures on GCC 4.8.4 is here -- https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gcc/i386-and-x86-64-Options.html
|
|
|
|
th3.r00t
|
|
April 20, 2016, 07:17:53 PM |
|
Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Here you are: root@beast:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10
|
|
|
|
sakuleo
Jr. Member
Offline
Activity: 324
Merit: 1
|
|
April 20, 2016, 07:22:27 PM |
|
windows support?
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 20, 2016, 07:30:05 PM |
|
Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Here you are: root@beast:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10 This curious. I presume that shows which arch is used by native. On my skylake I get core2-avx and on my haswell I get corei7-avx. configure fails with -march=skylake on my skylake.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 20, 2016, 07:30:30 PM |
|
windows support?
No time soon.
|
|
|
|
th3.r00t
|
|
April 20, 2016, 07:38:19 PM |
|
Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Here you are: root@beast:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10 This curious. I presume that shows which arch is used by native. On my skylake I get core2-avx and on my haswell I get corei7-avx. configure fails with -march=skylake on my skylake. Yeah! This got me curious and do some tests too. Intel Core i7-4790K CPU @ 4.40GHz root@storm:~$ gcc -march=native -Q --help=target | fgrep march -march= core-avx2 AMD Sempron 145 root@wolverine:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 20, 2016, 07:55:38 PM |
|
Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Here you are: root@beast:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10 This curious. I presume that shows which arch is used by native. On my skylake I get core2-avx and on my haswell sandy bridge I get corei7-avx. configure fails with -march=skylake on my skylake. Yeah! This got me curious and do some tests too. Intel Core i7-4790K CPU @ 4.40GHz root@storm:~$ gcc -march=native -Q --help=target | fgrep march -march= core-avx2 AMD Sempron 145 root@wolverine:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10 Correction, my sandy bridge shows corei7-avx.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 20, 2016, 08:27:53 PM |
|
|
|
|
|
hmage
Member
Offline
Activity: 83
Merit: 10
|
|
April 21, 2016, 01:50:44 PM Last edit: April 21, 2016, 02:01:00 PM by hmage |
|
Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Here you are: root@beast:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10 Then your build should have worked out of the box, with -march=native. amdfam10 doesn't have AES support and gcc won't define __AES__ macro. Can you try building it with -march=native again? This curious. I presume that shows which arch is used by native.
On my skylake I get core2-avx and on my haswell I get corei7-avx. configure fails with -march=skylake on my skylake.
Yes, this shows which arch gcc use for -march=native. In your case skylake is too new for gcc, your gcc 4.8.4 doesn't know about it, it should choose the closest match with most features enabled. There's no 'core2-avx' in GCC 4.8.4 manual, maybe you meant 'core-avx2'? core-avx2 defines __AES__ automatically. https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gcc/i386-and-x86-64-Options.html
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 21, 2016, 03:24:24 PM |
|
Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Here you are: root@beast:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10 Then your build should have worked out of the box, with -march=native. amdfam10 doesn't have AES support and gcc won't define __AES__ macro. Can you try building it with -march=native again? This curious. I presume that shows which arch is used by native.
On my skylake I get core2-avx core-avx2 and on my haswell sandy bridge I get corei7-avx. configure fails with -march=skylake on my skylake.
Yes, this shows which arch gcc use for -march=native. In your case skylake is too new for gcc, your gcc 4.8.4 doesn't know about it, it should choose the closest match with most features enabled. There's no 'core2-avx' in GCC 4.8.4 manual, maybe you meant 'core-avx2'? core-avx2 defines __AES__ automatically. https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gcc/i386-and-x86-64-Options.htmlHmage, you're a good teacher and you know your stuff, I'm learning a lot. Core-avx2 correction noted.
|
|
|
|
AlexGR
Legendary
Offline
Activity: 1708
Merit: 1049
|
|
April 21, 2016, 06:48:56 PM |
|
May I propose a different approach for much faster mining? Currently, most, if not all of CPU-mineable coins, are cripple-mined.The reason is simple: Under-utilizing of the SIMD nature of SSE & AVX sets. SSE and AVX commands are used in SISD fashion (single instruction single data, instead of Multiple data / SIMD), meaning they are not processing 2 batches of information but one. Right now hashing goes on like that: The main mining routine sends one output to each hash, where it will be subject to a process of SERIAL transmutations / permutation and in the end the hash will output that data back to the miner (some times to send it to the next hash). This serial process doesn't allow for much Single Instruction Multiple Data utilization. What should be done instead is that the miner program should issue 2-4 hash candidates to the hashing routines. The hashing routines should be able to get 2-4 inputs (instead of 1) and return back 2-4 outputs. In this way the process would be paralleled and SIMD utilization (packed processing of similar instructions) would result in much faster processing. Now this might require a lot of recoding, or, one could adjust the code in C for use with a special compiler which runs multiple instances of serial data crunching in order to process them in "packs" with SIMD or "packed" instructions - and then let the compiler do all the packing. Performance benefits of such an approach here: http://ispc.github.io/perf.html
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
April 21, 2016, 07:23:33 PM |
|
May I propose a different approach for much faster mining? Currently, most, if not all of CPU-mineable coins, are cripple-mined.The reason is simple: Under-utilizing of the SIMD nature of SSE & AVX sets. SSE and AVX commands are used in SISD fashion (single instruction single data, instead of Multiple data / SIMD), meaning they are not processing 2 batches of information but one. Right now hashing goes on like that: The main mining routine sends one output to each hash, where it will be subject to a process of SERIAL transmutations / permutation and in the end the hash will output that data back to the miner (some times to send it to the next hash). This serial process doesn't allow for much Single Instruction Multiple Data utilization. What should be done instead is that the miner program should issue 2-4 hash candidates to the hashing routines. The hashing routines should be able to get 2-4 inputs (instead of 1) and return back 2-4 outputs. In this way the process would be paralleled and SIMD utilization (packed processing of similar instructions) would result in much faster processing. Now this might require a lot of recoding, or, one could adjust the code in C for use with a special compiler which runs multiple instances of serial data crunching in order to process them in "packs" with SIMD or "packed" instructions - and then let the compiler do all the packing. Performance benefits of such an approach here: http://ispc.github.io/perf.htmlThat's a fascinating idea but I don't think it will get the visibility here that it deserves. Pooler and TPruvot are the two main guys for CPU mining although TPruvot is focussed more on other projects at the moment. Both have active threads in this forum. I suggest you present your idea to them in case they, or their folllowers, may want to take on the challenge. It's beyond my skill level.
|
|
|
|
th3.r00t
|
|
April 21, 2016, 07:32:34 PM |
|
Can you run this command on your AMD processors and show me the output? gcc -march=native -Q --help=target | fgrep march
Here you are: root@beast:~$ gcc -march=native -Q --help=target | fgrep march -march= amdfam10 Then your build should have worked out of the box, with -march=native. amdfam10 doesn't have AES support and gcc won't define __AES__ macro. Can you try building it with -march=native again? This curious. I presume that shows which arch is used by native.
On my skylake I get core2-avx and on my haswell I get corei7-avx. configure fails with -march=skylake on my skylake.
Yes, this shows which arch gcc use for -march=native. In your case skylake is too new for gcc, your gcc 4.8.4 doesn't know about it, it should choose the closest match with most features enabled. There's no 'core2-avx' in GCC 4.8.4 manual, maybe you meant 'core-avx2'? core-avx2 defines __AES__ automatically. https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gcc/i386-and-x86-64-Options.htmlNope... Still ./build.sh fails on AMD Maybe I am to try something else?
|
|
|
|
pallas
Legendary
Offline
Activity: 2716
Merit: 1094
Black Belt Developer
|
|
April 21, 2016, 07:37:59 PM |
|
It's not a new idea. It was used back in the GPU bitcoin mining days to get better speed on amd VLIW cards. It's easy to adapt the miner itself to process multiple nonces per thread, not sure about how much work is needed to work on the algos themselves. Maybe we could make a test with a simple algo like blake. But I'm not the man because I'm not proficient in those cpu instruction extensions.
|
|
|
|
th3.r00t
|
|
April 21, 2016, 07:40:08 PM |
|
********** cpuminer-opt 3.1.16 *********** A CPU miner with multi algo support and optimized for CPUs with AES_NI extension. BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT Forked from TPruvot's cpuminer-multi with credits to Lucas Jones, elmad, palmd, djm34, pooler, ig0tik3d, Wolf0 and Jeff Garzik.
Checking CPU capatibility... AMD Phenom(tm) II X4 940 Processor CPU arch supports AES_NI...NO. CPU arch supports SSE2.....YES. SW built for SSE2..........NO. Incompatible SW build, rebuild with "-march=native" Why?
|
|
|
|
|