joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 01:25:11 PM |
|
I'm volunteering for testing.  Can test on AMD (SSE2, no AVX) both Windows and Linux, aswell on Core i7-4790K (AVX) on Linux and AMD FX-7600P (AVX) on Windows. Thanks, check your PM.
|
|
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 02:55:00 PM |
|
Cryptonight update.
As previously mentioned cryptonight is broken on Windows from v3.3 to present. Prior to that Windows was not supported. Linux is not affected.
I have localized the problem to a simple function call that goes bad and crashes the miner. The line before the call is executed but the first line of the called function is never reached.
This is a simple call to a constant function, nothing fancy, no algo-gate function pointers, just a basic call to the hash function. It is the exact same code that runs fine on Linux. There are no OS hooks anywhere to be found.
It was first discovered in the CMB prebuilt binaries but I can easilly reproduce it with a different compiler version.
It's difficult to wrap my head around this kind of problem, especially when it's OS specific. The crash suggests the function address was invalid. It seems either the compiler messed up linking the function address or it got overwritten after compilation. c/c++ always scares me with the lack of built in buffer overflow protection. I'm not used to working without a net. But even if there is a buffer overflow corrupting a function why only on Windows?
I'll review all the data defined in that file for anything suspicious but after that I'll be pretty stuck.
|
|
|
|
Roolieman
Newbie
Offline
Activity: 53
Merit: 0
|
 |
June 06, 2016, 03:10:34 PM |
|
Great job Joblo,
v.3.3.5 working like charm on Gainestown Xeon's with cpuminer-sse2.exe build.
Checking CPU capatibility... Intel(R) Xeon(R) CPU E5540 @ 2.53GHz CPU features: SSE2 AVX AVX2 SW built on Jun 5 2016 with GCC 5.3.0 Build features: SSE2 Algo features: SSE2 AES AES not available, starting mining with SSE2 optimizations...
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 03:13:51 PM |
|
Great job Joblo,
v.3.3.5 working like charm on Gainestown Xeon's with cpuminer-sse2.exe build.
Checking CPU capatibility... Intel(R) Xeon(R) CPU E5540 @ 2.53GHz CPU features: SSE2 AVX AVX2 SW built on Jun 5 2016 with GCC 5.3.0 Build features: SSE2 Algo features: SSE2 AES AES not available, starting mining with SSE2 optimizations...
Thanks for testing. The erroneous display of AVX and AVX2 support for the CPU should be fixed in the next release. It's being tested now. Everything else looks good.
|
|
|
|
Dabs
Legendary
Offline
Activity: 3416
Merit: 1912
The Concierge of Crypto
|
 |
June 06, 2016, 03:36:06 PM |
|
E5640 here. Windows 10 64 bit in a VM. Still trying to figure out Debian, but hmage's simple instructions should help. If you have a compiled binary exe, I'll run it. If I need to compile something, I'll try.
|
|
|
|
th3.r00t
|
 |
June 06, 2016, 04:02:24 PM |
|
I'm volunteering for testing.  Can test on AMD (SSE2, no AVX) both Windows and Linux, aswell on Core i7-4790K (AVX) on Linux and AMD FX-7600P (AVX) on Windows. Thanks, check your PM. Reports from four machines send via PM.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 04:30:34 PM |
|
I'm volunteering for testing.  Can test on AMD (SSE2, no AVX) both Windows and Linux, aswell on Core i7-4790K (AVX) on Linux and AMD FX-7600P (AVX) on Windows. Thanks, check your PM. Reports from four machines send via PM. Awesome report. It had everything I needed. Everything looks good from the CPU capabilities check. I saw no inconsistencies. The only issue I found was the mapping of -march=native on non-AES AMD CPUs. As a long known issue it is already documented in README.md. I'll review your data in more detail and may update the documented workaround based on the info you provided. I think this gives me enough confidence about the CPU capabilities check to release it, but I'll wait a while to give me a chance to look deeper into the cryptonight problem.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 04:40:21 PM |
|
Cryptonight update.
As previously mentioned cryptonight is broken on Windows from v3.3 to present. Prior to that Windows was not supported. Linux is not affected.
I have localized the problem to a simple function call that goes bad and crashes the miner. The line before the call is executed but the first line of the called function is never reached.
This is a simple call to a constant function, nothing fancy, no algo-gate function pointers, just a basic call to the hash function. It is the exact same code that runs fine on Linux. There are no OS hooks anywhere to be found.
It was first discovered in the CMB prebuilt binaries but I can easilly reproduce it with a different compiler version.
It's difficult to wrap my head around this kind of problem, especially when it's OS specific. The crash suggests the function address was invalid. It seems either the compiler messed up linking the function address or it got overwritten after compilation. c/c++ always scares me with the lack of built in buffer overflow protection. I'm not used to working without a net. But even if there is a buffer overflow corrupting a function why only on Windows?
I'll review all the data defined in that file for anything suspicious but after that I'll be pretty stuck.
I LOVE that about C/C++. I've had religious debates with c/c++ proponents. My key argument is just to look at all the buffer overflow exploits that have existed over the years and continue to exist. Never would have happened with array bounds checking and no mixing of a[] and a*.
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
June 06, 2016, 04:55:28 PM |
|
You can enable gcc's stack protection.
-fstack-protector
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 05:29:27 PM |
|
You can enable gcc's stack protection.
-fstack-protector
Interesting, compile failed. cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x695a): undefined reference to `__stack_chk_fail' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6a8c): undefined reference to `__stack_chk_guard' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6c78): undefined reference to `__stack_chk_guard' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6d61): undefined reference to `__stack_chk_fail' c:/msys/opt/windows_64/bin/../lib64/gcc/x86_64-w64-mingw32/4.8.3/../../../../x86_64-w64-mingw32/bin/ld.exe: cpuminer-cpu-miner.o: bad reloc address 0x0 in section `.pdata' collect2.exe: error: ld returned 1 exit status make[2]: *** [cpuminer.exe] Error 1
I don't know what this means but it does mention pdata, an argument to the function that is failing. Cryptonight is coded the same as every other algo and every algo hash function references pdata the same way.
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
June 06, 2016, 05:45:19 PM |
|
cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x695a): undefined reference to `__stack_chk_fail' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6a8c): undefined reference to `__stack_chk_guard' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6c78): undefined reference to `__stack_chk_guard' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6d61): undefined reference to `__stack_chk_fail' c:/msys/opt/windows_64/bin/../lib64/gcc/x86_64-w64-mingw32/4.8.3/../../../../x86_64-w64-mingw32/bin/ld.exe: cpuminer-cpu-miner.o: bad reloc address 0x0 in section `.pdata' collect2.exe: error: ld returned 1 exit status make[2]: *** [cpuminer.exe] Error 1
I don't know what this means but it does mention pdata, an argument to the function that is failing. Cryptonight is coded the same as every other algo and every algo hash function references pdata the same way. It means that your build of gcc doesn't have stack protection support library (functions __stack_chk_fail() and __stack_chk_guard()). It builds fine on debian linux though.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 06:46:19 PM |
|
cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x695a): undefined reference to `__stack_chk_fail' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6a8c): undefined reference to `__stack_chk_guard' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6c78): undefined reference to `__stack_chk_guard' cpuminer-cpu-miner.o:cpu-miner.c:(.text+0x6d61): undefined reference to `__stack_chk_fail' c:/msys/opt/windows_64/bin/../lib64/gcc/x86_64-w64-mingw32/4.8.3/../../../../x86_64-w64-mingw32/bin/ld.exe: cpuminer-cpu-miner.o: bad reloc address 0x0 in section `.pdata' collect2.exe: error: ld returned 1 exit status make[2]: *** [cpuminer.exe] Error 1
I don't know what this means but it does mention pdata, an argument to the function that is failing. Cryptonight is coded the same as every other algo and every algo hash function references pdata the same way. It means that your build of gcc doesn't have stack protection support library (functions __stack_chk_fail() and __stack_chk_guard()). It builds fine on debian linux though. But the problem is only on Windows, oh well.
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
June 06, 2016, 06:47:43 PM |
|
But the problem is only on Windows, oh well.
msys comes with gdb, it should be able to catch the segfault, then you can inspect the registers, stack and variables. Compiling with "-O0 -g3" instead of "-O3" should help gdb give you more info.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 06, 2016, 06:53:08 PM |
|
But the problem is only on Windows, oh well.
msys comes with gdb, it should be able to catch the segfault, then you can inspect the registers, stack and variables. Compiling with "-O0 -g3" instead of "-O3" should help gdb give you more info. Tried that. Compile fails in another algo with -O0 asm has impossible constraints.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 07, 2016, 03:52:25 PM Last edit: June 07, 2016, 05:45:08 PM by joblo |
|
But the problem is only on Windows, oh well.
msys comes with gdb, it should be able to catch the segfault, then you can inspect the registers, stack and variables. Compiling with "-O0 -g3" instead of "-O3" should help gdb give you more info. Tried that. Compile fails in another algo with -O0 asm has impossible constraints. Use -ggdb3 and step through it. Progress. If ___chkstk_ms is checking for stackoverflow then I know what the problem is. Looking for solutions, looks like a Makefile.am edit. Edit: doubling the stacksize spec in Makefile.am didn't work. if HAVE_WINDOWS #cpuminer_CFLAGS += -Wl,--stack,10485760 cpuminer_CFLAGS += -Wl,--stack,20971520 endif Edit2: I couldn't find any documentation for --stack so I don't know what makefile is doing. I found -fno-stack-limit but it still crashes at the same place. I know it's crashing in ___chkstak_ms and I assume that means a stack overflow. A corrupt stack pointer should not occur in compiled code. From my experience the stack limit is fixed when the process is created, I'm not aware of any way to increase that after process creation so there must be some overriding limit beyond the scope of the compiler. Most of the info I found deals with limiting the stack, not growing it. It seems infinite recursion is a bigger issue than legitimate large stack use.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 07, 2016, 06:52:16 PM Last edit: June 07, 2016, 07:15:23 PM by joblo |
|
More cryptonight progress.
If I can't make the stack bigger use less.
I made the definition of ctx global instead of local and it doesn't crash. Just testing on a live pool to confirm.
If all goes well I should have it fixed by the end of the day.
Edit: My optimism was premature. Although the AES version doesn't crash it produces only rejects. And a non-AES compile still crashes.
|
|
|
|
hmage
Member

Offline
Activity: 83
Merit: 10
|
 |
June 07, 2016, 07:57:26 PM |
|
For the future, gcc has a flag: -fstack-usage when compiling it generates *.su files that have info how much bytes a function would need for stack. After adding that, recompiling cpuminer-opt on core2 and then printing all *.su files, sorted by size: $ find . -iname '*.su' -print0 | xargs -0 cat |sort -k2n|tail|column -t x17.c:212:6:x17hash_alt 3904 static cpu-miner.c:2689:5:main 4144 static x17.c:87:13:x17hash 4256 static hodl-wolf.c:28:5:scanhash_hodl_wolf 4304 static scrypt.c:696:12:scanhash_scrypt 7680 dynamic,bounded hmq1725.c:143:13:hmq1725hash 7744 static scrypt.c:648:13:scrypt_1024_1_1_256_24way 9088 dynamic,bounded m7mhash.c:195:5:scanhash_m7m_hash 12464 dynamic,bounded api.c:511:13:api 17136 dynamic,bounded cryptonight.c:172:6:cryptonight_hash_ctx 2097648 static
You might want to increase max stack size in makefile to 3mb or more, setting it to 2MB isn't enough because you need 2097648 bytes just for that function. Edit2: I couldn't find any documentation for --stack so I don't know what makefile is doing.
"-Wl," means "pass this to linker", which means ld, so you need to check the documentation of ld -- http://linux.die.net/man/1/ldOn Windows, stack size limit is specified in the binary. On Linux, the limit is set by system administrator. On current debian, default is 8Mb. I found -fno-stack-limit but it still crashes at the same place.
This is different feature from above, and -fno-stack-limit is used to negate -fstack-limit-register/-fstack-limit-symbol, by default this feature is not set.
|
|
|
|
joblo (OP)
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
June 07, 2016, 08:53:14 PM |
|
For the future, gcc has a flag: -fstack-usage when compiling it generates *.su files that have info how much bytes a function would need for stack. After adding that, recompiling cpuminer-opt on core2 and then printing all *.su files, sorted by size: $ find . -iname '*.su' -print0 | xargs -0 cat |sort -k2n|tail|column -t x17.c:212:6:x17hash_alt 3904 static cpu-miner.c:2689:5:main 4144 static x17.c:87:13:x17hash 4256 static hodl-wolf.c:28:5:scanhash_hodl_wolf 4304 static scrypt.c:696:12:scanhash_scrypt 7680 dynamic,bounded hmq1725.c:143:13:hmq1725hash 7744 static scrypt.c:648:13:scrypt_1024_1_1_256_24way 9088 dynamic,bounded m7mhash.c:195:5:scanhash_m7m_hash 12464 dynamic,bounded api.c:511:13:api 17136 dynamic,bounded cryptonight.c:172:6:cryptonight_hash_ctx 2097648 static
You might want to increase max stack size in makefile to 3mb or more, setting it to 2MB isn't enough because you need 2097648 bytes just for that function. Thanks for the info. I increased the stack size in Makefile.am to 3 MB but it made no difference. AES still produces rejects and non-aes still crashes. There is apparently another problem with the AES version other than the stack overflow (that is a huge stack compared with the other algos) because I solved that by reducing the local variables. So the situation now for AES seems the same code with no superficial Windows hooks works on Linux but produces rejects on Windows. By superficial I mean checks for Windows in cryptonight code. There may be some low level hooks in common code also used by other algos. The core2 build still crashes after moving ctx to global and increasing the stacksize to 3 MB in Makefile.am. It's either the same crash or a different one. I haven't followed up because my focus is on AES first.
|
|
|
|
|