Bitcoin Forum
November 15, 2024, 04:09:53 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 [4] 5 »  All
  Print  
Author Topic: 4 hashes parallel on SSE2 CPUs for 0.3.6  (Read 22049 times)
impossible7
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
August 06, 2010, 09:00:52 AM
 #61

Here's the output on stderr:
Code:
bitcoind: main.cpp:2741: void BitcoinMiner(): Assertion `("break caught by CRITICAL_BLOCK!", !fcriticalblockonce)' failed.
tcatm (OP)
Sr. Member
****
qt
Offline Offline

Activity: 337
Merit: 285


View Profile
August 06, 2010, 10:59:18 AM
 #62

Oh, that's a part in the code my patch doesn't touch. You could try to remove line 2741 (CRITICAL_BLOCK(cs_main)).
impossible7
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
August 06, 2010, 11:37:20 AM
Last edit: August 06, 2010, 11:53:17 AM by impossible7
 #63

CRITICAL_BLOCK is a macro that contains a for loop. The assertion failure indicates that break has been called inside the body of the loop. The only break statement in this block is in line 2762. In the original source file, there is no break statement in this critical block. I think you must remove lines 2759-2762. The is nothing like that in the original main.cpp.
tcatm (OP)
Sr. Member
****
qt
Offline Offline

Activity: 337
Merit: 285


View Profile
August 06, 2010, 11:54:04 AM
 #64

Thanks! That got probably mixed up when I patched the file with an older diff. I fixed the git: http://github.com/tcatm/bitcoin-cruncher
satoshi
Founder
Sr. Member
*
qt
Offline Offline

Activity: 364
Merit: 7248


View Profile
August 07, 2010, 09:16:01 PM
 #65

CRITICAL_BLOCK is a macro that contains a for loop. The assertion failure indicates that break has been called inside the body of the loop. The only break statement in this block is in line 2762. In the original source file, there is no break statement in this critical block. I think you must remove lines 2759-2762. The is nothing like that in the original main.cpp.
Sorry about that.  CRITICAL_BLOCK isn't perfect.  You have to be careful not to break or continue out of it.  There's an assert that catches and warns about break.  I can be criticized for using it, but the syntax would be so much more bloated and error prone without it.

Is there a chance the SSE2 code is slow on Intel because of some quirk that could be worked around?  For instance, if something works but is slow if it's not aligned, or thrashing the cache, or one type of instruction that's really slow?  I'm not sure how available it is, but I think Intel used to have a profiler for profiling on a per instruction level.  I guess if tcatm doesn't have a system with the slow processor to test with, there's not much hope.  But it would be really nice if this was working on most CPUs.
impossible7
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
August 07, 2010, 10:51:07 PM
 #66

I can confirm that the patch now works just fine. I just generated my first 50 BTC with it. And since this patch doubles the speed I think it's only fair if I donated half of that to tcatm.
tcatm (OP)
Sr. Member
****
qt
Offline Offline

Activity: 337
Merit: 285


View Profile
August 07, 2010, 10:59:55 PM
 #67

I can confirm that the patch now works just fine. I just generated my first 50 BTC with it. And since this patch doubles the speed I think it's only fair if I donated half of that to tcatm.
That's nice to hear. Thanks for the donation and thanks to everyone else who donated!
aceat64
Full Member
***
Offline Offline

Activity: 307
Merit: 102



View Profile
August 08, 2010, 06:37:13 AM
 #68

I just tried again with the latest version from your github. I'm still seeing a drop in performance compared to the vanilla source.

My system went from ~7100 to ~4200.

This particular system has dual Intel Xeon Quad-Core CPUs (E5335) @ 2.00GHz.
tcatm (OP)
Sr. Member
****
qt
Offline Offline

Activity: 337
Merit: 285


View Profile
August 08, 2010, 11:52:53 AM
 #69

It seems to be like this: everything before Core2 will be slower, everything starting with Core2 is faster. Can anyone test the code on an older AMD64? I know there was a change in the way SSE2 instructions are executed in recent architectures.
nimnul
Sr. Member
****
Offline Offline

Activity: 252
Merit: 250


View Profile WWW
August 12, 2010, 12:18:23 PM
 #70

Can we implement a speed test, so different hashing engines are tried and the fastest is chosen?

nelisky
Legendary
*
Offline Offline

Activity: 1540
Merit: 1002


View Profile
August 12, 2010, 12:57:58 PM
 #71

It seems to be like this: everything before Core2 will be slower, everything starting with Core2 is faster. Can anyone test the code on an older AMD64? I know there was a change in the way SSE2 instructions are executed in recent architectures.

My Core2Quad (Q6600) slowed down 50%, my i5 improved ~200%, thus I don't think what you state is accurate. Maybe starting at some specific Core2?
satoshi
Founder
Sr. Member
*
qt
Offline Offline

Activity: 364
Merit: 7248


View Profile
August 12, 2010, 10:07:23 PM
Last edit: August 13, 2010, 02:53:16 AM by satoshi
 #72

That big of a difference in speed, by a factor of 4 or 6, feels like it's likely to be some quirky weak spot or instruction that the old chip is slow with.  Unless it's a touted feature of the i5 that they made SSE2 six times faster.

A quick summary:
Xeon Quad        41% slower
Core 2 Duo        55% slower
Core 2 Duo        same (vess)
Core 2 Quad      50% slower
Core i5            200% faster (nelisky)
Core i5            100% faster (vess)
AMD Opteron    105% faster

aceat64:
My system went from ~7100 to ~4200.
This particular system has dual Intel Xeon Quad-Core CPUs (E5335) @ 2.00GHz.

impossible7:
on an Intel Core 2 Duo T7300 running x86_64 linux it was 55% slower compared to the stock version (r121)

nelisky:
My Core2Quad (Q6600) slowed down 50%,
my i5 improved ~200%,

impossible7:
on an AMD Opteron 2374 HE running x86_64 linux I got a 105% improvement (!)
vess
Full Member
***
Offline Offline

Activity: 141
Merit: 100



View Profile WWW
August 12, 2010, 10:09:50 PM
 #73

My core i5 doubled in speed. My Core 2 Duo is the same speed.

I'm the CEO of CoinLab (www.coinlab.com) and the Executive Director of the Bitcoin Foundation, I will identify if I'm speaking for myself or one of the organizations when I post from this account.
tcatm (OP)
Sr. Member
****
qt
Offline Offline

Activity: 337
Merit: 285


View Profile
August 13, 2010, 12:42:47 AM
 #74

Would be interesting to try it out on older AMD64. There's been a change that would explain it there:
http://developer.amd.com/documentation/articles/pages/682007171.aspx

Maybe Intel did something similiar without announcing it?
Cheater
Newbie
*
Offline Offline

Activity: 13
Merit: 0


View Profile
August 13, 2010, 06:27:23 AM
 #75

I'll just pitch in that Phenom and Phenom II processors doubled (roughly).
No difference between the two that I can tell.

Sorry I dont have anything older than Phenoms available right now.
Might be able to access a old X2 in a week.
sgtstein
Member
**
Offline Offline

Activity: 61
Merit: 10


View Profile
August 13, 2010, 09:10:33 PM
 #76

Just a question for whoever, trying to wrap up the information in this thread.

Does this:
  • 1. Work on 32-bit?
  • 2. Patch the SVN (r130 as of current) or Git?
  • 3. Compile on CentOS?

If anyone has any answers I would greatly appreciate them.
tcatm (OP)
Sr. Member
****
qt
Offline Offline

Activity: 337
Merit: 285


View Profile
August 13, 2010, 09:27:14 PM
 #77

1. Does not work on 32-bit (though that's not a problem with the algorithm).
2. Patch is against older SVN. There's a git repo at http://github.com/tcatm/bitcoin-cruncher
3. Compiles on every 64bit Linux.

It's not intended as a replacement for a standard client but for a dedicated bitcoinminer box. I'm planning a pluggable bitcoinminer someday. But at current difficulty it's easier to work for bitcoins than finding faster ways for mining.
NewLibertyStandard
Sr. Member
****
Offline Offline

Activity: 252
Merit: 268



View Profile WWW
August 13, 2010, 10:27:25 PM
 #78

I would really like to have this feature included in an official build  sometimes soon along with an internal speed test to determine which algorithm to use. You can always remove the speed test later once you figure out how to determine whether it will be faster or slower without running the speed test.

Treazant: A Fullever Rewarding Bitcoin - Backup Your Wallet TODAY to Double Your Money! - Dual Currency Donation Address: 1Dnvwj3hAGSwFPMnkJZvi3KnaqksRPa74p
sgtstein
Member
**
Offline Offline

Activity: 61
Merit: 10


View Profile
August 13, 2010, 11:17:51 PM
 #79

1. Does not work on 32-bit (though that's not a problem with the algorithm).
2. Patch is against older SVN. There's a git repo at http://github.com/tcatm/bitcoin-cruncher
3. Compiles on every 64bit Linux.

It's not intended as a replacement for a standard client but for a dedicated bitcoinminer box. I'm planning a pluggable bitcoinminer someday. But at current difficulty it's easier to work for bitcoins than finding faster ways for mining.

1. Do we know why it doesn't work on 32bit? Is is it because it's using 128bits and if so, would it help if we dropped it to 64?

2. Thank you, I will look into implementing it on my 64bit systems.
3. Excellent to hear. I'm looking forward to using it.

I was planning on using it on a PE2650 dual proc Xeon @3.2GHz w/HT. I would really like to get this figured out to utilize that system. I am planning one as well. At current difficulty I would agree, except when the system needs to be run anyway and latency isn't an issue.
satoshi
Founder
Sr. Member
*
qt
Offline Offline

Activity: 364
Merit: 7248


View Profile
August 14, 2010, 12:49:18 AM
 #80

MinGW on Windows has trouble compiling it:

g++ -c -mthreads -O2 -w -Wno-invalid-offsetof -Wformat -g -D__WXDEBUG__ -DWIN32 -D__WXMSW__ -D_WINDOWS -DNOPCH -I"/boost" -I"/db/build_unix" -I"/openssl/include" -I"/wxwidgets/lib/gcc_lib/mswud" -I"/wxwidgets/include" -msse2 -O3 -o obj/sha256.o sha256.cpp

sha256.cpp: In function `long long int __vector__ Ch(long long int __vector__, long long int __vector__, long long int __vector__)':
sha256.cpp:31: internal compiler error: in perform_integral_promotions, at cp/typeck.c:1454
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://www.mingw.org/bugs.shtml> for instructions.
make: *** [obj/sha256.o] Error 1
Pages: « 1 2 3 [4] 5 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!