Bitcoin Forum
April 26, 2024, 06:32:37 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Faster SHA-256, MSVC build  (Read 15606 times)
dkaparis (OP)
Newbie
*
Offline Offline

Activity: 53
Merit: 0


View Profile
July 18, 2010, 01:28:50 PM
Last edit: July 18, 2010, 09:35:52 PM by satoshi
 #1

I've managed to set up dependencies and build bitcoin with MS Visual C++ 2008 Express Edition. I'll give 2010 a try at some time.

There is a custom allocator class in serialize.h, secure_allocator, that fails to build with non-debug runtime selected. It is my understanding allocator classes require a template copy constructor, I've attached a small patch that solves the problem.

As Satoshi noted elsewhere, the MSVC build is indeed significantly slower khash/s-wise (more than twice) than the prebuilt one (MinGW?), even though I enabled the highest optimization level options and also global optimization with link-time code generation. I find that result strange, since MSVC is not known to have significantly worse optimizer than GCC's. Most probably the problem can be traced to the sha module that is extracted from Crypto++. I find in Crypto++ SVN there are revised versions of the module, including x86/x64 assembly for SHA-256. Using the newer versions would involve reintegrating their dependencies, though. On that note, why aren't we using OpenSSL's SHA-2 hashing functions instead? Since we already use OpenSSL, this would be a better solution than to manually support a SHA module from another library.
1714156357
Hero Member
*
Offline Offline

Posts: 1714156357

View Profile Personal Message (Offline)

Ignore
1714156357
Reply with quote  #2

1714156357
Report to moderator
The forum strives to allow free discussion of any ideas. All policies are built around this principle. This doesn't mean you can post garbage, though: posts should actually contain ideas, and these ideas should be argued reasonably.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
satoshi
Founder
Sr. Member
*
qt
Offline Offline

Activity: 364
Merit: 6723


View Profile
July 18, 2010, 09:24:09 PM
Last edit: July 18, 2010, 09:37:25 PM by satoshi
 #2

OpenSSL doesn't have any interface for doing just the low level raw block hash part of SHA256.  SHA256 begins by wrapping your data in a specially formatted buffer.  Setting up the buffer takes an order of magnitude longer than the actual hashing if you're only hashing one or two blocks like we do.  It's intended that the time is amortised if you were hashing many KB or MB of data.  In BitcoinMiner, we format the buffer once and keep reusing it.

If you can find SHA256 code that's faster (with MinGW/GCC) than what we've got, that would be really great!  (although, keep licensing in mind)  The one we have is the only one I tried, so there's significant chance for improvement.

When I wrote it more than 2 years ago, there were screaming hot SHA1 implementations but minimal attention to SHA256.  That's a lot of time for them to come up with better stuff.  SHA256 was a lot slower than the fastest SHA1 at the time than I thought it should be.  Obviously SHA256 should be slower than SHA1 by a certain amount, but not by as much as I saw.

(hope you don't mind I renamed your thread, SHA-256 optimisation is something important that I keep forgetting about)
BlackEye
Newbie
*
Offline Offline

Activity: 17
Merit: 0


View Profile
July 20, 2010, 10:20:05 PM
 #3

You could try the SHA2 implementation from : http://polarssl.org/

With a simple test in Visual Studio 2008 of a busy loop executing the hash function it was able to hash at 1.5x the rate that bitcoin does.
Olipro
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
July 20, 2010, 10:37:53 PM
 #4

You could try the SHA2 implementation from : http://polarssl.org/

With a simple test in Visual Studio 2008 of a busy loop executing the hash function it was able to hash at 1.5x the rate that bitcoin does.

really? I'm struggling to see how that implementation could possibly be faster.
BlackEye
Newbie
*
Offline Offline

Activity: 17
Merit: 0


View Profile
July 20, 2010, 10:50:28 PM
Last edit: July 20, 2010, 11:36:22 PM by BlackEye
 #5

How many bytes does bitcoin typically hash each time?

edit
It appears, if I'm looking at the source correctly, that bitcoin does 2 hashes and a bunch of other stuff and only counts it as 1 hash.  Which means polarssl probably isn't faster then.
Olipro
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
July 20, 2010, 11:30:35 PM
 #6

How many bytes does bitcoin typically hash each time?

as far as I can see, it hashes 16 bytes at a time, the number of 16 byte blocks to process is variable, check the main.cpp
BlackEye
Newbie
*
Offline Offline

Activity: 17
Merit: 0


View Profile
July 21, 2010, 09:21:58 PM
Last edit: July 22, 2010, 12:00:42 PM by BlackEye
 #7

Further examination of the source makes it clear that the hashing is done in blocks of 64 bytes.  I was able to hack the SHA256 functions from PolarSSL and got it to do the block hashing just like the current code, with about a 10% increase in speed.  I verified the hashes produced are the same.

I took the plunge and got all the dependencies together and compiled Bitcoin myself to try to get the new hashing in place.  It's odd that you say that a MSVC build decreases the hashing performance, as I've found it increases it.  I'm using Visual Studio 2008 Standard on a 32bit dual core machine, so maybe that has something to do with it.  I went from ~1000khash/sec with the build from the Bitcoin website, to ~1350khash/sec by just compiling the source with Visual Studio 2008, to ~15000khash/sec with Visual Studio 2008 using the PolarSSL hashing functions.

edit - latest binary 2010-07-22
You can get my build here : http://www.filedropper.com/bitcoin-032_3
You'll need the Visual Studio 2008 redistributable to run this, so if it crashes immediately or complains about an incorrect configuration you need to install this.  It includes the modified sources.  I used the latest bdb, which seems to update the database format, so you can't go back to the old client because it can't open the newer database.  I suggest you save your database before you try this build if you want to revert back later.
jgarzik
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
July 21, 2010, 10:15:01 PM
 #8

Ideally, bitcoin should be hashing in CPU cacheline-sized, cacheline-aligned chunks (usually 64 bytes).

Also, on modern CPUs, you can issue a "pre-fetch" like this

     while (have bytes)
          pre-fetch(index + 1)
          sha256(index)

which will potentially speed up the operation.

Both Linux and Windows compilers can generate prefetches.  gcc provides builtins for this.

Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Olipro
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
July 22, 2010, 04:43:39 AM
 #9

There was an issue with hashing multiple blocks in the binary above.  I've corrected the source for now, but I won't be able to compile a new binary until tomorrow.  Here's the corrected source for anyone who is interested.
http://www.filedropper.com/bitcoin-032_2

I gave it a try using the x64 Intel compiler with full optimization, performance is practically identical to the stock algorithm, in fact the new algo seems marginally worse.
BlackEye
Newbie
*
Offline Offline

Activity: 17
Merit: 0


View Profile
July 22, 2010, 12:15:01 PM
 #10

A 32 bit build running on a 32 bit system with the new algorithm is definitely faster than the base algorithm.  I've ran it on 3 different systems and all showed improvement.
dkaparis (OP)
Newbie
*
Offline Offline

Activity: 53
Merit: 0


View Profile
July 22, 2010, 12:43:24 PM
 #11

I took the plunge and got all the dependencies together and compiled Bitcoin myself to try to get the new hashing in place.  It's odd that you say that a MSVC build decreases the hashing performance, as I've found it increases it.  I'm using Visual Studio 2008 Standard on a 32bit dual core machine, so maybe that has something to do with it.  I went from ~1000khash/sec with the build from the Bitcoin website, to ~1350khash/sec by just compiling the source with Visual Studio 2008, to ~15000khash/sec with Visual Studio 2008 using the PolarSSL hashing functions.

It's baffling. Do you mind posting the makefile/project you used to build the original source?
BlackEye
Newbie
*
Offline Offline

Activity: 17
Merit: 0


View Profile
July 22, 2010, 01:31:56 PM
Last edit: July 22, 2010, 04:01:09 PM by BlackEye
 #12

The project file is attached.  You'll need to remove the txt extension and change the paths to your libraries, and compile the Release build of course, not the Debug one.

If you are using the Express edition to compile, make sure to read this thread on msdn.
Olipro
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
July 22, 2010, 05:22:15 PM
 #13

The project file is attached.  You'll need to remove the txt extension and change the paths to your libraries, and compile the Release build of course, not the Debug one.

If you are using the Express edition to compile, make sure to read this thread on msdn.

bit of a shame you linked it against BerkeleyDB5, that will break everyone's database if they should wish to go back to the stock build.
BlackEye
Newbie
*
Offline Offline

Activity: 17
Merit: 0


View Profile
July 22, 2010, 06:37:30 PM
Last edit: July 22, 2010, 07:09:40 PM by BlackEye
 #14

Yes, my post clearly says I used the latest BDB which will update the database format, and you should backup your database beforehand.  I just used the latest production release from the Oracle website, and I didn't see any version requirement in the documentation about compiling Bitcoin.  If you look at the BDB release notes, there were plenty of bugs squashed since whatever 4.7.x release Bitcoin is currently using.  What was the rationale of using an outdated version for the official release?

edit
Here's a release statically linked against BDB 4.7.25, so there will be no issues with database versions.
http://www.filedropper.com/bitcoin-032_4
dkaparis (OP)
Newbie
*
Offline Offline

Activity: 53
Merit: 0


View Profile
July 25, 2010, 08:36:03 AM
 #15

Found the culprit. I had left in the  /Ob0 option from the original makefile, which obviously led to the abysmal performance I was getting. With proper settings, the VC++ build is really faster. This tinkering with makefiles is a major hassle, I'm going to suggest converting to CMake in another post.

Regarding the SHA-256 function, we can:

1) leave it as it is - requires no effort, but it seems other solutions provide significant performance benefits;

2) adopt the SHA-256 code from later releases of Crypto++; an asm version is currently available; we have the option to either extract the functionality from modules and integrate it into bitcoin source as currently done, which would not be trivial for all SHA module dependencies, or use the complete Crypto++ library as a dependency. No one has done either yet as far as I know, so it is not clear how much faster will it be. More interestingly, I've skimmed the change logs and there are various fixes in Crypto++'s sha module as well. I'm not certain if there are any serious problems with the code bitcoin is currently using though.

3) Integrate code from PolarSSL, like BlackEye did. He claims a 50% khash/s increase with that code.

If we choose either of the last two options, we need to take great care that the hashing functionality is preserved without change or breaking anything. Unit tests would greatly help here, but for that the sha-invoking code in bitcoin would need to be extracted to a separate unit which can be tested.

I could do that refactoring, create unit tests for the hashing and provide patches. Then we can more freely experiment with upgrading the sha implementation. Satoshi and/or anyone interested, post your thoughts.
BlackEye
Newbie
*
Offline Offline

Activity: 17
Merit: 0


View Profile
July 25, 2010, 10:12:23 PM
Last edit: July 25, 2010, 10:26:37 PM by BlackEye
 #16

I was able to integrate the SHA256 functionality from Crypto++ 5.6.0 into Bitcoin.  This is the fastest SHA256 yet using the SSE2 assembly code.  Since Bitcoin was sending unaligned data to the block hash function, I had to change the MOVDQA instruction to MOVDQU.

I think using the SHA256 functionality from Crypto++ 5.6.0 is the way forward right now.

http://www.filedropper.com/bitcoin-033
dkaparis (OP)
Newbie
*
Offline Offline

Activity: 53
Merit: 0


View Profile
July 25, 2010, 10:26:36 PM
 #17

Excellent work.

Can you provide patches against current SVN?
FreeMoney
Legendary
*
Offline Offline

Activity: 1246
Merit: 1014


Strength in numbers


View Profile WWW
July 25, 2010, 11:55:53 PM
 #18

Is it easy for a newb to try this stuff?

Play Bitcoin Poker at sealswithclubs.eu. We're active and open to everyone.
Olipro
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
July 26, 2010, 03:48:33 AM
 #19

I was able to integrate the SHA256 functionality from Crypto++ 5.6.0 into Bitcoin.  This is the fastest SHA256 yet using the SSE2 assembly code.  Since Bitcoin was sending unaligned data to the block hash function, I had to change the MOVDQA instruction to MOVDQU.

I think using the SHA256 functionality from Crypto++ 5.6.0 is the way forward right now.

http://www.filedropper.com/bitcoin-033

is this the x86 asm? I dumped out the x64 asm and integrated it and performance has proved to be nothing short of blistering.
falkenberg
Member
**
Offline Offline

Activity: 84
Merit: 10


View Profile
July 26, 2010, 10:45:53 AM
 #20

Hi guys,
what about the cryptoengines, available on some architectures? Ie PADLock on a VIA processor (yeah, it's a very slow processor, but has padlock instructions)? As far as I know SHA-256 is backed upped by the hardware and must be fast.  Last versions of OpenSSL can benefit from hardware engine, so if it used instead of your own SHA-256 implementation you can accelerate the program without dealing with low-level details.

Regards,
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!