Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: skeeterskeeter on October 08, 2014, 03:30:49 PM



Title: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: skeeterskeeter on October 08, 2014, 03:30:49 PM
I am taking a parallel computing course at my university and we are supposed to do a semester long project on anything related to the course material. I have decided that I wanted to do something with Bitcoin or cryptos.

My idea is to fork the current Bitcoin Core wallet. The fork will have a feature to verify the blockchain on a local GPU to offload the work of verifying blocks on the CPU. This is what miners do, though they are trying to find the hashes, I wish to verify.

This idea came about after watching my CPU sit at 100% for a few hours after doing a fresh install of the Core wallet on a new desktop and waiting for the blockchain to verify.

My questions,

So I want to know if this is a feasible idea, from everything I know it is, but I haven't dug into the code yet at all?

Would this be useful? I would assume as long as the chain is being downloaded fast enough to provide the GPU with work it would be. I think for testing I will just copy a blockchain data file to a new Core wallet install and see if my software or the original verifies faster to prove the idea.  

Any general ideas or concerns?

EDIT: I had another idea. I think it would be a lot easier to just write a standalone chain verifier... Give it a file path of the chain you want verified, set some GPU parameters, and let it rip through it.



Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: andytoshi on October 08, 2014, 03:49:15 PM
Hi Skeeter,


I've played around with parallelizing verification (though not as part of Bitcoin Core), and once past 100k blocks I can pin all eight cores of my CPU constantly. So there is definitely benefit for parallel verification using a fairly large number of cores. There are a few complications though. Here are my thoughts:

Transactions refer to earlier transactions' outputs, so it is difficult to verify out-of-order unless you can do fast lookups (so memory bandwidth is gonna wreck your performance). Further, within a block, transactions are allowed to refer to earlier transactions' outputs but not vice-versa. So to be consensus-correct you need to verify them in order. You can solve this by doing two passes: one to make sure that all transaction references are valid (this can be done in about 30 minutes on my CPU with my crappy Rust code), and the other to actually do the signature verifications. However, this is actually what Bitcoin Core is already doing for the majority of the chain (full signature verification could take a couple of days), so you wouldn't be gaining anything vs the Bitcoin Core behaviour.

One thing you could do (and what I do in my own code) is to break consensus-compatibility and verify all transactions in each block in parallel. For blocks with hundreds or thousands of transactions there is a big benefit here. The way to do it is:
1. Scan the block and blindly add all transaction outputs to your output set in a linear pass. This makes sure that backreferences will work, but will also allow forward references. (It occurred to me while typing this that it's easy to label the outputs with an index into the block, so you can still easily detect forward refs. Oops :) I will fix my code to do this.)
2. Split all transactions in the block into blocks of N_TRANSACTIONS/N_CORES transactions, and pass each one to a new thread for verification. Each thread will need to be able to lookup old outputs, so you will need to make sure your output set is immutable during this phase.
3. Scan the block and delete all spent outputs from your output set in a linear pass.

I expect that by using a parallel data structure for your output set you can merge some or all of these passes. There is room for experimentation and measurement here.

Andrew


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: skeeterskeeter on October 08, 2014, 04:04:08 PM
Andrew,

I think a much better question I now have. What are the steps involved in verifying the block chain?

1. Verify each transaction (highly parallelizable? )
2. Verify each block hash.. ~300k hashes, could be done in serial
3. ... ?

So a big issue you have encountered is that transactions many blocks away from each other that reference each other mean a breakdown in the parallelizability? (that is an awesome word) Because if you wish to verify a single block you might have to look far up or down the chain to find if that specific address really does have x btc to send to another address in the current block?


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: andytoshi on October 08, 2014, 05:07:03 PM
It's not so much that they are far away from each other (though they are), it's the fact that they can be that hurts parallelizability because you need access to a lot of data. The way to handle this is to maintain an "output set" of all unspent outputs; you add every output for each transaction and delete every output that is referenced in each transaction. Maintaining this data structure (which requires random insert/delete access in roughly equal and enormous proportions, which is a nasty data structure problem ... not sure you can do better than a hashmap) is why you have to go through the song and dance I described.

To see what's involved with verifying the blockchain, you should check out my high level overview (https://download.wpsoftware.net/bitcoin/bitcoin-faq.pdf) and also check out the Bitcoin Wiki pages on transactions (https://en.bitcoin.it/wiki/Transactions) and Script (https://en.bitcoin.it/wiki/Script). It's a bit involved.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: skeeterskeeter on October 08, 2014, 05:55:00 PM
It's not so much that they are far away from each other (though they are), it's the fact that they can be that hurts parallelizability because you need access to a lot of data. The way to handle this is to maintain an "output set" of all unspent outputs; you add every output for each transaction and delete every output that is referenced in each transaction. Maintaining this data structure (which requires random insert/delete access in roughly equal and enormous proportions, which is a nasty data structure problem ... not sure you can do better than a hashmap) is why you have to go through the song and dance I described.

To see what's involved with verifying the blockchain, you should check out my high level overview (https://download.wpsoftware.net/bitcoin/bitcoin-faq.pdf) and also check out the Bitcoin Wiki pages on transactions (https://en.bitcoin.it/wiki/Transactions) and Script (https://en.bitcoin.it/wiki/Script). It's a bit involved.


Ahh I see. So I might launch a kernel to verify a block. I would have to pass the block data, along with a pointer to an array of unspent coins in device memory, so that it can do this referencing "[sic]up and down" the chain. The issue is though that the RW bandwidth to the device memory would be massive. You would have to do a lookup on it once per unique address in the blockchain, because at some point all coins are unspent.? (is that correct logic).

To get around this you propose to verify all blocks in parallel, and write any unspent coins for each block into memory. Then do a second pass and verify for each unspent coin that it has some link in the blockchain? Then remove that unspent coin from the list, and if the list is empty after the algorithm runs then the chain is verified?

// I read a lot of the wiki in the past, I need to get back to it. Thanks for the reminder!



Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: r3c4ll on October 08, 2014, 11:36:47 PM
Very intresting conversation! I'll follow up.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: marcus_of_augustus on October 09, 2014, 12:58:11 AM
Interested in this also. Maybe begin looking at how parallelisable sipa's libsecp265k1? ...which is basically the crypto verifying engine.

https://github.com/bitcoin/secp256k1 (https://github.com/bitcoin/secp256k1)


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: cr1776 on October 09, 2014, 01:29:22 AM
As an academic exercise this is an interesting discussion.  From a coding perspective too. I did some code for a hyper-cube (and some other highly parallel machines) in grad school and love the area.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: andytoshi on October 09, 2014, 02:10:16 AM
Interested in this also. Maybe begin looking at how parallelisable sipa's libsecp265k1? ...which is basically the crypto verifying engine.

https://github.com/bitcoin/secp256k1 (https://github.com/bitcoin/secp256k1)

You should be using libsecp256k1 if you are doing verification. It is trivially parallelizable in the sense that you can verify many signatures at once...however, it does not do batch verification (it is not obvious that this is even possible for ECDSA since there is a sign ambiguity in the k value). There are some coming improvements but even today it is something like 6x faster than openssl.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: gmaxwell on October 09, 2014, 05:45:35 AM
Bitcoin core's verification is already parallel.  But there isn't as much parallelism to extract as you might guess, not without extensive architectural changes (e.g. pipelining multiple blocks and accepting partially verified ones). You should not, however, have been seeing it using only one cpu... in fact windows users have frequently complained that it must be broken because it was using all of their cores for minutes at a time. :)


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: skeeterskeeter on October 09, 2014, 07:36:16 AM
Bitcoin core's verification is already parallel.  But there isn't as much parallelism to extract as you might guess, not without extensive architectural changes (e.g. pipelining multiple blocks and accepting partially verified ones). You should not, however, have been seeing it using only one cpu... in fact windows users have frequently complained that it must be broken because it was using all of their cores for minutes at a time. :)


Could you rephrase that slightly? I am not sure if you mean it will work or,

There is no speed up over what is already implemented? or,
The Bitcoin Core already sneaks some code onto the GPU, and is already fully implemented? or,
The GPU would be of no advantage, the most parallel it can become can be ran on a 2+ Core CPU as fast as possible (High step complexity overall)?


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: gmaxwell on October 09, 2014, 03:56:55 PM
Could you rephrase that slightly? I am not sure if you mean it will work or,

There is no speed up over what is already implemented? or,
The Bitcoin Core already sneaks some code onto the GPU, and is already fully implemented? or,
The GPU would be of no advantage, the most parallel it can become can be ran on a 2+ Core CPU as fast as possible (High step complexity overall)?
It is already parallel, it does not use the GPU.  Using the GPU effectively will be difficult because there is not that much parallelism available unless you are verifying multiple blocks at once, and communicating with the GPU has huge latencies.  Competing with state of the art cpu code will take state of the art GPU code (my own vanitygen code on the cpu is roughly as fast as the published gpu vanitygen).

I think you should try it, a port of libsecp256k1 to the OpenCL would be pretty interesting and useful. If it were faster we'd probably be willing to do the architectural changes (making it possible to verify multiple blocks at once) to enable using it.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: hhanh00 on October 12, 2014, 04:38:21 AM
Do you have recommendations?

My current thinking is that the best bang for the buck is parallalizing the two group multiplications (by G and Q).
They use the field doubling and multiplication that were optimized in asm and they must be put on the device.
The rest that uses GMP isn't very much. It's a modular inverse and a few mult/add. It barely registers on my execution profile.

Thanks


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: hhanh00 on October 13, 2014, 06:43:45 AM
I made an openCL version of sipa's secp256k1 library. It's on github if someone is interested. It executes faster but only after you throw hundred of thousands of signatures to verify.
On a sample test run
# of sigs = 262144

Times in ms per phase.

prepare -> 2518
compute -> 15201
check   -> 35
total   -> 17755
           0.067733

Phase 1 & 3 are on CPU. Phase 2 is on GPU. 0.067733 ms per verification.
I have a GTX 560 Ti.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: marcus_of_augustus on October 14, 2014, 01:04:56 AM
Quote
I made an openCL version of sipa's secp256k1 library. It's on github if someone is interested.

Link?

AMD's probably be better then Nvidia if there a lot of integer steps.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: gmaxwell on October 14, 2014, 01:14:50 AM
Phase 1 & 3 are on CPU. Phase 2 is on GPU. 0.067733 ms per verification.
I have a GTX 560 Ti.
So thats 14763 per second?  Thats not favourable compared to running on the cpu but there may be many optimizations still available for you.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: hhanh00 on October 14, 2014, 08:54:55 AM
It's an old video card and there are a few improvements that I can see. It's a baseline.
L


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: Giulio Prisco on October 15, 2014, 09:57:20 AM
But isn't massive parallel computing precisely what the Bitcoin network as a whole does? Why parallelize locally?


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: hhanh00 on October 15, 2014, 10:22:28 AM
It's decentralized, not parallel. Every full node has to verify the whole blockchain.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: dexX7 on October 18, 2014, 04:15:51 PM
Link?

I believe this is it:

https://github.com/hhanh00/secp256k1-cl


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: deepceleron on October 19, 2014, 11:14:03 PM
But isn't massive parallel computing precisely what the Bitcoin network as a whole does? Why parallelize (SIC) locally?
Just signature and Merkle tree analysis of the whole blockchain (which is disabled by checkpoints) would take ~24+ hours of modern CPU processing power. These need to be performed by every user in a trustless environment. Optimizations are beneficial to every user, whether by utilization of available hardware resources, algorithmic shortcuts, or even hand-tuned x64/SSE assembly code routines.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: hhanh00 on October 23, 2014, 02:16:30 AM
Just signature and Merkle tree analysis of the whole blockchain (which is disabled by checkpoints) would take ~24+ hours of modern CPU processing power. These need to be performed by every user in a trustless environment. Optimizations are beneficial to every user, whether by utilization of available hardware resources, algorithmic shortcuts, or even hand-tuned x64/SSE assembly code routines.

It isn't so bad. On a i7 desktop ivy bridge 8-core machine, I could validate all the scripts up to block #295000 in just about 5 1/2 mn.

Code:
Succeeded #36249675
Failed #0
Elapsed 334s

I don't have a more recent bootstrap file, but by interpolating it shouldn't take more than 10 mn to do the complete chain.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: deepceleron on October 23, 2014, 05:22:49 AM
Just signature and Merkle tree analysis of the whole blockchain (which is disabled by checkpoints) would take ~24+ hours of modern CPU processing power. These need to be performed by every user in a trustless environment. Optimizations are beneficial to every user, whether by utilization of available hardware resources, algorithmic shortcuts, or even hand-tuned x64/SSE assembly code routines.

It isn't so bad. On a i7 desktop ivy bridge 8-core machine, I could validate all the scripts up to block #295000 in just about 5 1/2 mn.

Code:
Succeeded #36249675
Failed #0
Elapsed 334s

I don't have a more recent bootstrap file, but by interpolating it shouldn't take more than 10 mn to do the complete chain.


Verification is disabled up to the most recent checkpoint, block 295000. The hash and headers leading to it is known trustworthy, so signature checking is turned off. That's what I said that you just quoted. If you remove all but the first checkpoint from chainparams.cpp and recompile, you will find a quite different result.

(295000, uint256("0x00000000000000004d9b4ef50f0f9d686fd69db2e03af35a100370c64632a983"))

BTW, you have 4 cores with hyperthreading, CPU-Z your machine. Still, a $300+ CPU is not the lowest bar.


Title: Re: Looking into forking the core wallet to use parallel computing to verify blocks
Post by: hhanh00 on October 23, 2014, 07:10:21 AM
I'm not using bitcoin core and my code is performing cross block full validation. In fact, Bitcoin core is taking longer without doing validation.
I know my desktop is not underpowered but 24h+ is not the same as 5 mn. Just saying that validation is not such a big deal.