Bitcoin Forum
April 26, 2024, 06:31:50 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: P2P block download - use bit torrent like method  (Read 4690 times)
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 11:08:43 AM
 #1

At the moment, the block chain is downloaded linearly.  This places a large load on the other node and it isn't possible for the downloader to pay back the other node.

A better method would be to download the blocks like in bittorrent.

Download all the headers. 

This should be fast, since it is 80 bytes per block. 

80 bytes every 10 minutes is around 4MB per year.

Download missing blocks

Since you have all the headers, you can ask for any 500 blocks that you want.

You could download blocks 1000-1500 from a seed and another peer could download 500-1000 from a seed.  You could then share the blocks with each other.  This greatly reduces the load on the seeding computers.

Blocks near the end of the chain are more valuable, since they are more likely to have active transactions.  The random selection could be biased towards trying to get those first.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
1714156310
Hero Member
*
Offline Offline

Posts: 1714156310

View Profile Personal Message (Offline)

Ignore
1714156310
Reply with quote  #2

1714156310
Report to moderator
1714156310
Hero Member
*
Offline Offline

Posts: 1714156310

View Profile Personal Message (Offline)

Ignore
1714156310
Reply with quote  #2

1714156310
Report to moderator
The trust scores you see are subjective; they will change depending on who you have in your trust list.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
XIU
Member
**
Offline Offline

Activity: 84
Merit: 10


View Profile WWW
June 30, 2011, 11:21:33 AM
 #2

Isn't the reason that the blocks have to verified linear anyway? As in part of the hash is the result of the previous hash?
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 11:25:43 AM
 #3

Isn't the reason that the blocks have to verified linear anyway? As in part of the hash is the result of the previous hash?

Mostly, it is the block chain itself that needs to be verified.  Once you have that, you can distinguish fake blocks from real blocks.

Blocks that have inputs connected to blocks that you haven't downloaded could be tagged as pending until the corresponding input block is downloaded.

However, any block that is more than 100-200 blocks away from the last block in the chain is almost certain to be valid and doesn't really need to be checked.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
just_someguy
Full Member
***
Offline Offline

Activity: 125
Merit: 100


View Profile
June 30, 2011, 12:49:55 PM
 #4

Quote
Mostly, it is the block chain itself that needs to be verified.  Once you have that, you can distinguish fake blocks from real blocks.

Unless you are doing simple payment verification you can't verify a block until you have all previous blocks.

You can't know if anything in the block you are examining is a double spend unless you possess the data where those double spends would exist.
If you are downloading the headers with the intention of downloading all the full blocks then you might as well just download the full blocks since you are going to do the same work twice.


TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 01:32:22 PM
 #5

If you are downloading the headers with the intention of downloading all the full blocks then you might as well just download the full blocks since you are going to do the same work twice.

That isn't entirely true.  You could download the chain and then download the last 2 weeks worth of blocks.  If anyone sent you money the last 2 weeks, you could verify the transaction directly.

By verify the block, I just meant confirm that the block is in the chain.

Also, the main proposal was to improve download speed and reduce strain on the seed nodes/miners.

Like in bittorrent, if one client downloads blocks 1-1000 and the other downloads 1001-2000, they can then swap blocks with each other and not download from the seed.  This means that the seed only has to send the chain once.

In fact, it would be equivalent to just using bit-torrent to download the chain and then rescanning it.  I am not sure if -rescan checks all the transactions though.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
June 30, 2011, 02:03:09 PM
 #6

Also, the main proposal was to improve download speed and reduce strain on the seed nodes/miners.

Since the overwhelming majority of all the nodes already have all the blocks, it doesn't matter which blocks you fetch... except: making it sparse would add the  overhead of having to communicate what blocks you do/don't have.

The reasons that it's slow right now doesn't appear have anything to do with the downloading external to bitcoin.  One reason its slow, for example, is that downloading a run of 500 blocks is frequently enough to trigger a flood protection disconnect now. So a new node spends a lot of time being disconnected over and over again from its neighbors, and when it reconnects it wastes a lot of time doing addr.dat exchange.  It also may get itself flooded off all the competent neighbors...





bitcool
Legendary
*
Offline Offline

Activity: 1441
Merit: 1000

Live and enjoy experiments


View Profile
June 30, 2011, 02:11:59 PM
 #7

I too think the speed of block chain downloading is an issue.
Even you want to verify all blocks in the chain, can this be a two step process -- download all blocks concurrently first, and then verify the chain afterward? Just wondering.
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 02:17:42 PM
 #8

I too think the speed of block chain downloading is an issue.
Even you want to verify all blocks in the chain, can this be a two step process -- download all blocks concurrently first, and then verify the chain afterward? Just wondering.

Exactly.  300MB takes a few hours, no matter how fast your connection because it is p2p, but not torrent like.

However, if you download just the headers first (80 *130k = 10MB), you can at least verify that the blocks you get are real.

All nodes, including client nodes, would likely have the headers so you could p2p them too.

The main point is to download the chain itself from the last block and work backwards.  For most people, they will only need blocks from the last few weeks anyway.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
patvarilly
Guest

June 30, 2011, 03:41:48 PM
 #9

Isn't the reason that the blocks have to verified linear anyway? As in part of the hash is the result of the previous hash?

The hash is calculated solely based on the 80 bytes of the block header (which contains the hash of the previous block), and not the contents of the block (https://en.bitcoin.it/wiki/Protocol_specification#Block_Headers).  So you can verify the integrity of the block chain using only the headers.  If you get block headers for which you don't have the previous block's header, e.g., not linearly, they'll be stored in memory as orphan chains.  Eventually, you'll get a block that allows you to link the orphan chains to the main block chain.  As is pointed out in the posts, it might be much faster to download lots of small parts of the chain from lots of peers separately, and then join them together on the client side, because the individual peer-to-peer speed might be quite limited.
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 03:46:59 PM
 #10

As is pointed out in the posts, it might be much faster to download lots of small parts of the chain from lots of peers separately, and then join them together on the client side, because the individual peer-to-peer speed might be quite limited.

Also, it means that there is no sharing, which is key to p2p (at least bit torrent).

The only people who want blocks from you are behind you in the chain.  You give them blocks, but they never have any blocks you need.

This means that it is best to just be a pure leecher.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
just_someguy
Full Member
***
Offline Offline

Activity: 125
Merit: 100


View Profile
June 30, 2011, 05:15:22 PM
 #11

Quote
The hash is calculated solely based on the 80 bytes of the block header (which contains the hash of the previous block), and not the contents of the block

That's a bit misleading.
Part of the 80 bytes is the merkle root of the transactions which is essentially the contents of the block.
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 05:49:42 PM
 #12

That's a bit misleading.
Part of the 80 bytes is the merkle root of the transactions which is essentially the contents of the block.

True, however, for the purposes of verifying that the headers form a valid chain, the Merkle root might as well be a random number.

It is used to confirm that the information contained in the blocks is valid.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
kjj
Legendary
*
Offline Offline

Activity: 1302
Merit: 1024



View Profile
June 30, 2011, 07:02:21 PM
 #13

I'm pretty sure that verification time is the dominant factor when fetching a large number of blocks.  For each block, the client verifies each transaction.  That is a lot of work.

17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8
I routinely ignore posters with paid advertising in their sigs.  You should too.
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 09:43:15 PM
 #14

I'm pretty sure that verification time is the dominant factor when fetching a large number of blocks.  For each block, the client verifies each transaction.  That is a lot of work.

I could see that being true in the future, but it shouldn't be a massive deal at the moment.  The client should still try to download the blocks as fast as possible anyway.

I have noticed that the client uses a lot of swap space though.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 09:53:31 PM
 #15

I am downloading the chain again, and the process is at 90% CPU, so I guess I was wrong about the bottle-neck.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
just_someguy
Full Member
***
Offline Offline

Activity: 125
Merit: 100


View Profile
June 30, 2011, 09:54:46 PM
 #16

Quote
True, however, for the purposes of verifying that the headers form a valid chain, the Merkle root might as well be a random number.

It is used to confirm that the information contained in the blocks is valid.

If you aren't verifying anything or looking for transactions that are relevant to you then you might as well just download the block chain all at once:
http://bitcoin.bluematt.me/bitcoin-nightly/blockchain-nightly/
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 09:56:01 PM
 #17

If you aren't verifying anything or looking for transactions that are relevant to you then you might as well just download the block chain all at once:
http://bitcoin.bluematt.me/bitcoin-nightly/blockchain-nightly/

Does the -rescan parameter just scan for transactions, or does it check the chain for validity?

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
just_someguy
Full Member
***
Offline Offline

Activity: 125
Merit: 100


View Profile
June 30, 2011, 10:11:36 PM
 #18

It will rescan for transactions.... but if the chain download lies to you and gives you an alternate chain you won't know.
It will become apparent though as soon as you get no new blocks.
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 30, 2011, 10:25:09 PM
 #19

It will rescan for transactions.... but if the chain download lies to you and gives you an alternate chain you won't know.
It will become apparent though as soon as you get no new blocks.

Ahh, ok.  I think there needs to be a -verifychain command.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
deepceleron
Legendary
*
Offline Offline

Activity: 1512
Merit: 1025



View Profile WWW
July 14, 2011, 09:55:50 PM
 #20

I'm going to bump this because I used search and knew I couldn't be the first to think of this.

Bitcoin does need aggressive torrent-like out-of-order block downloading, especially on new client install or a restart after not being used for a while. If instead of requesting block 1, then block 2, etc, I request a simultaneous 500 blocks from 500 nodes I am going to get them much quicker. Think of this as a torrent-like look ahead. My client can start verifying it's way up the block chain as it starts assembling them and adding them to the block database, and if it doesn't get a few blocks after an expected time or finds invalid blocks, it can request them again from several clients.

Alternately, it could get bitcoin_blockchain_120000.zip or similar from a trusted source upon new installation.

Bittorrent can get me data as fast as my connection, why should it take half a day to get 200MB on Bitcoin?
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!