Bitcoin Forum
November 01, 2024, 02:52:29 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 4 5 6 »  All
  Print  
Author Topic: [BETA] Bitcoin blockchain torrent  (Read 57751 times)
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 03:17:49 AM
Last edit: February 20, 2013, 03:57:21 PM by jgarzik
 #1



UPDATE Feb 2013: This thread is obsolete.  See the non-beta Bitcoin blockchain data torrent thread for further torrents and updates.


This is a beta test of a blockchain torrent project.  Interested participants are invited to try, and comment.

Version 0.7.1, which just entered testing, includes a new feature:  If the file "bootstrap.dat" is found in the bitcoin data directory, it will validate and import all blockchain data found in that file.  The following torrent presents a bootstrap.dat file for that feature.

Here is the PGP-signed torrent information.  Details follow below the signature.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Torrent info hash: 0bb0521942f586ed96203c6f4d136324756f8a9a
Torrent magnet link:
magnet:?xt=urn:btih:0bb0521942f586ed96203c6f4d136324756f8a9a&dn=bootstrap.dat

Filename: bootstrap.dat
Byte size: 2491771562
SHA1: e70ca90775dfdb13fd0014425805a0bdf4a31677
SHA256: a3f258e7af030165360596e4cb0b9beb24b4ce97352c22e65349b89ad5fc5d3e

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQIVAwUBUHeHSdodwg8tvwyoAQJ2DA/8CcINxuD7DzLv96fE+R6GVLvRASLcy0ig
8d2YyDbJ9r9dPVODDPGC7I/ooDVJAcQsqdrLYHST2DTCt6c4zZ/7iXzFEahRopsH
PmxYOgkHie7438nqpmH9uC+d5t0pPHUFS6dBSBgdSVPaLfS86fToXrV/bx30rHBi
60FJz5A99aXrrrUny0cGIjafqVv8XjqNoA1lzjsRjeiu3EgTm8Ibcr8ZI6DLp/80
Siv3potlOArTI6sxjc/vKUa6nZILnW8mKdwc/d8LUdRaBPoo71c6Q4YOSQh/OVht
B1rZ8NEX/2SlS3PbhhMELcY/2wvgPGovkIqOgiw6dDIkhsD8iJAD1DqCkZbsq9S1
kKobGmceuQBcyRUCavkafdHJpZzyCAKLnQLV3FvZ3O4QZQVmYGPYn1H8a5UFRDQq
LGKcQmwm4Cet7162woEiNAaR9p7HvTJ4LV2uEvY56m2GfZbToCk2aiycO+v6Fm8K
ZY8cX+cbEOW3AM5rYVa6Rks029LIrFFzIutlo5MJ7uc7oMqciWrcsPtEh59QY+yJ
SENk6cLyWCAHl4jsvUTBvdNGy3fHKSEyQOjG5cmAMXiJTX/iiB/DJf7koTZAj4ub
wez3/HwuenMYjHgjsVXJAXprcEpTjpEcicg4W0MIDw07dPjol0dnRRNpwfJ1HMRc
dlTqLwlXqgs=
=uwRB
-----END PGP SIGNATURE-----


What is bootstrap.dat?

It is a flat, binary file containing bitcoin blockchain data, from the genesis block through height 193,000.  Height 193,000 is the height of the current checkpoint baked into the reference client.

Version 0.7.1 (when released) will automatically validate and import a file in the data directory named "bootstrap.dat".  Version 0.7 or later will also import this file by passing the command line argument "-loadblock=/path/to/bootstrap.dat" to bitcoin-qt or bitcoind.


Who wants bootstrap.dat?

Anyone bringing up a new node using the reference client.  This is one method of accelerating the initial blockchain download process, while helping the bitcoin P2P network by offloading data download traffic from public P2P nodes.

This download is not for those who are already running the bitcoin client.


NOTE: This torrent requires DHT torrent capability

This torrent is a so-called "trackerless torrent", to avoid making any of the open torrent trackers targets of any bitcoin antipathy.  Peers for this torrent are discovered via DHT, and early results seem to indicate that some bittorrent clients take a while to find their initial peers.  We have also discovered clients (rtorrent) that disable DHT by default; you will need to turn it on.

Please report results, good or bad, on using this trackerless torrent.


How often will this torrent be updated?

Assuming this project is deemed useful and worth continuing... the torrent will be updated once every few months, when the checkpoints are updated in the reference client source code.


Why not update the torrent more often?

A torrent works best when it is a large, static dataset that changes infrequently.  That maximizes the ability to seed the data, enabling even part-timer seeders to contribute meaningfully.  Less frequent changes also minimizes the risk that a malicious torrent will appear, with a long, malicious side chain.  The current policy only updates the torrent after blocks are buried many thousands deep in the chain.


Why should I trust you?

You don't have to:  This data is raw block chain data.  The client will verify this data during import.

Independent third parties may generate their own bootstrap.dat, up to height 193000, and verify that the sha256sum matches that posted above.  The file format is simple and publicly known:

     <4-byte pchMessageStart><32-bit length><CBlock, serialized in network wire format>


Torrent file download?

If the magnet link does not work, download http://gtf.org/garzik/bitcoin/bootstrap.dat.torrent


Comments welcome

Post any comments or experiences in this thread.  I'll update the OP as needed.

Maybe trackerless will be a #fail, but let's see how it goes.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Steve
Hero Member
*****
Offline Offline

Activity: 868
Merit: 1008



View Profile WWW
October 12, 2012, 03:46:46 AM
 #2

Cool.  It got me thinking and questioning a few things.  I've often thought that a good alternative to downloading & validating the full block chain when bootstrapping a new client is to find a node you trust and simply copy it block chain and transaction data.  I think of this kind of like cloning a trusted node.  You can bypass a all the validation work.  

This is similar to that in the sense that you are downloading a block chain that you trust to be validated.  It happens to match the 193,000 block baked into the client and signed by the core developers (by virtue of their signature on the hash of the software download).  Since this download.dat has block 193,000 whose hash matches the one baked into the client, you can be sure of the fact that you've got a block chain that matches that which was signed by the core developers up to block 193,000.  If all the merkle hashes and previous block pointers check out, you're good to go.

But, I always thought the bulk of the time to sync up the block chain was due to transaction validation and not actually due to the download time.  When it comes to the matters of trust, could you not achieve the equivalent result by starting up the client in a mode that doesn't do full validation for blocks prior to the most recent, baked in block (i.e. 193,000)?  On startup (if it's GUI), the user could be prompted to decide whether to perform a full validation, or to trust the baked in checkpoint.  Maybe downloading through Bittorrent still has advantages (i.e. not burdening the bitcoin network with download traffic), but I wonder how the performance of this would compare with downloading via Bittorrent.

(gasteve on IRC) Does your website accept cash? https://bitpay.com
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 03:56:24 AM
 #3

This is similar to that in the sense that you are downloading a block chain that you trust to be validated.  It happens to match the 193,000 block baked into the client and signed by the core developers (by virtue of their signature on the hash of the software download).  Since this download.dat has block 193,000 whose hash matches the one baked into the client, you can be sure of the fact that you've got a block chain that matches that which was signed by the core developers up to block 193,000.  If all the merkle hashes and previous block pointers check out, you're good to go.

Not quite...  this data is the precisely the same data you see on the P2P network.

The client validates bootstrap.dat data to the same level it validates data downloaded from the P2P network.

If all checkpointing code is disabled, this data remains fully valid and useful.

Quote
But, I always thought the bulk of the time to sync up the block chain was due to transaction validation and not actually due to the download time.

A lot of the time is due to Berkeley DB slowness (fixed by ultraprune).  Another yet-unfixed cause of slowness is poor bitcoin P2P network peer selection, for block download.

Quote
When it comes to the matters of trust, could you not achieve the equivalent result by starting up the client in a mode that doesn't do full validation for blocks prior to the most recent, baked in block (i.e. 193,000)?  On startup (if it's GUI), the user could be prompted to decide whether to perform a full validation, or to trust the baked in checkpoint.  Maybe downloading through Bittorrent still has advantages (i.e. not burdening the bitcoin network with download traffic), but I wonder how the performance of this would compare with downloading via Bittorrent.

We want to do exactly the same amount of validation as the client does with network blocks...  break no additional link in the trust chain Smiley


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Steve
Hero Member
*****
Offline Offline

Activity: 868
Merit: 1008



View Profile WWW
October 12, 2012, 04:06:14 AM
 #4

We want to do exactly the same amount of validation as the client does with network blocks...  break no additional link in the trust chain Smiley

Oh, ic…I guess that's what you meant by "The client will verify this data during import."  Wink

Any idea how long it takes to perform the validation of bootstrap.dat once it's downloaded (on typical hardware)?

(gasteve on IRC) Does your website accept cash? https://bitpay.com
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 04:45:36 AM
 #5

Any idea how long it takes to perform the validation of bootstrap.dat once it's downloaded (on typical hardware)?

Sadly it varies wildly depending on your hard drive configuration.  Import time here is under an hour.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
ShadowOfHarbringer
Legendary
*
Offline Offline

Activity: 1470
Merit: 1006


Bringing Legendary Har® to you since 1952


View Profile
October 12, 2012, 08:09:34 AM
 #6

At last, finally somebody made bootstrapping Bitcoin easier.

It was a major pain in the ass for people running full BTC client.

acoindr
Legendary
*
Offline Offline

Activity: 1050
Merit: 1002


View Profile
October 12, 2012, 03:08:31 PM
 #7

Very nice! This project seems to continue to have the best technology associated with it.
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
October 12, 2012, 03:15:27 PM
 #8

Nice one.  I will dedicated some bandwidth for it.  I could imagine with a couple hundred seeds new users could bootstrap very quickly.  Even if you only updated the torrent once a year it would still provide a significant portion of the blockchain at a high speed.
justusranvier
Legendary
*
Offline Offline

Activity: 1400
Merit: 1013



View Profile
October 12, 2012, 03:18:33 PM
 #9

Won't this torrent only benefit new users if people actually seed it?

The swarm doesn't appear to be particularly large at the moment...
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 03:33:24 PM
 #10

Nice one.  I will dedicated some bandwidth for it.  I could imagine with a couple hundred seeds new users could bootstrap very quickly.  Even if you only updated the torrent once a year it would still provide a significant portion of the blockchain at a high speed.

Yes, that's the hope.  Already had reports of 7 MB/s downloads...


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 03:34:27 PM
 #11

Won't this torrent only benefit new users if people actually seed it?

The swarm doesn't appear to be particularly large at the moment...

Correct.  It is not much use, without participating seeders.  We definitely have several right now... but it does look like users occasionally get partitioned off into "Azureus island" or "rtorrent island" or "everybody else's island."


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
justusranvier
Legendary
*
Offline Offline

Activity: 1400
Merit: 1013



View Profile
October 12, 2012, 03:56:38 PM
 #12

but it does look like users occasionally get partitioned off into "Azureus island" or "rtorrent island" or "everybody else's island."
I wonder why the Azureus island is so lonely right now. All I can see are two seeds and no peers.
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
October 12, 2012, 04:02:05 PM
 #13

Maybe a silly question but would it be possible to separate the torrent into something like daily or weekly segments so the individual parts never need to be updated?

Great work, looking forward to testing this client over the weekend.

That probably is a good idea but I don't think updates that often are necessary.  Think of the torrent as just a jumpstarter.  It doesn't need to get you within days of the current block just rapidly build the "ancient history".  Even having a single base torrent to block 193,000 allows new users to bootstrap ~90% of the blockchain.  

It could be as simple as making a new torrent for each checkpoint. Say in 6 months we are at block 240,000 and there is a checkpoint 220,000 a "torrent2" could be added which contains blocks 193,001 to 220,000.   New users could download both torrents and be within 20,000 blocks of "current".  I think at most quarterly updates is all that is necessary.

On edit: looks like it will take me ~30 min to download the torrent.  If a user can download 80% to 90% of the blockchain in half an hour and then index it in another hour that is a pretty nice jumpstart.
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 04:13:01 PM
 #14

Maybe a silly question but would it be possible to separate the torrent into something like daily or weekly segments so the individual parts never need to be updated?

At present, that sort of setup would be challenging for users to use effectively.  It is far easier to have one file to import (automatically, in 0.7.1+) than a collection of files.

Longer term, the idea is to fix any issues with bitcoin P2P peer selection, so that downloading recent blocks from your peers is faster and not burdensome.

It could be as simple as making a new torrent for each checkpoint. Say in 6 months we are at block 240,000 and there is a checkpoint 220,000 a "torrent2" could be added which contains blocks 193,001 to 220,000.   New users could download both torrents and be within 20,000 blocks of "current".  I think at most quarterly updates is all that is necessary.

That's the plan, with a slight change:  each new torrent will contain all blocks from zero to X.

For bitcoin users, the largest user population served by this is those that have zero blocks, and are jumpstarting a fresh node installation.  Other bitcoin users will likely have blocks fresher than 3-6 months old; catching up via P2P network is fine for them.  There is only a tiny remaining segment of bitcoin users who would then be served by new-torrent-for-checkpoint, those that only turn on their bitcoin clients once every ~6 months.

Note:  For existing torrent seeders, they may simply swap out the .torrent file, perhaps kick their torrent client to manually re-verify a file, and bootstrap.dat in their Uploads directory will simply be extended.  Seeders will automatically already have 90% of each new torrent's bootstrap.dat.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
October 12, 2012, 04:16:05 PM
 #15

Note:  For existing torrent seeders, they may simply swap out the .torrent file, perhaps kick their torrent client to manually re-verify a file, and bootstrap.dat in their Uploads directory will simply be extended.  Seeders will automatically already have 90% of each new torrent's bootstrap.dat.

Hmm.  I didn't realize that was possible.  I assumed that changing the torrent would result in no seeds (and all prior seeders needing to download again).  I was thinking many wouldn't and thus it would be a challenge to keep the number of seeders high.  I guess bittorrent is "smarter" than I realized.  
Richy_T
Legendary
*
Offline Offline

Activity: 2604
Merit: 2296


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 12, 2012, 04:33:19 PM
 #16

At last, finally somebody made bootstrapping Bitcoin easier.

It was a major pain in the ass for people running full BTC client.

It has been possible to download the blockchain. Torrent is in some ways an improvement in that it improves decentralization but in some ways a step back as it's not really suited to frequent updates. Say, for example, I download and seed v1, eventually v2 will come out. Although I already have all of the data in v2 in my running blockchain, there's no easy way to contribute to seeding v2 without downloading the whole thing again. A way to extract a version of the torrented blockchain from my live blockchain would be good. What might be better would be a custom client that could seed from the live blockchain itself.

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
Richy_T
Legendary
*
Offline Offline

Activity: 2604
Merit: 2296


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 12, 2012, 04:59:43 PM
 #17


...
Note:  For existing torrent seeders, they may simply swap out the .torrent file, perhaps kick their torrent client to manually re-verify a file, and bootstrap.dat in their Uploads directory will simply be extended.  Seeders will automatically already have 90% of each new torrent's bootstrap.dat.

D'oh. This is "stuff I know" and have used from time to time. My bad.

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 05:01:57 PM
 #18

Note:  For existing torrent seeders, they may simply swap out the .torrent file, perhaps kick their torrent client to manually re-verify a file, and bootstrap.dat in their Uploads directory will simply be extended.  Seeders will automatically already have 90% of each new torrent's bootstrap.dat.

Hmm.  I didn't realize that was possible.  I assumed that changing the torrent would result in no seeds (and all prior seeders needing to download again).  I was thinking many wouldn't and thus it would be a challenge to keep the number of seeders high.  I guess bittorrent is "smarter" than I realized.  

This is special to our use case:  bootstrap.dat is essentially an append-only file.  Blocks are simply concatenated onto the end.

Today's torrent at height 193000 is 2,491,771,562 bytes in size.

The next torrent, a few months from now, will have the same first 2,491,771,562 bytes.

Thus, to bittorrent, the next torrent will simply appear to be a truncated / not fully downloaded bootstrap.dat.  Bittorrent is built to fill in the missing pieces of a file, so that is what it does here Smiley


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Richy_T
Legendary
*
Offline Offline

Activity: 2604
Merit: 2296


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 12, 2012, 05:06:16 PM
 #19

Note:  For existing torrent seeders, they may simply swap out the .torrent file, perhaps kick their torrent client to manually re-verify a file, and bootstrap.dat in their Uploads directory will simply be extended.  Seeders will automatically already have 90% of each new torrent's bootstrap.dat.

Hmm.  I didn't realize that was possible.  I assumed that changing the torrent would result in no seeds (and all prior seeders needing to download again).  I was thinking many wouldn't and thus it would be a challenge to keep the number of seeders high.  I guess bittorrent is "smarter" than I realized.  

It will result in seeders needing to re-download a torrent for each update though. They'll also have to download all the intervening data (though they will be seeding from what they already have right away). I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
October 12, 2012, 05:12:31 PM
 #20

It will result in seeders needing to re-download a torrent for each update though.

It will result in seeders already having 90% of any new torrent, thus having to fill in the remaining 10%.

Quote
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Pages: [1] 2 3 4 5 6 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!