Bitcoin Forum
June 17, 2024, 08:33:56 AM *
News: Voting for pizza day contest
 
   Home   Help Search Login Register More  
Pages: « 1 [2] 3 4 5 6 »  All
  Print  
Author Topic: [BETA] Bitcoin blockchain torrent  (Read 57708 times)
Richy_T
Legendary
*
Offline Offline

Activity: 2450
Merit: 2130


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 12, 2012, 05:16:31 PM
 #21

It will result in seeders needing to re-download a torrent for each update though.

It will result in seeders already having 90% of any new torrent, thus having to fill in the remaining 10%.

Yes, it's just that it's an extra step, not automatic. This is a limitation of the torrent protocol and not a criticism of this project, just pointing it out.


Quote
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley


Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
TangibleCryptography
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250


Tangible Cryptography LLC


View Profile WWW
October 12, 2012, 05:21:42 PM
 #22

Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).

Why would a byte for byte copy be a "slightly different length"?
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1091


View Profile
October 12, 2012, 05:31:09 PM
 #23

Quote
Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley

Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).

Once the height of the next checkpoint is publicly known, everyone may run that script to independently generate bootstrap.dat with the exact same file size and SHA256 checksum.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Richy_T
Legendary
*
Offline Offline

Activity: 2450
Merit: 2130


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 12, 2012, 05:35:44 PM
 #24

Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).

Why would a byte for byte copy be a "slightly different length"?

How does it know the length of the torrented file? (Note that it is "identical", not a copy) Though from what jgarzick says in the post above there is some kind of checkpointing that it either knows or gets fed into it?

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
October 12, 2012, 05:42:25 PM
 #25

Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).

Why would a byte for byte copy be a "slightly different length"?

How does it know the length of the torrented file? (Note that it is "identical", not a copy) Though from what jgarzick says in the post above there is some kind of checkpointing that it either knows or gets fed into it?

It doesn't need to know the length of the file.  The script doesn't make a dat file of your entire blockchain just through the last checkpoint.  Currently the last checkpoint is block 193,000.   If you run that script on any node it will produce the same file.   Exactly the same file.

Now currently the script has 193,000 hardcoded but I could see future version either getting the checkpoint from the client or making an API call to get the block #.  The script could be included with the client or even better it could be built into the client so you click an [export blockchain > torrent] button and it generates the proper file based on the current checkpoint.

Lots of interesting options.
kjj
Legendary
*
Offline Offline

Activity: 1302
Merit: 1025



View Profile
October 12, 2012, 06:19:10 PM
 #26

Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).

Why would a byte for byte copy be a "slightly different length"?

How does it know the length of the torrented file? (Note that it is "identical", not a copy) Though from what jgarzick says in the post above there is some kind of checkpointing that it either knows or gets fed into it?

Torrent works by breaking files up into pieces and hashing each piece.  The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece.  Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash.  And naturally, they will also grab all of the new pieces.

17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8
I routinely ignore posters with paid advertising in their sigs.  You should too.
Peter Todd
Legendary
*
Offline Offline

Activity: 1120
Merit: 1150


View Profile
October 12, 2012, 06:36:59 PM
 #27

This is special to our use case:  bootstrap.dat is essentially an append-only file.  Blocks are simply concatenated onto the end.

Today's torrent at height 193000 is 2,491,771,562 bytes in size.

The next torrent, a few months from now, will have the same first 2,491,771,562 bytes.

Thus, to bittorrent, the next torrent will simply appear to be a truncated / not fully downloaded bootstrap.dat.  Bittorrent is built to fill in the missing pieces of a file, so that is what it does here Smiley

Does anyone know if bittorrent can share streams between multiple versions of the same file?

I mean, lets suppose we publish the torrent for the first x bytes, add y bytes to the file, then publish another torrent for the new version. Will people downloading the new, longer torrent, be able to request blocks from people running clients that have only downloaded the shorter torrent? There does exist a Bittorrent streaming protocol, TS Engine, but as far as I can tell it's purely block based and doesn't efficiently handle the case where every client needs the whole stream, right from the beginning. I know internally bittorrent can identify blocks that is already has using a merkle tree system, but the tree can only have one tip. (1)

It's not a very important optimization for bitcoin, just publishing up to the latest checkpoint is fine for us even if old seeds aren't useful anymore, but I have an application where torrenting a file that is continuously being extended would be useful.

(1) Ironically the data I want to distribute via bittorrent in this fashion is a forest of merkle trees, exactly the sort of data structure that you could use to implement a continuously-appended-to torrent...

kjj
Legendary
*
Offline Offline

Activity: 1302
Merit: 1025



View Profile
October 12, 2012, 06:38:48 PM
 #28

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

I also have a PHP (!) script that parses the block chain and makes a clean sequential bootstrap.dat file.  It is ugly and slow, but I wanted to have an independent verification.  Our two scripts came up with identical files.

And by ugly, I mean embarrassingly ugly, like I'd be ashamed to let anyone see it.  If I have time this weekend, I'll clean it up and post it.

17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8
I routinely ignore posters with paid advertising in their sigs.  You should too.
Richy_T
Legendary
*
Offline Offline

Activity: 2450
Merit: 2130


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 12, 2012, 07:26:04 PM
 #29


Now currently the script has 193,000 hardcoded

Ah, this is what I was talking about.

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
Richy_T
Legendary
*
Offline Offline

Activity: 2450
Merit: 2130


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 12, 2012, 07:38:51 PM
 #30


Torrent works by breaking files up into pieces and hashing each piece.  The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece.  Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash.  And naturally, they will also grab all of the new pieces.

Yes. I was talking about generating the file without having to download it. No point having torrents download the missing pieces when you already have them sitting in the blockchain on your hard-drive.

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1091


View Profile
October 12, 2012, 07:45:10 PM
 #31

Does anyone know if bittorrent can share streams between multiple versions of the same file?

It depends on your definition of "share"... locally or remotely?

A single torrent is simply a hash-of-hashes.  Each stream is a different torrent, with different hashes, even if torrent A is a strict subset of torrent B.

Your client may be modified to share streams which are multiple versions of the same file.  Would probably only need some small mods to existing clients, and would not break the network protocol.  So from that perspective: "yes"

Remote clients will see different hashes, and assume that each stream is separate and independent of each other.  So from that perspective: "no"


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
kjj
Legendary
*
Offline Offline

Activity: 1302
Merit: 1025



View Profile
October 12, 2012, 07:47:59 PM
 #32


Torrent works by breaking files up into pieces and hashing each piece.  The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece.  Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash.  And naturally, they will also grab all of the new pieces.

Yes. I was talking about generating the file without having to download it. No point having torrents download the missing pieces when you already have them sitting in the blockchain on your hard-drive.

Don't think that will work.  Most bittorrent clients check the file size, etc.  It would be super cool if some torrent client would be willing to serve matching chunks out of a file, even if the overall file is wrong.  But none do that I'm aware of.

These are specially cleaned block files.  These are sequential, have no orphans, and no inter-block garbage.  For virtually everyone on the planet, the first N bytes of their actual block files won't match these.  jgarzik has already published his script, and I hope to publish mine soon.  You can use them to recreate the file from your block database without having to download it.

17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8
I routinely ignore posters with paid advertising in their sigs.  You should too.
Peter Todd
Legendary
*
Offline Offline

Activity: 1120
Merit: 1150


View Profile
October 12, 2012, 07:52:00 PM
 #33

Does anyone know if bittorrent can share streams between multiple versions of the same file?

It depends on your definition of "share"... locally or remotely?

Remotely

A single torrent is simply a hash-of-hashes.  Each stream is a different torrent, with different hashes, even if torrent A is a strict subset of torrent B.

Your client may be modified to share streams which are multiple versions of the same file.  Would probably only need some small mods to existing clients, and would not break the network protocol.  So from that perspective: "yes"

Remote clients will see different hashes, and assume that each stream is separate and independent of each other.  So from that perspective: "no"

Hmm... that's pretty much what I expected. Anyway I thought about it some more, and I think I have a way for my application to even deal with divergent versions of the file, really divergent trees, which bittorrent *definitely* doesn't support. It'd be a very nice feature, so at that point I might as well just bite the bullet and hack bittorrent as required. (or invent Yet Another Peer-to-Peer Network)

foo
Sr. Member
****
Offline Offline

Activity: 409
Merit: 250



View Profile
October 12, 2012, 10:35:04 PM
 #34

Good idea, except the trackerless part, IMHO. Here's a magnet link to the same torrent, with 4 public trackers added:

magnet:?xt=urn:btih:0bb0521942f586ed96203c6f4d136324756f8a9a&dn=bootstrap.dat&tr=udp://tracker.openbittorrent.com:80&tr=udp://tracker.publicbt.com:80&tr=udp://tracker.ccc.de:80&tr=udp://tracker.istole.it:80

I know this because Tyler knows this.
Richy_T
Legendary
*
Offline Offline

Activity: 2450
Merit: 2130


1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k


View Profile
October 13, 2012, 01:33:50 AM
 #35

Good idea, except the trackerless part, IMHO. Here's a magnet link to the same torrent, with 4 public trackers added:

magnet:?xt=urn:btih:0bb0521942f586ed96203c6f4d136324756f8a9a&dn=bootstrap.dat&tr=udp://tracker.openbittorrent.com:80&tr=udp://tracker.publicbt.com:80&tr=udp://tracker.ccc.de:80&tr=udp://tracker.istole.it:80


Good deal. Unless a torrent is marked as private, the dht will kick in if the trackers ever stop working (though I'm not sure if it's possible to nobble them?)

1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
BitcoinBug
Full Member
***
Offline Offline

Activity: 196
Merit: 100


View Profile
October 13, 2012, 09:37:20 AM
 #36

Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?
Boussac
Legendary
*
Offline Offline

Activity: 1220
Merit: 1015


e-ducat.fr


View Profile WWW
October 13, 2012, 10:23:24 AM
 #37

Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?
+1
 and many thanks to jgarzick for this useful development

jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1091


View Profile
October 13, 2012, 03:35:04 PM
 #38

Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?

The best place to watch is probably this thread, though I am open to other suggestions.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
runeks
Legendary
*
Offline Offline

Activity: 980
Merit: 1008



View Profile WWW
October 13, 2012, 10:52:51 PM
 #39


Quote
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley


How do I use that script?

Should some of these variables reference my .bitcoin directory somehow?

Code:
NET_SETTINGS = {
'mainnet' : {
'log' : '/spare/tmp/mkbootstrap.log',
'db' : '/spare/tmp/chaindb'
},
'testnet3' : {
'log' : '/spare/tmp/mkbootstraptest.log',
'db' : '/spare/tmp/chaintest'
}
}

I get:

Code:
Traceback (most recent call last):
  File "mkbootstrap.py", line 36, in <module>
    log = Log.Log(SETTINGS['log'])
  File "/home/rune/Programming/pynode/Log.py", line 15, in __init__
    self.fh = open(filename, 'a+', 0)
IOError: [Errno 2] No such file or directory: '/spare/tmp/mkbootstrap.log'
jgarzik (OP)
Legendary
*
Offline Offline

Activity: 1596
Merit: 1091


View Profile
October 13, 2012, 11:08:37 PM
Last edit: October 14, 2012, 12:21:34 AM by jgarzik
 #40


Quote
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley


How do I use that script?

Should some of these variables reference my .bitcoin directory somehow?

The .bitcoin directory is for a different app.

pynode is a full bitcoin client, separate from bitcoind.  The script mkbootstrap.py requires access to the pynode database, after you have downloaded all the blocks.

Sadly you do need to be a bit of a programmer to generate a bootstrap.dat file.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Pages: « 1 [2] 3 4 5 6 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!