Richy_T
Legendary
Offline
Activity: 2604
Merit: 2296
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
October 12, 2012, 05:16:31 PM |
|
It will result in seeders needing to re-download a torrent for each update though.
It will result in seeders already having 90% of any new torrent, thus having to fill in the remaining 10%. Yes, it's just that it's an extra step, not automatic. This is a limitation of the torrent protocol and not a criticism of this project, just pointing it out. I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.
Yes, this is open data and an open file format. Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.pyEach time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately. A nice, decentralized solution Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).
|
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
|
TangibleCryptography
|
|
October 12, 2012, 05:21:42 PM |
|
Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).
Why would a byte for byte copy be a "slightly different length"?
|
|
|
|
jgarzik (OP)
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
October 12, 2012, 05:31:09 PM |
|
Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately. A nice, decentralized solution Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option). Once the height of the next checkpoint is publicly known, everyone may run that script to independently generate bootstrap.dat with the exact same file size and SHA256 checksum.
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
Richy_T
Legendary
Offline
Activity: 2604
Merit: 2296
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
October 12, 2012, 05:35:44 PM |
|
Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).
Why would a byte for byte copy be a "slightly different length"? How does it know the length of the torrented file? (Note that it is "identical", not a copy) Though from what jgarzick says in the post above there is some kind of checkpointing that it either knows or gets fed into it?
|
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
October 12, 2012, 05:42:25 PM |
|
Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).
Why would a byte for byte copy be a "slightly different length"? How does it know the length of the torrented file? (Note that it is "identical", not a copy) Though from what jgarzick says in the post above there is some kind of checkpointing that it either knows or gets fed into it? It doesn't need to know the length of the file. The script doesn't make a dat file of your entire blockchain just through the last checkpoint. Currently the last checkpoint is block 193,000. If you run that script on any node it will produce the same file. Exactly the same file. Now currently the script has 193,000 hardcoded but I could see future version either getting the checkpoint from the client or making an API call to get the block #. The script could be included with the client or even better it could be built into the client so you click an [export blockchain > torrent] button and it generates the proper file based on the current checkpoint. Lots of interesting options.
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
October 12, 2012, 06:19:10 PM |
|
Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).
Why would a byte for byte copy be a "slightly different length"? How does it know the length of the torrented file? (Note that it is "identical", not a copy) Though from what jgarzick says in the post above there is some kind of checkpointing that it either knows or gets fed into it? Torrent works by breaking files up into pieces and hashing each piece. The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece. Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash. And naturally, they will also grab all of the new pieces.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
Peter Todd
Legendary
Offline
Activity: 1120
Merit: 1160
|
|
October 12, 2012, 06:36:59 PM |
|
This is special to our use case: bootstrap.dat is essentially an append-only file. Blocks are simply concatenated onto the end. Today's torrent at height 193000 is 2,491,771,562 bytes in size. The next torrent, a few months from now, will have the same first 2,491,771,562 bytes. Thus, to bittorrent, the next torrent will simply appear to be a truncated / not fully downloaded bootstrap.dat. Bittorrent is built to fill in the missing pieces of a file, so that is what it does here Does anyone know if bittorrent can share streams between multiple versions of the same file? I mean, lets suppose we publish the torrent for the first x bytes, add y bytes to the file, then publish another torrent for the new version. Will people downloading the new, longer torrent, be able to request blocks from people running clients that have only downloaded the shorter torrent? There does exist a Bittorrent streaming protocol, TS Engine, but as far as I can tell it's purely block based and doesn't efficiently handle the case where every client needs the whole stream, right from the beginning. I know internally bittorrent can identify blocks that is already has using a merkle tree system, but the tree can only have one tip. (1) It's not a very important optimization for bitcoin, just publishing up to the latest checkpoint is fine for us even if old seeds aren't useful anymore, but I have an application where torrenting a file that is continuously being extended would be useful. (1) Ironically the data I want to distribute via bittorrent in this fashion is a forest of merkle trees, exactly the sort of data structure that you could use to implement a continuously-appended-to torrent...
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
October 12, 2012, 06:38:48 PM |
|
I also have a PHP (!) script that parses the block chain and makes a clean sequential bootstrap.dat file. It is ugly and slow, but I wanted to have an independent verification. Our two scripts came up with identical files. And by ugly, I mean embarrassingly ugly, like I'd be ashamed to let anyone see it. If I have time this weekend, I'll clean it up and post it.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
Richy_T
Legendary
Offline
Activity: 2604
Merit: 2296
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
October 12, 2012, 07:26:04 PM |
|
Now currently the script has 193,000 hardcoded
Ah, this is what I was talking about.
|
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
|
Richy_T
Legendary
Offline
Activity: 2604
Merit: 2296
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
October 12, 2012, 07:38:51 PM |
|
Torrent works by breaking files up into pieces and hashing each piece. The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece. Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash. And naturally, they will also grab all of the new pieces.
Yes. I was talking about generating the file without having to download it. No point having torrents download the missing pieces when you already have them sitting in the blockchain on your hard-drive.
|
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
|
jgarzik (OP)
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
October 12, 2012, 07:45:10 PM |
|
Does anyone know if bittorrent can share streams between multiple versions of the same file?
It depends on your definition of "share"... locally or remotely? A single torrent is simply a hash-of-hashes. Each stream is a different torrent, with different hashes, even if torrent A is a strict subset of torrent B. Your client may be modified to share streams which are multiple versions of the same file. Would probably only need some small mods to existing clients, and would not break the network protocol. So from that perspective: "yes" Remote clients will see different hashes, and assume that each stream is separate and independent of each other. So from that perspective: "no"
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
October 12, 2012, 07:47:59 PM |
|
Torrent works by breaking files up into pieces and hashing each piece. The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece. Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash. And naturally, they will also grab all of the new pieces.
Yes. I was talking about generating the file without having to download it. No point having torrents download the missing pieces when you already have them sitting in the blockchain on your hard-drive. Don't think that will work. Most bittorrent clients check the file size, etc. It would be super cool if some torrent client would be willing to serve matching chunks out of a file, even if the overall file is wrong. But none do that I'm aware of. These are specially cleaned block files. These are sequential, have no orphans, and no inter-block garbage. For virtually everyone on the planet, the first N bytes of their actual block files won't match these. jgarzik has already published his script, and I hope to publish mine soon. You can use them to recreate the file from your block database without having to download it.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
Peter Todd
Legendary
Offline
Activity: 1120
Merit: 1160
|
|
October 12, 2012, 07:52:00 PM |
|
Does anyone know if bittorrent can share streams between multiple versions of the same file?
It depends on your definition of "share"... locally or remotely? Remotely A single torrent is simply a hash-of-hashes. Each stream is a different torrent, with different hashes, even if torrent A is a strict subset of torrent B.
Your client may be modified to share streams which are multiple versions of the same file. Would probably only need some small mods to existing clients, and would not break the network protocol. So from that perspective: "yes"
Remote clients will see different hashes, and assume that each stream is separate and independent of each other. So from that perspective: "no"
Hmm... that's pretty much what I expected. Anyway I thought about it some more, and I think I have a way for my application to even deal with divergent versions of the file, really divergent trees, which bittorrent *definitely* doesn't support. It'd be a very nice feature, so at that point I might as well just bite the bullet and hack bittorrent as required. (or invent Yet Another Peer-to-Peer Network)
|
|
|
|
foo
|
|
October 12, 2012, 10:35:04 PM |
|
Good idea, except the trackerless part, IMHO. Here's a magnet link to the same torrent, with 4 public trackers added:
magnet:?xt=urn:btih:0bb0521942f586ed96203c6f4d136324756f8a9a&dn=bootstrap.dat&tr=udp://tracker.openbittorrent.com:80&tr=udp://tracker.publicbt.com:80&tr=udp://tracker.ccc.de:80&tr=udp://tracker.istole.it:80
|
I know this because Tyler knows this.
|
|
|
Richy_T
Legendary
Offline
Activity: 2604
Merit: 2296
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
October 13, 2012, 01:33:50 AM |
|
Good idea, except the trackerless part, IMHO. Here's a magnet link to the same torrent, with 4 public trackers added:
magnet:?xt=urn:btih:0bb0521942f586ed96203c6f4d136324756f8a9a&dn=bootstrap.dat&tr=udp://tracker.openbittorrent.com:80&tr=udp://tracker.publicbt.com:80&tr=udp://tracker.ccc.de:80&tr=udp://tracker.istole.it:80
Good deal. Unless a torrent is marked as private, the dht will kick in if the trackers ever stop working (though I'm not sure if it's possible to nobble them?)
|
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
|
|
|
BitcoinBug
|
|
October 13, 2012, 09:37:20 AM |
|
Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?
|
|
|
|
Boussac
Legendary
Offline
Activity: 1221
Merit: 1025
e-ducat.fr
|
|
October 13, 2012, 10:23:24 AM |
|
Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?
+1 and many thanks to jgarzick for this useful development
|
|
|
|
jgarzik (OP)
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
October 13, 2012, 03:35:04 PM |
|
Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?
The best place to watch is probably this thread, though I am open to other suggestions.
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
runeks
Legendary
Offline
Activity: 980
Merit: 1008
|
|
October 13, 2012, 10:52:51 PM |
|
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.
Yes, this is open data and an open file format. Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.pyEach time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately. A nice, decentralized solution How do I use that script? Should some of these variables reference my .bitcoin directory somehow? NET_SETTINGS = { 'mainnet' : { 'log' : '/spare/tmp/mkbootstrap.log', 'db' : '/spare/tmp/chaindb' }, 'testnet3' : { 'log' : '/spare/tmp/mkbootstraptest.log', 'db' : '/spare/tmp/chaintest' } } I get: Traceback (most recent call last): File "mkbootstrap.py", line 36, in <module> log = Log.Log(SETTINGS['log']) File "/home/rune/Programming/pynode/Log.py", line 15, in __init__ self.fh = open(filename, 'a+', 0) IOError: [Errno 2] No such file or directory: '/spare/tmp/mkbootstrap.log'
|
|
|
|
jgarzik (OP)
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
October 13, 2012, 11:08:37 PM Last edit: October 14, 2012, 12:21:34 AM by jgarzik |
|
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.
Yes, this is open data and an open file format. Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.pyEach time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately. A nice, decentralized solution How do I use that script? Should some of these variables reference my .bitcoin directory somehow? The .bitcoin directory is for a different app. pynode is a full bitcoin client, separate from bitcoind. The script mkbootstrap.py requires access to the pynode database, after you have downloaded all the blocks. Sadly you do need to be a bit of a programmer to generate a bootstrap.dat file.
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
|