Pieter Wuille (OP)
|
|
October 20, 2012, 10:37:51 PM |
|
(copy of mailinglist post) I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work). This is a very significant change, and all testing is certainly welcome. As a result of this, many pull requests probably don't apply cleanly anymore. If you need help rebasing them on the new structure, ask me. The idea behind ultraprune is to use an ultra-pruned copy (only unspent transaction outputs in a custom compact format) of the block chain for validation (as opposed to a transaction index into the block chain). It still keeps all blocks around for serving them to other nodes, for rescanning, and for reorganisations. As such, it is still a full node. So, despite the name, it does not implement any actual pruning yet, though pruning would be trivial to implement now. This would have profound effects on the network though, so may still need some discussion first. A small summary of the changes: - Instead of blk000?.dat, we have blocks/blk000??.dat files of max 128 MiB, pre-allocated per 16 MiB
- Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.
- A new LevelDB directory coins/, which contains data about the current unspent transaction output set.
- New files blocks/rev000??.dat contain undo data for blocks (necessary for reorganisation).
- More information is kept about blocks and block files, to facilitate pruning in the future, and to prepare for a headers-first mode.
- Two new RPC calls are added: gettxout and gettxoutsetinfo.
The most noticeable change should be performance: LevelDB deals much better with slow I/O than BDB does, and the working set size for validation is an order of magnitude smaller. In the longer run, I think it is an evolution towards separation between validation nodes and archive nodes, which is needed in my opinion.
|
I do Bitcoin stuff.
|
|
|
Syke
Legendary
Offline
Activity: 3878
Merit: 1193
|
|
October 21, 2012, 02:25:44 AM |
|
I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).
Does this require downloading and re-processing the blockchain from the beginning?
|
Buy & Hold
|
|
|
jgarzik
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
October 21, 2012, 02:35:13 AM |
|
I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).
Does this require downloading and re-processing the blockchain from the beginning? Yes. However, to save downloading, you may provide -loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat
to import the old data files into the new bitcoin database backend (ultraprune/leveldb). * "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
Syke
Legendary
Offline
Activity: 3878
Merit: 1193
|
|
October 21, 2012, 03:41:48 AM |
|
Yes. However, to save downloading, you may provide -loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat
to import the old data files into the new bitcoin database backend (ultraprune/leveldb). * "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1. Excellent work.
|
Buy & Hold
|
|
|
Stephen Gornick
Legendary
Offline
Activity: 2506
Merit: 1010
|
|
October 21, 2012, 04:58:12 AM |
|
- Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.
So is BDB still used at all? (e.g, for peers.dat and wallet.dat? ) Or will that likely be changing with the upcoming release that includes ultraprune?
|
|
|
|
Atlas
Jr. Member
Offline
Activity: 56
Merit: 1
|
|
October 21, 2012, 06:11:01 AM |
|
I recommend not releasing this until it has been thoroughly tested and analysed for at least 6 months.
|
|
|
|
jgarzik
Legendary
Offline
Activity: 1596
Merit: 1100
|
|
October 21, 2012, 06:47:03 AM |
|
So is BDB still used at all? (e.g, for peers.dat and wallet.dat? ) Or will that likely be changing with the upcoming release that includes ultraprune?
peers.dat is a flat file with a bitcoin-specific file format, unrelated to any database system. wallet.dat remains BDB, though there are proposals on changing that.
|
Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own. Visit bloq.com / metronome.io Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
|
|
|
|
kokjo
Legendary
Offline
Activity: 1050
Merit: 1000
You are WRONG!
|
|
October 21, 2012, 09:53:11 AM |
|
testing it now!!
|
"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
|
|
|
paraipan
In memoriam
Legendary
Offline
Activity: 924
Merit: 1004
Firstbits: 1pirata
|
|
October 21, 2012, 11:18:32 AM |
|
Yes. However, to save downloading, you may provide -loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat
to import the old data files into the new bitcoin database backend (ultraprune/leveldb). * "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1. Excellent work. +1 Will test
|
BTCitcoin: An Idea Worth Saving - Q&A with bitcoins on rugatu.com - Check my rep
|
|
|
LightRider
Legendary
Offline
Activity: 1500
Merit: 1022
I advocate the Zeitgeist Movement & Venus Project.
|
|
October 21, 2012, 11:19:16 AM |
|
Will there be win32 compiles of this any time soon?
|
|
|
|
niko
|
|
October 21, 2012, 02:53:30 PM |
|
Will there be win32 compiles of this any time soon?
Yes, please.
|
They're there, in their room. Your mining rig is on fire, yet you're very calm.
|
|
|
Pieter Wuille (OP)
|
|
October 21, 2012, 03:01:46 PM |
|
An answer to MysteryMiner, who asked in another thread: One more question - will the new database discard spent addresses? Some places says it will, some says it will not. I am confused. What will happen to clients that rely on downloading the complete transaction history and verify all blocks and transactions in them on-the-way, like 0.3.xx does?
The current code does not prune anything - it uses a pruned copy (in addition to the blockchain itself) for validation. Since this copy is much smaller, far less data needs to be accessed during block and transaction validation (it's around 120 MB right now). This makes it faster to validate and to update the database. Also, Bitcoin at the protocol level does not know anything about addresses or balances - those are client-side things provided by the wallet abstractions. What we're talking about is removing individual transaction outputs that have been spent. At some later point in time we may add actual pruning, by removing blocks (but not their unspent outputs in the pruned copy) that are old enough. This will imply they cannot be served to other nodes, cannot be rescanned, and cannot be reorganised away. Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore. This is why I believe in a move towards validation nodes and archive nodes. Also, Bitcoin is a zero-trust system (at least full nodes are). This means that no data ever received from the network is ever taken for granted, and needs validation. This implies you can't ever bootstrap a (zero trust) node without having it validate the entire block chain (although it is not necessary that everyone keeps that data around forever).
|
I do Bitcoin stuff.
|
|
|
casascius
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
October 21, 2012, 03:47:24 PM Last edit: October 21, 2012, 04:00:39 PM by casascius |
|
Here's how it ought to work in my mind: The user ought to have a simple way to decide what he wants to contribute to the network, with the default being something that ensures that the user remains a "full citizen node" but perhaps without automatically seeding large amounts of history without the user's consent. I imagine having four or five settings, but a real implementation will probably expound on the idea. (I realize that this is a thread about "ultraprune" and my examples mention "metatree", but please see past that - I am only presenting a 30,000-foot-level view of how I imagine this working) What the other settings might be: MINIMAL: * Recommended for low-bandwidth or high-cost network connections. * No incoming connections from peers allowed. * Downloaded data set consists only of the minimum necessary to determine the latest block. * Information about balances queried from peers on an as-needed basis * Lowest possible security. Add trusted peers to the preferred peer list whenever possible. LOW: * No incoming connections from peers allowed. * A pruned dataset is downloaded and maintained. MEDIUM: (this would be the default setting) * Incoming connections from peers allowed * A pruned dataset is downloaded and maintained. * Peers may download the dataset up to the configured upload limit MEDIUM-HIGH: see image... HIGH: * Incoming connections from peers allowed * Accepts metatree queries from peers, and seeds historical versions of metatree to assist in recovery/rollback if needed * Full transaction history is maintained (requires XX GB, which increases over time) * Allows peers to download the data set up to the configured bandwidth limit. * Full network citizen/historian which assists in allowing other nodes to recover the entire network history in case recovery is needed * Recommended setting for mining nodes wherever feasible Ideally, if all of these modes were implemented, a new installation could start running in the "MINIMAL" mode regardless of user choice so it is instantly usable without a day of downloading, and then slowly upgrade itself to the level of the user's choice as objects are downloaded and verified.
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
Come-from-Beyond
Legendary
Offline
Activity: 2142
Merit: 1010
Newbie
|
|
October 21, 2012, 03:56:09 PM |
|
Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.
What will prevent it? If the changes r good then everyone will use new approach.
|
|
|
|
dree12
Legendary
Offline
Activity: 1246
Merit: 1077
|
|
October 21, 2012, 04:00:19 PM |
|
Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.
What will prevent it? If the changes r good then everyone will use new approach. We could financially reward nodes that participate with the transaction fees. I think this is a good approach, as long as a secure way of doing it is found.
|
|
|
|
Pieter Wuille (OP)
|
|
October 21, 2012, 04:16:21 PM |
|
Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.
What will prevent it? If the changes r good then everyone will use new approach. Why do people today run a full node like Bitcoind or Bitcoin-Qt when they could run an SPV node like MultiBit or not a node at all, like webwallets or Electrum?
|
I do Bitcoin stuff.
|
|
|
Come-from-Beyond
Legendary
Offline
Activity: 2142
Merit: 1010
Newbie
|
|
October 21, 2012, 04:19:48 PM |
|
Why do people today run a full node like Bitcoind or Bitcoin-Qt when they could run an SPV node like MultiBit or not a node at all, like webwallets or Electrum?
Ok
|
|
|
|
flipperfish
Sr. Member
Offline
Activity: 350
Merit: 251
Dolphie Selfie
|
|
October 21, 2012, 04:26:18 PM |
|
At some later point in time we may add actual pruning, by removing blocks (but not their unspent outputs in the pruned copy) that are old enough. This will imply they cannot be served to other nodes, cannot be rescanned, and cannot be reorganised away. Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore. This is why I believe in a move towards validation nodes and archive nodes.
Also, Bitcoin is a zero-trust system (at least full nodes are). This means that no data ever received from the network is ever taken for granted, and needs validation. This implies you can't ever bootstrap a (zero trust) node without having it validate the entire block chain (although it is not necessary that everyone keeps that data around forever).
Maybe it is a good a idea, to define some of the terms used here (maybe in the wiki?). It can be very confusing to read different terms for the same thing and the same words for different things, especially if you're not deeply invovlved in the ongoing development. Also I think "ultraprune" should really be renamed, as it does not prune, but rather lays the foundation for pruning. I would suggest calling it "historic data separation" or "blockchain validation data optimization" as this is what it does. As far I identiefied this terms from the recent posts about this topic. Please correct me, if I'm wrong: - Pruning: To remove all transactions, whose outputs have been spent.
- Full Node: A bitcoin-client, which stores only the data needed to validate new transactions within the network. It has seen the complete blockchain history at some previous time and can thus be sure, that it's current validation data is correct.
- Archiving Node: A bitcoin-client, which stores all data from the beginning of the blockchain. Can serve the whole blockchain to other nodes. Needed for bootstrapping new nodes without trust to anything else.
- Light Node: A bitcoin-client, which does not store any data and has to trust another Full or Archiving Node.
- Zero Trust Node: A bitcoin-client, which can validate new transactions within the network, without having to trust anything besides the blockchain. Full Nodes and Archival Nodes are Zero Trust Nodes.
|
|
|
|
casascius
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
October 21, 2012, 04:26:33 PM |
|
Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.
What will prevent it? If the changes r good then everyone will use new approach. Why do people today run a full node like Bitcoind or Bitcoin-Qt when they could run an SPV node like MultiBit or not a node at all, like webwallets or Electrum? I would agree with this point, by comparison to the following: why do people seed torrents of pirated content, even in spite of an obvious legal risk of doing so? If rational self interest were pure, all-knowing, and all-selfish, The Pirate Bay should have collapsed on its own by now due to lack of seeders.
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
|