Bitcoin Forum
April 23, 2024, 05:54:01 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 4 »  All
  Print  
Author Topic: Ultraprune merged in mainline  (Read 25396 times)
Pieter Wuille (OP)
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
October 20, 2012, 10:37:51 PM
Merited by ABCbits (6)
 #1

(copy of mailinglist post)

I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work). This is a very significant change, and all testing is certainly welcome. As a result of this, many pull requests probably don't apply cleanly anymore. If you need help rebasing them on the new structure, ask me.

The idea behind ultraprune is to use an ultra-pruned copy (only unspent transaction outputs in a custom compact format) of the block chain for validation (as opposed to a transaction index into the block chain). It still keeps all blocks around for serving them to other nodes, for rescanning, and for reorganisations. As such, it is still a full node. So, despite the name, it does not implement any actual pruning yet, though pruning would be trivial to implement now. This would have profound effects on the network though, so may still need some discussion first.

A small summary of the changes:
  • Instead of blk000?.dat, we have blocks/blk000??.dat files of max 128 MiB, pre-allocated per 16 MiB
  • Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.
  • A new LevelDB directory coins/, which contains data about the current unspent transaction output set.
  • New files blocks/rev000??.dat contain undo data for blocks (necessary for reorganisation).
  • More information is kept about blocks and block files, to facilitate pruning in the future, and to prepare for a headers-first mode.
  • Two new RPC calls are added: gettxout and gettxoutsetinfo.

The most noticeable change should be performance: LevelDB deals much better with slow I/O than BDB does, and the working set size for validation is an order of magnitude smaller. In the longer run, I think it is an evolution towards separation between validation nodes
and archive nodes, which is needed in my opinion.

I do Bitcoin stuff.
1713894841
Hero Member
*
Offline Offline

Posts: 1713894841

View Profile Personal Message (Offline)

Ignore
1713894841
Reply with quote  #2

1713894841
Report to moderator
1713894841
Hero Member
*
Offline Offline

Posts: 1713894841

View Profile Personal Message (Offline)

Ignore
1713894841
Reply with quote  #2

1713894841
Report to moderator
Even if you use Bitcoin through Tor, the way transactions are handled by the network makes anonymity difficult to achieve. Do not expect your transactions to be anonymous unless you really know what you're doing.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
Syke
Legendary
*
Offline Offline

Activity: 3878
Merit: 1193


View Profile
October 21, 2012, 02:25:44 AM
 #2

I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).

Does this require downloading and re-processing the blockchain from the beginning?

Buy & Hold
jgarzik
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
October 21, 2012, 02:35:13 AM
 #3

I've just merged my "ultraprune" branch into mainline (including Mike's LevelDB work).

Does this require downloading and re-processing the blockchain from the beginning?

Yes.  However, to save downloading, you may provide
Code:
-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Syke
Legendary
*
Offline Offline

Activity: 3878
Merit: 1193


View Profile
October 21, 2012, 03:41:48 AM
 #4

Yes.  However, to save downloading, you may provide
Code:
-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

Excellent work.

Buy & Hold
Stephen Gornick
Legendary
*
Offline Offline

Activity: 2506
Merit: 1010


View Profile
October 21, 2012, 04:58:12 AM
 #5

  • Instead of a Berklely DB blkindex.dat, we have a LevelDB directory blktree/. This only contains a block index, no transaction index.

So is BDB still used at all? (e.g, for peers.dat and wallet.dat? )  Or will that likely be changing with the upcoming release that includes ultraprune?

Unichange.me

            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █


Atlas
Jr. Member
*
Offline Offline

Activity: 56
Merit: 1


View Profile
October 21, 2012, 06:11:01 AM
 #6

I recommend not releasing this until it has been thoroughly tested and analysed for at least 6 months.
jgarzik
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
October 21, 2012, 06:47:03 AM
 #7

So is BDB still used at all? (e.g, for peers.dat and wallet.dat? )  Or will that likely be changing with the upcoming release that includes ultraprune?

peers.dat is a flat file with a bitcoin-specific file format, unrelated to any database system.

wallet.dat remains BDB, though there are proposals on changing that.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
Atlas
Jr. Member
*
Offline Offline

Activity: 56
Merit: 1


View Profile
October 21, 2012, 08:09:00 AM
 #8

Hey jgarzik, I noticed you lost your Bitcoins under this upgrade but they do appear on the current release. Pretty serious, huh?

http://bitcoinstats.com/irc/bitcoin-dev/logs/2012/10/21
kokjo
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
October 21, 2012, 09:53:11 AM
 #9

testing it now!!

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
paraipan
In memoriam
Legendary
*
Offline Offline

Activity: 924
Merit: 1004


Firstbits: 1pirata


View Profile WWW
October 21, 2012, 11:18:32 AM
 #10

Yes.  However, to save downloading, you may provide
Code:
-loadblock=DATA_DIR/blk0001.dat -loadblock=DATA_DIR/blk0002.dat

to import the old data files into the new bitcoin database backend (ultraprune/leveldb).

* "DATA_DIR" should be replaced with the directory where your blockchain was stored in <= 0.7.1.

Excellent work.

+1 Will test

BTCitcoin: An Idea Worth Saving - Q&A with bitcoins on rugatu.com - Check my rep
LightRider
Legendary
*
Offline Offline

Activity: 1500
Merit: 1021


I advocate the Zeitgeist Movement & Venus Project.


View Profile WWW
October 21, 2012, 11:19:16 AM
 #11

Will there be win32 compiles of this any time soon?

Bitcoin combines money, the wrongest thing in the world, with software, the easiest thing in the world to get wrong.
Visit www.thevenusproject.com and www.theZeitgeistMovement.com.
niko
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


There is more to Bitcoin than bitcoins.


View Profile
October 21, 2012, 02:53:30 PM
 #12

Will there be win32 compiles of this any time soon?
Yes, please.

They're there, in their room.
Your mining rig is on fire, yet you're very calm.
Pieter Wuille (OP)
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
October 21, 2012, 03:01:46 PM
 #13

An answer to MysteryMiner, who asked in another thread:

One more question - will the new database discard spent addresses? Some places says it will, some says it will not. I am confused. What will happen to clients that rely on downloading the complete transaction history and verify all blocks and transactions in them on-the-way, like 0.3.xx does?

The current code does not prune anything - it uses a pruned copy (in addition to the blockchain itself) for validation. Since this copy is much smaller, far less data needs to be accessed during block and transaction validation (it's around 120 MB right now). This makes it faster to validate and to update the database.

Also, Bitcoin at the protocol level does not know anything about addresses or balances - those are client-side things provided by the wallet abstractions. What we're talking about is removing individual transaction outputs that have been spent.

At some later point in time we may add actual pruning, by removing blocks (but not their unspent outputs in the pruned copy) that are old enough. This will imply they cannot be served to other nodes, cannot be rescanned, and cannot be reorganised away. Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore. This is why I believe in a move towards validation nodes and archive nodes.

Also, Bitcoin is a zero-trust system (at least full nodes are). This means that no data ever received from the network is ever taken for granted, and needs validation. This implies you can't ever bootstrap a (zero trust) node without having it validate the entire block chain (although it is not necessary that everyone keeps that data around forever).

I do Bitcoin stuff.
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
October 21, 2012, 03:47:24 PM
Last edit: October 21, 2012, 04:00:39 PM by casascius
 #14

Here's how it ought to work in my mind:



The user ought to have a simple way to decide what he wants to contribute to the network, with the default being something that ensures that the user remains a "full citizen node" but perhaps without automatically seeding large amounts of history without the user's consent.  I imagine having four or five settings, but a real implementation will probably expound on the idea.  (I realize that this is a thread about "ultraprune" and my examples mention "metatree", but please see past that - I am only presenting a 30,000-foot-level view of how I imagine this working)

What the other settings might be:

MINIMAL:
 * Recommended for low-bandwidth or high-cost network connections.
 * No incoming connections from peers allowed.
 * Downloaded data set consists only of the minimum necessary to determine the latest block.
 * Information about balances queried from peers on an as-needed basis
 * Lowest possible security.  Add trusted peers to the preferred peer list whenever possible.

LOW:
 * No incoming connections from peers allowed.
 * A pruned dataset is downloaded and maintained.

MEDIUM: (this would be the default setting)
 * Incoming connections from peers allowed
 * A pruned dataset is downloaded and maintained.
 * Peers may download the dataset up to the configured upload limit

MEDIUM-HIGH: see image...

HIGH:
 * Incoming connections from peers allowed
 * Accepts metatree queries from peers, and seeds historical
    versions of metatree to assist in recovery/rollback if needed
 * Full transaction history is maintained (requires XX GB,
    which increases over time)
 * Allows peers to download the data set up to the
    configured bandwidth limit.
 * Full network citizen/historian which assists in allowing other nodes
    to recover the entire network history in case recovery is needed
 * Recommended setting for mining nodes wherever feasible

Ideally, if all of these modes were implemented, a new installation could start running in the "MINIMAL" mode regardless of user choice so it is instantly usable without a day of downloading, and then slowly upgrade itself to the level of the user's choice as objects are downloaded and verified.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Come-from-Beyond
Legendary
*
Offline Offline

Activity: 2142
Merit: 1009

Newbie


View Profile
October 21, 2012, 03:56:09 PM
 #15

Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.

What will prevent it? If the changes r good then everyone will use new approach.
dree12
Legendary
*
Offline Offline

Activity: 1246
Merit: 1077



View Profile
October 21, 2012, 04:00:19 PM
 #16

Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.

What will prevent it? If the changes r good then everyone will use new approach.
We could financially reward nodes that participate with the transaction fees. I think this is a good approach, as long as a secure way of doing it is found.
Pieter Wuille (OP)
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
October 21, 2012, 04:16:21 PM
 #17

Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.

What will prevent it? If the changes r good then everyone will use new approach.

Why do people today run a full node like Bitcoind or Bitcoin-Qt when they could run an SPV node like MultiBit or not a node at all, like webwallets or Electrum?

I do Bitcoin stuff.
Come-from-Beyond
Legendary
*
Offline Offline

Activity: 2142
Merit: 1009

Newbie


View Profile
October 21, 2012, 04:19:48 PM
 #18

Why do people today run a full node like Bitcoind or Bitcoin-Qt when they could run an SPV node like MultiBit or not a node at all, like webwallets or Electrum?

Ok Smiley
flipperfish
Sr. Member
****
Offline Offline

Activity: 350
Merit: 251


Dolphie Selfie


View Profile
October 21, 2012, 04:26:18 PM
 #19

At some later point in time we may add actual pruning, by removing blocks (but not their unspent outputs in the pruned copy) that are old enough. This will imply they cannot be served to other nodes, cannot be rescanned, and cannot be reorganised away. Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore. This is why I believe in a move towards validation nodes and archive nodes.

Also, Bitcoin is a zero-trust system (at least full nodes are). This means that no data ever received from the network is ever taken for granted, and needs validation. This implies you can't ever bootstrap a (zero trust) node without having it validate the entire block chain (although it is not necessary that everyone keeps that data around forever).

Maybe it is a good a idea, to define some of the terms used here (maybe in the wiki?). It can be very confusing to read different terms for the same thing and the same words for different things, especially if you're not deeply invovlved in the ongoing development. Also I think "ultraprune" should really be renamed, as it does not prune, but rather lays the foundation for pruning. I would suggest calling it "historic data separation" or "blockchain validation data optimization" as this is what it does.

As far I identiefied this terms from the recent posts about this topic. Please correct me, if I'm wrong:

  • Pruning: To remove all transactions, whose outputs have been spent.
  • Full Node: A bitcoin-client, which stores only the data needed to validate new transactions within the network. It has seen the complete blockchain history at some previous time and can thus be sure, that it's current validation data is correct.
  • Archiving Node: A bitcoin-client, which stores all data from the beginning of the blockchain. Can serve the whole blockchain to other nodes. Needed for bootstrapping new nodes without trust to anything else.
  • Light Node: A bitcoin-client, which does not store any data and has to trust another Full or Archiving Node.
  • Zero Trust Node: A bitcoin-client, which can validate new transactions within the network, without having to trust anything besides the blockchain. Full Nodes and Archival Nodes are Zero Trust Nodes.
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
October 21, 2012, 04:26:33 PM
 #20

Clearly not everyone in the network can do this, as that would mean new nodes cannot bootstrap anymore.

What will prevent it? If the changes r good then everyone will use new approach.

Why do people today run a full node like Bitcoind or Bitcoin-Qt when they could run an SPV node like MultiBit or not a node at all, like webwallets or Electrum?


I would agree with this point, by comparison to the following: why do people seed torrents of pirated content, even in spite of an obvious legal risk of doing so?  If rational self interest were pure, all-knowing, and all-selfish, The Pirate Bay should have collapsed on its own by now due to lack of seeders.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Pages: [1] 2 3 4 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!