Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: PPA on March 25, 2017, 05:31:46 PM



Title: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: PPA on March 25, 2017, 05:31:46 PM
In Satoshi Nakamoto's paper we can read this:
Quote
Reclaiming Disk Space
Once the latest transaction in a coin is buried under enough blocks, the spent transactions before it can be discarded to save disk space. To facilitate this without breaking the block's hash, transactions are hashed in a Merkle Tree [7][2][5], with only the root included in the block's hash. Old blocks can then be compacted by stubbing off branches of the tree. The interior hashes do not need to be stored.

Is it really implemented by Bitcoin Core?

When I used some web site such as https://blockchain.info (https://blockchain.info) or https://blockexplorer.com (https://blockexplorer.com), it seems that we can scan any transaction of any block.
Does this mean that the sites above save an archive of the blockchain?

Does the Bitcoin Core discards some old transactions to save space, and saves a compact blockchain?
Note that according to https://bitcoin.org/en/download (https://bitcoin.org/en/download), the blockchain size is over 100Gb.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: PPA on March 25, 2017, 07:20:26 PM
I have found a similar question with an answer updated in July 2016.
http://bitcoin.stackexchange.com/questions/11170/why-is-pruning-not-considered-already-at-the-moment (http://bitcoin.stackexchange.com/questions/11170/why-is-pruning-not-considered-already-at-the-moment)


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: achow101 on March 25, 2017, 07:28:14 PM
First of all, Satoshi proposed several ideas which are either infeasible with current technology (e.g. fraud proofs) or are just stupid to do right now with current technology (the original transaction replacement stuff).

That being said, while Satoshi's idea with reclaiming disk space is not implemented exactly as he describes, we do have blockchain pruning. This however does require still downloading all 110+ GB however it will not all exist at the same time on disk. Currently pruning will delete on the fly, once a block becomes old enough, it is deleted to make space for the next block. Pruning has been around for over a year now, it was first introduced in Bitcoin Core 0.11.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: PPA on March 25, 2017, 07:38:13 PM
Thanks.
I am going to investigate further on the "pruning feature".


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: gmaxwell on March 25, 2017, 09:40:44 PM
That being said, while Satoshi's idea with reclaiming disk space is not implemented exactly as he describes, we do have blockchain pruning. This however does require still downloading all 110+ GB however it will not all exist at the same time on disk. Currently pruning will delete on the fly, once a block becomes old enough, it is deleted to make space for the next block. Pruning has been around for over a year now, it was first introduced in Bitcoin Core 0.11.
You make it sound like pruning is worse, in fact it's tens of times more efficient than whats described in the whitepaper. Both have the need to transfer data in the first place (note that it's 'reclaiming disk space' not 'avoiding bandwidth usage' :) ).

To the unanswered part of the OP's post:

Sites like blockchain.info aren't nodes. They are custom databases that take up many terabytes of space. They don't validate things (at least not completely) and often show invalid data.  What they do is unrelated to how nodes work.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: Kprawn on March 28, 2017, 03:32:39 PM
Sorry to hijack your thread OP, but I want to add a question here. In my limited knowledge of pruning I suspect that if EVERYONE runs a pruned

version of the software, then those old tx's are lost forever? I have a basic understanding of Pruning, and I have not taken the time to brush up

on the research... so I am asking this out of pure laziness to do the research myself.  :(


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: cr1776 on March 28, 2017, 04:31:44 PM
Sorry to hijack your thread OP, but I want to add a question here. In my limited knowledge of pruning I suspect that if EVERYONE runs a pruned

version of the software, then those old tx's are lost forever? I have a basic understanding of Pruning, and I have not taken the time to brush up

on the research... so I am asking this out of pure laziness to do the research myself.  :(

The short answer is yes, the old transactions would be lost.

The longer answer (which I'm sure you figured out) also includes, all backups would have to be lost for them to be lost forever.  Likewise without software modifications new nodes wouldn't be able to start up if that occurred and in all likelihood bitcoin would collapse - that is as the software is now, future changes could be made to mitigate some (maybe all) of these impacts.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: xGravity on March 28, 2017, 11:35:19 PM
I think it is very interstning that satoshi has some code in there not being active.
So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: achow101 on March 28, 2017, 11:39:52 PM
I think it is very interstning that satoshi has some code in there not being active.
No. That is not at all what this thread is about. Satoshi came up with an idea in the whitepaper, but his idea specifically was never implemented.

So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
Again, you are completely missing the point of this thread. This thread has nothing to do with Core checking disk space but rather whether a specific feature is implemented in Core. Anyways, Core does tell you if you are running out of space to store the blockchain. However it cannot tell you whether you have enough disk space because that would imply it knows the actual size of the blockchain, and the only way to do that is by downloading the whole thing.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: mezzomix on March 29, 2017, 11:01:29 AM
... This however does require still downloading all 110+ GB ...

With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: xGravity on March 29, 2017, 11:03:20 AM
... This however does require still downloading all 110+ GB ...

With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.


Does a node not always need to be complete sinyced? Meaning downloading the whole blockchain?

I think it is very interstning that satoshi has some code in there not being active.
No. That is not at all what this thread is about. Satoshi came up with an idea in the whitepaper, but his idea specifically was never implemented.

So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
Again, you are completely missing the point of this thread. This thread has nothing to do with Core checking disk space but rather whether a specific feature is implemented in Core. Anyways, Core does tell you if you are running out of space to store the blockchain. However it cannot tell you whether you have enough disk space because that would imply it knows the actual size of the blockchain, and the only way to do that is by downloading the whole thing.

Oh ok thank you. Maybe satoshi will return in a few years :)


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: mezzomix on March 29, 2017, 02:12:40 PM
... This however does require still downloading all 110+ GB ...
With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.
Does a node not always need to be complete sinyced? Meaning downloading the whole blockchain?

It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: DannyHamilton on March 29, 2017, 02:37:11 PM
It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).

Are you suggesting storing the complete UTXO list in every block?  Or are you suggesting that nodes share their UTXO list with peers, and that the block just store a hash of the UTXO list?

Either way seems to require a significant amount of trust in the list that you receive.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: mezzomix on March 29, 2017, 03:54:57 PM
It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).
Are you suggesting storing the complete UTXO list in every block?  Or are you suggesting that nodes share their UTXO list with peers, and that the block just store a hash of the UTXO list?

The idea is to store an UTXO hash in the block header. If nodes or any other source provides an UTXO set for download, I can bootstrap from the last verified block (user provided checkpoint). This should be safe as long as there is no easy way to create hash collisions.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: DannyHamilton on March 29, 2017, 04:04:55 PM
The idea is to store an UTXO hash in the block header. If nodes or any other source provides an UTXO set for download, I can bootstrap from the last verified block (user provided checkpoint). This should be safe as long as there is no easy way to create hash collisions.

So every node would need to keep track of two UTXO sets?  The current set, and the confirmed set?

What if you receive two different valid blocks with two different UTXO hashes?  How will your node know which is the correct hash?



Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: mezzomix on March 29, 2017, 05:16:42 PM
You need a matching UTXO set for every block you want to start with. Practically it might be enough to have an UTXO snapshot every month or every year.

The identification of the correct chain is easy. Only the correct block header chain will lead to the already validated block hash. A different chain will lead to a different block hash unless there is a collision.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: bomberb17 on October 21, 2017, 05:17:25 AM
Ok as far as I understand, you have to download the whole ~150GB blockchain nevertheless, then enable pruning afterwards to be left with a couple of gigabytes of blockchain data.
The other part that I understand is that not all nodes can have pruning enabled, some nodes must keep the whole blockchain anyway. All this makes pruning much less effective.
By reading #7 of Satoshi's white paper, It seems that current pruning functionality is not working as intended.
So why don't we just store coinbase transactions and UTXOs? Am I missing something?


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: achow101 on October 21, 2017, 05:23:55 AM
Ok as far as I understand, you have to download the whole ~150GB blockchain nevertheless, then enable pruning afterwards to be left with a couple of gigabytes of blockchain data.
No. You can enable pruning at any time and it will reduce the amount of space used on disk to a few GB at most. You do not, at any point in time, need to have the full blockchain on disk.

The other part that I understand is that not all nodes can have pruning enabled, some nodes must keep the whole blockchain anyway. All this makes pruning much less effective.
That is correct.

By reading #7 of Satoshi's white paper, It seems that current pruning functionality is not working as intended.
No. Pruning is working exactly as intended. Pruning and what Satoshi said in the whitepaper are two completely different things.

So why don't we just store coinbase transactions and UTXOs? Am I missing something?
Because without the full transaction history, that data can be forged. You can't know whether a UTXO is legitimate without knowing the transaction that created it and what that transaction spent. You need the full transaction history to verify the validity of a UTXO. With UTXO commitments (which do not yet exist) we could do that, but we will need a fork to enable such functionality.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: bomberb17 on October 21, 2017, 03:31:44 PM

No. Pruning is working exactly as intended. Pruning and what Satoshi said in the whitepaper are two completely different things.
...
Because without the full transaction history, that data can be forged. You can't know whether a UTXO is legitimate without knowing the transaction that created it and what that transaction spent. You need the full transaction history to verify the validity of a UTXO. With UTXO commitments (which do not yet exist) we could do that, but we will need a fork to enable such functionality.

Ok let me express a simple example:
Suppose Alice got 50BTC from a coinbase transaction on block #n. Alice then transfers 25BTC to Bob on block #(n+1) which results Bob having 25BTC and Alice a 25BTC UTXO.
Up to that point, we need all blocks and transactions for blockchain validation.
Then on block#(n+2) Bob sends Charlie all of his funds, 25BTC, leaving Bob with 0 BTC.
Now the transaction "Alice->Bob (25BTC)" is not needed to remain on block#(n+1) since Bob has 0 UTXO, and the transaction "Bob->Charlie (25BTC)" was verified on block#(n+2).
Also this improves privacy since it makes harder to link transactions and taint coins.

I believe that's what Satoshi means in his whitepaper #7 by pruning Tx0-2 from the block on the right.
If this implementation requires a hard-fork, that's another story..
Correct me if I'm wrong.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: achow101 on October 21, 2017, 05:07:26 PM
Ok let me express a simple example:
Suppose Alice got 50BTC from a coinbase transaction on block #n. Alice then transfers 25BTC to Bob on block #(n+1) which results Bob having 25BTC and Alice a 25BTC UTXO.
Up to that point, we need all blocks and transactions for blockchain validation.
Then on block#(n+2) Bob sends Charlie all of his funds, 25BTC, leaving Bob with 0 BTC.
Now the transaction "Alice->Bob (25BTC)" is not needed to remain on block#(n+1) since Bob has 0 UTXO, and the transaction "Bob->Charlie (25BTC)" was verified on block#(n+2).
Also this improves privacy since it makes harder to link transactions and taint coins.

I believe that's what Satoshi means in his whitepaper #7 by pruning Tx0-2 from the block on the right.
If this implementation requires a hard-fork, that's another story..
Correct me if I'm wrong.
You can do that locally once you have downloaded and verified the blockchain. You cannot do that to the blockchain as a whole because I don't know whether the transaction "Bob->Charlie (25BTC)" is actually legitimate when I am syncing a new node. For me to check that it is legit, I need to know where Bob got the output to spend. Just because a transaction is in a block with a valid proof of work does not automatically mean that all transactions in the block are valid; that's not how Bitcoin works.

You can certainly do this locally as that is basically what Satoshi suggests in the whitepaper. But as gmaxwell pointed out above, what we do now for pruning locally is way more efficient than what Satoshi suggests. Satoshi suggests that we throw away parts of blocks as UTXOs are spent. But what we do is that we maintain a separate database with our UTXOs and chainstate data so we don't actually need to have the blocks themselves. So we just throw away old blocks entirely because we have validated them and taken the things from them that we need and stored them elsewhere in a more compact form.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: bomberb17 on October 22, 2017, 01:34:21 AM
You can do that locally once you have downloaded and verified the blockchain. You cannot do that to the blockchain as a whole because I don't know whether the transaction "Bob->Charlie (25BTC)" is actually legitimate when I am syncing a new node. For me to check that it is legit, I need to know where Bob got the output to spend. Just because a transaction is in a block with a valid proof of work does not automatically mean that all transactions in the block are valid; that's not how Bitcoin works.

You can certainly do this locally as that is basically what Satoshi suggests in the whitepaper. But as gmaxwell pointed out above, what we do now for pruning locally is way more efficient than what Satoshi suggests. Satoshi suggests that we throw away parts of blocks as UTXOs are spent. But what we do is that we maintain a separate database with our UTXOs and chainstate data so we don't actually need to have the blocks themselves. So we just throw away old blocks entirely because we have validated them and taken the things from them that we need and stored them elsewhere in a more compact form.


Why would you need where Bob got the output? If that transaction is included in a verified block, it means that it is valid. And if that block is like 1000 blocks behind the current block, it's impossible to change it.
(I am not talking about how the current bitcoin protocol works).
I was thinking of something like transaction cut-through https://bitcointalk.org/index.php?topic=281848.0 (https://bitcointalk.org/index.php?topic=281848.0) which is already being implemented in mimble wimble.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: achow101 on October 22, 2017, 07:32:55 PM
Why would you need where Bob got the output? If that transaction is included in a verified block, it means that it is valid. And if that block is like 1000 blocks behind the current block, it's impossible to change it.
(I am not talking about how the current bitcoin protocol works).
I was thinking of something like transaction cut-through https://bitcointalk.org/index.php?topic=281848.0 (https://bitcointalk.org/index.php?topic=281848.0) which is already being implemented in mimble wimble.
You need to either know where Bob got the output to verify that the block is valid or you need to have some other way to prove that a block is valid. You can't just assume a block is valid even if it is deep in the blockchain, that's not the security model of Bitcoin.


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: bomberb17 on October 24, 2017, 02:43:00 AM
You need to either know where Bob got the output to verify that the block is valid or you need to have some other way to prove that a block is valid. You can't just assume a block is valid even if it is deep in the blockchain, that's not the security model of Bitcoin.

Ok let me rephrase my question: Suppose we change the security model of bitcoin, and enforce transaction pruning at the blockchain level (not at the client level) in a fashion described above, for blocks that their height is < (H - N), where H is the current block height and N is a set constant. Would that model be insecure? If not, why?


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: achow101 on October 24, 2017, 03:50:25 AM
Ok let me rephrase my question: Suppose we change the security model of bitcoin, and enforce transaction pruning at the blockchain level (not at the client level) in a fashion described above, for blocks that their height is < (H - N), where H is the current block height and N is a set constant. Would that model be insecure? If not, why?
It is still insecure.

The case that you must consider is a new full node that is just coming online and is syncing the blockchain. It has to download it from its peers. So how does it know that a peer didn't just prune a bunch of transactions that have unspent output that are a couple thousand blocks deep and relay that version of the blockchain to them?


Title: Re: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?
Post by: bomberb17 on October 24, 2017, 05:39:16 AM
I get your point.
By using my first example, if e.g. a malicious node pruned the tx "Bob->Charlie (25BTC)", then the tx "coinbase -> Alice (50BTC)" would be partially orphaned, since it was originally linked to the tx "Alice->Bob (25BTC)" and then became linked to "Bob->Charlie (25BTC)". Alice would effectively get all the coins back from charlie.
So the only solution would be to enforce CoinJoin as gmaxwell suggested, however these coinjoin transactions are interactive and optional.
Maybe an incentive to the users to use such transactions (zero tx fees?) would help, which would in turn enable pruning the intermediate coinjoin transactions safely.