Bitcoin Forum
May 04, 2024, 03:48:41 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Is the feature "reclaiming disk space" really implemented in Bitcoin Core?  (Read 2207 times)
PPA (OP)
Newbie
*
Offline Offline

Activity: 7
Merit: 1


View Profile
March 25, 2017, 05:31:46 PM
Merited by ABCbits (1)
 #1

In Satoshi Nakamoto's paper we can read this:
Quote
Reclaiming Disk Space
Once the latest transaction in a coin is buried under enough blocks, the spent transactions before it can be discarded to save disk space. To facilitate this without breaking the block's hash, transactions are hashed in a Merkle Tree [7][2][5], with only the root included in the block's hash. Old blocks can then be compacted by stubbing off branches of the tree. The interior hashes do not need to be stored.

Is it really implemented by Bitcoin Core?

When I used some web site such as https://blockchain.info or https://blockexplorer.com, it seems that we can scan any transaction of any block.
Does this mean that the sites above save an archive of the blockchain?

Does the Bitcoin Core discards some old transactions to save space, and saves a compact blockchain?
Note that according to https://bitcoin.org/en/download, the blockchain size is over 100Gb.
Each block is stacked on top of the previous one. Adding another block to the top makes all lower blocks more difficult to remove: there is more "weight" above each block. A transaction in a block 6 blocks deep (6 confirmations) will be very difficult to remove.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714794521
Hero Member
*
Offline Offline

Posts: 1714794521

View Profile Personal Message (Offline)

Ignore
1714794521
Reply with quote  #2

1714794521
Report to moderator
PPA (OP)
Newbie
*
Offline Offline

Activity: 7
Merit: 1


View Profile
March 25, 2017, 07:20:26 PM
 #2

I have found a similar question with an answer updated in July 2016.
http://bitcoin.stackexchange.com/questions/11170/why-is-pruning-not-considered-already-at-the-moment
achow101
Moderator
Legendary
*
expert
Offline Offline

Activity: 3388
Merit: 6578


Just writing some code


View Profile WWW
March 25, 2017, 07:28:14 PM
Merited by ABCbits (2)
 #3

First of all, Satoshi proposed several ideas which are either infeasible with current technology (e.g. fraud proofs) or are just stupid to do right now with current technology (the original transaction replacement stuff).

That being said, while Satoshi's idea with reclaiming disk space is not implemented exactly as he describes, we do have blockchain pruning. This however does require still downloading all 110+ GB however it will not all exist at the same time on disk. Currently pruning will delete on the fly, once a block becomes old enough, it is deleted to make space for the next block. Pruning has been around for over a year now, it was first introduced in Bitcoin Core 0.11.

PPA (OP)
Newbie
*
Offline Offline

Activity: 7
Merit: 1


View Profile
March 25, 2017, 07:38:13 PM
 #4

Thanks.
I am going to investigate further on the "pruning feature".
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
March 25, 2017, 09:40:44 PM
Merited by ABCbits (2)
 #5

That being said, while Satoshi's idea with reclaiming disk space is not implemented exactly as he describes, we do have blockchain pruning. This however does require still downloading all 110+ GB however it will not all exist at the same time on disk. Currently pruning will delete on the fly, once a block becomes old enough, it is deleted to make space for the next block. Pruning has been around for over a year now, it was first introduced in Bitcoin Core 0.11.
You make it sound like pruning is worse, in fact it's tens of times more efficient than whats described in the whitepaper. Both have the need to transfer data in the first place (note that it's 'reclaiming disk space' not 'avoiding bandwidth usage' Smiley ).

To the unanswered part of the OP's post:

Sites like blockchain.info aren't nodes. They are custom databases that take up many terabytes of space. They don't validate things (at least not completely) and often show invalid data.  What they do is unrelated to how nodes work.
Kprawn
Legendary
*
Offline Offline

Activity: 1904
Merit: 1073


View Profile
March 28, 2017, 03:32:39 PM
 #6

Sorry to hijack your thread OP, but I want to add a question here. In my limited knowledge of pruning I suspect that if EVERYONE runs a pruned

version of the software, then those old tx's are lost forever? I have a basic understanding of Pruning, and I have not taken the time to brush up

on the research... so I am asking this out of pure laziness to do the research myself.  Sad

THE FIRST DECENTRALIZED & PLAYER-OWNED CASINO
.EARNBET..EARN BITCOIN: DIVIDENDS
FOR-LIFETIME & MUCH MORE.
. BET WITH: BTCETHEOSLTCBCHWAXXRPBNB
.JOIN US: GITLABTWITTERTELEGRAM
cr1776
Legendary
*
Offline Offline

Activity: 4032
Merit: 1299


View Profile
March 28, 2017, 04:31:44 PM
 #7

Sorry to hijack your thread OP, but I want to add a question here. In my limited knowledge of pruning I suspect that if EVERYONE runs a pruned

version of the software, then those old tx's are lost forever? I have a basic understanding of Pruning, and I have not taken the time to brush up

on the research... so I am asking this out of pure laziness to do the research myself.  Sad

The short answer is yes, the old transactions would be lost.

The longer answer (which I'm sure you figured out) also includes, all backups would have to be lost for them to be lost forever.  Likewise without software modifications new nodes wouldn't be able to start up if that occurred and in all likelihood bitcoin would collapse - that is as the software is now, future changes could be made to mitigate some (maybe all) of these impacts.
xGravity
Newbie
*
Offline Offline

Activity: 41
Merit: 0


View Profile
March 28, 2017, 11:35:19 PM
 #8

I think it is very interstning that satoshi has some code in there not being active.
So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
achow101
Moderator
Legendary
*
expert
Offline Offline

Activity: 3388
Merit: 6578


Just writing some code


View Profile WWW
March 28, 2017, 11:39:52 PM
 #9

I think it is very interstning that satoshi has some code in there not being active.
No. That is not at all what this thread is about. Satoshi came up with an idea in the whitepaper, but his idea specifically was never implemented.

So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
Again, you are completely missing the point of this thread. This thread has nothing to do with Core checking disk space but rather whether a specific feature is implemented in Core. Anyways, Core does tell you if you are running out of space to store the blockchain. However it cannot tell you whether you have enough disk space because that would imply it knows the actual size of the blockchain, and the only way to do that is by downloading the whole thing.

mezzomix
Legendary
*
Offline Offline

Activity: 2618
Merit: 1252


View Profile
March 29, 2017, 11:01:29 AM
 #10

... This however does require still downloading all 110+ GB ...

With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.
xGravity
Newbie
*
Offline Offline

Activity: 41
Merit: 0


View Profile
March 29, 2017, 11:03:20 AM
 #11

... This however does require still downloading all 110+ GB ...

With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.


Does a node not always need to be complete sinyced? Meaning downloading the whole blockchain?

I think it is very interstning that satoshi has some code in there not being active.
No. That is not at all what this thread is about. Satoshi came up with an idea in the whitepaper, but his idea specifically was never implemented.

So to sum this up: the diskspace is currently not checked on the core github source? Or did i understand it wrong?
Again, you are completely missing the point of this thread. This thread has nothing to do with Core checking disk space but rather whether a specific feature is implemented in Core. Anyways, Core does tell you if you are running out of space to store the blockchain. However it cannot tell you whether you have enough disk space because that would imply it knows the actual size of the blockchain, and the only way to do that is by downloading the whole thing.

Oh ok thank you. Maybe satoshi will return in a few years Smiley
mezzomix
Legendary
*
Offline Offline

Activity: 2618
Merit: 1252


View Profile
March 29, 2017, 02:12:40 PM
 #12

... This however does require still downloading all 110+ GB ...
With an UTXO commitment in the block header it would not be necessary to always download the complete blockchain to bootstrap a new node.
Does a node not always need to be complete sinyced? Meaning downloading the whole blockchain?

It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4615



View Profile
March 29, 2017, 02:37:11 PM
Merited by ABCbits (1)
 #13

It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).

Are you suggesting storing the complete UTXO list in every block?  Or are you suggesting that nodes share their UTXO list with peers, and that the block just store a hash of the UTXO list?

Either way seems to require a significant amount of trust in the list that you receive.
mezzomix
Legendary
*
Offline Offline

Activity: 2618
Merit: 1252


View Profile
March 29, 2017, 03:54:57 PM
 #14

It's enough to verify the blockchain once. If you need to bootstrap another node, you could start with the last verified block (hash) and the corresponding set of unspent outputs (hash).
Are you suggesting storing the complete UTXO list in every block?  Or are you suggesting that nodes share their UTXO list with peers, and that the block just store a hash of the UTXO list?

The idea is to store an UTXO hash in the block header. If nodes or any other source provides an UTXO set for download, I can bootstrap from the last verified block (user provided checkpoint). This should be safe as long as there is no easy way to create hash collisions.
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4615



View Profile
March 29, 2017, 04:04:55 PM
Merited by ABCbits (2)
 #15

The idea is to store an UTXO hash in the block header. If nodes or any other source provides an UTXO set for download, I can bootstrap from the last verified block (user provided checkpoint). This should be safe as long as there is no easy way to create hash collisions.

So every node would need to keep track of two UTXO sets?  The current set, and the confirmed set?

What if you receive two different valid blocks with two different UTXO hashes?  How will your node know which is the correct hash?

mezzomix
Legendary
*
Offline Offline

Activity: 2618
Merit: 1252


View Profile
March 29, 2017, 05:16:42 PM
 #16

You need a matching UTXO set for every block you want to start with. Practically it might be enough to have an UTXO snapshot every month or every year.

The identification of the correct chain is easy. Only the correct block header chain will lead to the already validated block hash. A different chain will lead to a different block hash unless there is a collision.
bomberb17
Hero Member
*****
Offline Offline

Activity: 771
Merit: 528



View Profile
October 21, 2017, 05:17:25 AM
 #17

Ok as far as I understand, you have to download the whole ~150GB blockchain nevertheless, then enable pruning afterwards to be left with a couple of gigabytes of blockchain data.
The other part that I understand is that not all nodes can have pruning enabled, some nodes must keep the whole blockchain anyway. All this makes pruning much less effective.
By reading #7 of Satoshi's white paper, It seems that current pruning functionality is not working as intended.
So why don't we just store coinbase transactions and UTXOs? Am I missing something?
achow101
Moderator
Legendary
*
expert
Offline Offline

Activity: 3388
Merit: 6578


Just writing some code


View Profile WWW
October 21, 2017, 05:23:55 AM
 #18

Ok as far as I understand, you have to download the whole ~150GB blockchain nevertheless, then enable pruning afterwards to be left with a couple of gigabytes of blockchain data.
No. You can enable pruning at any time and it will reduce the amount of space used on disk to a few GB at most. You do not, at any point in time, need to have the full blockchain on disk.

The other part that I understand is that not all nodes can have pruning enabled, some nodes must keep the whole blockchain anyway. All this makes pruning much less effective.
That is correct.

By reading #7 of Satoshi's white paper, It seems that current pruning functionality is not working as intended.
No. Pruning is working exactly as intended. Pruning and what Satoshi said in the whitepaper are two completely different things.

So why don't we just store coinbase transactions and UTXOs? Am I missing something?
Because without the full transaction history, that data can be forged. You can't know whether a UTXO is legitimate without knowing the transaction that created it and what that transaction spent. You need the full transaction history to verify the validity of a UTXO. With UTXO commitments (which do not yet exist) we could do that, but we will need a fork to enable such functionality.

bomberb17
Hero Member
*****
Offline Offline

Activity: 771
Merit: 528



View Profile
October 21, 2017, 03:31:44 PM
 #19


No. Pruning is working exactly as intended. Pruning and what Satoshi said in the whitepaper are two completely different things.
...
Because without the full transaction history, that data can be forged. You can't know whether a UTXO is legitimate without knowing the transaction that created it and what that transaction spent. You need the full transaction history to verify the validity of a UTXO. With UTXO commitments (which do not yet exist) we could do that, but we will need a fork to enable such functionality.

Ok let me express a simple example:
Suppose Alice got 50BTC from a coinbase transaction on block #n. Alice then transfers 25BTC to Bob on block #(n+1) which results Bob having 25BTC and Alice a 25BTC UTXO.
Up to that point, we need all blocks and transactions for blockchain validation.
Then on block#(n+2) Bob sends Charlie all of his funds, 25BTC, leaving Bob with 0 BTC.
Now the transaction "Alice->Bob (25BTC)" is not needed to remain on block#(n+1) since Bob has 0 UTXO, and the transaction "Bob->Charlie (25BTC)" was verified on block#(n+2).
Also this improves privacy since it makes harder to link transactions and taint coins.

I believe that's what Satoshi means in his whitepaper #7 by pruning Tx0-2 from the block on the right.
If this implementation requires a hard-fork, that's another story..
Correct me if I'm wrong.
achow101
Moderator
Legendary
*
expert
Offline Offline

Activity: 3388
Merit: 6578


Just writing some code


View Profile WWW
October 21, 2017, 05:07:26 PM
 #20

Ok let me express a simple example:
Suppose Alice got 50BTC from a coinbase transaction on block #n. Alice then transfers 25BTC to Bob on block #(n+1) which results Bob having 25BTC and Alice a 25BTC UTXO.
Up to that point, we need all blocks and transactions for blockchain validation.
Then on block#(n+2) Bob sends Charlie all of his funds, 25BTC, leaving Bob with 0 BTC.
Now the transaction "Alice->Bob (25BTC)" is not needed to remain on block#(n+1) since Bob has 0 UTXO, and the transaction "Bob->Charlie (25BTC)" was verified on block#(n+2).
Also this improves privacy since it makes harder to link transactions and taint coins.

I believe that's what Satoshi means in his whitepaper #7 by pruning Tx0-2 from the block on the right.
If this implementation requires a hard-fork, that's another story..
Correct me if I'm wrong.
You can do that locally once you have downloaded and verified the blockchain. You cannot do that to the blockchain as a whole because I don't know whether the transaction "Bob->Charlie (25BTC)" is actually legitimate when I am syncing a new node. For me to check that it is legit, I need to know where Bob got the output to spend. Just because a transaction is in a block with a valid proof of work does not automatically mean that all transactions in the block are valid; that's not how Bitcoin works.

You can certainly do this locally as that is basically what Satoshi suggests in the whitepaper. But as gmaxwell pointed out above, what we do now for pruning locally is way more efficient than what Satoshi suggests. Satoshi suggests that we throw away parts of blocks as UTXOs are spent. But what we do is that we maintain a separate database with our UTXOs and chainstate data so we don't actually need to have the blocks themselves. So we just throw away old blocks entirely because we have validated them and taken the things from them that we need and stored them elsewhere in a more compact form.

Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!