Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: zaq on January 24, 2018, 12:07:49 PM



Title: Why DB so big?
Post by: zaq on January 24, 2018, 12:07:49 PM
Why do we need to keep all history on all nodes? Would it be sufficient to keep unspent outputs only?


Title: Re: Why DB so big?
Post by: Carlton Banks on January 24, 2018, 12:36:51 PM
Would it be sufficient to keep unspent outputs only?

It's possible, but the software engineering to do it is extensive. Even then, I think mining nodes still need the complete tx history. The developers have a plan to design such a mode, but they can only do as much work as there is time to do it. I *think* Pieter Wuille had the most promising idea to do something like this, but I can't remember what the details are right now.


Title: Re: Why DB so big?
Post by: Xynerise on January 24, 2018, 02:55:19 PM
Decentralisation.
If all transactions are stored then your node doesn't need to trust that a transaction is correct , it just verifies everything from the genesis block.

If you didn't store all transactions then nodes will have to trust that their info is correct, which is not really trustless.


Bitcoin has the concept of checkpoints though, which are hard coded to the client which prevents the rewriting of early blocks.
https://github.com/bitcoin/bitcoin/blob/c091b99/src/chainparams.cpp#L150-L162


Title: Re: Why DB so big?
Post by: ranochigo on January 24, 2018, 03:27:30 PM
Individually, you don't need to store the whole Blockchain. If you have no need for transaction information for addresses not associated with your wallet file, you don't have to use it. You can easily just prune the Blockchain. The disadvantage is that you can't access any transactions related to any other address that is not associated with your wallet.

You need to download the whole Blockchain though. It is unsafe to download only parts of it and expect it to be valid. At all times, a portion of the nodes must have the full Blockchain to facilitate their peers in synchronizing.


Title: Re: Why DB so big?
Post by: Turkish88 on January 24, 2018, 03:55:28 PM
You can use wallet which not store all blockchain on your computer, look at Electrum


Title: Re: Why DB so big?
Post by: cellard on January 24, 2018, 08:14:45 PM
Individually, you don't need to store the whole Blockchain. If you have no need for transaction information for addresses not associated with your wallet file, you don't have to use it. You can easily just prune the Blockchain. The disadvantage is that you can't access any transactions related to any other address that is not associated with your wallet.

You need to download the whole Blockchain though. It is unsafe to download only parts of it and expect it to be valid. At all times, a portion of the nodes must have the full Blockchain to facilitate their peers in synchronizing.

Pruned mode is paradoxical because you need to be able to download the entire thing at least once... which kind of defeats the purpose. At best you would be able to save space afterwards. If we could engineer something that is able to download only the last GB which would be a reasonable amount since trying to go "1GB back in time" would require an huge amount of money for such a hashrate attack... but I don't see how that's even possible. You would need some epic wizardry engineering to pull it off without opening potential exploits.

You can use wallet which not store all blockchain on your computer, look at Electrum

SPV wallets aren't safe... and I don't think we will ever be able to get past having to download the entire blockchain and validate it at least once.


Title: Re: Why DB so big?
Post by: Anti-Cen on January 24, 2018, 10:33:46 PM
Has to be big because it stores lots of data compression is double edged because it takes time to decompress
not that this is a magical cure for Bitcoin and the solution is to break the chain up into smaller chunks 

It's possible, but the software engineering to do it is extensive. Even then, I think mining nodes still need the complete tx history. The developers have a plan to design such a mode, but they can only do as much work as there is time to do it. I *think* Pieter Wuille had the most promising idea to do something like this, but I can't remember what the details are right now.

I like the full history on the block chain but maybe could live with roll up's from time to time but all the archive data
should be available


Title: Re: Why DB so big?
Post by: CheatingCoins on January 24, 2018, 10:36:52 PM
It's whole idea is to have everything validated from start in order to prove there are no faults. Therefore it all goes trough validation. :)


Title: Re: Why DB so big?
Post by: ranochigo on January 24, 2018, 10:48:13 PM
Pruned mode is paradoxical because you need to be able to download the entire thing at least once... which kind of defeats the purpose. At best you would be able to save space afterwards. If we could engineer something that is able to download only the last GB which would be a reasonable amount since trying to go "1GB back in time" would require an huge amount of money for such a hashrate attack... but I don't see how that's even possible. You would need some epic wizardry engineering to pull it off without opening potential exploits.
Not really. Just to speak in my context, everyone in my country has unlimited bandwidth so it was never an issue about bandwidth. However, not everyone has enough disk space. Pruning solves this problem since it only store X mb of the blocks at any given time so you could theoractically have 550 MB of space before syncing and still be able to do it successfully. Whilst such a massive block reorg is rather uncommon, I really doubt anyone would be happy to see them needing to resyncing everything from scratch if it ever happens.


Title: Re: Why DB so big?
Post by: achow101 on January 25, 2018, 12:44:22 AM
Pruned mode is paradoxical because you need to be able to download the entire thing at least once... which kind of defeats the purpose.
No, that is its express and only purpose. Pruned mode is for the sole purpose of reducing disk space. It has nothing to do with bandwidth and was not designed for reducing bandwidth or computation requirements. It was literally designed to only reduce disk space usage.


Title: Re: Why DB so big?
Post by: zaq on January 25, 2018, 12:01:49 PM
Thank you all for exhaustive answers.


Title: Re: Why DB so big?
Post by: cellard on January 25, 2018, 03:13:01 PM
Pruned mode is paradoxical because you need to be able to download the entire thing at least once... which kind of defeats the purpose. At best you would be able to save space afterwards. If we could engineer something that is able to download only the last GB which would be a reasonable amount since trying to go "1GB back in time" would require an huge amount of money for such a hashrate attack... but I don't see how that's even possible. You would need some epic wizardry engineering to pull it off without opening potential exploits.
Not really. Just to speak in my context, everyone in my country has unlimited bandwidth so it was never an issue about bandwidth. However, not everyone has enough disk space. Pruning solves this problem since it only store X mb of the blocks at any given time so you could theoractically have 550 MB of space before syncing and still be able to do it successfully. Whilst such a massive block reorg is rather uncommon, I really doubt anyone would be happy to see them needing to resyncing everything from scratch if it ever happens.

Aha that's great, I thought that pruning required you to have the required total space to sync at least once, but it seems that it never downloads an higher amount that the selected pruning amount, well then yes that's a solution.

But still, you will not be able to reuse your block files within a forked client to access forks faster for example. Im sure there are other use cases of owning the entire blockchain other than the security itself.


Title: Re: Why DB so big?
Post by: achow101 on January 25, 2018, 04:02:41 PM
But still, you will not be able to reuse your block files within a forked client to access forks faster for example.
You can always stop a node after syncing to a certain block height, backup the datadir, and then resume syncing. Now you will have a copy of the datadir before any forks and just copy and paste that datadir into the datadirs of forked clients to sync their forked blockchain. Since the pruned datadir is small, there isn't much additional disk space taken up by doing this, and way less than if you didn't prune the blockchain.