Bitcoin Forum
November 13, 2024, 11:06:15 AM *
News: Check out the artwork 1Dq created to commemorate this forum's 15th anniversary
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 »  All
  Print  
Author Topic: Blockchain Compression  (Read 8647 times)
Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
July 04, 2013, 04:26:48 AM
 #41

I don't follow.  How could an SPV node function with only SPV peers?  How would it get the tx data that it needs?

Payment protocols.

Anyway the whole idea of just assuming SPV nodes are going to be able to get the tx data they need from peers for free is really flawed. Running bloom filters against the blockchain data costs resources and at some point people aren't going to be willing to do that for free.

I've got writing "tit-for-tat" peering on my todo list actually: IE silently fail to relay transactions to peers if they don't relay enough new and valid transactions to you for being leeches on the network. Pure SPV clients, like Android clients, can pay for the resources they consume via micropayment channels or at least proof-of-work. It'd also prevent attackers from DoSing the network by making large numbers of connections to large numbers of nodes to fill the incoming peer slots of nodes on the network; you need very little in the way of resources to pull off that attack right now.

I addressed this as well in that thread: https://bitcointalk.org/index.php?topic=131493.msg1407971#msg1407971.  The Merkle tree of transactions is used to authenticate the maximum fee reward calculation in each block.  It didn't seem to require utxo set commitments.

That's a good point, a changed merkle tree can achieve that too. However maaku is quite correct that network operator attacks are trivial without UTXO set commitments, whereas with them the worst a network operator can do is prevent you from getting a fraud proof message; creating fake confirms is extremely expensive.

Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
July 04, 2013, 04:30:10 AM
 #42

Okay, so the network operator could mislead a node onto his invalid chain by handing it fake fraud challenges.  Can't he do this regardless, by simply refusing to relay valid block headers?

Among other things there is the problem that without the way to prove that a txout doesn't exist the network operator can prevent a SPV node from ever knowing that they have been paid and there is nothing the SPV node can do about it.

d'aniel
Sr. Member
****
Offline Offline

Activity: 461
Merit: 251


View Profile
July 04, 2013, 04:49:10 AM
 #43

I don't follow.  How could an SPV node function with only SPV peers?  How would it get the tx data that it needs?

Payment protocols.

Anyway the whole idea of just assuming SPV nodes are going to be able to get the tx data they need from peers for free is really flawed. Running bloom filters against the blockchain data costs resources and at some point people aren't going to be willing to do that for free.

I've got writing "tit-for-tat" peering on my todo list actually: IE silently fail to relay transactions to peers if they don't relay enough new and valid transactions to you for being leeches on the network. Pure SPV clients, like Android clients, can pay for the resources they consume via micropayment channels or at least proof-of-work. It'd also prevent attackers from DoSing the network by making large numbers of connections to large numbers of nodes to fill the incoming peer slots of nodes on the network; you need very little in the way of resources to pull off that attack right now.
Ways to reduce the load off of full nodes, and pay for their efforts seems like good ideas.  But do you expect some SPV nodes to not be able to connect to any full nodes at all at some point in the future?  If there's a market for the service, then surely they will always be able to find some willing to provide it, especially if their security depends on it.

Among other things there is the problem that without the way to prove that a txout doesn't exist the network operator can prevent a SPV node from ever knowing that they have been paid and there is nothing the SPV node can do about it.
That sounds a bit obnoxious, sure, but is it really that big a problem?
Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
July 04, 2013, 05:16:22 AM
 #44

I don't follow.  How could an SPV node function with only SPV peers?  How would it get the tx data that it needs?

Payment protocols.

Anyway the whole idea of just assuming SPV nodes are going to be able to get the tx data they need from peers for free is really flawed. Running bloom filters against the blockchain data costs resources and at some point people aren't going to be willing to do that for free.

I've got writing "tit-for-tat" peering on my todo list actually: IE silently fail to relay transactions to peers if they don't relay enough new and valid transactions to you for being leeches on the network. Pure SPV clients, like Android clients, can pay for the resources they consume via micropayment channels or at least proof-of-work. It'd also prevent attackers from DoSing the network by making large numbers of connections to large numbers of nodes to fill the incoming peer slots of nodes on the network; you need very little in the way of resources to pull off that attack right now.
Ways to reduce the load off of full nodes, and pay for their efforts seems like good ideas.  But do you expect some SPV nodes to not be able to connect to any full nodes at all at some point in the future?  If there's a market for the service, then surely they will always be able to find some willing to provide it, especially if their security depends on it.

Among other things there is the problem that without the way to prove that a txout doesn't exist the network operator can prevent a SPV node from ever knowing that they have been paid and there is nothing the SPV node can do about it.
That sounds a bit obnoxious, sure, but is it really that big a problem?

Yes: it allows someone to claim they have funds that they have since spent.

Also look at in the broader sense: if you do have UTXO proofs an SPV node can pay for a full-node connection and SPV-related services, either with real funds or a anti-DoS proof-of-work, and be sure that the node is being honest and they are getting accurate data with nothing more than a source of block header info. (relaying of block headers between SPV nodes is something I'm also planning on implementing)

d'aniel
Sr. Member
****
Offline Offline

Activity: 461
Merit: 251


View Profile
July 04, 2013, 06:20:07 AM
 #45

That sounds a bit obnoxious, sure, but is it really that big a problem?
Yes: it allows someone to claim they have funds that they have since spent.
So in the event that you are connecting to the same network that someone spending coins to you controls or is colluding with the controller of, then you can get screwed.  I agree, that is a sore spot.  But wouldn't it look suspicious if on some new network you weren't able to find any recognizable peers to connect to?

Quote
Also look at in the broader sense: if you do have UTXO proofs an SPV node can pay for a full-node connection and SPV-related services, either with real funds or a anti-DoS proof-of-work, and be sure that the node is being honest and they are getting accurate data with nothing more than a source of block header info. (relaying of block headers between SPV nodes is something I'm also planning on implementing)
To be fair, all the data is provably accurate, you just don't know if you're being told the whole story.

There are definitely some benefits to this, but there are also costs, and I'm just wondering if there are perhaps other cheaper ways to get practically the same benefits.  Thankfully maaku will be providing us with a sense of the costs pretty soon.
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
July 04, 2013, 09:49:02 AM
 #46

The second problem is that you don't need to make these changes to have immediate initial startup. SPV wallets can already do the initial sync in a few seconds. If you want to run a full node too, just run both MultiBit and bitcoind in parallel until the latter is ready. If you want it to be seamless, just bundle bitcoind with a MultiBit and make it connect to your new local node once it's finished syncing. SPV+local full node = full node security, except that it's a very simple implementation, with the advantage that you can seamlessly switch back and forth afterwards.

The reference client should operate that way anyway.

- download headers
- verify scan backwards [ * ]

The client could give the length of the history that has been confirmed.

It could say that it has verified all blocks back to the inputs to all your transactions and how many total blocks have been scanned.

Each transaction could go through different states
0) not synced (RED)
1) scanned back to earliest input and 1k additional blocks (ORANGE)
2) scanned back to a checkpoint (YELLOW)
3) synced back to genesis block (GREEN)

[ * ] The client would have to keep a hashmap of all transactions that have occured in later blocks, so it can detect a double spend.  Effectively, it would be an "unfunded input set" rather than an "unspent output set".

Even level 1 security would be pretty rock solid.  It would require a 1k re-org to be violated.

Quote
I'm not actually convinced it's a good idea though. The user experience of having a full node randomly start syncing in the background because something decided you had enough cpu/disk space is very poor.

You could add CPU limits, so that it never uses more than 10% of any CPU (do 5ms of work and then sleep for 50ms).  Ideally, with a distributed verification engine, 10% of everyone's CPUs would be more than enough to verify the chain.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
July 04, 2013, 10:44:11 AM
 #47

One catch is that unfortunately UTXO commitments are in themselves very dangerous in that they allow you to safely mine without actually validating the chain at all.

You still need the UTXO tree though.  Updating the tree requires that you process the entire block.

Quote
That said I'm pretty sure it's possible to allow those possession proofs to be done in a way where participants can maintain only part of the UTXO set, and thus part of the P2P relay bandwidth, allowing low-bandwidth true mining + larger blocksizes without decreasing the 51% attack threshold and helping solve the censorship problem and keeping thus Bitcoin decentralized. The devil is in the details, but at worst it can be done as a merge-mined alt-coin.

Ideally, there would be some way to merge multiple sub-blocks into a higher level block.

For example, someone publishes a merkle root containing only transactions which start with some prefix.  A combiner could take 16 of those any produce a higher level merkle root.  The prefix for the combined root would be 4 bits less.

Effectively, you would have 16 sub-chains, where each sub-chain deals with transactions that start with a particular prefix.  Each of those chains could have sub-chains too.

The only way to make it work with Bitcoin would be to have a defined number of transactions per block.  You can only combine two merkle trees into 1 merkle tree if both children have the same number of transactions.

There would also need to be some kind of trust system for the combiners.  If you have to verify, then it defeats the purpose.

A "trusted-identity" would certify the combination (and get fees).  If proof is given that the claim is false, then that id is eliminated.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
July 04, 2013, 11:31:34 AM
 #48

One catch is that unfortunately UTXO commitments are in themselves very dangerous in that they allow you to safely mine without actually validating the chain at all.

You still need the UTXO tree though.  Updating the tree requires that you process the entire block.

That's not true unfortunately. You'll need to do some extra UTXO queries strategically to get the right information on the branches next to the transaction being changed, but with that information you have enough to calculate the next UTXO tree state without having any of the data. This is inherent to having a fast UTXO commitment system in the first place because part of making it cheap to provide those commitments is keeping the data changed for every new block at a minimum - exactly opposite the need to have miners prove they actually have the UTXO set at all.

Ideally, there would be some way to merge multiple sub-blocks into a higher level block.

For example, someone publishes a merkle root containing only transactions which start with some prefix.  A combiner could take 16 of those any produce a higher level merkle root.  The prefix for the combined root would be 4 bits less.

Effectively, you would have 16 sub-chains, where each sub-chain deals with transactions that start with a particular prefix.  Each of those chains could have sub-chains too.

The only way to make it work with Bitcoin would be to have a defined number of transactions per block.  You can only combine two merkle trees into 1 merkle tree if both children have the same number of transactions.

There would also need to be some kind of trust system for the combiners.  If you have to verify, then it defeats the purpose.

A "trusted-identity" would certify the combination (and get fees).  If proof is given that the claim is false, then that id is eliminated.

You're missing a key point: transactions can touch multiple parts of the UTXO set so once the UTXO set is split into subsets participants validate (at minimum) pairs of those subsets.

TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
July 04, 2013, 11:43:11 AM
 #49

That's not true unfortunately. You'll need to do some extra UTXO queries strategically to get the right information on the branches next to the transaction being changed, but with that information you have enough to calculate the next UTXO tree state without having any of the data.

Right. 

In fact, you could require all the extra data be attached to the transaction.

Quote
You're missing a key point: transactions can touch multiple parts of the UTXO set so once the UTXO set is split into subsets participants validate (at minimum) pairs of those subsets.

I was thinking that you share out the transactions based on the transaction's own hash.  However, you are right, to actually verify, you need all the info.

In theory, transactions could be passed between the 16 sub-chains, and you need a verification for each of the 16 sub-chains to be actually verified.

You could have 16 blocks per actual "full" block.  A sub-block would contain

- all transactions from the previous block, except
-- Any transactions added 16 blocks previously (they count as confirmed)
-- Any transactions that have inputs from transactions in this sub-chain, but are invalid
- Any new transactions for this sub-set

This is similar to my suggestion to have odd/even transaction rules.  Basically, you guarantee that the next block won't alter certain aspects of the UTXO set.

This allows the next block to be prepared ahead of time.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
d'aniel
Sr. Member
****
Offline Offline

Activity: 461
Merit: 251


View Profile
July 04, 2013, 09:51:08 PM
 #50

Partially verifying nodes is what I ended up being interested in utxo set commitments for, and they seem to be really useful if/when we get efficient verifiable computing.  Though there definitely isn't any near-term need for this.
Re: verifiable computing and utxo set commitments, I noticed that a commitment of the utxo set digest into the block header wouldn't actually be necessary since you'd already have a succinct proof of its validity which is much stronger than simple inclusion in a block header.  Though the authenticated utxo set is still necessary for being able to turn blockchain verification into verifying a chain of succinct verifiable computing proofs(?), and for knowing your copy of the utxo (sub)set is valid at any given time.
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
July 05, 2013, 08:38:31 AM
 #51

It's not quite that simple - to combine provable computations with blocks you could "just" provide the proofs to the clients that verify them, which indeed is a huge upgrade over SPV security assuming the verifications are sufficiently cheap. However as all such schemes require encoding of the entire input to the function and have complexity related to the size of the inputs, the right way to do it is to have the provable program actually work with a UTXO merkle tree and output deltas to that tree. At least, that's the scheme we came up with when I discussed this with Bryan Parno.

If someone wanted to start calculating UTXO merkle trees in order to experiment with provable verification, then I'd think that was cool but it's a long way from "we should actually make this a part of Bitcoin". See that as zerocoin style research. Cutting edge stuff that has great potential but it's not something for todays world.
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
July 05, 2013, 08:56:55 AM
Last edit: July 05, 2013, 07:34:35 PM by TierNolan
 #52

I think also there could be a difference between verification for mining and general verification.

The problem with mining is that you need verification to be extremely quick.  A distributed system (or at least a p2p distributed system) isn't likely to be able to keep up with a centralised system.

However, with bad block proofs, a distributed system may be able to detect a bad block within 5-10 blocks.  This wouldn't be sufficient to mine, but would be sufficient to keep miners honest.

Ofc, there is still the problem of proving that data is missing.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
piotr_n
Legendary
*
Offline Offline

Activity: 2055
Merit: 1359


aka tonikt


View Profile WWW
July 05, 2013, 07:09:14 PM
Last edit: July 05, 2013, 07:50:04 PM by piotr_n
 #53

You know I really don't care if that's true. Because what I do know is that it's wasting time on a problem that we might have over spending vital time we don't have on a problem that we have RIGHT NOW which frustrates me to no end.
I understand you very well, but all I can tell you is: get used to this - it's normal. They only fix/optimize things if people have reported (and proven) an issue.
And since by their definition an uncompressed blockchain data is not an issue, they just keep following the usual road map, which is always about adding over-complicated things that:
1) nobody really needs, nor they help him with anything
2) can be as well implemented outside the bitcoin node

But try to explain them that no sane bitcoin user seeks a technology that requires him to submit all his personal details (full name, address, etc.) to a certificate issuing authority, in order to receive a bitcoin... and good luck with it!
They already found it important and they are going to spend 12 months while nothing else matters implementing and testing it, repeating that this is very important and widely requested feature - there is nothing any of us can do about it, except maybe laughing on how the suckers waste their time by burying a complex code of unnecessary features inside a so crucial piece of software that a bitcoin core is. So yeah, at least it's funny Smiley

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB  9059 1BE7 B545 CDF3 FD0E
IIOII
Legendary
*
Offline Offline

Activity: 1153
Merit: 1012



View Profile
July 06, 2013, 06:32:37 PM
 #54

You know I really don't care if that's true. Because what I do know is that it's wasting time on a problem that we might have over spending vital time we don't have on a problem that we have RIGHT NOW which frustrates me to no end.
I understand you very well, but all I can tell you is: get used to this - it's normal. They only fix/optimize things if people have reported (and proven) an issue.
And since by their definition an uncompressed blockchain data is not an issue, they just keep following the usual road map, which is always about adding over-complicated things that:
1) nobody really needs, nor they help him with anything
2) can be as well implemented outside the bitcoin node

But try to explain them that no sane bitcoin user seeks a technology that requires him to submit all his personal details (full name, address, etc.) to a certificate issuing authority, in order to receive a bitcoin... and good luck with it!
They already found it important and they are going to spend 12 months while nothing else matters implementing and testing it, repeating that this is very important and widely requested feature - there is nothing any of us can do about it, except maybe laughing on how the suckers waste their time by burying a complex code of unnecessary features inside a so crucial piece of software that a bitcoin core is. So yeah, at least it's funny Smiley


You're so right.

Instead of improving core functionalities, Bitcoin gets bloated with features that are optional "nice-to-haves" which could be more efficiently solved by third party services.

It's false to claim that blockchain size isn't a problem. Every single bitcoin user I know complains about blockchain download times. Implementing pruning would be a great leap forward in usability.
mechs (OP)
Full Member
***
Offline Offline

Activity: 210
Merit: 100



View Profile
July 06, 2013, 11:52:49 PM
 #55

It seems to me their is one quick solution which would be to simply compress the blockchain with a known encryption algorithm such as rar (I as an exercise was able to shrink mine to 65% of its original size) and then the client can decrypt in the memory in the fly as needed.  Hardly a very cpu intensive task for most modern processors and it can be made a feature that can be disabled at the user's option.  It would result in 35% faster downloads and 35% less disk space used at the cost of using the cpu a bit more and some more memory.

This is not a replacment at all for pruning, however this strikes me as fairly quick to implement and should be a part of any solution.  Is there any flaw here I do not see?
piotr_n
Legendary
*
Offline Offline

Activity: 2055
Merit: 1359


aka tonikt


View Profile WWW
July 07, 2013, 01:56:46 AM
 #56

Implementing pruning would be a great leap forward in usability.

Problem with pruning is that if all the nodes remove spent txs from their storage, who is going to serve the block chain? Charities? Smiley

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB  9059 1BE7 B545 CDF3 FD0E
maaku
Legendary
*
expert
Offline Offline

Activity: 905
Merit: 1012


View Profile
July 07, 2013, 04:28:59 AM
 #57

Implementing pruning would be a great leap forward in usability.

The problem isn't (yet) the size of the block chain on disk. 10GB isn't a lot of space. It's network bandwidth - downloading 10GB of data takes a while on any connection. Pruning would do nothing to fix that (you still have to download the blocks before selectively throwing portions away), and it may even make things worse since fewer peers will have the full blockchain to seed from.

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
July 08, 2013, 12:25:26 PM
 #58

Yes, exactly. Pruning is not a bandwidth saver. Mostly when speed of chain download has been examined, it turned out that disk IO and CPU were the limits, not bandwidth. Mostly people could download the chain faster than Bitcoin can digest it. It might have changed a bit since 0.8.1+/ultraprune/leveldb/other optimisations, but you'd still rapidly get back to the point where disk and CPU are the limits rather than the network.

This is why it's so dangerous to throw around random assumptions about what changes should be made. You might end up not fixing the problem you wanted to fix.

If you ARE bandwidth limited, Jeff already runs a blockchain torrent. I don't remember if it's compressed but as it's just a file, it certainly could be. If you want to bootstrap a new node faster, downloading the torrent is one way to do it. You still have to build the indexes yourself of course, and that's going to take a long time no matter what.

For the vast majority of new users, the correct client to use is MultiBit, and since a few days ago that's now our recommendation on bitcoin.org. The Bitcoin-Qt client is now listed as an app you run if you want to help the network by donating resources. This is a scalable fix - MultiBit will always be fast no matter how popular Bitcoin gets, whereas anything you can do to Bitcoin-Qt either changes the security model to be more like MultiBits (in which case why bother) or will make a percentage improvement taking us from "slow" to "slow". Once you make an informed decision to donate your resources to the network over the long term, the exact amount of time it takes to catch up with the chain head becomes less relevant. You're going to run it for months we'd hope, so one or two days doesn't make a big difference.
maaku
Legendary
*
expert
Offline Offline

Activity: 905
Merit: 1012


View Profile
July 08, 2013, 05:06:19 PM
 #59

Mike, I will make one comment that if you are going to trust the checkpoint (which you do currently - validation in Bitcoin-Qt isn't performed until after the last checkpoint, right?) and if the checkpoint included an authenticated UTXO index hash under the soft-fork scenario, then you could just download that UTXO set and start verification from there, with *exactly* the same security as you would otherwise would have had. This let's us keep the "one to two days" to perform IBD to a small value, and not continue to increase over time.

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
piotr_n
Legendary
*
Offline Offline

Activity: 2055
Merit: 1359


aka tonikt


View Profile WWW
July 08, 2013, 05:16:06 PM
 #60

I agree.
If there are these checkpoints anyway, you can as well distribute a new version of the client with the content of the last checkpoints' UTXO, instead of only a block's hash.
This would make bootstrapping so much faster, but also require no storage for all the previous blocks, since any node would be able to download the last checkpoint's db content from the peers - with a checksum it would be equally "safe" as the checpoint values now, but so much more useful.
It's a bold move, but in fact not bolder than the idea of checkpoints itself.
And it would at least optimize something.
Though, again, who am I talking to + the idea of checkpoints is stupid and goes totally against the main purpose of the solution Smiley

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB  9059 1BE7 B545 CDF3 FD0E
Pages: « 1 2 [3] 4 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!