Ultimate blockchain compression w/ trust-free lite nodes

Bitcoin Forum

December 14, 2024, 02:33:53 AM

Welcome, Guest. Please login or register.

News: Latest Bitcoin Core release: 28.0 [Torrent]

Home

Help

Bitcoin Forum > Bitcoin > Development & Technical Discussion > Ultimate blockchain compression w/ trust-free lite nodes

Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 » All

« previous topic next topic »

Author

Topic: Ultimate blockchain compression w/ trust-free lite nodes (Read 87947 times)

maaku

Legendary

Offline

Activity: 905
Merit: 1012

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 05:44:05 PM

#101

@etothepi, what about non-standard transactions? Including IP, P2SH and future contract formats. Not all outputs can be reduced to an address. We've been speaking loosely about a tree of “addresses” but it would really have to be a tree of output scripts, so it's not going to be possible to limit search-string length for the prefix trie.

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP

casascius

Mike Caldwell
VIP
Legendary

Offline

Activity: 1386
Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 05:53:47 PM

#102

Quote from: maaku on June 21, 2012, 05:44:05 PM

I would think that all that matters is there be a deterministic index that can be used to look it up.

P2SH has a hash. IP, to the best of my knowledge, isn't a kind of transaction, but is just a way to produce a pubkey-based transaction (from which an address/hash can be derived). Transaction formats yet to be invented could easily stipulate some way of being found, if a simple default of "first hash in the script, or hash of [first constant | all concatenated constants] in the script bigger than X bits, whichever comes first" didn't solve most or all cases with a single broad stroke. (For example, if such a default didn't make sense for a future transaction type, that future transaction type could contain a field that says "My Search Key is X".)

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.

vuce

Sr. Member

Offline

Activity: 476
Merit: 250

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 05:56:35 PM
Last edit: June 21, 2012, 06:12:32 PM by vuce

#103

Quote from: etotheipi on June 21, 2012, 05:18:11 PM

I'm voting for level-compressed trie structures (so, a variant of a patricia trees) which have no balance issues at all, are insert-order-independent, and O(1) query/insert/delete.

Quote

To insert a string, we search the trie until we can make no further progress. At this point we either add a new outgoing edge labeled with all remaining characters in the input string, or if there is already an outgoing edge sharing a prefix with the remaining input string, we split it into two edges (the first labeled with the common prefix) and proceed.

So new insertions go to the leaves of the tree, I think this would make it insert dependent - just like any other tree. I'd suggest avl instead of r-b, since it has lower worst height.

CoinLab

Sr. Member

Offline

Activity: 270
Merit: 250

1CoinLabF5Avpp5kor41ngn7prTFMMHFVc

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 08:34:05 PM

#104

This is a very interesting idea. Excited to see how it develops.

etotheipi (OP)

Legendary

Offline

Activity: 1428
Merit: 1093

Core Armory Developer

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 08:42:30 PM

#105

Quote from: maaku on June 21, 2012, 05:44:05 PM

I was expecting that the hash of the TxOut script would be used so that all nodes are exactly 32-bytes. You could argue that's exactly what most TxOut scripts already are: hashes of longer data fields (such as using hash160s in place of public keys), but you have to make sure the search key is strictly bounded in size if you're using a trie of some sort.

@vuce
The structure of a trie has no dependence on insert order. Given a set of data, there is only one trie that can hold it. The same goes for Patricia tries (which are level-compressed tries). And given that its query, insert and delete times are based strictly on key size (which will be bounded as above), there are no balance issues at all: it always takes exactly 32 "hops" to get from the root to the leaf you want, regardless of whether you are querying, inserting or deleting. So given fixed key-length, all those operations are actually O(1).

On the other hand, I was hoping for a structure that wasn't too complicated, and both RB trees and Patricia tries have complicated implementations (even though the concepts behind them are fairly simple). But if we're going to have to go with something complicated, anyway (to limit worst-case speed and time performance), then I'd have to vote for Patricia trie or variant. Not only is it O(1)... someone brought up the very good point that updates to the tree can mostly be parallelized. That sounds like another good property of a tree that's going to have very high update rates...

I just gotta spend some time to figure out the space overhead for storing TxOuts. If it's going to triple the overall disk space compared to other structures, it might be worth using one of the insert-order dependent trees.

What other data structures am I missing that could be considered? I know B-trees would be a good choice if we are going with insert-order-dependent structure: they are easy to keep balanced with fairly simple rules for balancing.

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here! (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)

vuce

Sr. Member

Offline

Activity: 476
Merit: 250

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 08:47:07 PM
Last edit: June 21, 2012, 09:22:18 PM by vuce

#106

Quote from: etotheipi on June 21, 2012, 08:42:30 PM

@vuce
The structure of a trie has no dependence on insert order. Given a set of data, there is only one trie that can hold it. The same goes for Patricia tries (which are level-compressed tries). And given that its query, insert and delete times are based strictly on key size (which will be bounded as above), there are no balance issues at all: it always takes exactly 32 "hops" to get from the root to the leaf you want, regardless of whether you are querying, inserting or deleting. So given fixed key-length, all those operations are actually O(1).

I was citing the patricia trie wiki, where it's pretty obvious that new inserts are inserted as leaves of the tree, therefore making them insert order dependent. If you would direct me to a better explanation I would appreciate it.

nevermind, I misunderstood how it works Embarrassed

FWIW, I agree, I think this might be the best choice.

Quote

What other data structures am I missing that could be considered? I know B-trees would be a good choice if we are going with insert-order-dependent structure: they are easy to keep balanced with fairly simple rules for balancing.

AVL tree is the mother of balanced binary trees. They have the smallest "worst case height", so the fastest query, but a bit slower insert/delete than red-black trees. They are also very easy to implement.

2-3-4 tree might also be worth considering. Don't know if it's insert-order-independent or not but by a quick look it might be. Or maybe plain 2-3 tree, that one has data in leaves only so it looks kind of like a merkle tree, but does have quite a bit of overhead.

casascius

Mike Caldwell
VIP
Legendary

Offline

Activity: 1386
Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 08:55:14 PM

#107

I would consider height to be a worthy thing to control, not so much for query speed, but because the nodes from leaf to ancestor might have to be transmitted across the network in response to a query.

Also as we discuss these tree types, I want to make sure we are not straying far from the definition of a merkle tree, to maintain the desirable property of being able to prove that key x is or is not in the tree by providing all of the hashes necessary to climb to the root. All of these nifty tree types that put data in the branches rather than the leaf nodes likely may not retain that important property. I read about red black trees on Wikipedia and notice the data does not go in the leaf nodes and cannot clearly see how I could clip part of that tree and hand it to someone and they be able to trust my clipping via a merkle root.

2112

Legendary

Offline

Activity: 2128
Merit: 1073

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 09:28:52 PM

#108

Quote from: vuce on June 21, 2012, 08:47:07 PM

AVL tree is the mother of balanced binary trees. They have the smallest "worst case height", so the fastest query, but a bit slower insert/delete than red-black trees.

I just wanted to point out that query speed is pretty much immaterial. All it matters is the update complexity.

Integrating over the world population of Bitcoin clients the probability of any particular key being queried is almost 0, but the probability of any particular key being inserted/deleted is almost 1. This is pretty much the exact opposite of the assumptions made in all classic information storage and retrieval texts.

If you come up with a really good storage tree with low overhead for insert/delete but bad query performance you can easily fix it by maintaining a secondary index structure that facilitates fast query for keys that are locally interesting. That secondary structure may be different for each individual client and be dependent on the local querying behavior.

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

2112

Legendary

Offline

Activity: 2128
Merit: 1073

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 09:44:50 PM

#109

Quote from: casascius on June 20, 2012, 11:11:56 PM

I am the one who originally suggested the 100 block interval... but I don't think I said that updating the meta tree only every 100 blocks is what should be done.

Also I urge to seriously consider batch updating the primary storage structure. And keep the recently-heard-of updates in a separate storage area. This probably should be somehow similar to the generational garbage collection concept.

I would also urge to avoid using 100 but choose a divisor of 2016.

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

casascius

Mike Caldwell
VIP
Legendary

Offline

Activity: 1386
Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 09:47:25 PM

#110

Quote from: 2112 on June 21, 2012, 09:44:50 PM

Quote from: casascius on June 20, 2012, 11:11:56 PM

I am the one who originally suggested the 100 block interval... but I don't think I said that updating the meta tree only every 100 blocks is what should be done.

If you mean a factor of 2016, how about 21. (21x96=2016) That's also a clean round factor of 21 million, as well as 210000 blocks between reward changes.

2112

Legendary

Offline

Activity: 2128
Merit: 1073

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 10:08:06 PM

#111

Quote from: casascius on June 21, 2012, 09:47:25 PM

If you mean a factor of 2016, how about 21. (21x96=2016) That's also a clean round factor of 21 million, as well as 210000 blocks between reward changes.

I think it is too early to make this decision. I just wanted to stress that the heaviest housekeeping updates should be phase-shifted with respect to the difficulty retarget. In other words the blocks just before and just after the retarget should involve only light housekeeping.

I haven't seen anyone doing any serious game-theoretic analysis of the possible splitting attacks on the global Bitcoin network during the retarget, but I just want to avoid creating additional headaches resulting from batch updates.

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

casascius

Mike Caldwell
VIP
Legendary

Offline

Activity: 1386
Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 21, 2012, 10:20:30 PM

#112

I just wanted to confirm that we understand that when the network "retargets", or the block reward halves, it isn't actually doing anything computationally intensive. All that happens is that the client expects to see a different number in a field in future blocks as compared to past blocks.

Rebuilding a tree isn't too computationally intensive to have happen once every 3.5 hours (which is what 21 blocks would suggest - and is also a relatively fixed interval). All that matters is that we know it is too computationally intensive to happen per transaction (which is multiple times per minute and tends to infinity). I can't imagine picking this number is premature, as whether the magic number is 1, 3, 7, 21, or 420 blocks, all choices accomplish the goal of not pegging users CPU's to 100% chugging through this tree.

maaku

Legendary

Offline

Activity: 905
Merit: 1012

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 01:14:11 AM

Merited by ABCbits (1)

#113

Quote from: etotheipi

On the other hand, I was hoping for a structure that wasn't too complicated, and both RB trees and Patricia tries have complicated implementations (even though the concepts behind them are fairly simple). But if we're going to have to go with something complicated, anyway (to limit worst-case speed and time performance), then I'd have to vote for Patricia trie or variant. Not only is it O(1)... someone brought up the very good point that updates to the tree can mostly be parallelized. That sounds like another good property of a tree that's going to have very high update rates...

The 2-3-4 tree (aka B-tree of order 4) is really the only one I would consider in this circumstance. RB-trees are actually a special representation of 2-3-4 trees, but the implementation choices in balancing an RB-tree don't exist for 2-3-4 trees (for a given 2-3-4 tree there can exist more than one RB-trees that “encodes” that structure, but not vice versa). A higher order B-tree would also work, but then you would be trading increase CPU time for decreased I/O, which doesn't fit this application.

That said, the ability to parallelize prefix/radix tries is a very good point. You might win me over to that side yet... but if self-balancing trees are to be used, the 2-3-4 tree has some clear benefits over others (KVL, RB, etc.).

To the other posts... why would you ever need to rebuild the tree? I don't understand the purpose. If you are using a self-balancing structure then it stays balanced “for free”. And under what circumstance would you have all the transaction data, but not an existing tree structure or the block chain from which you can order the updates?

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP

casascius

Mike Caldwell
VIP
Legendary

Offline

Activity: 1386
Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 02:22:26 AM

#114

Quote from: maaku on June 22, 2012, 01:14:11 AM

If you are using a self-balancing structure then it stays balanced “for free”. And under what circumstance would you have all the transaction data, but not an existing tree structure or the block chain from which you can order the updates?

If you have a self-balancing structure, you may not have a merkle tree, which is what you'd want as a node so you can serve lookups to lite clients with no block chain that they can determine with 100% certainty whether you're telling the truth given just the merkle root. What you have instead with all these neat tree ideas is an index to speed up database ops - which would be great if we're building a database engine - but the title of the OP is "trust-free lite nodes". I am not sure that by enumerating all of these other wonderful tree structures that we are remembering we need the cryptographic properties that a merkle tree offers to accomplish the stated goal.

Assuming you had a copy of such a tree... the circumstance one would have possession of it, is one as a node would have acquired it from a peer as a complete and total substitute for the block chain (other than the block headers and perhaps a few days of recent blocks). The whole point of this exercise is to ditch the storage requirement of spent transactions from the block chain for the purpose of scaling Bitcoin and dealing with the fact that the block chain will soon be too heavy to lift - not so much to shave milliseconds off index lookups.

etotheipi (OP)

Legendary

Offline

Activity: 1428
Merit: 1093

Core Armory Developer

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 02:46:27 AM

#115

Quote from: casascius on June 22, 2012, 02:22:26 AM

Quote from: maaku on June 22, 2012, 01:14:11 AM

If you have a self-balancing structure, you may not have a merkle tree, ...

Any of these can be made into an authentication data structure. Each node, including non-leaf nodes may represent data, so you just append this node's data, to the hashes of the non-null children and hash that to get the current nodes value. Then it's parent nodes do the same to agregate their children.

I haven't defined it rigorously, but it can be done. One issue with a 256-way Patricia/radix tree is that each level will need the values of the other 255 children in order to verify each level. Granted, it only matters at the top levels where there's a lot of branching, beyond levels 4 and 5 basically all other children pointers will be null. But it goes along with why I'm hesitant to endorse a patricia tree: there might be a lot of data to transfer to very a node.

2112

Legendary

Offline

Activity: 2128
Merit: 1073

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 03:00:20 AM

#116

Quote from: maaku on June 22, 2012, 01:14:11 AM

why would you ever need to rebuild the tree?

Probably instead of "rebuilding the tree" the better phrase whould be "rehashing the nodes on the update path in the tree". The rehashing of many short strings would completely dominate the cost of the update (insertion/deletion). This is one of those things that the O() notation doesn't convey.

The benefits of a short-and-fat data structure over a tall-and-lean are four-fold:

1) less rehash overhead after update
2) less latency sensitivity when queried level-wise over the p2p network (at the expense of wasted bandwidth)
3) lower write amplification if this structure is stored locally in the block-storage device
4) ability to batch-update the nodes incurring single rehash overhead for multiple inserts/deletions.

Ultmately from the point of long-term code stability it would be better to choose a more generic structure instead of 2-3-4-tree. If we choose N-to-M-tree we could set the N and M values based on the experimentation now and possibly easily change them in the future if experience shows that our initial guess was not so good.

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

tevirk

Newbie

Offline

Activity: 15
Merit: 0

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 05:05:48 AM

#117

Quote from: casascius on June 21, 2012, 04:01:12 PM

Quote from: tevirk on June 21, 2012, 06:10:25 AM

Sorry to be slow, but I don't see the gain here. If a lightweight client is going to trust that a metablock that's been merged into the chain is truthful (because it's been built into a block), then it can just as reliably trust that a transaction that's in the chain a few blocks back is valid, because it's been built into a block.

That works if, as a lightweight node, you plan only on receiving funds that have a very small number of confirmations, which eliminates your view of the majority of bitcoins that exist. In USD terms, this would be like limiting yourself to only being able to accept crisp dollar bills that have never been handled more than once or twice. More likely than not, you're going to need to be able to receive funds from anybody, which will have been confirmed anywhere on the block chain between the genesis block and now. You either need the whole block chain to know whether a given incoming transaction is valid, or at least the digested tree of all unspent txouts for the entire block chain.

I'm not talking about the inputs to the transaction that pays me, I'm talking about the transaction that pays me itself. Fred posts a transaction which he says pays me 100 bitcoins. If I have the digested tree, or the whole block chain, I can check the transaction is valid. If I don't, I can't. But either way, it's still unverified. I'm going to wait for confirmations. Once I have confirmations, then I know it's valid too, because if I'm trusting the miners not to include fictional unspent txout digests I might just as well trust them not to include invalid transactions.

The problem this mechanism solves - validating an unverified transaction in a lightweight node - doesn't look to me like a very important one.

casascius

Mike Caldwell
VIP
Legendary

Offline

Activity: 1386
Merit: 1140

The Casascius 1oz 10BTC Silver Round (w/ Gold B)

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 05:30:01 AM

#118

Quote from: tevirk on June 22, 2012, 05:05:48 AM

Sure, if you aren't mining, like spam, and don't see any value in reliably knowing you have received funds from Fred in under an hour, since you can't tell the difference between a bogus transaction from Fred with fake inputs and a real one. You might be willing and able to wait 6 confirmations before deciding others have paid you, but others won't.

Under this proposal, miners can reliably mine using these trees and not the block chain. If you think all miners will want to lug around a block chain whose size tends closer to infinity with each person who starts running a gambling bot, then sure, this isn't important.

Being able to validate a transaction instantly is important for spam prevention. Nodes only relay valid transactions. If you can't validate transactions, you have no choice but to blindly spew to your peers anything any of them sends. You'll be a sitting duck for DoS attacks (since for every 1 message coming in you'll nominally send 7 out), and a whole network made of nodes like this would be easy to spam into oblivion.

Finally, this tree proposal isn't meant to RUN on a lightweight node. It is meant to make a normal node be able to SERVE another lightweight node, at the same time not having to have the full unabridged block chain.

tevirk

Newbie

Offline

Activity: 15
Merit: 0

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 06:08:09 AM

#119

Quote from: casascius on June 22, 2012, 05:30:01 AM

Quote from: tevirk on June 22, 2012, 05:05:48 AM

So the scenario in which this helps is where

(a) transaction volume is so high that even miners running fancy purpose-built mining rigs can't store the transaction history on a standard-issue 1TB hard drive
(b) every Tom, Dick and Harry runs a lightweight node which relays every single transaction on a P2P network.

Those two conditions contradict each other. If transaction rate goes up that high (and I think it shouldn't, but that's an entirely different discussion), bandwidth becomes the limiting factor before storage space does. At that transaction rate, inevitably the bitcoin network evolves to a backbone of heavy nodes exchanging everything and lightweight clients which consume only data of interest to them. That's quite independent of how the history is handled.

As to unconfirmed transactions, are there really going to be that many people who will accept an unconfirmed transaction, but not be willing to trust anyone to validate it for them?

maaku

Legendary

Offline

Activity: 905
Merit: 1012

Re: Ultimate blockchain compression w/ trust-free lite nodes

June 22, 2012, 07:21:48 AM

#120

Maybe this will help: the trouble is when Satoshi released Bitcoin 0.1 and the whitepaper, he came up with this idea for a version of the client that only kept block headers and skimmed for transactions it cares about. It's become known as SPV--Simple Payment Verification--or a “lightweight client” and has a weaker trust model than a full verifying client.

We are not discussing that in this thread. What's going on here is the creation of a full-featured “thick client” which doesn't require the entire block chain history and could conceivably be run even on low-memory and embedded devices as bitcoin scales up. You can have your cake and eat it too.

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP

Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 » All

Bitcoin Forum > Bitcoin > Development & Technical Discussion > Ultimate blockchain compression w/ trust-free lite nodes

« previous topic next topic »

Jump to: