Bitcoin Forum
April 19, 2024, 09:54:43 PM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 [5] 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 »  All
  Print  
Author Topic: Ultimate blockchain compression w/ trust-free lite nodes  (Read 87865 times)
tevirk
Newbie
*
Offline Offline

Activity: 15
Merit: 0



View Profile
June 20, 2012, 05:23:04 AM
 #81

Because most of the data in the block chain is hashes, it's not compressible at all.
1713563683
Hero Member
*
Offline Offline

Posts: 1713563683

View Profile Personal Message (Offline)

Ignore
1713563683
Reply with quote  #2

1713563683
Report to moderator
The trust scores you see are subjective; they will change depending on who you have in your trust list.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713563683
Hero Member
*
Offline Offline

Posts: 1713563683

View Profile Personal Message (Offline)

Ignore
1713563683
Reply with quote  #2

1713563683
Report to moderator
maaku
Legendary
*
expert
Offline Offline

Activity: 905
Merit: 1011


View Profile
June 20, 2012, 07:38:46 AM
 #82

Can compression be used as an intermediate step along the way.  That is, is the blockchain stored on the disk in an efficient manner?  Also, why do noobs have to dl the whole block chain from peers?  Its soooo slow.  Couldn't each release come along with a zipped copy of the blockchain up to the date it was released, along with a hard-coded hash of that block chain up till then with a built in check.  That way the user can just dl the whole shebang in one swoop.

These of course are not permanent measures but maybe as interim fixes for now.
Most of the sync time is spent doing ECDSA verification and disk I/O. Compression would actually make the problem worse. Packaging the block chain up to the latest checkpoint isn't a bad idea, but won't improve the situation as much as you probably think. The client would still have to verify the block chain, which means hours on end of 100% CPU utilization.

The real solution is to develop a way to avoid verification of historical data without compromising the security that verification provides. That's what this thread is about.

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
jojkaart
Member
**
Offline Offline

Activity: 97
Merit: 10


View Profile
June 20, 2012, 05:49:03 PM
 #83

About rolling out new features and avoiding block chain splits, what we need is a good automatic system to automatically and democratically add any feature. Just like the implementation schedule of p2sh but being more like my proposal: time-flexible, with an additional temporal sub-chain, and for any feature. It may be difficult and problematic to code it only for one feature, but IMHO it's worth it if it's a generic implementation-deprecation system for determining the validity of blocks.

Maybe I'm misunderstanding the change your talking about. But I think this is dangerous.I use bitcoin because i trust the protocol behind to never change. If the majority or the devs can push a change in protocol, then I'm out. A way to compress the blockchain fine. A fork, hard or soft... mmmmm seems dangerous to me.

There is at least one guaranteed hard fork that is going to happen eventually. Blocks are currently limited to maximum of 1 megabyte in size. This could start hampering further growth of Bitcoin usage starting sometime next year.

However, this doesn't mean Bitcoin devs can just push any change they want. Users and miners can always opt to not use whatever new version they put out.

In short, getting any change out needs a massive majority support from Bitcoin users to happen.
unclescrooge
aka Raphy
Hero Member
*****
Offline Offline

Activity: 868
Merit: 1000


View Profile
June 20, 2012, 06:06:14 PM
 #84

About rolling out new features and avoiding block chain splits, what we need is a good automatic system to automatically and democratically add any feature. Just like the implementation schedule of p2sh but being more like my proposal: time-flexible, with an additional temporal sub-chain, and for any feature. It may be difficult and problematic to code it only for one feature, but IMHO it's worth it if it's a generic implementation-deprecation system for determining the validity of blocks.

Maybe I'm misunderstanding the change your talking about. But I think this is dangerous.I use bitcoin because i trust the protocol behind to never change. If the majority or the devs can push a change in protocol, then I'm out. A way to compress the blockchain fine. A fork, hard or soft... mmmmm seems dangerous to me.

There is at least one guaranteed hard fork that is going to happen eventually. Blocks are currently limited to maximum of 1 megabyte in size. This could start hampering further growth of Bitcoin usage starting sometime next year.

However, this doesn't mean Bitcoin devs can just push any change they want. Users and miners can always opt to not use whatever new version they put out.

In short, getting any change out needs a massive majority support from Bitcoin users to happen.

Do we need a fork for block size? (sorry I don't know a bit about this)
jojkaart
Member
**
Offline Offline

Activity: 97
Merit: 10


View Profile
June 20, 2012, 07:06:04 PM
 #85

Do we need a fork for block size? (sorry I don't know a bit about this)

Yes,  it will be a fork because all current nodes on the network will ignore any block bigger than 1 megabyte. The best way to actually do this is to set a date (well, a block number in practise), say a year or two in the future and release a version that will keep rejecting blocks bigger than one megabyte until that block number is reached. After that it will accept blocks bigger than one megabyte.

Then, when a block that is bigger than one megabyte is created, the nodes with client versions after the update was made will accept the block and all the older versions will reject it. In practise, there's unlikely to be a real fork unless a significant portion of users refuse to upgrade.

- Joel
Mageant
Legendary
*
Offline Offline

Activity: 1145
Merit: 1001



View Profile WWW
June 20, 2012, 08:10:15 PM
Last edit: June 20, 2012, 08:30:57 PM by Mageant
 #86

You wouldn't have to update the meta-chain with every individual new block that comes out of the regular blockchain.

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.

This also avoids any synchronization problems with the latest blocks out of the blockchain and the possibility of small forks (orphaned blocks) in the blockchain, since the meta-chain then would only synchronize with blocks of a highly confirmed number.

cjgames.com
jojkaart
Member
**
Offline Offline

Activity: 97
Merit: 10


View Profile
June 20, 2012, 08:33:24 PM
 #87

You wouldn't have to update the meta-chain with every individual new block that comes out of the regular blockchain.

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.

This also avoids any sychronization problems with the latest blocks out of the blockchain and the possibility of small forks (orphaned blocks) in the blockchain, since the meta-chain then would only sychronize with blocks of a highly confirmed number.

I don't see how it would reduce the load any to only update the meta-chain every 100 blocks. It'd just concentrate the update load on certain blocks. The planned tree structure for this would allow O(M*log N) updates, where M is number of updated transactions and N is the total number of transactions in the tree.
Mageant
Legendary
*
Offline Offline

Activity: 1145
Merit: 1001



View Profile WWW
June 20, 2012, 08:48:37 PM
 #88

You wouldn't have to update the meta-chain with every individual new block that comes out of the regular blockchain.

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.

This also avoids any sychronization problems with the latest blocks out of the blockchain and the possibility of small forks (orphaned blocks) in the blockchain, since the meta-chain then would only sychronize with blocks of a highly confirmed number.

I don't see how it would reduce the load any to only update the meta-chain every 100 blocks. It'd just concentrate the update load on certain blocks. The planned tree structure for this would allow O(M*log N) updates, where M is number of updated transactions and N is the total number of transactions in the tree.

Because certain outputs could have been already respent within the last 100 blocks, so you could cut out those.

Also there is the issue of avoiding blockchain forks that get orphaned, so the lightweight client would store a small portion of the regular blockchain anyway, say the last X blocks, where X is the number of blocks where you can be relatively sure that they won't get orphaned. I don't know if avoiding orphaned blocks is very important though, maybe it's not.

cjgames.com
apetersson
Hero Member
*****
Offline Offline

Activity: 668
Merit: 501



View Profile
June 20, 2012, 09:43:48 PM
 #89

your nice graphs hosted on dropbox are no longer working. maybe upload them to imgur.com ?
Realpra
Hero Member
*****
Offline Offline

Activity: 815
Merit: 1000


View Profile
June 20, 2012, 10:44:43 PM
 #90

The lightweight client that uses the meta-chain could still store a small portion of the regular blockchain, something like the latest 100 blocks (max. 100 MB), alternatively this could be a variable of storage space.

Only updating the meta-chain every 100 blocks/100 MB or so would IMHO reduce the load on the meta-chain and the lightweight clients using it.
Seems like a another flaw in this design to me.

At VISA volumes I think each block would be one gigabyte. Even if half or less it would break the light nodes proposed here.

This solution is patch work, a swarm client combined with the ledger system is the only way. It doesn't have to be my design, but we will benefit a lot from swarm principles at some point.

Cheap and sexy Bitcoin card/hardware wallet, buy here:
http://BlochsTech.com
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
June 20, 2012, 11:11:56 PM
 #91

I am the one who originally suggested the 100 block interval... but I don't think I said that updating the meta tree only every 100 blocks is what should be done.

Rather, the meta tree should be rebalanced every 100 blocks, and between then, nodes should be added and deleted using methodology that avoids (procrastinates) having to recalculate the hashes for most or all the nodes in the tree any time there was a change.  Otherwise, every incoming transaction will have a huge CPU time burden that's not sustainable.  Rebalancing the tree is much like rebuilding a database index.

The reason why 100 blocks, is that there needs to be an agreement that everybody will do it at a certain time simultaneously, so that as hashes of the tree are exchanged, they will always refer to the same data set.  An arbitrary number must be chosen that strikes a balance between the resource burden of rebalancing the tree (which favors a rebalance less frequently), versus the burden required of someone to get themselves up to speed (which favors a rebalance more frequently).

And while the meta tree may in fact be large, no node ever needs to accumulate older copies of the meta tree, so it is not as though having a 1GB meta tree is going to mean 1 GB for every 100 blocks.  There is value to having a few old revisions of the meta tree (e.g. so that a node can opt to formulate its view of the network starting a certain number of blocks back and protect itself from orphan blocks) but there is no reason, for example, anyone needs to accumulate this meta data for historical purposes, as it is completely reconstructable from the normal block chain.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
tevirk
Newbie
*
Offline Offline

Activity: 15
Merit: 0



View Profile
June 21, 2012, 06:10:25 AM
 #92

Sorry to be slow, but I don't see the gain here.  If a lightweight client is going to trust that a metablock that's been merged into the chain is truthful (because it's been built into a block), then it can just as reliably trust that a transaction that's in the chain a few blocks back is valid, because it's been built into a block.  There's no need for it to keep anything.  The only real advantage here seems to be that it saves miners from having to have a hard disk, and it seems like a lot of engineering to do that.

Quite possibly I'm missing something, in which case it would probably help for someone to step back and explain the aims and benefits.
maaku
Legendary
*
expert
Offline Offline

Activity: 905
Merit: 1011


View Profile
June 21, 2012, 08:00:53 AM
 #93

I am the one who originally suggested the 100 block interval... but I don't think I said that updating the meta tree only every 100 blocks is what should be done.

Rather, the meta tree should be rebalanced every 100 blocks, and between then, nodes should be added and deleted using methodology that avoids (procrastinates) having to recalculate the hashes for most or all the nodes in the tree any time there was a change.  Otherwise, every incoming transaction will have a huge CPU time burden that's not sustainable.  Rebalancing the tree is much like rebuilding a database index.
I'm not sure I follow. Updating any tree (balanced or not) is a constant-time operation. Updating any Merkle-tree (balanced or not) is log(N), although you can save a little effort by marking updated nodes as dirty and only updating Merkle hashes at the end. Rebuilding a database is N*log(N), a different beast. Anyway that's beside the point. Updating a balanced Merkle tree shouldn't be any more complex, algorithmically, than leaving it unbalanced. Unless I'm missing something; am I?

Sorry to be slow, but I don't see the gain here.  If a lightweight client is going to trust that a metablock that's been merged into the chain is truthful (because it's been built into a block), then it can just as reliably trust that a transaction that's in the chain a few blocks back is valid, because it's been built into a block.  There's no need for it to keep anything.  The only real advantage here seems to be that it saves miners from having to have a hard disk, and it seems like a lot of engineering to do that.

Quite possibly I'm missing something, in which case it would probably help for someone to step back and explain the aims and benefits.
Wrong problem. It saves the lightweight client from having to download, verify, and keep track of any of the block chain at all, except for those parts the user cares about (their own unspent outputs, for example).

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
June 21, 2012, 04:01:12 PM
 #94

Sorry to be slow, but I don't see the gain here.  If a lightweight client is going to trust that a metablock that's been merged into the chain is truthful (because it's been built into a block), then it can just as reliably trust that a transaction that's in the chain a few blocks back is valid, because it's been built into a block.  There's no need for it to keep anything.  The only real advantage here seems to be that it saves miners from having to have a hard disk, and it seems like a lot of engineering to do that.

Quite possibly I'm missing something, in which case it would probably help for someone to step back and explain the aims and benefits.

That works if, as a lightweight node, you plan only on receiving funds that have a very small number of confirmations, which eliminates your view of the majority of bitcoins that exist.  In USD terms, this would be like limiting yourself to only being able to accept crisp dollar bills that have never been handled more than once or twice.  More likely than not, you're going to need to be able to receive funds from anybody, which will have been confirmed anywhere on the block chain between the genesis block and now.  You either need the whole block chain to know whether a given incoming transaction is valid, or at least the digested tree of all unspent txouts for the entire block chain.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
June 21, 2012, 04:07:48 PM
 #95

I am the one who originally suggested the 100 block interval... but I don't think I said that updating the meta tree only every 100 blocks is what should be done.

Rather, the meta tree should be rebalanced every 100 blocks, and between then, nodes should be added and deleted using methodology that avoids (procrastinates) having to recalculate the hashes for most or all the nodes in the tree any time there was a change.  Otherwise, every incoming transaction will have a huge CPU time burden that's not sustainable.  Rebalancing the tree is much like rebuilding a database index.
I'm not sure I follow. Updating any tree (balanced or not) is a constant-time operation. Updating any Merkle-tree (balanced or not) is log(N), although you can save a little effort by marking updated nodes as dirty and only updating Merkle hashes at the end. Rebuilding a database is N*log(N), a different beast. Anyway that's beside the point. Updating a balanced Merkle tree shouldn't be any more complex, algorithmically, than leaving it unbalanced. Unless I'm missing something; am I?

My understanding is that removing leaf nodes from a sorted Merkle tree while maintaining the constraint that the tree remain sorted and balanced has the potential to cause the hash of every node to be recalculated.  Imagine going from a tree that has 513 nodes to one that has 512 nodes.  The tree will lose a whole rank, and 100% of its hashes will change.  That's an extreme case, but not far from the typical case: if you remove a leaf out of the middle and don't replace it with a placeholder, all the leaf nodes will shift left by one position to maintain the sort and balance constraints, and every parent of any node that has shifted will be recalculated.

The closer the removal is to the left side of the tree, the greater proportion of the tree must be recalc'd.  A recalc of a tree in the hundreds of MB or in the GB's for every incoming Bitcoin transaction would be unsustainably and unscalably expensive.  All transactions result in the spending, and therefore, a deletion of at least one leaf node, so this kind of update would be CPU-intensive for every user upon every roll of Satoshi's dice.  So my idea - to keep the tree view consistent and keep the updating to log(N) would be to only balance the tree on a predetermined interval, and at any point between, use placeholders and allow leaf nodes to become branches (both of which - especially the latter - would make the tree no longer a proper Merkle tree) to conserve CPU resources during updates.


Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
jojkaart
Member
**
Offline Offline

Activity: 97
Merit: 10


View Profile
June 21, 2012, 04:13:07 PM
 #96

My understanding is that removing leaf nodes from a sorted Merkle tree while maintaining the constraint that the tree remain sorted and balanced has the potential to cause the hash of every node to be recalculated.  Imagine going from a tree that has 513 nodes to one that has 512 nodes.  The tree will lose a whole rank.  That's an extreme case, but not far from the typical case: if you remove a leaf out of the middle and don't replace it with a placeholder, all the leaf nodes will shift left by one position to maintain the sort and balance constraints, and every parent of any node that has shifted will be recalculated.  The closer the removal is to the left side of the tree, the greater proportion of the tree must be recalc'd.  A recalc of a tree in the hundreds of MB or in the GB's for every incoming Bitcoin transaction would be overbearingly expensive.  All transactions result in the spending, and therefore, a deletion of at least one leaf node, so this kind of update would be CPU-intensive for every roll of Satoshi's dice.  So my idea - to keep the tree view consistent and keep the updating to log(N) would be to only balance the tree on a predetermined interval, and at any point between, use placeholders and allow leaf nodes to become branches to conserve updating resources.

Not quite true. For example, the red-black tree algorithm guarantees worst case operation of O(log N). Most branches, although they will move, will keep their hashes as they were, no need to recalculate.

- Joel
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
June 21, 2012, 04:16:19 PM
 #97


Not quite true. For example, the red-black tree algorithm guarantees worst case operation of O(log N). Most branches, although they will move, will keep their hashes as they were, no need to recalculate.

- Joel

The red-black tree is a concept I don't yet understand.  But if choosing this type of structure brings the benefit of O(log N) updates without introducing any negatives, then I'm all for it, and of course the periodic rebalance would become unnecessary.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
EnergyVampire
Full Member
***
Offline Offline

Activity: 210
Merit: 100



View Profile
June 21, 2012, 04:56:21 PM
Last edit: June 26, 2012, 11:30:55 PM by EnergyVampire
 #98

Subscribing

Added to Watchlist: https://bitcointalk.org/index.php?topic=90136.0

etotheipi (OP)
Legendary
*
expert
Offline Offline

Activity: 1428
Merit: 1093


Core Armory Developer


View Profile WWW
June 21, 2012, 05:18:11 PM
 #99


Not quite true. For example, the red-black tree algorithm guarantees worst case operation of O(log N). Most branches, although they will move, will keep their hashes as they were, no need to recalculate.

- Joel

The red-black tree is a concept I don't yet understand.  But if choosing this type of structure brings the benefit of O(log N) updates without introducing any negatives, then I'm all for it, and of course the periodic rebalance would become unnecessary.

This is a topic I've been debating with folks on IRC the past couple days.  It's clear that most tree data strcutures have most of the properties we want.  Red-Black trees work great, but I don't like that the specific underlying structure (and thus root hash) depends on the specific order/history of insertions and deletions.  And assumes that every red-black implementation uses the rebalancing algorithm.  I have been given compelling reasons why this shouldn't be a problem, but I am personally not convinced yet.  Though I agree that it is probably an acceptable solution.

I'm voting for level-compressed trie structures (so, a variant of a patricia trees) which have no balance issues at all, are insert-order-independent, and O(1) query/insert/delete.  The problem is they can have a lot of storage overhead per tree element.  I haven't done the calculation to know for sure just how bad it is. 

Once I get out my next version of Armory, I will be diving into this a bit more, hopefully creating a proposal with much more specifics about tree structure, and the CONOPs (concept of operations) of the meta-chain.


Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
June 21, 2012, 05:30:18 PM
 #100

This is a topic I've been debating with folks on IRC the past couple days.  It's clear that most tree data strcutures have most of the properties we want.  Red-Black trees work great, but I don't like that the specific underlying structure (and thus root hash) depends on the specific order/history of insertions and deletions.  And assumes that every red-black implementation uses the rebalancing algorithm.  I have been given compelling reasons why this shouldn't be a problem, but I am personally not convinced yet.  Though I agree that it is probably an acceptable solution.

Regardless of the tree structure chosen, why not rebuild it every 100 blocks, just for the sole purpose of having a periodic way of deterministically regenerating the tree, and to avoid mandating a continuous dependency on all prior versions of the meta tree in order to be certain that one has the "correct" permutation of the meta tree?

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Pages: « 1 2 3 4 [5] 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!