jimbobway
Legendary
Offline
Activity: 1304
Merit: 1015
|
|
July 27, 2012, 09:48:06 PM |
|
Before I read this I just want to quickly post that I personally, no matter whether justifiably or unjustifiably, I personally feel like this is the most pressing issue when it comes to Bitcoin's successful future and I really hope the core team has an order of priorities planed accordingly.
I too believe this is a critical issue for Bitcoin, as a whole. I had floated the idea that handling blockchain size was critical, in the past, but other issues seemed more pressing for the devs at the time -- I didn't have a solid idea to promote, and the blockchain size wasn't so out of hand yet. One nice benefit of this solution is that because it's an alt-chain, technically no core devs have to be on-board. It can be done completely indepedently and operate completely non-disruptively, even with only the support of other devs who believe in it. I'd certainly like to get core devs interested in it, as they are very smart people who probably have a lot of good ideas to add. But one of the biggest upsides here is that it can be done completely indepdently. Read some more of your proposal today and I now have a better idea of how it works and what you are accomplishing. Good proposal BTW. I'm still very concerned that this development will make it so the full blockchain database will not be as accessible for download. Its important for Bitcoin that anyone can fully audit the blockchain and only need to trust in math and cryptography and not the miners (even though it would be an extremely low probability that enough miners would conspire to get coins in the compressed blockchain). I completely agree that there are many applications and situations that a compressed blockchain would be extremely useful. Please update us on your perspective of the balance between systems that should have the full block chain and ones that should use the compressed version. The way I read the OP proposal and probably because of my fears is that this compressed form is to replace the full block chain in all instances. The transaction cost just needs to be increased so it serves as an incentive for miners to support the main block chain...I think.
|
|
|
|
ripper234
Legendary
Offline
Activity: 1358
Merit: 1003
Ron Gross
|
|
July 28, 2012, 12:08:52 PM |
|
Finally had the time to parse and understand the first message in this thread.
+1 for the initiative, it's a good addition to Bitcoin.
Does this information appear in the wiki somewhere? Does anyone care to TL;DR the rest of the thread for me?
Is there a bounty jar for this? (We can use Booster.io to open one)
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
July 28, 2012, 02:28:24 PM |
|
FYI, I have been too swamped with work-work and keeping Armory from breaking under the increased blockchain load to spend too much time on this specific proposal. I have not lost interest by any means, I just need to catch my breath after some big deadlines. Then I'll be taking some time off to work explicitly on some Bitcoin stuff... including this proposal.
I'm going to spend some time in the near future looking at space efficiency of a couple variants of the trie data-structure. I'm not sure exactly how this theoretical datastructure can be merged with a disk-based DB engine (I imagine that what I have in mind is not used by an existing, acceptable DB engine) but maybe there's a way to make a hybrid. This is a problem that still needs to be resolved before we can move forward with an implementation: Once we agree on a datastructure, how do we use it but avoid re-inventing the wheel in terms of robust, scalable disk-based database engines?
The more I've been thinking about it, the more I have become convinced that a trie-like structure is necessary. Note only are query and insert times are constant, the determinism of tree structure for a given set of tree nodes means that queries and inserts can be parallelized. For instance, the tree could be implemented with the first layer (the first 256 nodes of a 256-way trie) split into a different files/DBs/processes/servers for each branch. Then every new piece can be distributed and queued for its particular destination. It could even be distributed amongst different, independent servers. This seems to be advantageous.
For reference, I learned of this particular tree in my data structures class in college, but I find no reference that anyone else has ever heard of it. In my class it was called a "De La Brandia Tree/Trie". A Patricia tree is a level-compressed trie. A "de la brandia tree" is a Patricia tree that uses a linked-list of pointers, instead of a constant-size array. i.e. -- in a 256-way trie or patricia tree, each branch node contains 256 pointers (8 bytes each), which could point to other nodes. However, the sparseness of lower-level nodes means that the nodes will frequently have 255 null pointers, with only one relevant pointer. The de-la-brandia tree will represent all child pointers with an ordered linked list. It has some extra overhead for nodes that have mostly-full child lists, but it I think such near-full nodes will be tiny compared to the amount of space saved on sparse nodes.
When I get some more time, I'll make some pictures, and update the original post with the updated proposal.
|
|
|
|
Serith
|
|
July 28, 2012, 03:39:46 PM Last edit: July 28, 2012, 04:06:57 PM by Serith |
|
For reference, I learned of this particular tree in my data structures class in college, but I find no reference that anyone else has ever heard of it. In my class it was called a "De La Brandia Tree/Trie".
Did you notice that over the last 2-3 years google search gradually became really smart, it feels like there is a full scale AI behind it. I found the data structure that you were looking for, it's called "De la Briandais" tree.
|
|
|
|
jimbobway
Legendary
Offline
Activity: 1304
Merit: 1015
|
|
July 28, 2012, 04:01:59 PM |
|
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
July 28, 2012, 07:01:02 PM |
|
For reference, I learned of this particular tree in my data structures class in college, but I find no reference that anyone else has ever heard of it. In my class it was called a "De La Brandia Tree/Trie".
Did you notice that over the last 2-3 years google search gradually became really smart, it feels like there is a full scale AI behind it. I found the data structure that you were looking for, it's called "De la Briandais" tree. Oh, if I had stopped typing so fast into google, I probably would've notice the autocompletion answer. Interesting that I always thought it "brandia" instead of "brandais". That's what I get for never going to class... (though I did stay up all night debugging the insert function for a Patricia tree). Speaking of that, when I do a search for the correct name, I get a lot of links to the exact class I took when I attended UIUC: http://www.cs.uiuc.edu/class/fa05/cs225/cs225/_notes/_section/cs225ta4/Documents/bsttries.pdfIt's not the most exhaustive introduction to the trie strcutures, but if you are already familiar with the concepts, you can get the gist of it. And in fact, I was really proposing the Patricia/Brandais hybrid tree. A pure "Brandais" tree uses linked lists but is not level-compressed. The linked list may also make it easier to combine children to produce the "hash" of a particular node: You only need to concatenate the non-null children hashes, which will frequently be very few elements (and the "skip string"). And in those cases, you just hash consecutively through the linked list. And the dense nodes near the top can be cached for when they need to be recalculated. This will also reduce the amount of data that needs to be transmitted to communicate a branch of the tree to a node (though, it's probably still more than I originally estimated). Unfortunately, my understanding of the correct path forward once these structures need to move from RAM to disk is beyond me.
|
|
|
|
allten
|
|
July 28, 2012, 08:03:06 PM |
|
Before I read this I just want to quickly post that I personally, no matter whether justifiably or unjustifiably, I personally feel like this is the most pressing issue when it comes to Bitcoin's successful future and I really hope the core team has an order of priorities planed accordingly.
I too believe this is a critical issue for Bitcoin, as a whole. I had floated the idea that handling blockchain size was critical, in the past, but other issues seemed more pressing for the devs at the time -- I didn't have a solid idea to promote, and the blockchain size wasn't so out of hand yet. One nice benefit of this solution is that because it's an alt-chain, technically no core devs have to be on-board. It can be done completely indepedently and operate completely non-disruptively, even with only the support of other devs who believe in it. I'd certainly like to get core devs interested in it, as they are very smart people who probably have a lot of good ideas to add. But one of the biggest upsides here is that it can be done completely indepdently. Read some more of your proposal today and I now have a better idea of how it works and what you are accomplishing. Good proposal BTW. I'm still very concerned that this development will make it so the full blockchain database will not be as accessible for download. Its important for Bitcoin that anyone can fully audit the blockchain and only need to trust in math and cryptography and not the miners (even though it would be an extremely low probability that enough miners would conspire to get coins in the compressed blockchain). I completely agree that there are many applications and situations that a compressed blockchain would be extremely useful. Please update us on your perspective of the balance between systems that should have the full block chain and ones that should use the compressed version. The way I read the OP proposal and probably because of my fears is that this compressed form is to replace the full block chain in all instances. The transaction cost just needs to be increased so it serves as an incentive for miners to support the main block chain...I think. yes, that is the correct answer. I'm just concerned that miners will be convinced that they are "supporting" the block chain with only the compressed/pruned version on their disk drive. That is why I'm asking etotheipi to clarify how this will be presented to the miner community. That is, its a tool for the miners and everyone else, but the miners should still support the full blockchain.
|
|
|
|
casascius
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
July 28, 2012, 09:54:36 PM |
|
yes, that is the correct answer. I'm just concerned that miners will be convinced that they are "supporting" the block chain with only the compressed/pruned version on their disk drive. That is why I'm asking etotheipi to clarify how this will be presented to the miner community. That is, its a tool for the miners and everyone else, but the miners should still support the full blockchain.
The way I understand it, the miners will be supporting the block chain, regardless of whether they have part of it or all of it. The only thing a miner is supporting at any given time is the hash of the latest block and the new transactions he is adding into his block. Nothing more. All of the history is covered simply by reference to the prior block hash. Whether or not he has a local copy of the first billion rolls of Satoshi Dice on his hard drive is irrelevant toward his ability to mine. I don't worry for one bit that the original unabridged block chain will ever go extinct. Enough people care about it, the cost to maintain it is low, all it takes is one historian to seed it for the rest of everyone else and everyone who wants it will have it. The way I see it, the only real critical reason one should demand full blocks all the way to point-in-time X is to maximize the probability that he is not being fed an attack fork without enough information to detect it. A reasonable hunch of what a good value of X might be for the average client might be a week, and for a miner, a few months. It could be argued someone investing in a serious mining operation (like a pool) arguably "needs" more assurance, but someone running a serious mining operation also likely has the skills to determine for himself whether he has the correct block chain and that assurance is arguably just as good as having more block history.
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
ripper234
Legendary
Offline
Activity: 1358
Merit: 1003
Ron Gross
|
|
July 29, 2012, 05:22:30 AM |
|
FYI, I opened a bounty jar for this project at booster.io. Please add a link to the OP. Ultimate Blockchain Pruning is a proposed alt-chain data structure that will enhance core Bitcoin scalability and allow for trust-free light clients. It does not compete with Bitcoin, but rather complements and strengthens it.
This bounty will be awarded to the first person or group who completes all these tasks: 1. Implement UBP 2. Get at least 15% of the hash power to merge-mine it 3. Patch at least one major Bitcoin client to support UBP mode 4. Benchmark the result and show an improvement of at least 10% in downloading the blockchain from scratch
This is quite an undertaking ... so you better donate if you want to encourage this idea.
|
|
|
|
maaku
Legendary
Offline
Activity: 905
Merit: 1012
|
|
July 29, 2012, 06:30:56 AM Last edit: July 29, 2012, 07:47:35 PM by maaku |
|
|
I'm an independent developer working on bitcoin-core, making my living off community donations. If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
|
|
|
btharper
|
|
July 29, 2012, 10:40:03 PM |
|
Since the alt-chain can only update as often as the main chain, would using a different difficulty mechanism make sense? Of course adherer or not merged mining is used matters.
For example: Instead of saying that the first hash below difficulty wins, is there any way to say that the absolute lowest hash wins? The biggest issue I can see without an obvious answer is how to keep the chain from backpedaling if someone releases a better block N-1 while everyone else is working on block N , but it might just be a variant on the 51% attack.
TL;DR - would a different difficulty mechanism be warranted?
|
|
|
|
maaku
Legendary
Offline
Activity: 905
Merit: 1012
|
|
July 29, 2012, 11:19:45 PM |
|
No, it'll work as-is. The alt-chain mints blocks independently of the main-chain (excepting merged-mined blocks where they are both minted at the same time).
|
I'm an independent developer working on bitcoin-core, making my living off community donations. If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
|
|
|
btharper
|
|
July 30, 2012, 01:04:28 AM Last edit: July 30, 2012, 03:34:33 AM by btharper |
|
No, it'll work as-is. The alt-chain mints blocks independently of the main-chain (excepting merged-mined blocks where they are both minted at the same time).
Everything would work as is, but this chain wouldn't work quite the same I don't think. Since this alt-chain essentially needs to have the same number of blocks as the primary bitcoin chain. If the alt-chain catches up there's no incentive, or value, in mining anything else on it, which otherwise "wastes" the workers that are participating on the chain. My main point is that linking the block count to the main chain one-to-one changes things somewhat. Or is this not as big of an issue as I'm thinking it is? Edit: Fixed typos made while using my phone.
|
|
|
|
gmaxwell
Moderator
Legendary
Offline
Activity: 4284
Merit: 8816
|
|
July 30, 2012, 01:54:13 AM |
|
The alt-chain mints blocks independently of the main-chain
An alt-"chain" is probably the wrong way to think of this. It's committed data. There doesn't need to be a chain. Each bitcoin block could have 0 or 1 tree commitments of a given type.
|
|
|
|
btharper
|
|
July 30, 2012, 03:44:31 AM |
|
The alt-chain mints blocks independently of the main-chain
An alt-"chain" is probably the wrong way to think of this. It's committed data. There doesn't need to be a chain. Each bitcoin block could have 0 or 1 tree commitments of a given type. The chain has established itself as a good proof-of-work system, which is the largest reason to stick with it that I can see right off hand. However it may run more like P2Pool in that storing old blocks (other than headers) may be slightly useless, or at least contrary to the goal of eliminating extra data. Setting up the alt-chain with either "none" or "some" updates may be the way to go to preserve simplicity however compared to normal chains. For one of the datastructure guys, would there be an advantage in marking updates as "a few" vs "many" in terms of packing the data? Maybe something else worth looking into for someone who knows how the current setup would work. As a much more random aside, any idea what the alt-chain coins could be used for? and if transfer would be worthwhile. If transfer doesn't matter just give 1 coin per block and let people sign with their keys how many they've accumulated helping to secure the chain for lite nodes.
|
|
|
|
DiThi
Full Member
Offline
Activity: 156
Merit: 100
Firstbits: 1dithi
|
|
July 30, 2012, 12:03:57 PM |
|
The "alt-chain" term confuses people. It's more like a "sub-chain" of the main blockchain. It doesn't generate any coins at all, and it's a temporal fix until the majority of miners support it.
|
1DiThiTXZpNmmoGF2dTfSku3EWGsWHCjwt
|
|
|
casascius
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
July 30, 2012, 04:07:18 PM |
|
The alt-chain mints blocks independently of the main-chain
An alt-"chain" is probably the wrong way to think of this. It's committed data. There doesn't need to be a chain. Each bitcoin block could have 0 or 1 tree commitments of a given type. +100 The term that makes the most sense for me is a "meta-tree". It would never be "mined" - it would simply be committed to in normal blocks - optionally - at least until it is proven to work in practice and a decision is made to make it a mandatory part of the protocol.
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
gmaxwell
Moderator
Legendary
Offline
Activity: 4284
Merit: 8816
|
|
July 31, 2012, 03:44:42 PM |
|
One recent revelation I've had as a result of Pieter's ultraprune implementation is that any tree commitment scheme should also commit to undo logs so that nodes don't necessarily have to all individually store all the data required to reorg forever.
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
July 31, 2012, 06:10:57 PM |
|
One recent revelation I've had as a result of Pieter's ultraprune implementation is that any tree commitment scheme should also commit to undo logs so that nodes don't necessarily have to all individually store all the data required to reorg forever.
Perhaps that's another reason to use insert-order-invariant tree structure. If you have to reverse some transactions, you don't have to worry how you undo them. Undoing a block/tx is just as easy as adding it in the first place.
|
|
|
|
gmaxwell
Moderator
Legendary
Offline
Activity: 4284
Merit: 8816
|
|
July 31, 2012, 08:28:12 PM |
|
Perhaps that's another reason to use insert-order-invariant tree structure. If you have to reverse some transactions, you don't have to worry how you undo them. Undoing a block/tx is just as easy as adding it in the first place.
You still have to have the complete data that you would remove. E.g. when I spend txn X, I don't specify all of X's data to spend it (thats burried elsewhere in the chain) only the hash. Order invariant wouldn't let me recover that. I need some kind of undo data, even if its just the location of the original txn so that I could fetch it. (though more efficient if you can serve me up a whole undo block)
|
|
|
|
|