TheBitcoinChemist
Member
Offline
Activity: 70
Merit: 10
|
|
August 07, 2012, 10:38:14 PM |
|
Consider this, you get a block, block Z. You do not have block (Z-1), so you check the skiphash and ask for block Y. You also don't have block (Y-1), so you check the skiphash again and ask for block X. Now block X is one that you do have, so you stop there. How much should you trust blocks Y and Z at this point?
...
Unless I'm wrong, of course. It could be that the skiphash chain rules add up to a much stronger context than is immediately apparent to me.
Branch and Bound! You've now established that block X is at least a common point in both chains (assuming you already have a chain, it's the genesis block if you're starting from scratch). Block Y is the largest hash in between X and Z. You can now repeat the process on either sub interval. Each time you go down a level, there are more blocks to check, and you get an increasingly precise estimate. If you keep going, eventually you will process every block. I imagine that there would be a lite client that says "Estimated work: 1 bajillion megahashes. Likelihood: 76%". The likelihood number increases, very rapidly at first, and slowing down at 99%, 99.99% and so on. If it hits 100, then you have validated the entire chain like a full node. This also allows you to quickly find the true fork point between two chains, even ones that have had difficulty changes along the way. What do you do with the subinterval exactly? You don't know any blocks between X and Y or Y and Z. You don't know how many there are, Actually, you would know how many there are, because the current block number is part of the header. I'm starting to see value in this idea. EDIT: but what happens if the difficulty drops or levels off ofr a long peroid of time, what prevents a uselessly large interval?
|
|
|
|
socrates1024 (OP)
Full Member
Offline
Activity: 126
Merit: 110
Andrew Miller
|
|
August 07, 2012, 10:53:29 PM Last edit: August 07, 2012, 11:05:46 PM by socrates1024 |
|
I imagine that in each block, you store a concise report of the sum of the difficulties so far. This way you are never trying to make an estimate from scratch, instead you are trying to evaluate whether the self-reported estimate is plausible. Suppose I tell you that my chain has bajillion hashes with sum-difficulty of 5. You say "Sweet! I bet you have some cool 9's and 10's in there. Show me those." If I say "Nope actually my best is just an 8", you are immediately suspicious. "Well show me all your 8s and 7s then." If I actually did a fraction of a bajillion hashes, then the jig will be up quite quickly. EDIT: but what happens if the difficulty drops or levels off ofr a long peroid of time, what prevents a uselessly large interval?
Good question! There's a difference between difficulty and hash-value. The difficulty is just the minimum hash-value needed for a block. The skip list indexes the work according to hash-value, which corresponds to the total number hashes drawn over time, regardless of what the minimum thresholds were. So if there's a period of time where the difficulty is only 1 (and perhaps there are tons of blocks like this), but most of the time the difficulty was 5, then you will be estimating mostly from the 7's 6's and 5's. You'd only get to the 1s if the total is plausible all the way down.
|
|
|
|
socrates1024 (OP)
Full Member
Offline
Activity: 126
Merit: 110
Andrew Miller
|
|
August 07, 2012, 11:01:20 PM Last edit: August 07, 2012, 11:14:25 PM by socrates1024 |
|
What do you do with the subinterval exactly?
You don't know any blocks between X and Y or Y and Z. You don't know how many there are,
I'm sorry, I wasn't very clear on this part. Remember that each skip list node allows you to travel either "backwards" or "up". If you traveled "up" several times from Z to get to to Y, then you can start at any of those nodes and travel "back" instead. This lets you subdivide/partition the entire history in a way that's roughly half-as-dense at each level. Since there are two links in each block, you can traverse to any point quickly. A full chain iteration would consist of "back back back back back back ... back back [genesis]". An efficient way to check the work is to go "up up up up up up up [genesis]". then do "up up up up back up up up up up [genesis]". Mostly ups. At the beginning, you'll spend most of your time among the peaks, where the air is clearer and the signal is stronger.
|
|
|
|
ben-abuya
|
|
August 08, 2012, 12:01:51 AM |
|
Hmm, how about this:
Let's hypothetically assume current difficulty for all blocks for the sake of this illustration. Your client chooses a random number (k) from 0 to current block height (n). That number is a binary number with log(n) digits (zero-padded). A zero means "look low", a 1 means "look high". You send me that number and I now have to give you all the blocks I get from traversing the highway according to those instructions. If I wanted to feed you fake blocks, I'd have to either dream up all log(n) blocks on the path for you on the fly, or have all possible paths ready to go, which is basically the same as hashing the entire blockchain from scratch.
In order to hash the blocks on the path in a matter of seconds, I'd need 60 * log(n) more hashing power than the entire network.
The problem with this is that I can try to say that the difficulty was historically ridiculously low. But since I'm feeding you all the highest value blocks, if you're not happy with the overall difficulty you just won't deal with me, because I'm probably a crook. Since you can be off by at least two orders of magnitude and still feel safe dealing with me, there probably won't be a lot of false negatives.
|
|
|
|
socrates1024 (OP)
Full Member
Offline
Activity: 126
Merit: 110
Andrew Miller
|
|
August 08, 2012, 12:21:28 AM |
|
A zero means "look low", a 1 means "look high". You send me that number and I now have to give you all the blocks I get from traversing the highway according to those instructions. If I wanted to feed you fake blocks, I'd have to either dream up all log(n) blocks on the path for you on the fly, or have all possible paths ready to go, which is basically the same as hashing the entire blockchain from scratch.
Yup, random traversals of this structure are very interesting. I suspect that at some later date, we will be able to make a non-gpu proof-of-work scheme based on traversals of the previous work-graph. It's basically a way of randomly sampling blocks from history, weighted according to difficulty.
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
August 08, 2012, 01:39:47 AM |
|
Blocks don't currently have a concept of height. If you add that too, you have more information, but not that much more.
And backing up doesn't give a new walk, ever. Every block from X+1 to Y has a skiphash value of X, every block from Y+1 to Z has a skiphash value of Y. If you back up one block to Z-1 you don't get a new walk, you get the exact same one again.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
ben-abuya
|
|
August 08, 2012, 02:06:22 AM |
|
Blocks don't currently have a concept of height. If you add that too, you have more information, but not that much more.
And backing up doesn't give a new walk, ever. Every block from X+1 to Y has a skiphash value of X, every block from Y+1 to Z has a skiphash value of Y. If you back up one block to Z-1 you don't get a new walk, you get the exact same one again.
kjj, you seem to be confusing two concepts here, the skip hash (most recent block with higher value than the previous) and the highest value block in the chain. In your example, X, Y, Z are skip hashes, but for the traversal stuff, you'd be working with highest value blocks. If I find the highest value block in the chain, clearly there are two intervals: blocks before the hvb in the chain and blocks after the hvb in the chain. If we look at each of these intervals, excluding the hvb, we will find a different hvb in each interval. We can choose to go either up or down. That's the second block in the path.
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
August 08, 2012, 01:22:47 PM |
|
Ok, so I read this thread again and figured out the part that I had been missing. When creating a new block, the skiphash points to the most recent block with a value higher than the previous block, not the block with the previous highest value ever. I'm working through the implications of this, but at first glance, it does not appear to produce a chain of links that leads to a high value block. Suppose the back-link from the current head looks like 000xxx (value 3). We also include an 'up'-link to the parent of the most recent block that looks like 0000xxxx (value 4). If you skip to that parent block, then it will have an 'up'-link to the parent of a block that looks like 00000xxxxxx (value 5). And so on, until you get to the parent of the largest block for which there's no up-link.
Suppose you start at that largest block, which looks like 00000xxxxx (value 5). It contains an up-link, either to a previous block with value 00000xxx (value 5), or to one with lower value. You can skip your way to the highest-value hash in every sub-interval of history.
The problem is that when you follow that first "up" link, the block you find there does not have a link to a value higher than itself, it has a link to a value higher than its predecessor. If you want to find a block with a value higher than the block you are on, you have to go forward and then "up" from there, and forward links aren't possible.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
TheBitcoinChemist
Member
Offline
Activity: 70
Merit: 10
|
|
August 08, 2012, 03:33:26 PM |
|
Ok, so I read this thread again and figured out the part that I had been missing. When creating a new block, the skiphash points to the most recent block with a value higher than the previous block, not the block with the previous highest value ever. I'm working through the implications of this, but at first glance, it does not appear to produce a chain of links that leads to a high value block. Suppose the back-link from the current head looks like 000xxx (value 3). We also include an 'up'-link to the parent of the most recent block that looks like 0000xxxx (value 4). If you skip to that parent block, then it will have an 'up'-link to the parent of a block that looks like 00000xxxxxx (value 5). And so on, until you get to the parent of the largest block for which there's no up-link.
Suppose you start at that largest block, which looks like 00000xxxxx (value 5). It contains an up-link, either to a previous block with value 00000xxx (value 5), or to one with lower value. You can skip your way to the highest-value hash in every sub-interval of history.
The problem is that when you follow that first "up" link, the block you find there does not have a link to a value higher than itself, it has a link to a value higher than its predecessor. If you want to find a block with a value higher than the block you are on, you have to go forward and then "up" from there, and forward links aren't possible. Of course they are, just +1 on your block count, then "up". Or just change the higher last hash to one that is higher than both the last hash & the current one. The idea of choosing to be higher than the last hash is to avoid duplicating the existing liner structure of the blockchain.
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
August 08, 2012, 04:02:35 PM |
|
Ok, so I read this thread again and figured out the part that I had been missing. When creating a new block, the skiphash points to the most recent block with a value higher than the previous block, not the block with the previous highest value ever. I'm working through the implications of this, but at first glance, it does not appear to produce a chain of links that leads to a high value block. Suppose the back-link from the current head looks like 000xxx (value 3). We also include an 'up'-link to the parent of the most recent block that looks like 0000xxxx (value 4). If you skip to that parent block, then it will have an 'up'-link to the parent of a block that looks like 00000xxxxxx (value 5). And so on, until you get to the parent of the largest block for which there's no up-link.
Suppose you start at that largest block, which looks like 00000xxxxx (value 5). It contains an up-link, either to a previous block with value 00000xxx (value 5), or to one with lower value. You can skip your way to the highest-value hash in every sub-interval of history.
The problem is that when you follow that first "up" link, the block you find there does not have a link to a value higher than itself, it has a link to a value higher than its predecessor. If you want to find a block with a value higher than the block you are on, you have to go forward and then "up" from there, and forward links aren't possible. Of course they are, just +1 on your block count, then "up". Or just change the higher last hash to one that is higher than both the last hash & the current one. The idea of choosing to be higher than the last hash is to avoid duplicating the existing liner structure of the blockchain. You don't know the value of the current block's hash until after you hash it, and once you do that, you can't go back in and figure out which block you should have linked to, because that would change the hash and value. Adding a block index to the block gives you the ability to move forward, but it also gives you full random access. Why bother with a second link when you can instantly seek by index? With random access, the only job left for hashes is to preserve ordering, that is to prove that Z came after Y came after X. Someone creating a fake skipchain only has to make one trivial fake block per high value fake block, and make them in the right order, which is still billions and billions of times easier than creating a whole fake chain, at least over intervals long enough for this system to make any sense at all.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
gmaxwell
Moderator
Legendary
Offline
Activity: 4242
Merit: 8684
|
|
August 08, 2012, 04:18:39 PM |
|
How does a lite client know which block chain is the largest? By iterating through the whole chain from front to back? Ick!
I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point. There are interesting challenges in Bitcoin, reading headers isn't one of them. Also, from this discussion it sounds like some people are confused wrt difficulty. The difficulty of a block is its target, not the hash value.
|
|
|
|
ben-abuya
|
|
August 08, 2012, 05:13:50 PM |
|
How does a lite client know which block chain is the largest? By iterating through the whole chain from front to back? Ick!
I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point. There are interesting challenges in Bitcoin, reading headers isn't one of them. Also, from this discussion it sounds like some people are confused wrt difficulty. The difficulty of a block is its target, not the hash value. If you stopped reading, how do you know? But seriously, I believe the idea here isn't to address practical problems with the current Bitcoin parameters, but to explore concepts that will be needed if Bitcoin morphs in interesting ways. For instance, if average block solving time is reduced 60-fold to ten seconds, can your smartphone still keep up with all those headers?
|
|
|
|
socrates1024 (OP)
Full Member
Offline
Activity: 126
Merit: 110
Andrew Miller
|
|
August 08, 2012, 05:43:39 PM |
|
I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point.
This proposal is primarily about ddos-resistance. A protocol's ddos-resistance is related to the largest plausible validation effort, not the expected (or actual) effort. So if I give you a 20GB set of headers and tell you it's the largest chain... you'd apparently reject it without even looking, since larger than 400MB is implausible. There are interesting challenges in Bitcoin, reading headers isn't one of them.
My mission is to eliminate every last hard-coded global parameter in Bitcoin, so that it grows into an indisputably scalable and universal protocol. On the chopping block are "10 minutes," and "difficulty adjustment every 2016 blocks." Two of the things I'm going to propose next (absorbing orphaned forks, and self-adjusted difficulty without timestamps) are going to potentially to create a much larger number of headers, so I wanted to explain my solution to that first, especially starting with efficient validation for lite-clients. If it's not interesting, then no need to read ahead - but save your place.
|
|
|
|
TheBitcoinChemist
Member
Offline
Activity: 70
Merit: 10
|
|
August 08, 2012, 05:45:07 PM |
|
How does a lite client know which block chain is the largest? By iterating through the whole chain from front to back? Ick!
I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point. There are interesting challenges in Bitcoin, reading headers isn't one of them. Also, from this discussion it sounds like some people are confused wrt difficulty. The difficulty of a block is its target, not the hash value. If you stopped reading, how do you know? But seriously, I believe the idea here isn't to address practical problems with the current Bitcoin parameters, but to explore concepts that will be needed if Bitcoin morphs in interesting ways. For instance, if average block solving time is reduced 60-fold to ten seconds, can your smartphone still keep up with all those headers? Average block solving time will never be reduced to anywhere near 10 seconds.
|
|
|
|
TheBitcoinChemist
Member
Offline
Activity: 70
Merit: 10
|
|
August 08, 2012, 05:49:46 PM |
|
I stopped reading at this point. Ten years of headers is about 40 megabytes. One hundred years is 400 megabytes. A boring computer _today_ could validate 100 years of headers in a second or two. And even this could be skipped up to a fixed reference point.
This proposal is primarily about ddos-resistance. A protocol's ddos-resistance is related to the largest plausible validation effort, not the expected (or actual) effort. So if I give you a 20GB set of headers and tell you it's the largest chain... you'd apparently reject it without even looking, since larger than 400MB is implausible. There are interesting challenges in Bitcoin, reading headers isn't one of them.
My mission is to eliminate every last hard-coded global parameter in Bitcoin, so that it grows into an indisputably scalable and universal protocol. On the chopping block are "10 minutes," and "difficulty adjustment every 2016 blocks." I assure you, that neither of those parameters are up for debate. Two of the things I'm going to propose next (absorbing orphaned forks, and self-adjusted difficulty without timestamps) are going to potentially to create a much larger number of headers, so I wanted to explain my solution to that first, especially starting with efficient validation for lite-clients. If it's not interesting, then no need to read ahead - but save your place.
Orphanced forks don't need to be 'absorbed' into the main chain, and self-adjusting difficulty without timestamps is too complicated to impliment across a massive, distributed p2p network with connections that vary by orders of magnitude. This will not fly, and your current proposal isn't likely to be incorporated into the main chain, either; without some overriding gain. Incremental gains need not apply here, it's about two years too late for such things.
|
|
|
|
TheBitcoinChemist
Member
Offline
Activity: 70
Merit: 10
|
|
August 08, 2012, 05:56:56 PM |
|
Adding a block index to the block gives you the ability to move forward, but it also gives you full random access. Why bother with a second link when you can instantly seek by index?
Are you talking about an index within a block? No, too big times too many full clients. A better idea would be what I mentioned earlier, a web accessible block index on a regular webserver. It's not like any particular client needs to trust the index, as it just references blocks that can be downloaded and verified over the p2p network. Also, the external index could do the 'last harder hash' index after the block is produced, and this wouldn't require any changes in the codebase or protocol. For that matter, every full client already indexes everything, and that is how blockexplorer.com works. The trick is just setting up a machine readable standard for light clients to query the indexing server, and then hunt those blocks down over the p2p network.
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
August 08, 2012, 06:32:57 PM |
|
My mission is to eliminate every last hard-coded global parameter in Bitcoin, so that it grows into an indisputably scalable and universal protocol. On the chopping block are "10 minutes," and "difficulty adjustment every 2016 blocks."
In that case, good luck, but I don't think you'll get much traction with it. A few people are greatly concerned with the magic numbers, but most of us don't care. I've personally read dozens of threads on that subject, and so far no one has ever come up with a good argument why removing them would be beneficial. This isn't physics where we kept getting 1/137 but didn't know why. We already know why we have 600 and 2016, and we are ok with it.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
ben-abuya
|
|
August 08, 2012, 06:43:40 PM |
|
Average block solving time will never be reduced to anywhere near 10 seconds.
Care to elaborate? It'll be easier to do if we know why it's impossible.
|
|
|
|
ben-abuya
|
|
August 08, 2012, 06:46:26 PM |
|
My mission is to eliminate every last hard-coded global parameter in Bitcoin, so that it grows into an indisputably scalable and universal protocol. On the chopping block are "10 minutes," and "difficulty adjustment every 2016 blocks."
Two of the things I'm going to propose next (absorbing orphaned forks, and self-adjusted difficulty without timestamps) are going to potentially to create a much larger number of headers, so I wanted to explain my solution to that first, especially starting with efficient validation for lite-clients. If it's not interesting, then no need to read ahead - but save your place.
I'm looking forward to reading this.
|
|
|
|
TheBitcoinChemist
Member
Offline
Activity: 70
Merit: 10
|
|
August 08, 2012, 07:49:14 PM |
|
Average block solving time will never be reduced to anywhere near 10 seconds.
Care to elaborate? It'll be easier to do if we know why it's impossible. It's not impossible, just astronomically improbable. Those are arbitrary parameters decided by Satoshi, and in all the time since there has not been any proposal for altering them that could offer more than a marginal improvement, if that. For example, the 10 minute block interval could have been 6 minutes, 12 minutes, 15 minutes or any other arbitrary number of seconds; but it could not have been a small number of seconds because the interval was chosen as a balance between transaction confirmation time and the expected future network propogation latency. 10 seconds, or even 120, is too fast and would result in a great deal many more network splits & orphaned blocks; potentially causing a congestion feedback. Also, the block interval time cannot be variable, due to the interlocking structure of the blockchain's design & purpose. If block intervals were variable; 1) the block reward would either have to be variable also, which would make the task of verifying that the block reward was a valid amount quite difficult for an individual node to perform or the distribution rate of newly issued bitcoins into circulation would no longer be predictable, and that would be bad; and 2) the relatively common blockchain splits would be significantly more difficult to resolve in any automatic fashion, which is the only way it could be done.
|
|
|
|
|