casascius (OP)
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
September 15, 2011, 02:29:12 PM Last edit: September 15, 2011, 02:47:09 PM by casascius |
|
I just read another post about Solidcoin getting a new genesis block.
I can't help but think that the same thing will end up needing to happen to Bitcoin so that the overhead of downloading an ever-growing block chain remains within the realm of possibility.
Block chain is already over 600 MB and growing exponentially. I am surprised there is no talk about the need to have some sort of "checkpointing" where the entire block chain is compressed into a digest so that only unspent transactions remain, and this compressed version be used in lieu of all the blocks it represents.
Of course, the whole thing would have to be digitally signed with keys recognized by the client and distributed, along with source for a utility that anyone can run to confirm that it in fact accurately represents the account balances. This file would in essence be very similar to the "Solidcoin genesis block" that I'm sure some of us are laughing about, except that it would artificially contain the hash of the last "real" block that went into it, so that it could be built upon by real blocks from the original chain.
There would be other minor logistical concerns as well. For example, I don't think the peer-to-peer protocol has any language for informing a peer that it only has a subset of the block chain.
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
Alex Zee
|
|
September 15, 2011, 02:35:00 PM |
|
I want to add that the amount of HDD "swapping" on new blocks is unreasonable. It sounds like the client performs defragmentation.
Can't this be optimized? Cached? It's very annoying and doesn't do any good for the disk.
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
September 15, 2011, 02:52:20 PM |
|
There actually has been a lot of discussion on this. Search for pruning.
The trick is that exactly how to do it isn't clear yet, and there is no urgency. 600 MB is trivial for the types of devices that run full clients, and the ideas that people have for tiny clients will probably need something else entirely.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
casascius (OP)
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
September 15, 2011, 03:24:35 PM |
|
There actually has been a lot of discussion on this. Search for pruning.
The trick is that exactly how to do it isn't clear yet, and there is no urgency. 600 MB is trivial for the types of devices that run full clients, and the ideas that people have for tiny clients will probably need something else entirely.
I understand the pruning (with respect to the idea that the Merkle trees can have branches taken off them and still be able to evaluate the hash of the remaining items). That just plain isn't going to work, for one big reason: there is no way for a client to tell if any given missing branch is a spend that the client needs to know about. (That is, a spend that consumes some prior transaction... if this is "pruned", the prior transaction looks unspent). And there is no viable way to share pruned blocks in an untrusted peer-to-peer environment, as there is no way for any peer to trust that what's missing is actually what's supposed to be missing. For a pruned block chain to work, one must be able to trust whoever did the pruning... which makes a signature infrastructure necessary... and it starts to look a lot like what I've proposed as a digest. In fact the way I see it, pruning is totally useless. If you already have to trust the pruner and pruning can only remove some of the unneeded data, then you may as well accept a signed digest file which will be smaller and remove all the unneeded data. Tiny devices in my view would typically be carried around with a digest of all-they-need-to-know produced by whoever is administering the application being used, or if they're online, would ask a trusted web service to query the blockchain on their behalf, likely using JSON or something similar. 600 MB may be trivial if it weren't growing exponentially. That, and new users practically have to wait all day for the client to download and verify it already, today, which is well beyond what a typical newbie expects. The not being sure about what the solution might be at this point, is a recipe for having our backs against a wall at some point in the future. Maybe we will get lucky and someone will spam attack us like they did with Solidcoin for our own good, which will escalate the urgency of this issue. =)
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
Steve
|
|
September 15, 2011, 03:31:22 PM |
|
I think this has been discussed many times before. I thought a "block headers only" solution was in the works. In this case, a client only needs to download the block headers instead of all transactions. As new transactions are received, it would pull input transactions lazily out of the network to do verification. You need to take steps to ensure that there are still clients that maintain the full transaction history however.
Creating a digest of current account balances is still a good idea though...in fact, it might be a good idea to include a full digest of all account balances with every block that is created (the block header would have a merkle hash that points to the digest and only addresses with a nonzero balance would be included). This way, no signatures or trust network is needed...the digest gets created as part of normal block creation. In fact, clients could choose not to verify the entire history, but simply start with the current block in the network and validate forward from there. They could even discard old transaction history this way (as long as you set a limit on how deep reorgs are allowed to happen). The validation might need to reach back into history prior to the earliest digest that the client is using, but it could pull those earlier transactions as needed from the network. I think you would just need to be verifying that input transactions did appear in an earlier block.
|
|
|
|
casascius (OP)
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
September 15, 2011, 03:40:45 PM |
|
I think this has been discussed many times before. I thought a "block headers only" solution was in the works. In this case, a client only needs to download the block headers instead of all transactions. As new transactions are received, it would pull input transactions lazily out of the network to do verification. You need to take steps to ensure that there are still clients that maintain the full transaction history however.
So you receive an incoming transaction... How do you know if it's a double spend if all you have is block headers? If it has to ask the network about transactions, what happens if the network lies or the node being asked doesn't have a complete picture? The block headers allow you to prove that something is on the block chain, but in this case, you are needing proof that something invalidating it is not on the block chain to know it is any good. You can't do that without a complete copy, a trusted peer, or a trusted digest. Creating a digest of current account balances is still a good idea though...in fact, it might be a good idea to include a full digest of all account balances with every block that is created (the block header would have a merkle hash that points to the digest and only addresses with a nonzero balance would be included). This way, no signatures or trust network is needed...the digest gets created as part of normal block creation. In fact, clients could choose not to verify the entire history, but simply start with the current block in the network and validate forward from there. They could even discard old transaction history this way (as long as you set a limit on how deep reorgs are allowed to happen). The validation might need to reach back into history prior to the earliest digest that the client is using, but it could pull those earlier transactions as needed from the network. I think you would just need to be verifying that input transactions did appear in an earlier block.
If these were done on an automated basis (e.g. every 1000 or 2016 blocks), then a client could feasibly only need the block headers from genesis to the last trusted digest. Miners would probably need to be forced not to "trust" the digest and would need to verify it in its entirety before building on it, until it's at least a certain age (e.g. 1000 blocks), and then clients would only treat a digest as trusted if it's more than 1000 blocks old. The digest is going to be huge (tens of MB or more) and can't reasonably be done every block of course, and the way I see it, is still probably a viable solution.
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
cypherdoc
Legendary
Offline
Activity: 1764
Merit: 1002
|
|
September 15, 2011, 04:35:45 PM |
|
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
September 15, 2011, 04:38:41 PM Last edit: September 15, 2011, 05:57:17 PM by kjj |
|
I think that your problem with pruning is that you have decided on one way to do it out of many, and your way won't work, so you think that no way will work.
I can assure you that there actually are ways to do pruning that you haven't thought of, and they really do work just fine.
And pruning isn't the only solution to the real and imaginary problems in this area. For example, tiny clients will probably use a system where they contact one or more special nodes to download transaction data on demand.
The key is in the understanding this is to consider that a you need different subsets of the data to do different things, and not every device needs to do the same things, so not every device will need the same data.
edit: kjj 2011-09-15 12:55 -0500 - though -> thought
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
trentzb
|
|
September 15, 2011, 04:52:39 PM |
|
I keep coming back to the need for a super-node but I don't particularly like that idea much. I think it has been mentioned elsewhere that if a super-node system was used there has to be some incentive and a proof of work for a super-node to operate and relay/validate transactions. I cannot imagine a way to provide such incentive while insuring valid answers are provided to other nodes.
|
|
|
|
cypherdoc
Legendary
Offline
Activity: 1764
Merit: 1002
|
|
September 15, 2011, 04:55:30 PM |
|
whatever you do, don't centralize any of these nodes as much as possible.
|
|
|
|
ElectricMucus
Legendary
Offline
Activity: 1666
Merit: 1057
Marketing manager - GO MP
|
|
September 15, 2011, 05:07:08 PM |
|
I don't think this will ever be a real issue, i seriously doubt the size of the blockchain will ever outgrow the size of the average harddrive.
|
|
|
|
theymos
Administrator
Legendary
Offline
Activity: 5376
Merit: 13370
|
|
September 15, 2011, 05:54:12 PM |
|
Solo miners and pool servers, which need all unspent transactions, will probably eventually figure out some system of downloading only unspent transactions. A list of spent transactions will become "common knowledge", perhaps.
Clients are expected to mostly use headers-only mode, which is safe as long as an attacker doesn't have >50% of the network. Clients could also rely on the network only to verify spends of very old transactions, and verify new blocks themselves.
If it really becomes a serious problem, it's possible for >50% of the network to implement something like demurrage in a backward-compatible way, which would allow the network to forget really old transactions, even if they are unspent.
|
1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
|
|
|
Steve
|
|
September 15, 2011, 06:01:02 PM |
|
I think this has been discussed many times before. I thought a "block headers only" solution was in the works. In this case, a client only needs to download the block headers instead of all transactions. As new transactions are received, it would pull input transactions lazily out of the network to do verification. You need to take steps to ensure that there are still clients that maintain the full transaction history however.
So you receive an incoming transaction... How do you know if it's a double spend if all you have is block headers? If it has to ask the network about transactions, what happens if the network lies or the node being asked doesn't have a complete picture? The block headers allow you to prove that something is on the block chain, but in this case, you are needing proof that something invalidating it is not on the block chain to know it is any good. You can't do that without a complete copy, a trusted peer, or a trusted digest. You don't have to trust any peers...transactions are identified by their hash...when you ask a peer for the transaction associated with a txid (i.e. you need to retrieve a transaction input), you validate what it returns by hashing it and verifying that it matches the txid. If that transaction is part of an earlier block, then to the extent you trust the block headers that you downloaded, you can trust that the input transaction was valid (you could make a rule that you only trust the headers that pre-date one of the hard coded blocks in the client...probably hundreds of other possibilities as well). Creating a digest of current account balances is still a good idea though...in fact, it might be a good idea to include a full digest of all account balances with every block that is created (the block header would have a merkle hash that points to the digest and only addresses with a nonzero balance would be included). This way, no signatures or trust network is needed...the digest gets created as part of normal block creation. In fact, clients could choose not to verify the entire history, but simply start with the current block in the network and validate forward from there. They could even discard old transaction history this way (as long as you set a limit on how deep reorgs are allowed to happen). The validation might need to reach back into history prior to the earliest digest that the client is using, but it could pull those earlier transactions as needed from the network. I think you would just need to be verifying that input transactions did appear in an earlier block.
If these were done on an automated basis (e.g. every 1000 or 2016 blocks), then a client could feasibly only need the block headers from genesis to the last trusted digest. Miners would probably need to be forced not to "trust" the digest and would need to verify it in its entirety before building on it, until it's at least a certain age (e.g. 1000 blocks), and then clients would only treat a digest as trusted if it's more than 1000 blocks old. The digest is going to be huge (tens of MB or more) and can't reasonably be done every block of course, and the way I see it, is still probably a viable solution. The digest for any block could be produced on demand by any node that has the full transaction history. If you did something like this, keep it simple...define a standard format for the digest and have miners always include the hash of the digest in every block. When a new block is found, other nodes would update their running digest with that block's transaction, then compute the hash of the digest and verify that it matches what is in the block. Still, I'm struggling to think of a real benefit of such digests (it is redundant information after all)...using digests would require placing some trust in the information provided by peers (full validation of the transaction history and block chain is required to avoid trusting peers). It's possible this could provide some benefit to lighter weight peers (that don't have the storage, CPU or bandwidth needed for full validation), but then I think that problem is better solved by a lightweight client that just communicates with a full node that it trusts. In fact, as I write this, I'm wondering whether a "headers only" client fills an real need either. Either you're trusting other nodes or your not...if you're not, then you need the full transaction history...I don't think you can escape that. If you are trusting other nodes, then just pull only the information you need from one or more trusted nodes when you need it and don't even try to connect directly to the p2p network.
|
|
|
|
flower1024
Legendary
Offline
Activity: 1428
Merit: 1000
|
|
September 15, 2011, 06:06:00 PM |
|
i am not really a crypto guy: but wouldnt it be possible to bootstrap the thin client from a trusted node (maybe from home-pc) so it does have the latest 10 block digests.
as far as i understand you that would be enough to verify further transactions.
|
|
|
|
kjj
Legendary
Offline
Activity: 1302
Merit: 1026
|
|
September 15, 2011, 06:39:13 PM |
|
i am not really a crypto guy: but wouldnt it be possible to bootstrap the thin client from a trusted node (maybe from home-pc) so it does have the latest 10 block digests.
as far as i understand you that would be enough to verify further transactions.
Sorta. If you assume a really trusted node, you don't even need 10 digests, or even one. But it really depends on how thin the thin client is, and what it is planning to do. Consider a wallet that only holds keys and signs transactions. In theory, it doesn't need to know anything at all, because it can ask for the transactions it is about to spend at when it is ready to spend. The worst a hostile node could do at that point is present bogus transactions for the wallet to use as inputs, but that doesn't hurt the wallet any, just the recipient. And the hostile node will almost always belong to either the spender or the receiver, and one of them would notice the problem quickly anyway. As you ask the client to do more things, do more extensive verification, or trust other nodes less, it needs more data and needs to do more processing itself. There is a full spectrum between the full nodes that everyone has now, and the superthin wallets that I think people will eventually be carrying around.
|
17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8 I routinely ignore posters with paid advertising in their sigs. You should too.
|
|
|
2112
Legendary
Offline
Activity: 2128
Merit: 1073
|
|
September 15, 2011, 07:04:21 PM |
|
In fact, as I write this, I'm wondering whether a "headers only" client fills an real need either.
It fulfills psychological needs of some developers, and transitively some bitcoin stakeholders and promoters. Whether those needs are "real" or not is a different story. In my opinion they are very real, although sometimes subconscious.
|
|
|
|
casascius (OP)
Mike Caldwell
VIP
Legendary
Offline
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
|
|
September 15, 2011, 07:42:46 PM |
|
So you receive an incoming transaction... How do you know if it's a double spend if all you have is block headers? If it has to ask the network about transactions, what happens if the network lies or the node being asked doesn't have a complete picture? The block headers allow you to prove that something is on the block chain, but in this case, you are needing proof that something invalidating it is not on the block chain to know it is any good. You can't do that without a complete copy, a trusted peer, or a trusted digest.
You don't have to trust any peers...transactions are identified by their hash...when you ask a peer for the transaction associated with a txid (i.e. you need to retrieve a transaction input), you validate what it returns by hashing it and verifying that it matches the txid. If that transaction is part of an earlier block, then to the extent you trust the block headers that you downloaded, you can trust that the input transaction was valid (you could make a rule that you only trust the headers that pre-date one of the hard coded blocks in the client...probably hundreds of other possibilities as well). I agree you can easily determine that the transaction exists, but what part of this validation process will allow you to confirm that the transaction has not been spent by a later transaction? Still, I'm struggling to think of a real benefit of such digests (it is redundant information after all)...using digests would require placing some trust in the information provided by peers (full validation of the transaction history and block chain is required to avoid trusting peers). It's possible this could provide some benefit to lighter weight peers (that don't have the storage, CPU or bandwidth needed for full validation), but then I think that problem is better solved by a lightweight client that just communicates with a full node that it trusts. In fact, as I write this, I'm wondering whether a "headers only" client fills an real need either.
Either you're trusting other nodes or your not...if you're not, then you need the full transaction history...I don't think you can escape that. If you are trusting other nodes, then just pull only the information you need from one or more trusted nodes when you need it and don't even try to connect directly to the p2p network.
The real benefit is simple: much less for users to download. Imagine block 150000 was a superblock that contained a digest of all the unspent transactions before it, and let's say it weighed in at 50 MB. A client would only need to download the 80-byte block headers of blocks 1 thru 149999, which would be only 149999*80bytes, or 12 megabytes. Total download burden to start using Bitcoin at block 150000: 62 megabytes. Now let's say at block 750000, the block chain is 50 terabytes. But as a superblock, it lists all unspent transactions in 1 GB. Total download burden: 1 GB + 750000*80 bytes (60 MB) = 1.06 GB, versus 50 terabytes. I am sure you would agree that having less to download is universally desirable.
|
Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable. I never believe them. If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins. I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion. Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice. Don't keep coins online. Use paper or hardware wallets instead.
|
|
|
Steve
|
|
September 15, 2011, 07:58:43 PM |
|
Getting back to the topic of the OP...if you believe that bitcoin will flourish and be used for generations to come, then eventually this does become a problem. I'm also interested in ways to allow the network to forget old transactions just out of principle (although, you should never assume old transactions are ever forgotten). I think you can do this in a couple steps. The first step is to establish a block depth, beyond which, the network will never accept a re-org. This formalizes the current practice of hard coding recent blocks into the client. It should be set deep enough that it doesn't present any real risk of block chain splits. The second step is to add the hash of the unspent transactions to the block. Nodes would have to validate this hash before considering the block to be valid.
New nodes joining the network could request the current block along with its transactions. The node can then work backward, fetching each block and the transactions until it reaches a block that is old enough that it would never be subject to reorg (plus a few extra for good measure in case some of the rest of the network hasn't seen some of the more recent blocks). It fetches all the unspent transactions that go with that block. It can then validate going forward all of the transactions and blocks. This new node will be trusting that other nodes have validated history up to the point in the past where it is starting from. Care could be taken to ensure that trust is distributed among a large and diverse set of nodes by confirming everything with a large number of nodes (perhaps seeded by a group of nodes widely acknowledged to be trustworthy). Another approach to this node bootstrapping would be to view it as a kind of cloning of existing nodes. If Alice trust Bob, John, and Bill, then Alice would be confident in bootstrapping a new node from Bob, John and Bill and not have to place trust in the broader network.
If I'm not missing anything, nodes would only need to keep block headers and unspent transactions that are older than the block at which reorgs are prohibited. Spent transactions that are older than the reorg boundary (plus some buffer) could be safely discarded. The older blocks themselves could possibly be discarded as well (remember, all the unspent transactions can be verified to be in the chain by virtue of this new hash in the header). I think for good measure you would want to keep a healthy amount of history beyond the reorg boundary, though I'm not sure it would be strictly necessary. I even wonder whether this would work and be safe even on very short time scales if you retained a relatively long tail of block headers (i.e. a few hours for the reorg boundary).
|
|
|
|
Steve
|
|
September 15, 2011, 08:15:10 PM |
|
So you receive an incoming transaction... How do you know if it's a double spend if all you have is block headers? If it has to ask the network about transactions, what happens if the network lies or the node being asked doesn't have a complete picture? The block headers allow you to prove that something is on the block chain, but in this case, you are needing proof that something invalidating it is not on the block chain to know it is any good. You can't do that without a complete copy, a trusted peer, or a trusted digest.
You don't have to trust any peers...transactions are identified by their hash...when you ask a peer for the transaction associated with a txid (i.e. you need to retrieve a transaction input), you validate what it returns by hashing it and verifying that it matches the txid. If that transaction is part of an earlier block, then to the extent you trust the block headers that you downloaded, you can trust that the input transaction was valid (you could make a rule that you only trust the headers that pre-date one of the hard coded blocks in the client...probably hundreds of other possibilities as well). I agree you can easily determine that the transaction exists, but what part of this validation process will allow you to confirm that the transaction has not been spent by a later transaction? If a transaction appears in a block, you know that the network has agreed that it is the authoritative transaction to the extent that the network is able to agree to such a thing (reorgs, >50% attacks, etc). If it doesn't appear in a block, well, you need to ask your peers if they think it's valid (if they've seen a conflicting transaction, they won't) or you wait for it to appear in a block (and both still carry degrees of risk when it comes to double spends). A headers only client doesn't mean that you never pull the body of the block from the network...if you need to check that a transaction appeared in a block, you can ask a peer that question...the peer can respond with the portion of the merkle tree that proves the transaction is in the block. The real benefit is simple: much less for users to download.
My point is that with a lobotomized client, you still have to place trust in network peers...so you may as well go as thin as possible and not bother with the block chain at all. Just a very thin client connecting to one (or several if you want a little bit of distributed trust) full network nodes.
|
|
|
|
ctoon6
|
|
September 15, 2011, 08:15:16 PM |
|
the main problem is picking somebody trusted to do it. you would have to have hundreds of people run the same calculations, and audit the code just to be 99.9999999% sure that nobody lost/gained any coins and such. the network could be in ruins for a week while this happened. while we are at it, we might as well recycle all the very small amounts just wasting disk space, like anything lower than .001 will get added back into the pile of coins yet to be claimed.
i think it should be done every gigabyte or so. right now that would be off the top of my head, about every 1.5-2 years.
|
|
|
|
|