Bitcoin Forum
November 09, 2024, 12:40:51 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 »  All
  Print  
Author Topic: A proposal for a scalable blockchain.  (Read 6209 times)
piuk (OP)
Hero Member
*****
expert
Offline Offline

Activity: 910
Merit: 1005



View Profile WWW
November 25, 2011, 05:42:17 PM
Last edit: November 26, 2011, 04:51:58 PM by piuk
Merited by ABCbits (1)
 #1

The problem:

The blockchain will not scale how it is used currently. There is some mention of pruning unspent outputs mention on https://en.bitcoin.it/wiki/Scalability however this method still requires storage of all blockheaders, meaning there is still a unlimited cap on the blockchain size. Merkel tree pruning will not help to any significant extent as you maybe be able to tell if a transaction is in a block, however you cannot validate that block without all transactions. If lightweight clients cannot validate blocks then they cannot mine, relay blocks or relay transactions there is almost not point to them validating anything at all and might as well use a centralised blockchain.

a) A smaller blockchain helps lower the barrier of entry for the new users.
b) With less risk of blockchain bloat transaction fees could be lowered
c) The larger the blockchain the less users who will run the client and the more centralised the network becomes.

Proposed solution.

At certain points in time the client generates a snapshot of the of every unspent tx output in the chain. This snapshot encapsulates the state of the blockchain upto, but not including, that block.  When a miner produces a block he generates a SHA256 hash of this ledger and includes the hash it in the blocks coinbase.

When a client begins the initial block chain download it starts from the chain head and works backwards. The client downloads a minimum of 2016 blocks before it will accept a ledger hash. If there is a fork in the chain the client will continue to download blocks until it finds a pair of blocks that at least one ledger hash can be agreed upon. When an identical ledger is found the chain with the best proof of work wins. When the client accepts a hash it will ask the node to provide it with the full ledger corresponding to it which can be self hashed and verified. If the node doesn't have the ledger for that hash it may ask other nodes, if no nodes have a copy then it should continue to download past blocks until it can find a hash and a full copy of the ledger.

To validate a transaction the client locates the each txIn outpoint in the unspent ledger and checks the corresponding script for validity. The client checks the validity of a transaction by looking at the txOutputs in it's latest ledger and at in the transactions included in blocks after. Therefore nodes do need to not generate a balance sheet every transaction instead they would keep a balance sheet for approximately two weeks (2016 blocks) before regenerating. Two weeks has been chosen as a base value because it provides enough blocks to use for difficulty targeting, however nodes are free to keep more or less blocks depending on their storage capacity.

When the client decides it is time to generate a new ledger it looks through the chain for a more recent block which has a ledger hash in it's coinbase and is at least 2016 blocks behind the chain head. It then generates a new ledge sheet for that block and checks that the hash matches. It if matches then it is free to purge all transactions/blocks before that time. If the hash does not match then it is important to note the client does not reject the chain, as long as the proof of work is valid. The order of transactions is already decided by the order in the blockchain so the client would simply wait until a miner produces a hash it can agree with, it should not purge transactions until a hash is found. Miners may want to keep blocks for a longer period of time to ensure they have the necessary proof of work should it be needed.

Would this fork the chain?

No. Miners are free to insert whatever data they like into their block's coinbase. Clients that wish to hold a entire blockchain history can simply ignore it.

How much data would clients need to hold?

Quote
Approximately 4.5 million txOuts and 3.3 million txIns - so ~1.2 million unspent outputs.

At the present blockchain size, the ledger would consume at most:

(256 + 160 + 16 + 64) * 1.2 million = 71MB

+ Approximatly two weeks worth of blocks = 100 MB total

This is the initial estimate with compression it maybe possible to halve this value.

Could you mine without the entire blockchain?
Yes. The network could operate fully without any node having the entire blockchain. It is possible that a chain fork could go so far into the past that no nodes have a copy of the chain long enough to resolve the split, however this is extremely unlikely without a malicious attacker having 51% hashing power for a significant period of time.

How would this be adopted, would all miners need to switch immediately?
There needs to be at least one miner producing a ledger hash around every two weeks. So initially this would be possible to implement with only a small pool adopting the scheme. The more frequent miners produce a ledger the more efficiently clients will be able to prune old transactions.

File format
The initial proposed file format would simply be a dump of all unspent txOutputs, in the order they appeared in the blockchain, in the same format as they are serialised over the nework. This has the advantage that any bitcoin client that participates on a network level can decode the file with minimum effort.

The file will probably need to be indexed after it is downloaded as it will not be suitable for locating a txOutput efficiently. The file format for the ledger will be included in the coinbase along with the hash. In the file there will be many duplicate scripts and transaction hashes giving the possibility of much greater compression in future.

Coinbase
Magic Value - File format - Ledger size - Hash
uint32_t, uint16_t, uint64_t, uint256_t

** magic value is a flag indicating this coinbase holds a ledger hash


/Discuss. Feel free to point out any glaringly obvious flaws Smiley


btc_artist
Full Member
***
Offline Offline

Activity: 154
Merit: 102

Bitcoin!


View Profile WWW
November 25, 2011, 05:44:53 PM
 #2

Don't have any comments at the moment, but this looks like an interesting proposal.

EDIT: Upon second reading, I don't see any glaringly obvious flaws and it looks like it might work.  I am, however, very much a neophyte in terms of understanding the bitcoin protocol.

BTC: 1CDCLDBHbAzHyYUkk1wYHPYmrtDZNhk8zf
LTC: LMS7SqZJnqzxo76iDSEua33WCyYZdjaQoE
notme
Legendary
*
Offline Offline

Activity: 1904
Merit: 1002


View Profile
November 25, 2011, 05:54:40 PM
 #3

Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.

https://www.bitcoin.org/bitcoin.pdf
While no idea is perfect, some ideas are useful.
terrytibbs
Hero Member
*****
Offline Offline

Activity: 560
Merit: 501



View Profile
November 25, 2011, 06:06:58 PM
 #4

watching this
btc_artist
Full Member
***
Offline Offline

Activity: 154
Merit: 102

Bitcoin!


View Profile WWW
November 25, 2011, 06:39:48 PM
 #5

Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

BTC: 1CDCLDBHbAzHyYUkk1wYHPYmrtDZNhk8zf
LTC: LMS7SqZJnqzxo76iDSEua33WCyYZdjaQoE
notme
Legendary
*
Offline Offline

Activity: 1904
Merit: 1002


View Profile
November 25, 2011, 06:42:04 PM
 #6

Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.

https://www.bitcoin.org/bitcoin.pdf
While no idea is perfect, some ideas are useful.
btc_artist
Full Member
***
Offline Offline

Activity: 154
Merit: 102

Bitcoin!


View Profile WWW
November 25, 2011, 06:44:37 PM
 #7

Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.
But would there be any built-in incentive to mirror the block chain?  If the default client wouldn't, then that might not be a good direction to go.

BTC: 1CDCLDBHbAzHyYUkk1wYHPYmrtDZNhk8zf
LTC: LMS7SqZJnqzxo76iDSEua33WCyYZdjaQoE
bc
Member
**
Offline Offline

Activity: 72
Merit: 10



View Profile
November 25, 2011, 06:47:47 PM
 #8

Fascinating idea - if I understand it. It sounds like this proposal is sort of like an "oral history" of a block chain - as long as enough recently-connected clients are around to testify as to the "recent" history. Is that an accurate description?

So I'm sure that I understand this:

a) when would it be "safe" to "forget" a genesis block from such a blockchain? As soon as all coins generated by it have been spent, and 51% of clients have learned of these spends (received the blocks they're contained in)?

b) IF 51% of the network were forced offline for 2+ weeks, could a malevolent actor with 51% hash power step in and present a complete two-week false history?

"Democracy is the original 51% attack." - Erik Voorhees
notme
Legendary
*
Offline Offline

Activity: 1904
Merit: 1002


View Profile
November 25, 2011, 06:55:54 PM
 #9

Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.
But would there be any built-in incentive to mirror the block chain?  If the default client wouldn't, then that might not be a good direction to go.

Whatever... waste your time on a solved problem.  I'm tired of beating my head against this particular wall.  I never said the reference client should have this as default, but there is room for many different clients.  Personally, I'm going to focus on solving the lack of usefulness for bitcoin before worrying about what might happen when I and others work our butts off to get us to the point where this discussion even matters.  And I'm pretty sure the Satoshi solution is the way to go.  Have you read the whitepaper?  I highly recommend it . http://www.bitcoin.org/bitcoin.pdf

https://www.bitcoin.org/bitcoin.pdf
While no idea is perfect, some ideas are useful.
btc_artist
Full Member
***
Offline Offline

Activity: 154
Merit: 102

Bitcoin!


View Profile WWW
November 25, 2011, 07:00:43 PM
 #10

Block contents for blocks not containing address you own can be pruned.  That's the whole purpose of the merkle tree structure the blockchain uses.  Future clients will do this and only some nodes will hold the entire chain.
This would mean the system would depend on centralized node(s) to always have the full block chain, since no clients can be expected to have any given block at any given time, right?  I don't know if centralization like that would be a good idea.

Anyone can still mirror the entire chain, most likely including myself.  This is not centralization.
But would there be any built-in incentive to mirror the block chain?  If the default client wouldn't, then that might not be a good direction to go.

Whatever... waste your time on a solved problem.  I'm tired of beating my head against this particular wall.  I never said the reference client should have this as default, but there is room for many different clients.  Personally, I'm going to focus on solving the lack of usefulness for bitcoin before worrying about what might happen when I and others work our butts off to get us to the point where this discussion even matters.  And I'm pretty sure the Satoshi solution is the way to go.  Have you read the whitepaper?  I highly recommend it . http://www.bitcoin.org/bitcoin.pdf
I'm not convinced it's solved.  But I agree with you that bitcoin needs to actually become useful and be adopted before we have to worry about these things.

BTC: 1CDCLDBHbAzHyYUkk1wYHPYmrtDZNhk8zf
LTC: LMS7SqZJnqzxo76iDSEua33WCyYZdjaQoE
piuk (OP)
Hero Member
*****
expert
Offline Offline

Activity: 910
Merit: 1005



View Profile WWW
November 25, 2011, 07:44:54 PM
Last edit: November 25, 2011, 08:52:27 PM by piuk
 #11

Fascinating idea - if I understand it. It sounds like this proposal is sort of like an "oral history" of a block chain - as long as enough recently-connected clients are around to testify as to the "recent" history. Is that an accurate description?

Sort of miners would be generating a hash that says "this is the result of all transactions at this block", as long as enough hashing power agrees with this state then it is accepted.

a) when would it be "safe" to "forget" a genesis block from such a blockchain? As soon as all coins generated by it have been spent, and 51% of clients have learned of these spends (received the blocks they're contained in)?

There would be no 100% safe point. I guess the idea would be most clients would hold blocks from the past few weeks (possibly less, few days) but miners would would hold blocks for a much longer period. If your holding the blocks for the past year then an attacker would need to put in a years worth of hashing power to beat it.

b) IF 51% of the network were forced offline for 2+ weeks, could a malevolent actor with 51% hash power step in and present a complete two-week false history?

Yes assuming every node only had the past two weeks blocks.

Whatever... waste your time on a solved problem.  I'm tired of beating my head against this particular wall.  I never said the reference client should have this as default, but there is room for many different clients.  Personally, I'm going to focus on solving the lack of usefulness for bitcoin before worrying about what might happen when I and others work our butts off to get us to the point where this discussion even matters.  And I'm pretty sure the Satoshi solution is the way to go.  Have you read the whitepaper?  I highly recommend it . http://www.bitcoin.org/bitcoin.pdf

Satoshi's paper is pretty vague on this, where is it explained how you prove a transaction was in a block with the merkel tree? With pruning you still have to hold all unspent outputs, including the transaction hash and scriptPubKey. This method clusters unspent txOutputs and no longer requires the transaction hash.

Edit: Even if you can prove a transaction is in a block with the merkel tree you cannot validate a block without all transactions. If you can't validate a block then how do you know you haven't been given false data?

If clients cannot validate a block with only the headers then what is to stop a malicious attacker generating a set of block headers with fake hashes and difficulty targets. There would be no way for nodes to determine which chain is valid without downloading the transactions.

Skybuck
Full Member
***
Offline Offline

Activity: 386
Merit: 111


View Profile
November 25, 2011, 07:59:17 PM
Last edit: November 28, 2011, 02:18:45 AM by Skybuck
 #12

I completely do not understand your proposal it's enormously complex and relies on additional communication providing data which might or might not be available. But reading your ideas I have my own idea:
(Editi/Addition:Ok, now I understand your proposal better, I didn't know what a ledger is lol Wink I think our ideas are similiar, except my idea was to create a new protocol to exchange the ledger hashes, apperently your idea is to embed those into the block chains, so no new protocol would be necessary, However a drawback of your idea would be that only miners get to verify the ledger/balance sheet, an obvious flaw I think Wink It needs to be seperated so everybody can have a vote on it ! Wink Then again, blocks are the way the network protocol and verification works, this could mean miners are now in control and decide what transactions are valid and which are not, those it seems bitcoin is coming down crashing and burning it's no longer p2p, it's no longer everybody in control, only the miners are now in control, probably a very dangerous situation Wink at least non-miners can still verify, but rejecting will be useless it seems, since they can never win).

How about this instead:

1. Instead of storing all transactions which ever occured, a point in historic time is chosen, where the software makes up the balances of all bitcoin addresses.

2. Bitcoin addresses which have turned into zero balances are thrown away.

3. Instead of storing the transactions, the balances are stored, which could cut back on data.

What kind of problems could this solution face and what could be the additional solutions:

1. Somebody could change it's own balance in it's own data, to give himself a million dollars.

This would then conflict with the datasets of others.

An idea could be to calculate a hash over all bit coin address balances.

This hash is then broadcasted throughout the system.

The number of confirmations is tracked.

If the majority agrees that the hash is indeed correct it is accepted into a "balance chain".

To make it a little bit more difficult to fake this balance chain, the hash could follow the principle of "difficulty", except since there is no rush, the difficulty could be set 100x times the current difficulty.
(Perhaps the difficult setting for the balance should match the number of "transaction/blocks collapses" * "current difficulty". In other words, blocks are calculated every 10 minutes, the balance block is calculated every 1000 blocks. So the difficulty for the balance chain is 1000x the block difficulty, which should keep both in sync).

Seems simply enough idea.

In principle there is probably no difference between "storing transactions" or "storing the sum of all those transactions" ?!?!?

Except that those transactions are "hashed into a block chain".

Well the same can be done with a "balance chain".

Perhaps it should also become a "racing balance chain" where the longest balance chain wins.

The difference is however: the balance chain is much harder to calculate.

Also the balance chain lags behind the transaction chain, by for example 1000 or 2000 blocks.

So the transaction chain gets a chance to stabilize, so the balance chain can work on stabilized transaction/block data.
Akemashite Omedetou
Member
**
Offline Offline

Activity: 84
Merit: 13



View Profile WWW
November 25, 2011, 09:53:01 PM
 #13

Quote
3. Instead of storing the transactions, the balances are stored, which could cut back on data.

This would require quite a rework on the entire bitcoin network.
Right now it is based on transactions not just for the sake of it, but for flexibility which is achieved with scripts. Basically a transaction can have many different inputs and many different outputs, and many different conditions that will have to be med to claim the outputs. Just throwing this all away and simply storing balances of each address would not be compatible with this and would mean we would have to rework the whole concept.
Cool idea though.

Bitcoin Fog: Secure Bitcoin Anonymization

---
Creedy: Die! Die! Why won't you die?... Why won't you die?
V: Beneath this mask there is more than flesh. Beneath this mask there is an idea, Mr. Creedy, and ideas are bulletproof.
notme
Legendary
*
Offline Offline

Activity: 1904
Merit: 1002


View Profile
November 25, 2011, 10:06:26 PM
 #14

To be clear, merkle pruning only applies to clients that need to conserve space.  You still need the entire chain to mine.

https://www.bitcoin.org/bitcoin.pdf
While no idea is perfect, some ideas are useful.
piuk (OP)
Hero Member
*****
expert
Offline Offline

Activity: 910
Merit: 1005



View Profile WWW
November 26, 2011, 11:34:38 AM
 #15

I've modified to the original proposal to simplify to process of generating a ledger.

I don't believe the merkel tree pruning proposal by Satoshi is workable. If a client cannot validate blocks then it cannot be trusted to have any network participation.

ThomasV
Legendary
*
Offline Offline

Activity: 1896
Merit: 1353



View Profile WWW
November 26, 2011, 11:56:42 AM
 #16

interesting

Electrum: the convenience of a web wallet, without the risks
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4270
Merit: 8805



View Profile WWW
November 26, 2011, 12:00:58 PM
 #17

however this method still requires storage of all blockheaders, meaning there is still a unlimited cap on the blockchain size.

One hundred years of headers is about 400 mbytes of data.

Why are you wasting our time with this drivel?

Quote
At certain points in time the client generates a snapshot of the of every unspent tx output in the chain.

Superior versions of what you're describing have been suggested before: https://bitcointalk.org/index.php?topic=21995.0

I say superior, because arranging the ledger summary into a hash tree allows nodes to participate without even knowing the complete ledger— other nodes can present ledger fragments to them with the branches which connect the ledger to the tree root— and the whole ledger is constantly current. Bytecoin went further and suggested that you can actually flip the direction of the chain and track the ledgers alone, leaving the coin holders to track the fragments connecting their own coins to the chain.

But none of this is required because of block header sizes.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4270
Merit: 8805



View Profile WWW
November 26, 2011, 12:09:07 PM
 #18

To be clear, merkle pruning only applies to clients that need to conserve space.  You still need the entire chain to mine.

You can prune all the spent outputs and mine away. What you can't do is run a lite headers-only node to mine— unless you just want to mine for the subsidy and not process any transactions. Though with a commitment to open transactions in the prior block, you could mine txn which provide the connecting fragments for the prior block for all of their inputs.
P4man
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
November 26, 2011, 12:15:53 PM
Last edit: November 26, 2011, 01:00:59 PM by P4man
 #19

Disclaimer: I have no idea what Im talking about.

With that out of the way, would it be possible (or even a good idea) to have all clients store the most recent blocks + a random subset, say 5% of the blockchain? The entire network would still hold countless copies of the blockchain, and as the network (and blockchain) grows you could reduce the subset each client has.

cbeast
Donator
Legendary
*
Offline Offline

Activity: 1736
Merit: 1014

Let's talk governance, lipstick, and pigs.


View Profile
November 26, 2011, 12:23:25 PM
 #20

Personally, I'm going to focus on solving the lack of usefulness for bitcoin before worrying about what might happen when I and others work our butts off to get us to the point where this discussion even matters.

+∞

Explosive growth in Bitcoin use = explosive growth in Bitcoin value. Internet speed and storage space are growing exponentially. As Bitcoin grows in value, miners with full clients will be connected to fiber optic networks operating at multi GB speeds and everyone else can use lite clients. There is plenty of time to explore other options.

Any significantly advanced cryptocurrency is indistinguishable from Ponzi Tulips.
Pages: [1] 2 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!