Bitcoin Forum
April 23, 2024, 09:34:29 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Proposal to solve bitcoin scaling problem  (Read 4446 times)
bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 07:00:08 PM
Last edit: June 17, 2011, 10:05:15 PM by bji
 #1

EDIT: Hello, now that this was moved from the newbie forum, I have removed the section at the beginning in which I complained about some unrelated (and probably not interesting to anybody) aspects of bitcoin.  If anyone feels it is important to see these, please let me know and I will re-post them.END EDIT

Proposal: Bitcoin Elided Blocks

Introduction:

The bitcoin protocol in its most basic form does not scale for end users.  Downloading the entire block chain currently, as of June 16, 2011, requires several hundred megabytes of data transfer, and the rate of generation of new data is increasing.  Within a year it is likely that the block chain will be over 1 GB in size.  It is impractical to expect that end users will be able to download this much data to start using bitcoin, and additionally that they will have sufficient resources to download new blocks on an ongoing basis at this increasing rate.

Furthermore, there are some current clients (smartphones as an example) for which the current multi-hundred-megabyte block chain download is already impractical.  Therefore this problem is already relevant to the success of bitcoin (how can bitcoin succeed if new users are unable to practically join?) and will only become more relevant as time goes on.  Right Now is a criticial time for bitcoin: the barriers to entry must be as minimal as possible in order for the user base to grow to a self-sustaining level.

The Problem:

Currently the bitcoin client validates transactions locally by examining the entire block chain.  This is the reason that the client must download the entire block chain - to be able to validate transactions.  The client does not need to download blocks to initiate a transaction, but even then, it is likely that the user will want to maintain an up-to-date local block chain so that they can verify that any payments they send were received.

As of this writing (June 16 2011, 7:17 pm PDT), the total number of blocks is 131356, the average transaction size is 341 and the average block size is 2063, considering the entire block chain.  However, these average values are misleading because they incorporate a large number of blocks that were generated during the first year and a half or so of bitcoin's existence where the blocks and transactions were very small as they were exclusively coin generation blocks.  Using the last 1,000 blocks yields an average transaction size of 419 and an average block size of 22673.

Thus, currently the total size of all blocks is 270987428 bytes (258.433 MiB); if all blocks were as large as the average over the last 1,000 blocks, then the total size would be 2978234588 bytes, or 2.773 GiB.  Even at today's relatively low volume, less than two more years of blocks would occupy almost 3 GiB.  It is clearly impractical for all clients to download and store all of this data.

A Proposed Solution:

A proposed solution is for clients to store "elided blocks".  "Elided blocks" are blocks for which transactions not of interest have been stripped out, leaving only transactions interesting to the client.  For most blocks, this will be all transactions, reducing the block to 80 bytes.  Using the previous example of 131356 blocks, this reduces the storage to a lower bound of 10.021 MiB.  Of course, some blocks will have transactions of interest, but these should add an average only the sizes of the transactions themselves plus the Merkle hash tree entries for that transaction; 1000 bytes seems generous given a 419 byte average transaction size.  Thus a user with 10 active bitcoin addresses, each of which requires a chain of 100 transactions to verify their contents, might need only an extra 0.954 MiB of data.

So how do clients acquire "elided blocks"?  For each client, the transactions of interest are unique, so each client's set of elided blocks is different.  Therefore it is not possible to store elided blocks ahead of time in a shared pool; they must be fabricated on demand for each client.

One way to do this would be for each client to download the entire block chain and then elide transactions of interest.  However, this leaves some problems unsolved:

1. The client still must download the entire block chain as input to the eliding operation.

2. The client would need to re-download the entire block chain whenever a new bitcoin address became interesting to the client, in order to re-elide with the new set of interesting transactions.

This solution is clearly not appropriate as it does not solve one of the fundamental problems (downloading the entire block chain) and it adds an additional problem of requiring re-downloading of the entire block chain over and over again with each new bitcoin address of interest.

A better solution would be for a client to ask its peers for "elided blocks", which would allow the client to download elided blocks in a progressive manner.  The client would indicate bitcoin addresses of interest and its peer would provide the elided blocks.  The client could come back later and query for all new elided blocks since the last time that the client queried in order to acquire the most up-to-date transactional information for bitcoin addresses of interest.  The client could also come back later and query about an entirely new bitcoin address and receive only the elided blocks for that address.

Note that the client must always receive enough information in order to verify the transactions itself.  It is not asked to trust the peer's verification of a transaction chain; instead it must only trust that the peer is delivering accurately elided versions of the true block chain blocks, and it would do the verification itself.

What is to prevent a peer from fabricating elided blocks in response to client queries?  The client simply needs to validate each block header against the set of block headers that it acquired from the peer ahead of time (which themselves should be validated against other sources of the block headers), in order to ensure that the peer could not have fabricated any elided block on demand, as it would be impossible for the peer to fabricate a block and yet have the block header match the true bitcoin block header for that block.

Client Identification:

Because there are costs of storing the entire block chain and eliding blocks on demand, peers are given the option to charge for this service.  The means for accepting payment and verifying account balances are outside of the scope of the bitcoin protocol; but the protocol does include the key ingredient necessary to facilitate such a system: the requests may include an optional bitcoin address and be signed by that address, in order to associate the request with an account, that account being known to the peer by the bitcoin address that pays into it.  This is optional - peers can be completely free and never require signed requests, or can be completely for-hire and only accept signed requests, or accept both kinds of requests and give priority to paid accounts, or anything in between.

Postscript:

This last bit, where I talk about client signed requests, is part of the proposal.  I think that all client requests ought to be optionally signable so that bitcoin peers can decide who they want to talk to (i.e. who has paid them for the service and is carrying a positive balance) and optionally charge.  I think this should be some fundamental part of the protocol that can apply (at the client's discretion and the peer's acceptance) to any protocol message.

I was going to write up the protocol messages that would be added to the bitcoin protocol (to be added to the Bitcoin Protocol wiki page), but I'm not going to bother.  Like I said, maybe some developer will find what I have proposed useful, if not then ... please realize that the problem does need to be solved, even if what I have proposed is not the best way.  But please take my advice: you have to find a way now to solve the current bitcoin client data scalability problem.  In a matter of months, bitcoin clients as they currently stand will be impractical due to the ever-growing size of the block chain.
1713908069
Hero Member
*
Offline Offline

Posts: 1713908069

View Profile Personal Message (Offline)

Ignore
1713908069
Reply with quote  #2

1713908069
Report to moderator
1713908069
Hero Member
*
Offline Offline

Posts: 1713908069

View Profile Personal Message (Offline)

Ignore
1713908069
Reply with quote  #2

1713908069
Report to moderator
1713908069
Hero Member
*
Offline Offline

Posts: 1713908069

View Profile Personal Message (Offline)

Ignore
1713908069
Reply with quote  #2

1713908069
Report to moderator
The network tries to produce one block per 10 minutes. It does this by automatically adjusting how difficult it is to produce blocks.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713908069
Hero Member
*
Offline Offline

Posts: 1713908069

View Profile Personal Message (Offline)

Ignore
1713908069
Reply with quote  #2

1713908069
Report to moderator
1713908069
Hero Member
*
Offline Offline

Posts: 1713908069

View Profile Personal Message (Offline)

Ignore
1713908069
Reply with quote  #2

1713908069
Report to moderator
dontListen2me
Newbie
*
Offline Offline

Activity: 14
Merit: 0


View Profile
June 17, 2011, 07:32:18 PM
 #2

not to be harsh, you sound like a idiot.
the "early" in "early-adopters" was two years ago
what the hell is an elid? don't be a troll Smiley

btw, I have a word in my brain called "Netflix"
bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 09:08:26 PM
 #3

not to be harsh, you sound like a idiot.

Contradiction?

Quote
the "early" in "early-adopters" was two years ago

OK, if you say so.

Quote
what the hell is an elid? don't be a troll Smiley

Look up the word 'elide' in a dictionary.

Quote
btw, I have a word in my brain called "Netflix"

I don't understand what you mean.
willphase
Hero Member
*****
Offline Offline

Activity: 767
Merit: 500


View Profile
June 17, 2011, 09:15:16 PM
 #4

OP: go read the bitcoin paper. The block chain is intentionally stored as a merkle tree for the very reasons you describe. The paper also describes how a low resource client can trim the tree and yet atill maintain security.

It just happens that the official bitcoin client doesn't do this yet but I expect other clients e.g. bitcoinj that are specifically designed for for resource constrained platforms to implement these optimisations soon enough.

Will

bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 09:19:11 PM
 #5

OP: go read the bitcoin paper. The block chain is intentionally stored as a merkle tree for the very reasons you describe. The paper also describes how a low resource client can trim the tree and yet atill maintain security.

It just happens that the official bitcoin client doesn't do this yet but I expect other clients e.g. bitcoinj that are specifically designed for for resource constrained platforms to implement these optimisations soon enough.

Will

Please read the paper yourself.  The paper doesn't say anything about the problem of distributing the Merkle tree to clients.  That is what my proposal is about.  With the current client and the current protocol, each client will have to download the entire block chain even if they only store the transactions, Merkle trees, and block headers they care about.  This is infeasable and is going to be a big problem within 6 months to 1 year.

Or perhaps there is something obvious I didn't see in the white paper?  Can you explain how clients are going to trim their tree without first downloading gigabytes of data once the block chain is that big?
willphase
Hero Member
*****
Offline Offline

Activity: 767
Merit: 500


View Profile
June 17, 2011, 09:31:04 PM
 #6

To verify a transaction the client doesn't need the whole block chain. It merely stats listening for blocks and waits until it sees a block with the transaction in it. Then it merely has to watch for all subsequent blocks and ensure that those blocks when attached to the block containing the transaction remain the longest chain. An attacker would have to create 'valid' blocks on the block containing the invalid transaction at a faster rate than the rest of the network is generating board blocks on the valid chain in order for the transaction to remain valid.

Will

Maged
Legendary
*
Offline Offline

Activity: 1204
Merit: 1015


View Profile
June 17, 2011, 09:38:26 PM
 #7

https://en.bitcoin.it/wiki/Scalability

bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 09:41:36 PM
 #8

To verify a transaction the client doesn't need the whole block chain. It merely stats listening for blocks and waits until it sees a block with the transaction in it. Then it merely has to watch for all subsequent blocks and ensure that those blocks when attached to the block containing the transaction remain the longest chain. An attacker would have to create 'valid' blocks on the block containing the invalid transaction at a faster rate than the rest of the network is generating board blocks on the valid chain in order for the transaction to remain valid.

Will

There will come a time when just listening for blocks is too expensive.  Right now the average block size is something like 10K and that's with very few transactions per second.  Can end users really expect to put their mouth over the bitcoin firehose when the average block size is megabytes?  Can end users ever reasonably be expected to have to watch every transaction that goes by just to pick out their own?

It's working now because the transaction counts are so low.  It will not work at all once bitcoin starts being used 'for real'.

This is why I propose a standardized protocol to allow clients to query for just the transactions, Merkle trees, and block headers that they need, iteratively, and allow for a way for clients to pay peers for delivering these costly pieces of data to them.
willphase
Hero Member
*****
Offline Offline

Activity: 767
Merit: 500


View Profile
June 17, 2011, 09:50:30 PM
 #9

yoir your proposal had merit but I think such a fundamental change to the basic bitcoin message is unlikely to occur with the protocol already at the maturity that it's at. I think solutions to this problem are more likely to build on the existing ideas in the bitcoin paper e.g. tree pruning rather than introducing the new elided block construct.

It's a pity this thread is in the newbie forum where not many people will read it. Hopefully the discussion here will promote you out of noob purgatory and you can move this discussion elsewhere.

Will

bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 09:50:59 PM
 #10


That document only addresses network requirements for future 'supernodes'.  It already implicitly acknowledges that most future peers will not be able to listen to the entire block stream due to bandwidth restrictions.  So there has to be some way for clients to get the transactions, Merkle tree nodes, and block headers that they need to validate transactions of interest to them.  The document you linked to does not in any way address this problem.

What I am suggesting is a solution to the problem that would be built into the bitcoin protocol and allow end users to get the data that they need to continue to be an independent bitcoin 'node' (and end-user node, not a 'supernode') without leaning on any service except one which provides more targeted data for their needs.

An alternative is to not have something like this and then end users will simply not be able to run bitcoin nodes as we know them - they'll have to lean on a 'bank' to act as a proxy for them, and have to trust the bank to verify transactions because the client itself will not have access to the blocks.

I don't know why anyone would prefer to relegate normal users to second-class bitcoin citizens, rather than building into the protocol mechanisms that allow every user to validate and verify their own transactions as seems to have been intended by the bitcoin designer(s).   What I propose allows the latter.

Of course, this could all not be standardized either and end-users would just have to use a different client other than the standard client in order to use a more intelligent service and not have to download gigabytes of data.  But I think it would be better to standardize it in the client and protocol now for the benefit of everyone.
bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 09:52:50 PM
 #11

yoir your proposal had merit but I think such a fundamental change to the basic bitcoin message is unlikely to occur with the protocol already at the maturity that it's at. I think solutions to this problem are more likely to build on the existing ideas in the bitcoin paper e.g. tree pruning rather than introducing the new elided block construct.

It's a pity this thread is in the newbie forum where not many people will read it. Hopefully the discussion here will promote you out of noob purgatory and you can move this discussion elsewhere.

Will


I agree that the proper solution is tree pruning (which I am calling eliding, but we're talking about the same thing).  I'm just proposing that a standardized mechanism be built for allowing a peer with access to the entire blocks to do the pruning on behalf of another peer, instead of expecting each peer to download all of the blocks and do their own pruning, because the latter simply does not scale.

Yours are the first encouraging words I have gotten about my idea though, and thanks for that.  Maybe I will try to take this to a better forum when/if I lose my newbie status.
willphase
Hero Member
*****
Offline Offline

Activity: 767
Merit: 500


View Profile
June 17, 2011, 10:00:36 PM
 #12

The difference is that your 'elided blocks' ate fundamentally different from the existing bitcoin blocks and would be a major change. However just restructuring the existing tree structures in memory are a natural extension of the existing protocol and are even mentioned in the bitcoin paper. I think by the time this is an issue most devices even smartphones wil be able to afford the 1 block every ten minutes for 6 blocks to verify a transaction from zero knowledge, with the longest chain potentially kept in a rolling buffer of linked blocks.

Even with your 25k blocks... That's less than 50 bytes/s required.

Will

4n0n
Newbie
*
Offline Offline

Activity: 14
Merit: 0


View Profile
June 17, 2011, 10:10:53 PM
 #13

I don't think the pathetically low mobile data allowances will keep up with the popularity of Bitcoin - if it becomes popular, which it isn't at the moment.   I mean, let's face it, if PayPal is Visa's baby cousin, Bitcoin is an amoeba.

The question is, does Bitcoin scale to 1,000,000 times the transactions it has today?  Clearly not, as a retailer who has been waiting now 20 hours for his first purchase of Bitcoin currency to come through, the current system has to be improved let alone the future version. 

Whether or not the OP's suggestions require a "major change" to the protocol or not, the reality is that it improvements need to be made and that Moore's Law does not apply to bandwidth - especially on mobile devices.
bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 10:13:45 PM
 #14

The difference is that your 'elided blocks' ate fundamentally different from the existing bitcoin blocks and would be a major change. However just restructuring the existing tree structures in memory are a natural extension of the existing protocol and are even mentioned in the bitcoin paper. I think by the time this is an issue most devices even smartphones wil be able to afford the 1 block every ten minutes for 6 blocks to verify a transaction from zero knowledge, with the longest chain potentially kept in a rolling buffer of linked blocks.

Even with your 25k blocks... That's less than 50 bytes/s required.

Will

The current blocks are 10K because the transaction limit is set to 7 per second (at least according to the wiki) and even then the number of transactions per second is far lower than this on average.  If there were even 100 transactions per second on average, then the block size would then be 22.397 MiB.  Is it really practical to have every client sucking down 23 MiB every 10 minutes?  That's 3.29 GiB per day.  What about periods of really heavy activity - what if, during the holiday season, the number of transactions averages 500 per second?   That's over 16 GiB every day - for every single client, including cell phones.  This just does not scale.  Even 10 transactions per second is 345 MB every day.  I don't think that every client should be expected to download that much.
bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 10:22:45 PM
 #15

The difference is that your 'elided blocks' ate fundamentally different from the existing bitcoin blocks and would be a major change.

Exactly right.  The protocol does not scale for all clients (maybe for super nodes that can download and process dozes of GiB per day), and a major change is needed.  I wouldn't even propose a destructive change, just an addition:

Add a new 'service', number 2, called 'NODE_ELIDED' (or NODE_PRUNED if others prefer).  This service would enable the client to send queries for elided (pruned) blocks - specify a set of bitcoin address of interest, and a start and end block hash, and get back a new message called 'elided_block' that is like 'block' but has uninteresting transactions, and subtrees of the Merkle tree, elided out.

Also, add a mechanism whereby clients can specify a bitcoin address (or public key or whatever works) and sign the request so that peers can identify who to 'charge' for servicing the request should that be what the server wants to do.  This would be optional but it is absolutely inevitable that the distribution of blocks will eventually cost money as there is so much bandwidth (and CPU) involved in creating and distributing elided blocks.  Supporting that in the protocol would empower end users to easily switch between elided block providers instead of being tied into something proprietary.
willphase
Hero Member
*****
Offline Offline

Activity: 767
Merit: 500


View Profile
June 17, 2011, 10:47:40 PM
 #16

How would the client verify the hash on the elided block, given that the hash of a block is a function of all the whole transactions contained within?

Will

bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 17, 2011, 11:26:15 PM
 #17

How would the client verify the hash on the elided block, given that the hash of a block is a function of all the whole transactions contained within?

Will

Keep in mind that what I call an 'elided' block is exactly the same thing as a 'pruned' block as described in the white paper.  The only difference is that a peer 'prunes' it before delivering it to the interested client, instead of the client pruning it itself.

As the white paper describes, an 'elided' (pre-pruned) block would have the block header as normal, but would simply not have transactions not of interest in it.  The data structure would be very similar but would have a way to indicate that some entire branches of the Merkle tree are pruned.  The block would still have enough of the Merkle tree in it to allow the client to verify the hashing at every level of the Merkle tree, and finally, of the block as a whole.

The protocol request that I would expect end-user clients to make is 'please give me all transactions for this set of bitcoin addresses beginning at this block hash and ending at this block hash'.  This would imply that the end-user's client already knows the entire block chain headers but these are easy and cheap to acquire ahead of time, so that it could verify that the elided block thus delivered does indeed have the correct block hash.  The peer's response would be a set of blocks that it detected had transactions relevant to the bitcoin address (maybe only the most recent one would be necessary in most cases) and return them in the elided form I have described.

In this way, clients would on-demand use network bandwidth, and only a small amount at that, instead of using a continuous and ever-growing amount of network bandwidth to monitor blocks for transactions of interest.

The load put onto a peer to satisfy elided block requests (both looking up the blocks that contain transactions of interest, and composing the elided form of the block for delivery to the client) would not be insignificant, which is why I propose a built-in mechanism for identifying the client such that the peer could, outside of the scope of the bitcoin protocol, charge the client for this service.

I could start actually implementing this if the consensus is that it is a good idea.  Of course I would expect some bitcoin to come my way at 14PtQ2dHfjftuDuRkx5jnUgGrvgYCGaVh5 before I'd invest the effort Smiley  Not being an early adopter, i.e. loaded with bitcoins (I have all of 3, and I paid nearly $100 for them), I don't have alot of incentive to go much further beyond talking about this ...

defxor
Hero Member
*****
Offline Offline

Activity: 530
Merit: 500


View Profile
June 19, 2011, 01:48:56 PM
 #18

A better solution would be for a client to ask its peers for "elided blocks", which would allow the client to download elided blocks in a progressive manner.  The client would indicate bitcoin addresses of interest and its peer would provide the elided blocks.

I might misunderstand, but it sounds like there's a leak of pseudoanonymity at this stage. The peers would not only be able to associate IPs with specific addresses but also which addresses [likely] belong to the same wallet/user.

Bitcoin users shouldn't trust their peers Wink


Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1128


View Profile
June 19, 2011, 02:31:26 PM
 #19

Hi,

This topic was already discussed back in May. There's no need for payment schemes or other complexity.

   http://forum.bitcoin.org/index.php?topic=7972.0

I will add something to the scalability page about this.
bji (OP)
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 19, 2011, 04:58:59 PM
 #20

A better solution would be for a client to ask its peers for "elided blocks", which would allow the client to download elided blocks in a progressive manner.  The client would indicate bitcoin addresses of interest and its peer would provide the elided blocks.

I might misunderstand, but it sounds like there's a leak of pseudoanonymity at this stage. The peers would not only be able to associate IPs with specific addresses but also which addresses [likely] belong to the same wallet/user.

Bitcoin users shouldn't trust their peers Wink




It is inevitable.  Absolutely INEVITABLE.  There is no way that all peers are going to be able to suck down all of the blocks all of the time when the block chain gets large enough (I predict this will happen within a year).  Thus, there is going to have to be some kind of alternate block delivery mechanism, and there will be a definite cost to being a provider of the block chain, and those bearing that cost will have to charge for it.  Maybe people will just use bitcoin 'banks' that they keep their wallet in, in which case they're going to go far beyond the minimal loss of pseudoanonymity that you have pointed out.  Or maybe they will keep their wallet private but will pay a proxy to deliver blocks of interest to them, which is basically my proposal already, except using a proprietary and likely closed protocol, and I cannot see the advantage of closed protocols over open ones.
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!