Mike Hearn (OP)
Legendary
Offline
Activity: 1526
Merit: 1128
|
|
October 15, 2012, 10:14:27 PM |
|
I rewrote the scalability wiki page with the results of the latest work: http://en.bitcoin.it/wiki/ScalabilityI also simplified it by taking into account very simple optimizations that were already done or partially prototyped, and removed the discussion of sharded supernodes as with ultraprune and more up to date OpenSSL performance figures it no longer seems necessary to shard a node over multiple machines.
|
|
|
|
|
|
|
|
TalkImg was created especially for hosting images on bitcointalk.org: try it next time you want to post an image
|
|
|
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
|
|
|
|
Etlase2
|
|
October 16, 2012, 06:47:11 AM |
|
Can you fix the "18 different people edited this" format? "VISA handles on average around 2,000 transactions/sec, so call it a daily peak rate of 4,000/sec" "Let's take 4,000 tps as starting goal." "Let's assume an average rate of 2000tps, so just VISA." "That means that you need to keep up with around 8 megabits/second of transaction data (2000tps * 512 bytes) / 1024 bytes in a kilobyte / 1024 kilobytes in a megabyte = 0.97 megabytes per second * 8 = 7.8 megabits/second. This sort of bandwidth is already common for even residential connections today, and is certainly at the low end of what colocation providers would expect to provide you with." If the network were to fail miserably, this is accurate. Otherwise everyone needs more than just a single downstream connection. "As of October 2012 (block 203258) there have been 7,979,231 transactions, however the size of the unspent output set is less than 100MiB" You need to back this up, because from what I recall estimates were between 70-80% of the current block chain's size, which, even today, is definitely not 100MB. "Only a small number of archival nodes need to store the full chain going back to the genesis block. These nodes can be used to bootstrap new fully validating nodes from scratch but are otherwise unnecessary." Yeah, unnecessary if there's a monopoly on mining and validating.
|
|
|
|
Mike Hearn (OP)
Legendary
Offline
Activity: 1526
Merit: 1128
|
|
October 16, 2012, 10:02:36 AM |
|
I made the transactions/sec -> tps notation more consistent. Of course you could have done that yourself, it being a wiki. If the network were to fail miserably, this is accurate. Otherwise everyone needs more than just a single downstream connection. I don't understand what you're talking about here. You need to back this up, because from what I recall estimates were between 70-80% of the current block chain's size, which, even today, is definitely not 100MB. It's backed up by reality. Check out Pieters ultraprune branch and dump the stats from it yourself. Or you could just ask Pieter himself.
|
|
|
|
Etlase2
|
|
October 16, 2012, 10:30:00 AM |
|
I don't understand what you're talking about here. Is it a peer to peer network, or is it bitcoin visa? It's backed up by reality. Check out Pieters ultraprune branch and dump the stats from it yourself. Or you could just ask Pieter himself. It is a wiki, why not put a link in?
|
|
|
|
davout
Legendary
Offline
Activity: 1372
Merit: 1007
1davout
|
|
October 16, 2012, 10:37:15 AM |
|
It is a wiki, why not put a link in?
Go for it.
|
|
|
|
Etlase2
|
|
October 16, 2012, 10:38:12 AM |
|
How about because I don't know the link?
|
|
|
|
Mike Hearn (OP)
Legendary
Offline
Activity: 1526
Merit: 1128
|
|
October 16, 2012, 10:41:28 AM |
|
There is no link. The stats come from measurements taken from the software.
|
|
|
|
|
fornit
|
|
October 17, 2012, 05:36:15 PM |
|
with 0.5gb blocks, the blockchain will grow by 25tb every year. thats 75 million 5 1/4 floppy disks, more than even the most modern c64s can handle. therefore, bitcoin is doomed to fail.
|
|
|
|
bfever
Jr. Member
Offline
Activity: 39
Merit: 1
|
|
October 17, 2012, 08:37:37 PM |
|
"As of October 2012 (block 203258) there have been 7,979,231 transactions, however the size of the unspent output set is less than 100MiB"
You need to back this up, because from what I recall estimates were between 70-80% of the current block chain's size, which, even today, is definitely not 100MB.
You probably recall the correct percentages (70-80%) but that's the percentage of spent outputs, which could be forgotten. I can confirm the following around block 202287 (using my BiRD client which only keeps track of unspent transactions): - 2443854 unspent transaction outputs, which is about 30% of all transactions (see 7.9M txs at block 203258)
- the MySQL database containing all the necessary data to be able to spent those outputs, i.e. creating a valid (unsigned) tx, is about 316Mb in size when converted to an uncompressed CSV dump (simple text file)
- compressing this CSV file yields only 110Mb of data.
You can download the client and some CSV's here.
|
|
|
|
Pieter Wuille
|
|
October 17, 2012, 08:56:27 PM |
|
Ultraprune's unspent transaction output database is around 120 MB is size now (including LevelDB indexes/overhead). Compressed, something around 80-85 MB.
|
I do Bitcoin stuff.
|
|
|
Etlase2
|
|
October 17, 2012, 09:00:15 PM |
|
I meant to say a 70-80% reduction, or 20-30% of the current size. Flipperfish's link would be pretty much all that is necessary to avoid doubt in the future.
The bandwidth section is still misleading though.
|
|
|
|
Mike Hearn (OP)
Legendary
Offline
Activity: 1526
Merit: 1128
|
|
October 19, 2012, 09:44:16 AM |
|
You still haven't explained why it's misleading. Why don't you put a proposed rewrite of the bandwidth section here so we can read it?
|
|
|
|
davout
Legendary
Offline
Activity: 1372
Merit: 1007
1davout
|
|
October 19, 2012, 10:03:06 AM |
|
Ultraprune's unspent transaction output database is around 120 MB is size now (including LevelDB indexes/overhead). Compressed, something around 80-85 MB.
You're a rockstar.
|
|
|
|
caveden
Legendary
Offline
Activity: 1106
Merit: 1004
|
|
October 19, 2012, 10:11:22 AM |
|
You still haven't explained why it's misleading. Why don't you put a proposed rewrite of the bandwidth section here so we can read it?
I think he means that's misleading because it only consider the download side of it. If you're a full node you're also expected to be a relay, so you might have to upload what you download too. It's hard to estimate how much you'll need to upload since you don't know how many of your peers will receive the data before you're ready to send them. Assuming only pool operators are full nodes and they're all interconnected, then you'll only have to upload transactions when the sender is directly connected to you and not to other full nodes. In this case you'd upload it as many times as you have full node peers. If neither the sender nor the receiver is connected to you (I'm assuming every thin client is using bloom filters), you may not need to relay the transaction at all. But honestly, if nothing is done to create monetary incentives to relays, I believe those Microsoft researchers might be right and eventually full nodes will not relay transaction between themselves. They have no interest, actually they have a negative interest in doing so. In such scenario (which is not that bad btw), thin clients would better attempt to connect to every full node they manage to find. Assuming this is the actual scenario, then perhaps we can estimate the total upload a full node would have to handle to be equivalent to the number of transactions times the average rate of false positives thin clients request. Say, if everybody requests 99 false positives for each relevant transaction, then a full node would likely upload 100 times what it downloads. But we should also consider that thin clients have an interest in dispersing their bloom filters in ways that no full node has the entire set. That would reduce the tps rate accordingly.
|
|
|
|
Pieter Wuille
|
|
October 20, 2012, 01:58:59 PM |
|
Ultraprune's unspent transaction output database is around 120 MB is size now (including LevelDB indexes/overhead). Compressed, something around 80-85 MB.
Around block 204149: LevelDB database: $ du -sh ~/.bitcoin/coins/ 117M /home/pw/.bitcoin/coins/
Raw UTXO data (not directly usable): $ ls -1hs utxo* 101M utxo.dat 72M utxo.dat.7z 94M utxo.dat.bz2 98M utxo.dat.gz 70M utxo.dat.lzma
|
I do Bitcoin stuff.
|
|
|
fornit
|
|
October 20, 2012, 02:18:16 PM |
|
one tiny requestion regarding ultraprune. are clients using ultraprune able to share their dataset at all? do they share with other clients using ultraprune? or, for example, to they store recent blocks and transmit them to full clients as well?
|
|
|
|
Pieter Wuille
|
|
October 20, 2012, 02:32:10 PM |
|
I probably should have changed the name ultraprune a long time ago, as it is somewhat confusing. It does not prune blocks or transactions, and implements a full node. What it does is use an ultra-pruned copy of the block chain (in addition to the normal blk000*.dat files), for almost all operations, making it significantly faster. It also removes the need for a transaction index (so no blkindex.dat anymore). For serving blocks to other nodes, for rescanning, and for reorganisations it still needs the normal blocks to be present.
That said, this model of working will allow block pruning to be implemented relatively easily. It's almost trivial to do - just delete the block files that you think you won't need anymore for serving, rescanning or reorganising - but having such nodes on the network may have a very significant effect on the system. Doing this will be implemented, but needs some discussion first.
Somewhat longer term, I think we'll see a split between "fully validating nodes" and "archive nodes", where only the latter would serve arbitrary blocks to the network. This may be a problem (because fewer nodes serve all blocks), or it may improve things (as the nodes who still do serve the blocks are those who choose to and have the bandwidth for it).
|
I do Bitcoin stuff.
|
|
|
|