bytemaster (OP)
|
|
June 09, 2013, 05:14:22 AM Last edit: June 14, 2013, 02:34:11 AM by bytemaster |
|
I would like to see a graph (and data table) on the total number of 'unspent outputs' over time, perhaps once per block.
It seems like every time there is a transaction with change, there is at least 1 input to 2 outputs, but how often does the number of inputs 're-combine' the outputs? How much would the blockchain compress if we only had to store the unspent outputs?
I would guess that the number of unspent outputs will grow proportional to the user base.
|
|
|
|
trout
|
|
June 09, 2013, 11:46:54 AM |
|
the reference client does a nice job combining small outputs as inputs to new transactions. That is, as far as I understand, it makes some effort to minimize the utxo set and the fee the user has to pay at the same time
|
|
|
|
bitspill
Legendary
Offline
Activity: 2087
Merit: 1015
|
|
June 13, 2013, 08:56:44 AM |
|
Are you looking for something along the lines of this? (click for full size) Textbox intentionally placed over portion of graph until complete so we don't have anyone trying to crop out the graph and claim it
|
|
|
|
bytemaster (OP)
|
|
June 13, 2013, 11:53:30 AM |
|
That graph looks interesting, could you tell me how you calculated the numbers (only so that I can be sure these are the numbers I was looking for).
Right now I will assume you have won the bounty provided your methodology is accurate and you provide me that XLS sheet. So don't worry about anyone else trying to copy it.
|
|
|
|
bitspill
Legendary
Offline
Activity: 2087
Merit: 1015
|
|
June 13, 2013, 04:19:46 PM |
|
The data was obtained by looping over every block and adding the transactions to a list after removing its inputs from the list, then after each block I save how many transactions are in the list. More specifically it is a modified version of mb300sd's code https://github.com/mb300sd/Bitcoin-Tool/blob/master/Bitcoin%20Tool/Apps/ComputeUnspentTxOuts.cs In the loop iterating the blocks I save the current count to an array and print that array to a .csv file at the end. Edit: Forgot to mention I will send the excel file when I get back to my computer later tonight.
|
|
|
|
bitspill
Legendary
Offline
Activity: 2087
Merit: 1015
|
|
June 13, 2013, 06:52:54 PM |
|
Spreadsheet is linked below and one thing to note the data I have is only up through block 237,270 as I do not currently have the entire blockchain downloaded (slow internet sucks) I have bitcoin-qt downloading more and plan to update when the entire chain is downloaded unspent.xlsx (5.2 MB) https://mega.co.nz/#!aMBACaba!N402Lnf1DyK-ZCxIAEAkUjcL6WFkJrRSqChpI0L7N0c Edit: It seems the forum bbcode is hating on that link with the ! in it so you will need to copy-pasta
|
|
|
|
jackjack
Legendary
Offline
Activity: 1176
Merit: 1280
May Bitcoin be touched by his Noodly Appendage
|
|
June 13, 2013, 07:21:08 PM |
|
Textbox intentionally placed over portion of graph until complete so we don't have anyone trying to crop out the graph and claim it You made the xls file public anyway, so I don't get the point
|
Own address: 19QkqAza7BHFTuoz9N8UQkryP4E9jHo4N3 - Pywallet support: 1AQDfx22pKGgXnUZFL1e4UKos3QqvRzNh5 - Bitcointalk++ script support: 1Pxeccscj1ygseTdSV1qUqQCanp2B2NMM2 Pywallet: instructions. Encrypted wallet support, export/import keys/addresses, backup wallets, export/import CSV data from/into wallet, merge wallets, delete/import addresses and transactions, recover altcoins sent to bitcoin addresses, sign/verify messages and files with Bitcoin addresses, recover deleted wallets, etc.
|
|
|
bytemaster (OP)
|
|
June 13, 2013, 07:46:03 PM |
|
He could post it because I already gave him credit for the bounty (conditionally) and thus no one could steal it.
bitspill: I will throw in an extra 0.05BTC to rerun your script and include the 'total outputs' in addition to 'unspent' outputs.
|
|
|
|
jackjack
Legendary
Offline
Activity: 1176
Merit: 1280
May Bitcoin be touched by his Noodly Appendage
|
|
June 13, 2013, 07:59:12 PM |
|
Ah yeah I forgot about the bounty, sorry
|
Own address: 19QkqAza7BHFTuoz9N8UQkryP4E9jHo4N3 - Pywallet support: 1AQDfx22pKGgXnUZFL1e4UKos3QqvRzNh5 - Bitcointalk++ script support: 1Pxeccscj1ygseTdSV1qUqQCanp2B2NMM2 Pywallet: instructions. Encrypted wallet support, export/import keys/addresses, backup wallets, export/import CSV data from/into wallet, merge wallets, delete/import addresses and transactions, recover altcoins sent to bitcoin addresses, sign/verify messages and files with Bitcoin addresses, recover deleted wallets, etc.
|
|
|
bytemaster (OP)
|
|
June 14, 2013, 01:39:44 AM |
|
Spreadsheet is linked below and one thing to note the data I have is only up through block 237,270 as I do not currently have the entire blockchain downloaded (slow internet sucks) I have bitcoin-qt downloading more and plan to update when the entire chain is downloaded unspent.xlsx (5.2 MB) https://mega.co.nz/#!aMBACaba!N402Lnf1DyK-ZCxIAEAkUjcL6WFkJrRSqChpI0L7N0c Edit: It seems the forum bbcode is hating on that link with the ! in it so you will need to copy-pasta Could you send me your bitcoin address in a form I could copy and paste vs in your graphic?
|
|
|
|
bytemaster (OP)
|
|
June 14, 2013, 01:44:32 AM |
|
Also, if you could export it as a .csv because I don't have .xls and Numbers will not open your .xls file because it has too many rows.
|
|
|
|
bitspill
Legendary
Offline
Activity: 2087
Merit: 1015
|
|
June 14, 2013, 02:02:55 AM |
|
Could you send me your bitcoin address in a form I could copy and paste vs in your graphic?
Sure, It was in the spreadsheet 1DtEUTEUBHrUSWTEDLz99Mrz2a2WjaVKeM Also, if you could export it as a .csv because I don't have .xls and Numbers will not open your .xls file because it has too many rows.
Sure can, the code exports a csv I only opened it in excel to make the graph, however if it was due to the number of rows importing from csv will still be 237k rows anyway and likely cause the same error. unspent.csv (3.3 MB) https://mega.co.nz/#!ONAzgaqA!CDZ8Dlr2NIXeMTaq5Pg0REH0XYFs8Fwtinvb-aiHvS0
|
|
|
|
bitspill
Legendary
Offline
Activity: 2087
Merit: 1015
|
|
June 14, 2013, 02:04:00 AM |
|
bitspill: I will throw in an extra 0.05BTC to rerun your script and include the 'total outputs' in addition to 'unspent' outputs.
Something like http://blockchain.info/charts/n-transactions-total but by block rather than by date?
|
|
|
|
bytemaster (OP)
|
|
June 14, 2013, 02:26:50 AM |
|
What this graph tells me is that if we were to 'compress' the bitcoin blockchain with the simple approach of only 'storing' the unspent outputs where these outputs are 50 bytes each, that it would require a 350 MB output database + index or about 500 MB. It is also growing by about 1 million new unspent outputs per month or about 50 MB / month and accelerating. Thus if the only thing we did was to optimize the storage of the blockchain it would likely require over 10 GB in less than 10 years time.
I suspect that the number of 'unspent outputs' is greatly increased due to mining pools, sdice, and the perverse incentive not to combine dust. Suppose we were to charge based upon the number of new unspent outputs instead the total transaction size? Suppose that transactions with more inputs than outputs were 'free' and we somehow reward miners for including these transactions? Perhaps reduction in storage size would be all of the 'fee' required?
Now suppose that funds that are not spent for a few years start to be 'charged' storage fees? Why should the entire network incur an ongoing 'cost' to store these outputs forever even when people have lost their private keys or generated 'dust'?
I suspect that with a few incentive changes we could reduce the number of unspent outputs by 50% or more.
The other thing we can conclude from this is that the total number of bitcoin users is WELL UNDER 1 million if we assume that the average wallet contains a mere 10 unspent outputs.
What I also conclude from this is that if every user only had a hand full of accounts (checking,savings, business, etc) on the same order of magnitude that they currently manage in the banking system then the network 'at maturity' would have 10 * 10,000,000,000 accounts * 50 bytes and require about 5 TB just to store the 'unspent' outputs. If bitcoin doesn't do something it will hit that number long before 'maturity' and thus long before technology will enable the average computer to store it and access it in reasonable times.
The primary argument for lots of addresses per account is the theory that it increases privacy. The reality is that most of that privacy is an illusion and that some other solution (zero coin, open transactions) should be used instead.
|
|
|
|
bytemaster (OP)
|
|
June 14, 2013, 02:28:01 AM |
|
By block and not transactions but total outputs... ie, maintain a running total without subtracting the spent outputs.
|
|
|
|
bytemaster (OP)
|
|
June 14, 2013, 02:33:12 AM |
|
What this graph tells me is that if we were to 'compress' the bitcoin blockchain with the simple approach of only 'storing' the unspent outputs where these outputs are 50 bytes each, that it would require a 350 MB output database + index or about 500 MB. It is also growing by about 1 million new unspent outputs per month or about 50 MB / month and accelerating. Thus if the only thing we did was to optimize the storage of the blockchain it would likely require over 10 GB in less than 10 years time.
I suspect that the number of 'unspent outputs' is greatly increased due to mining pools, sdice, and the perverse incentive not to combine dust. Suppose we were to charge based upon the number of new unspent outputs instead the total transaction size? Suppose that transactions with more inputs than outputs were 'free' and we somehow reward miners for including these transactions? Perhaps reduction in storage size would be all of the 'fee' required?
Now suppose that funds that are not spent for a few years start to be 'charged' storage fees? Why should the entire network incur an ongoing 'cost' to store these outputs forever even when people have lost their private keys or generated 'dust'?
I suspect that with a few incentive changes we could reduce the number of unspent outputs by 50% or more.
The other thing we can conclude from this is that the total number of bitcoin users is WELL UNDER 1 million if we assume that the average wallet contains a mere 10 unspent outputs.
What I also conclude from this is that if every user only had a hand full of accounts (checking,savings, business, etc) on the same order of magnitude that they currently manage in the banking system then the network 'at maturity' would have 10 * 10,000,000,000 accounts * 50 bytes and require about 5 TB just to store the 'unspent' outputs. If bitcoin doesn't do something it will hit that number long before 'maturity' and thus long before technology will enable the average computer to store it and access it in reasonable times.
The primary argument for lots of addresses per account is the theory that it increases privacy. The reality is that most of that privacy is an illusion and that some other solution (zero coin, open transactions) should be used instead.
This also tells me that even going so far as to 'distribute' the outputs via a hash-table and then 'prove' the outputs are in the chain via a merkel tree wouldn't actually save any space. Merkel trees would require at least 28 bytes just for the hash and the 'merkel tree' of the full set of outputs would be the same size as the outputs.
|
|
|
|
bytemaster (OP)
|
|
June 14, 2013, 02:33:57 AM |
|
Could you send me your bitcoin address in a form I could copy and paste vs in your graphic?
Sure, It was in the spreadsheet 1DtEUTEUBHrUSWTEDLz99Mrz2a2WjaVKeM Also, if you could export it as a .csv because I don't have .xls and Numbers will not open your .xls file because it has too many rows.
Sure can, the code exports a csv I only opened it in excel to make the graph, however if it was due to the number of rows importing from csv will still be 237k rows anyway and likely cause the same error. unspent.csv (3.3 MB) https://mega.co.nz/#!ONAzgaqA!CDZ8Dlr2NIXeMTaq5Pg0REH0XYFs8Fwtinvb-aiHvS0
PAID. I was unable to open the .xls in keynote thus couldn't copy your address.
|
|
|
|
bitspill
Legendary
Offline
Activity: 2087
Merit: 1015
|
|
June 14, 2013, 02:42:06 AM |
|
PAID. I was unable to open the .xls in keynote thus couldn't copy your address.
Thanks, and as I said I plan to update again upon downloading more of the blockchain, and add the total outputs portion as well
|
|
|
|
bitspill
Legendary
Offline
Activity: 2087
Merit: 1015
|
|
June 14, 2013, 09:01:27 AM |
|
New version of the csv file containing: Block #, Total unspent, Transactions in block, New outputs in block unspent_v2.csv (4.5 MB) https://mega.co.nz/#!bZp01SQY!HRkpblOZmobu_UUNpnOtKS65zeIjRiiMoKMaGOEsHNE
Note: Still only goes up to block 237,270. Note2: If you try to verify on blockchain.info by viewing a block by its # you must subtract 1, for example http://blockchain.info/block-height/237262 is block # 237263 in the .csv (blockchain.info first block is 0, .csv first block is 1) Edit: looking back it seems you wanted total outputs not "new outputs in block" however you can simply add all the previous data points to get the total
|
|
|
|
bytemaster (OP)
|
|
June 16, 2013, 08:30:11 AM |
|
I just sent you the last .05 BTC for your latest update.
|
|
|
|
|