Bitcoin Forum

Bitcoin => Bitcoin Discussion => Topic started by: AssemblY on August 31, 2011, 09:52:45 PM



Title: Where will stop, the size of the database bitcoin. 1GB+
Post by: AssemblY on August 31, 2011, 09:52:45 PM
I have a big question, searched the forum about it and found nothing.

Bitcoin today has surpassed the mark of 1GB of data.
Imagine that it has become popular in large scale in a short time, the size of the database could be increased so incredible, and make it slower, no?

As far this size can reach? It can compromise the network?

 ???


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: Hawkix on August 31, 2011, 10:45:14 PM
My up-to-date blk001.dat is "only" 528MB. The blkindex.dat is 216MB. Where did you get that 1GB+ ?


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: ElectricMucus on August 31, 2011, 10:50:34 PM
Very simple: Future Clients might use a binary data format and / or compression, there are large amounts of redundant information.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: evoorhees on August 31, 2011, 10:53:53 PM
Use the excellent Android app as an example. It's a fully-functional wallet, but only about 30MB.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: pilardi on August 31, 2011, 10:57:08 PM
The client does not need the entire database to work properly.  Future clients will allow downloading of only relevant blocks.  

Read this for more:
https://en.bitcoin.it/wiki/FAQ#If_every_transaction_is_broadcast_via_the_network.2C_does_Bitcoin_scale.3F



Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: AssemblY on August 31, 2011, 11:02:57 PM
My up-to-date blk001.dat is "only" 528MB. The blkindex.dat is 216MB. Where did you get that 1GB+ ?


I installed bitcoin on a new machine now, and I finished downloading all the blocks. The folder "database" was to 1.10GB, and blkindex and blk0001 together added 752mb.

Very simple: Future Clients might use a binary data format and / or compression, there are large amounts of redundant information.

I dont understand, why I had to download almost 2GB of data then?
I not understand it, but I think everyone who install bitcoin from today will download the same amount of data in an ever increasing volume, is not?
Where are the compression?


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: AssemblY on August 31, 2011, 11:06:57 PM
The client does not need the entire database to work properly.  Future clients will allow downloading of only relevant blocks.  

Read this for more:
https://en.bitcoin.it/wiki/FAQ#If_every_transaction_is_broadcast_via_the_network.2C_does_Bitcoin_scale.3F



Right ... when? The article does not told this specifically.  :-\


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: Vod on August 31, 2011, 11:07:00 PM
I have a big question, searched the forum about it and found nothing.

Bitcoin today has surpassed the mark of 1GB of data.
Imagine that it has become popular in large scale in a short time, the size of the database could be increased so incredible, and make it slower, no?

As far this size can reach? It can compromise the network?

 ???

Mining pools and exchanges can help control the size of the blockchain by increasing the minimum amount they work with.  If pools would transfer 1 bitcoin at a time instead of 0.1 or even 0.01, you'd reduce the size of the blockchain considerably.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: AssemblY on August 31, 2011, 11:10:42 PM
I have a big question, searched the forum about it and found nothing.

Bitcoin today has surpassed the mark of 1GB of data.
Imagine that it has become popular in large scale in a short time, the size of the database could be increased so incredible, and make it slower, no?

As far this size can reach? It can compromise the network?

 ???

Mining pools and exchanges can help control the size of the blockchain by increasing the minimum amount they work with.  If pools would transfer 1 bitcoin at a time instead of 0.1 or even 0.01, you'd reduce the size of the blockchain considerably.

I know this, but if the number of users increase bitcoin in a very fast, worldwide, the demand would be incalculable.
With existing pools and exchanges will not be enough to balance this.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: MoonShadow on August 31, 2011, 11:15:26 PM
I have a big question, searched the forum about it and found nothing.

Bitcoin today has surpassed the mark of 1GB of data.
Imagine that it has become popular in large scale in a short time, the size of the database could be increased so incredible, and make it slower, no?


No.  Not slower.  Not the network, anyway.  Perhaps your particular client, but only if your machine & internet connection suck.

Quote

As far this size can reach? It can compromise the network?

 ???

I can't compromise the network, and it can reach infinity until a client with the capacity to 'prune' spent transactions from the old blocks is developed.  Pruning of the blockchain is part of the protocol, but isn't implimented yet, and likely won't be for some time.  It probably isn't going to become a priority until the blockchain is around the 10-20 GB range.  Any computer newer than two years old and with a decent broadband connection can handle this kind of database.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: AssemblY on August 31, 2011, 11:42:10 PM
I can't compromise the network, and it can reach infinity until a client with the capacity to 'prune' spent transactions from the old blocks is developed.  Pruning of the blockchain is part of the protocol, but isn't implimented yet, and likely won't be for some time.  It probably isn't going to become a priority until the blockchain is around the 10-20 GB range.

If the pruning of block chain is part of the protocol, where is it written?
I looked for several articles and found nothing about it.

If this information is official, it makes sense.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: MoonShadow on September 01, 2011, 12:02:28 AM
I can't compromise the network, and it can reach infinity until a client with the capacity to 'prune' spent transactions from the old blocks is developed.  Pruning of the blockchain is part of the protocol, but isn't implimented yet, and likely won't be for some time.  It probably isn't going to become a priority until the blockchain is around the 10-20 GB range.

If the pruning of block chain is part of the protocol, where is it written?
I looked for several articles and found nothing about it.

If this information is official, it makes sense.

It's in the white paper.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: iamzill on September 01, 2011, 01:09:39 AM
It probably isn't going to become a priority until the blockchain is around the 10-20 GB range.  Any computer newer than two years old and with a decent broadband connection can handle this kind of database.

The most common SSD drives are 64 to 80GB  :'(



Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: phillipsjk on September 01, 2011, 05:08:11 AM
I will just leave this here:
Topic: Don't forget to rotate your logs... (https://bitcointalk.org/?topic=292.0)


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: iamzill on September 01, 2011, 05:15:20 AM
It probably isn't going to become a priority until the blockchain is around the 10-20 GB range.  Any computer newer than two years old and with a decent broadband connection can handle this kind of database.

The most common SSD drives are 64 to 80GB  :'(



you don't have to store it in %appdata%

-datadir=[old 2TB spinning disk]:\



That's what I'm doing on my desktop, but unfortunately the technique is not applicable for my laptop.

It's not a big concern for me though. I'm sure the standard the client will implement a "recent blockchain only" option or blockchain pruning soon. Any one of these new feature will solve the problem instantly, along with the slow client-up problem. 


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: ElectricMucus on September 01, 2011, 01:29:10 PM
Very simple: Future Clients might use a binary data format and / or compression, there are large amounts of redundant information.

Mmmh.

Data in blkchain isn't very compressible: here's a bzip2 run on it:

Code:
  blk0001.dat:   1.268:1,  6.310 bits/byte, 21.13% saved, 554632957 in, 437434096 out.
  blkindex.dat:  1.598:1,  5.007 bits/byte, 37.41% saved, 236568576 in, 148074565 out.
oops was a shot in the dark then  :-[


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: Gabi on September 01, 2011, 04:06:27 PM
The Roaming/Bitcoin folder is 837MB big here with blk001.dat at 530MB and blkindex.dat at 217MB



Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: NothinG on September 01, 2011, 04:11:46 PM
Why not this:

If the client is ran for the first time, only grab the most recent block. And, from there on out...grab the latest blocks.
-There is no need for new clients to go back and sift through all the blocks to see if they have a transactions.

If the client has been ran before, only grab the blocks from when their client first started.
-Again, no need to go ALL the way back.

If the user so-happens to want all the blocks, they would easily just be able to do -rescan.
-Sometimes people want to have all the blocks.



^ How about that?


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: Piper67 on September 01, 2011, 04:13:15 PM
Why not this:

If the client is ran for the first time, only grab the most recent block. And, from there on out...grab the latest blocks.

If the client has been ran before, only grab the blocks from when their client first started.

If the user so-happens to want all the blocks, they would easily just be able to do -rescan.



^ How about that?

Yes, can't the client "trust" the network with the majority of the block chain as a default, and only download the whole thing when specifically requested to do so?


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: NothinG on September 01, 2011, 04:15:21 PM
Yes, can't the client "trust" the network with the majority of the block chain as a default, and only download the whole thing when specifically requested to do so?
Personally, I could care less how big the client is. However, following this method would allow users with little space to start a new client and send all Bitcoins to it.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: Gabi on September 01, 2011, 04:21:53 PM
Remember that, due to the decentralized nature of Bitcoin, we need as many people as possible with the FULL blockchain, to relay all the blocks and transactions to others

1GB is almost nothing, you can buy 3TB hard disks for cheap... so as long as it doesn't become a real problem, it's better we keep the full blockchain


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: NothinG on September 01, 2011, 04:30:08 PM
Remember that, due to the decentralized nature of Bitcoin, we need as many people as possible with the FULL blockchain, to relay all the blocks and transactions to others

1GB is almost nothing, you can buy 3TB hard disks for cheap... so as long as it doesn't become a real problem, it's better we keep the full blockchain
I see your point, and I raise you space detection. :)

If their HDD (the one that has Bitcoin Client on it) has enough space that will allow at LEAST 10% left on it, to go ahead and download the full blockchain.
Another option would be to ask the user if they would be so kind and help grow the Bitcoin economy by downloading the full blockchain.

How about payment monthly for having the full blockchain with a decent upload speed? Nothing HUGE, but something to make them WANT to have all the blocks.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: MoonShadow on September 01, 2011, 06:35:22 PM
Why not this:

If the client is ran for the first time, only grab the most recent block. And, from there on out...grab the latest blocks.
-There is no need for new clients to go back and sift through all the blocks to see if they have a transactions.

BitcoinJ clients do this, because once they are started for the first time, they create their own addresses and have no logical need to assume that such addresses have existed prior to themselves.  They download the most recent blocks, and then keep up with the chain, but then I think that they only keep the block headers and discard the data in all blocks that do not relate to coins sent to themselves.  It's certainly possible, there just isn't a real need for a regular client that can do this just yet.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: MoonShadow on September 01, 2011, 06:45:07 PM

How about payment monthly for having the full blockchain with a decent upload speed? Nothing HUGE, but something to make them WANT to have all the blocks.

There isn't any real need for a large number of nodes that keep a full copy of the blockchain.  There probably isn't any real need for any node to keep a full copy of the blockchain unpruned at all.  Pruning of transaction data that 1) is older than a certain period of time, say three months or15K blocks or so and 2) has been referenced (spent) and the referencing transaction has been referenced (i.e. the transaction is at least two transactions long spent) will eventually result in a fairly stable blockchain size that mostly varies by transaction volumes over those three months.  Some clients won't keep spent transactions unpruned at all, and will thus have a much smaller data footprint, growing only by the size of the block headers; which amounts to about 4 megs per year.

That said, some nodes will keep full copies of the blockchain, if only for archival reasons.  It's not neccessary that these nodes have high bandwidth or super powerful machines either.  I have a VPS that can keep up with the blockchain just fine, that I direct my other client(s) to bootstrap and update from, since I don't keep clients running either on my home machine nor my android phone and I don't want either to announce to the network or to the bootstrapping IRC channel that they exist.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: wareen on September 01, 2011, 07:23:56 PM
Data in blkchain isn't very compressible: here's a bzip2 run on it:
Code:
  blk0001.dat:   1.268:1,  6.310 bits/byte, 21.13% saved, 554632957 in, 437434096 out.
  blkindex.dat:  1.598:1,  5.007 bits/byte, 37.41% saved, 236568576 in, 148074565 out.

For reference -here's the result of an lzma run:
Code:
  blk0001.dat:   32.17% saved
  blkindex.dat:  57.47% saved

And that's just a "dumb" entropy compressor, so I'd guess in combination with some low-level optimization of the data format we could easily get the size down to about 40-50%.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: wareen on September 01, 2011, 08:37:39 PM
Yeah ...
Not even an order of magnitude ... probably not worth the trouble.
I agree, but then again - with the total blockchain size growing steadily, saving 60% might one day be worth the trouble.

Correct me if I'm wrong but I believe that pruning the blockchain of fully spent transactions is "only" expected to yield about 70% savings as well.

There is a reason that every node keeps a copy of the blockchain and while we probably can get away with a lot of light clients, I'd rather not see the full history only be kept by large centralized institutions.

Having more efficient network protocols and storage formats would be great but there are other more important issues right now IMHO.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: MoonShadow on September 01, 2011, 09:15:06 PM
Correct me if I'm wrong but I believe that pruning the blockchain of fully spent transactions is "only" expected to yield about 70% savings as well.


That would be true about now, but that is also one reason pruning isn't high on the to-do list.  In a future, wherein Bitcoin processes transactions on the scale of Paypal or Visa, the pruning of spent transactions will be a very useful thing.

Quote
There is a reason that every node keeps a copy of the blockchain and while we probably can get away with a lot of light clients, I'd rather not see the full history only be kept by large centralized institutions.


Yes, there is a reason; and because of that reason, and the sentiment that you have expressed above, not every node is going to prune.  Thus, the entire blockchain will continue to exist somewhere on the 'net forever.  However, this isn't particularly useful for a node processing transactions in a live production environment.  If you truely feel this way, then you can feel free to commit some of your own personal resources to make certain that a node with the complete and unabridged blockchain continues to exist.

Quote
Having more efficient network protocols and storage formats would be great but there are other more important issues right now IMHO.

Agreed.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: kjj on September 02, 2011, 03:45:27 PM
There isn't any real need for a large number of nodes that keep a full copy of the blockchain.  There probably isn't any real need for any node to keep a full copy of the blockchain unpruned at all.  Pruning of transaction data that 1) is older than a certain period of time, say three months or15K blocks or so and 2) has been referenced (spent) and the referencing transaction has been referenced (i.e. the transaction is at least two transactions long spent) will eventually result in a fairly stable blockchain size that mostly varies by transaction volumes over those three months.  Some clients won't keep spent transactions unpruned at all, and will thus have a much smaller data footprint, growing only by the size of the block headers; which amounts to about 4 megs per year.

That said, some nodes will keep full copies of the blockchain, if only for archival reasons.  It's not neccessary that these nodes have high bandwidth or super powerful machines either.  I have a VPS that can keep up with the blockchain just fine, that I direct my other client(s) to bootstrap and update from, since I don't keep clients running either on my home machine nor my android phone and I don't want either to announce to the network or to the bootstrapping IRC channel that they exist.

The hash of a pruned block will not match the hash in the header, and thus a pruned block cannot be verified.  Checkpoints will help this a bit, in that you'll have a verification chain (by a signature on the source code or binary) that says that someone "trustworthy" claims to have verified up to a certain point.  But you won't be able to prove it unless you have, or can get, the full chain, but you don't need to have all of it at once.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: MoonShadow on September 02, 2011, 05:38:27 PM
There isn't any real need for a large number of nodes that keep a full copy of the blockchain.  There probably isn't any real need for any node to keep a full copy of the blockchain unpruned at all.  Pruning of transaction data that 1) is older than a certain period of time, say three months or15K blocks or so and 2) has been referenced (spent) and the referencing transaction has been referenced (i.e. the transaction is at least two transactions long spent) will eventually result in a fairly stable blockchain size that mostly varies by transaction volumes over those three months.  Some clients won't keep spent transactions unpruned at all, and will thus have a much smaller data footprint, growing only by the size of the block headers; which amounts to about 4 megs per year.

That said, some nodes will keep full copies of the blockchain, if only for archival reasons.  It's not neccessary that these nodes have high bandwidth or super powerful machines either.  I have a VPS that can keep up with the blockchain just fine, that I direct my other client(s) to bootstrap and update from, since I don't keep clients running either on my home machine nor my android phone and I don't want either to announce to the network or to the bootstrapping IRC channel that they exist.

The hash of a pruned block will not match the hash in the header, and thus a pruned block cannot be verified. 

It doesn't need to be verified.  The client downloads the entire block, verifies it, and then proceeds to prune it to it's own liking.  That is what the merkle tree block structure is for.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: kjj on September 02, 2011, 06:03:22 PM
The hash of a pruned block will not match the hash in the header, and thus a pruned block cannot be verified. 

It doesn't need to be verified.  The client downloads the entire block, verifies it, and then proceeds to prune it to it's own liking.  That is what the merkle tree block structure is for.

Right, but that means that the full block must exist somewhere so the node can download it and verify it.  So while any node can prune, not every node can prune.


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: MoonShadow on September 02, 2011, 07:58:24 PM
The hash of a pruned block will not match the hash in the header, and thus a pruned block cannot be verified. 

It doesn't need to be verified.  The client downloads the entire block, verifies it, and then proceeds to prune it to it's own liking.  That is what the merkle tree block structure is for.

Right, but that means that the full block must exist somewhere so the node can download it and verify it.  So while any node can prune, not every node can prune.

And that is why it is reasonable to assume that there will always be at least one person willing to run a full node that does not prune.  It's also why I said that most full clients won't prune anything recent.  As already noted, light clients don't need the full chain anyway, and can start their chain when they create their first address.  Most full clients won't really need a full chain either, and can prune quite extensively and/or start from the most recent trusted checkpoint hash.  All clients do full chains currently, in part, because the network is still very small compared to the expectations, and the many-copies-keep-data-safe method is employed.  The concept of placing a full "quiet" node in orbit, on Earth protected in a safe zone, and eventually on the Moon, as archival devices have been discussed already.  For now, there is neither the need nor the resources to do any such things, but if Bitcoin is as successful as some of us predict, those nodes will be cheap insurance; largely inaccessable to most forms of destruction.  (one or another, but not all, depending on the kind of destruction being considered)


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: AssemblY on September 04, 2011, 12:59:47 PM
(...)For now, there is neither the need nor the resources to do any such things, but if Bitcoin is as successful as some of us predict, those nodes will be cheap insurance; largely inaccessable to most forms of destruction.  (one or another, but not all, depending on the kind of destruction being considered)

I understand now. I thank everyone for their cooperation and commitment to this topic.

Added much information to my knowledge.

 :)


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: Meatpile on September 04, 2011, 03:12:37 PM
As far as I know there are hardcoded checkpoints, in the future they would just make sure you have data after said checkpoint.

However this leaves the system open to a possible conspiracy situation perhaps?


Title: Re: Where will stop, the size of the database bitcoin. 1GB+
Post by: phillipsjk on September 04, 2011, 08:47:42 PM
There may exist coins minted/created in (split/merging) transactions since before the checkpoint. You can only prune coin obliterated by a split/merge.