Title: Data compression Post by: mda on November 09, 2017, 01:49:36 AM Is anybody working on this https://bitcointalk.org/index.php?topic=1533714.0 ?
A compression ratio of 50% equals 3.5 years of traffic growth at 23% CAGR https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/vni-hyperconnectivity-wp.html . Title: Re: Data compression Post by: haltingprobability on November 09, 2017, 04:19:36 AM Is anybody working on this https://bitcointalk.org/index.php?topic=1533714.0 ? A compression ratio of 50% equals 3.5 years of traffic growth at 23% CAGR https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/vni-hyperconnectivity-wp.html . Most of the data in Bitcoin blocks and mempool transactions is incompressible, while some of it is trivially compressible (certain format fields, etc.) I am guessing that the Core devs would be interested in more aggressive compression using Merkle hashes. Since it is basically the case that the entire blockchain is an immutable, read-only structure (except just when a new block arrives), the only time you need to transmit raw data is for new transactions and the latest block. I doubt that the bandwidth for these is a bottleneck, so that 25% is probably not worth the cost of optimization. For other kinds of synchronization between nodes, all you need to transmit are the hashes. Title: Re: Data compression Post by: elbandi on November 09, 2017, 04:27:46 PM i "move" blk*.dat and rev*.dat file to a squashfs file, and remount back to bitcoin core, here is the stats:
Code: # du -hs bitcoin-blocks?.squashfs both files contain 500-500 blk*.dat files, compression rate is 34%. so transaction data can be compressed. Title: Re: Data compression Post by: ZipReg on November 10, 2017, 02:49:11 AM I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files.
Title: Re: Data compression Post by: sivagananathan on November 10, 2017, 05:25:35 AM I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files. Data compression; When data compression is used in a data transmission application,the goal is speed. Speed of transmission depends upon the number of bits sent,the time required for the encoder to generate the coded message and the time required for the decoder to recover the original ensemble. In a data storage application,Although the degree of compression is the primary concern,it is nonetheless necessary that the algorithm be efficient in order for the scheme to be practical. Title: Re: Data compression Post by: sivagananathan on November 10, 2017, 05:50:38 AM I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files. Data compression; When data compression is used in a data transmission application,the goal is speed. Speed of transmission depends upon the number of bits sent,the time required for the encoder to generate the coded message and the time required for the decoder to recover the original ensemble. In a data storage application,Although the degree of compression is the primary concern,it is nonetheless necessary that the algorithm be efficient in order for the scheme to be practical. As discussed in the Introduction,data compression has wide application in terms of information storage,including representation of the abstract data type string and file compression.Huffman coding is used for compression in several file archival systems [ARC 1986; one of the adaptive schemes to be discussed in Section 5.An adaptive Huffman coding technique is the basis for the compact command of the UNIX operating system. one could expect to see even greater use of variable-length coding in the future. Title: Re: Data compression Post by: mda on November 10, 2017, 09:43:26 AM As discussed in the Introduction,data compression has wide application in terms of information storage,including representation of the abstract data type string and file compression.Huffman coding is used for compression in several file archival systems [ARC 1986; one of the adaptive schemes to be discussed in Section 5.An adaptive Huffman coding technique is the basis for the compact command of the UNIX operating system. one could expect to see even greater use of variable-length coding in the future. Really, isn't it amazing? Title: Re: Data compression Post by: LoyceV on November 10, 2017, 03:44:59 PM Most of the data in Bitcoin blocks and mempool transactions is incompressible This thread got me curious, so I've tested it for myself using bzip2 (options -z -9). Result (in kB):Code: 130820 blk00400.dat I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files. Why would you do this? Unless you have a very slow and very expensive internet connection, loading 25 DVDs is much more work than just downloading the blockchain again.Title: Re: Data compression Post by: ZipReg on November 10, 2017, 04:00:28 PM As discussed in the Introduction,data compression has wide application in terms of information storage,including representation of the abstract data type string and file compression.Huffman coding is used for compression in several file archival systems [ARC 1986; one of the adaptive schemes to be discussed in Section 5.An adaptive Huffman coding technique is the basis for the compact command of the UNIX operating system. one could expect to see even greater use of variable-length coding in the future. Really, isn't it amazing? lol bot users? I'm sure bitcoin could benefit similarly to websites using gzip to deliver content -if- it can be applied. 40+GB is a pretty big difference, so just offering up some data. Title: Re: Data compression Post by: ZipReg on November 10, 2017, 04:11:57 PM Most of the data in Bitcoin blocks and mempool transactions is incompressible This thread got me curious, so I've tested it for myself using bzip2 (options -z -9). Result (in kB):Code: 130820 blk00400.dat I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files. Why would you do this? Unless you have a very slow and very expensive internet connection, loading 25 DVDs is much more work than just downloading the blockchain again.Data retention. In case of loss or unrecoverable error, you can use the backup, instead of having to download the entire blockchain. A backup lets you be back in sync within hours. The purpose of having dvd sized archives is to ensure file transfer ease and integrity, not to actually use dvd media as storage. Cheers! |