Bitcoin Forum
November 20, 2017, 01:04:19 AM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: [1]
  Print  
Author Topic: Data compression  (Read 423 times)
mda
Member
**
Offline Offline

Activity: 69


View Profile
November 09, 2017, 01:49:36 AM
 #1

Is anybody working on this https://bitcointalk.org/index.php?topic=1533714.0 ?
A compression ratio of 50% equals 3.5 years of traffic growth at 23% CAGR https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/vni-hyperconnectivity-wp.html .
1511139859
Hero Member
*
Offline Offline

Posts: 1511139859

View Profile Personal Message (Offline)

Ignore
1511139859
Reply with quote  #2

1511139859
Report to moderator
1511139859
Hero Member
*
Offline Offline

Posts: 1511139859

View Profile Personal Message (Offline)

Ignore
1511139859
Reply with quote  #2

1511139859
Report to moderator
Join ICO Now A blockchain platform for effective freelancing
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1511139859
Hero Member
*
Offline Offline

Posts: 1511139859

View Profile Personal Message (Offline)

Ignore
1511139859
Reply with quote  #2

1511139859
Report to moderator
1511139859
Hero Member
*
Offline Offline

Posts: 1511139859

View Profile Personal Message (Offline)

Ignore
1511139859
Reply with quote  #2

1511139859
Report to moderator
haltingprobability
Newbie
*
Offline Offline

Activity: 28


View Profile
November 09, 2017, 04:19:36 AM
 #2


Most of the data in Bitcoin blocks and mempool transactions is incompressible, while some of it is trivially compressible (certain format fields, etc.) I am guessing that the Core devs would be interested in more aggressive compression using Merkle hashes. Since it is basically the case that the entire blockchain is an immutable, read-only structure (except just when a new block arrives), the only time you need to transmit raw data is for new transactions and the latest block. I doubt that the bandwidth for these is a bottleneck, so that 25% is probably not worth the cost of optimization. For other kinds of synchronization between nodes, all you need to transmit are the hashes.
elbandi
Hero Member
*****
Offline Offline

Activity: 492


View Profile
November 09, 2017, 04:27:46 PM
 #3

i "move" blk*.dat and rev*.dat file to a squashfs file, and remount back to bitcoin core, here is the stats:

Code:
# du -hs bitcoin-blocks?.squashfs
53G     bitcoin-blocks0.squashfs
56G     bitcoin-blocks1.squashfs
# du -hs blocks-ro?
71G     blocks-ro0
71G     blocks-ro1

both files contain 500-500 blk*.dat files, compression rate is 34%. so transaction data can be compressed.
ZipReg
Newbie
*
Offline Offline

Activity: 18


View Profile WWW
November 10, 2017, 02:49:11 AM
 #4

I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files.

Cheap Domain Registration, No Gimmicks. ZipReg™
sivagananathan
Newbie
*
Offline Offline

Activity: 12


View Profile
November 10, 2017, 05:25:35 AM
 #5

I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files.

Data compression;

When data compression is used in a data transmission application,the goal is speed. Speed of transmission depends upon the number of bits sent,the time required for the encoder to generate the coded message and the time required for the decoder to recover the original ensemble. In a data storage application,Although the degree of compression is the primary concern,it is nonetheless necessary that the algorithm be efficient in order for the scheme to be practical.
sivagananathan
Newbie
*
Offline Offline

Activity: 12


View Profile
November 10, 2017, 05:50:38 AM
 #6

I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files.

Data compression;

When data compression is used in a data transmission application,the goal is speed. Speed of transmission depends upon the number of bits sent,the time required for the encoder to generate the coded message and the time required for the decoder to recover the original ensemble. In a data storage application,Although the degree of compression is the primary concern,it is nonetheless necessary that the algorithm be efficient in order for the scheme to be practical.

As discussed in the Introduction,data compression has wide application in terms of information storage,including representation of the abstract data type string and file compression.Huffman coding is used for compression in several file archival systems [ARC 1986; one of the adaptive schemes to be discussed in Section 5.An adaptive Huffman coding technique is the basis for the compact command of the UNIX operating system.
one could expect to see even greater use of variable-length coding in the future.
mda
Member
**
Offline Offline

Activity: 69


View Profile
November 10, 2017, 09:43:26 AM
 #7

As discussed in the Introduction,data compression has wide application in terms of information storage,including representation of the abstract data type string and file compression.Huffman coding is used for compression in several file archival systems [ARC 1986; one of the adaptive schemes to be discussed in Section 5.An adaptive Huffman coding technique is the basis for the compact command of the UNIX operating system.
one could expect to see even greater use of variable-length coding in the future.

Really, isn't it amazing?
LoyceV
Hero Member
*****
Offline Offline

Activity: 938


Howdy


View Profile
November 10, 2017, 03:44:59 PM
 #8

Most of the data in Bitcoin blocks and mempool transactions is incompressible
This thread got me curious, so I've tested it for myself using bzip2 (options -z -9). Result (in kB):
Code:
130820  blk00400.dat
106864  blk00400.dat.bz
The compressed file is 18.3% smaller. Considering the current cost of disk space, and the complications it would give to read back data (for a wallet rescan), I see no reason to implement this.

I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files.
Why would you do this? Unless you have a very slow and very expensive internet connection, loading 25 DVDs is much more work than just downloading the blockchain again.

ZipReg
Newbie
*
Offline Offline

Activity: 18


View Profile WWW
November 10, 2017, 04:00:28 PM
 #9

As discussed in the Introduction,data compression has wide application in terms of information storage,including representation of the abstract data type string and file compression.Huffman coding is used for compression in several file archival systems [ARC 1986; one of the adaptive schemes to be discussed in Section 5.An adaptive Huffman coding technique is the basis for the compact command of the UNIX operating system.
one could expect to see even greater use of variable-length coding in the future.

Really, isn't it amazing?

lol bot users?

I'm sure bitcoin could benefit similarly to websites using gzip to deliver content -if- it can be applied. 40+GB is a pretty big difference, so just offering up some data.

Cheap Domain Registration, No Gimmicks. ZipReg™
ZipReg
Newbie
*
Offline Offline

Activity: 18


View Profile WWW
November 10, 2017, 04:11:57 PM
 #10

Most of the data in Bitcoin blocks and mempool transactions is incompressible
This thread got me curious, so I've tested it for myself using bzip2 (options -z -9). Result (in kB):
Code:
130820  blk00400.dat
106864  blk00400.dat.bz
The compressed file is 18.3% smaller. Considering the current cost of disk space, and the complications it would give to read back data (for a wallet rescan), I see no reason to implement this.

I backed up the blockchain data recently (late October) it was 160GB uncompressed, I used 7zip with normal compression and it is 112GB in 25 dvd sized archive files.
Why would you do this? Unless you have a very slow and very expensive internet connection, loading 25 DVDs is much more work than just downloading the blockchain again.

Data retention. In case of loss or unrecoverable error, you can use the backup, instead of having to download the entire blockchain. A backup lets you be back in sync within hours. The purpose of having dvd sized archives is to ensure file transfer ease and integrity, not to actually use dvd media as storage. Cheers!

Cheap Domain Registration, No Gimmicks. ZipReg™
Pages: [1]
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!