But is it better than Pied Piper's compression algorithm?
Best thing I read all day lol I bet Hooli did an under the table deal with Lempel–Ziv on this lossless algo we're seeing here Our compression algorithm is also a lossless compression, We tested on the bitcoin, compression and decompression about 200,000 blocks without any problem, And Blockchain compression rates may be the highest, And it's not just compression.
|
|
|
Great job! Hopefully Polo will take note. Let's do it together, thanks
|
|
|
Hashes and encryption are extremely resistive to ASCII based compression algorithms so it is likely you are at best compressing the script and small transactions may actually get bigger. Have you reached out the the UPX developers for a look at binary compression?
Our compression algorithm is binary compression, The core algorithm is LZMA, it is better than LZ4.
|
|
|
If you want to see this in Bitcoin Core, I suggest that you open a pull request with your changes at https://github.com/bitcoin/bitcoin/pulls. Then see how the discussion goes with the actual developers of Core. However, since 0.13 has reached its feature freeze, your change would not make it to a release until 0.14 at the earliest, which will be in roughly 6 months. Yes, we hope that the bitcoin development team use this compression algorithm, Thank you very much for your advice.
|
|
|
But is it better than Pied Piper's compression algorithm?
I don't have compare other people's algorithms, Maybe you can make a comparison to see which of the compression ratio is higher. Moreover, our algorithm can save the same network traffic through the transmission compressed blocks protocol.
|
|
|
Yesterday, I ported the compression algorithm code to bitcoin version 0.8.6, The compression effect is obvious. In order to view the compression effect, i changed the functions inside main.cpp. Each block file (blkxxxxx.dat) contains 10,000 blocks.FindBlockPos bool (&state CValidationState, &pos CDiskBlockPos, int nAddSize unsigned, int nHeight unsigned, nTime Uint64, fKnown bool = false) { ... /* while (infoLastBlockFile.nSize + nAddSize >= MAX_BLOCKFILE_SIZE) { */ if( ((nHeight / 10000) > 0) && ((nHeight % 10000) == 0) ) { printf("nHeight = [%d], Leaving block file %i: %s\n", nHeight, nLastBlockFile, infoLastBlockFile.ToString().c_str()); FlushBlockFile(true); nLastBlockFile++; infoLastBlockFile.SetNull(); pblocktree->ReadBlockFileInfo(nLastBlockFile, infoLastBlockFile); // check whether data for the new file somehow already exist; can fail just fine fUpdatedLast = true; } ... }
Here is the test data:blk00000.dat (include 0 ~ 9999 blocks), Original size is 2,318,345 bytes, After compression is 2,116,328 bytes, compression ratio is 8.7%, blk00001.dat (include 10000 ~ 19999 blocks), Original size is 2,303,141 bytes, After compression is 2,103,239 bytes, compression ratio is 8.6%, blk00002.dat (include 20000 ~ 29999 blocks), Original size is 2,440,262 bytes, After compression is 2,224,608 bytes, compression ratio is 8.8%, blk00003.dat (include 30000 ~ 39999 blocks), Original size is 2,500,372 bytes, After compression is 2,278,627 bytes, compression ratio is 8.86%, blk00004.dat (include 40000 ~ 49999 blocks), Original size is 2,775,946 bytes, After compression is 2,527,266 bytes, compression ratio is 8.95%, blk00005.dat (include 50000 ~ 59999 blocks), Original size is 4,611,316 bytes, After compression is 3,927,464 bytes, compression ratio is 14.8%, blk00006.dat (include 60000 ~ 69999 blocks), Original size is 6,788,315 bytes, After compression is 5,763,507 bytes, compression ratio is 15%, blk00007.dat (include 70000 ~ 79999 blocks), Original size is 8,111,206 bytes, After compression is 6,493,703 bytes, compression ratio is 19.9%, blk00008.dat (include 80000 ~ 89999 blocks), Original size is 7,963,189 bytes, After compression is 7,048,131 bytes, compression ratio is 11.49%, blk00009.dat (include 90000 ~ 99999 blocks), Original size is 20,742,813 bytes, After compression is 13,708,206 bytes, compression ratio is 33.9%, blk00010.dat (include 100000 ~ 109999 blocks), Original size is 23,122,509 bytes, After compression is 19,481,570 bytes, compression ratio is 15.7%, blk00011.dat (include 110000 ~ 119999 blocks), Original size is 50,681,392 bytes, After compression is 40,918,962 bytes, compression ratio is 19.2%, blk00012.dat (include 120000 ~ 129999 blocks), Original size is 107,469,564 bytes, After compression is 88,319,322 bytes, compression ratio is 17.8%, blk00013.dat (include 130000 ~ 139999 blocks), Original size is 231,631,119 bytes, After compression is 188,562,481 bytes, compression ratio is 18.59%, blk00014.dat (include 140000 ~ 149999 blocks), Original size is 215,720,950 bytes, After compression is 174,676,348 bytes, compression ratio is 19%, blk00015.dat (include 150000 ~ 159999 blocks), Original size is 173,452,632 bytes, After compression is 139,074,101 bytes, compression ratio is 19.8%, blk00016.dat (include 160000 ~ 169999 blocks), Original size is 212,377,235 bytes, After compression is 164,287,461 bytes, compression ratio is 22.6%, blk00017.dat (include 170000 ~ 179999 blocks), Original size is 263,652,393 bytes, After compression is 205,578,322 bytes, compression ratio is 22%, blk00018.dat (include 180000 ~ 189999 blocks), Original size is 887,112,287 bytes, After compression is 612,296,114 bytes, compression ratio is 30.9%, blk00019.dat (include 190000 ~ 199999 blocks), Original size is 925,036,513 bytes, After compression is 638,670,092 bytes, compression ratio is 30.9%,
|
|
|
Original Bitcoin Genesis Block Hex Code:Compression Bitcoin Genesis Block Hex Code:
|
|
|
In the windows environment, the impact of CPU can be ignored, It can save 20%~25% and even more disk space, My harddisk is 1TB Right now 328 GB is free space. I think that this would be enough for 3-5 years for me. What is a reason to compress blockchain and increase the work for CPU? I see no reasons for compressing data on disk. This algorithm not only saves disk space, it can also save the same network traffic. Double It's true that disk space is not a big problem for nodes, but traffic is ! Saving 25%+ on traffic can be really interesting ! I guess it sends compressed blocks to other clients using the same protocol.Yes, you're right.
|
|
|
In the windows environment, the impact of CPU can be ignored, It can save 20%~25% and even more disk space, My harddisk is 1TB Right now 328 GB is free space. I think that this would be enough for 3-5 years for me. What is a reason to compress blockchain and increase the work for CPU? I see no reasons for compressing data on disk. This algorithm not only saves disk space, it can also save the same network traffic. Double
|
|
|
Compression features usually come with an increase of computational power. Have you done any tests to see how much more CPU power would someone need to run this and how much disk space would be saved ?
i store some blockchain data in squashfs (with xz compression), it store 75G data in a 57G file, so compression rate is 24%. cpu usage is indiscernible. This algorithm is dynamic compression and decompression of each block, The greater the block, the higher the compression rate.
|
|
|
Compression features usually come with an increase of computational power. Have you done any tests to see how much more CPU power would someone need to run this and how much disk space would be saved ?
The compression algorithm has been used in Vpncoin with LZMA(7zip), In the windows environment, the impact of CPU can be ignored, It can save 20%~25% and even more disk space, As you know, the more content, the higher the compression rate, The bigger the block, the higher the compression rate, I think there's a higher compression rate on bitcoin, Because bitcoin's block is relatively large.
|
|
|
I can't understand, does it could be used with bitcoin core?
Yes.
|
|
|
Some correlation function: #include "lz4/lz4.h" #include "lzma/LzmaLib.h"
int StreamToBuffer(CDataStream &ds, string& sRzt, int iSaveBufSize) { int bsz = ds.size(); int iRsz = bsz; if( iSaveBufSize > 0 ){ iRsz = iRsz + 4; } sRzt.resize(iRsz); char* ppp = (char*)sRzt.c_str(); if( iSaveBufSize > 0 ){ ppp = ppp + 4; } ds.read(ppp, bsz); if( iSaveBufSize > 0 ){ *(unsigned int *)(ppp - 4) = bsz; } return iRsz; }
int CBlockToBuffer(CBlock *pb, string& sRzt) { CDataStream ssBlock(SER_DISK, CLIENT_VERSION); ssBlock << (*pb); int bsz = StreamToBuffer(ssBlock, sRzt, 0); return bsz; }
int writeBufToFile(char* pBuf, int bufLen, string fName) { int rzt = 0; std::ofstream oFs(fName.c_str(), std::ios::out | std::ofstream::binary); if( oFs.is_open() ) { if( pBuf ) oFs.write(pBuf, bufLen); oFs.close(); rzt++; } return rzt; }
int lz4_pack_buf(char* pBuf, int bufLen, string& sRzt) { int worstCase = 0; int lenComp = 0; try{ worstCase = LZ4_compressBound( bufLen ); //std::vector<uint8_t> vchCompressed; //vchCompressed.resize(worstCase); sRzt.resize(worstCase + 4); char* pp = (char *)sRzt.c_str(); lenComp = LZ4_compress(pBuf, pp + 4, bufLen); if( lenComp > 0 ){ *(unsigned int *)pp = bufLen; lenComp = lenComp + 4; } } catch (std::exception &e) { printf("lz4_pack_buf err [%s]:: buf len %d, worstCase[%d], lenComp[%d] \n", e.what(), bufLen, worstCase, lenComp); } return lenComp; }
int lz4_unpack_buf(const char* pZipBuf, unsigned int zipLen, string& sRzt) { int rzt = 0; unsigned int realSz = *(unsigned int *)pZipBuf; if( fDebug )printf("lz4_unpack_buf:: zipLen [%d], realSz [%d], \n", zipLen, realSz); sRzt.resize(realSz); char* pOutData = (char*)sRzt.c_str(); // -- decompress rzt = LZ4_decompress_safe(pZipBuf + 4, pOutData, zipLen, realSz); if ( rzt != (int) realSz) { if( fDebug )printf("lz4_unpack_buf:: Could not decompress message data. [%d :: %d] \n", rzt, realSz); sRzt.resize(0); } return rzt; }
int CBlockFromBuffer(CBlock* block, char* pBuf, int bufLen) { CDataStream ssBlock(SER_DISK, CLIENT_VERSION); ssBlock.write(pBuf, bufLen); int i = ssBlock.size(); ssBlock >> (*block); return i; }
int lz4_pack_block(CBlock* block, string& sRzt) { int rzt = 0; string sbf; int bsz = CBlockToBuffer(block, sbf); if( bsz > 12 ) { char* pBuf = (char*)sbf.c_str(); rzt = lz4_pack_buf(pBuf, bsz, sRzt); //if( lzRzt > 0 ){ rzt = lzRzt; } // + 4; } } sbf.resize(0); return rzt; }
int lzma_depack_buf(unsigned char* pLzmaBuf, int bufLen, string& sRzt) { int rzt = 0; unsigned int dstLen = *(unsigned int *)pLzmaBuf; sRzt.resize(dstLen); unsigned char* pOutBuf = (unsigned char*)sRzt.c_str(); unsigned srcLen = bufLen - LZMA_PROPS_SIZE - 4; SRes res = LzmaUncompress(pOutBuf, &dstLen, &pLzmaBuf[LZMA_PROPS_SIZE + 4], &srcLen, &pLzmaBuf[4], LZMA_PROPS_SIZE); if( res == SZ_OK )//assert(res == SZ_OK); { //outBuf.resize(dstLen); // If uncompressed data can be smaller rzt = dstLen; }else sRzt.resize(0); if( fDebug ) printf("lzma_depack_buf:: res [%d], dstLen[%d], rzt = [%d]\n", res, dstLen, rzt); return rzt; }
int lzma_pack_buf(unsigned char* pBuf, int bufLen, string& sRzt, int iLevel, unsigned int iDictSize) // (1 << 17) = 131072 = 128K { int res = 0; int rzt = 0; unsigned propsSize = LZMA_PROPS_SIZE; unsigned destLen = bufLen + (bufLen / 3) + 128; try{ sRzt.resize(propsSize + destLen + 4); unsigned char* pOutBuf = (unsigned char*)sRzt.c_str();
res = LzmaCompress(&pOutBuf[LZMA_PROPS_SIZE + 4], &destLen, pBuf, bufLen, &pOutBuf[4], &propsSize, iLevel, iDictSize, -1, -1, -1, -1, -1); // 1 << 14 = 16K, 1 << 16 = 64K //assert(propsSize == LZMA_PROPS_SIZE); //assert(res == SZ_OK); if( (res == SZ_OK) && (propsSize == LZMA_PROPS_SIZE) ) { //outBuf.resize(propsSize + destLen); *(unsigned int *)pOutBuf = bufLen; rzt = propsSize + destLen + 4; }else sRzt.resize(0); } catch (std::exception &e) { printf("lzma_pack_buf err [%s]:: buf len %d, rzt[%d] \n", e.what(), bufLen, rzt); } if( fDebug ) printf("lzma_pack_buf:: res [%d], propsSize[%d], destLen[%d], rzt = [%d]\n", res, propsSize, destLen, rzt); return rzt; }
int lzma_pack_block(CBlock* block, string& sRzt, int iLevel, unsigned int iDictSize) // (1 << 17) = 131072 = 128K { int rzt = 0; string sbf; int bsz = CBlockToBuffer(block, sbf); if( bsz > 12 ) { unsigned char* pBuf = (unsigned char*)sbf.c_str(); rzt = lzma_pack_buf(pBuf, bsz, sRzt, iLevel, iDictSize); //if( lzRzt > 0 ){ rzt = lzRzt; } // + 4; } } sbf.resize(0); return rzt; }
int bitnet_pack_block(CBlock* block, string& sRzt) { if( dw_zip_block == 1 ) return lzma_pack_block(block, sRzt, 9, uint_256KB); else if( dw_zip_block == 2 ) return lz4_pack_block(block, sRzt); }
bool getCBlockByFilePos(CAutoFile filein, unsigned int nBlockPos, CBlock* block) { bool rzt = false; int ips = nBlockPos - 4; // get ziped block size; if (fseek(filein, ips, SEEK_SET) != 0) return error("getCBlockByFilePos:: fseek failed"); filein >> ips; // get ziped block size; if( fDebug )printf("getCBlockByFilePos:: ziped block size [%d] \n", ips); string s; s.resize(ips); char* pZipBuf = (char *)s.c_str(); filein.read(pZipBuf, ips); string sUnpak; int iRealSz; if( dw_zip_block == 1 ) iRealSz = lzma_depack_buf((unsigned char*)pZipBuf, ips, sUnpak); else if( dw_zip_block == 2 ) iRealSz = lz4_unpack_buf(pZipBuf, ips - 4, sUnpak); if( fDebug )printf("getCBlockByFilePos:: ziped block size [%d], iRealSz [%d] \n", ips, iRealSz); if( iRealSz > 0 ) { pZipBuf = (char *)sUnpak.c_str(); rzt = CBlockFromBuffer(block, pZipBuf, iRealSz) > 12; /*if( fDebug ){ if( block->vtx.size() < 10 ) { printf("\n\n getCBlockByFilePos:: block info (%d): \n", rzt); block->print(); }else printf("\n\n getCBlockByFilePos:: block vtx count (%d) is too large \n", block->vtx.size()); }*/ } s.resize(0); sUnpak.resize(0); return rzt; }
bool getCBlocksTxByFilePos(CAutoFile filein, unsigned int nBlockPos, unsigned int txId, CTransaction& tx) { bool rzt = false; CBlock block; rzt = getCBlockByFilePos(filein, nBlockPos, &block); if( rzt ) { if( block.vtx.size() > txId ) { tx = block.vtx[txId]; if( fDebug ){ printf("\n\n getCBlocksTxByFilePos:: tx info: \n"); tx.print(); } }else rzt = false; } return rzt; }
|
|
|
To cut a long story short, i directly show the source code. Add code to init.cpp int dw_zip_block = 0; int dw_zip_limit_size = 0; int dw_zip_txdb = 0;
bool AppInit2() { ... // ********************************************************* Step 2: parameter interactions
#ifdef WIN32 dw_zip_block = GetArg("-zipblock", 1); #else /* 7Zip source code in the Linux system needs to improve, It can work, but sometimes it will crash. */ dw_zip_block = GetArg("-zipblock", 0); #endif dw_zip_limit_size = GetArg("-ziplimitsize", 64); dw_zip_txdb = GetArg("-ziptxdb", 0); if( dw_zip_block > 1 ){ dw_zip_block = 1; } else if( dw_zip_block == 0 ){ dw_zip_txdb = 0; }
... } Add code to main.h extern int bitnet_pack_block(CBlock* block, string& sRzt); extern bool getCBlockByFilePos(CAutoFile filein, unsigned int nBlockPos, CBlock* block); extern bool getCBlocksTxByFilePos(CAutoFile filein, unsigned int nBlockPos, unsigned int txId, CTransaction& tx); extern int dw_zip_block;
class CTransaction { ... bool ReadFromDisk(CDiskTxPos pos, FILE** pfileRet=NULL) { CAutoFile filein = CAutoFile(OpenBlockFile(pos.nFile, 0, pfileRet ? "rb+" : "rb"), SER_DISK, CLIENT_VERSION); if (!filein) return error("CTransaction::ReadFromDisk() : OpenBlockFile failed");
if( dw_zip_block > 0 ) { //if( fDebug ) printf("CTransaction::ReadFromDisk():: pos.nFile [%d], nBlockPos [%d], nTxPos [%d], pfileRet [%d] \n", pos.nFile, pos.nBlockPos, pos.nTxPos, pfileRet); getCBlocksTxByFilePos(filein, pos.nBlockPos, pos.nTxPos, *this); }else{ // Read transaction if (fseek(filein, pos.nTxPos, SEEK_SET) != 0) return error("CTransaction::ReadFromDisk() : fseek failed");
try { filein >> *this; } catch (std::exception &e) { return error("%s() : deserialize or I/O error", __PRETTY_FUNCTION__); }}
// Return file pointer if (pfileRet) { if (fseek(filein, pos.nTxPos, SEEK_SET) != 0) return error("CTransaction::ReadFromDisk() : second fseek failed"); *pfileRet = filein.release(); } return true; } ... }
class CBlock { ... bool WriteToDisk(unsigned int& nFileRet, unsigned int& nBlockPosRet, bool bForceWrite = false) { // Open history file to append CAutoFile fileout = CAutoFile(AppendBlockFile(nFileRet), SER_DISK, CLIENT_VERSION); if (!fileout) return error("CBlock::WriteToDisk() : AppendBlockFile failed");
// Write index header unsigned int nSize = fileout.GetSerializeSize(*this);
int nSize2 = nSize; string sRzt; if( dw_zip_block > 0 ) { // compression blcok +++ nSize = bitnet_pack_block(this, sRzt); // nSize include 4 byte( block Real size ) // compression blcok +++ }
fileout << FLATDATA(pchMessageStart) << nSize;
// Write block long fileOutPos = ftell(fileout); if (fileOutPos < 0) return error("CBlock::WriteToDisk() : ftell failed"); nBlockPosRet = fileOutPos;
if( dw_zip_block == 0 ){ fileout << *this; } else{ //if( fDebug ) printf("main.h Block.WriteToDisk:: nFileRet [%d], nBlockSize [%d], zipBlockSize [%d], nBlockPosRet = [%d] \n", nFileRet, nSize2, nSize, nBlockPosRet); // compression blcok +++ if( nSize > 0 ){ fileout.write(sRzt.c_str(), nSize); } sRzt.resize(0); // compression blcok +++ }
// Flush stdio buffers and commit to disk before returning fflush(fileout); if( bForceWrite || (!IsInitialBlockDownload() || (nBestHeight+1) % 500 == 0) ) FileCommit(fileout);
return true; }
bool ReadFromDisk(unsigned int nFile, unsigned int nBlockPos, bool fReadTransactions=true) { SetNull(); unsigned int iPos = nBlockPos; if( dw_zip_block > 0 ){ iPos = 0; }
// Open history file to read CAutoFile filein = CAutoFile(OpenBlockFile(nFile, iPos, "rb"), SER_DISK, CLIENT_VERSION); if (!filein) return error("CBlock::ReadFromDisk() : OpenBlockFile failed"); if (!fReadTransactions) filein.nType |= SER_BLOCKHEADERONLY;
// Read block try { if( dw_zip_block > 0 ) { getCBlockByFilePos(filein, nBlockPos, this); }else{ filein >> *this; } } catch (std::exception &e) { return error("%s() : deserialize or I/O error", __PRETTY_FUNCTION__); }
// Check the header if (fReadTransactions && IsProofOfWork() && !CheckProofOfWork(GetPoWHash(), nBits)) return error("CBlock::ReadFromDisk() : errors in block header");
return true; }
... }
|
|
|
To moderator gmaxwell: These source code just for test, it can compile and run in ubuntu and windows, And it is compatible with bitcoin, does not fork bitcoin. Please don't move it, thanks.
Hello, I am the Vpncoin's dev, nice to meet you. We invented a blockchain compression algorithm, It can be reduce about 25% of the disk space and reduce network traffic,We are happy to share it and it is free, And the increase of the source code is compatible, will not fork bitcoin, The core compression algorithm is LZMA (7zip) and LZ4, Our compression code has been applied on the Vpncoin, and run stable. If someone want to use these code, Please indicate the author (Vpncoin development team, Bit Lee).If you are interested in this, please post here, I will publish the relevant source code. Thanks.
|
|
|
bitnet.cc is better than bitnet.pw Thanks. Bitnet.wang does not resolve. What is IP? Same as bitnet.cc Someone said it was DNS pollution
|
|
|
Dev the official website domain has expired you should renew it immediately.
Which domain name? bitnet.wang expire time is 2017.09.11 bitnet.cc expire time is 2018.12.02 Thanks. http://ww2.bitnet.pw/?folio=7POYGN0G2bitnet.cc is better than bitnet.pw Thanks.
|
|
|
Dev the official website domain has expired you should renew it immediately.
Which domain name? bitnet.wang expire time is 2017.09.11 bitnet.cc expire time is 2018.12.02 Thanks.
|
|
|
|