Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: Jimmy2011 on September 02, 2011, 12:33:04 AM



Title: How to read block file?
Post by: Jimmy2011 on September 02, 2011, 12:33:04 AM
I just want to read block header information from the block file. Is there any simple c/c++ code segment related to this simple problem? So I can do some change of the code to use it for this purpose. Thanks!


Title: Re: How to read block file?
Post by: etotheipi on September 02, 2011, 07:45:37 AM
I don't have the code directly in front of me, and I'm very short on time.  But if it helps, I've done this before, and it is actually quite simple.  I just don't have time to go dig up my source code right now... perhaps tomorrow if you don't have it yet.

In blk0001.dat, the first four bytes of every block is the magic number (f9beb4d9).  Followed by 4 bytes which is the number of bytes in the block, N.  The following 80 bytes is the header.  Then the following N-80 is the block data which can be ignored.

Rinse, repeat.


Title: Re: How to read block file?
Post by: etotheipi on September 04, 2011, 01:04:34 AM
Here's the relevant code in my project, though I pulled out a lot of vailidity checking and is using my own data structures, so it's not directly usable... only for informational purposes.  But you should be able to adapt it to your project.  The structure of blk0001.dat really is quite simple:

Code:
 4 | 4 | 80 | TxData | 4 | 4 | 80 | TxData | 4 | 4 | 80 | TxData | ...

First 4 bytes - magic bytes (identifying which network you are on)
Second 4 bytes- the number of bytes of the remaining block
Next 80 bytes - block header itself
NumBlockBytes-80 - Transaction data in this block [ numTx | Tx1 | Tx2 | Tx3 | ... ]


Code:
uint32_t importHeadersFromBlockFile(std::string filename)
{

      BinaryData  thisHash(32);
      BinaryData  magicNum(4);
      BinaryData  thisHeaderSer(80)
      BlockHeader thisHeader;

      // While there is still data left in the stream (file)...
      while(!bsb.isEof())
      {
            // Get the magic bytes
            magicNum = bsb.reader().get_BinaryData(4);

            // Get total number of bytes in this block (including header)
            numBlockBytes = bsb.reader().get_uint32_t();

            // In case I want to retrieve block data from file later
            uint64_t blkByteOffset = bsb.getFileByteLocation();

            // Pull the header from the block data
            thisHeaderSer = bsb.reader().get_BinaryData(80);

            // Interpret header data and compute hash
            thisHeader.unserialize(thisHeaderSer);
            thisHash = thisHeaderSer.getHash256Digest();

            // Finally, skip the rest of the block data because only pulling headers
            bsb.reader().advance(numBlockBytes-80);
 
      }
   }


Title: Re: How to read block file?
Post by: Jimmy2011 on September 04, 2011, 06:43:31 AM
etotheipi, thank you for your detail explanation.



Title: Re: How to read block file?
Post by: etotheipi on September 07, 2011, 04:57:10 AM
Btw, one minor detail I left out was the fact that btween the header and the TxData is a var_int giving you the number of transactions included in that block. 

If you are interested in examining other files, you might consider using my mysteryHex tool that will help you extract unknown file formats:  https://bitcointalk.org/index.php?topic=38336.0

Check out the linked git repo via "git clone git://github.com/etotheipi/PyBtcEngine.git" and run something like:

Code:
python mysteryHex.py -b --byterange=0,1000 -f ~/.bitcoin/blk0001.dat

This will open the blk0001.dat file (-f)  as binary (-b), and read bytes 0-1000.  It will then find everything recognizable in that chunk of data and display the results visually.  It is quite useful for identifying random files/serialized fragments, or picking apart BTC data formats.