Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: kerzane on May 19, 2013, 02:40:00 PM



Title: Blockchain parsing
Post by: kerzane on May 19, 2013, 02:40:00 PM

I'm writing a parser for the blockchain for my myself. I've been following the specs at

https://en.bitcoin.it/wiki/Genesis_block
https://en.bitcoin.it/wiki/Protocol_specification
and
https://en.bitcoin.it/wiki/Transactions

I've got to the point where I can follow the chain down to block #29664.

My problem is that I don't know how to infer the length of certain data entries, namely:

 - The number of transactions in a block
 - The number of inputs to a transaction
 - The number of outputs to a transaction
 - The length of the transaction input script
 - The length of the transaction output script

All of these entries are stated to be of length 1-9 bytes in the protocol specs.

But surely we need to have some way of predicting what length to read for these entries, in order to parse the chain successfully.

The (obviously flawed) assumption that they are always 1 byte long, fails at block #29664.

The only solution I can think of is trial and error until the block is parsed successfully (i.e. the magic bytes are correctly found at the start of the next block), but this seems really unsatisfactory.

Anyone know the solution to this problem?


Title: Re: Blockchain parsing
Post by: kerzane on May 19, 2013, 02:59:20 PM
Bump, help a man out!


Title: Re: Blockchain parsing
Post by: Nikinger on May 19, 2013, 03:36:09 PM
Do you have taken a look on the source code of at least one random Bitcoin client?


Title: Re: Blockchain parsing
Post by: kerzane on May 19, 2013, 05:09:31 PM
Do you have taken a look on the source code of at least one random Bitcoin client?

Thanks for your reply!

I haven't, but I don't know what I would be looking for. I know what is in the data, I just don't know how long those fields are in any given block. I don't think a client code would help me.

I could have a look at another parser, but the whole reason I decided to write my own was because I couldn't find a clear simple one, and up to now it's been quite straightforward.

Any more help is welcome!


Title: Re: Blockchain parsing
Post by: penguinn on May 19, 2013, 06:37:50 PM
I want more!


Title: Re: Blockchain parsing
Post by: No 1 on May 19, 2013, 06:51:52 PM
looks interesting. ill be watching this


Title: Re: Blockchain parsing
Post by: Zeilap on May 19, 2013, 11:08:56 PM
Anyone know the solution to this problem?

https://en.bitcoin.it/wiki/Protocol_specification#Variable_length_integer (https://en.bitcoin.it/wiki/Protocol_specification#Variable_length_integer)

Read the first byte as uint8 and check its value, you have 4 choices,

<0xFD:          use the value as it is
  0xFD:           read 2 more bytes as uint16
  0xFE:           read 4 more bytes as uint32
  0xFF:           read 8 more bytes as uint64


Title: Re: Blockchain parsing
Post by: walf_man on May 20, 2013, 07:09:23 AM
nice I need more


Title: Re: Blockchain parsing
Post by: kerzane on May 20, 2013, 12:27:00 PM
Anyone know the solution to this problem?

https://en.bitcoin.it/wiki/Protocol_specification#Variable_length_integer (https://en.bitcoin.it/wiki/Protocol_specification#Variable_length_integer)

Read the first byte as uint8 and check its value, you have 4 choices,

<0xFD:          use the value as it is
  0xFD:           read 2 more bytes as uint16
  0xFE:           read 4 more bytes as uint32
  0xFF:           read 8 more bytes as uint64


Thanks Zeilap, I think I follow, I hadn't found that entry in the specification before, I'll try it out when I get a chance.


Title: Re: Blockchain parsing
Post by: kerzane on May 20, 2013, 01:41:37 PM
Like a charm, cheers zeilap. I haz blockchain!!