Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: NotATether on July 03, 2023, 03:15:08 AM



Title: Calculating the weight units of a transaction
Post by: NotATether on July 03, 2023, 03:15:08 AM
At this point, I am able to successfully decode a raw transaction, but then I read here: https://en.bitcoin.it/wiki/Weight_units that the weight units of a transaction is not simply the transaction byte length * 4, but actually it must be converted into whats called a "P2P protocol block message" first - probably the internal representation in Bitcoin Core (and segwit transactions have certain fields like the flag and witness data consuming less weight units than others).

The issue is, I can't seem to find any documentation for what a block message is supposed to look like. The only hints I have are the diagrams on the wiki page, and I am not sure if they are accurate.


Title: Re: Calculating the weight units of a transaction
Post by: achow101 on July 03, 2023, 03:47:37 AM
https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki#user-content-Transaction_size_calculations


Title: Re: Calculating the weight units of a transaction
Post by: SeriouslyGiveaway on July 03, 2023, 03:47:50 AM
At this point, I am able to successfully decode a raw transaction, but then I read here: https://en.bitcoin.it/wiki/Weight_units that the weight units of a transaction is not simply the transaction byte length * 4, but actually it must be converted into whats called a "P2P protocol block message" first - probably the internal representation in Bitcoin Core (and segwit transactions have certain fields like the flag and witness data consuming less weight units than others).

The issue is, I can't seem to find any documentation for what a block message is supposed to look like. The only hints I have are the diagrams on the wiki page, and I am not sure if they are accurate.
My answer can be wrong but I try to learn too.

My answer is a block message is supposed to contain 0x01
Segwit wallet developers. Transaction serialization (https://bitcoincore.org/en/segwit_wallet_dev/)
Quote
A segwit-compatible wallet MUST also support the new serialization format, as nVersion|marker|flag|txins|txouts|witness|nLockTime
Format of nVersion, txins, txouts, and nLockTime are same as the original format
The marker MUST be 0x00
The flag MUST be 0x01

BIP 0141 (https://en.bitcoin.it/wiki/BIP_0141)
Quote
The flag MUST be a 1-byte non-zero value. Currently, 0x01 MUST be used.

Transaction size calculator (https://bitcoinops.org/en/tools/calc-size/)
Quote
Only in transactions spending one or more segwit UTXOs:

Segwit marker & segwit flag (0.5) A byte sequence used to clearly differentiate segwit transactions from legacy transactions

P2P networking (https://developer.bitcoin.org/reference/p2p_networking.html)
Quote
2†

“MSG_WITNESS_BLOCK”

The hash is of a block header; identical to “MSG_BLOCK”. When used in a “getdata” message, this indicates the response should be a block message with transactions that have a witness using witness serialization. Only for use in“getdata” messages.

† These are the same as their respective type identifier but with their 30th bit set to indicate witness. For example MSG_WITNESS_TX = 0x01000040.


Title: Re: Calculating the weight units of a transaction
Post by: NotATether on July 03, 2023, 10:19:21 AM
https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki#user-content-Transaction_size_calculations

I read BIPs 141 and 144 but there is one confusion that still bothers me:

Are these diagrams as depicted on the Bitcoin Wiki entry equivalent to the raw (segwit or non-segwit) transaction serialized format?

https://en.bitcoin.it/w/images/en/2/2b/P2pkh-1in-2out_bytes.png
https://en.bitcoin.it/w/images/en/d/d8/P2wpkh-1in-2out_bytes.png


Title: Re: Calculating the weight units of a transaction
Post by: witcher_sense on July 03, 2023, 10:59:36 AM
https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki#user-content-Transaction_size_calculations

I read BIPs 141 and 144 but there is one confusion that still bothers me:

Are these diagrams as depicted on the Bitcoin Wiki entry equivalent to the raw (segwit or non-segwit) transaction serialized format?

https://en.bitcoin.it/w/images/en/2/2b/P2pkh-1in-2out_bytes.png
https://en.bitcoin.it/w/images/en/d/d8/P2wpkh-1in-2out_bytes.png
Look at serialized transactions here https://hongchao.me/anatomy-of-raw-bitcoin-transaction/ and compare it to these diagrams. From my point of view, colorized hexadecimal format is much easier to understand.


Title: Re: Calculating the weight units of a transaction
Post by: pooya87 on July 03, 2023, 02:02:43 PM
https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki#user-content-Transaction_size_calculations

I read BIPs 141 and 144 but there is one confusion that still bothers me:

Are these diagrams as depicted on the Bitcoin Wiki entry equivalent to the raw (segwit or non-segwit) transaction serialized format?

https://en.bitcoin.it/w/images/en/2/2b/P2pkh-1in-2out_bytes.png
https://en.bitcoin.it/w/images/en/d/d8/P2wpkh-1in-2out_bytes.png
Those are two very hard to understand pictures if you ask me.
Maybe I'm looking at the picture out of context but for example why is there the word "witness" in parenthesis in first picture (P2PKH), there is no witness in P2PKH scripts and even with witness putting it in parenthesis in front of "scriptsig" is just misleading in my opinion.

There is also the "count" boxes which are wrong. The one before scriptsig and 2 scriptpubkeys in first picture are not count they are of the same integer encoding scheme (variable length integer) but they show the "length" of the script not the count. The other two counts before outpoint and first amount are correct and they show the count of inputs and outputs and are of the same type (var. int).

Same with the second picture but with the addition of another "count" behind "sequence" which is wrong and misleading. It is not count, it is another length and it is zero (since the P2WPKH scriptsig is empty).
Same with 3 blue counts. The first one is the actual count (2 witness items in this case) the next two "counts" are the length of the data that exists on the stack (eg. 72 bytes signature and 33 byte public key).

Should be something like this:
https://i.ibb.co/5jJFKBB/P2wpkh-1in-2out-bytes.jpg


Title: Re: Calculating the weight units of a transaction
Post by: NotATether on July 03, 2023, 03:15:45 PM
@witcher_sense @pooya87 Those were very helpful. Just one last problem:

With the exception of all the var_int types, are all of the multi-byte structures in the raw transactions in big endian? I suspect so, because I'm getting crazy values when parsing them as little endian.

EDIT: Sorry, apparently it was big endian conversion this whole time that was causing chaos.


Title: Re: Calculating the weight units of a transaction
Post by: pooya87 on July 03, 2023, 05:24:42 PM
With the exception of all the var_int types, are all of the multi-byte structures in the raw transactions in big endian?
Version, outpoint index, input sequence, output amount and locktime are all in little endian.
R and S values in signature, public key (integer value) are in big endian.
Any integer inside the scripts (used in something like OP_ADD) are interpreted as little endian.
Variable length integers indicating the length of the scripts, input/output/witness_item count and witness_item length are all using a special compact encoding with the integer encoded in little endian.
Variable length integers inside scripts used to push something to the stack are using a different special compact encoding but also using the little endian system.

4 byte representation of one:
0x01000000 <-- little endian
0x00000001 <-- big endian


Title: Re: Calculating the weight units of a transaction
Post by: witcher_sense on July 04, 2023, 08:17:18 AM
@witcher_sense @pooya87 Those were very helpful. Just one last problem:

With the exception of all the var_int types, are all of the multi-byte structures in the raw transactions in big endian? I suspect so, because I'm getting crazy values when parsing them as little endian.

EDIT: Sorry, apparently it was big endian conversion this whole time that was causing chaos.
Field formats of a legacy bitcoin transaction are well-explained here: https://learnmeabitcoin.com/technical/transaction-data.

You can also use this article https://daniel.perez.sh/blog/2020/bitcoin-format/ and this simple bitcoin transaction parser https://github.com/danhper/simple-bitcoin-parser as a reference: