As you may know there are many numeric values used in the bitcoin blockchain, each representing a different thing: version, script lengths, locktime,... and since a block is just a sequence of octets (bytes) that is transferred between nodes and stored on disk, we have to convert these integer values into octet strings (byte arrays) and back.
But what you may not have ever noticed is that in bitcoin, depending on what that integer represents a different approach is chosen for its conversion to bytes, resulting in 6 different ways of encoding integers!
The following is an example transaction from
BIP-143 containing 5 out of 6 methods listed below with each one highlighted according their type:
01000000000102fff7f7881a8099afa6940d42d1e7f6362bec38171ea3edf433541db4e4ad969f
00000000494830450221008b9d1dc26ba6a9cb62127b02742fa9d754cd3bebf337f7a55d114c8e
5cdd30be022040529b194ba3f9281a99f2b1c0a19c0489bc22ede944ccf4ecbab4cc618ef3ed01
eeffffffef51e1b804cc89d182d279655c3aa89e815b1b309fe287d9b2b55d57b90ec68a010000
0000ffffffff02202cb206000000001976a9148280b37df378db99f66f85c95a783a76ac7a6d59
88ac9093510d000000001976a9143bde42dbee7e4dbe6a21b2d50ce2f0167faa815988ac000247
304402203609e17b84f6a7d30c80bfa610b5b4542f32a8a0d5447a12fb1366d7f01cc44a022057
3a954c4518331561406f90300e8f3358f51928d43c212a8caed02de67eebee0121025476c2e831
88368da1ff3e292e7acafcdb3566bb0ad253f62fc70f07aeee635711000000
1. Fixed length little endianThis is the easiest and most common way. It is used for block/transaction version, block time, block target, block nonce, TxIn.Outpoint.Index, TxIn.Sequence and TxOut.Amount and locktime.
The integer will be converted to a little endian byte array of fixed 4 bytes length, with the exception of TxOut.Amount which is 8 bytes.
2. Variable length big endianThis is only used for bigger values in signatures (R and S) using signed notation (most significant bit indicates sign) and public keys (X and Y coordinate). They are always preceded by their length using StackInt (scripts) or CompactInt (witnesses) format.
3. CompactIntThis is a special format used in bitcoin only that can encode from 0 to 2
64-1 values. It is used for script lengths, input/output count, witness item count and witness item length.
4. DerInt(Not an official name) This method is used in DER encoding and can encode from 0 to 2
1008 length integers. This is only used for encoding signatures (only uses up to 33 bytes lengths). It indicates the length part in DER's Tag-Length-Value encoding scheme.
5. StackInt(Not an official name) This method is used in bitcoin only to indicate length of the data that is supposed to be pushed onto the stack. It can encode from 0 to 2
32-1 values.
6. Short form integers inside scriptsIn bitcoin script language, the stack is an array of bytes. Sometimes these bytes could be interpreted as integers, or an integer could be pushed to the stack. To do that a special format is used:
if there is an OP code for the value (OP_0, OP_NegativeOne,...) that single byte is used.
if there is no OP code, integer is converted to byte array in little endian order in shortest form (no extra zeros) and sign is determined based on most significant bit.
Example: How does 254 (0b11111110) look like in each encoding?
Spoiler (select/highlight to see text):
1. 0xfe000000
3. 0xfdfe00
4. 0x81fe
5. 0x4cfe
6. 0xfe00