Could somebody please explain why the limits for the byte length have always been set way lower than what the little-endian encoding would allow?
E.g. length 1 would allow for 255
That is just the number of combinations of bits you can have in 8 bits, so if you start with 0000-0000 all the way without duplicates, you should have a total of 256 (including 0) different sets.
That is called an unsigned 8 bit which can be represented by 2^8 - 1 = 255
The value in the code you posted is a signed integer and one bit will be used as a flag so now you have xxxx-xxx1 which can be represented as 2^7 - 1 = 127, anything above 127 will be 2 bytes and 2 bytes of unsigned int should be (2^15)-1 = -+32767 but the code says 16511 is the limit which means according to
this 1 bit is used to determine the number of extra bytes which is represented in 0 in the above ( xxxx-xx10-xxxx-xxxx) and that will give you 2^14 +127 = 16511, next will be 3 bytes or 2^21 + 16511 = 2113663 and so on.
The number of trailing zero bits in the first byte indicates how many subsequent bytes there are. This allows the number of bytes in an encoded integer to be efficiently determined with a single instruction (e.g., BSF or TZCNT) on most modern processors.
Source:
https://github.com/stepchowfun/typicalEdited: added more details.