How much rarer are short bitcoin addresses?

RoxxR (OP)

Full Member

Offline

Activity: 209
Merit: 148

How much rarer are short bitcoin addresses?

November 23, 2013, 11:13:00 AM

I'm not very good at math... but I have noticed that 33-character addresses are about 20 times less common than 34-character addresses. Similarly, are 32-character addresses 20 times less common than 33-character addresses (and so on)?

I would really like to see a math guru confirm this.

michagogo

Member

Offline

Activity: 80
Merit: 10

Re: How much rarer are short bitcoin addresses?

November 23, 2013, 04:44:52 PM

http://bitcoin.stackexchange.com/questions/2013/why-does-the-length-of-a-bitcoin-key-vary

BurtW

Legendary

Offline

Activity: 2646
Merit: 1137

All paid signature campaigns should be banned.

Re: How much rarer are short bitcoin addresses?

November 23, 2013, 04:56:09 PM

Quote from: michagogo on November 23, 2013, 04:44:52 PM

http://bitcoin.stackexchange.com/questions/2013/why-does-the-length-of-a-bitcoin-key-vary

That does not answer the question about the distribution of Bitcoin address lengths. I do remember seeing a post about that a long time ago. Try the search function to see if you can find it.

Our family was terrorized by Homeland Security. Read all about it here: http://www.jmwagner.com/ and http://www.burtw.com/ Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!

cr1776

Legendary

Offline

Activity: 4116
Merit: 1306

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 07:31:18 AM
Last edit: November 24, 2013, 08:05:59 AM by cr1776

They just vary due to encoding, so (somewhat simplified) if you had two:
12345678
00098765 could be written as 98765

That doesn't quantify it, but this explains:

Quote

Most Bitcoin addresses are 34 characters. They consist of random digits and uppercase and lowercase letters, with the exception that the uppercase letter "O", uppercase letter "I", lowercase letter "l", and the number "0" are never used to prevent visual ambiguity.
Some Bitcoin addresses can be shorter than 34 characters (as few as 27 in theory) and still be valid. A significant percentage of Bitcoin addresses are only 33 characters, and some addresses may be even shorter. Every Bitcoin address stands for a number - somewhat like an account number. These shorter addresses are valid simply because they stand for numbers that happen to start with zeroes, and when the zeroes are omitted, the encoded address gets shorter.
Several of the characters inside a Bitcoin address are used as a checksum so that typographical errors can be automatically found and rejected. The checksum also allows Bitcoin software to confirm that a 33-character (or shorter) address is in fact valid and isn't simply an address with a missing character.

Quote from: RoxxR on November 23, 2013, 11:13:00 AM

piotr_n

Legendary

Offline

Activity: 2053
Merit: 1356

aka tonikt

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 09:56:58 AM

Quote from: RoxxR on November 23, 2013, 11:13:00 AM

It's as you say, except that the factor is 58, not 20.
Each one character shorter address should be 58 times rarer...

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB 9059 1BE7 B545 CDF3 FD0E

flatfly

Legendary

Offline

Activity: 1092
Merit: 1016

760930

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 10:30:07 AM

Quote from: piotr_n on November 24, 2013, 09:56:58 AM

Quote from: RoxxR on November 23, 2013, 11:13:00 AM

It's as you say, except that the factor is 58, not 20.
Each one character shorter address should be 58 times rarer...

Nope, the factor is not 58. I haven't done the math (not enough time for this now) but from my quick initial testing, the factor seems to be roughly 22.5, at least for 33-character addresses. Don't forget to take leading zeros into account.

piotr_n

Legendary

Offline

Activity: 2053
Merit: 1356

aka tonikt

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 11:05:17 AM

Quote from: flatfly on November 24, 2013, 10:30:07 AM

Quote from: piotr_n on November 24, 2013, 09:56:58 AM

Quote from: RoxxR on November 23, 2013, 11:13:00 AM

It's as you say, except that the factor is 58, not 20.
Each one character shorter address should be 58 times rarer...

You're right - sorry, my mistake.
The first byte is always zero, which means that the first factor is lower.
~~13.140625, if I did the math right... (58*58/256)~~
But any further shorter ones should use the factor 58 / char.

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB 9059 1BE7 B545 CDF3 FD0E

piotr_n

Legendary

Offline

Activity: 2053
Merit: 1356

aka tonikt

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 11:24:28 AM
Last edit: November 24, 2013, 11:38:47 AM by piotr_n

I believe the proper calculation for the first factor comes from the value:

Code:

0x00FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF / 58 = 0x0469ee58469ee58469ee58469ee58469ee58469ee58469ee

And the chance of 0x0469ee58469ee58469ee58469ee58469ee58469ee58469ee having... something to do with mod/div 58 - is the first factor.

Something like that... Haven't had my fifth coffee yet today Wink

EDIT:
If this time I did not get it wrong again, it might go like this (python code):

Code:

>>> x = 0x00FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF / 58
>>> 1 / (1 - float(x - 58**31) / x)
23.337985918792462

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB 9059 1BE7 B545 CDF3 FD0E

flatfly

Legendary

Offline

Activity: 1092
Merit: 1016

760930

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 02:18:10 PM

Quote from: piotr_n on November 24, 2013, 11:24:28 AM

I believe the proper calculation for the first factor comes from the value:

Code:

0x00FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF / 58 = 0x0469ee58469ee58469ee58469ee58469ee58469ee58469ee

And the chance of 0x0469ee58469ee58469ee58469ee58469ee58469ee58469ee having... something to do with mod/div 58 - is the first factor.

Something like that... Haven't had my fifth coffee yet today Wink

EDIT:
If this time I did not get it wrong again, it might go like this (python code):

Code:

>>> x = 0x00FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF / 58
>>> 1 / (1 - float(x - 58**31) / x)
23.337985918792462

That looks much better

However the next factors are much, MUCH higher than 58, it seems. I haven't been able to figure it out completely, though. This question is trickier than it seems...

piotr_n

Legendary

Offline

Activity: 2053
Merit: 1356

aka tonikt

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 04:58:53 PM
Last edit: November 24, 2013, 05:18:26 PM by piotr_n

#10

Quote from: flatfly on November 24, 2013, 02:18:10 PM

However the next factors are much, MUCH higher than 58, it seems. I haven't been able to figure it out completely, though. This question is trickier than it seems...

I think I know why.

It is the part of satoshi's EncodeBase58() function:

Code:

    // Leading zeroes encoded as base58 zeros
    for (const unsigned char* p = pbegin; p < pend && *p == 0; p++)
        str += pszBase58[0];

So for every zero-byte at the beginning, the encode function adds '1' in front of the string - effectively forcing the string to come out longer.

Using this function, I've made a simulation on 20 millions random addresses and indeed not even once we get a string shorter than 32 characters. The statistical results are like this:

Code:

34 chars: ~96.026255%
33 chars: ~3.973735%
32 chars: ~0.00001%

But when we modify the EncodeBase58 function to break the loop (after putting the first '1' in front), then after the 20 million rounds we get stats like this:

Code:

34 chars: 95.714475%
33 chars: 4.210685%
32 chars: 0.073435%
31 chars: 0.00139%
30 chars: 0.000015%

So in other words: if you want to replace trailing '11..' with single '1..' - then you get the expected factor 58, for the even shorter addresses.

And the bottom line is that EncodeBase58 function in the satoshi client is stupidly implemented, because there is no reason whatsoever for an address to start with more than one '1', and even if it does you can just remove the extra ones and the address will still work just fine.

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB 9059 1BE7 B545 CDF3 FD0E

flatfly

Legendary

Offline

Activity: 1092
Merit: 1016

760930

Re: How much rarer are short bitcoin addresses?

November 24, 2013, 10:37:36 PM

#11

Quote from: piotr_n on November 24, 2013, 04:58:53 PM

Quote from: flatfly on November 24, 2013, 02:18:10 PM

However the next factors are much, MUCH higher than 58, it seems. I haven't been able to figure it out completely, though. This question is trickier than it seems...

I think I know why.

It is the part of satoshi's EncodeBase58() function:

Code:

    // Leading zeroes encoded as base58 zeros
    for (const unsigned char* p = pbegin; p < pend && *p == 0; p++)
        str += pszBase58[0];

Code:

34 chars: ~96.026255%
33 chars: ~3.973735%
32 chars: ~0.00001%

But when we modify the EncodeBase58 function to break the loop (after putting the first '1' in front), then after the 20 million rounds we get stats like this:

Code:

34 chars: 95.714475%
33 chars: 4.210685%
32 chars: 0.073435%
31 chars: 0.00139%
30 chars: 0.000015%

Very interesting. Thanks for running the tests! Perhaps the function was coded this way to avoid confusing newbies with address lengths that vary too much? Hard to be sure... but i can't think of another reason.

piotr_n

Legendary

Offline

Activity: 2053
Merit: 1356

aka tonikt

⇾ Re: How much rarer are short bitcoin addresses?

November 25, 2013, 09:33:58 AM

#12

Quote from: flatfly on November 24, 2013, 10:37:36 PM

Perhaps the function was coded this way to avoid confusing newbies with address lengths that vary too much? Hard to be sure... but i can't think of another reason.

I believe it is to assure that the specific implementation of the decode function will always return 25 bytes, since no padding is being done there.

BTW, in theory there are eight possible "valid" addresses with the length of 26 characters:

Code:

11111111111111111111BZbvjr
11111111111111111111HeBAGj
11111111111111111111QekFQw
11111111111111111111UpYBrS
11111111111111111111g4hiWR
11111111111111111111jGyPM8
11111111111111111111o9FmEC
11111111111111111111ufYVpS

And I was wrong again; removing any of the trailing '1' will cause the satoshi client to assume the address as "invalid", even though it is technically quite valid, since e.g. both; 1BZbvjr and 11111111111111111111BZbvjr represent exactly the same 200-bit number (which a bitcoin address essentially is).

But IMHO it is a flaw in the implementation - e.g. my client considers addresses with redundant trailing '1' characters are valid and equal.
Hell you can even skip the first '1' - the 32-bit checksum that si there anyway should be just fine to assure no typos.

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB 9059 1BE7 B545 CDF3 FD0E

RoxxR (OP)

Full Member

Offline

Activity: 209
Merit: 148

Re: How much rarer are short bitcoin addresses?

November 25, 2013, 10:03:30 AM

#13

Thanks for this interesting discussion.

Code:

34 chars: ~96.026255%
33 chars: ~3.973735%
32 chars: ~0.00001%

So if I understand this right, I've got my answer:
32 characters are 400,000 times rarer than 33 characters.

Are 31 characters also 400,000 times rarer than 32 characters, or does the factor differ again?

piotr_n

Legendary

Offline

Activity: 2053
Merit: 1356

aka tonikt

Re: How much rarer are short bitcoin addresses?

November 25, 2013, 11:37:08 AM
Last edit: November 25, 2013, 10:29:51 PM by piotr_n

#14

Quote from: RoxxR on November 25, 2013, 10:03:30 AM

So if I understand this right, I've got my answer:
32 characters are 400,000 times rarer than 33 characters.

Statistically, but in this case, there where were only two 32-chars long addresses generated withing the 20 million random samples so the number is not very precise.
It gives you an idea about the magnitude, but the 400,000 might just as well be 150,000 or 1,000,000

Quote from: RoxxR on November 25, 2013, 10:03:30 AM

So if I understand this right, I've got my answer:
Are 31 characters also 400,000 times rarer than 32 characters, or does the factor differ again?

That's a good question...
Unfortunately I don't know the answer nor have resources to run such a long simulation.
Though back in my high school (which was very long ago), if I had asked my math teacher, she would have probable known how to calculate it

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB 9059 1BE7 B545 CDF3 FD0E

Pages: [1]

Bitcoin Forum > Bitcoin > Development & Technical Discussion > How much rarer are short bitcoin addresses?

« previous topic next topic »