Each extra character makes it 58 times more difficult to find.
Note that starting with a Capital can be 58 times faster (depending on which character you use): 1Abcdef or 1ABCDEF are much faster than 1abcdef.
Just two questions about the two points:
- Why is it 58 exactly? My guess would be: is it something like 26 +26+ 10 = 62 (alphabet sets in caps and regular making the 26 each and the 10 being the number of numbers zero to nice) minus four illegal characters?
Yes, 62
minus four illegal characters. That equals “58 exactly”.
Old-style (pre-Bech32) Bitcoin addresses use base58,
not base-62. Each character is a radix-58 digit, in the range of [0, 57]. Following are the “digits” used by Bitcoin, from an old code snippet of mine. Observe that “I” (uppercase i), “O” (uppercase o), “0” (numeral zero), and “l” (lowercase L) are excluded.
const char base58[59] =
"123456789" /* 9, [0..8] */
"ABCDEFGHJKLMNPQRSTUVWXYZ" /* 24, [9..32] */
"abcdefghijkmnopqrstuvwxyz"; /* 25, [33..57] */
- Why are capital letters easier to find as compared to regular numbers and what's like the "math" behind it?
Capital letters are not
generally easier to find. However, at the beginning, they represent a lower number. Since the large integer being represented is in a range which is not a power of 58, higher digits at the beginning may be rare, or even impossible.
For an analogy:
Imagine that you are searching for a pattern of base-10 digits in a 30-bit base-2 (binary) number. The number you seek has a range of [0, 1073741823]. Digits [2-9] are impossible in the first position; and digit 1 is only in the first position for 73741824/1073741823 ≈ 6.9% of randomly selected 30-bit numbers.
Here, you are searching for a 192-bit base-2 (binary) number, where the upper 160 bits are uniformly distributed and the lower 32 bits are also uniformly distributed (but dependent on the upper 160 bits). You are representing that number as a base58 number. Probability of hitting various base58 digits in the first position is left as an exercise to the reader. <g>
Hmm, I'm guessing this more of a practical real life data rather than actual theoretical analysis?
Yes. The theoretical answer must be somewhere within the hashing algorithm, but that's beyond my understanding.
The theoretical answer is actually not in the hashing algorithm at all, but rather, in how a pseudorandom number uniformly distributed across a binary search space is represented in radix-58 (base58).
If you just restart it, nothing is lost. It just makes a clean random start at another point than where you started before.
What does "nothing is lost" mean? It went through 12 quadrillion tries before crashing. Is every try completely random (it doesn't "save" a list of previous attempts or go in some methodical order)?
LoyceV provided a good explanation
by analogy to dice throws. I have only to add: This is a
probabilistic search. You
could hit your lucky address on the very first try (like winning a lottery). Considering your previous 12 quadrillion “losses” is actually an instance of classic Gambler’s Fallacy.
The probability of repeating one of those 12q tries is the same as trying an untried one?
In both cases, the probability is
negligible =
practically impossible. 12 quadrillion (1.2 x 10
16) is a drop in the ocean of a 2
160 search space (>10
48, more than a thousand quadrillion quadrillion quadrillion).
(
N.b. that the
search space is of size 2
160 although its input is 33 octets for compressed keys and 65 octets for uncompressed keys, and the output is a 192-bit number due to the 32-bit checksum.)