Title: 🖤 Post by: digaran on September 29, 2023, 06:58:37 AM 🖤
Title: Re: How to calculate accurately about collisions in bit ranges? Post by: NotATether on September 29, 2023, 08:21:46 AM No, you can't do that because an address hash is made with a SHA-256 + Ripemd-160 (plush a double SHA-256 checksum) and the properties of these algorithms are non-trivial and therefore do not allow it to be mapped between any two ranges of an address.
Title: Re: How to calculate accurately about collisions in bit ranges? Post by: vjudeu on September 29, 2023, 10:22:34 AM Quote Is there any formula to calculate how many identical base58 addresses exist in each bit range? No, you can only estimate that, but you will not get the exact numbers, unless you want to do that in a very small range, where you can just check all of them.Quote Since each bit is twice the size of it's previous bit, how should I accurately calculate how many of 2^96 collisions could exist in a certain bit range? Well, if you have 2^256 on one side, and 2^160 on another, you can get 2^(256-160), which is equal to 2^96. It is just size of all public keys, divided by size of all addresses. Nothing more, nothing less, but for such huge numbers, it is only an estimation. It could be 2^95 or 2^97 as well for some specific values, nobody knows the exact number, because nobody counted all of them one-by-one, and nobody found an algorithm, to quickly and accurately count them, without breaking math problems behind them.Also note that for Script-based hashes, this number is probably much bigger, because then you can have a script, that for example pushes some 520-byte value on the stack. Quote You mean we can't find an estimation of how many probable identical addresses exist in each bit range? Estimation is always possible, but it could be wrong, because you could estimate it as 2^96, while it could be 2^97 or more, if you count compressed and uncompressed keys separately, or if you take other things into account.Quote So it would be wrong to divide N by 2^96 and then divide any bit range by that result? If you want to get the probability of a collision, then it is something like that.Quote but is the calculation above correct? To some extent yes, but you cannot be 100% sure, that the real probability is 5*10^(-29), and not for example 6*10^(-29).Quote Meaning there are no collisions in 66 bit It is almost guaranteed that there are collisions in 66 bit. If there would be absolutely no collisions, that would mean RIPEMD-160 is totally broken.Quote or you are saying we can't really tell because there is no mathematical relation? There is a mathematical relation, but it is unknown, and hard to simplify, because of the avalanche effect (https://en.wikipedia.org/wiki/Avalanche_effect). However, if you want to for example count, how many collisions are present, where you have some 32-bit hash function on input (for example reduced SHA-256 from 32-bit internal values to 4-bit internal values), and 20-bit hash function on output (for example RIPEMD-160, also reduced to 4-bit internal values), then you can use brute force, to get the exact results.Title: Re: How to calculate accurately about collisions in bit ranges? Post by: garlonicon on September 29, 2023, 01:40:05 PM Maybe you should start with some simple example on some smaller numbers to better see the whole topic. First, start with the procedure of hashing a public key: https://bitcointalk.org/index.php?topic=5468186.msg62916017#msg62916017 Then, imagine that you have only keys from 1 to 256. And that all of your addresses can contain only a single "one", and one character after that. In this case, you will have 256 valid private keys, and also 256 valid public keys, but only 58 valid addresses (a single one, and one character with base58, and you can optionally include any checksum into that, if you really want).
And then, you are trying to count collisions, and find out, which keys are used to generate which address. So, let's see, what will happen, if you compute it for the first 16 keys and addresses: Code: +---------+-------------+------------+ Title: Re: How to calculate accurately about collisions in bit ranges? Post by: NotATether on September 29, 2023, 03:20:38 PM You mean we can't find an estimation of how many probable identical addresses exist in each bit range? So it would be wrong to divide N by 2^96 and then divide any bit range by that result? You could divide the range, but then it will just be a number and there is no way to know which addresses are on either side of it. The problem is very similar to public keys, where you can't "split" a list of public keys with a line based on how big their private key might be. Only in this case, it's even worse, because there's no other information you can associate with the address hash besides itself. Title: Re: How to calculate accurately about collisions in bit ranges? Post by: albert0bsd on September 29, 2023, 05:40:39 PM You only can get some approximation of the expected number of address, but is only that an approximation. Since the address is generated from a double hash system sha256 and rmd160 the approximation can or can't match because the out of those hash look like random numbers.
I will publish the way to calculate it ASAP Title: Re: How to calculate accurately about collisions in bit ranges? Post by: garlonicon on September 29, 2023, 06:35:29 PM Quote There might be other ways but to be 100% sure that's the only way, even to find one collision? As long as some hash function is not fully broken, brute force is the only way. However, if you can break some hash function, then you can mount an attack, and potentially reach that answer faster than by brute force. But to do that, you need to understand it on a level, where you can put some hash as your input, and reach the exact number of possible messages, that could produce it, as your output. And it could be possible, if you could dig into internal operations of hash functions, and construct a proof, that a combination of some of them, will allow reaching exactly M inputs from N outputs.Even if something like that is possible, I guess it is harder than finding a single collision. However, there are some mathematical operations, where you can count something, without checking every possible case. One of those are the number of points on elliptic curve. By having p-value, and the curve equation, it is possible to calculate the exact n-value, without visiting N points, one-by-one. So, there may exist some algorithms, to do some similar things on top of hash functions, but I guess they are not simple, and I guess even if they exist, nobody revealed them yet. But, even if you could reach something like that, I guess it would be quite useless. Because currently, you can estimate it as 2^96=79228162514264337593543950336. And imagine you could have some algorithm, that could tell you, that for 18oaCk6aLBNTh6f9h6Jh9hGeSQm6S6MRC9, it would be for example exactly 23136222216537552491922889594 matching keys, but for 1EHNa6Q4Jz2uvNExL497mE43ikXhwF6kZm, it would be exactly 164065158840186683045169679 keys. Let's assume you could get that. What then? I guess it could be just some useless information, if that would work in a similar way as counting points on elliptic curves, where generating n-value won't help you in breaking it. So yes, in theory, there could exist some way, where you could get some results, similar to those ones: Code: +------------------------------------+-------------------------------+ Title: Re: How to calculate accurately about collisions in bit ranges? Post by: j2002ba2 on September 29, 2023, 10:57:12 PM So put simply, the only way to see if 2 identical addresses exist in any bit range with different private keys is to have all the 2^160 addresses in a file to compare for collisions when we brute force any range. There might be other ways but to be 100% sure that's the only way, even to find one collision? Sorry if I ask stupid questions, my head is not clear past few days. As always thank you all so much for the time you spent to answer. Let's first try to see if there is an "impossible" address, that is - no public key maps to that address. 1. When randomly mapping a n-bit number x to n-bit number y, the probability of having k distinct x-es mapping to y is e-1/k! 2. Starting with a private key (256 bit) we map it to public key, then sha256, and finally ripemd160. This is 256 bit to 160 bit random mapping. 3. Fixing the 96 MSB of the private key we get 160 bit to 160 bit mapping. The probability of not reaching y is e-1. 4. We repeat this for every possible 96 MSB. The probability of not reaching y every time is (e-1)296 = e-296 < 2-114302077158074026402637675936. Well, we could safely say there's no such address. Let's try to see if there is an address to which points only single public key. The probability of having zero mappings is the same as having one mapping: e-1(e-1)296-1 = e-296 We get the same number as above. There is no address with a single public key. We expect the number of public keys pointing to that address to be quite close to 296. --- Now let the private key lays in certain 160-bit interval (or is in a random set of 2160 distinct private keys). The probability of not reaching an address y is e-1 ≈ 36.8% The probability of reaching an address only once is the same. The probability of collision is 1-e-1-e-1 ≈ 26.4% --- Hopefully this brings some clarity. For info on the math leading to these results one could look at the article "Random Mapping Statistics", or the book "Analytic Combinatorics". |