Title: The «beauty» of probability and pure randomness Post by: BlackHatCoiner on January 21, 2021, 10:22:33 AM I really didn't know where to create this thread, so I chose this place. Even if it doesn't require much discussion I just wanted to do it. I wrote the following php code that once you run it, it counts you the total hashes that begin with zero:
Code: $count = 0; It starts from number 0 and until 16 it hashes every number. Since hexadecimal characters are 16 in total, then on average it should find 1 hash that starts with "0". If you try the above it'll find no hashes that starts with "0". If I change $maxtries to 32, then I should get on average 2 hashes that start with "0". Again, if you try it with 32, you'll find none. By changing it to 160 I should find 10 hashes on average and surprisingly I find exactly 10: Code: 39: 0b918943df0962bc7a1824c0555a389347b4febdc7cf9d1254406d80ce44e3f9 If I change it to 1600 I should get 100, but there are 115: Code: 39: 0b918943df0962bc7a1824c0555a389347b4febdc7cf9d1254406d80ce44e3f9 I know that this is completely misplaced and it's part of the function not to give exactly 100, but I find something odd with possibilities and randomness. I mean it's not random, there are some mathematical operations that the computer does and results on a hash. You just can't know it if you don't perform every possible combination until you find it. There is no proof that between 1,000,000 and 2,000,000 there is a hash that starts with zero. It's just highly improbable. But let's define what probability means. Quote Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, 0 indicates impossibility of the event and 1 indicates certainty. The higher the probability of an event, the more likely it is that the event will occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes ("heads" and "tails") are both equally probable; the probability of "heads" equals the probability of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is 1/2 (which could also be written as 0.5 or 50%). There is a difference with the coin, though. If you make the calculations that are required to know the final result, you can always win. For example, if you calculate the weight of the coin, the pull of gravity, the wind etc, you can predict it correctly. On the other hand, on hash functions you can't know what input gives the analogous output that is publicly known unless if you successfully brute force it. This concludes that SHA-256 hashes are not probable. But, neither random. If you picture a database that expands through infinity and has two columns, the input, that will have every different combination of text that can ever be created, and the output that has input's SHA256 hash (or any other hash function) you will realize that every result is pre-defined. There is a bunch of characters out there that is unknown and once you perform SHA256 to it, it will give you: Code: 0000000000000000000000000000000000000000000000000000000000000000 And there are more than one. I don't have the proof, but just like before, it's highly likely. https://i.imgur.com/pt71yso.gif But wait. Since all possible combinations of a text are infinite, does it make the possibilities infinite too? Yes, here's the proof. Whether I have or don't have the text that prints theses 64 zeros once it's put through SHA256, I can prove you that there is one. And not just one. Infinite texts that return it. And that's reasonable. On the beginning of this post, I showed that 1 in 16 different hashes start with "0" on average, which also means that 1 in an infinite amount of different texts will return 64 zeros once we hash it. There is no specific reason why I created this topic beyond that I just wanted to. I don't believe that I've made a mistake somewhere. I think that it's possible to perform SHA-256 to an input no matter its length. If not, if there's a limit of characters, then it's not that important. I just wanted to give the idea. Either way, a database that expands through infinity is by itself impossible. I hope the title, doesn't mislead. Your thoughts. Title: Re: The «beauty» of probability and pure randomness Post by: o_e_l_e_o on January 21, 2021, 12:00:43 PM Since all possible combinations of a text are infinite, does it make the possibilities infinite too? Yes, here's the proof. Whether I have or don't have the text that prints theses 64 zeros once it's put through SHA256, I can prove you that there is one. And not just one. Infinite texts that return it. There are not an infinite number of possible inputs for SHA256.Then append the 64-bit block that is equal to the number l expressed using a binary representation. What this means is that any input in to SHA256 must first have the length of the input, expressed as a 64 bit number, appended to the input. Therefore, the input itself cannot be longer than 264 - 1 bits, which works out to 2 exabytes (2 million terabytes). What this means is that for each of the 2256 possible outputs from SHA256, if the inputs were evenly distributed then there would be 218446744073709551360 possible inputs per output. So although not infinite, the chance of an output having 0 possible inputs is incomprehensibly small. Title: Re: The «beauty» of probability and pure randomness Post by: NotATether on January 21, 2021, 01:23:36 PM You are describing pseudorandomness and not pure randomness because SHA256 is just XORing a bunch of numbers shifted to the right together. You know how many rounds it's going to use so you could theoretically count the number of XORs done if you know the input to guess the value of the left-most bit.
Also note that all strings, numbers and so on have to be organized into groups of 4 bytes (a double-word) in order to be fed into SHA256. Each double-word has operations applied to it as a unit. So if you change a group of 32 bits (4 bytes) in the input, you completely change all of the output bytes. So to convert an integer to bytes you'd group the lowest 8 bits of the integer into a single byte, then the next lowest 8 bits, and so on until you have four bytes. That's one double word. If there aren't enough bytes to make a double-word then 0x00 bytes are put at the end of the double-word. I.e we pad the input with bytes to the right of it. Let's say for example you wanted to hash 65536 which is 0x010000 in hex. This is turned into bytes as 0x00 0x01 0x00 0x00, using big-endian format, which is the way I spelled out the bytes just now. We never spell them out to SHA256 as little endian (where all of the bytes would be spelled in the reverse order). There is a peculiarly when hashing strings that doesn't happen in numbers. Sometimes there aren't enough bytes to form a double-word, say if the number of characters is not divisible by 4, then we put 0x80 0x00 0x00 ... at the end until we can form another a double word (the first bit we spend must always be a 1, hence the 8 in the 0x80. A consequence of this is that there are some strings that make exactly the same hash, such as "abc\x80" and "abc". But if you're only hashing strings of length N then there isn't a known collision (meaning having the same hash) yet between two such strings. Title: Re: The «beauty» of probability and pure randomness Post by: BlackHatCoiner on January 21, 2021, 07:03:16 PM What this means is that any input in to SHA256 must first have the length of the input, expressed as a 64 bit number, appended to the input. Therefore, the input itself cannot be longer than 264 - 1 bits, which works out to 2 exabytes (2 million terabytes). Thanks again, I didn't know that you can't hash something that's more than 2 million terabytes. But why 72057594037927936? How did you come up with that number?What this means is that for each of the 2256 possible outputs from SHA256, if the inputs were evenly distributed then there would be 272057594037927936 possible inputs per output. A consequence of this is that there are some strings that make exactly the same hash, such as "abc\x80" and "abc". On what function they make exactly the same hash? I tried it on SHA256 but they don't collide.abc: Code: ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad abc\x80: Code: 435a2c4c08a230aa3dc9874a02324b03c6ffbedf4e6deacf7226477a28f80133 Title: Re: The «beauty» of probability and pure randomness Post by: o_e_l_e_o on January 21, 2021, 07:52:15 PM But why 72057594037927936? How did you come up with that number? Yeah, my bad. I was a bit too quick with my calculations and have divided the exponents instead of subtracting them. The actual number is 218446744073709551360. I've fixed my reply above.Given that the maximum input can be a message of length 264 bits (we'll forget about the "- 1" for ease of explanation), then there are 2264 possible inputs. We also know that there are 2256 possible outputs. In the same way that if we know there are 50 outputs and 10 inputs then there would be 50/10 = 5 outputs per input, using our numbers above there are 2264 / 2256 outputs per input. This can be simplified to 2((264) - 256), which works out to 218446744073709551360. Title: Re: The «beauty» of probability and pure randomness Post by: gmaxwell on January 22, 2021, 10:08:28 AM What this means is that any input in to SHA256 must first have the length of the input, expressed as a 64 bit number, appended to the input. Therefore, the input itself cannot be longer than 264 - 1 bits, which works out to 2 exabytes (2 million terabytes). What this means is that for each of the 2256 possible outputs from SHA256, if the inputs were evenly distributed then there would be 218446744073709551360 possible inputs per output. So although not infinite, the chance of an output having 0 possible inputs is incomprehensibly small. It's worse than that... sha-256 has a 256-bit internal state bottleneck, and so that worst case deviation from uniform is likely massively larger than you estimate (though still extraordinarily tiny). Basically you just consider your last block of input, there is 512 bits of input text plus 256 bits of state from the prior block and outputs 256 bits. Even if you model the 256-bit prior-state as totally uniform, there is "only" 2^512 possible input values for each output value. Still, according to this counting argument it's absurdly unlikely for there to be an unreachable output ... just not quite as much as your figures suggested. Title: Re: The «beauty» of probability and pure randomness Post by: j2002ba2 on January 22, 2021, 10:57:34 AM Still, according to this counting argument it's absurdly unlikely for there to be an unreachable output ... just not quite as much as your figures suggested. Since bitcoin generally uses double sha-256, there are in fact unreachable outputs. It's expected, that half of the output values are unreachable (when using dSHA256). Theoretically it might become a problem in the future, when difficulty is insanely high, no combination of nonce/extranonce would produce a number below target. This might not be true though, it's possible that all low values are reachable. Long before this happens there will be a hardfork, since sha-256 wouldn't be secure anymore, or a softfork adding additional hash commitment, using something stronger than sha-256. Title: Re: The «beauty» of probability and pure randomness Post by: odolvlobo on January 22, 2021, 08:33:26 PM ...I wrote the following php code that once you run it, it counts you the total hashes that begin with zero. ...It starts from number 0 and until 16 it hashes every number. Since hexadecimal characters are 16 in total, then on average it should find 1 hash that starts with "0". ...If I change $maxtries to 32, then I should get on average 2 hashes that start with "0". Again, if you try it with 32, you'll find none. ...By changing it to 160 I should find 10 hashes on average and surprisingly I find exactly 10 ...If I change it to 1600 I should get 100, but there are 115 ... Random is non-deterministic by definition, so it is a mistake to believe that any aspect of randomness is deterministic. You can quote me on that. Also, has it been proven that all possible 256-bit values can be produced by SHA-256? And we assume that the output is uniform, but is that actually known? Title: Re: The «beauty» of probability and pure randomness Post by: NotATether on January 26, 2021, 10:54:06 AM A consequence of this is that there are some strings that make exactly the same hash, such as "abc\x80" and "abc". On what function they make exactly the same hash? I tried it on SHA256 but they don't collide.abc: Code: ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad abc\x80: Code: 435a2c4c08a230aa3dc9874a02324b03c6ffbedf4e6deacf7226477a28f80133 I thought it should give the same output if you hash with abc + hexadecimal byte 0x80, not the actual "\x80" characters. The backslash and "x" is C/C++ printf notation and I am used to writing hex numbers like that. My reasoning for this was that "abc" was not aligned to 512 bits so you'd have to add 0x80 and then enough 0x00 bytes to make it aligned to 512 bits. But I think my example was wrong since "abc" is just 3 bytes so the equivalent SHA256 hash would be "abc\x80\x00\x00\x00\x00(this is 8 bytes, we need more zeros to make it 64 bytes = 512 bits)". You could try making an input with 63 characters and see if it's SHA256 is identical to the same string but with a 64th byte of 0x00. Also, has it been proven that all possible 256-bit values can be produced by SHA-256? And we assume that the output is uniform, but is that actually known? According to this (https://crypto.stackexchange.com/a/1251), it isn't publicly known whether SHA1/256/512/etc hash functions are uniformly distributed because they were designed behind closed doors. |