arulbero (OP)
Legendary
Offline
Activity: 1942
Merit: 2094
|
|
November 29, 2016, 03:45:05 PM Last edit: December 01, 2016, 02:41:50 PM by arulbero |
|
Hash function (sha256) is used to mine and identify a block. The digest is a 256 bit string.
We don't know whether this function is surjective or not (whether there is an input-block header for each output-digest).
But we assume that the output of the hash function is evenly distributed (so we can calculate target and difficulty for proof-of-work algorithm).
How can we know for sure the output is evenly distributed?
|
|
|
|
|
Syke
Legendary
Offline
Activity: 3878
Merit: 1193
|
|
November 30, 2016, 05:55:53 PM |
|
And while in the subject...
Although extremely unlikely, what happens if a new block has the same hash as a previous block?
Bitcoin continues normally. It's a perfectly valid occurrence.
|
Buy & Hold
|
|
|
shorena
Copper Member
Legendary
Offline
Activity: 1498
Merit: 1540
No I dont escrow anymore.
|
|
November 30, 2016, 09:03:29 PM Last edit: December 02, 2016, 08:31:49 AM by shorena |
|
And while in the subject...
Although extremely unlikely, what happens if a new block has the same hash as a previous block?
Bitcoin continues normally. It's a perfectly valid occurrence. Is that so? Lets say there are blocks A, B, C and D with B and C having the same hash. B refers to A as normal. C refers to B which may work even though its the same hash, Im not sure. D refers to C and B because they have the same hash. What am I missing? Besides that a 256-bit collision will probably never happen. Edit: a more in depth description of this problem can be found in post #8 by Danny below. Edit2: answered: Is there something in Bitcoin Core that prevents the block at height 441372 (with a Previous Block Hash that is identical to it's own Block hash) from being added to the blockchain? Yes. No one node will receive it Because will ignore INV command thinking that already has this block in the blockchain reference -> https://bitcoin.org/en/developer-reference#getblocks
|
Im not really here, its just your imagination.
|
|
|
Fuserleer
Legendary
Offline
Activity: 1064
Merit: 1020
|
|
November 30, 2016, 09:43:22 PM |
|
And while in the subject...
Although extremely unlikely, what happens if a new block has the same hash as a previous block?
Bitcoin continues normally. It's a perfectly valid occurrence. Is that so? Lets say there are blocks A, B, C and D with B and C having the same hash. B refers to A as normal. C refers to B which may work even though its the same hash, Im not sure. D refers to C and B because they have the same hash. What am I missing? Besides that a 256-bit collision will probably never happen. Ignoring the fact that the likelihood of 2 blocks outputting the same hash, in the same consensus round (10 mins) is practically zero. There would be a network fork until the next valid blocks on all of the fork chains were created. The network would then revert to the fork where the next block/s had the most work performed upon it. A (not very good) diagram: CHAIN (WORK = 100) -> Block +1 (HASH X : WORK = 10) -> Block +2 (HASH Z : WORK 12) CHAIN (WORK = 100) -> Block +1 (HASH X' : WORK = 10) -> Block +2 (HASH Y : WORK 13) <- This one would win
|
|
|
|
shorena
Copper Member
Legendary
Offline
Activity: 1498
Merit: 1540
No I dont escrow anymore.
|
|
November 30, 2016, 10:25:44 PM |
|
And while in the subject...
Although extremely unlikely, what happens if a new block has the same hash as a previous block?
Bitcoin continues normally. It's a perfectly valid occurrence. Is that so? Lets say there are blocks A, B, C and D with B and C having the same hash. B refers to A as normal. C refers to B which may work even though its the same hash, Im not sure. D refers to C and B because they have the same hash. What am I missing? Besides that a 256-bit collision will probably never happen. Ignoring the fact that the likelihood of 2 blocks outputting the same hash, in the same consensus round (10 mins) is practically zero. There would be a network fork until the next valid blocks on all of the fork chains were created. The network would then revert to the fork where the next block/s had the most work performed upon it. A (not very good) diagram: CHAIN (WORK = 100) -> Block +1 (HASH X : WORK = 10) -> Block +2 (HASH Z : WORK 12) CHAIN (WORK = 100) -> Block +1 (HASH X' : WORK = 10) -> Block +2 (HASH Y : WORK 13) <- This one would win So block X' cant follow X?
|
Im not really here, its just your imagination.
|
|
|
Fuserleer
Legendary
Offline
Activity: 1064
Merit: 1020
|
|
November 30, 2016, 10:39:11 PM |
|
And while in the subject...
Although extremely unlikely, what happens if a new block has the same hash as a previous block?
Bitcoin continues normally. It's a perfectly valid occurrence. Is that so? Lets say there are blocks A, B, C and D with B and C having the same hash. B refers to A as normal. C refers to B which may work even though its the same hash, Im not sure. D refers to C and B because they have the same hash. What am I missing? Besides that a 256-bit collision will probably never happen. Ignoring the fact that the likelihood of 2 blocks outputting the same hash, in the same consensus round (10 mins) is practically zero. There would be a network fork until the next valid blocks on all of the fork chains were created. The network would then revert to the fork where the next block/s had the most work performed upon it. A (not very good) diagram: CHAIN (WORK = 100) -> Block +1 (HASH X : WORK = 10) -> Block +2 (HASH Z : WORK 12) CHAIN (WORK = 100) -> Block +1 (HASH X' : WORK = 10) -> Block +2 (HASH Y : WORK 13) <- This one would win So block X' cant follow X? Nope, the hash X is already used for a block X, so block X' using the same hash X would get "soft" rejected by the network.
|
|
|
|
DannyHamilton
Legendary
Offline
Activity: 3486
Merit: 4851
|
|
December 01, 2016, 03:47:05 AM |
|
Nope, the hash X is already used for a block X, so block X' using the same hash X would get "soft" rejected by the network.
I'm not understanding what you are saying. Can you point out the code in Bitcoin Core that explains what you are saying?
As a new example (saying what I think is the same thing as shorena in a different way)... Lets say we are at block height 441370 with a hash of 0000...08ac (as of the writing of this post, we actually are) Now lets say a few minutes later a mining pool solves a block. The new block at block height 441371 has: Block Hash: 0000...075c Previous Block: 0000...08ac (referring to block height 441370)
Now, lets say 15 minutes later a mining pool solves a block. The new block at block height 441372 has: Block Hash: 0000...075c Previous Block: 0000...75c (referring to block height 441371)
Is there something in Bitcoin Core that prevents the block at height 441372 (with a Previous Block Hash that is identical to it's own Block hash) from being added to the blockchain? If not, then when a miner creates the next block having: Block Hash: 0000...08ac Previous Block: 0000...075c
Then how will any Bitcoin Core node know if this new block is at height 441373 (Previous Block: 0000...075c refers to block height 441372) or a fork competing with the other block at height 441372 (Previous Block: 0000...075c refers to block height 441371)?
|
|
|
|
Dabs
Legendary
Offline
Activity: 3416
Merit: 1912
The Concierge of Crypto
|
|
December 01, 2016, 01:08:20 PM |
|
But we assume that the output of the hash function is evenly distributed (so we can calculate target and difficulty for proof-of-work algorithm).
How can we know for sure the output is evenly distributed?
I did a few million hashes of number sequences and other random stuff, for playing cards and dice simulations. It looked evenly distributed.
|
|
|
|
arulbero (OP)
Legendary
Offline
Activity: 1942
Merit: 2094
|
|
December 01, 2016, 02:39:41 PM |
|
I rephrase my question: 1) given arbitrary inputs, does the SHA-256 function really generate outputs that are evenly distributed? 2) given block headers** as inputs, does the SHA-256 function really generate outputs that are evenly distributed? **block header = Version (4 bytes) + Previous Block Hash (32 bytes) + Merkle Root (32 bytes) + Timestamp (4 bytes) + Difficulty Target (4 bytes) + Nonce (4 bytes) = 80 byte = 640 bit --> fixed length block hash = block header hash = SHA256(SHA256(Block_Header)) But we assume that the output of the hash function is evenly distributed (so we can calculate target and difficulty for proof-of-work algorithm).
How can we know for sure the output is evenly distributed?
I did a few million hashes of number sequences and other random stuff, for playing cards and dice simulations. It looked evenly distributed. In your case, what were your inputs? What kind of number sequences? Fixed length (more or less than 256 bit) or not?
|
|
|
|
amaclin
Legendary
Offline
Activity: 1260
Merit: 1019
|
|
December 01, 2016, 03:52:21 PM |
|
Is there something in Bitcoin Core that prevents the block at height 441372 (with a Previous Block Hash that is identical to it's own Block hash) from being added to the blockchain? Yes. No one node will receive it Because will ignore INV command thinking that already has this block in the blockchain
|
|
|
|
arulbero (OP)
Legendary
Offline
Activity: 1942
Merit: 2094
|
|
December 03, 2016, 05:00:28 PM |
|
I found out some interesting information about hash function: http://www.denimgroup.com/know_artic_secure_hash_functions.htmlA secure hash function has three properties: preimage resistance, collision resistance, and second preimage resistance.
A good collision resistant hash function should have each hash value be about as evenly distributed as possible
http://crypto.stackexchange.com/questions/12822/are-the-sha-family-hash-outputs-practically-randomThe SHA-256 (as well as any cryptographically secure hash algorithm) produces output that will appear like an uniformly random sequence to observer who does not know the input.
Quite a few random number generators, for example ANSI X9.31's RNG and NIST SP 800-90 Hash_DRBG use SHA family hash functions for the reason that resulting sequence is hard to distinguish from random.
If the RNG is able to produce entropy, but has large bias, SHA-256 is a very good function for collecting entropy. The output of the function has nearly 1 bit of entropy per output bit, if the input to the function contained at least 256 bits of entropy. Only very little entropy gets lost in this process.
SHA-256 using constructs like Hash_DRBG in NIST SP 800-90 can be used in situation where true entropy is available, but is slow to collect. Once you have been able to collect 256 bits of entropy (and preferable a bit more to be safe), you can use the entropy to instantiate Hash_DRBG/SHA-256, which will be able to serve billions of random numbers.
Remember these are not good uses:
If you feed SHA-256 with too little entropy (or even smaller input than 256 bits), the output may appear random, but it is not. Smart adversary can be able to abuse this In other words: if your input to hash is short or too predictable, so shall be the output.
Pseudo-Random Number Generator using SHA-256: https://www.stat.berkeley.edu/~stark/Java/Html/sha256Rand.htmBut in general randomness is not the same as collision avoidance: http://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed
|
|
|
|
Dabs
Legendary
Offline
Activity: 3416
Merit: 1912
The Concierge of Crypto
|
|
December 04, 2016, 03:06:36 AM |
|
I rephrase my question: 1) given arbitrary inputs, does the SHA-256 function really generate outputs that are evenly distributed? 2) given block headers** as inputs, does the SHA-256 function really generate outputs that are evenly distributed? **block header = Version (4 bytes) + Previous Block Hash (32 bytes) + Merkle Root (32 bytes) + Timestamp (4 bytes) + Difficulty Target (4 bytes) + Nonce (4 bytes) = 80 byte = 640 bit --> fixed length block hash = block header hash = SHA256(SHA256(Block_Header)) But we assume that the output of the hash function is evenly distributed (so we can calculate target and difficulty for proof-of-work algorithm).
How can we know for sure the output is evenly distributed?
I did a few million hashes of number sequences and other random stuff, for playing cards and dice simulations. It looked evenly distributed. In your case, what were your inputs? What kind of number sequences? Fixed length (more or less than 256 bit) or not? Why don't you try your #2 and see... we have more than 400k blocks already. I believe my inputs were more than 256 bits. I used a bunch of "random" characters ... at least 60 in length.
|
|
|
|
Mitchell4265727279
Newbie
Offline
Activity: 1
Merit: 0
|
|
December 05, 2016, 08:55:17 AM Last edit: December 05, 2016, 09:39:41 AM by Mitchell4265727279 |
|
Ignoring the fact that the likelihood of 2 blocks outputting the same hash, in the same consensus round (10 mins) is practically zero.
It is still zero regardless of time. The human brain isnt equipped to deal with very large/small numbers. It's just too far from the realm of normal experience. SHA2 is regarded as being evenly distributed. Even if that's wrong it's uniform enough for current purposes, makes little difference in the end as far as a collision is concerned.
|
|
|
|
Fuserleer
Legendary
Offline
Activity: 1064
Merit: 1020
|
|
December 05, 2016, 12:09:43 PM |
|
Ignoring the fact that the likelihood of 2 blocks outputting the same hash, in the same consensus round (10 mins) is practically zero.
It is still zero regardless of time. The human brain isnt equipped to deal with very large/small numbers. It's just too far from the realm of normal experience. SHA2 is regarded as being evenly distributed. Even if that's wrong it's uniform enough for current purposes, makes little difference in the end as far as a collision is concerned. and if I said it was zero there would be posts stating that in theory it isn't :p
|
|
|
|
Rick Storm
Newbie
Offline
Activity: 17
Merit: 0
|
|
December 05, 2016, 12:22:31 PM |
|
I rephrase my question:
1) given arbitrary inputs, does the SHA-256 function really generate outputs that are evenly distributed?
2) given block headers** as inputs, does the SHA-256 function really generate outputs that are evenly distributed?
Yes, yes. If not, it would not be a good hash function and it would be open to attacks.
|
|
|
|
arulbero (OP)
Legendary
Offline
Activity: 1942
Merit: 2094
|
|
December 05, 2016, 04:20:11 PM Last edit: December 05, 2016, 04:43:35 PM by arulbero |
|
Obviously block hashes from blockchain are not evenly distributed (because of difficulty and retarget): BLOCKS : from 0 to 442033 (TOT = 442034) height hash 0 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f 1 00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048 ... 10000 0000000099c744455f58e6c6e98b671e1bf7f37346bfd4cf5d0274ad8ee660cb 10001 00000000f01df1dbc52bce6d8d31167a8fef76f1a8eb67897469cf92205e806b ... 50000 000000001aeae195809d120b5d66a39c83eb48792e068f8ea1fea19d84a4278a 50001 000000001c920d495e1eeef2452b6d1c6c229a919b28196c103ecffebabee141 ... 100000 000000000003ba27aa200b1cecaad478d2b00432346c3f1f3986da1afd33e506 100001 00000000000080b66c911bd5ba14a74260057311eaeb1982802f7010f1a9f090 ... 200000 000000000000034a7dedef4a161fa058a2d67a173a90155f3a2fe6fc132e0ebf 200001 00000000000002e3269b8a00caf315115297c626f954770e8398470d7f387e1c ... 300000 000000000000000082ccf8f1557c5d40b21edabb18d2d691cfbf87118bac7254 300001 000000000000000049a0914d83df36982c77ac1f65ade6a52bdced2ce312aba9 ... 400000 000000000000000004ec466ce4732fe6f1ed1cddc2ed4b328fff5224276e3f6f 400001 000000000000000005421b1b2ee6d06d037557d7f5ec96852542413cfed40c22 ... 442032 000000000000000000b6a5c60d2509768b29f4b49ceda2679bc6d33f99c8adba 442033 000000000000000000bf88951b0bd245474265cc6a2a1b38ac8ef78c8612158a
So I have checked the last digit distribution of every hash (16 possible values): BLOCKS : from 0 to 442033 (TOT = 442034) last digit fr. prob. unif. --> 1/16=0.0625 0 27520 0.0622576543886 1 27881 0.0630743336485 2 27952 0.0632349547772 MAX --> =+0.00073495777 = +1,16% 3 27892 0.0630992186121 4 27543 0.0623096865852 5 27339 0.0618481836239 6 27674 0.062606043879 7 27691 0.0626445024591 8 27428 0.0620495256021 9 27203 0.061540514983 MIN --> -0.000959485 = -1.54% 10 (a) 27594 0.0624250623255 11 (b) 27554 0.0623345715488 12 (c) 27863 0.063033612799 13 (d) 27616 0.0624748322527 14 (e) 27619 0.062481619061 15 (f) 27665 0.0625856834542
I think it's ok. Out of curiosity, I have checked the last 2 digits distribution (256 possible values) too: BLOCKS : from 0 to 442033 (TOT = 442034) last 2 digits fr. prob. unif. --> 1/256=0.00390625 71 (47) 1891 0.0042779514698 MAX --> =+0.0003717 = +9,516% ................................................... ................................................... ................................................... 138 (8a) 1622 0.0036694009963 MIN --> =-0.000236849 = -6,06%
and the last 3 digits distribution (4096 possible values): BLOCKS : from 0 to 442033 (TOT = 442034) last 3 digits fr. prob. unif. --> 1/4096=0.000024414 2686 (A7E) 147 0.000332553604474 MAX --> 0.00008841 = +36,21% ................................................. ................................................. ................................................. 3139 (C43) 75 0.000169670206364 MIN --> -0.000074470419 = -30,5%
How do you think about these results? These distributions are okay?
|
|
|
|
Dabs
Legendary
Offline
Activity: 3416
Merit: 1912
The Concierge of Crypto
|
|
December 05, 2016, 05:41:08 PM |
|
Try a loop. Count from 1 to a billion. Hash those. Include a large random salt.
For example:
xxxxxxxxxxxxxxxxxxxxxxxxxx00000001 xxxxxxxxxxxxxxxxxxxxxxxxxx00000002 xxxxxxxxxxxxxxxxxxxxxxxxxx00000003 xxxxxxxxxxxxxxxxxxxxxxxxxx00000004 ... xxxxxxxxxxxxxxxxxxxxxxxxxx99999996 xxxxxxxxxxxxxxxxxxxxxxxxxx99999997 xxxxxxxxxxxxxxxxxxxxxxxxxx99999998 xxxxxxxxxxxxxxxxxxxxxxxxxx99999999
Or, let it run for a day or a few hours, then stop whatever you have generated. That should be more than enough for you or most other people. I've already done it, but it's better for you to try it yourself.
|
|
|
|
arulbero (OP)
Legendary
Offline
Activity: 1942
Merit: 2094
|
|
December 07, 2016, 03:37:38 PM |
|
Try a loop. Count from 1 to a billion. Hash those. Include a large random salt.
For example:
xxxxxxxxxxxxxxxxxxxxxxxxxx00000001 xxxxxxxxxxxxxxxxxxxxxxxxxx00000002 xxxxxxxxxxxxxxxxxxxxxxxxxx00000003 xxxxxxxxxxxxxxxxxxxxxxxxxx00000004 ... xxxxxxxxxxxxxxxxxxxxxxxxxx99999996 xxxxxxxxxxxxxxxxxxxxxxxxxx99999997 xxxxxxxxxxxxxxxxxxxxxxxxxx99999998 xxxxxxxxxxxxxxxxxxxxxxxxxx99999999
Ok, I took your advice. I picked a block (# 277316, hash 0000000000000001b6b9a13b095e96db41c4a928b97ef2d944a9b31b2cc7bdc4) and I made a loop with the nonce from 0 to 2^20: #!/usr/bin/env python import hashlib
nomefile="dist_hash" maxnonce=2**20
version='00000002' previousblockhash='0000000000000002a7bbd25a417c0374cc55261021e8a9ca74442b01284f0569' merkleroot='c91c008c26e50763e9f548bb8b2fc323735f73577effbc55502c51eb4cc7cf2e' timestamp='52be093a' difficultytarget='1903a30c' nonce=0
def byteswap(a): b=''; for i in range(0,len(a)/2+1): b=a[2*i:2*i+2]+b return b
header=byteswap(version)+byteswap(previousblockhash)+byteswap(merkleroot)+byteswap(timestamp)+byteswap(difficultytarget)
with open(nomefile, 'w') as f: f.write("NONCE" + "\t\t" + "BLOCK HEADER HASH" + "\n") for nonce in range(0,maxnonce) : #924591752 is the best nonce_st = byteswap("{0:0{1}x}".format(nonce,8)) #format: hex (8 bytes), little endian hash_result = hashlib.sha256((header+nonce_st).decode('hex')).digest() hash_result = hashlib.sha256(hash_result).hexdigest() f.write("{0:0{1}}".format(nonce,8) + "\t" + byteswap(hash_result) + "\n")
Output sorted by ascending hash: NONCE BLOCK HEADER HASH 00073675 00001161c27946bfbc99e631e5cfd1bf979c27bd68207991cb97e3d054c3192b 00563775 00001f485a9ad567aeb6a18a514f28b807f6a5c43a166debdc74b7ac46a765ea 00108164 000024a77201aaa21f86a43878bc50ef8be511105b403cad9c4b99e6f6b5563c 00172165 00002f3ea713eef6411ca3b2ead29ce859b1330d05df76a298bed3cf9807d666 00661616 000033f44a25e97b6dca43e4ba8dc9fe0e0b48777cc936f774433ce6244c71d4 ... ... 00052351 ffffaa0a77c61b614e110c280bbb06c6d48561df1d6b064e4718a75ca97d509f 00845204 ffffc35f3b2a105c5a9d057b03e5eea26bb83f5d6482070be0eaa28a4dcb2f6b 00999281 ffffc4ee9128c730094017f32614562b434c3c782263b4adeb2a64fdc14a8584 00245490 ffffe3579d3529b5fe46b12db6eb472a6d5df707d4a738b7259ea7b2c62bae9b 00282384 ffffef1566a137349da9e6e02ecd638c7508933c08428f099a7f0808f10d7c6f
The last digit distribution: TOT = 2^20 = 1048576 last digit fr. prob. unif. --> 1/16=0.0625 0 62401 0.062401 1 62764 0.062764 2 62608 0.062608 3 62638 0.062638 4 62465 0.062465 5 62695 0.062695 6 62351 0.062351 7 62706 0.062706 8 62093 0.062093 --> MIN = -0.000407 = -0.65% 9 62340 0.06234 10 (a) 62884 0.062884 --> MAX = +0.000384 = +0.61% 11 (b) 62412 0.062412 12 (c) 62455 0.062455 13 (d) 62204 0.062204 14 (e) 62220 0.06222 15 (f) 62764 0.062764
and the last 2 digits distribution: last 2 digits fr. prob. unif. --> 1/256=0.00390625
78 (4E) 3751 0.003751 MIN --> -0.00015525 = -3,97% ........................................... ........................................... 47 (8A) 4117 0.004117 MAX --> +0.00021075 = +5.40%
|
|
|
|
Dabs
Legendary
Offline
Activity: 3416
Merit: 1912
The Concierge of Crypto
|
|
December 07, 2016, 07:14:44 PM |
|
Looks good. For more than 2 digits distribution, you should do a few million more, or a few billion more, or 2^32 or something, to get a better picture of the distribution. At one point, you'll see that, maybe there's no point going further.
No one needs pi to more than 100 decimal places, but geeks have been computing it to billions of digits.
|
|
|
|
|