Bitcoin Forum
May 08, 2024, 07:58:38 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Double hashing: less entropy?  (Read 3995 times)
Kazimir (OP)
Legendary
*
Offline Offline

Activity: 1176
Merit: 1003



View Profile
June 11, 2012, 10:31:25 AM
 #1

If I understood correctly, in most situations where hashing occurs, the Bitcoin protocol applies double hashing: sha256(sha256(input)) rather than sha26(input).

Why is this, doesn't this actually reduce entropy?

As far as I know, we cannot guarantee that there aren't many different inputs x and y which have different hashes, but the same double hashes. That is, sha256 is not a one-to-one mapping on the 256 bit space, right?


In theory, there's no difference between theory and practice. In practice, there is.
Insert coin(s): 1KazimirL9MNcnFnoosGrEkmMsbYLxPPob
1715198318
Hero Member
*
Offline Offline

Posts: 1715198318

View Profile Personal Message (Offline)

Ignore
1715198318
Reply with quote  #2

1715198318
Report to moderator
1715198318
Hero Member
*
Offline Offline

Posts: 1715198318

View Profile Personal Message (Offline)

Ignore
1715198318
Reply with quote  #2

1715198318
Report to moderator
1715198318
Hero Member
*
Offline Offline

Posts: 1715198318

View Profile Personal Message (Offline)

Ignore
1715198318
Reply with quote  #2

1715198318
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715198318
Hero Member
*
Offline Offline

Posts: 1715198318

View Profile Personal Message (Offline)

Ignore
1715198318
Reply with quote  #2

1715198318
Report to moderator
1715198318
Hero Member
*
Offline Offline

Posts: 1715198318

View Profile Personal Message (Offline)

Ignore
1715198318
Reply with quote  #2

1715198318
Report to moderator
1715198318
Hero Member
*
Offline Offline

Posts: 1715198318

View Profile Personal Message (Offline)

Ignore
1715198318
Reply with quote  #2

1715198318
Report to moderator
kokjo
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
June 11, 2012, 10:48:37 AM
 #2

that is correct, but the design decision about double sha256, is i think to slow mining down. and using double-sha256 all-around is just for consistency.

it won't be a problem, as long as you don't search the whole 256 bit space for a collision(which you don't want to do).


DISCLAIMER: im not a cryptographer.

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
kokjo
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
June 11, 2012, 11:18:37 AM
 #3

If I understood correctly, in most situations where hashing occurs, the Bitcoin protocol applies double hashing: sha256(sha256(input)) rather than sha26(input).

Why is this, doesn't this actually reduce entropy?

As far as I know, we cannot guarantee that there aren't many different inputs x and y which have different hashes, but the same double hashes. That is, sha256 is not a one-to-one mapping on the 256 bit space, right?



First, do we actually *know* that sha-256 is *not* a one to one mapping on the 256 bit space ?
If it turns out to be, then you've got nothing. I don't know the answer, I'm not a professional cryptographer,
but looking at the code for SHA-256, there doesn't seem to be an obvious dropping of bits within
the transform step itself, but then I am too lazy to analyze it in-depth.

Would love for someone more knowledgeable than I to comment.

Second, if SHA-256 does indeed somehow drop information for a 256-bit input, IMO, if there's a reduction
in entropy, it's likely to be negligible when compared to the additional work needed to untangle the complexity
added by the second round.

Finally, what someone said: the likely intent of the team who designed bitcoin was to slow mining down, not to
add a layer of security there. Arguably, they failed because they didn't foresee the length at which people would
go to mine coins (first GPUs, then FPGAs, then dedicated ASICs).

Had they realized, they would have added an scrypt-like round to the hash step.

satoshi encouraged people to mine with gpus, he did foresee this.

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
Kazimir (OP)
Legendary
*
Offline Offline

Activity: 1176
Merit: 1003



View Profile
June 11, 2012, 11:25:00 AM
 #4

Sure, I get that double hashing was intended to slow down mining, as well as vexing brute force. But perhaps sha256(sha256(input)+input) would have been better.

First, do we actually *know* that sha-256 is *not* a one to one mapping on the 256 bit space ?
If it turns out to be, then you've got nothing. I don't know the answer, I'm not a professional cryptographer,
but looking at the code for SHA-256, there doesn't seem to be an obvious dropping of bits within
the transform step itself, but then I am too lazy to analyze it in-depth.
Can't tell for sure, but obviously sha256 is designed to be irreversible (besides the mathematical fact that for inputs larger than 256 bit it's bound to lose information).

for example if it does something like X=A+B (with + being a binary addition modulo 2³²) it "drops bits" in the sense that either a 1-bit in A or in B would result in a 1-bit in X.
Similarly for X=(A<<n)+(B>>m) etc, albeit for different bits.

Quote
Second, if SHA-256 does indeed somehow drop information for a 256-bit input, IMO, if there's a reduction
in entropy, it's likely to be negligible when compared to the additional work needed to untangle the complexity
added by the second round.
Could be, or not, I really don't see why or why not that would be negligible. I hope someone with in-depth knowledge about hashing can confirm or deny this? (hopefully confirm Smiley)

Quote
Finally, what someone said: the likely intent of the team who designed bitcoin was to slow mining down, not to
add a layer of security there. Arguably, they failed because they didn't foresee the length at which people would
go to mine coins (first GPUs, then FPGAs, then dedicated ASICs).

Had they realized, they would have added an scrypt-like round to the hash step.
You mean bcrypt?
(not familiar with scrypt, perhaps you mean something similar, I think bcrypt would have been a very good, flexible, future-proof choice for Bitcoin)

In theory, there's no difference between theory and practice. In practice, there is.
Insert coin(s): 1KazimirL9MNcnFnoosGrEkmMsbYLxPPob
pusle
Member
**
Offline Offline

Activity: 89
Merit: 10


View Profile
June 11, 2012, 11:27:42 AM
 #5


The way I understand it is that adding more rounds makes it harder to find a short cut/weakness.
The fixed length for the second sha256 stage also removes one type of "attack" often used against such algorithms.
Kazimir (OP)
Legendary
*
Offline Offline

Activity: 1176
Merit: 1003



View Profile
June 11, 2012, 11:35:29 AM
 #6

The way I understand it is that adding more rounds makes it harder to find a short cut/weakness.
The fixed length for the second sha256 stage also removes one type of "attack" often used against such algorithms.
OK, nonetheless I don't see any demerits (yet it does overcome certain risks) from using the alternative double hashing scheme I mentioned above. Or more generally:

Code:
Hash(input,depth) := 
  Hash(input) if depth==1
  Hash(Hash(input)+input,depth-1) otherwise
(where 'Hash' can be any regular hashing function, such as sha256)

In theory, there's no difference between theory and practice. In practice, there is.
Insert coin(s): 1KazimirL9MNcnFnoosGrEkmMsbYLxPPob
kokjo
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
June 11, 2012, 11:38:36 AM
 #7

Code:
Hash(input,depth) := 
  Hash(input) if depth==1
  Hash(Hash(input)+input,depth-1) otherwise
(where 'Hash' can be any regular hashing function, such as sha256)
Sad don't write recursive code. its bad. for the stack and people mind.
also your code is kind of broken... what does Hash with only one input?


"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
Kazimir (OP)
Legendary
*
Offline Offline

Activity: 1176
Merit: 1003



View Profile
June 11, 2012, 12:06:05 PM
 #8

Sad don't write recursive code. its bad. for the stack and people mind.
Well it was pseudocode, just to get the general idea.

In more detail: (this does something different than the example above)
Code:
function DeepHash( input , depth )
{
  h = '';
  while(0≤(depth--)) h = Hash(h+input); // where Hash(x) is a regular hasing function, e.g. sha256
  return h;
}

DeepHash(input,1) is the same as Hash(input)
DeepHash(input,2) gives Hash(Hash(input)+input)
DeepHash(input,3) gives Hash(Hash(Hash(input)+input))+input)
etc.

Quote
also your code is kind of broken... what does Hash with only one input?
It was a polymorphic function, a two-parameter alternative to the already existing Hash function with one input (i.e. regular sha256).

In theory, there's no difference between theory and practice. In practice, there is.
Insert coin(s): 1KazimirL9MNcnFnoosGrEkmMsbYLxPPob
Meni Rosenfeld
Donator
Legendary
*
expert
Offline Offline

Activity: 2058
Merit: 1054



View Profile WWW
June 11, 2012, 12:15:53 PM
 #9

I did a calculation which says that every application of SHA-256 reduces entropy by about 0.5734 bits. I have no idea if that's correct.

The reason for this sacrifice is almost certainly to prevent cracks in SHA-256 from being immediately translated to an attack on Bitcoin hashing.

First, do we actually *know* that sha-256 is *not* a one to one mapping on the 256 bit space ?
If it turns out to be, then you've got nothing. I don't know the answer, I'm not a professional cryptographer,
but looking at the code for SHA-256, there doesn't seem to be an obvious dropping of bits within
the transform step itself, but then I am too lazy to analyze it in-depth.
SHA-256, as a cryptographic hash function, aspires to be indistinguishable from random. If it was in fact random, the number of preimages for every 256-bit element would follow the Poisson distribution - about 36% would have no preimage, 36% would have one, 18% two, 6% three and so on. So I'd say it's almost certain that it's not a 1-1 mapping.

Finally, what someone said: the likely intent of the team who designed bitcoin was to slow mining down, not to
add a layer of security there.
What could that mean? The difficulty controls the mining rate. If a hash function half as hard would be chosen, the difficulty would double and you'd have the same generation rate.

Arguably, they failed because they didn't foresee the length at which people would
go to mine coins (first GPUs, then FPGAs, then dedicated ASICs).
Of course they foresaw all of this, if not the timing of their advent.

Had they realized, they would have added an scrypt-like round to the hash step.
The hash function should be easy to verify - each application should be fast but block generation requires many applications. Choosing a slow hash function would be counterproductive.

satoshi encouraged people to mine with gpus, he did foresee this.
Eh. I remember hearing the opposite. I probably remember wrong.
I know of one comment Satoshi made about GPUs, and it wasn't an encouragement:
We should have a gentleman's agreement to postpone the GPU arms race as long as we can for the good of the network.  It's much easer to get new users up to speed if they don't have to worry about GPU drivers and compatibility.  It's nice how anyone with just a CPU can compete fairly equally right now.

1EofoZNBhWQ3kxfKnvWkhtMns4AivZArhr   |   Who am I?   |   bitcoin-otc WoT
Bitcoil - Exchange bitcoins for ILS (thread)   |   Israel Bitcoin community homepage (thread)
Analysis of Bitcoin Pooled Mining Reward Systems (thread, summary)  |   PureMining - Infinite-term, deterministic mining bond
Kazimir (OP)
Legendary
*
Offline Offline

Activity: 1176
Merit: 1003



View Profile
June 11, 2012, 12:24:36 PM
 #10

I did a calculation which says that every application of SHA-256 reduces entropy by about 0.5734 bits. I have no idea if that's correct.
Ah, interesting. Well if that's true, the problem seems insignificant, although it could have been prevented entirely by an approach like I posted above (just an example, obviously there are numerous alternatives). And still covering the vulnerability for potential cracks in sha256.

In theory, there's no difference between theory and practice. In practice, there is.
Insert coin(s): 1KazimirL9MNcnFnoosGrEkmMsbYLxPPob
kokjo
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
June 11, 2012, 12:32:36 PM
 #11

I did a calculation which says that every application of SHA-256 reduces entropy by about 0.5734 bits. I have no idea if that's correct.
Ah, interesting. Well if that's true, the problem seems insignificant, although it could have been prevented entirely by an approach like I posted above (just an example, obviously there are numerous alternatives). And still covering the vulnerability for potential cracks in sha256.

no! the number is wrong, the entropy after 512 rounds of sha256, will be below 0, if its true. which is not good or insignificant. its a huge error.

a acceptable reduction would be around 10^-80 bits/round

sorry dude you did your math wrong.

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
TangibleCryptography
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250


Tangible Cryptography LLC


View Profile WWW
June 11, 2012, 12:33:41 PM
 #12

There are lots of "quirks" with the protocol but the core of it remains solid and sometimes surprisingly inclusive.

Rather than iterative uses of a single hashing function using two hashing functions would have provided more resistance in the event that SHA-256 is significantly degraded.

i.e. RIPEMD160(SHA-256(input)+input)

ironically a modified version of this structure is used for creating addresses from public keys but not hashing.
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
June 11, 2012, 12:48:03 PM
 #13

I did a calculation which says that every application of SHA-256 reduces entropy by about 0.5734 bits. I have no idea if that's correct.

Assuming SHA-256 is a random function (maps every input to a uniformly random independent output), you will end up having (on average) (1-1/e)*2^256 different outputs. This indeed means a loss of entropy of about half a bit. Further iterations map a smaller space to an output space of 2^256, and the loss of entropy of each further application drops very quickly. It's certainly not the case that you lose any significant amount by doing 1000 iterations or so.

I do Bitcoin stuff.
Kazimir (OP)
Legendary
*
Offline Offline

Activity: 1176
Merit: 1003



View Profile
June 11, 2012, 12:53:28 PM
 #14

no! the number is wrong, the entropy after 512 rounds of sha256, will be below 0, if its true. which is not good or insignificant. its a huge error.

a acceptable reduction would be around 10^-80 bits/round

sorry dude you did your math wrong.
Eh no, it doesn't work that way. Decreasing entropy from 256 bit to (256-0.5734)=255.4266 bit means a reduction by factor 255.4266/256 = 0.997760156250

0.997760156250512 ≈ 0.317243314 so after 512 rouds, about 31.7% or 81.2 bits of entropy remains.

FYI: the standard Bitcoin-Qt client applies many tens of thousands hashing rounds (or sometimes hundreds of thousands, depending on your cpu speed) to create the master key for wallet encryption. Well, rest assured, all wallets do not have the same master key Smiley

In theory, there's no difference between theory and practice. In practice, there is.
Insert coin(s): 1KazimirL9MNcnFnoosGrEkmMsbYLxPPob
Meni Rosenfeld
Donator
Legendary
*
expert
Offline Offline

Activity: 2058
Merit: 1054



View Profile WWW
June 11, 2012, 12:57:56 PM
 #15

I did a calculation which says that every application of SHA-256 reduces entropy by about 0.5734 bits. I have no idea if that's correct.
Mmmh.

Does that mean that sha256 ^N(some 256bit input) for N sufficiently large
would always converge to the same value, independent of the actual input.
No, because the assumptions I made become less true the more rounds are done (maybe they're not even accurate enough after one round). The set of all possible images of SHA256^N becomes smaller for larger N until it converges to a fixed set (which is probably very large). Then SHA-256 is a permutation (one-to-one mapping) on this set. (This is true for every function from a space to itself).

Quote
First, do we actually *know* that sha-256 is *not* a one to one mapping on the 256 bit space ?
If it turns out to be, then you've got nothing. I don't know the answer, I'm not a professional cryptographer,
but looking at the code for SHA-256, there doesn't seem to be an obvious dropping of bits within
the transform step itself, but then I am too lazy to analyze it in-depth.
SHA-256, as a cryptographic hash function, aspires to be indistinguishable from random. If it was in fact random, the number of preimages for every 256-bit element would follow the Poisson distribution - about 36% would have no preimage, 36% would have one, 18% two, 6% three and so on. So I'd say it's almost certain that it's not a 1-1 mapping.
Very interesting observation, and you're probably correct.
Even more interesting would be to find a way to measure if that
is indeed the case (even if the verification is in a monte-carlo sense)
That's probably as hard as determining whether the function is broken.

Quote
Finally, what someone said: the likely intent of the team who designed bitcoin was to slow mining down, not to
add a layer of security there.
What could that mean? The difficulty controls the mining rate. If a hash function half as hard would be chosen, the difficulty would double and you'd have the same generation rate.

You're right, but then I'm not entirely convinced by the 'this buys use time the day sha-256 gets cracked' either.
If that was their concern, why not combine two wildly differing 256-bit hash algorithm and XOR the result ?
That probably could work too, but that also doesn't guarantee improves security. Maybe the two hash functions are actually connected in some unexpected way and the XOR is actually weaker than either.

SHA-256 itself is multiple iterations of a basic function. I'm guessing it is assumed that the basic function itself is ok and that the more times you apply it, the harder it is to attack.

1EofoZNBhWQ3kxfKnvWkhtMns4AivZArhr   |   Who am I?   |   bitcoin-otc WoT
Bitcoil - Exchange bitcoins for ILS (thread)   |   Israel Bitcoin community homepage (thread)
Analysis of Bitcoin Pooled Mining Reward Systems (thread, summary)  |   PureMining - Infinite-term, deterministic mining bond
kokjo
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
June 11, 2012, 12:58:14 PM
 #16

i should really stop posting in this thread, im embarrassing myself.

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
Kazimir (OP)
Legendary
*
Offline Offline

Activity: 1176
Merit: 1003



View Profile
June 11, 2012, 01:05:15 PM
 #17

No, because the assumptions I made become less true the more rounds are done (maybe they're not even accurate enough after one round). The set of all possible images of SHA256^N becomes smaller for larger N until it converges to a fixed set (which is probably very large). Then SHA-256 is a permutation (one-to-one mapping) on this set. (This is true for every function from a space to itself).
Ah right, yeah. It would be interesting to know (or at least have a reasonable estimate) how large this conversion set is.

Intuitively I'd say it's quite sufficient to hold as a secure hash for the forseeable future, although I don't have any reasonable arguments for this.

In theory, there's no difference between theory and practice. In practice, there is.
Insert coin(s): 1KazimirL9MNcnFnoosGrEkmMsbYLxPPob
TangibleCryptography
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250


Tangible Cryptography LLC


View Profile WWW
June 11, 2012, 01:58:22 PM
 #18

Do people who design hash functions target these kind of properties or are they sort of
an after-the-fact implication of stronger sought-after properties of the function ?

It is property they are targeting.  The ideal hashing function is described as a random oracle
http://en.wikipedia.org/wiki/Random_oracle

No hashing function to date is a perfect random oracle but that is the ideal they are striving for.
Meni Rosenfeld
Donator
Legendary
*
expert
Offline Offline

Activity: 2058
Merit: 1054



View Profile WWW
June 11, 2012, 03:52:32 PM
Last edit: June 11, 2012, 04:35:58 PM by Meni Rosenfeld
 #19

I did a calculation which says that every application of SHA-256 reduces entropy by about 0.5734 bits. I have no idea if that's correct.

Assuming SHA-256 is a random function (maps every input to a uniformly random independent output), you will end up having (on average) (1-1/e)*2^256 different outputs. This indeed means a loss of entropy of about half a bit. Further iterations map a smaller space to an output space of 2^256, and the loss of entropy of each further application drops very quickly. It's certainly not the case that you lose any significant amount by doing 1000 iterations or so.
I took it one step further and considered the distribution among the outputs, which is not uniform; the result for the amount of entropy lost is (1/e)*sum(log_2 n / (n-1)!). But this is probably little more than a nice exercise, as SHA-256 is likely too far from random for this calculation to be meaningful.

And I mistakenly input ln rather than log_2 to the software, so the value I want really should be 0.827.

1EofoZNBhWQ3kxfKnvWkhtMns4AivZArhr   |   Who am I?   |   bitcoin-otc WoT
Bitcoil - Exchange bitcoins for ILS (thread)   |   Israel Bitcoin community homepage (thread)
Analysis of Bitcoin Pooled Mining Reward Systems (thread, summary)  |   PureMining - Infinite-term, deterministic mining bond
someone42
Member
**
Offline Offline

Activity: 78
Merit: 10

Chris Chua


View Profile
June 11, 2012, 04:35:08 PM
 #20

No, because the assumptions I made become less true the more rounds are done (maybe they're not even accurate enough after one round). The set of all possible images of SHA256^N becomes smaller for larger N until it converges to a fixed set (which is probably very large). Then SHA-256 is a permutation (one-to-one mapping) on this set. (This is true for every function from a space to itself).

I thought it would be interesting to see what the entropy reduction is for multiple rounds. I assumed each round has its own independent random oracle which maps k * N elements to N potential output elements, where 0 <= k <= 1 and N is 2 ^ 256. For each round, I found that on average, exp(-k) * N output elements have no preimage. Therefore, each round maps k * N elements to (1 - exp(-k)) * N elements.

Iterating this, the entropy reduction (ignoring the non-uniform output distribution for now) is:
RoundCumulative entropy reduction
10.6617
21.0938
41.6800
82.4032
163.2306
324.1285
645.0704
1286.0381
2567.0204

I don't observe any convergence, and indeed the equation k = 1 - exp(-k) has one solution: at k = 0. But this is probably because I assumed that each round had its own independent random oracle. The results may be different for a fixed function like SHA-256.
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!