Double hashing: less entropy?

Bitcoin Forum

September 13, 2025, 05:25:14 AM

Welcome, Guest. Please login or register.

News: Latest Bitcoin Core release: 29.0 [Torrent]

Home

Help

Search

Login

Register

More

Bitcoin Forum > Bitcoin > Development & Technical Discussion > Double hashing: less entropy?

Pages: « 1 [2] All

« previous topic next topic »

Author

Topic: Double hashing: less entropy? (Read 4066 times)

Meni Rosenfeld

Donator
Legendary

expert

Offline

Offline

Activity: 2058
Merit: 1057

WWW

Re: Double hashing: less entropy?

June 11, 2012, 04:55:38 PM

#21

Quote from: someone42 on June 11, 2012, 04:35:08 PM

Quote from: Meni Rosenfeld on June 11, 2012, 12:57:56 PM

No, because the assumptions I made become less true the more rounds are done (maybe they're not even accurate enough after one round). The set of all possible images of SHA256^N becomes smaller for larger N until it converges to a fixed set (which is probably very large). Then SHA-256 is a permutation (one-to-one mapping) on this set. (This is true for every function from a space to itself).

But this is probably because I assumed that each round had its own independent random oracle. The results may be different for a fixed function like SHA-256.

Right, this is because of the changing function, with some support from the continuity approximation.

Let's go to the extreme: Let's say after several rounds we end up with 2 elements, so k=2/N. Then after another round we should end up with 2-2/N) elements which is a bit less, and with additional rounds we converge to 0.

Of course, we cannot actually have a noninteger number of elements. But that's not a problem when we change the function each time. Most functions will map 2 elements to 2 elements; but after enough rounds we eventually stumble upon a function that maps both elements to one.

With a consistent function, this can't happen; either it sends them to 2 elements (which must be the same ones) or it doesn't. If it does then it remains at 2 elements forever without decreasing further.

And this doesn't have to happen when we're at 2 elements; after enough rounds, we've weeded out all elements that are not in the function's cycles, and what we're left with we're stuck with forever. And the set of cycle elements can be very large, though I have no idea just how large.

1EofoZNBhWQ3kxfKnvWkhtMns4AivZArhr   |   Who am I?   |   bitcoin-otc WoT
Bitcoil - Exchange bitcoins for ILS (thread)   |   Israel Bitcoin community homepage (thread)
Analysis of Bitcoin Pooled Mining Reward Systems (thread, summary) |   PureMining - Infinite-term, deterministic mining bond

someone42

Member

Offline

Offline

Activity: 78
Merit: 11

Chris Chua

Re: Double hashing: less entropy?

June 11, 2012, 07:15:04 PM

#22

Quote from: Meni Rosenfeld on June 11, 2012, 04:55:38 PM

Right, this is because of the changing function, with some support from the continuity approximation.

Let's go to the extreme: Let's say after several rounds we end up with 2 elements, so k=2/N. Then after another round we should end up with 2-2/N) elements which is a bit less, and with additional rounds we converge to 0.

Of course, we cannot actually have a noninteger number of elements. But that's not a problem when we change the function each time. Most functions will map 2 elements to 2 elements; but after enough rounds we eventually stumble upon a function that maps both elements to one.

With a consistent function, this can't happen; either it sends them to 2 elements (which must be the same ones) or it doesn't. If it does then it remains at 2 elements forever without decreasing further.

And this doesn't have to happen when we're at 2 elements; after enough rounds, we've weeded out all elements that are not in the function's cycles, and what we're left with we're stuck with forever. And the set of cycle elements can be very large, though I have no idea just how large.

You've given me an idea on how to analyse the fixed-function case. You're better than me at this stuff, so correct me if I'm wrong.

Represent every element as a vertex in a graph. Then, add directed edges between vertices according to how SHA-256 maps elements onto other elements. The final set of cycle elements is all the stuff that is part of a cycle. The question is, how big is this set? I attempted a rough estimate, and was surprised at what I got.

The graph can be constructed by building chains/cycles. To begin, pick an element, run it through the random oracle, then add a directed edge between input and output. Then pick the output element, run it through the same random oracle ... etc. This builds up the first chain, and eventually this chain will loop back on itself, creating a cycle. How big is this cycle? Let N be the number of elements. After adding 1 edge, the probability that the chain will loop back on itself is 1 / N. After adding k edges, the probability that the chain will loop back on itself is k / N. Thus the probability that the first chain has k edges with no cycles is:

Code:

(1 - 1/N)(1 - 2/N)...(1 - k/N)

This can be approximated as (1 - k / (2N)) ^ k (as long as k is much smaller than N), and then approximated again as exp(-k ^ 2 / (2N)). Equating this to 0.5 reveals that the first chain, on average, has a length of roughly sqrt(N).

The rest of the graph can be constructed by picking an unconnected vertex and running it through the random oracle, building another chain. However, this time, the chain either loops back on itself (creating another cycle) or merges with a previous chain. Very roughly, the second chain has a 1/2 chance of creating another cycle and a 1/2 chance of merging with the first chain (because, its average length should be similar to the first chain's average length). Likewise, the ith chain has a 1/i chance of creating another cycle and a (i - 1)/i chance of merging with any one of the previous (i - 1) chains. The average number of cycles is then 1 + 1/2 + 1/3 + 1/4 + ...; the harmonic series. Even for 2 ^ 128 terms, this is only about 100.

My estimate for the size of the final set is: average cycle length * average number of cycles, and is very roughly, 100 * sqrt(N). For SHA-256, this is about 2 ^ 135. That's much lower than I expected! But to get to this, you probably have to go through an insane (like 2 ^ 128) number of rounds.

dooglus

Legendary

Offline

Offline

Activity: 2940
Merit: 1334

Re: Double hashing: less entropy?

June 11, 2012, 08:10:52 PM

#23

Quote from: znort987 on June 11, 2012, 11:21:19 AM

Quote from: kokjo on June 11, 2012, 11:18:37 AM

satoshi encouraged people to mine with gpus, he did foresee this.

Eh. I remember hearing the opposite. I probably remember wrong.

No, you're right. He foresaw it, but discouraged it:

Quote from: satoshi on December 12, 2009, 05:52:44 PM

We should have a gentleman's agreement to postpone the GPU arms race as long as we can for the good of the network. It's much easer to get new users up to speed if they don't have to worry about GPU drivers and compatibility. It's nice how anyone with just a CPU can compete fairly equally right now.

Just-Dice

██
██████████
██████████████████
██████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████
██████████████
██████

Play or Invest

██
██████████
██████████████████
██████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████████████
██████████████████████
██████████████
██████

1% House Edge

Gavin Andresen

Legendary

Offline

Offline

Activity: 1652
Merit: 2402

Chief Scientist

WWW

Re: Double hashing: less entropy?

June 12, 2012, 12:56:08 PM

#24

Relevant discussion here:
http://crypto.stackexchange.com/questions/779/hashing-or-encrypting-twice-to-increase-security

How often do you get the chance to work on a potentially world-changing project?

Meni Rosenfeld

Donator
Legendary

expert

Offline

Offline

Activity: 2058
Merit: 1057

WWW

Re: Double hashing: less entropy?

June 12, 2012, 02:02:14 PM

#25

Quote from: someone42 on June 11, 2012, 07:15:04 PM

Quote from: Meni Rosenfeld on June 11, 2012, 04:55:38 PM

Right, this is because of the changing function, with some support from the continuity approximation.

Let's go to the extreme: Let's say after several rounds we end up with 2 elements, so k=2/N. Then after another round we should end up with 2-2/N) elements which is a bit less, and with additional rounds we converge to 0.

Of course, we cannot actually have a noninteger number of elements. But that's not a problem when we change the function each time. Most functions will map 2 elements to 2 elements; but after enough rounds we eventually stumble upon a function that maps both elements to one.

With a consistent function, this can't happen; either it sends them to 2 elements (which must be the same ones) or it doesn't. If it does then it remains at 2 elements forever without decreasing further.

And this doesn't have to happen when we're at 2 elements; after enough rounds, we've weeded out all elements that are not in the function's cycles, and what we're left with we're stuck with forever. And the set of cycle elements can be very large, though I have no idea just how large.

You've given me an idea on how to analyse the fixed-function case. You're better than me at this stuff, so correct me if I'm wrong.

Represent every element as a vertex in a graph. Then, add directed edges between vertices according to how SHA-256 maps elements onto other elements. The final set of cycle elements is all the stuff that is part of a cycle. The question is, how big is this set? I attempted a rough estimate, and was surprised at what I got.

The graph can be constructed by building chains/cycles. To begin, pick an element, run it through the random oracle, then add a directed edge between input and output. Then pick the output element, run it through the same random oracle ... etc. This builds up the first chain, and eventually this chain will loop back on itself, creating a cycle. How big is this cycle? Let N be the number of elements. After adding 1 edge, the probability that the chain will loop back on itself is 1 / N. After adding k edges, the probability that the chain will loop back on itself is k / N. Thus the probability that the first chain has k edges with no cycles is:

Code:

(1 - 1/N)(1 - 2/N)...(1 - k/N)

This can be approximated as (1 - k / (2N)) ^ k (as long as k is much smaller than N), and then approximated again as exp(-k ^ 2 / (2N)). Equating this to 0.5 reveals that the first chain, on average, has a length of roughly sqrt(N).

The rest of the graph can be constructed by picking an unconnected vertex and running it through the random oracle, building another chain. However, this time, the chain either loops back on itself (creating another cycle) or merges with a previous chain. Very roughly, the second chain has a 1/2 chance of creating another cycle and a 1/2 chance of merging with the first chain (because, its average length should be similar to the first chain's average length). Likewise, the ith chain has a 1/i chance of creating another cycle and a (i - 1)/i chance of merging with any one of the previous (i - 1) chains. The average number of cycles is then 1 + 1/2 + 1/3 + 1/4 + ...; the harmonic series. Even for 2 ^ 128 terms, this is only about 100.

My estimate for the size of the final set is: average cycle length * average number of cycles, and is very roughly, 100 * sqrt(N). For SHA-256, this is about 2 ^ 135. That's much lower than I expected! But to get to this, you probably have to go through an insane (like 2 ^ 128) number of rounds.

Looks ok, there are a few small issues but they don't matter much.

1EofoZNBhWQ3kxfKnvWkhtMns4AivZArhr   |   Who am I?   |   bitcoin-otc WoT
Bitcoil - Exchange bitcoins for ILS (thread)   |   Israel Bitcoin community homepage (thread)
Analysis of Bitcoin Pooled Mining Reward Systems (thread, summary) |   PureMining - Infinite-term, deterministic mining bond

Pages: « 1 [2] All

Bitcoin Forum > Bitcoin > Development & Technical Discussion > Double hashing: less entropy?

« previous topic next topic »

Jump to:

Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines