Rule 30 automaton as hash function

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 19, 2014, 08:04:12 PM

#21

Here is a harder problem: Try to find a message that produces the following hash:

Code:

b27ac6cabaede78ee58ef73c6099e4e8dd8f1491ababe9b3f858f3d1644a51dd

http://jsfiddle.net/ALV5X/

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 20, 2014, 07:03:34 AM

#22

Wolfram's computational irreducibility, if exists, defines the perfect proof of work, therefore the discussion is on spot here or better in an alt-coin forum.

I think that a simple POW algorithm eliminates the risk of possible shortcuts that might exist in complex algorithms like the SHA group.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 20, 2014, 08:00:28 AM

#23

Quote from: grau on July 20, 2014, 07:03:34 AM

Wolfram's computational irreducibility, if exists, defines the perfect proof of work, therefore the discussion is on spot here or better in an alt-coin forum.

I think that a simple POW algorithm eliminates the risk of possible shortcuts that might exist in complex algorithms like the SHA group.

Yes, if computational irreducibility can be proved or at least assumed to be true according to experts, then it per definition means no shortcuts, no matter what fancy mathematics is used.

SHA-1 is considered broken if I understood it correctly. In that case the same may happen to SHA-2.

If SHA-2 becomes broken then Bitcoin will maybe need to replace the SHA-256 hash function, or add another hash on top of SHA-256.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 20, 2014, 12:06:02 PM
Last edit: July 20, 2014, 12:28:44 PM by Anders

#24

Maybe I should explain how my hash function works. Here is an illustration with a smaller version:

Starting with a sample message: 0123456789

For each byte in the message the initial condition is set to the byte value xor previous wrapped initial condition result, plus the sum of all previous bytes in the message, plus a random value, summed together modulo 256, resulting in something like 5396. In this simple illustration modulo 10 is used for the values and modulo 4 for the length of the initial condition.

The initial condition 5396 is then used in the rule 30 cellular automaton by setting the cell values equal to the bit values of the initial condition. Then a fixed number of rows (cellular automaton generations) are skipped to allow all cells to influence each other left to right and back again. And then after that a hash is generated as the value of every other cell in the expansion of the center column of the cellular automaton. Other values than 640 bits initial condition and 256-bit hash as in my JavaScript example can be used.

Now that I have published it here, no one else can turn this into a patent. Cool

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 20, 2014, 07:02:07 PM

#25

Exact same hash function with fewer calculations: http://jsfiddle.net/7TuBy/

The previous version calculated a lot of unnecessary cells. The cellular automaton expands like this:

Only the cells represented by the triangle need to be calculated.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 20, 2014, 08:48:15 PM

#26

Quote from: DeathAndTaxes on July 19, 2014, 06:56:23 PM

I have to agree with Peter.

Quote

Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can't break.

Schneier's law

A cryptographic primitive becomes more valuable the more it is used (which creates a self reinforcing cycle). Better known and deployed algorithms are more likely to be researched that esoteric ones. The longer the algorithm is around without any published breaks the higher the confidence that the algorithm can't be broken at the current time.

Yes, the cryptographic algorithms need to be heavily battle tested in real applications over a long period of time. The scary thing however is that many cryptographic algorithms can become broken overnight, even after several years of widespread use.

To use a rule 30 hash for Bitcoin today would probably be too shaky. Instead a rule 30 hash can be used in a new altcoin as a pilot project to test the concept in a real application with real attacks and attempts to break the algorithm.

Another scary thing I came to think of: what if some Bitcoin miner already has discovered a way to break SHA-256? And they keep that knowledge secret for their own benefit.

DeathAndTaxes

Donator
Legendary

Offline

Activity: 1218
Merit: 1079

Gerald Davis

Re: Rule 30 automaton as hash function

July 20, 2014, 09:45:38 PM

#27

Quote from: Anders on July 20, 2014, 08:48:15 PM

Yes, the cryptographic algorithms need to be heavily battle tested in real applications over a long period of time. The scary thing however is that many cryptographic algorithms can become broken overnight, even after several years of widespread use.

That is very rarely true. It depends on what you mean by broken. Faster than brute force no matter how much time or energy that would take (or even if the human race could generate energy on that scale)? Sure. Going from no know flaws to exploitable in a real world scenario. That hasn't happened to any major cryptographic hashing function in the last 30 years. Take SHA-1 for example. Collision attacks were found as early as 2005 however almost a decade later there isn't a single known preimage attack. Even MD5 which wouldn't be recommended for any new application is still resistance to first and second preimage attacks.

Quote

Another scary thing I came to think of: what if some Bitcoin miner already has discovered a way to break SHA-256? And they keep that knowledge secret for their own benefit.

If you could break SHA-2 there are a lot more valuable things you could do with it. Hell even if you didn't know what to do with it you could sell it to a three letter agency for seven figures easy. Still a flaw in SHA-2 it is unlikely to be useful to a miner. A miner isn't looking for a single hash but rather one of trillions of quadrillions of possible valid hashes. Mining is block is only on complexity 2^60. If you found a flaw in SHA-2 which would allow a preimage attack on the order of 2^80 it would be huge but would still be a thousand times slower than just mining by brute force.

smoothie

Legendary

Offline

Activity: 2492
Merit: 1473

LEALANA Bitcoin Grim Reaper

Re: Rule 30 automaton as hash function

July 20, 2014, 09:51:57 PM

#28

Quote from: DeathAndTaxes on July 19, 2014, 06:56:23 PM

I have to agree with Peter.

Quote

Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can't break.

Schneier's law

A cryptographic primitive becomes more valuable the more it is used (which creates a self reinforcing cycle). Better known and deployed algorithms are more likely to be researched that esoteric ones. The longer the algorithm is around without any published breaks the higher the confidence that the algorithm can't be broken at the current time.

In otherwords trust is through time. And the test of time of a particular algorithm or cryptocoin for that matter also needs to stand that test...

Much of these new alt coins have not stood that test of time.

. ★☆ WWW.LEALANA.COM My PGP fingerprint is A764D833. History of Monero development Visualization ★☆ .
LEALANA BITCOIN GRIM REAPER SILVER COINS.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 20, 2014, 09:57:50 PM

#29

Computational irreducibility, in the case of a hash function, only protects against shortcuts when calculating the hash. It doesn't protect from going backwards from the hash to a matching message. Fortunately, in the case of rule 30, to determine previous values quickly becomes increasingly difficult, as Stephen Wolfram pointed out.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 20, 2014, 10:10:23 PM

#30

Quote from: DeathAndTaxes on July 20, 2014, 09:45:38 PM

... Going from no know flaws to exploitable in a real world scenario. That hasn't happened to any major cryptographic hashing function in the last 30 years. Take SHA-1 for example. Collision attacks were found as early as 2005 however almost a decade later there isn't a single known preimage attack. Even MD5 which wouldn't be recommended for any new application is still resistance to first and second preimage attacks.

Or successful attacks have happen that people keep silent about! Who in the professional world would admit that their company/service got the cryptographic hash broken?

I read that SHA-1 was considered broken just because someone figured out how to find matches some three magnitudes faster than brute force. In practice this may mean that SHA-1 is still secure enough and that it's not recommended for the fear of exposure of even more efficient ways for finding matches.

I myself wouldn't fully trust either SHA-2 or a rule 30 hash function. Who knows what clever techniques the NSA and other agencies like that have. And big companies like IBM may also have methods they keep hidden from the public community.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 20, 2014, 10:29:09 PM

#31

Quote from: DeathAndTaxes on July 20, 2014, 09:45:38 PM

If you could break SHA-2 there are a lot more valuable things you could do with it. Hell even if you didn't know what to do with it you could sell it to a three letter agency for seven figures easy. Still a flaw in SHA-2 it is unlikely to be useful to a miner. A miner isn't looking for a single hash but rather one of trillions of quadrillions of possible valid hashes. Mining is block is only on complexity 2^60. If you found a flaw in SHA-2 which would allow a preimage attack on the order of 2^80 it would be huge but would still be a thousand times slower than just mining by brute force.

But what if a Bitcoin miner has figured out how to select the bits in the block to get the required number of leading zeros? A direct method that bypasses the entire trillion tries brute force approach.

DeathAndTaxes

Donator
Legendary

Offline

Activity: 1218
Merit: 1079

Gerald Davis

Re: Rule 30 automaton as hash function

July 20, 2014, 10:47:42 PM

#32

Quote from: Anders on July 20, 2014, 10:29:09 PM

But what if a Bitcoin miner has figured out how to select the bits in the block to get the required number of leading zeros? A direct method that bypasses the entire trillion tries brute force approach.

Then Bitcoin would fail. There is no evidence anything like that is possible. What if Earth collided with a blackhole tomorrow?

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 21, 2014, 07:32:36 AM

#33

I wonder if a rule 30 hash function would be quantum-safe. Here is an article about SHA-256 not being quantum-safe:

"Bitcoin Is Not Quantum-Safe, And How We Can Fix It When Needed

... Quantum computers have two major tools that make them superior to classical computers in breaking cryptography: Shor’s algorithm and Grover’s algorithm. ... A modified version of Shor’s algorithm can crack elliptic curve cryptography as well, and Grover’s algorithm attacks basically anything, including SHA256 and RIPEMD-160." -- http://bitcoinmagazine.com/6021/bitcoin-is-not-quantum-safe-and-how-we-can-fix/

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 21, 2014, 11:30:12 AM

#34

If I have a hash with 16 bits, then in theory it should take 2^N/2 = 2⁸ = 256 attempts on average to find a collision.

"As an example, if a 64-bit hash is used, there are approximately 1.8 × 10¹⁹ different outputs. If these are all equally probable (the best case), then it would take 'only' approximately 5 billion attempts (5.1 × 10⁹) to generate a collision using brute force. This value is called birthday bound[2] and for n-bit codes it could be computed as 2^n/2.[3]" -- http://en.wikipedia.org/wiki/Birthday_attack#Mathematics

When I tested my hash function I get an average of around 350, not 256:

Attempt 1: 552
Attempt 2: 100
Attempt 3: 452
Attempt 4: 360
Attempt 5: 273
Attempt 6: 336
Attempt 7: 337
Attempt 8: 117
Attempt 9: 419
Attempt 10: 742
Attempt 11: 313
Attempt 12: 408
Attempt 13: 200
Attempt 14: 383
Attempt 15: 385
Attempt 16: 185
Attempt 17: 286
Attempt 18: 265
Attempt 19: 398
Attempt 20: 566
----------------
Average: 354

Source: http://jsfiddle.net/wsEU7/

Is there an error in my code, or is Math.random() in JavaScript a poor random generator, or is the theoretical value 256 only an approximation? Huh

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 21, 2014, 11:48:54 AM

#35

I changed the collision test to use different lengths of the messages tested (1 - 160 characters). Then the average value was around 400, even higher than with fixed length test messages:

Attempt 1: 41
Attempt 2: 487
Attempt 3: 122
Attempt 4: 584
Attempt 5: 771
Attempt 6: 474
Attempt 7: 458
Attempt 8: 378
Attempt 9: 109
Attempt 10: 312
Attempt 11: 585
Attempt 12: 558
Attempt 13: 227
Attempt 14: 515
Attempt 15: 266
Attempt 16: 423
Attempt 17: 462
Attempt 18: 305
Attempt 19: 808
Attempt 20: 296
----------------
Average: 409

Source: http://jsfiddle.net/4nFz8/

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 21, 2014, 01:13:56 PM

#36

I tested with messages of up to 10,000 characters and got an average of about 350, the same as for fixed messages of length 4.

Attempt 1: 419
Attempt 2: 263
Attempt 3: 185
Attempt 4: 372
Attempt 5: 351
Attempt 6: 434
Attempt 7: 155
Attempt 8: 619
Attempt 9: 319
Attempt 10: 336
Attempt 11: 333
Attempt 12: 442
Attempt 13: 95
Attempt 14: 639
Attempt 15: 290
Attempt 16: 284
Attempt 17: 435
Attempt 18: 514
Attempt 19: 98
Attempt 20: 352
----------------
Average: 347

Source: http://jsfiddle.net/My76F/

So on average it seems that around 350 attempts are needed instead of the theoretical 256 attempts. It also shows that my wrapping method for the initial condition seems to work. And the collision-rate looks good, unless I have messed something up in my test.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 21, 2014, 08:20:06 PM

#37

My current hash function is only potentially good when the message is unknown. When the message is known it's easy to construct collisions, such as manipulating the content in a document so that it gets the same hash value. Not good.

One solution to make the hash function more general, other than using the entire message as initial condition (which will become computationally very demanding for long messages), is, as I described earlier, to use another known hash function such as SHA-512 to generate the initial condition.

It's a bit boring to have to rely on another already existing hash function. I will try to figure out a way to make the rule 30 hash function more general without being dependent on other known hash function standards.

Also, the often used Merkle–Damgård construction has problems:

"Unfortunately, this construction also has several undesirable properties:

Length extension — once an attacker has one collision, he can find more very cheaply.
Second preimage attacks against long messages are always much more efficient than brute force.
Multicollisions (many messages with the same hash) can be found with only a little more work than collisions.[5]
"Herding attacks" (first committing to an output h, then mapping messages with arbitrary starting values to h) are possible for more work than finding a collision, but much less than would be expected to do this for a random oracle.[6][7]
"Extension attacks": Given the hash H(X) of an unknown input X, it is easy to find the value of H(pad(X) || Y), where pad is the padding function of the hash. That is, it is possible to find hashes of inputs related to X even though X remains unknown.[8] A random oracle would not have this property, and this may lead to simple attacks even for natural schemes proven secure in the random oracle model.[9] Length extension attack was actually used to attack a number of commercial web message authentication schemes such as one used by Flickr.[10]

" -- http://en.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_construction#Security_characteristics

DeathAndTaxes

Donator
Legendary

Offline

Activity: 1218
Merit: 1079

Gerald Davis

Re: Rule 30 automaton as hash function

July 21, 2014, 10:23:16 PM
Last edit: July 21, 2014, 10:45:07 PM by DeathAndTaxes

#38

Quote from: Anders on July 21, 2014, 08:20:06 PM

One solution to make the hash function more general, other than using the entire message as initial condition (which will become computationally very demanding for long messages), is, as I described earlier, to use another known hash function such as SHA-512 to generate the initial condition.

Then the function is no more secure than SHA-2 so why not just use SHA-2 if the goal is security?

It still may be useful to for a PoW by moving the nonce outside of the blockheader.

R30(nonce + H(blockheader)) < target

The security of the PoW still relies on the preimage resistance of H however if R30 is irreducible then it would prevent more efficient work in the PoW. This has the advantage of making mining hardware highly commoditized which means lower margins (anyone can do it and they work about the same) which is optimal from a security point of view.

Quote

Also, the often used Merkle–Damgård construction has problems: <snipped>

These are weaknesses known to M-D and they are what cryptographers target when attempting to "break" the hashing function. To date nobody has shown a preimage attack on SHA-1 or the more complex SHA-2 is possible.

The issue of length extension doesn't apply to PoW as the header has a fixed length and ordering. Even if you could perform a preimage attack on an existing block via length extension the resulting block would be invalid regardless of the block hash because Bitcoin blocks must be exactly 840 bytes and the elements ordered in a specific order.

In applications where the hash will protect variable length data using a HMAC over the pure hashing function is preferable. HMAC don't suffer from length extension attacks and they make collision attacks less effective. Still this is academical at this point as most hashing functions are still secure against preimage attacks (even the ancient MD5). A major goal of the SHA-3 competition was to bypass some of the weaknesses of M-D construction and as such there are theoretical length extension attacks on SHA-3. Still time trumps all, SHA-2 has been vetted more than SHA-3 at least as of today. Maybe in a decade or so but SHA-3 is a little ahead of its time as SHA-2 held up better than NIST expected it to.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 22, 2014, 06:50:11 AM
Last edit: July 22, 2014, 07:40:35 AM by Anders

#39

Quote from: DeathAndTaxes on July 21, 2014, 10:23:16 PM

Then the function is no more secure than SHA-2 so why not just use SHA-2 if the goal is security?

Hmm... Yes, the combination would perhaps not be more secure than SHA-2 alone except as a way of obfuscating the initial hash.

Quote

It still may be useful to for a PoW by moving the nonce outside of the blockheader.

R30(nonce + H(blockheader)) < target

The security of the PoW still relies on the preimage resistance of H however if R30 is irreducible then it would prevent more efficient work in the PoW. This has the advantage of making mining hardware highly commoditized which means lower margins (anyone can do it and they work about the same) which is optimal from a security point of view.

Concatenating the Bitcoin nonce and the hash of the blockheader and use it as an initial condition for an R30 hash would perhaps work. I don't know enough about the Bitcoin protocol to know for sure. As a proof of work to produce a value less than the target, the work required can easily be made harder by increasing the number of rows (generations) that should be calculated before the hash value is extracted.

In the standard rule 30 cellular automaton the initial condition is always only one bit set to 1. The trick I have done is to replace that initial condition with a whole sequence of bits. And the other part of the trick is to skip enough generations in the cellular automaton so that the initial bits influence each other left to right and back again. This ensures that the initial condition is distributed in a highly random way that is unique for each message, when the hash is long enough. The hash value is taken from every other cell in the center column of the cellular automaton after the initial (skipped) rows have been calculated.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 22, 2014, 08:56:07 AM
Last edit: July 22, 2014, 02:34:06 PM by Anders

#40

Here is a version of my hash function using a kind of Merkle–Damgård construction: http://jsfiddle.net/YkPA3/

Length extension attacks are protected against by concatenating the initial condition with the bits of the length of the message. Any change in length of the message and that bit pattern will become different, making the cellular automaton expand with different cell values.

The compression function is a simple cipher with a key that depends on the previous block (of 640 bits).