Rule 30 automaton as hash function

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 10:14:20 AM

#81

Quote from: grau on July 27, 2014, 10:10:00 AM

It seems 7 is the bare minimum for a setup like in Bitcoin 80 bytes -> 32 bytes hash.

I got 7 too by trial and error. But is that the worst case pattern? And what about the leftmost value?

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 10:26:10 AM

#82

Quote from: Anders on July 27, 2014, 10:14:20 AM

Quote from: grau on July 27, 2014, 10:10:00 AM

It seems 7 is the bare minimum for a setup like in Bitcoin 80 bytes -> 32 bytes hash.

I got 7 too by trial and error. But is that the worst case pattern? And what about the leftmost value?

Thats the beauty of the minimum design. One can enumerate all patterns to figure the worst case, since inputs are only 3 bit wide there are only 8 possible patterns.

The leftmost must be fine.

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 10:40:50 AM

#83

Quote from: grau on July 27, 2014, 10:26:10 AM

Quote from: Anders on July 27, 2014, 10:14:20 AM

Quote from: grau on July 27, 2014, 10:10:00 AM

It seems 7 is the bare minimum for a setup like in Bitcoin 80 bytes -> 32 bytes hash.

I got 7 too by trial and error. But is that the worst case pattern? And what about the leftmost value?

Thats the beauty of the minimum design. One can enumerate all patterns to figure the worst case, since inputs are only 3 bit wide there are only 8 possible patterns.

The leftmost must be fine.

Actually there are 2 2 bit patters in addition 0,1,0,1 ... and 1,0,1,0... so we have 10 cases.

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 10:52:10 AM

#84

It seems the 0,1,0,1 ... and 1,0,1,0 ... two bit patterns are the worst case that need >7N evolutions.

And there is a corner case of 0,0,0 ... hat leads to a hash of 0,0,0....

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 11:12:21 AM

#85

With such hash function:
- the work needed is exactly quantified by number of logical operations, no implementation specific gimmicks.
- units implementing rule 30 are identical and utmost simple
- all units can run in parallel for a single layer
- has constant memory requirement, that is just two arrays one of the current and one for next generation of the evolution,
- irreducibility of computation could be rigorously proven

Its single parameter is the number of evolutions before emitting the hash.
This parameter should be set >= 7 times input width.

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 11:28:55 AM

#86

I have no idea of hardware design, but guess that the whole net could be laid out as is without a memory, it would just need exact clocking.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 01:03:45 PM

#87

Quote from: grau on July 27, 2014, 10:52:10 AM

It seems the 0,1,0,1 ... and 1,0,1,0 ... two bit patterns are the worst case that need >7N evolutions.

And there is a corner case of 0,0,0 ... hat leads to a hash of 0,0,0....

I think 7N is enough. I tried up to 15,360 bits and it works, with different hash results for UUUU...U and UUUU...V. It also works when changing the start of the message to VUUU...U.

R30 with 7N: http://jsfiddle.net/2Mxt2/

The case with all zeros is taken care of by the extra bits for the length of the message. It's only the empty message '' that results in hash value zero.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 01:09:22 PM

#88

Quote from: grau on July 27, 2014, 11:12:21 AM

- has constant memory requirement, that is just two arrays one of the current and one for next generation of the evolution,

Actually, I use only one array for the cellular automaton in my implementation. I save a single value in a temporary variable, then write the result of the next generation into the array, and for the next calculation I use the temporary variable instead of the value in the array, iterated from left to right for all the cells.

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 01:21:46 PM

#89

Quote from: Anders on July 27, 2014, 01:09:22 PM

Quote from: grau on July 27, 2014, 11:12:21 AM

- has constant memory requirement, that is just two arrays one of the current and one for next generation of the evolution,

Actually, I use only one array for the cellular automaton in my implementation. I save a single value in a temporary variable, then write the result of the next generation into the array, and for the next calculation I use the temporary variable instead of the value in the array, iterated from left to right for all the cells.

I do not think such algorithm should be optimised for space, rather time. Yours is rather sub-optimal doing everything sequentially although substitutions are independent and not using that even in software operating on bit arrays is faster than on individual bits. I think I will write one that beats this with a few magnitudes for fun

Later today .. now I rather play with the kids.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 02:54:00 PM

#90

Fortunately it seems that 7N is enough. I tested with messages of length 100,000 bits:

The R30 hash for "UUU...U" is: e4ebe82d717cf60eb6b7d6473f647d733fb0efbfe930bd3048bb14b0cbe7e97f
The R30 hash for "UUU...V" is: 0f821effa4773fdbbe6d7a8c50bd2146c0fe75e577fab0d59dd57c8aa582ba78

But how to prove that 7N is enough mathematically? Huh

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 03:38:45 PM

#91

Quote from: grau on July 27, 2014, 01:21:46 PM

Quote from: Anders on July 27, 2014, 01:09:22 PM

Quote from: grau on July 27, 2014, 11:12:21 AM

- has constant memory requirement, that is just two arrays one of the current and one for next generation of the evolution,

Actually, I use only one array for the cellular automaton in my implementation. I save a single value in a temporary variable, then write the result of the next generation into the array, and for the next calculation I use the temporary variable instead of the value in the array, iterated from left to right for all the cells.

I do not think such algorithm should be optimised for space, rather time. Yours is rather sub-optimal doing everything sequentially although substitutions are independent and not using that even in software operating on bit arrays is faster than on individual bits. I think I will write one that beats this with a few magnitudes for fun

Later today .. now I rather play with the kids.

Yes, definitely. The memory requirement is tiny. Then two arrays may be faster. And also to calculate with for example 16-bit words instead of 1 bit at a time as I do now.

If you want to compare the hash values, use my latest version 7N and with reversed bits (the least significant bit for each byte in the message now starts from the left in the automaton): http://jsfiddle.net/MA3vS/

justusranvier

Legendary

Offline

Activity: 1400
Merit: 1009

Re: Rule 30 automaton as hash function

July 27, 2014, 04:25:36 PM

#92

Tell me I'm not the only one who initially misread this headline and imagined something completely different...

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 05:19:33 PM

#93

Quote from: Anders on July 27, 2014, 02:54:00 PM

But how to prove that 7N is enough mathematically? Huh

We enumerated all three bit patterns and 7N is enough to propagate any change to the middle column, even in worst case pattern. A pattern with repetition cycle of more than 3 bits or a random pattern must start local triangles that propagate at a higher slope than the worst case two bit pattern 01010 ...

Is this enough?

Well, enough for what? I think we should first be clear of properties needed for the purpose of block header hashing.
My take is:
- no known inverse - Wolfram demonstrated nicely that the middle column gives no clue of the input after a few generations.
- full propagation - the nonce should be able to flip any bit in the hash - we think given with 7N generations.
- proof of work - means no shortcuts, irreducible, given more clearly than in SHA256
- able to meet any difficulty target - thats a hard question, but did someone prove it for SHA256d? I think not. If there was an attempt to prove this, it would be easier for this one.

did I missed some?

Peter R

Legendary

Offline

Activity: 1162
Merit: 1007

Re: Rule 30 automaton as hash function

July 27, 2014, 05:38:14 PM

#94

Quote from: grau on July 27, 2014, 05:19:33 PM

Quote from: Anders on July 27, 2014, 02:54:00 PM

But how to prove that 7N is enough mathematically? Huh

We enumerated all three bit patterns and 7N is enough to propagate any change to the middle column, even in worst case pattern. A pattern with repetition cycle of more than 3 bits or a random pattern must start local triangles that propagate at a higher slope than the worst case two bit pattern 01010 ...

Is this enough?

Well, enough for what? I think we should first be clear of properties needed for the purpose of block header hashing.
My take is:
- no known inverse - Wolfram demonstrated nicely that the middle column gives no clue of the input after a few generations.
- full propagation - the nonce should be able to flip any bit in the hash - we think given with 7N generations.
- proof of work - means no shortcuts, irreducible, given more clearly than in SHA256
- able to meet any difficulty target - thats a hard question, but did someone prove it for SHA256d? I think not. If there was an attempt to prove this, it would be easier for this one.

did I missed some?

I don't believe it's a requirement to select the hash bits from the central column (although I realize that this is what Mathematica does with Random[]). Given the fact that the influence of the nonce propagates better moving right than moving left, I think it makes sense to select a column to the right of center. If your example image (nice graphics BTW) shows the worst case propagation, then perhaps this is sufficient:

Run Bitcoin Unlimited (www.bitcoinunlimited.info)

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 06:08:50 PM
Last edit: July 27, 2014, 08:30:55 PM by Anders

#95

Fewest calculations so far and with 7N: http://jsfiddle.net/34FyH/

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 06:11:51 PM

#96

Quote from: grau on July 27, 2014, 05:19:33 PM

Quote from: Anders on July 27, 2014, 02:54:00 PM

But how to prove that 7N is enough mathematically? Huh

We enumerated all three bit patterns and 7N is enough to propagate any change to the middle column, even in worst case pattern. A pattern with repetition cycle of more than 3 bits or a random pattern must start local triangles that propagate at a higher slope than the worst case two bit pattern 01010 ...

Is this enough?

Well, enough for what? I think we should first be clear of properties needed for the purpose of block header hashing.
My take is:
- no known inverse - Wolfram demonstrated nicely that the middle column gives no clue of the input after a few generations.
- full propagation - the nonce should be able to flip any bit in the hash - we think given with 7N generations.
- proof of work - means no shortcuts, irreducible, given more clearly than in SHA256
- able to meet any difficulty target - thats a hard question, but did someone prove it for SHA256d? I think not. If there was an attempt to prove this, it would be easier for this one.

did I missed some?

Proved for Bitcoin, yes. But what about longer messages? It seems that the slope converges to 6. something for large messages (I have tested up to 100,000 bits). 7N seems enough, but I don't know for sure if it will work for say a 1 megabyte file as the message.

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 06:26:07 PM

#97

Here is a more efficient implementation, not yet parallel, but already illustrates the power of simple design for implementation.

Hashing the argument in hex (assumed block header) to 32 bytes of output after 7 * 80 * 8 rounds with rule 30.

Code:

package com.bitsofproof.r30hash;
import java.math.BigInteger;
public class Main 
{
    public static void main(String[] args)
    {
	    BigInteger input = new BigInteger (args[0], 16);
	    BigInteger result = BigInteger.ZERO;
	    int rounds = (80 * 7 + 32) * 8;
	    for ( int i = 0; i < rounds; ++i )
	    {
		    BigInteger q = input.shiftLeft (1);
		    BigInteger r = input.shiftLeft (2);
		    input = input.xor(q.or(r));
		    if ( i >= 80 * 7 * 8  )
		    {
			    if ( input.testBit (i+1) )
			    {
				    result = result.setBit (0);
			    }
			    result = result.shiftLeft (1);
		    }
	    }
	   System.out.println(result.toString (16));
    }
}

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Rule 30 automaton as hash function

July 27, 2014, 06:40:06 PM

#98

Quote from: Peter R on July 27, 2014, 05:38:14 PM

I don't believe it's a requirement to select the hash bits from the central column ...

It is apparent that chaos converges to order toward both left and right, so intuition says to stay in the middle if you want maximum entropy.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 07:15:17 PM

#99

Quote from: grau on July 27, 2014, 06:26:07 PM

Here is a more efficient implementation, not yet parallel, but already illustrates the power of simple design for implementation.

Hashing the argument in hex (assumed block header) to 32 bytes of output after 7 * 80 * 8 rounds with rule 30.

Code:

package com.bitsofproof.r30hash;
import java.math.BigInteger;
public class Main 
{
    public static void main(String[] args)
    {
	    BigInteger input = new BigInteger (args[0], 16);
	    BigInteger result = BigInteger.ZERO;
	    int rounds = (80 * 7 + 32) * 8;
	    for ( int i = 0; i < rounds; ++i )
	    {
		    BigInteger q = input.shiftLeft (1);
		    BigInteger r = input.shiftLeft (2);
		    input = input.xor(q.or(r));
		    if ( i >= 80 * 7 * 8  )
		    {
			    if ( input.testBit (i+1) )
			    {
				    result = result.setBit (0);
			    }
			    result = result.shiftLeft (1);
		    }
	    }
	   System.out.println(result.toString (16));
    }
}

Neat. I noticed in this paper that indeed Rule 30 can be defined as x_i' = x_i-1 xor (x_i or x_i+1):

Cryptography with Cellular Automata -- http://www.stephenwolfram.com/publications/academic/cryptography-cellular-automata.pdf

I wonder about the performance of BigInteger though.

Anders (OP)

Full Member

Offline

Activity: 126
Merit: 100

Re: Rule 30 automaton as hash function

July 27, 2014, 08:11:37 PM

#100

Using the Java long type (64 bits) would make the algorithm fast. Tedious to implement however with all the padding, boundary matching and shifts of bits between the longs etc. I'm too lazy for that at the moment. Plus BigInteger is perhaps fast enough to match the use of longs for the implementation of R30.