an explanation of asic boost

There has been a lot of discussion about the "cheating" due to a "secret algorithm" that is ASIC BOOST. I'm going to try to explain in rather simple but technical terms why "asic boost" is an almost trivial optimisation of hashcash proof of work.

First, we have to understand very well the structure of the SHA-256 hashing function, because it contains the key to understanding asic boost. The sha-256 hashing protocol consists of two aspects:
- the overall protocol definition of how exactly a data set is cut up in pieces, and treated by a "compression function"
- the compression function itself, which is the heart of the system. It is the subtle interplay between these two aspects, which renders the asic boost optimisation almost trivial.

The compression function of the SHA-2 family is defined in a long tradition of structures of block cyphers invented by the NSA since DES: Feistel networks. In other words, the compression function of the SHA-2 family is not really invented to be a hash function, but is in fact a good symmetric block encryption algorithm, according to a typical structure used a lot by the NSA since 40 years. For a hash function, there was no need to take this structure, but it can be used, although it is "overkill". So, in order to understand this, we have to go still a step back, and understand block cyphers.

Block cyphers are algorithms that take a "block of clear data" and "a secret key", and produce a "cypher text block" of the same size as the data block. DES, AES-256, blowfish, .... are all examples of block cyphers. Now, the trick of these algorithms is to use the secret key to scramble the data, in such a way that we can unscramble the data again with that same key, but such that revealing the key is essentially impossible. As such, these algorithms take two inputs: the secret key, and the clear data, and produce the cypher data.

This is done by processing the clear data in several "rounds of scrambling" in a pipeline ; and each round is fed with a "piece of key". However, in order not to reveal the secret key, the original secret key goes through a "self-scrambling" procedure out of which come "round keys", one for each of the rounds in the data scrambling procedure.

Note that the "round keys" ONLY depend upon the secret key, but that the data scrambling will depend both on the data to be cyphered, and on the round keys, so on the secret key. Usually, deriving the key rounds is "heavy" because you're supposed to do this only once, while you have to crunch a lot of different data with the same key (and hence the same round keys).

==> this will be the key insight of ASIC BOOST: the round keys only depend on the key, not on the data, while the data output depends on both.

cypher text (256 bits) = f(key (512 bits), clear text (256 bits) )

Now, when using a block cypher type of algorithm in a hash function, the roles of "data" and "key" are inverted. Indeed, in a cypher function, clear data and the cypher data are of same length, and the key can be longer or shorter (usually longer: in AES-256, the key length is 256 bits, and the data blocks are 128 bits for instance). When using this as a hash primitive compression function, one takes what used to be the "data" as the hash input and output, and what used to be the secret key is now the to be hashed data.

Now, this is funny: it means that the output hash with SAME DATA (same "secret key") and DIFFERENT INPUT HASHES can be calculated more efficiently (we only have to re-do the hash rounds, but we can keep the "key schedule" which is dependent on the data), than if we have the same hash input and different data.

==> this is the "trivial" insight of asic boost.

Let us now look at how the overall protocol works in SHA-256.

1) we cut the data in blocks of 512 bits (64 bytes).

2) we pad the last block to make it full 64 bytes, including a word that tells us how many bytes we included in total.

Call these blocks D1, D2, .... Dn

Note that for a bitcoin block header, which is 80 bytes, this corresponds to two blocks: D1, and D2, where D2 is padded in always the same way (containing a tail that tells us that we used 80 bytes).

3) make a construction so that:

H_i = f(key = D_i , "data" = H_i-1 )

with H_0 a fixed set of 256 bits: "5be0cd19 1f83d9ab 9b05688c 510e527f a54ff53a 3c6ef372 bb67ae85 6a09e667" called IV.

this initial hash value is part of the SHA-256 protocol.

The last H_i value is the final hash of the whole data set.

Now, in our case, we have only two steps:

H1 = f(D1 , IV)

H2 = f(D2, H1)

The "standard" way to calculate the hash, is to run over the nonce in D2. But this means that each time, we have to REDO the key schedule of D2, and the fact that H1 is the same doesn't help us, because its treatment depends on the key schedule of D2. The opposite, however, does, and that's the idea of asic boost.

The ASIC boost insight is that if we loop over D1, and we can keep D2 constant, the key schedule from D2 can be RE-USED with changing H1, just like in a block cypher application, the key schedule is re-used when one uses the block cypher over many blocks of a data file to be encrypted: the SAME key schedule is then introduced in the rounds that successively applied to H1, result in H2.

That's all there is to it: keeping the key schedule in the second invocation of the compression function, by keeping D2 constant. There's no point in keeping D1 constant, because the constant H1 doesn't help us when D2 changes: the round keys are different and hence the whole computation of H1 into H2 has to be redone.

In fact, what ASIC boost simply does, is something akin to:

If you have to encrypt different data with the same key (here, the same D2), you can re-use the key schedule. If you have to encrypt the same data with different secret keys (the standard way of doing things), then you cannot re-use anything.

In bitcoin's hashcash, this last H2 output still has to be re-hashed again once more, but this is an independent invocation of SHA-256, which in any case has to be done.

We get: D3 as H2 sufficiently padded to turn the 32 bytes of H2 into a 64 byte block, and then: H3 = f(D3,IV).

I'm surprised about 2 things:

1) that this rather obvious scheme wasn't thought off before
2) that the "re-use of the key schedule" as is normally always done in the application of block cyphers, is patentable.