A extremely simple POW concept tailored for CPU mining

therealdf (OP)

Newbie

Offline

Activity: 7
Merit: 0

A extremely simple POW concept tailored for CPU mining

March 20, 2018, 05:01:34 AM

#1

Hey guys,

Long time lurker, first time posting. Let me get right to it.

I have been going through nearly every algorithms whitepaper, or implementation and ingesting all i can, because why the hell not.

Now i can not only appreciate the level of hard work that went into these algorithms, and then the optimizations that happen on some after. Some of the source for some algorithms took me weeks to fully ingest and understand on a programming and mathematical level. Now the ones that always interested me the most was CryptoNight and Blake2. I quite enjoy the characteristics of how they come up with different ways to favor the CPU. However i have been experimenting with my own modified version of these algorithms and one of them is a extremely simple idea that was so easy to implement and test that it seems i have to be missing something.

This algorithm would also favor CPUs, would have ridiculously fast validation time, and should be ASIC unfavorable, as well as not much increase on GPUs if possible at all.

PoW Algorithm

There would be a fixed length Merkel Tree of 8 or 13 gigs (relatively high), and the tree would be made of 32 byte Blake2 leafs. So lets say for the sake of the post we decided on a ~8G Tree. For a ~8G tree consisting of 32 byte leafs we would need to add 134217728 (2^27) "Base Leafs" to the tree in a loop, example:
(Pseudo code to save time)

Code:

MerkelTree MT = new MerkelTree();
for (long long i = 0; i < (2^27); i++)
{
    MT.addLeaf(Blake2.hasher(i));
}

When complete we would have a ~8G MerkelTree in the memory which are simply hashes of the index of the loop. The exact size of the tree after branches combine is (2**28 - 1) * 32 (8589934560 bytes). Now, this is a PoW post after all, so we need to include the Block Header and Nonce. In this implementation, it would literally be up to client miner where in the BASE of the MerkelTree the Block Header and Nonce would go, the algorithm would not care, as long as its in the base. So lets update our example client code to the following:

Code:

MerkelTree MT = new MerkelTree();
char* blockHeader = ...;
long long headerPOS = getRandomLong() % (2^27);
long long Nonce = getRandomLong();
for (long long i = 0; i < (2^27); i++)
{
    if(headerPOS == i)
        MT.addLeaf(Blake2.hasher(blockHeader + Nonce));
    MT.addLeaf(Blake2.hasher(i));
}

Now its also important to know that the other leafs beside the Block Header and Nonce hash can be a hash of anything the client also wants and doesn't have to be the index hashed, allowing this to be dynamic as possible, and allow a infinitely larger possible outputs.

So now at this point the machine as the "Base Leafs" entirely completed. Now we need to complete the MerkelTree Computation, and get our Root and Proofs.
The proof would simply be a simple json array or binary Struct of the siblings and siblings positions needed to achieve the Root.

An example of a proof item, which the Proofs would consist of a array of these items :

Code:

{'right': '19581e27de7ced00ff1ce50b2047e7a567c76b1cbaebabe5ef03f7c3017bb5b7'}

Which would now bring our code to:

Code:

MerkelTree MT = new MerkelTree();
char* blockHeader = ...;
long long headerPOS = getRandomLong() % (2^27);
long long Nonce = getRandomLong();
for (long long i = 0; i < (2^27); i++)
{
    if(headerPOS == i)
        MT.addLeaf(Blake2.hasher(blockHeader + Nonce));
    MT.addLeaf(Blake2.hasher(i));
}
MT.computeTree();

proofs = MT.getProof(Blake2.hasher(blockHeader + Nonce));
rootHash = MT.root;

The miner now hash attempted his first solution. If the Root, hash is lower than the set target then the Mining client can now send the following to the network:

The Nonce used
The Root Hash
The Proofs

Optimization (IMPORTANT!):
Let it be known that because most of the 8GB MerkelTree is random hashes based on the clients wishes, it is most likely a fact that instead of newly allocating 8GB of RAM each loop cycle, the client will most likely allocate once, and only slightly modify on each attempt to get a solution. This is 100% ok in my eyes, and doesn't break the PoW more than it compliments it in my opinion. There are such a vast varieties of ways to produce a different solution with this algorithm like changing the position of the Block Header, the Nonce, each and every leaf in the base, timestamps, and more.

Notes:

The amount of RAM needed is arbitrary right now and would most likely need to be agreed upon in a best case manner, with keeping in mind that we would want to favor
CPU/MEM combo over GPUs, and try to completely eliminate ASICs.
The base hashing algorithm Blake2 is insanely fast on CPUs, and secure with the right implementation.
The algorithm above can also use CryptoNight instead if needed to hit its favorable CPU goal, however i would hate to introduce the slowdown.

Validation:

First, the network would examine the Root, and Proofs hashes length to ensure they are 32 bytes in length, to migrate any attempts to use less RAM.
Second, the network would examine the Levels of the Proofs array. If a client tried to use less than the 8GB Tree, then the levels would be less and invalid.
Third, with all preconditions checked, the network then combines the Block Header and Nonce the same way the client did, and hashes them to get the Hash Target.
Fourth, the network starts with the Hash Target found and combines each Proof Level in the Proof Array. If the remaining last hash is equal to the Root, then the PoW is accepted and completed.

Notes:

The validation is one of the best things i love about this algorithm. The work sent up by the miner to the network is less than 1kb! and the only thing the network needs to do to verify is simply make sure the Block Header and Nonce hash is in the base, and that the proofs match up with the root. This is such a small computation, with only validating a MerkelTree and Blake2 hash speed, that it should in theory be among some of the fastest out there now.
The idea is that when validating, its not just the Root that matters, but the Block Header being in the base and matching has just as strong validation as that root, and when trying to game a MerkelTree, you change the whole thing. So why use anything more complicated than that.

Conclusion:

What i have purposed is a PoW based algorithm that doesn't use 2 through 10 algorithms in one to make it secure, but rather kept it simple with using the MerkelTree in a unique way. These trees provide a amazing level of security and flexibility by themselves, that when combined with a fast algorithm, seems to be up to snuff, if not better than most PoW's out there. Despite my long and poorly written post, this algorithm on paper is actually really simple compared to most of its bothern out there, and the idea of mining with a MerkelTree but in a different way than MTP, interests me for its ridiculously fast validation techniques.

I also enjoy the "Less constants, more dynamic" approach to these algorithms, where the client can choose many of the inputs for themselves, and the network doesn't care as long as the X set rules are used in those choices.

Now that i have laid my thought process and small PoC out there, i would love for some of the brilliant minds here to poke some holes in this, and attempt to break the system. If breaking it is not possible, then is there a reason why a experimental coin shouldn't be created to test this out further?

Thank you for your time.

therealdf (OP)

Newbie

Offline

Activity: 7
Merit: 0

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 05:16:42 AM

#2

Sorry, I also forgot to mention that in order to combat Moores Law, the chosen size of the MerkelTree could increase in size each calendar year, as often suggested by many similar algorithms, which would also never be large enough to effect the validation.

tromp

Legendary

Offline

Activity: 978
Merit: 1087

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 08:27:23 AM

#3

Quote from: therealdf on March 20, 2018, 05:01:34 AM

Second, the network would examine the Levels of the Proofs array. If a client tried to use less than the 8GB Tree, then the levels would be less and invalid.

What are these Levels you speak of?

This PoW doesn't need any memory at all. Instead of computing a Merkle Tree on 2^27 data items (which btw, needs only 1KB of memory) you can just pick 27 random elements in the proof array.

therealdf (OP)

Newbie

Offline

Activity: 7
Merit: 0

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 01:21:43 PM
Last edit: March 20, 2018, 01:32:36 PM by therealdf

#4

I'm confused, how would you pick 27 random elements to put in the proof array, when all those said elements have to be able to compute to the root hash.

If you could effectively pick 27 random elements well enough to where it equaled a hash, then Merkle Trees would be broke, no?

Also the proof levels are how many tiers are in the merkle proof. For example a merkle tree of 10 items would have 5. Which are 10,5,3,2,1. 5 numbers. This number helps verify the starting amount of items later if needed

Edit: could you also example how 2^27 would be 1kb memory? 2^27 is 134217728 which * 32 is 4294967296, which is 4.29GB for the base, making the whole computed tree 8GB, unless I'm missing some optimizations of handling a 32 byte hash leaf MerkleTree?

Also thank you for the reply! I'm still in the process of completely understanding your Cuckoo Cycle and think the second version is quite impressive!

tromp

Legendary

Offline

Activity: 978
Merit: 1087

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 03:03:30 PM

#5

Quote from: therealdf on March 20, 2018, 01:21:43 PM

I'm confused, how would you pick 27 random elements to put in the proof array, when all those said elements have to be able to compute to the root hash.

The 27 elements are the header leaf and the supposed hashes of the siblings. But since you don't verify that sibling hashes are computed from a subtree, they can be freely chosen.

Quote

If you could effectively pick 27 random elements well enough to where it equaled a hash, then Merkle Trees would be broke, no?

No, they still serve to prove inclusion of a data item in a large set.

Quote

Edit: could you also example how 2^27 would be 1kb memory? 2^27 is 134217728 which * 32 is 4294967296, which is 4.29GB for the base, making the whole computed tree 8GB, unless I'm missing some optimizations of handling a 32 byte hash leaf MerkleTree?

No need to store the base or tree.
You can compute the merkle tree root hash with a hash("") call to recursive function
hash(path) {
  if length(path) = 27 then return hash(path_to_index(path));
  left = hash(path++"0");
  right = hash(path++"1");
  return hash(left ++ right);
}
which only needs memory for 27 stack frames.

therealdf (OP)

Newbie

Offline

Activity: 7
Merit: 0

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 04:04:22 PM

#6

Ah! Thank you! I believe i see what you mean, and the weakness would be in the dynamic picking of the base + Block Header Hash, but not the algorithm itself.

This leads me to the conclusion that using a static Merkle Tree contents derived from a known constant seed would still allow the algorithm to compute as normal, and remove this weakness then.

Say if it was a constant in every client that the seed to use for the Base Construction of the MerkelTree was the constitution words, and everything else in the algorithm proceeded as prior, then this seems like it would achieve the rules set before as a PoW.

Interesting answer, and i knew i was missing something prior! However, this still seems like a viable easy to use PoW

tromp

Legendary

Offline

Activity: 978
Merit: 1087

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 04:31:27 PM

#7

Quote from: therealdf on March 20, 2018, 04:04:22 PM

Interesting answer, and i knew i was missing something prior! However, this still seems like a viable easy to use PoW

I think MTP (Merkle Tree Proof) is what you're trying to do:

https://medium.com/@marketing_28540/zcoin-merkle-tree-proof-version-1-2-c791c9a49a8f

https://bitcointalk.org/index.php?topic=2256042.0

https://bitcointalk.org/index.php?topic=2539743.0

therealdf (OP)

Newbie

Offline

Activity: 7
Merit: 0

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 04:42:57 PM

#8

Thanks for the links! I have exhausted my research on MTP though. As i mentioned in the first post, i believe it is overly complex for little reason, and also has issues in the calculations preformed after its MerkelTree generation.

I believe that with simply using a MerkelTree, Known Base-Target Hash in the tree, and a Nonce, that not only can large memory computation be preformed with a fast hash rate, but that validation of that solution can be sped up dramatically.

I would like to see this Algorithm similar to MTP, but instead of creating a complex function after the tree generation, simply stop at the Tree and use multiple parts of it to cause work, and validation.

I have tested my theory in my last message, and indeed using a static BASE secures the algorithm again. However weirdly in my tests, SHA-256 was running faster than Blake2 by a tremendous amount, and i may have grabbed a bad implementation, or the machine im on isn't optimal for it.

tromp

Legendary

Offline

Activity: 978
Merit: 1087

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 04:59:24 PM

#9

Quote from: therealdf on March 20, 2018, 04:42:57 PM

I have tested my theory in my last message, and indeed using a static BASE secures the algorithm again.

I fail to see how a static base would require significant memory use though...

therealdf (OP)

Newbie

Offline

Activity: 7
Merit: 0

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 05:18:29 PM

#10

My apologizes, i kind of switches theories there without filling you in properly.

I have numerous Merkel Tree type algorithm implementations that i have been creating, testing, and attacking. When you brought up the issue of computing the hash, dropping the data, and continuing, it made me realize that this exact specification would most likely fall on it face due to parallelization and no need to store data when computing on the fly is more favorable.

This lead me back to my original design which is nearly identical, but instead of using memory speed or size, it actually uses Hard Drive read speed and latency to read, because the one device everyone has other than a CPU, and Memory, is a HDD and decent ones are dirt cheap these days.

So the idea is to construct a static known MerkelTree File that is so large it cannot be placed in memory, and can either only be natively seeked, or memory mapped. Lets say ~100GB or (2^31.5 -1) * 32 for the whole tree file size. At this point, the algorithm does not care how much you retain in the memory or not, the majority of latency in the work is the latency to the HDD data and back. The static

The proofs would still be ridiculously small, and validate fast, and what is unique about this algorithm is you would be effectively mining with your HDD more than any other device, also making it different than any type of HDD type mining out there like Proof of Storage, Proof of Time, and others.

Does this make sense? The coin would still be majority CPU mining, just throttled on the speed/latency of the drive you have with it. This would also kill GPU mining, because leaving the kernel, to the memory, to the HDD, is a long ass trip.

tromp

Legendary

Offline

Activity: 978
Merit: 1087

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 07:22:47 PM

#11

Quote from: therealdf on March 20, 2018, 05:18:29 PM

So the idea is to construct a static known MerkelTree File that is so large it cannot be placed in memory, and can either only be natively seeked, or memory mapped. Lets say ~100GB or (2^31.5 -1) * 32 for the whole tree file size. At this point, the algorithm does not care how much you retain in the memory or not, the majority of latency in the work is the latency to the HDD data and back. The static
Does this make sense?

No:-(
You still haven't specified the PoW.
What exactly does your proposed proof verification look like?
Once I understand that, we can discuss what the resource requirements for efficiently solving it might be.

therealdf (OP)

Newbie

Offline

Activity: 7
Merit: 0

Re: A extremely simple POW concept tailored for CPU mining

March 20, 2018, 08:10:47 PM

#12

Instead of me wasting your time, let me write a proper PoC based on this method and im sure it will go alot further than my feeble attempts of words

LtMotioN

Member

Offline

Activity: 210
Merit: 29

Re: A extremely simple POW concept tailored for CPU mining

March 22, 2018, 05:46:14 AM

#13

I am not a master of cryptography or anything, but my understanding is that more difficult algorithms are better for CPU. SHA256 is quite easy, and that is why we see ASICs being created for it.

Should there not be an extremely difficult POW algorithm? Since CPU's are very versatile in what they can process unlike GPU's and ASICs.

Dogs are nice, I don't like cats though.