Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: charlescharles on October 07, 2014, 02:56:28 PM



Title: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 07, 2014, 02:56:28 PM
I might be looking at an old version, but it seems like the combiner function for internal nodes (described here: https://github.com/olalonde/proof-of-liabilities#internal-node), which is the following:

Code:
function NodeCombiner (left_child, right_child) {
  var n = {};
  n.sum = left_child.sum + right_child.sum;
  n.hash = sha256(string(n.sum) + '|' + left_child.hash + '|' + right_child.hash);
  return n;
}

is vulnerable to an attack in which the prover replaces the expression for n.sum with
Code:
n.sum = max(left_child.sum, right_child.sum)
This is undetectable by anyone asking for a proof of inclusion in the liabilities Merkle tree. A fix for this would be to replace the expression for n.hash with something like
Code:
n.hash = sha256(string(left_child.sum) + '|' + string(right_child.sum) + '|' + left_child.hash + '|' + right_child.hash)
Has this been brought up/addressed? Let me know if this is unclear.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 07, 2014, 09:31:00 PM
Hi charlescharles,

Good catch. This actually addressed in iwilcox's writeup (https://iwilcox.me.uk/2014/proving-bitcoin-reserves) --- grep for "similar customer pairing trick". In fact this is safe, since the sum for each node is part of its hash. This means that if you tried to zero out one of the children, that child's hash would change. So even though the unsummed child amounts are not explicitly hashed, the are committed to as part of the child nodes' hashes. I hope that make sense.

By the way, in future, if you think you've found a security/crypto bug, it's best to bug people in private (in this case olalonde or gmaxwell would be appropriate -- or go on IRC #bitcoin-wizards and ask to PM somebody). Especially if there is a possibility of loss of funds (if the wrong people know about something before it's fixed) responsible disclosure is critical.

Thanks for auditing code!

Andrew


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 07, 2014, 09:56:02 PM
Hey Andy,

I'm not sure that fixes this issue, though. Let's call the properties of each node 'value' and 'hash' ('value' instead of 'sum'). Let's look at an 'end' node, node P, with two leaves, one customer A with 10 BTC and another one customer B with 5 BTC. Let's say you're the company and you want to pretend your liabilities are less than they actually are. Then you construct this node like

Code:
Node P
value = 10
hash = sha256(10 | hashA | hashB)

/                           \
customer A           customer B
value = 10            value = 5
hash = hashA        hash = hashB

In this case, when A wants proof of inclusion, you give node P to A, and tell A that her paired customer's value is 0. This checks out, so A believes you because P.value = 10 = 10 + 0. When B wants proof of inclusion, you still give the exact same node P to B, and tell B that his paired customer's value is 5, and this checks out for him because P.value = 10 = 5 + 5. There's no way for B to verify that the value hashed into hashA is actually 10, not 5.

You repeat this procedure for all the nodes on A's branch up to the root. Note that you never alter hashes, and the Merkle property holds; all you do is falsify the stub values you tell people for the other branch. Also note that this doesn't require any special pairing; this is a defect in the node-combining function itself.

Thanks for the info, by the way! I'll do that in the future.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 07, 2014, 10:28:21 PM
My reading of iwilcox's post (well, my reading of the big diagram with nodes circled) is that the preimages of the child-node hashes (just the preimages of the hashes actually in the path from user's account to root --- the whole tree is not expanded of course!) are given to the user, so the user can verify this. I haven't verified that that's what olalonde's code actually does, but I'd expect it to be the case.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 07, 2014, 10:47:59 PM
But the thing is that in order to avoid this vulnerability, a customer would need to verify the preimages of the preimages of the hashes in the account-root path -- i.e., if the customer is node.left, he would have to know node.value, node.hash, node.right.userAccount, node.right.value, node.right.hash, and node.right.nonce. It's not enough to know node.right.hash and node.right.value because you cannot be certain that node.right.value is the same value that was hashed to create node.right.hash.

If you look at this --
Code:
hexstr( firstfourbytes( sha256( "satoshi@example.com3333ab00" ))) = e4fd9d12

the value is 3333, and the nonce is ab00. If the prover tells you, "this node's hash is e4fd9d12 and its value is 0", you have no way of verifying this is true unless you know "satoshi@example.com" and also "ab00", in which case you could see that hexstr(firstfourbytes(sha256("satoshi@example.com0ab00"))) doesn't match.

Taking my example above, if you're customer A, the prover could very well tell you that node B's hash is hashB and node B's value is 0. This would match up because P.value = 10 + 0, and P.hash = hash(10 | hashA | hashB). If you're customer B, the prover could tell you that node A's hash is hashA and node A's value is 5. This also matches up because P.value = 5 + 5 and P.hash = hash(10 | hashA | hashB). Does this make sense?


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 12:48:10 PM
You are making sense. What I'm saying is that in the iwilcox writeup, all the information is provided to the user to verify the hash preimages. Which does include user data for one other user, apparently, though I recall some discussion at the time gmaxwell proposed this about cleaning that up..


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 01:16:24 PM
If that's the case, then this is no longer a Merkle tree in the sense that proofs of inclusion now take O(N) (rather than O(lg(N))) space. Consider a root with four leaves and three inner nodes:

Code:
                     nodeA
                     valA = 4
                     hashA = hash(val | hashB | hashC)
                       /                                          \
                nodeB                                              nodeC
                valB =  2                                          valC = 4
                 hashB = hash(2 | hashD | hashE)       hashC = hash(4 | hashF | hashG)
                 /                  \                                 /                                     \
nodeD                            nodeE                               nodeF                                    nodeG
valD = 1                         valE = 2                            valF = 3                                valG = 4
hashD = hash('alice' | 1)  hashE = hash('bob' | 2)      hashF = hash('charlie' | 3)      hashG = hash('david' | 4)                  

if nodeE wants proof of inclusion, nodeE needs to require nodeD.name and nodeD.val in order to verify hashD in order to verify that valB is correct. Notice this is in much more than what is required for the standard Merkle tree proof of inclusion, in which nodeE would just require hashD, not valD. Now to verify valA, bob needs valC, but bob can't be sure that valC is actually the hash in hashC. So now bob also needs hashF, valF, hashG, and valG, in order to verify valC = valF + valG. But bob can't be sure about valF or valG either unless he knows the names 'charlie' and 'david' in addition to hashF and hashG, to verify those two hashes' correctness. As you can see, this is no longer just user data for one other user, but user data for every single other user in the tree.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 02:00:37 PM
It's still logarithmic. You only reveal the preimages of the neighboring hashes (which include amounts for the immediate children of everything in your merkle path), there is no need to recurse further.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 02:09:16 PM
That's not sufficient, though. In the above example, if node E gets node D, node B, node A, and node C (as it would in a standard Merkle inclusion proof) there would be no way to verify that nodeC.val is actually the value in nodeC.hash. The prover could falsify nodeC.val without changing nodeC.hash, and there would be no way for nodeE to detect this. In this case, the prover could tell nodeE that nodeC.val = 2 and nodeA.val = 2 + 2 = 4, and nodeE would not be able to verify this.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 02:25:01 PM
The prover could falsify nodeC.val without changing nodeC.hash

The prover cannot, because nodeC.hash = H(4 || nodeC.left_child_hash || nodeC.right_child_hash) and the user is provided all three pieces of information in the hash (as part of nodeC). To change the value of nodeC the prover would need to find strings X, Y such that nodeC.hash = H(2 || X || Y), which is impossible if H is a collision-resistant hash function.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 02:37:55 PM
The prover cannot, because nodeC.hash = H(4 || nodeC.left_child_hash || nodeC.right_child_hash) and the user is provided all three pieces of information in the hash (as part of nodeC).

Agreed, this is what I was saying before. Node E needs to require more than just node C from the prover; node E needs nodeC.left.hash, nodeC.left.value, nodeC.right.hash, and nodeC.right.value to verify node C.hash. But then you run into the same problem with nodeC.left and nodeC.right -- you need their hash preimages as well to verify their values and hashes.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: throwaway084575 on October 08, 2014, 03:12:38 PM
I really like the merkle tree solution to anonymising the users and holdings.

But sorry if I'm missing something here, how is any of this proof of solvency? I can see that it's proof of liability, but to prove solvency we would need to see public addresses of hot and cold wallets?



Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 03:19:00 PM
This scheme doesn't anonymize that much -- you still know the value held by your "paired" user. Proof of reserves is accomplished by signing some string with private keys controlling UTXOs adding to the value of your claimed reserve. There's no need to reveal any public addresses for proof-of-liability, only for proof-of-reserve.

Anyway, the point of this thread is that the node-combining function the proof-of-liability tree is defective, for reasons I've explained above.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 03:26:44 PM
The prover cannot, because nodeC.hash = H(4 || nodeC.left_child_hash || nodeC.right_child_hash) and the user is provided all three pieces of information in the hash (as part of nodeC).

Agreed, this is what I was saying before. Node E needs to require more than just node C from the prover; node E needs nodeC.left.hash, nodeC.left.value, nodeC.right.hash, and nodeC.right.value to verify node C.hash. But then you run into the same problem with nodeC.left and nodeC.right -- you need their hash preimages as well to verify their values and hashes.

You don't need their hash preimages. If these hashes are incorrect it will be detected by the users whose merkle paths go through these nodes.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 03:32:46 PM
The prover does not at any point alter the hashes, only the values he's telling people for proofs of inclusion. The reason you can't trust (what the prover tells you for) nodeC.value is that you can't be sure that it is the same value that's in nodeC.hash.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 03:35:09 PM
The value of each node is part of the input to the hash for that node. The verifier can compute this himself, the prover does not need to make any claims about it.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 03:41:33 PM
But the verifier can't compute nodeC.hash without also knowing nodeC's children's hashes, correct?

This logic applies to nodeC's children as well, and so on until the leaves.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 05:07:54 PM
The prover provides nodeC's child hashes. There is no need to verify that they are correct, that is the job of verifiers whose merkle paths go through nodeC.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 05:10:47 PM
The prover provides nodeC's child hashes. There is no need to verify that they are correct

This is true, but the verifier needs to verify that nodeC.value is correct. There's no way to verify that without also verifying nodeC.hash. The only way to verify nodeC.value is the value in nodeC.hash is by computing hash(nodeC.value | nodeC.left.hash | nodeC.right.hash) and comparing this value to nodeC.hash -- do you agree?


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 05:33:06 PM
Yes.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: charlescharles on October 08, 2014, 06:41:57 PM
So we've established that we need nodeC.left.val, nodeC.left.hash, nodeC.right.val, and nodeC.right.hash to verify the integrity of nodeC.val and nodeC.hash. The same logic applies recursively to nodeC.left and nodeC.right: this is why I said that proofs of inclusion now require O(N) space. You ultimately need to know all the leaves.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: andytoshi on October 08, 2014, 07:12:05 PM
You don't need to verify the integrity of nodeC.left.hash or nodeC.right.hash.


Title: Re: A vulnerability in olalonde's implementation of gmaxwell's proof-of-solvency
Post by: gmaxwell on October 08, 2014, 07:35:20 PM
The requirement is that if there is fraud it must be detectable by some user under some path and that they have the ability to create a transferable proof of their detection. You can't achieve stronger than that (e.g. that if there is fraud all users can detect it) under this approach.  The criteria is met if you show the unsummed values (as listed on iwilcox page) or just show the one step deep off-path preimage.