deuteragenie (OP)
Newbie
Offline
Activity: 36
Merit: 0
|
|
January 22, 2014, 11:29:05 AM Last edit: January 22, 2014, 07:16:47 PM by deuteragenie |
|
I understand that downloading and verifying the blockchain takes time. As regards to verification, I believe that most time is spent checking digital signatures. Has anyone looked at batch verification of ECDSA signatures ? Here is a starting point: http://cse.iitkgp.ac.in/~abhij/publications/AfricaCrypt12-ECDSA-bv.pdfPS: After a cursory check, I did not find an implementation of these algotirthms.
|
|
|
|
Altoidnerd
|
|
January 22, 2014, 01:26:50 PM |
|
Bump. We need a math board.
|
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 22, 2014, 02:30:46 PM |
|
The client doesn't use the "fastest" possible implementation. It just uses a standard openSSL elliptic curve implementation.
The have to balance speed against security.
The idea of a batch system sounds cool though. It looks like it only works if the same public key is used for all signatures?
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
prezbo
|
|
January 22, 2014, 04:57:20 PM |
|
The idea of a batch system sounds cool though. It looks like it only works if the same public key is used for all signatures?
See equation 3 of the naive algoritm, because the curve point multiplication with base point P is done only once per batch of signatures it offers at least about 2x the speedup in any case.
|
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 22, 2014, 05:44:41 PM |
|
See equation 3 of the naive algoritm, because the curve point multiplication with base point P is done only once per batch of signatures it offers at least about 2x the speedup in any case.
They have a problem that the signature is (r, s) which is 2 integers (mod p). You need to verify sum(R) = sum(u * G) + sum(v * Q) or sum(R) = sum(u) * G + sum(v * Q) The problem is that you don't have R. The parameter r is the x-coord of R. R means y^2 = r^3 + a*r + b mod p That gives you 2 y's for each r value. They suggest trying all permutations. If you do a batch of 4, that gives 16 attempts to find R. sum(u) * G => 3 integer adds and a point multiply sum(v * Q) => 4 point multiplies and 3 point adds Each R guess requires 3 point adds and 8 on average means 24 point adds. It also requires a square root step. Total 5 point multiplies 27 point adds Normal: 8 point multiplies 1 point add The batch method has 3 fewer multiplies but 19 more adds. What is the relative time for adds and multiplies?
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
prezbo
|
|
January 22, 2014, 06:08:23 PM |
|
What is the relative time for adds and multiplies?
Using double-and-add one multiplication with n requires O(log(n)) additions. I don't think there are more efficient methods. Adding another bit to the r-values to determine the points on the curve would be a very elegant solution, something to consider if it provides a significant improvement.
|
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 22, 2014, 07:23:01 PM |
|
Using double-and-add one multiplication with n requires O(log(n)) additions. I don't think there are more efficient methods.
Is that log2? If so, then a multiply "costs" 128 adds? Adding another bit to the r-values to determine the points on the curve would be a very elegant solution, something to consider if it provides a significant improvement.
Time is proportional to the number of unique public keys. For bitcoin, all the transactions would have different public keys, so the speed up is at most 2X.
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
deuteragenie (OP)
Newbie
Offline
Activity: 36
Merit: 0
|
|
January 22, 2014, 07:23:54 PM |
|
Relevant part of the introduction:
' In this paper, we propose three algorithms to verify original ECDSA signatures in batches. Our algorithms apply to all cases of ECDSA signatures sharing the same curve parameters, although we obtain good speedup figures when all the signatures in the batch come from the same signer. Our algorithms are effective only for small batch sizes'
|
|
|
|
prezbo
|
|
January 22, 2014, 07:28:01 PM |
|
Using double-and-add one multiplication with n requires O(log(n)) additions. I don't think there are more efficient methods.
Is that log2? If so, then a multiply "costs" 128 adds? Yes, that would be the binary logarithm. Time is proportional to the number of unique public keys. For bitcoin, all the transactions would have different public keys, so the speed up is at most 2X.
That is correct, the question is if this is significant enough to bother with it.
|
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 22, 2014, 07:41:34 PM |
|
That is correct, the question is if this is significant enough to bother with it.
I don't think it would be top priority. The custom bitcoin curve code gives a much higher effect without using potentially an insecure technique.
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
Altoidnerd
|
|
January 22, 2014, 08:49:27 PM |
|
See equation 3 of the naive algoritm, because the curve point multiplication with base point P is done only once per batch of signatures it offers at least about 2x the speedup in any case.
Sorry to go back to this point...looking at Eq (3), it refers to the situation in which addresses are re-used, a behavior we recently had a thread about (it is discouraged). THis makes me suspicious Relevant part of the introduction: ' In this paper, we propose three algorithms to verify original ECDSA signatures in batches. Our algorithms apply to all cases of ECDSA signatures sharing the same curve parameters, although we obtain good speedup figures when all the signatures in the batch come from the same signer. Our algorithms are effective only for small batch sizes'
So am I understanding the speedup is by recognition of signatures by a particular key within a single block?
|
|
|
|
prezbo
|
|
January 22, 2014, 10:20:41 PM |
|
See equation 3 of the naive algoritm, because the curve point multiplication with base point P is done only once per batch of signatures it offers at least about 2x the speedup in any case.
Sorry to go back to this point...looking at Eq (3), it refers to the situation in which addresses are re-used, a behavior we recently had a thread about (it is discouraged). THis makes me suspicious Reusing addresses would provide additional speedup, but 2x would be guarantied even with unique addresses.
|
|
|
|
deuteragenie (OP)
Newbie
Offline
Activity: 36
Merit: 0
|
|
January 23, 2014, 08:58:31 AM |
|
It would be good if there was an implementation of this available somewhere to try to figure out what the actual speed up could be, as it is not possible, on modern CPU architecture to determine the speed up by counting the number of adds/muls. Unfortunately, I have not been able to identify an implementation. The custom bitcoin curve code gives a much higher effect without using potentially an insecure technique.
Is it so that sipa's custom code is used in the current implementation ? Why do you say that the technique is potentially "insecure" ? This would only be used to verify signatures on the blockchain, how can this be "insecure" ? The signature verification can be tested thoroughly against a reference implementation (OpenSSL and sipa), so the likelihood to incorrectly verify signatures (either declaring them valid, when they are not, or vice-versa) is very low.
|
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 23, 2014, 09:48:50 AM |
|
So am I understanding the speedup is by recognition of signatures by a particular key within a single block?
The signature test requires 2 point multiplications. One of them is a multiplication by the generation point (G) and the other is by the public key (Q). Assuming you are checking 4 signatures, you normally check: R1 = a1 * G + b1 * Q R2 = a2 * G + b2 * Q R3 = a3 * G + b3 * Q R4 = a4 * G + b4 * Q This can be simplified to (R1 + R2 + R3 + R4) = (a1 + a2 + a3 + a4) * G + (b1 + b2 + b3 + b4) * Q Point addition and normal integer addition are (relatively) very fast. That reduces the number of multiplies to 2, no matter how many signatures. However, if Q is different for each signature (different public key), then you can only reduce the complexity of the (a1 + a2 + a3 + a4) * G. So, at most half of the multiplies are eliminated in that case.
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 23, 2014, 09:51:57 AM Last edit: January 23, 2014, 10:32:09 AM by TierNolan |
|
Why do you say that the technique is potentially "insecure" ? This would only be used to verify signatures on the blockchain, how can this be "insecure" ? The signature verification can be tested thoroughly against a reference implementation (OpenSSL and sipa), so the likelihood to incorrectly verify signatures (either declaring them valid, when they are not, or vice-versa) is very low.
The method will definitely pass valid signatures. I was concerned that there might be a way to sign invalid signatures somehow. From Google, the authors have presented a system which uses "Randomisers". That suggests that their original scheme had some potential (or actual) weakness. Verify A = B C = D is weaker that just verifying A + C = B + D
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 23, 2014, 05:37:35 PM |
|
Would this work?
It is based on "Shamir's trick".
The only EC operations are double and add. This allows you build up a multiply.
a = a0 + a1*2^1 + a2*2^2 + .... an-1*2^(n-1)
which can be simplified to
a = a0 + 2*(a1 + 2*(a2 + 2*(... an-1)))
aG can then be expressed as
aG = a0*G + 2*(a1*G + 2*(a2*G + 2*(.... an-1*G)))
This is the standard double and add rule. You start from the MSB and work backwards.
I am going to assume 3 bit numbers, which shouldn't reduce the argument.
aG = a0*G + 2*(a1*G + 2*(a2*G)) bQ = b0*Q + 2*(b1*Q + 2*(b2*Q))
aG + bQ = a0*G + 2*(a1*G + 2*(a2*G)) + b0*Q + 2*(b1*Q + 2*(b2*Q))
re-arrange aG + bQ = (a0*G + b0*Q) + 2*(a1*G + 2*(a2*G)) +2*(b1*Q + 2*(b2*Q))
factor by 2 aG + bQ = (a0*G + b0*Q) + 2*(a1*G + 2*(a2*G) + b1*Q + 2*(b2*Q))
re-arrange aG + bQ = (a0*G + b0*Q) + 2*((a1*G + b1*Q) + 2*(a2*G) + 2*(b2*Q))
factor by 2 aG + bQ = (a0*G + b0*Q) + 2*((a1*G + b1*Q) + 2*(a2*G + b2*Q))
The effect is that you can handle the sum of 2 products with the same number of doublings as a single doubling.
Note: multiply by a0, b1 etc aren't really multiplies, since they are single bit numbers. You multiply by 1 or 0.
Anyway, for batch mode, that seems to give an equivalent boost.
R1 = a1 * G + b1 * Q R2 = a2 * G + b2 * Q R3 = a3 * G + b3 * Q R4 = a4 * G + b4 * Q
converts to
R1 + R2 + R3 + R4 = a1 * G + b1 * Q1 + a2 * G + b2 * Q2 + a3 * G + b3 * Q3 + a4 * G + b4 * Q4
Even with different public keys you can still get the boost in speed.
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
Altoidnerd
|
|
January 23, 2014, 07:29:53 PM |
|
I'm not sure why you defined Qk in the final step.
Moreover I believe this algebra is fine, but I'm not convinced a computer will be able to rearrange a block like this actually resulting in a speed up.
We need an analysis in a particular computer language or simply a test to see if anything happens faster.
|
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 23, 2014, 09:56:46 PM |
|
I'm not sure why you defined Qk in the final step.
I had a go at coding it. The results are Bouncy castle: 2.5ms per verify Naive Implementation: 3.3 per verify Batch: 1.0ms per verify It is 2.5X faster. If it was combined with the internal tricks that bouncy castle do, then maybe it would be 4-5X bouncy castle's speed. The batch results show that the EC calcs are what is taking the time, so it is raw EC time limited. That means the randomizer thing might be possible. It saves lots of multiplies, but it does it by doing lots of adds. I also calculate R directly, so it doesn't have to make multiple attempts to find a solution.
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
deuteragenie (OP)
Newbie
Offline
Activity: 36
Merit: 0
|
|
January 23, 2014, 10:20:07 PM |
|
Wow! Impressive how fast you did this!
- I suppose that addition can be sped up in bounty castle because parameter 'A' is zero for the BC curve. - This part of your code is interesting:
for (int j = 255; j >= 0; j--) { sum = sum.twice(); for (int i = 0; i < len; i++) { if (u2.testBit(j)) { sum = sum.add(Q); } } }
I do not know how much the 'if' and the 'testbit' method cost, but theoretically they could be replaced by a symbolic expression evaluator, at the cost of traversing a tree and a few pointers. Not sure what would be fastest.
- I am unsure whether the 'attack' on batch verifications apply to the use of signature verifications of the blockchain in the BC context. Presumably, one could apply batch verify until, say, the last 1000 blocks, then switch to the individual verifications.
|
|
|
|
TierNolan
Legendary
Offline
Activity: 1232
Merit: 1104
|
|
January 23, 2014, 10:38:05 PM |
|
Wow! Impressive how fast you did this!
Thanks, but it is mostly the fact that the bouncy castle library has an EC point library built in. Also, the formula just "worked". I had expected an hour or two of debugging at least. - I suppose that addition can be sped up in bounty castle because parameter 'A' is zero for the BC curve.
Dunno. It looks like addition doesn't actually depend on the curve parameters, weirdly. - This part of your code is interesting:
for (int j = 255; j >= 0; j--) { sum = sum.twice(); for (int i = 0; i < len; i++) { if (u2.testBit(j)) { sum = sum.add(Q); } } }
I do not know how much the 'if' and the 'testbit' method cost, but theoretically they could be replaced by a symbolic expression evaluator, at the cost of traversing a tree and a few pointers. Not sure what would be fastest.
Java doesn't really do pointers like that. However, I think the EC calcs are probably the limiting factor, so that shouldn't matter. Test bit is probably implemented as an array lookup + bit mask. - I am unsure whether the 'attack' on batch verifications apply to the use of signature verifications of the blockchain in the BC context. Presumably, one could apply batch verify until, say, the last 1000 blocks, then switch to the individual verifications.
Dunno. There is a paper that talks about protecting it by using random numbers. You multiply u1(i) and u2(i) by a constant. It means you have to multiply R(i) by the same constant though, so that means the LHS has more calcs too.
|
1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
|
|
|
|