1) Mathematical background:a) Groups:I recommend to read the wiki article on groups (
http://en.wikipedia.org/wiki/Group_%28mathematics%29) at least partially.
If you don't like to read it for whatever reason and still want to receive some impression of what a group is, here is the short (and not mathematical strict) version:
You know the set of integers. There is an addition defined on the set of integers. You can do calculations like
a + (b + (-a)) = a + ((-a) + b) = (a + (-a)) + b = 0 + b = b
If you have a set of elements and you can do calculations with them like the one above, the set is called an (additive) commutative group. There always is a neutral element,
denoted by 0, which you can add to any other element without changing the element. To every element a there is an inverse element, which in the case of an additive group is
denoted by -a. Adding those 2 elements together results in the neutral element: a + (-a) = 0.
How about multiplication? Does the set of integers form a group with respect to multiplication as operation? No, because there is no inverse element to most integers with
respect to multiplication, e.g. there is no integer a that satifies 2*a=1 (1 is the neutral element of multiplication). How about the rational numbers Q and multiplication?
Well, almost! The only element not having an inverse is the 0. But Q without the element 0, i.e. 0 Q\{0}, is a group with respect to multiplication. In a multiplicative
group the inverse element of a is usually denoted by a^-1 and the neutral element by 1.
b) Modular arithmetic:I recomment to read the wiki article on modular arithmetic (
http://en.wikipedia.org/wiki/Modular_arithmetic) at least partially.
Again, here is the short version:
When you were young, you didn't know about fractions. So if you had 7 apples and wanted to split them among 3 children, each child would get 2 apples and there was 1 apple
remaining: 7 = 2*3 + 1. In modular arithmetics you would say "7 is congruent 1 modulo 3" and write 7 ≡ 1 mod 3. The neat thing about remainders of divisions is that you can
calculate with them as if they were numbers:
7 ≡ 1 mod 3 and 8 ≡ 2 mod 3 and therefore 7+8 ≡ (1+2) mod 3 ≡ 0 mod 3, 7*8 ≡ (1*2) mod 3 ≡ 2 mod 3
When you do calculations modulo a prime p, the set of remainders forms a commutative group with respect to additon and the the set of remainders excluding the 0 forms a
commutative group with respect to multiplication. With those 2 operations (and some additional law which I omit for brevity), the set forms the finite field
Fp.
c) Montgomery curves:I recomment to read the wiki article on elliptic curves (
http://en.wikipedia.org/wiki/Elliptic_curves) and the article on Montgomery curves (
http://en.wikipedia.org/wiki/Montgomery_curve) at least partially.
Here is the short version:
Elliptic curves are described by equations which have the form y^2 = x^3 + ax^2 + bx + c with some numbers a,b,c. (Usually the second highest term is eliminated by a suitable translation of x).
Any point on the curve can be represented by its affine coordinates (x,y).
It is possible to define an addition for points on the curve (no, not by simply adding the x and y coordinates!) Read the wiki article for details. With this addition (and
by introducing another point called the point at infinity which serves as the neutral element), the set of points form an additive, commutative group.The addition of points
gives rise to another operation: if you add a point P to itself giving you P+P, you could as well write 2*P as a short version of P+P. You can guess what 3*P, 4*P, and so on
means. The product of a point by a number is called scalar product.
One can also restrict the allowed values for x (and thus for y) to finite fields
Fp.
Montgomery curves have the form By^2 = x^3 + Ax^2 + x. The reason why those curves are interesting for cryptography is that they allow fast arithmetic. Switching to
projective coordinates X and Z (x=X/Z) one can add points P=(Xp:Zp) and Q(Xq:Zq) without knowing the y coordinate as long as the X and Z coordinates of P-Q are known.
However, if you only know the projective coordinates of one point P=(Xp:Zp) it is impossible to recover the y coordinate. You need to know for instance the projective
coordinates of P, Q and P+Q to recover the y coordinate of P.
2) Analyzing and debugging the process of signing and verification:For the rest of post we fix p=2^255-19 and q=number of elements of the group generated by the so called base point G of curve25519. p and q are both prime numbers.
The Montgomery curve is given by the equation y^2 = x^3 + A*x^2 + x where A = 486662.
Let's start by looking at the processes at a high level, meaning that for the moment we believe that the software is actually doing what the comments of the author suggest.
The sign method of the Crypto class, which has a message and a secret phrase as input, first calls keygen(P, s, k=SHA256(secret phrase)) where k is the only [in]-argument
and P and s are [out]-arguments. keygen() clamps k and then calls core(). core() calculates the x-coordinate of P=k*G (which is the public key for k). Since s ist not null,
the calculation continues. First, the y-coordinate of the point P is recovered and then s is calculated as k^-1 mod q if the y-coordinate is positive or (-k)^-1 if the y-
coordinate is negative (we will see the reason for distinguishing those 2 cases in a moment). One could say s = (sign(P) * k)^-1 if sign(P) is defined to be the sign of the
y-coordinate of P.
The Crypto class then calculates
x = SHA256(message + s) (the "+" means the arrays are appended rather than added),
Y = public key of x (with a call to keygen(Y, null, x)) and
h = SHA256(message + Y)
and finally calls the sign method of the Curve25519 class which returns
v = (x-h)*s mod q.
The signature is defined as pair (v,h).
The process of verifying is as follows:
v and h are recovered from the signature and then Curve25519.verify(Y, v, h, P=publik key of k) is called in order to recover the same Y as above. This is tested comparing
h2 = SHA256(message + Y)
with the given h.
Let's take a closer look how Y is recovered:
For the verification to succeed we need to calculate Y = v * sign(P) * P + h * G since then we have
Y = (x - h) * s * sign(P) * P + h * G = (x - h) * (sign(P) * k)^-1 * sign(P) * k * G + h * G = (x - h) * G + h * G = x * G = public key of x (which is exactly the definition of Y in the sign method).
Oh no! We need to calculate sign(P) but we can't recover the y-coordinate of P! Fear not, a neat trick will help us.
If we knew that y is the y-coordinate of P, then the x-coordinates of P+G and P-G can be calculated to (the x denotes the x-coordinate, y the y-coordinate)
(P+G)x = (Py^2 + Gy^2 - 2 Py Gy)/(Px - Gx)^2 - Px - Gx - A =: s[0]
(P-G)x = (Py^2 + Gy^2 + 2 Py Gy)/(Px - Gx)^2 - Px - Gx - A =: s[1]
With the values for s[0] and s[1] given, the rest of the code of the verify method would calculate v*P+h*G by applying an addition chain.
Calculating Py^2 is no problem since we know the curve equation. There also is a method recip() for calculating a root for Py^2.
Let's say we switch the definition of s[0] and s[1] whenever the root returned from recip is negativ. So in that case we would have
s[0] := (P-G)x = (-((-P)+G))x = ((-P)+G)x
s[1] := (P+G)x = (-((-P)-G))x = ((-P)-G)x
where we used that negating a point is a reflexion across the x-axis and thus doesn't change the x-coordinate.
Switching the definition of s[0] and s[1] means we simply switch from P to -P in the calculation .
But we might have used the wrong y-coordinate of P! So let's check what the influence of that is:
We have to distinguish 4 cases (real y-coordinate denoted by y, y-coordinate returned from recip() denoted by y'):
i) sign(y)=1 and sign(y')=1 ==> y'=y. Thus s[0] = (P+G)x = (sign(P)*P+G)x, s[1] = (P-G)x = (sign(p)*P-G)x
ii) sign(y)=1 and sign(y')=-1 ==> y'=-y. Looking at the formula for P+G and P-G we see that P+G becomes P-G when replacing y with -y and P-G becomes P+G. Since in the case
sign(y')=-1 we switch s[0] and s[1] as well we end up with s[0] = P+G = (sign(P)*P+G)x and s[1]=(sign(P)*P-G)x as in case i).
iii) sign(y)=-1 and sign(y')=1 ==> y'=-y. We don't switch s[0] and s[1] in this case but we really calculate s[0]=(P-G)x, s[1]=(P+G)x because we switch from y to -y in the
formula. So s[0] = (P-G)x = ((-P)+G)x = (sign(P)*P+G)x, s[1] = (P+G)x = ((-P)-G)x = (sign(P)*P-G)x
iv) sign(y)=-1 and sign(y')=-1 ==> y'=y. We switch s[0] and s[1] resulting again in s[0] = (P-G)x = ((-P)+G)x = (sign(P)*P+G)x, s[1] = (P+G)x = ((-P)-G)x = (sign(P)*P-G)x.
So in all the cases we have s[0] = (sign(P)*P+G)x and s[1] = (sign(P)*P-G)x and the addition chain will output v * sign(P) * P + h * G just as desired!
This means the algorithm for signing and verifying is ok and verify() should return true every single time when we use the signature returned from sign().
And still verification fails every now and then, so there have to be bugs somewhere! Maybe a bug in the addition chain? Such a chain is hard to debug, it's simpler to write
our own methods for scalar multiplication and addition of points. I did that and it turns out that my method returns the very same Y as the addition chain every time. It
seems the addition chain is working, the bugs must be somewhere else in the code.
The next part of the code which I took a close look at was the calculation of s = (sign(p)*k)^-1 in the core() method. After the point P=k*G is calculated using a Montgomery
ladder (I have checked the code for adding points and doubling a point), the y-coordinate of P is recovered. How that? Well the ladder not only gives us P but also P+G.
Since G is known too the y-coordinate can be calculated. I have checked the derivation of the formula for y, it's valid. sign(P) is ok, what about the calculation of k^-1, how to do that?
Calculating the inverse k^-1 of k mod q can be done in 2 ways:
i) Fermat's little theorem states that for a prime q and any integer a the following equation holds: a^q ≡ a mod q.
Multiplying with a^-2 on both sides gives a^(q-2) ≡ a^-1 mod q. We simply have to calculate a power of a. This can be done similar to the Montgomery ladder giving us a time
independent algorithm.
ii) We can use the extended Euclidean algorithm. This algorithm computes the gcd (greatest common divisor) of 2 integers a and b. As a byproduct it can also output the
linear representation of the gcd: gcd(a,b) = t1*a + t2*b. That is nice because if we choose a=k and b=q and compute 1=gcd(k,q) = t1*k + t2*q then we have after taking the
equation mod q: t1*k ≡ 1 mod q which means t1=k^-1.
The author of the Curve25519 class chose the latter approach. It might be a problem because the algorithm is time dependent on the input und thus not safe against timing
attacks. I checked the algorithm and it appeared to be ok so I plugged in some numbers.
Testing different values for k and validating that the returned value is indeed the inverse of k, I gained confidence in that method. The only thing you should
not do is
setting k to a negative number like -1. It is interpreted by the algorithm as 2^256 - 1 (positive number) and will not return q-1 as it should. Note also that if egcd32()
returns a negative value for s in the method core(), then q is added to s to make it a positive number.
There is only one method left, that is Curve25519.sign(), so the bug must inside that method. Let's take a look at it.
It calculates (x-h)*s mod q.
Looks legit? Not! If x<h then x-h is negativ! That's not going to work. (x-h)*s will be negativ too (remember: s is positiv) and the mod q
reduction will not return the desired value. We have to take care of the case x<h. I think the easiest way is to reduce the x and h mod q in the beginning. Only then you can
test for a negativ result by looking at the highest bit and, in case it is set, add q to the result making it positive (If you don't reduce x and h mod q then you can't use
mula_small because you can't test the result by looking at the highest bit). So the new method Curve25519.sign() should look like this:
private static final void reduce(byte[] x) {
byte[] tmp=new byte[32];
divmod(tmp, x, 32, ORDER, 32);
if ((x[31] & 0x80) != 0)
{
// x is negativ, add q to it
mula_small(x, x , 0, ORDER, 32, 1);
}
}
public static final boolean sign(byte[] v, byte[] h, byte[] x, byte[] s) {
// v = (x - h) s mod q
int w, i;
byte[] h1 = new byte[32], x1 = new byte[32];
byte[] tmp1 = new byte[64];
byte[] tmp2 = new byte[64];
// Don't clobber the arguments, be nice!
cpy32(h1, h);
cpy32(x1, x);
// Reduce modulo group order
reduce(h1);
reduce(x1);
// v = x1 - h1
// If v is negative, add the group order to it to become positiv.
mula_small(v, x1, 0, h1, 32, -1);
if ((v[31] & 0x80) != 0)
{
mula_small(v, v , 0, ORDER, 32, 1);
}
// tmp1 = (x-h)*s mod q
mula32(tmp1, v, s, 32, 1);
divmod(tmp2, tmp1, 64, ORDER, 32);
for (w = 0, i = 0; i < 32; i++)
w |= v[i] = tmp1[i];
return w != 0;
}
I have tested the new sign() method with 10000 random pass phrases and messages and the verification has not failed a single time!
The bug I found is not the last flaw but nevertheless it is an anoying bug making verify() fail sometimes.