Does somebody has an idea why there is an keyFinderKernelWithDouble additionally to keyFinderKernel. What is the point of this additional function?
After further inspection I found that the call chain trickles down like this:
keyFinderKernelWithDouble
|
v
doIterationWithDouble -> CompleteBatchAddWithDouble
|
v
BeginBatchAddWithDouble
The same call chain appears for keyFinderKernel but without the "WithDouble" suffix.
keyFinderKernel and doIteration don't do anything different apart from calling these differently named functions, it's in Begin/CompleteBatchAdd where the interesting stuff happens.
__device__ __forceinline__ static void beginBatchAdd(const unsigned int *px, const unsigned int *x, unsigned int *chain, int i, int batchIdx, unsigned int inverse[8])
{
// x = Gx - x
unsigned int t[8];
subModP(px, x, t);
...
}
__device__ __forceinline__ static void beginBatchAddWithDouble(const unsigned int *px, const unsigned int *py, unsigned int *xPtr, unsigned int *chain, int i, int batchIdx, unsigned int inverse[8])
{
unsigned int x[8];
readInt(xPtr, i, x);
if(equal(px, x)) {
addModP(py, py, x);
} else {
// x = Gx - x
subModP(px, x, x);
}
...
}
Notice how there's an extra argument "py" (appears to be generator point y according to the comments) in the WithDouble function that doesn't appear in the non-double function. It appears to be doubling the py value if subtracting px-x (Gx-x) would make the point at 0, whilst there's no such protective measure in the non-double function.
In CompleteBatchAddWithDouble the only different snippet is this:
if(equal(px, x)) {
// currently s = 1 / 2y
unsigned int x2[8];
unsigned int tx2[8];
// 3x^2
mulModP(x, x, x2);
addModP(x2, x2, tx2);
addModP(x2, tx2, tx2);
// s = 3x^2 * 1/2y
mulModP(tx2, s);
// s^2
unsigned int s2[8];
mulModP(s, s, s2);
// Rx = s^2 - 2px
subModP(s2, x, newX);
subModP(newX, x, newX);
// Ry = s(px - rx) - py
unsigned int k[8];
subModP(px, newX, k);
mulModP(s, k, newY);
subModP(newY, py, newY);
} else {
unsigned int rise[8];
subModP(py, y, rise);
mulModP(rise, s);
// Rx = s^2 - Gx - Qx
unsigned int s2[8];
mulModP(s, s, s2);
subModP(s2, px, newX);
subModP(newX, x, newX);
// Ry = s(px - rx) - py
unsigned int k[8];
subModP(px, newX, k);
mulModP(s, k, newY);
subModP(newY, py, newY);
}
Specifically, this part is not in the non-double counterpart, while the rest are in there:
// 3x^2
mulModP(x, x, x2);
addModP(x2, x2, tx2);
addModP(x2, tx2, tx2);
// s = 3x^2 * 1/2y
mulModP(tx2, s);
// s^2
unsigned int s2[8];
mulModP(s, s, s2);
So it looks like the only changes are using the double of py in the beginBatchAdd and s = (3x^2 * 1/2y)^2 in the completeBatchAdd
in other words we just double Gy and use that if our search stumbles upon Gx,Gy point by chance, and we use an s = (3Gx^2 * 1/2Gy)^2 used to calculate the next point (Rx,Ry).
This might also explain why the main code only calls keyFinderKernelWithDouble and never keyFinderKernel.