Bitcoin Forum
June 24, 2024, 07:04:50 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Need help understanding this modular inverse implementation  (Read 162 times)
NotATether (OP)
Legendary
*
Offline Offline

Activity: 1638
Merit: 6911


bitcoincleanup.com / bitmixlist.org


View Profile WWW
June 04, 2024, 01:28:24 PM
Merited by ABCbits (1), hugeblack (1)
 #1

Taken straight from VanitySearch

https://github.com/XopMC/CudaBrainSecp/blob/main/GPU/GPUMath.h#L548-L659 (different project but identical source file)

_ModInv is the modular inverse funciton that implements the extended euclidean algorithm, which is described at Wikipedia here:

https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm

It's using signed 320-bit integers, probably to catch overflow and underflow, which it deals with later. That's not the issue here.

There is the Euclidean (proper) division, which I already understand. And then there's this talk about the Bezout coefficients which I also understand. My issue here is understanding how this particular implementation implements this this using 63-bit bit shifts.

U and V (the two inputs) are loaded with the values of P and R respectively. (because R * (the return value) === 1 mod P.

And then my understanding goes downhill from there  Smiley

Can someone help me figure this out, or at least point me to a website where I can find people who might understand this sort of function?

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
j2002ba2
Full Member
***
Online Online

Activity: 206
Merit: 447


View Profile
June 11, 2024, 08:11:32 AM
Merited by NotATether (5), hugeblack (3), ABCbits (2)
 #2

This is Extended Binary GCD

Binary GCD:
https://xlinux.nist.gov/dads/HTML/binaryGCD.html
http://www.cut-the-knot.org/blue/binary.shtml

Extended Binary GCD explanation with lots of words:
https://github.com/DavidNorman/gcd

NotATether (OP)
Legendary
*
Offline Offline

Activity: 1638
Merit: 6911


bitcoincleanup.com / bitmixlist.org


View Profile WWW
June 11, 2024, 08:43:03 AM
 #3

This is Extended Binary GCD

Binary GCD:
https://xlinux.nist.gov/dads/HTML/binaryGCD.html
http://www.cut-the-knot.org/blue/binary.shtml

Extended Binary GCD explanation with lots of words:
https://github.com/DavidNorman/gcd

Perfect. Now I'm going to try to figure out how the binary extended GCD operates with many words - this has been a pain point for me. I'll get back to you later.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
j2002ba2
Full Member
***
Online Online

Activity: 206
Merit: 447


View Profile
June 11, 2024, 09:12:09 AM
 #4

I mean lots of words explaining it.

Here is the code I made back in 2019, this is slower than the 5x52 version (it is 8x32), but was faster than xp-2.
Code:
__device__ static void invModP_(unsigned int value[8]) {
  //INPUT : Prime p and a in [1, p−1].
  //OUTPUT : a^−1 mod p.
  //1. u = a, v = p.
  unsigned int u[8], v[8];
  copyBigInt(value, u);
  copyBigInt(_P, v);
  //2. x1 = 1, x2 = 0.
  unsigned int x1[8] = { 0,0,0,0, 0,0,0,1 };
  unsigned int x2[8] = { 0,0,0,0, 0,0,0,0 };
  //3. While (u != 1 and v != 1) do
  for (;;) {
    if ( ((u[7] == 1) && ((u[0] | u[1] | u[2] | u[3] | u[4] | u[5] | u[6]) == 0)) ||
         ((v[7] == 1) && ((v[0] | v[1] | v[2] | v[3] | v[4] | v[5] | v[6]) == 0)) )
      break;
    //3.1 While u is even do
    while ((u[7] & 1) == 0) {
      //u = u/2.
      shift_right(u);
      unsigned int carry = 0;
      //If x1 is even then x1 = x1/2; else x1 = (x1 + p)/2.
      if ((x1[7] & 1) != 0) {
        carry = add(x1, _P, x1);
      }
      shift_right_c(x1, carry);
    }
    //3.2 While v is even do
    while ((v[7] & 1) == 0) {
      //v = v/2.
      shift_right(v);
      unsigned int carry = 0;
      //If x2 is even then x2 = x2/2; else x2 = (x2 + p)/2.
      if ((x2[7] & 1) != 0) {
        carry = add(x2, _P, x2);
      }
      shift_right_c(x2, carry);
    }
    //3.3 If u >= v then: u = u − v, x1 = x1 − x2 ;
    unsigned int t[8];
    if (sub(u, v, t)) {
      //Else: v = v − u, x2 = x2 − x1.
      sub(v, u, v);
      subModP(x2, x1, x2);
    } else {
      copyBigInt(t, u);
      subModP(x1, x2, x1);
    }
  }
  //4. If u = 1 then return(x1 mod p); else return(x2 mod p).
  if ((u[7] == 1) && ((u[0] | u[1] | u[2] | u[3] | u[4] | u[5] | u[6]) == 0)) {
    copyBigInt(x1, value);
  } else {
    copyBigInt(x2, value);
  }
}
/*
INPUT : Prime p and a in [1, p−1].
OUTPUT : a^−1 mod p.
1. u = a, v = p.
2. x1 = 1, x2 = 0.
3. While (u != 1 and v != 1) do
   3.1 While u is even do
       u = u/2.
       If x1 is even then x1 = x1/2; else x1 = (x1 + p)/2.
   3.2 While v is even do
       v = v/2.
       If x2 is even then x2 = x2/2; else x2 = (x2 + p)/2.
   3.3 If u >= v then: u = u − v, x1 = x1 − x2 ;
       Else: v = v − u, x2 = x2 − x1.
4. If u = 1 then return(x1 mod p); else return(x2 mod p).
*/
NotATether (OP)
Legendary
*
Offline Offline

Activity: 1638
Merit: 6911


bitcoincleanup.com / bitmixlist.org


View Profile WWW
June 20, 2024, 05:45:37 AM
Merited by Cricktor (2)
 #5

OK, so now, finally I got the hang of this!

The key to understanding this implementation which is not a conventional binary egcd, is that this implementation is DRS62 (delayed right shift 62 bits) - actually it says so in the comments even.

Basically, the reason why it's called such is because it implements the optimizations in this algorithm: https://github.com/pornin/bingcd/blob/main/doc/bingcd.pdf

This paper describes how to optimize EGCD when you are working in a modular field with fixed-size registers, like 64-bit registers and the like.

Although in the conventional binary gcd, we are left shifting (dividing) stuff by 2, in this implementation we are actually multiplying some of the words by 2 and then we "fix" it by dividing (right shift) by a bunch of twos at once - 62 times, to be exact.

Those lines that have MM64 and MSK62 on them followed by multiplication, addition and a right shift - that is Montgomery Reduction and that basically calculates the modulus by P. It uses some fancy modulo 2^62 stuff, but I will have more to say when I write a separate post about this.

Much thanks @j2002ba2!

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!