BSGS solver for cuda

math09183

Member

Offline

Activity: 170
Merit: 58

Re: BSGS solver for cuda

October 23, 2021, 03:17:08 PM

#121

Quote from: Etar on October 23, 2021, 07:01:46 AM

Quote from: math09183 on October 23, 2021, 06:53:32 AM

LOL

That's what happens when you use ad hoc written code, without proper testing. I guess you still did not prepare any set of unit tests to proof your code works?
Good luck for the future releases, maybe somewhere around version 20 it will be stable Grin

Most of code have bugs. Are you a great programmer who does everything without mistakes?
I found this bug and solved it, what's your problem?

Relax, haters gonna hate Grin

Anyway, good job, I appreciate your work. Sh*t happens.

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

⇾ Re: BSGS solver for cuda

October 23, 2021, 05:13:51 PM
Last edit: October 23, 2021, 07:01:36 PM by Etar

#122

Code:

KEY[15]: 0x49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e2452dd26bc983cd5
    Pub: 02b1985389d8ab680dedd67bba7ca781d1a9e6e5974aad2e70518125bad5783eb5
****************************
Found in 1 seconds
GPU#0 job finished
Working time 00:00:55s
FINDpubkey= (55b95bef84a6045a505d015ef15e136e0a31cc2aa00fa4bca62e5df215ee981b, 3b4d6bce33718dc6cf59f28b550648d7e8b2796ac36f25ff0c01f8bc42a16fd9)
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
GPU#0 Cnt:00000000000000000000000000000000000000000000000045ba000000000001 1110MKey/s x1073741824 2^30.12 x2^31=2^61.12
***********GPU#0************
KEY[16]: 0x49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e7ad38337c7f173c7
    Pub: 0355b95bef84a6045a505d015ef15e136e0a31cc2aa00fa4bca62e5df215ee981b
****************************
Found in 4 seconds
GPU#0 job finished
Working time 00:00:59s

double giant step defeated.
v1.7.1 released with maximum perfomance.

mamuu

Member

Offline

Activity: 73
Merit: 19

Re: BSGS solver for cuda

October 23, 2021, 07:13:47 PM

#123

Quote from: Etar on October 23, 2021, 05:13:51 PM

Code:

KEY[15]: 0x49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e2452dd26bc983cd5
    Pub: 02b1985389d8ab680dedd67bba7ca781d1a9e6e5974aad2e70518125bad5783eb5
****************************
Found in 1 seconds
GPU#0 job finished
Working time 00:00:55s
FINDpubkey= (55b95bef84a6045a505d015ef15e136e0a31cc2aa00fa4bca62e5df215ee981b, 3b4d6bce33718dc6cf59f28b550648d7e8b2796ac36f25ff0c01f8bc42a16fd9)
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
GPU#0 Cnt:00000000000000000000000000000000000000000000000045ba000000000001 1110MKey/s x1073741824 2^30.12 x2^31=2^61.12
***********GPU#0************
KEY[16]: 0x49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e7ad38337c7f173c7
    Pub: 0355b95bef84a6045a505d015ef15e136e0a31cc2aa00fa4bca62e5df215ee981b
****************************
Found in 4 seconds
GPU#0 job finished
Working time 00:00:59s

double giant step defeated.
v1.7.1 released with maximum perfomance.

Hi,
Please , can you write a little tutorial on usage?

1DWA3Sa8i6eHVWV4AG4UP2SBhYB2XrfiHW

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 23, 2021, 07:16:04 PM

#124

Quote from: mamuu on October 23, 2021, 07:13:47 PM

Hi,
Please , can you write a little tutorial on usage?

did you look at readme.md on github?
first of all set -t parametr as 256 or 512
next set -b equil to SM count of your card
next set -p start from 128
then set -w as max as possible to your gpu memory, check -htsz params
-w 31 -htsz 29 need around 64GB of RAM to generate all arrays
-w 30 -htsz 28 need around 32GB of RAM to generate all arrays
-w 29 -htsz 28
-w 28 -htsz 27
-w 27 -htsz 25
if you will have free gpu memory you can increase -p or -b or -t (all params multiple of 2)

ssxb

Jr. Member

Offline

Activity: 81
Merit: 2

Re: BSGS solver for cuda

October 24, 2021, 03:30:22 AM

#125

Quote from: Etar on October 23, 2021, 07:16:04 PM

Quote from: mamuu on October 23, 2021, 07:13:47 PM

Hi,
Please , can you write a little tutorial on usage?

some great work from your side , appreciate

just a quick question, do you have any plan to enhanced it for 120bit or more to perform better than JLK?

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 24, 2021, 07:55:06 AM

#126

Quote from: ssxb on October 24, 2021, 03:30:22 AM

just a quick question, do you have any plan to enhanced it for 120bit or more to perform better than JLK?

Bsgs will never faster then kangaroo on big ranges.
Example puzzle #80, width 79bit
Kanagaroo need in average 2^40.5 op to solve key
2080ti can serach DPs with speed 1500Mkey/s = 2^30.48
So you need in average 2^10 second to find key.

Bsgscuda can search 2^61/s, so 2^79 / 2^61 = 2^18 second to check whole 79bit range. Ofcourse key can be close to the begining and you can found very fast but it is not 100%.
we can devide pub to 64 pubkey and try to find one of them in range 1..3ffffffffffffffffff fortunately, the key we need is the first one on the list but it was just lucky.
by the way privkey will be 3a869719b73046d6b46 = 2^73.87
so we need 2^73.87 / 2^61/s = 2^12.87 seconds to find key. It is 7 more timer then need for kangaroo

arulbero

Legendary

Online

Activity: 1939
Merit: 2080

Re: BSGS solver for cuda

October 24, 2021, 04:32:04 PM

#127

Do you use the negation map to speed up the algorithm ? --> pag. 8-9 https://eprint.iacr.org/2015/605.pdf

You need to compute only:

sqrt(n) / 2 baby steps : for example, n = 2^60 -> 2^29 baby steps

sqrt(n) giant steps : for example, n = 2^60 -> 2^30 giant steps

It is like to shift the public key of 2^59 steps, and search in the interval [1,..., 2^59] instead of [1,...., 2^60] exploiting the symmetrie.

Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p.

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 24, 2021, 05:02:06 PM

#128

Quote from: arulbero on October 24, 2021, 04:32:04 PM

No, don`t use even don`t know about this.
Will try to understand this tweak, thanks.

arulbero

Legendary

Online

Activity: 1939
Merit: 2080

Re: BSGS solver for cuda

October 24, 2021, 05:16:14 PM

#129

Quote from: a.a on October 18, 2021, 12:20:36 PM

As you already explained few days ago, cuda BSGS is searching the keys one after another, not in parallel. I know that, I read carefully Wink

. With my previous post I meant that even if it would run in parallel it would slow down as described. So read my post in conjunctive and if someone would modify it to process in parallel I expect that behaviours.

BSGS works in this way:

suppose we know that P = k*G, and the private key k is in [1,...., 2^60] range

1) precompute 2^29 baby steps (they are simple public keys): 1*G, 2*G, 3*G, ...., 2^29 * G

2) split the public key P in many other public keys (they are called giant steps, 2^30 public keys):
P, P - 1*(2^30*G), P - 2*(2^30*G), P - 3*(2^30)G, ..., P - (2^30 - 1)*(2^30*G)

3) for each giant steps, you check if it lies in [1, ..., 2^30] range (if it is equal to a 'baby step' public key)

4) if P - a*(2^30*G) = +-b*G then P = (a*2^30 +- b)*G then the private key is k = a*2^30 +- b

If you want to search 2 public keys, P1 and P2, you can use the same baby steps, but you need to generate 2^31 giant steps instead of 2^30.

arulbero

Legendary

Online

Activity: 1939
Merit: 2080

Re: BSGS solver for cuda

October 24, 2021, 05:18:43 PM

#130

Quote from: Etar on October 24, 2021, 05:02:06 PM

Quote from: arulbero on October 24, 2021, 04:32:04 PM

Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p.

No, don`t use even don`t know about this.
Will try to understand this tweak, thanks.

I mean: how do you compute a batch of 'consecutive' keys ?

Like P, P+G, P+2G, P+3G, P+4G, P+5G, ... ?

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 24, 2021, 05:31:37 PM

#131

Quote from: arulbero on October 24, 2021, 05:18:43 PM

Quote from: Etar on October 24, 2021, 05:02:06 PM

Quote from: arulbero on October 24, 2021, 04:32:04 PM

Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p.

No, don`t use even don`t know about this.
Will try to understand this tweak, thanks.

I mean: how do you compute a batch of 'consecutive' keys ?

Like P, P+G, P+2G, P+3G, P+4G, P+5G, ... ?

with group inverse, calculate inverse for batch and use it in addition. Sorry if i misunderstud question.
In the same way as in bitcrack https://github.com/brichard19/BitCrack/blob/6bf8059ef075eb1622298395866b0bd02375e1d9/cudaMath/secp256k1.cuh#L642
and then https://github.com/brichard19/BitCrack/blob/6bf8059ef075eb1622298395866b0bd02375e1d9/cudaMath/secp256k1.cuh#L656

arulbero

Legendary

Online

Activity: 1939
Merit: 2080

Re: BSGS solver for cuda

October 24, 2021, 05:40:09 PM

#132

Quote from: Etar on October 24, 2021, 05:31:37 PM

Quote from: arulbero on October 24, 2021, 05:18:43 PM

Quote from: Etar on October 24, 2021, 05:02:06 PM

Quote from: arulbero on October 24, 2021, 04:32:04 PM

Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p.

No, don`t use even don`t know about this.
Will try to understand this tweak, thanks.

I mean: how do you compute a batch of 'consecutive' keys ?

Like P, P+G, P+2G, P+3G, P+4G, P+5G, ... ?

with group inverse, calculate inverse for batch and use it in addition. Sorry if i misunderstud question.

Ok.
If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions.

If you have to compute A + B:

If you have to compute A - B, since -B = (xB, n-yB)

1/(xb-xa) is the same as in A + B.
Because for example P+2G and P-2G use the same inverse.

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 24, 2021, 05:44:48 PM

#133

Quote from: arulbero on October 24, 2021, 05:40:09 PM

Ok.
If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions.
Because for example P+2G and P-2G use the same inverse.

Exacly all this used in bsgs cuda(batch addition and symmetry)
befor using symmetry in addition speed was 800Mkeys with -w 30, after using symmetry speed grow to 1150Mkeys

arulbero

Legendary

Online

Activity: 1939
Merit: 2080

Re: BSGS solver for cuda

October 24, 2021, 05:52:59 PM

#134

Quote from: Etar on October 24, 2021, 05:44:48 PM

Quote from: arulbero on October 24, 2021, 05:40:09 PM

Ok.
If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions.
Because for example P+2G and P-2G use the same inverse.

Exacly all this used in bsgs cuda(batch addition and symmetry)
befor using symmetry in addition speed was 800Mkeys with -w 30, after using symmetry speed grow to 1150Mkeys

Ok. The square "a**2 mod p" is optimized like here ?

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 24, 2021, 05:57:01 PM
Last edit: October 24, 2021, 06:07:30 PM by Etar

#135

Quote from: arulbero on October 24, 2021, 05:52:59 PM

Quote from: Etar on October 24, 2021, 05:44:48 PM

Quote from: arulbero on October 24, 2021, 05:40:09 PM

Ok.
If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions.
Because for example P+2G and P-2G use the same inverse.

Exacly all this used in bsgs cuda(batch addition and symmetry)
befor using symmetry in addition speed was 800Mkeys with -w 30, after using symmetry speed grow to 1150Mkeys

Ok. The square "a**2 mod p" is optimized like here ?

No. used mult mod P
i use optimized square mod P in PB https://github.com/Etayson/BSGS-cuda/blob/e41fff517b8de153b6bf9846ee7abb47524fe43e/lib/Curve64.pb#L2161
but need buffer 512bytes for this, so i did not transfer it to cuda ptx
Also used double giant step, so we substuct double giant value from pub.

arulbero

Legendary

Online

Activity: 1939
Merit: 2080

Re: BSGS solver for cuda

October 24, 2021, 06:33:05 PM

#136

Quote from: Etar on October 24, 2021, 05:57:01 PM

Also used double giant step, so we substuct double giant value from pub.

So, instead of computing

sqrt(n) / 2 baby steps
and
sqrt(n) giant steps

you compute

sqrt(n) baby steps
and
sqrt(n) / 2 giant steps

then, for example, for n = 2^60:

P - a * (2^31*G) = b*G where 'a' lies in [1, ..., 2^29] and 'b' lies in [1,...,2^30]

means:

1) P - a*(2^31*G) = b*G --> P = [a*(2^31) + b] * G --> priv key = a*(2^31) + b
or
2) P - a*(2^31*G) = -b*G --> P = [a*(2^31) - b] * G --> priv key = a*(2^31) - b

to save 2^30 giant steps.

It seems to me that the program is already optimized.

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 24, 2021, 06:46:12 PM

#137

Quote from: arulbero on October 24, 2021, 06:33:05 PM

Quote from: Etar on October 24, 2021, 05:57:01 PM

Also used double giant step, so we substuct double giant value from pub.

I compute as max as possible baby array size dependency of GPU memory
Baby array is 1G,2G,3G...
So this array computed only first time and then redesigned to hashtable.
It is one HT for any ranges.
Giant array is computed with doubled value of Baby array size, for ex if Baby array have size 2^30 then Giant Array have value G*(2^31), G*(2^32), G*(2^33)...
All arrays computed only one time if not changed settings.
So you can easy used all arays for different ranges and pubkeys without recompute.

arulbero

Legendary

Online

Activity: 1939
Merit: 2080

Re: BSGS solver for cuda

October 24, 2021, 06:57:39 PM

#138

Quote from: Etar on October 24, 2021, 06:46:12 PM

You can choose:

1) 2^30 baby-steps: 1*G, 2*G, ..., 2^30*G
and
2^29 giant steps: P-1*2^31*G, P-2*2^31*G,..,P -a*2^31*G where a is in [1,2,.., 2^29 - 1]

or

2) 2^29 baby-steps: 1*G, 2*G, ..., 2^29*G
and
2^30 giant steps: P-1*2^30*G, P-2*2^30*G,..., P -a*2^30*G where a is in [1,2,..., 2^30 - 1]

I don't understand why: "G*(2^31), G*(2^32), G*(2^33)": in this way you compute only 30 giant steps

Etar (OP)

Sr. Member

Offline

Activity: 635
Merit: 312

Re: BSGS solver for cuda

October 24, 2021, 07:05:38 PM
Last edit: October 24, 2021, 07:19:00 PM by Etar

#139

Quote from: arulbero on October 24, 2021, 06:57:39 PM

Quote from: Etar on October 24, 2021, 06:46:12 PM

it is universal arrays for any ranges. You not need to recompute array if you change range or pubkeys.
size of giant array is equil to thread number * block number * pparam
for 2080ti i use 512 thread 138 blocks and pparam = 480
so totaly i have 33914880 doubled giant values
So each cuda kernel call calculate 33914880 * 2(due to +y/-y in batch additions) giat steps

jacky19790729

Jr. Member

Offline

Activity: 82
Merit: 8

Re: BSGS solver for cuda

October 29, 2021, 03:42:12 AM

#140

Quote from: ssxb on October 20, 2021, 08:05:42 AM

based on above table you can increase speed if you will utilize both bloom+bp https://github.com/iceland2k14/bsgs
so CPU cores are less powerful than cuda and i was thinking [not sure possible or not] if we load all bp in RAM and use some bloom in GPU memory perhaps their will be some dramatic speed boost

I had try iceland2k14's BSGS
Intel i7-7800X + 24 GB DDR4-2400

Code:

D:\python\BSGS_ice>python bsgs_dll_secp256k1.py -p 0385a30d8413af4f8f9e6312400f2d194fe14f02e719b24c3f83bf1fd233a8f963 -b bPfile.bin -bl bloomfile.bin -n 1000000000000000 -keyspace 40000000000000:80000000000000

[+] Starting Program : BSGS mode     Version [ 13072021 ]
[+] Search Started for the Public key:  0485a30d8413af4f8f9e6312400f2d194fe14f02e719b24c3f83bf1fd233a8f9630eb400323654cec63999b56f4ba44e8b21ab92d9d697fabe4666df3678585669
[+] Search Mode: Sequential search in the given range
[+] Reading bloom filter from file complete in : 26.70087 sec
[+] Reading Baby table from file complete in : 0.03731 sec
[+] seq value: 1000000000000000    m value : 3999999999
[+] Search Range: 0x40000000000000  to  0x80000000000000
                                                                [+] k1: 0x40000000000000
PVK not found. 1000.00000 Trillion scanned in 1.14 sec. New range [+] k1: 0x438d7ea4c68001
PVK not found. 1000.00000 Trillion scanned in 1.18 sec. New range [+] k1: 0x471afd498d0002
PVK not found. 1000.00000 Trillion scanned in 1.03 sec. New range [+] k1: 0x4aa87bee538003
PVK not found. 1000.00000 Trillion scanned in 1.05 sec. New range [+] k1: 0x4e35fa931a0004
PVK not found. 1000.00000 Trillion scanned in 0.99 sec. New range [+] k1: 0x51c37937e08005
PVK not found. 1000.00000 Trillion scanned in 1.01 sec. New range [+] k1: 0x5550f7dca70006
PVK not found. 1000.00000 Trillion scanned in 1.00 sec. New range [+] k1: 0x58de76816d8007
PVK not found. 1000.00000 Trillion scanned in 1.01 sec. New range [+] k1: 0x5c6bf526340008
PVK not found. 1000.00000 Trillion scanned in 1.09 sec. New range [+] k1: 0x5ff973cafa8009
PVK not found. 1000.00000 Trillion scanned in 0.93 sec. New range [+] k1: 0x6386f26fc1000a
PVK not found. 1000.00000 Trillion scanned in 0.96 sec. New range [+] k1: 0x6714711487800b
PVK not found. 1000.00000 Trillion scanned in 1.00 sec. New range [+] k1: 0x6aa1efb94e000c
============== KEYFOUND ============== 1
BSGS FOUND PrivateKey  0x6abe1f9b67e114
======================================
Start 1  :  2021-10-29 11:34:19
Start 1_0:  2021-10-29 11:34:45
END   1  :  2021-10-29 11:34:59

1 CPU speed:
BSGS Check 0x38D7EA4C68000 key / second

Pages: « 1 2 3 4 5 6 [7] 8 9 10 11 » All

Bitcoin Forum > Bitcoin > Development & Technical Discussion > BSGS solver for cuda

« previous topic next topic »