Title: BSGS solver for cuda Post by: Etar on October 09, 2021, 05:46:55 PM It is my implementation of BigStepGiantStep algorithm for Nvidia card (Cuda and Windows x64 only)
https://github.com/Etayson/BSGS-cuda Let me know of your speed results. Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 09, 2021, 07:15:17 PM It seems that you know a bit bsgs algo and x86 assembler....
So I would like to ask you one question. I already modified Jean's bsgs for curve "r1" (btc uses k1) What is the meaning of start value? Jean even need start and stop values for k1 and k2. Does the searched k must lie in this interval? Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 09, 2021, 08:40:45 PM awesome. will let you know speed on various GPUs once I run it
Title: Re: BSGS solver for cuda Post by: COBRAS on October 10, 2021, 05:12:57 AM It is my implementation of BigStepGiantStep algorithm for Nvidia card (Cuda and Windows x64 only) https://github.com/Etayson/BSGS-cuda Let me know of your speed results. Great. I thin your project will be more usable then JLP cangaro. Tuning JLP kangaroo is a real big shit !!!! Title: Re: BSGS solver for cuda Post by: davidjjones on October 10, 2021, 10:35:50 AM It is my implementation of BigStepGiantStep algorithm for Nvidia card (Cuda and Windows x64 only) I tested your BSGS on GTX 1660s, the speed was significantly slower than JeanLucPons Kangaroo:https://github.com/Etayson/BSGS-cuda Let me know of your speed results. BSGS-cuda => 330 Mkey/s Kangaroo 2.2 => 450 Mkey/s Title: Re: BSGS solver for cuda Post by: COBRAS on October 10, 2021, 12:12:12 PM It is my implementation of BigStepGiantStep algorithm for Nvidia card (Cuda and Windows x64 only) I tested your BSGS on GTX 1660s, the speed was significantly slower than JeanLucPons Kangaroo:https://github.com/Etayson/BSGS-cuda Let me know of your speed results. BSGS-cuda => 330 Mkey/s Kangaroo 2.2 => 450 Mkey/s Need real tests on how many time need for find exaple pprivkey, what code find faste. Title: Re: BSGS solver for cuda Post by: a.a on October 10, 2021, 04:23:47 PM COBRAS, then how about you start testing and benchmarking? Or should others do that for you too?
Title: Re: BSGS solver for cuda Post by: Etar on October 10, 2021, 07:12:18 PM with v1.2 and single 2080ti i solve example pubkeys in range:
start: 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000 end: 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff in 28minutes with params -w 26: Here is pubkeys for searching: Code: 0459A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC994327554CED887AAE5D211A2407CDD025CFC3779ECB9C9D7F2F1A1DDF3E9FF8 Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 10, 2021, 08:04:18 PM Ok, and how fast it would be with interval
000000000....00000000 Ffffffffffff......fffffffffffff ? Title: Re: BSGS solver for cuda Post by: _Counselor on October 10, 2021, 09:11:25 PM How many bytes of memory do you need to store one babystep? Hashtable uses GPU memory or global ram?
Title: Re: BSGS solver for cuda Post by: a.a on October 10, 2021, 09:54:00 PM COBRAS, then how about you start testing and benchmarking? Or should others do that for you too? I suppose that was sarcasm:D Yeah something like sarcasm. COBRAS is a lazy lurker. And you can see, that his last post does not make any sense. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 10, 2021, 10:20:54 PM RTX 3070 = 1,000 MKey/s
Code: KEY!!>49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5ebb3ef3883c1866d4 Default settings. Have not tinkered with settings to see if GPUs can gain any speed. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 10, 2021, 11:00:11 PM I ran the same test as Etar and JLP, with 16 pubkeys:
Code: 0459A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC994327554CED887AAE5D211A2407CDD025CFC3779ECB9C9D7F2F1A1DDF3E9FF8 Total time: Code: GPU #4 finished For comparison, JLP with CPU only took 3 hours and 35 minutes. Title: Re: BSGS solver for cuda Post by: Etar on October 11, 2021, 05:25:25 AM How many bytes of memory do you need to store one babystep? Hashtable uses GPU memory or global ram? each baby step used 8 bytes memory. HT stored in GPU memory.with -w 26 and -htsz 25(default), app generate 2^26 babysteps that stored in HT with size (2^25 + 2^26 )*8 bytes Title: Re: BSGS solver for cuda Post by: fxsniper on October 11, 2021, 10:46:42 AM Thank Etar I think BSGS-cuda is work better than JLP BSGS JLP BSGS is good but using very long time (for my GPU) I test first sample command from github page speed result (GPU GTX 1050 on laptop) Code: Found in 972 seconds Code: bsgscudaHT2.exe -t 512 -b 68 -p 256 -pb 59A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC994327554CED887AAE5D211A2407CDD025CFC3779ECB9C9D7F2F1A1DDF3E9FF8 -pk 0x49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000 -w 26 Result Code: GPU#0 Cnt:000000000000000000000000000000000000000000000000ba78000000000001 98MKey/s x67108864 2^26.62 x2^27=2^53.62 Title: Re: BSGS solver for cuda Post by: Etar on October 11, 2021, 07:24:13 PM Mantadory update v.1.2.1
*bug fixed with multy GPU searching. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 11, 2021, 07:31:57 PM Quote I think BSGS-cuda is work better than JLP BSGS JLP's BSGS does not support GPU; his is CPU only.JLP BSGS is good but using very long time (for my GPU) Side by side tests of BSGS Cuda and JLP's Kangaroo... 4 pubkeys all in 65 bit range: Kangaroo total time = 2 mins 34 seconds: Code: [4921.81 MK/s][GPU 4517.36 MK/s][Count 2^33.89][Dead 0][04s (Avg 04s)][121.0/159.5MB] BSGS Cuda total time = 1 min 29 seconds: Code: GPU#2 Cnt:000000000000000000000000000000000000000000000000b850800000000001 859MKey/s x134217728 2^29.75 x2^28=2^57.75 For at least this range (and probably more up to a certain size) the BSGS Cuda program will be faster, for checking multiple pubkeys, as the spin up time between pub keys (finding a pub key and moving to the next pub key) is a lot faster than kangaroo program. Title: Re: BSGS solver for cuda Post by: fxsniper on October 12, 2021, 01:46:53 AM Quote I think BSGS-cuda is work better than JLP BSGS JLP's BSGS does not support GPU; his is CPU only.JLP BSGS is good but using very long time (for my GPU) Correct, Sorry I forget it, I mean it use very slow for my laptop work , sometime I give up to end task for waiting longtime overnight Title: Re: BSGS solver for cuda Post by: fxsniper on October 12, 2021, 01:56:51 AM 4 pubkeys all in 65 bit range: work fast on 65 bit range still limited for power can fine on 120 bit right and still limited to fine range that on 65 bit nearly point to hit key Title: Re: BSGS solver for cuda Post by: hamnaz on October 13, 2021, 04:11:22 PM searching these 2 pubkeys in 100 bit range
034786ac12686480348261b5dce84efcffc27b56b512ca793a09229ed06d63058d 027ede4f01c7dd2690603cd0449fc4e4ac9ca2d11de2404ef2285ab897d2645391 some one can help me to understand what hardware gpu's models you are using for above result data ? is there any ubuntu compilation/sourcecode program available, for cuda 8.0 and ccap 20, g++ 4.8 love to see your updates Title: Re: BSGS solver for cuda Post by: NotATether on October 13, 2021, 04:18:33 PM searching these 2 pubkeys in 100 bit range 034786ac12686480348261b5dce84efcffc27b56b512ca793a09229ed06d63058d 027ede4f01c7dd2690603cd0449fc4e4ac9ca2d11de2404ef2285ab897d2645391 some one can help me to understand what hardware gpu's models you are using for above result data ? is there any ubuntu compilation/sourcecode program available, for cuda 8.0 and ccap 20, g++ 4.8 love to see your updates CUDA toolkits don't support your CUDA version and CCap anymore, therefore it is highly unlikely you will find any brute-forcing software that works with your GPU. You're better using a newer GPU with ccap 6.0+ (even then, there is no Linux port of this code). Title: Re: BSGS solver for cuda Post by: hamnaz on October 13, 2021, 04:24:14 PM searching these 2 pubkeys in 100 bit range 034786ac12686480348261b5dce84efcffc27b56b512ca793a09229ed06d63058d 027ede4f01c7dd2690603cd0449fc4e4ac9ca2d11de2404ef2285ab897d2645391 some one can help me to understand what hardware gpu's models you are using for above result data ? is there any ubuntu compilation/sourcecode program available, for cuda 8.0 and ccap 20, g++ 4.8 love to see your updates CUDA toolkits don't support your CUDA version and CCap anymore, therefore it is highly unlikely you will find any brute-forcing software that works with your GPU. You're better using a newer GPU with ccap 6.0+ (even then, there is no Linux port of this code). Title: Re: BSGS solver for cuda Post by: NotATether on October 13, 2021, 04:32:10 PM searching these 2 pubkeys in 100 bit range 034786ac12686480348261b5dce84efcffc27b56b512ca793a09229ed06d63058d 027ede4f01c7dd2690603cd0449fc4e4ac9ca2d11de2404ef2285ab897d2645391 some one can help me to understand what hardware gpu's models you are using for above result data ? is there any ubuntu compilation/sourcecode program available, for cuda 8.0 and ccap 20, g++ 4.8 love to see your updates CUDA toolkits don't support your CUDA version and CCap anymore, therefore it is highly unlikely you will find any brute-forcing software that works with your GPU. You're better using a newer GPU with ccap 6.0+ (even then, there is no Linux port of this code). Sorry I made a mistake, anything with ccap 3.5+ will work. Yours is a Kepler GK210 model with ccap 3.7, so it should work fine [despite the caveat on Wikipedia (https://en.wikipedia.org/wiki/CUDA) saying that CUDA Toolkit 11.x only partially supports Kepler]. Title: Re: BSGS solver for cuda Post by: PrivatePerson on October 13, 2021, 05:50:50 PM Is it really possible to find a 100-bit key on one video card? How long does it take for this?
Title: Re: BSGS solver for cuda Post by: math09183 on October 13, 2021, 06:01:36 PM searching these 2 pubkeys in 100 bit range 034786ac12686480348261b5dce84efcffc27b56b512ca793a09229ed06d63058d 027ede4f01c7dd2690603cd0449fc4e4ac9ca2d11de2404ef2285ab897d2645391 some one can help me to understand what hardware gpu's models you are using for above result data ? is there any ubuntu compilation/sourcecode program available, for cuda 8.0 and ccap 20, g++ 4.8 love to see your updates Why? There are no coins. Title: Re: BSGS solver for cuda Post by: hamnaz on October 13, 2021, 06:48:29 PM Is it really possible to find a 100-bit key on one video card? How long does it take for this? as i see 100bit puzzle was picked by telaurist who write first kangaroo ver in cpu, and he used 1 gpu to find itmaybe latest cards do it fast Title: Re: BSGS solver for cuda Post by: NotATether on October 14, 2021, 08:15:17 AM Is it really possible to find a 100-bit key on one video card? How long does it take for this? as i see 100bit puzzle was picked by telaurist who write first kangaroo ver in cpu, and he used 1 gpu to find itmaybe latest cards do it fast They definitely do not do it fast because that's what would happen if the range was 50-60 bits... you sure his program wasn't published after he took #100 coins? Maybe his was the only Kangaroo program at the time and he kept it to himself until he found some private key. Title: Re: BSGS solver for cuda Post by: Minase on October 14, 2021, 09:25:35 AM Is it really possible to find a 100-bit key on one video card? How long does it take for this? as i see 100bit puzzle was picked by telaurist who write first kangaroo ver in cpu, and he used 1 gpu to find itmaybe latest cards do it fast It's quite possible to find 100bit puzzle with single video card and not even the most powerful one. (kangaroo method) On single RTX 2060 you can find such a key in 34-35 days (2^51 operations). Sometimes you dont even need full 2^51, you can find the key even when you reach 2^50 (this means half of time ~17 days). If we are talking about RTX 2080 then the speed is higher with almost 50% compared to 2060, this leads us to ~23 days for full 2^51 range. Title: Re: BSGS solver for cuda Post by: hamnaz on October 14, 2021, 09:45:59 AM Is it really possible to find a 100-bit key on one video card? How long does it take for this? as i see 100bit puzzle was picked by telaurist who write first kangaroo ver in cpu, and he used 1 gpu to find itmaybe latest cards do it fast It's quite possible to find 100bit puzzle with single video card and not even the most powerful one. (kangaroo method) On single RTX 2060 you can find such a key in 34-35 days (2^51 operations). Sometimes you dont even need full 2^51, you can find the key even when you reach 2^50 (this means half of time ~17 days). If we are talking about RTX 2080 then the speed is higher with almost 50% compared to 2060, this leads us to ~23 days for full 2^51 range. above 2 random key generate, one from first half and 2nd is 2nd half of 100 bit, i want to know how much fast rtx 3xxx series could found, i need to calc times, if you have rtx and have some time , to find above pubkeys in 100 bit, will help me to 3xxx power for time thankx Title: Re: BSGS solver for cuda Post by: Minase on October 14, 2021, 10:08:14 AM i dont have 3xxx series available but based on specs i can calculate the average speed.
with one 3090 or 3080ti 2^51 operations should be done in 6-7 days //edit based on your previous post your tesla k80 will find (if lucky) a private key, if it's in range 100bit, in ~ 25-26 days Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 07:37:15 AM GPU #0 launched
GPU #0 TotalBuff: 8112.000Mb error cuMemAlloc-2 Press Enter to exit i guess you hard coded 4096 GPU mem as i did everything but i am unable utilizing full GPU memory , my GPU is 3080 with 10GB this is the max i can use GPU #0 launched GPU #0 TotalBuff: 3216.000Mb speed is also slower than Kangaroo around 1200M i am getting , but i want to tweak to utilize max gpu memory and max ram with max power , increase item size will slow down speed and take longer to solve . any idea how to tweak Title: Re: BSGS solver for cuda Post by: NotATether on October 15, 2021, 07:52:45 AM speed is also slower than Kangaroo around 1200M i am getting , but i want to tweak to utilize max gpu memory and max ram with max power , increase item size will slow down speed and take longer to solve . any idea how to tweak Possibly due to "memory fragmentation" that happens when the program allocates GPU memory for one stuct, it's allocated in the middle of GPU memory and that will limit the maximum contiguous memory allocation allowed on the GPU for other structs. The resolution for it is to allocate the largest structure first (in this case the TotalBuff) and then the smaller ones last. It requires a code modification though, which is impossible to do without the source code. Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 08:24:12 AM speed is also slower than Kangaroo around 1200M i am getting , but i want to tweak to utilize max gpu memory and max ram with max power , increase item size will slow down speed and take longer to solve . any idea how to tweak Possibly due to "memory fragmentation" that happens when the program allocates GPU memory for one stuct, it's allocated in the middle of GPU memory and that will limit the maximum contiguous memory allocation allowed on the GPU for other structs. The resolution for it is to allocate the largest structure first (in this case the TotalBuff) and then the smaller ones last. It requires a code modification though, which is impossible to do without the source code. source codes are available i guess here https://github.com/Etayson/BSGS-cuda/blob/main/bsgscudaussualHTchangeble1_2.pb (https://github.com/Etayson/BSGS-cuda/blob/main/bsgscudaussualHTchangeble1_2.pb) can you check please Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 08:29:17 AM i think i found the problem the information which program is pulling from device is wrong or these are max value which intentionally hardcoded in program , Ethar can you please set all dynamic , i mean device should report all parameters
Found 1 Cuda device. Cuda device:GeForce RTX 3080(4095Mb) wrong Device have: MP:68 Cores+0 wrong Shared memory total:49152 i guess this is system memory but avaiable is 128GB Constant memory total:65536 not sure how calculate this one i am not sure but MP is unit of AMD cards and cuda for Nvidia , and cuda is 8k+ in 3080 but not sure what is 68 cores here so many confusions Title: Re: BSGS solver for cuda Post by: NotATether on October 15, 2021, 10:11:54 AM i think i found the problem the information which program is pulling from device is wrong or these are max value which intentionally hardcoded in program , Ethar can you please set all dynamic , i mean device should report all parameters Found 1 Cuda device. Cuda device:GeForce RTX 3080(4095Mb) wrong Device have: MP:68 Cores+0 wrong Shared memory total:49152 i guess this is system memory but avaiable is 128GB Constant memory total:65536 not sure how calculate this one i am not sure but MP is unit of AMD cards and cuda for Nvidia , and cuda is 8k+ in 3080 but not sure what is 68 cores here so many confusions There is no need to wait for a patch, you can independently get these stats on an NVIDIA card using their sample DeviceQuery program: https://github.com/NVIDIA/cuda-samples/blob/master/Samples/deviceQuery/deviceQuery.cpp - It needs to be compiled from source though but it's extremely easy to do since it's only a single file. Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 10:21:44 AM i agree with you but free purebasic program can compile only small code lines so that's why i need help from @Etar
and program is setting memory automatically but calculating it wrong Title: Re: BSGS solver for cuda Post by: Etar on October 15, 2021, 10:24:17 AM i think i found the problem the information which program is pulling from device is wrong or these are max value which intentionally hardcoded in program , Ethar can you please set all dynamic , i mean device should report all parameters Program used cuda driver api(not runtime api that ussualy used) and code for GPU writed on ptx.Found 1 Cuda device. Cuda device:GeForce RTX 3080(4095Mb) wrong Device have: MP:68 Cores+0 wrong Shared memory total:49152 i guess this is system memory but avaiable is 128GB Constant memory total:65536 not sure how calculate this one i am not sure but MP is unit of AMD cards and cuda for Nvidia , and cuda is 8k+ in 3080 but not sure what is 68 cores here so many confusions cuda.lib that used to call cuda driver api even x64 version alwayse return 32bit values. In that case you can`t use/allocate GPU memory more than 2**32bytes Also cuDeviceTotalMem() return 32bit values of memory that is why you see 4095mb I write about this issues to nvidia few times but according to them they have no problem) if you are looking to cuda.lib you will fined unofficial commands like cuDeviceTotalMem_v2 and other. All this commands have prefix _v2 and this comands return correct 64bit values. But nvidia say that they does not have commands with prefix _v2 )) It is about limitation of 2**32 bytes GPU memory About Device have: MP:68 Cores+0, here 0 because i didn`t add Ampere to programm: Code: Case 2 ;Fermi to get corect number of cores need add only this Code: Case 8; Ampere Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 10:27:15 AM i think i found the problem the information which program is pulling from device is wrong or these are max value which intentionally hardcoded in program , Ethar can you please set all dynamic , i mean device should report all parameters Program used cuda driver api(not runtime api that ussualy used) and code for GPU writed on ptx.Found 1 Cuda device. Cuda device:GeForce RTX 3080(4095Mb) wrong Device have: MP:68 Cores+0 wrong Shared memory total:49152 i guess this is system memory but avaiable is 128GB Constant memory total:65536 not sure how calculate this one i am not sure but MP is unit of AMD cards and cuda for Nvidia , and cuda is 8k+ in 3080 but not sure what is 68 cores here so many confusions cuda.lib that used to call cuda driver api even x64 version alwayse return 32bit values. In that case you can`t use/allocate GPU memory more than 2**32bytes Also cuDeviceTotalMem() return 32bit values of memory that is why you see 4095mb I write about this issues to nvidia few times but according to them they have no problem) if you are looking to cuda.lib you will fined unofficial commands like cuDeviceTotalMem_v2 and other. All this commands have prefix _v2 and this comands return correct 64bit values. But nvidia say that they does not have commands with prefix _v2 )) It is about limitation of 2**32 bytes GPU memory About Device have: MP:68 Cores+0, here 0 because i didn`t add Ampere to programm: Code: Case 2 ;Fermi to get corect number of cores need add only this Code: Case 8; Ampere Thanks man for the information , can you please fix memory & ampere issue? is it possible ? and recompile it as i am unable to compile it via pure basic , free version have limitation Title: Re: BSGS solver for cuda Post by: Etar on October 15, 2021, 10:31:56 AM Thanks man for the information , can you please fix memory & ampere issue? is it possible ? and recompile it as i am unable to compile it via pure basic , free version have limitation I can`t fix memory(it is more fix return 32bit values instead 64bit) because i can`t use unofficial _v2 comands with official commands in the same app. Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 10:36:10 AM Thanks man for the information , can you please fix memory & ampere issue? is it possible ? and recompile it as i am unable to compile it via pure basic , free version have limitation I can`t fix memory(it is more fix return 32bit values instead 64bit) because i can`t use unofficial _v2 comands with official commands in the same app. ahan :(, i am not good at cuda or in programing , but if i use -i in kangaroo , it is returning correct parameters of memory. is it possible to mix some codes from kangaroo side ? or any way to hardcode memory ? Title: Re: BSGS solver for cuda Post by: Etar on October 15, 2021, 10:43:34 AM ahan :(, i am not good at cuda or in programing , but if i use -i in kangaroo , it is returning correct parameters of memory. is it possible to mix some codes from kangaroo side ? or any way to hardcode memory ? I was try to solve 32bit limitation few years ago as soon as the first cards with more than 4GB memory appeared. But unfortunately this limit could not be overcome. And do you need to utilize all the memory? On my 2080ti already at -w 27 the hash rate drops from 570mkeys to 81. While at 3070 everything is fine. So you need first to check how your hashrate will decrease with increasing parameter -w. here is with cuDeviceTotalMem_v2 Code: APP VERSION: 1.2.1 Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 10:54:05 AM ahan :(, i am not good at cuda or in programing , but if i use -i in kangaroo , it is returning correct parameters of memory. is it possible to mix some codes from kangaroo side ? or any way to hardcode memory ? I was try to solve 32bit limitation few years ago as soon as the first cards with more than 4GB memory appeared. But unfortunately this limit could not be overcome. And do you need to utilize all the memory? On my 2080ti already at -w 27 the hash rate drops from 570mkeys to 81. While at 3070 everything is fine. So you need first to check how your hashrate will decrease with increasing parameter -w. here is with cuDeviceTotalMem_v2 Code: APP VERSION: 1.2.1 does memory allocation in gpu maks difference in speed? how to know T, P and b optimal value for my card (3080)? what is W and -htsz role? and what is item size ? can i occupy more ram in computer to give some speed boost as i have 128GB memory ? if yes how can ? Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 11:06:52 AM please take a look in these 2 URLs ~ they fixed this issue.
https://github.com/BOINC/boinc/issues/1773 (https://github.com/BOINC/boinc/issues/1773) https://github.com/BOINC/boinc/pull/2707 (https://github.com/BOINC/boinc/pull/2707) perhaps you will get some clue Title: Re: BSGS solver for cuda Post by: Etar on October 15, 2021, 11:06:56 AM does memory allocation in gpu maks difference in speed? how to know T, P and b optimal value for my card (3080)? what is W and -htsz role? and what is item size ? can i occupy more ram in computer to give some speed boost as i have 128GB memory ? if yes how can ? -b use 68, shoud be multiples of SM count your cars(3080 have 68 SM) -p use 256, this value mean how many xpoints will compute each thread in kernel. -w it is number of baby step, -w 26 mean create array with size 2^26 as large this array then more big giant step. But you should check you hashrate when increase -w it shodn`t drop more then 1.5 times. For ex, your hashrate with -w 26 is 1500 Mkeys and if with -w 27 your hashrate is more then 1000 mkeys then there will be sense to increase -w -htsz use default 25, it is size of Hash Table. you can change -htsz only if you have small baby aray(-w) less then Hash Table size Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 11:09:36 AM does memory allocation in gpu maks difference in speed? how to know T, P and b optimal value for my card (3080)? what is W and -htsz role? and what is item size ? can i occupy more ram in computer to give some speed boost as i have 128GB memory ? if yes how can ? -b use 68, shoud be multiples of SM count your cars(3080 have 68 SM) -p use 256, this value mean how many xpoints will compute each thread in kernel. -w it is number of baby step, -w 26 mean create array with size 2^26 as large this array then more big giant step. But you should check you hashrate when increase -w it shodn`t drop more then 1.5 times. For ex, your hashrate with -w 26 is 1500 Mkeys and if with -w 27 your hashrate is more then 1000 mkeys then there will be sense to increase -w -htsz use default 25, it is size of Hash Table. you can change -htsz only if you have small baby aray(-w) less then Hash Table size awesome , big thanks Title: Re: BSGS solver for cuda Post by: Etar on October 15, 2021, 01:51:19 PM Seems like i fix app.. ;D Replace most commands with unofficial _v2
Code: GPU #0 launched Also speed is little increased. Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 06:33:39 PM Seems like i fix app.. ;D Replace most commands with unofficial _v2 Code: GPU #0 launched Also speed is little increased. awesome bro , thanks for sharing all this ~~ will test it Title: Re: BSGS solver for cuda Post by: studyroom1 on October 15, 2021, 08:51:50 PM Seems like i fix app.. ;D Replace most commands with unofficial _v2 Code: GPU #0 launched Also speed is little increased. i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+ Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 04:35:50 AM Do i have to remove 04 from bigging of uncompressed key or software can recognize with 04 also?
and seems like one more issue is there if i am using range like this 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5eba34000000000000 its running fine in range but when i use 120 range it is calculating range fine but running very below and showing false collision this started like this GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001 ..... GPU#0 Cnt:0000000000000000000000000000000000000000000000160cc3800000000001 but i set range 0x800000000000000000000000000000 to 0xffffffffffffffffffffffffffffff is something wrong with software ? Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 16, 2021, 04:47:08 AM Do i have to remove 04 from bigging of uncompressed key or software can recognize with 04 also? You can run with 04 in front of uncompressed key's x,y points; you just can not use a compressed key in any format.and seems like one more issue is there if i am using range like this 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5eba34000000000000 its running fine in range but when i use 120 range it is calculating range fine but running very below and showing false collision this started like this GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001 ..... GPU#0 Cnt:0000000000000000000000000000000000000000000000160cc3800000000001 but i set range 0x800000000000000000000000000000 to 0xffffffffffffffffffffffffffffff is something wrong with software ? The Cnt's are the giant steps. Program offsets (subtracts start range) pubkey on startup and then after all the baby steps and sorting, the GPU starts the giant steps. Nothing is wrong with the program. If you run a smaller range, you will see the same thing and you will see it will solve for the inputted key. False collisions are normal due to 8 bytes stored in hash table. Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 05:37:26 AM Do i have to remove 04 from bigging of uncompressed key or software can recognize with 04 also? You can run with 04 in front of uncompressed key's x,y points; you just can not use a compressed key in any format.and seems like one more issue is there if i am using range like this 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5eba34000000000000 its running fine in range but when i use 120 range it is calculating range fine but running very below and showing false collision this started like this GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001 ..... GPU#0 Cnt:0000000000000000000000000000000000000000000000160cc3800000000001 but i set range 0x800000000000000000000000000000 to 0xffffffffffffffffffffffffffffff is something wrong with software ? The Cnt's are the giant steps. Program offsets (subtracts start range) pubkey on startup and then after all the baby steps and sorting, the GPU starts the giant steps. Nothing is wrong with the program. If you run a smaller range, you will see the same thing and you will see it will solve for the inputted key. False collisions are normal due to 8 bytes stored in hash table. awesome man got the point i was worried that something wrong with my setup Title: Re: BSGS solver for cuda Post by: Etar on October 16, 2021, 05:46:45 AM i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+ Make sure you run v1.3.1. you can check version in the begining.Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 16, 2021, 05:53:09 AM i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+ Make sure you run v1.3.1. you can check version in the begining.Does it search whole 256bit space? Or limited only to few lsb bits? Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 06:10:46 AM i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+ Make sure you run v1.3.1. you can check version in the begining.yes bro this is v1.3.1 nut still same incorrect detection :( Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 06:18:49 AM i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+ Make sure you run v1.3.1. you can check version in the begining.yes bro this is v1.3.1 nut still same incorrect detection :( i fixed it as i did not delete old files which were computed before by old program , when i deleted all old .bin etc file and now everything is fine thanks man will start testing now Title: Re: BSGS solver for cuda Post by: Etar on October 16, 2021, 06:19:59 AM i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+ Make sure you run v1.3.1. you can check version in the begining.yes bro this is v1.3.1 nut still same incorrect detection :( By the way bsgs fast only in small ranges like 2^64 and less. if you will try use bsgs for #80 puzzle for ex. then you will search pubkeys much longer then JLP kangaroo. Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 06:34:44 AM i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+ Make sure you run v1.3.1. you can check version in the begining.yes bro this is v1.3.1 nut still same incorrect detection :( By the way bsgs fast only in small ranges like 2^64 and less. if you will try use bsgs for #80 puzzle for ex. then you will search pubkeys much longer then JLP kangaroo. i am beginner so must be doing some wrong but what program you will recommend for above 80-bit range Title: Re: BSGS solver for cuda Post by: davidjjones on October 16, 2021, 06:42:53 AM Is there any script to uncompress multiple pubkeys in a file?
like this script: BTC Adresses > HASH160 https://github.com/sezginyildirim91/btc-address-to-hash160 Title: Re: BSGS solver for cuda Post by: Etar on October 16, 2021, 06:43:34 AM --snip-- JLP Kangaroo. https://github.com/JeanLucPons/Kangarooi am beginner so must be doing some wrong but what program you will recommend for above 80-bit range Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 06:52:44 AM Total: 4294967296 bytes
Save BIN file:79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798_536870912_b.BI N
Press Enter to exit ??? new error Title: Re: BSGS solver for cuda Post by: Etar on October 16, 2021, 07:20:28 AM Total: 4294967296 bytes Download 1.3.2 version. decreased chunk size to 1GbSave BIN file:79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798_536870912_b.BI N
Press Enter to exit ??? new error Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 07:28:59 AM Total: 4294967296 bytes Download 1.3.2 version. decreased chunk size to 1GbSave BIN file:79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798_536870912_b.BI N
Press Enter to exit ??? new error let me try Title: Re: BSGS solver for cuda Post by: hamnaz on October 16, 2021, 07:51:04 AM Is there any script to uncompress multiple pubkeys in a file? here i see post related to compress to uncompress and uncompress to compress (codes)like this script: BTC Adresses > HASH160 https://github.com/sezginyildirim91/btc-address-to-hash160 https://bitcointalk.org/index.php?topic=5244940.msg57700007#msg57700007 Title: Re: BSGS solver for cuda Post by: a.a on October 16, 2021, 08:57:50 AM Use bitcoin-tool.
https://github.com/matja/bitcoin-tool Title: Re: BSGS solver for cuda Post by: davidjjones on October 16, 2021, 11:59:24 AM Is there any script to uncompress multiple pubkeys in a file? here i see post related to compress to uncompress and uncompress to compress (codes)like this script: BTC Adresses > HASH160 https://github.com/sezginyildirim91/btc-address-to-hash160 https://bitcointalk.org/index.php?topic=5244940.msg57700007#msg57700007 Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 02:25:16 PM help me to understand which one is faster when you are searching multiple keys from file
BSGS or Kangaroo? i heard that if you load list of PB keys in kangaroo it is not checking all simultaneously, is it true? if over 80 bit kangaroo is more efficient and fast than bsgs please explain why? Title: Re: BSGS solver for cuda Post by: a.a on October 16, 2021, 04:26:51 PM Currently there is no known implementation of kangaroo algorithm which handles multiple pubkeys. JLPs implementation processes each pubkey consecutively.
BSGS is possible to handle each pubkey simultaneously. Which one is faster? Depends. Title: Re: BSGS solver for cuda Post by: studyroom1 on October 16, 2021, 04:38:19 PM Currently there is no known implementation of kangaroo algorithm which handles multiple pubkeys. JLPs implementation processes each pubkey consecutively. BSGS is possible to handle each pubkey simultaneously. Which one is faster? Depends. so maybe your meaning is JLP first get one key from file and process only on that until he will find that in collision and than will process second one and BSGS somewhat i know is checking all same time Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 16, 2021, 05:04:45 PM Currently there is no known implementation of kangaroo algorithm which handles multiple pubkeys. JLPs implementation processes each pubkey consecutively. BSGS is possible to handle each pubkey simultaneously. Which one is faster? Depends. so maybe your meaning is JLP first get one key from file and process only on that until he will find that in collision and than will process second one and BSGS somewhat i know is checking all same time In my limited tests, I can tell you BSGS Cuda is at least faster, when searching multiple pub keys (one at a time), in the 72 bit range. BSGS gives you 100% check a key is or is not in a range, Kangaroo does not, you have to at least set the -m option to -m 6 to get a 99% rate that the key is or is not in a range. Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 17, 2021, 06:56:36 AM Do you mean to say that bsgs algo gives you 100% information if key is or is not in s given range k1-k2?
Or you say bsgs would find private key even if k1 k2 interval is incorrect? I had a closer look at JLP bsgs code (not cuda you speak about) and it seems there is no limit to 125 or 128 bit for search interval? Is bsgs by nature searching whole 256 bit range? Title: Re: BSGS solver for cuda Post by: a.a on October 17, 2021, 08:25:02 AM You should read the articles about Pollards Kangaroo Algorithm and BSGS.
If you find the key with BSGS in range, than you know 100% for sure, that the key is in range. And if a baby step is not found in the defined range for the pubkey than it will find no solution. Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 17, 2021, 08:34:13 AM You should read the articles about Pollards Kangaroo Algorithm and BSGS. If you find the key with BSGS in range, than you know 100% for sure, that the key is in range. And if a baby step is not found in the defined range for the pubkey than it will find no solution. I do not know if you are joking or serious. Your answer is totally useless and does not even touch any of my questions. Claiming that if you find key in range then you are sure 100% is there is childisch and raising my doubts for your mental health Title: Re: BSGS solver for cuda Post by: a.a on October 17, 2021, 11:17:54 AM Actually my original answer was kind of rough and was basically
Like: your questions are stupid, first inform yourself what the different cracking methods are doing on Wikipedia and then ask again before wasting our time. Then I thought, I should be friendly and removed the toxic part. Now to read your answer encourages me to give you a rough answer. So yes your question about BSGS is total bullshit. BSGS looks for the right babysteps. If It finds the right babysteps, then it can determine by doing the giant steps the actual value of the private key. This is clearly described in wikipedia. Do you mean to say that bsgs algo gives you 100% information if key is or is not in s given range k1-k2? This question is stupid. Obviously if BSGS ran a range and did or did not find a solution it means that the key is or is not in range k1-k2. Or you say bsgs would find private key even if k1 k2 interval is incorrect? How is this even possible, when already babysteps don't find a value? This question is totally stupid. Please first do some fundamental research. Maybe then you won't waste your and our time by asking totally stupid questions and blaming others for giving supposedly stupid answer despite the fact reading wikipedia and using your own brain would give you the logical answer. Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 17, 2021, 12:51:51 PM If you feel wasting your time you should not react at all
Anyway, your answer is like saying baby steps must find it (what) first. Total bullshit. Title: Re: BSGS solver for cuda Post by: a.a on October 17, 2021, 01:23:07 PM If you feel wasting your time you should not react at all I was answering nice to even reach out to someone who asks such uninformed questions. Anyway, your answer is like saying baby steps must find it (what) first. Total bullshit. To understand babystep giantstep read the wiki article https://en.wikipedia.org/wiki/Baby-step_giant-step If you search in a range, you just need to generate a small babystep lookup table, with potential steps in that range. But if you are looking for a value outside the provided range, the babystep lookup table wont contain the necessary value! So you are having no hit. The english WP contains no example, but the german one contains an example: https://de.wikipedia.org/wiki/Babystep-Giantstep-Algorithmus#Beispiel Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 17, 2021, 01:40:07 PM Thank you
Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 17, 2021, 06:00:56 PM Do you mean to say that bsgs algo gives you 100% information if key is or is not in s given range k1-k2? Apart from testing a program, meaning we know a key lies in a range we are using to test a program, to see if program works and can find the key that we know is in the range we are searching....Or you say bsgs would find private key even if k1 k2 interval is incorrect? I had a closer look at JLP bsgs code (not cuda you speak about) and it seems there is no limit to 125 or 128 bit for search interval? Is bsgs by nature searching whole 256 bit range? So if you are searching a range and do not know if the key you are searching for lies in the range, if you run the range with BSGS program, if the key is not found, then you know for 100% sure, that the key is not in that range. If you use JLP Kangaroo, you have to at least set the -m option to -m 6, if the program does not find the key, then you know with 99% sure, the key is not in the range. BSGS will 100% tell you if the key is or is not in the range you are searching. JLP BSGS is not limited to any bit range; meaning you can search in a 10 bit range or a 256 bit range. Both programs search in the range you specify. If you specify a 64 bit range, that is what the program will search; if you specify a 256 bit range, that is what the program will search. Title: Re: BSGS solver for cuda Post by: Etar on October 17, 2021, 06:27:47 PM New release v1.4.0
supported compressed/uncompressed format public keys removed binsort program (don`t need sorted array any more) baby array need only first time to create HT (or rebuild HT when -htsz changed) After HT created and saved, next time you will need less ram and only HT, giant array to launch. with single 2080ti (parameters -w 29 -htsz 28) find 16 public keys from example JLP bsgs in 2m 30s Code: GPU#0 Cnt:0000000000000000000000000000000000000000000000006ce8000000000001 869MKey/s x536870912 2^29.76 x2^30=2^59.76 Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 17, 2021, 07:30:27 PM BSGS will 100% tell you if the key is or is not in the range you are searching. JLP BSGS is not limited to any bit range; meaning you can search in a 10 bit range or a 256 bit range. Thank you for exhausting explanation! I hope I understand correct both JLP and "cuda" BSGS work up to 256bit size search Maybe you noticed I migrated JLP bsgs from curve k1 to r1. It was success. I started it for interval almost 256bit: 1000000000....0000000 EFFF.....................FFFFF I know it will probably not succeed in this century but it is not impossible :) I want to try to find private key for gr**npass I continue to work to migrate also Kangaroo256 to r1 curve but I was told by the author to wait a bit until he updates hastables (?) Anyway, I migrated cpu part already just to see how is the behaviour of the code. I already reported him my findings, it is here on the forum Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 17, 2021, 07:39:31 PM Quote I continue to work to migrate also Kangaroo256 to r1 curve but I was told by the author to wait a bit until he updates hastables (?) Not sure which author you are referring to, but if this is NotATether's version, I would not use it. Very buggy and last known speed is compromised and the program does not find keys. The only thing needed, IMO, to upgrade JLPs original Kangaroo to be able to search a 256 bit range, was to update the limited 128 bit store function to a 256 bit store function (plus the + - and type bits) so the program could solve key. I have not looked at the code of the new 256bit version, but it seems it does not find keys. Original JLP Kangaroo, you can search a 256 bit range for a key but since it only stores 128 bits for the distances, you will not solve key properly because it is missing 128 bits of the distance (private key), which is needed to reconstruct the private key of the pub key you are searching for. Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 18, 2021, 05:38:50 AM Not sure which author you are referring to, but if this is NotATether's version yes it is, he informed me already about some problems and I should wait until he corrects them but as I have written already, I could not resist and I migrated it to R1 curve just to try what it does... yes, you are right, as it is now it behaves a bit non repetitive way, for me it found the keys but solving the same problem more time the time to resolve was from zero (!) seconds to minutes, but I tried only fewer bits search as it was considerably slower than JLP 125b version searching more bits range I also noticed he changed DP mask meaning (not from left to rigth as original but from right to left) and he "regrets" this as it made me confused also I noticed there is a problem when having public key in configuration file to solve in 03 mode (only x coordinate) something is wrong as it falsely reports "point not lie on curve", when in 04 mode it seems to be ok but let us see and wait if NotATether manages to bring it to life I was already thinking if there is any way as to test the program before starting it for months..., would it be a change to provide instead of random numbers where algo starts some precalculated values so program would find a test private key in hours? or is there already a way to test the code? Sorry, I do not understand abbreviation "IMO" what does it mean? :) Title: Re: BSGS solver for cuda Post by: davidjjones on October 18, 2021, 07:27:34 AM New release v1.4.0 Thanks for your great code.supported compressed/uncompressed format public keys removed binsort program (don`t need sorted array any more) baby array need only first time to create HT (or rebuild HT when -htsz changed) After HT created and saved, next time you will need less ram and only HT, giant array to launch. ... Is there a limit to the number of pubkeys in the input txt file? Here is the list of 2800 top richest pubkeys: https://pastebin.com/YMr3BaiU Title: Re: BSGS solver for cuda Post by: a.a on October 18, 2021, 12:02:00 PM Interesting List you have. I would have given merits If I had some.
I don't think that they can be effectively searched in parallel. You have to divide each pubkey and check it with the babysteps. So not only do you need to make very expensive global memory lookup (GPU has slow global and super fast local memory) and load each key. So if you would search multiple keys you would effectively reduce the performance by them. Like 10 keys in parallel means 10 times slower. 2800 keys means 2800 times slower. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 18, 2021, 12:10:15 PM Quote Is there a limit to the number of pubkeys in the input txt file? I am sure there will be a limit, but it is probably in the millions. For similar programs, it usually caps out at around 30 million addresses, pubkeys, xpoints... a.a. Have you ran this program yet? It's just that some of your answers make it seem like you have not ran it at all. The program does not check keys in parallel, it runs range with one pubkey, once finished, it moves to the next, until the last pubkey has been checked for that specific range. Title: Re: BSGS solver for cuda Post by: a.a on October 18, 2021, 12:20:36 PM I think I made myself not clear. My native language is German. I apologize. I Looked at it with my programmers eyes.
As you already explained few days ago, cuda BSGS is searching the keys one after another, not in parallel. I know that, I read carefully ;). With my previous post I meant that even if it would run in parallel it would slow down as described. So read my post in conjunctive and if someone would modify it to process in parallel I expect that behaviours. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 18, 2021, 12:46:36 PM I think I made myself not clear. My native language is German. I apologize. I Looked at it with my programmers eyes. Gotcha...no worries, I just did not want to mislead anyone or anyone mislead anyone lol. As you already explained few days ago, cuda BSGS is searching the keys one after another, not in parallel. I know that, I read carefully ;). With my previous post I meant that even if it would run in parallel it would slow down as described. So read my post in conjunctive and if someone would modify it to process in parallel I expect that behaviours. But you are right, if it did search in parallel, performance would drop, but I believe it would be due to the giant steps (CPU performs the baby steps). If one had higher end card and wanted to search 2 pubkeys, then I think it would be worth it, to search 2 at same time. I have been running the program on slower card, my test card, for a few days now. The purpose is to see if there is a benefit or angle to attack the 120, 125, 130, etc keys where public key is exposed. More to come with this... Title: Re: BSGS solver for cuda Post by: davidjjones on October 18, 2021, 01:59:43 PM Interesting List you have. I would have given merits If I had some. I don't think that they can be effectively searched in parallel. You have to divide each pubkey and check it with the babysteps. So not only do you need to make very expensive global memory lookup (GPU has slow global and super fast local memory) and load each key. So if you would search multiple keys you would effectively reduce the performance by them. Like 10 keys in parallel means 10 times slower. 2800 keys means 2800 times slower. Quote Is there a limit to the number of pubkeys in the input txt file? I am sure there will be a limit, but it is probably in the millions. For similar programs, it usually caps out at around 30 million addresses, pubkeys, xpoints... a.a. Have you ran this program yet? It's just that some of your answers make it seem like you have not ran it at all. The program does not check keys in parallel, it runs range with one pubkey, once finished, it moves to the next, until the last pubkey has been checked for that specific range. So multiple xpoints checking (in parallel) is only possible with KeyHunt-CUDA. Title: Re: BSGS solver for cuda Post by: NotATether on October 18, 2021, 01:59:51 PM Quote I continue to work to migrate also Kangaroo256 to r1 curve but I was told by the author to wait a bit until he updates hastables (?) Not sure which author you are referring to, but if this is NotATether's version, I would not use it. Very buggy and last known speed is compromised and the program does not find keys. The only thing needed, IMO, to upgrade JLPs original Kangaroo to be able to search a 256 bit range, was to update the limited 128 bit store function to a 256 bit store function (plus the + - and type bits) so the program could solve key. I have not looked at the code of the new 256bit version, but it seems it does not find keys. Original JLP Kangaroo, you can search a 256 bit range for a key but since it only stores 128 bits for the distances, you will not solve key properly because it is missing 128 bits of the distance (private key), which is needed to reconstruct the private key of the pub key you are searching for. Yep he was referring to mine. You should not try to update it to 256-bit store - it's too complicated since you'll have to find a new home for the two flag bits at the end of each store (the ones which limit to the actual search range to 126 bits). This is how the hashtable got screwed. It is more logical to update it to a 254-bit store instead so you don't have to move the flag bits anywhere. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 18, 2021, 02:10:13 PM Quote I continue to work to migrate also Kangaroo256 to r1 curve but I was told by the author to wait a bit until he updates hastables (?) Not sure which author you are referring to, but if this is NotATether's version, I would not use it. Very buggy and last known speed is compromised and the program does not find keys. The only thing needed, IMO, to upgrade JLPs original Kangaroo to be able to search a 256 bit range, was to update the limited 128 bit store function to a 256 bit store function (plus the + - and type bits) so the program could solve key. I have not looked at the code of the new 256bit version, but it seems it does not find keys. Original JLP Kangaroo, you can search a 256 bit range for a key but since it only stores 128 bits for the distances, you will not solve key properly because it is missing 128 bits of the distance (private key), which is needed to reconstruct the private key of the pub key you are searching for. Yep he was referring to mine. You should not try to update it to 256-bit store - it's too complicated since you'll have to find a new home for the two flag bits at the end of each store (the ones which limit to the actual search range to 126 bits). This is how the hashtable got screwed. It is more logical to update it to a 254-bit store instead so you don't have to move the flag bits anywhere. Title: Re: BSGS solver for cuda Post by: sky59sky59 on October 18, 2021, 02:41:59 PM Yep he was referring to mine. You should not try to update it to 256-bit store - it's too complicated since you'll have to find a new home for the two flag bits at the end of each store (the ones which limit to the actual search range to 126 bits). This is how the hashtable got screwed. It is more logical to update it to a 254-bit store instead so you don't have to move the flag bits anywhere. In a meantime I found that your updated Div() is faulty, why not to keep original functions? And it seems more functions are faulty that are tested in Check() you can try this: (gives wrong results) I would be probably the happiest man in universe if with all your knowledge and experience just updated 125bit version to 254bit version, just absolutely unnecessary staff without any improvements :) then I see your great success!! // Div ------------------------------------------------------------------------------------------- tTotal = 0.0; ok = true; for (int i = 0; i < 2 && ok; i++) { a.SetBase16("D51263D15FC81DE32C5CB69070ABDF3D58A2028184E15F3A6C56EB8A787C81DB"); b.SetBase16("2AED15B34BE1B98EE4246FB3F447059A"); // a.Rand(BISIZE); //b.Rand(BISIZE/2); d.Set(&a); e.Set(&b); printf("a= %s\n", a.GetBase16().c_str()); printf("b= %s\n", b.GetBase16().c_str()); printf("d= %s\n", d.GetBase16().c_str()); printf("e= %s\n", e.GetBase16().c_str()); t0 = Timer::get_tick(); a.Div(&b, &c); printf("a/b= %s\n", a.GetBase16().c_str()); printf("rem= %s\n", c.GetBase16().c_str()); t1 = Timer::get_tick(); tTotal += (t1 - t0); a.Mult(&e); a.Add(&c); if (!a.IsEqual(&d)) { ok = false; printf("Div() Results Wrong \nN: %s\nD: %s\nQ: %s\nR: %s\n", d.GetBase16().c_str(), b.GetBase16().c_str(), a.GetBase16().c_str(), c.GetBase16().c_str() ); return; } Title: Re: BSGS solver for cuda Post by: NotATether on October 19, 2021, 05:34:38 AM Agreed...really, can it be bumped up to a 160-bit store plus the 2 flag bits, easier than the 254-bit store? Then the program could at least cover up to the last 160 bit key for the puzzle/challenge transaction. 254 bits is easier to make than 160 bits because I can just update the structures from int128_t to int256_t in all occurrences without having to make any other changes. Title: Re: BSGS solver for cuda Post by: Etar on October 19, 2021, 10:41:00 AM This is the maximum that I can squeeze out of my 2080ti card:
16 pubkeys from JLP example solved in 1m 23s Code: NewFINDpubkey= (2375c86aa2a807fd50e4b1a2a65820244e704b8eabc8eb4dc0517393aff0c647, fad56264ae29d620205a68792091b64ae262bba359f8d013ce904d595e790ccf) first for GPU without xpoint position, only xpoint 32bit + size htsz, totaly 32+29 = 61bit per xpoint Second for host usage with xpoint position Utilized 9008Mb of GPU memory. Title: Re: BSGS solver for cuda Post by: ssxb on October 19, 2021, 12:02:39 PM This is the maximum that I can squeeze out of my 2080ti card: 16 pubkeys from JLP example solved in 1m 23s Code: NewFINDpubkey= (2375c86aa2a807fd50e4b1a2a65820244e704b8eabc8eb4dc0517393aff0c647, fad56264ae29d620205a68792091b64ae262bba359f8d013ce904d595e790ccf) first for GPU without xpoint position, only xpoint 32bit + size htsz, totaly 32+29 = 61bit per xpoint Second for host usage with xpoint position Utilized 9008Mb of GPU memory. i optimized parameters and the key "59A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC994327554CED887 AAE5D211A2407CDD025CFC3779ECB9C9D7F2F1A1DDF3E9FF8" i solved in 17 seconds well can you please tune it for parallel search for pubs , i undertand speed will drop but its still worth to try .. can you? Title: Re: BSGS solver for cuda Post by: Etar on October 19, 2021, 07:13:08 PM well can you please tune it for parallel search for pubs , i undertand speed will drop but its still worth to try .. can you? Possible to make pseudo-parallelism (this means finding the keys sequentially at each giant step). But the speed will drop in multiples of the number of search keys. For ex. with search 1 public key your speed is 1000mkeys/s. if you setup 10 keys the speed will drop to 100mkey/s, with 1000keys speed drop to 1mkeys/s :) By the way the search time for 16 keys will be exactly the same, either in a sequential search or in a pseudo-parallel. Title: Re: BSGS solver for cuda Post by: lostrelic on October 19, 2021, 09:40:58 PM This is the maximum that I can squeeze out of my 2080ti card: Hi Etar what settings have you used to get that, and could you recommend what to use for a 3080?16 pubkeys from JLP example solved in 1m 23s Code: NewFINDpubkey= (2375c86aa2a807fd50e4b1a2a65820244e704b8eabc8eb4dc0517393aff0c647, fad56264ae29d620205a68792091b64ae262bba359f8d013ce904d595e790ccf) first for GPU without xpoint position, only xpoint 32bit + size htsz, totaly 32+29 = 61bit per xpoint Second for host usage with xpoint position Utilized 9008Mb of GPU memory. Thanks Relic Title: Re: BSGS solver for cuda Post by: ssxb on October 20, 2021, 03:57:51 AM well can you please tune it for parallel search for pubs , i undertand speed will drop but its still worth to try .. can you? Possible to make pseudo-parallelism (this means finding the keys sequentially at each giant step). But the speed will drop in multiples of the number of search keys. For ex. with search 1 public key your speed is 1000mkeys/s. if you setup 10 keys the speed will drop to 100mkey/s, with 1000keys speed drop to 1mkeys/s :) By the way the search time for 16 keys will be exactly the same, either in a sequential search or in a pseudo-parallel. Got it but just as example if you do 32 divisor and load 32 keys , assume if key is on position 1~ lucky you. but if the key is on position 30 program will hang with full range scan for key 1 and than will be back to second (my guess ~ didn't test your program) one perhaps after this century ;D but one thing that i noticed Alberto keyhunt [updated recently] is way too faster than BSGScuda [although both have different way]. i solved 80 key with blink of eye but that one need serious K and N optimization ~ do the wrong K and N you will never reach the goal. i am not sure if you guys have a chance to test that one Alberto KEYHUNT (https://github.com/albertobsd/keyhunt) you will find it interesting. but dark fact is keyhunt is ram eating bug ;D so if have less ram (minimum 128gb) no point to compare it with BSGScuda perhaps in that case BSGScuda will do way better than Keyhunt. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 20, 2021, 05:10:05 AM well can you please tune it for parallel search for pubs , i undertand speed will drop but its still worth to try .. can you? Possible to make pseudo-parallelism (this means finding the keys sequentially at each giant step). But the speed will drop in multiples of the number of search keys. For ex. with search 1 public key your speed is 1000mkeys/s. if you setup 10 keys the speed will drop to 100mkey/s, with 1000keys speed drop to 1mkeys/s :) By the way the search time for 16 keys will be exactly the same, either in a sequential search or in a pseudo-parallel. Got it but just as example if you do 32 divisor and load 32 keys , assume if key is on position 1~ lucky you. but if the key is on position 30 program will hang with full range scan for key 1 and than will be back to second (my guess ~ didn't test your program) one perhaps after this century ;D but one thing that i noticed Alberto keyhunt [updated recently] is way too faster than BSGScuda [although both have different way]. i solved 80 key with blink of eye but that one need serious K and N optimization ~ do the wrong K and N you will never reach the goal. i am not sure if you guys have a chance to test that one Alberto KEYHUNT (https://github.com/albertobsd/keyhunt) you will find it interesting. but dark fact is keyhunt is ram eating bug ;D so if have less ram (minimum 128gb) no point to compare it with BSGScuda perhaps in that case BSGScuda will do way better than Keyhunt. and if you are going for this: Quote Got it but just as example if you do 32 divisor and load 32 keys , assume if key is on position 1~ lucky you then just do your 32 divisor and let it search each pubkey for 1 minute; maybe lucky you.Title: Re: BSGS solver for cuda Post by: Etar on October 20, 2021, 05:30:12 AM Hi Etar what settings have you used to get that, and could you recommend what to use for a 3080? It is unpublished version yet(i will publish it today)Thanks Relic it is my settings for 2080ti -t 512 -b 136 -p 480 -w 30 -htsz 28 Utilized around 9200mb of GPU memory(totaly 2080ti in windows10 have only 9240 free memory) P.s.Already released v1.6.0 Title: Re: BSGS solver for cuda Post by: ssxb on October 20, 2021, 05:48:59 AM well can you please tune it for parallel search for pubs , i undertand speed will drop but its still worth to try .. can you? Possible to make pseudo-parallelism (this means finding the keys sequentially at each giant step). But the speed will drop in multiples of the number of search keys. For ex. with search 1 public key your speed is 1000mkeys/s. if you setup 10 keys the speed will drop to 100mkey/s, with 1000keys speed drop to 1mkeys/s :) By the way the search time for 16 keys will be exactly the same, either in a sequential search or in a pseudo-parallel. Got it but just as example if you do 32 divisor and load 32 keys , assume if key is on position 1~ lucky you. but if the key is on position 30 program will hang with full range scan for key 1 and than will be back to second (my guess ~ didn't test your program) one perhaps after this century ;D but one thing that i noticed Alberto keyhunt [updated recently] is way too faster than BSGScuda [although both have different way]. i solved 80 key with blink of eye but that one need serious K and N optimization ~ do the wrong K and N you will never reach the goal. i am not sure if you guys have a chance to test that one Alberto KEYHUNT (https://github.com/albertobsd/keyhunt) you will find it interesting. but dark fact is keyhunt is ram eating bug ;D so if have less ram (minimum 128gb) no point to compare it with BSGScuda perhaps in that case BSGScuda will do way better than Keyhunt. and if you are going for this: Quote Got it but just as example if you do 32 divisor and load 32 keys , assume if key is on position 1~ lucky you then just do your 32 divisor and let it search each pubkey for 1 minute; maybe lucky you.you got big mouth but less sense and knowledge ;D i hate to tell you that grow up your knowledge & perhaps things will get more clear. 1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this ;D. 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on October 20, 2021, 06:09:02 AM Quote you got big mouth but less sense and knowledge Grin Your English reading or comprehension is less sense and knowledge.i hate to tell you that grow up your knowledge & perhaps things will get more clear. 1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this Grin. 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. I never said anything about 80 keys...you are saying you found 80 key, I took that as a single key in an 80 bit range, not 80 keys because you did not pluralize the word key. So with that, I merely said instead of trying to get someone to reprogram BSGS Cuda for multi key, run keyhunt, since it already supports multi key and if you think it is faster, then break up 120 key into however many keys you want to, 2^5, 2^20, 2^40, or however many you want to and let that program eat. I said 2^40 specifically because you said an 80 key in a blink of an eye; so 2^120/2^40 = 2^80; if you found one 80 key in a blink of an eye, maybe you find the 120 key in 80 bit range in 2 blinks of an eye. BSGS Cuda, can find 65 bit key in less than a second, it all depends on your hardware. you say Quote if you will load 2 keys, you will make keyhunt speed half the same will happen to BSGS Cuda; so I am not sure what your point is really. Title: Re: BSGS solver for cuda Post by: Etar on October 20, 2021, 06:19:11 AM 1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this ;D. 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. I don`t have 3080 card but i think speed will be around 1400Mkeys x BabyArraySize windows10 eat 20% of GPU memory so 3080 should have 8192 free memory, so we can use -w 30 Totaly 1400mkeys = 2^30.38 and baby array x2 = 2^31 and full perfomance = 2^61.38 and to check full 2^64 need 6.14s Only Kangaroo can solve keys faster then bsgs or keyhunt or whatever. Bsgs cuda created only because i didn`t find bsgs for gpu (maybe it useless app i don`t know) Title: Re: BSGS solver for cuda Post by: ssxb on October 20, 2021, 07:38:15 AM 1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this ;D. 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. I don`t have 3080 card but i think speed will be around 1400Mkeys x BabyArraySize windows10 eat 20% of GPU memory so 3080 should have 8192 free memory, so we can use -w 30 Totaly 1400mkeys = 2^30.38 and baby array x2 = 2^31 and full perfomance = 2^61.38 and to check full 2^64 need 6.14s Only Kangaroo can solve keys faster then bsgs or keyhunt or whatever. Bsgs cuda created only because i didn`t find bsgs for gpu (maybe it useless app i don`t know) i am not arguing on your math but if you have time and hardware please just try to do research on keyhunt [CPU+memory ] and by the way i appreciate your programing skills toward cuda its really impressive and wish some day you will enhanced it more to overcome 120 and by the way i know one guy who is running it with 9+Ekeys/sec [yoyodapro]. but with divisor you can get only get 1 key out of 1073741824 if you want to reach 90bit. i loaded all keys in keyhunt and i am trying my luck but on other side i was hoping if we can figure it out how to load multi keys with cudabsgs . so i will keep busy my 3080 for that as that one is just sitting idle now. Title: Re: BSGS solver for cuda Post by: ssxb on October 20, 2021, 07:56:11 AM Quote you got big mouth but less sense and knowledge Grin Your English reading or comprehension is less sense and knowledge.i hate to tell you that grow up your knowledge & perhaps things will get more clear. 1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this Grin. 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. I never said anything about 80 keys...you are saying you found 80 key, I took that as a single key in an 80 bit range, not 80 keys because you did not pluralize the word key. So with that, I merely said instead of trying to get someone to reprogram BSGS Cuda for multi key, run keyhunt, since it already supports multi key and if you think it is faster, then break up 120 key into however many keys you want to, 2^5, 2^20, 2^40, or however many you want to and let that program eat. I said 2^40 specifically because you said an 80 key in a blink of an eye; so 2^120/2^40 = 2^80; if you found one 80 key in a blink of an eye, maybe you find the 120 key in 80 bit range in 2 blinks of an eye. BSGS Cuda, can find 65 bit key in less than a second, it all depends on your hardware. you say Quote if you will load 2 keys, you will make keyhunt speed half the same will happen to BSGS Cuda; so I am not sure what your point is really. maybe you find the 120 key in 80 bit range in 2 blinks of an eye. ok learn basic knowledge of divisor bro , if you will do 32 times, only one key will be from 5 bit down range on unknown position other all will from uper bit ranges on exact same distance from their references values. now if you will do 2^40, you will have 1208925819614629174706176 reference values in 256 bit range and only one of key will be in 40bit range other all keys will from uper bits on exact distance from their respectively reference values. now how the hell you can work with such large number of keys and the line you said that get the 2^40 is aggressive comment without knowing my intention. my intention is that i already did divisor of 32 and loaded keys in Keyhunt and running it right now and i know how much speed and power i am getting from that , but i just dont know power of BSGScuda if i will load 32 keys parallel in that. so i asked Etar that if he can make such possibility who knows BSGS will out performed keyhunt. now come to the point in above post Etar said that my program is good until 80 bit and above that use JL kangaroo so i was comparing it with BSGS of alberto but i found that CPU based BSGS is more powerful than 3080 if you have good specification hardware but same time BSGScuda is better than keyhunt[CPU] if you dont have enough power of CPU and memory. Title: Re: BSGS solver for cuda Post by: ssxb on October 20, 2021, 08:05:42 AM @Etar ???
i seriously believing that there will some way to use power of GPU cores and process all BSGS inside computer memory perhaps this will give some crazy power which never been discovered or there will be bottle neck but you can confirm it when you will build such program. assume if you have power of keyhunt and than you will make bloom in SSD [7000+ read write speed gen4] RAM bpfile elements bpfile size bloom size 8 GB 1000000000 32 GB 5.02 GB 32 GB 5000000000 160 GB 25.11 GB 128 GB 22000000000 704 GB 110.47 GB 500 GB 90000000000 2.9 TB 451.92 GB based on above table you can increase speed if you will utilize both bloom+bp https://github.com/iceland2k14/bsgs (https://github.com/iceland2k14/bsgs) so CPU cores are less powerful than cuda and i was thinking [not sure possible or not] if we load all bp in RAM and use some bloom in GPU memory perhaps their will be some dramatic speed boost Title: Re: BSGS solver for cuda Post by: bigvito19 on October 20, 2021, 08:48:29 AM What's the link to the divisor script?
and how many keys can I generate with the divisor? Title: Re: BSGS solver for cuda Post by: NotATether on October 20, 2021, 09:55:44 AM What's the link to the divisor script? and how many keys can I generate with the divisor? If you mean the one I made, it's in the Kangaroo thread, anywhere from pages 90 to 100 I think. I think we can cut the number of baby steps made if we take into account that the correct baby step amount is going to be random-looking (in other words, no long 0 or 1 sequences). Or at least make the baby steps take a higher bit count, decreasing the number of giant steps. I'm thinking that we can find the numbers represented by these random bits and then calculate their multiples to use as an incrementor... not perfect but it does the trick I guess. E.g. 5 is 101, 10 is 1010, 15 is 1111, 20 is 10100, 25 is 11001, 30 11110, ..... etc. Special care would need to be taken to choose a number whose multiples don't make long sequences of bits, like 15: 3*5 I don't think that this randomness has any correlation to primality of numbers (or inverse correlation to it). Title: Re: BSGS solver for cuda Post by: bigvito19 on October 20, 2021, 01:28:20 PM I'm testing with the divisor keys on a smaller range, but its not solving the key with keyhunt. does it work the same with xpoint mode?
Title: Re: BSGS solver for cuda Post by: ssxb on October 20, 2021, 01:46:08 PM I'm testing with the divisor keys on a smaller range, but its not solving the key with keyhunt. does it work the same with xpoint mode? you need to adjust K and N as smaller range will be not solved if power of K and N is more than range count or if number of keys will be more or less than power of your hardware. remember tweak is seriously needed while keeping K and N according to your hardware power as well as adjust K and N according to number of keys you will load in software ~ do the test again and again and again Title: Re: BSGS solver for cuda Post by: Etar on October 20, 2021, 03:54:04 PM With last ptx optimisation (forgot about simmetry in batch point addition)
solve 16 pubkeys from JLP in 58s Code: ... Ofcourse JLP would probably have done it even faster :) Title: Re: BSGS solver for cuda Post by: studyroom1 on October 21, 2021, 08:58:37 AM With last ptx optimisation (forgot about simmetry in batch point addition) solve 16 pubkeys from JLP in 58s Code: ... Ofcourse JLP would probably have done it even faster :) impressive Etar . i have question . lets say if you have 1m keys in file and you load in bsgscuda and set scan range only to 64, now my question is if gpu finished whole 64 range scan for key1 than gpu will abandoned search of key1 and move to key2? your program is doing that or you will impalement this. right? Title: Re: BSGS solver for cuda Post by: Etar on October 21, 2021, 10:11:16 AM impressive Etar . i have question . lets say if you have 1m keys in file and you load in bsgscuda and set scan range only to 64, now my question is if gpu finished whole 64 range scan for key1 than gpu will abandoned search of key1 and move to key2? Use -pk to set start range and -pke to set endrange. if pubkey will not find in this range then seraching will be switched to next pubkey.your program is doing that or you will impalement this. right? Title: Re: BSGS solver for cuda Post by: lostrelic on October 21, 2021, 10:22:37 AM Hi Etar thanks for your continuing support for this program.
Quick question the fastest I get is 2^60 if I try to get 2^61 it sticks on add baby points to hashtable? I’ve got a 3080 16gb ram and 500gb ssd any ideas on settings to try? or how long should I wait for it to load? Thanks Relic Title: Re: BSGS solver for cuda Post by: Etar on October 21, 2021, 11:46:14 AM Hi Etar thanks for your continuing support for this program. Screen what i post in post above it is the latest verion and not yet published(tested).Quick question the fastest I get is 2^60 if I try to get 2^61 it sticks on add baby points to hashtable? I’ve got a 3080 16gb ram and 500gb ssd any ideas on settings to try? or how long should I wait for it to load? Thanks Relic By the way v1.6.0 shoud works fine for you but in little less perfomance, at v1.6.0 2080ti speed 826MKey/s x1073741824 2^29.69 x2^31=2^60.69 If you have 16gb gpu ram then try -w 31 and -htsz 29 In any case 3080 shoud have better perfomance then 2080ti even with the same size of baby array that i use, try set -t 512 -b 136 -p 512 -w 30 -htsz 28 P.s. Maybe you stick on add baby points to hashtable because have little memory on PC to generate HT in RAM. I generate HT -w 30 on PC that have 32GB of ram. For -w 31 you need 64gb of ram to creat all arrays. To launch solver you will need less more memory with already generated arrays. Title: Re: BSGS solver for cuda Post by: Etar on October 21, 2021, 12:47:06 PM STOP using BSGScuda, i found a bug that not all public keys found. I can`t say now from which version this bug apear, so don`t use programm while i am do not solve issue.
Title: Re: BSGS solver for cuda Post by: studyroom1 on October 21, 2021, 01:58:10 PM STOP using BSGScuda, i found a bug that not all public keys found. I can`t say now from which version this bug apear, so don`t use programm while i am do not solve issue. oh when can we see next update :( Title: Re: BSGS solver for cuda Post by: Etar on October 22, 2021, 12:55:03 PM The problem was a double giant step.
Now I have removed the double giant step and in my opinion everything works as it should. I run several tests with different small -w -p options with 1024 pubkeys file and all keys are founded. True, now the total indicator is 2 times less, due to the fact that the step is normal. You can run all sorts of tests with keys and check. If there are any bugs, let me know. release 1.7.0 available on github. Title: Re: BSGS solver for cuda Post by: _Counselor on October 22, 2021, 02:09:15 PM The problem was a double giant step. What the kind of problem was?Now I have removed the double giant step and in my opinion everything works as it should. I run several tests with different small -w -p options with 1024 pubkeys file and all keys are founded. True, now the total indicator is 2 times less, due to the fact that the step is normal. You can run all sorts of tests with keys and check. If there are any bugs, let me know. release 1.7.0 available on github. I think you exploited symmetry to double size of giant steps? Why it did not find some keys? Title: Re: BSGS solver for cuda Post by: math09183 on October 23, 2021, 06:53:32 AM STOP using BSGScuda, i found a bug that not all public keys found. I can`t say now from which version this bug apear, so don`t use programm while i am do not solve issue. LOL :D That's what happens when you use ad hoc written code, without proper testing. I guess you still did not prepare any set of unit tests to proof your code works? Good luck for the future releases, maybe somewhere around version 20 it will be stable ;D Title: Re: BSGS solver for cuda Post by: Etar on October 23, 2021, 07:01:46 AM LOL :D That's what happens when you use ad hoc written code, without proper testing. I guess you still did not prepare any set of unit tests to proof your code works? Good luck for the future releases, maybe somewhere around version 20 it will be stable ;D I found this bug and solved it, what's your problem? Title: Re: BSGS solver for cuda Post by: Etar on October 23, 2021, 07:02:51 AM What the kind of problem was? I think you exploited symmetry to double size of giant steps? Why it did not find some keys? if we talk about doubled GS (Giant Step) For ex, option -p 8 -w 4 mean baby array 2^4 =16 each giant step (doubled) is 16*2=32 let say we should find pubkey with privkey=32 program substruct GS from public key and look to a Baby array to check overlap. but if you substruct 32-32 then you get 64 and this value is not present in the baby array. But if we used the usual GS 32-16=16 and 16 is present in baby array - pubkey solved. So with doubled GS not finded every (baby array size)*2 keys. Title: Re: BSGS solver for cuda Post by: math09183 on October 23, 2021, 03:17:08 PM LOL :D That's what happens when you use ad hoc written code, without proper testing. I guess you still did not prepare any set of unit tests to proof your code works? Good luck for the future releases, maybe somewhere around version 20 it will be stable ;D I found this bug and solved it, what's your problem? Relax, haters gonna hate ;D Anyway, good job, I appreciate your work. Sh*t happens. Title: Re: BSGS solver for cuda Post by: Etar on October 23, 2021, 05:13:51 PM Code: KEY[15]: 0x49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e2452dd26bc983cd5 v1.7.1 released with maximum perfomance. Title: Re: BSGS solver for cuda Post by: mamuu on October 23, 2021, 07:13:47 PM Code: KEY[15]: 0x49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e2452dd26bc983cd5 v1.7.1 released with maximum perfomance. Hi, Please , can you write a little tutorial on usage? Title: Re: BSGS solver for cuda Post by: Etar on October 23, 2021, 07:16:04 PM Hi, did you look at readme.md on github?Please , can you write a little tutorial on usage? first of all set -t parametr as 256 or 512 next set -b equil to SM count of your card next set -p start from 128 then set -w as max as possible to your gpu memory, check -htsz params -w 31 -htsz 29 need around 64GB of RAM to generate all arrays -w 30 -htsz 28 need around 32GB of RAM to generate all arrays -w 29 -htsz 28 -w 28 -htsz 27 -w 27 -htsz 25 if you will have free gpu memory you can increase -p or -b or -t (all params multiple of 2) Title: Re: BSGS solver for cuda Post by: ssxb on October 24, 2021, 03:30:22 AM Hi, did you look at readme.md on github?Please , can you write a little tutorial on usage? first of all set -t parametr as 256 or 512 next set -b equil to SM count of your card next set -p start from 128 then set -w as max as possible to your gpu memory, check -htsz params -w 31 -htsz 29 need around 64GB of RAM to generate all arrays -w 30 -htsz 28 need around 32GB of RAM to generate all arrays -w 29 -htsz 28 -w 28 -htsz 27 -w 27 -htsz 25 if you will have free gpu memory you can increase -p or -b or -t (all params multiple of 2) some great work from your side , appreciate just a quick question, do you have any plan to enhanced it for 120bit or more to perform better than JLK? Title: Re: BSGS solver for cuda Post by: Etar on October 24, 2021, 07:55:06 AM just a quick question, do you have any plan to enhanced it for 120bit or more to perform better than JLK? Bsgs will never faster then kangaroo on big ranges. Example puzzle #80, width 79bit Kanagaroo need in average 2^40.5 op to solve key 2080ti can serach DPs with speed 1500Mkey/s = 2^30.48 So you need in average 2^10 second to find key. Bsgscuda can search 2^61/s, so 2^79 / 2^61 = 2^18 second to check whole 79bit range. Ofcourse key can be close to the begining and you can found very fast but it is not 100%. we can devide pub to 64 pubkey and try to find one of them in range 1..3ffffffffffffffffff fortunately, the key we need is the first one on the list but it was just lucky. by the way privkey will be 3a869719b73046d6b46 = 2^73.87 so we need 2^73.87 / 2^61/s = 2^12.87 seconds to find key. It is 7 more timer then need for kangaroo Title: Re: BSGS solver for cuda Post by: arulbero on October 24, 2021, 04:32:04 PM Do you use the negation map to speed up the algorithm ? --> pag. 8-9 https://eprint.iacr.org/2015/605.pdf (https://eprint.iacr.org/2015/605.pdf)
You need to compute only: sqrt(n) / 2 baby steps : for example, n = 2^60 -> 2^29 baby steps sqrt(n) giant steps : for example, n = 2^60 -> 2^30 giant steps It is like to shift the public key of 2^59 steps, and search in the interval [1,..., 2^59] instead of [1,...., 2^60] exploiting the symmetrie. Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p. Title: Re: BSGS solver for cuda Post by: Etar on October 24, 2021, 05:02:06 PM Do you use the negation map to speed up the algorithm ? --> pag. 8-9 https://eprint.iacr.org/2015/605.pdf (https://eprint.iacr.org/2015/605.pdf) No, don`t use even don`t know about this.You need to compute only: sqrt(n) / 2 baby steps : for example, n = 2^60 -> 2^29 baby steps sqrt(n) giant steps : for example, n = 2^60 -> 2^30 giant steps It is like to shift the public key of 2^59 steps, and search in the interval [1,..., 2^59] instead of [1,...., 2^60] exploiting the symmetrie. Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p. Will try to understand this tweak, thanks. Title: Re: BSGS solver for cuda Post by: arulbero on October 24, 2021, 05:16:14 PM As you already explained few days ago, cuda BSGS is searching the keys one after another, not in parallel. I know that, I read carefully ;). With my previous post I meant that even if it would run in parallel it would slow down as described. So read my post in conjunctive and if someone would modify it to process in parallel I expect that behaviours. BSGS works in this way: suppose we know that P = k*G, and the private key k is in [1,...., 2^60] range 1) precompute 2^29 baby steps (they are simple public keys): 1*G, 2*G, 3*G, ...., 2^29 * G 2) split the public key P in many other public keys (they are called giant steps, 2^30 public keys): P, P - 1*(2^30*G), P - 2*(2^30*G), P - 3*(2^30)G, ..., P - (2^30 - 1)*(2^30*G) 3) for each giant steps, you check if it lies in [1, ..., 2^30] range (if it is equal to a 'baby step' public key) 4) if P - a*(2^30*G) = +-b*G then P = (a*2^30 +- b)*G then the private key is k = a*2^30 +- b If you want to search 2 public keys, P1 and P2, you can use the same baby steps, but you need to generate 2^31 giant steps instead of 2^30. Title: Re: BSGS solver for cuda Post by: arulbero on October 24, 2021, 05:18:43 PM Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p. No, don`t use even don`t know about this.Will try to understand this tweak, thanks. I mean: how do you compute a batch of 'consecutive' keys ? Like P, P+G, P+2G, P+3G, P+4G, P+5G, ... ? Title: Re: BSGS solver for cuda Post by: Etar on October 24, 2021, 05:31:37 PM Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p. No, don`t use even don`t know about this.Will try to understand this tweak, thanks. I mean: how do you compute a batch of 'consecutive' keys ? Like P, P+G, P+2G, P+3G, P+4G, P+5G, ... ? In the same way as in bitcrack https://github.com/brichard19/BitCrack/blob/6bf8059ef075eb1622298395866b0bd02375e1d9/cudaMath/secp256k1.cuh#L642 and then https://github.com/brichard19/BitCrack/blob/6bf8059ef075eb1622298395866b0bd02375e1d9/cudaMath/secp256k1.cuh#L656 Title: Re: BSGS solver for cuda Post by: arulbero on October 24, 2021, 05:40:09 PM Besides, in order to compute a batch of k steps, you need to calculate only k/2 (instead of k) elements x^-1 mod p , at the cost of 1 inversions and 3*(k/2 - 1) multiplications mod p. No, don`t use even don`t know about this.Will try to understand this tweak, thanks. I mean: how do you compute a batch of 'consecutive' keys ? Like P, P+G, P+2G, P+3G, P+4G, P+5G, ... ? Ok. If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions. If you have to compute A + B: https://i.imgur.com/jbMdLFE.jpg If you have to compute A - B, since -B = (xB, n-yB) 1/(xb-xa) is the same as in A + B. Because for example P+2G and P-2G use the same inverse. Title: Re: BSGS solver for cuda Post by: Etar on October 24, 2021, 05:44:48 PM Ok. If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions. Because for example P+2G and P-2G use the same inverse. befor using symmetry in addition speed was 800Mkeys with -w 30, after using symmetry speed grow to 1150Mkeys Title: Re: BSGS solver for cuda Post by: arulbero on October 24, 2021, 05:52:59 PM Ok. If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions. Because for example P+2G and P-2G use the same inverse. befor using symmetry in addition speed was 800Mkeys with -w 30, after using symmetry speed grow to 1150Mkeys Ok. The square "a**2 mod p" is optimized like here (https://github.com/JeanLucPons/Kangaroo/blob/master/GPU/GPUMath.h#L909) ? Title: Re: BSGS solver for cuda Post by: Etar on October 24, 2021, 05:57:01 PM Ok. If you have a batch of 100 points, you don't need to compute 100 inversions but only 50 inversions. Because for example P+2G and P-2G use the same inverse. befor using symmetry in addition speed was 800Mkeys with -w 30, after using symmetry speed grow to 1150Mkeys Ok. The square "a**2 mod p" is optimized like here (https://github.com/JeanLucPons/Kangaroo/blob/master/GPU/GPUMath.h#L909) ? i use optimized square mod P in PB https://github.com/Etayson/BSGS-cuda/blob/e41fff517b8de153b6bf9846ee7abb47524fe43e/lib/Curve64.pb#L2161 but need buffer 512bytes for this, so i did not transfer it to cuda ptx Also used double giant step, so we substuct double giant value from pub. Title: Re: BSGS solver for cuda Post by: arulbero on October 24, 2021, 06:33:05 PM Also used double giant step, so we substuct double giant value from pub. So, instead of computing sqrt(n) / 2 baby steps and sqrt(n) giant steps you compute sqrt(n) baby steps and sqrt(n) / 2 giant steps then, for example, for n = 2^60: P - a * (2^31*G) = b*G where 'a' lies in [1, ..., 2^29] and 'b' lies in [1,...,2^30] means: 1) P - a*(2^31*G) = b*G --> P = [a*(2^31) + b] * G --> priv key = a*(2^31) + b or 2) P - a*(2^31*G) = -b*G --> P = [a*(2^31) - b] * G --> priv key = a*(2^31) - b to save 2^30 giant steps. It seems to me that the program is already optimized. Title: Re: BSGS solver for cuda Post by: Etar on October 24, 2021, 06:46:12 PM Also used double giant step, so we substuct double giant value from pub. So, instead of computing sqrt(n) / 2 baby steps and sqrt(n) giant steps you compute sqrt(n) baby steps and sqrt(n) / 2 giant steps then, for example, for n = 2^60: P - a * (2^31*G) = b*G where 'a' lies in [1, ..., 2^29] and 'b' lies in [1,...,2^30] means: 1) P - a*(2^31*G) = b*G --> P = [a*(2^31) + b] * G --> priv key = a*(2^31) + b or 2) P - a*(2^31*G) = -b*G --> P = [a*(2^31) - b] * G --> priv key = a*(2^31) - b to save 2^30 giant steps. It seems to me that the program is already optimized. Baby array is 1G,2G,3G... So this array computed only first time and then redesigned to hashtable. It is one HT for any ranges. Giant array is computed with doubled value of Baby array size, for ex if Baby array have size 2^30 then Giant Array have value G*(2^31), G*(2^32), G*(2^33)... All arrays computed only one time if not changed settings. So you can easy used all arays for different ranges and pubkeys without recompute. Title: Re: BSGS solver for cuda Post by: arulbero on October 24, 2021, 06:57:39 PM I compute as max as possible baby array size dependency of GPU memory Baby array is 1G,2G,3G... So this array computed only first time and then redesigned to hashtable. It is one HT for any ranges. Giant array is computed with doubled value of Baby array size, for ex if Baby array have size 2^30 then Giant Array have value G*(2^31), G*(2^32), G*(2^33)... All arrays computed only one time if not changed settings. So you can easy used all arays for different ranges and pubkeys without recompute. You can choose: 1) 2^30 baby-steps: 1*G, 2*G, ..., 2^30*G and 2^29 giant steps: P-1*2^31*G, P-2*2^31*G,..,P -a*2^31*G where a is in [1,2,.., 2^29 - 1] or 2) 2^29 baby-steps: 1*G, 2*G, ..., 2^29*G and 2^30 giant steps: P-1*2^30*G, P-2*2^30*G,..., P -a*2^30*G where a is in [1,2,..., 2^30 - 1] I don't understand why: "G*(2^31), G*(2^32), G*(2^33)": in this way you compute only 30 giant steps Title: Re: BSGS solver for cuda Post by: Etar on October 24, 2021, 07:05:38 PM I compute as max as possible baby array size dependency of GPU memory Baby array is 1G,2G,3G... So this array computed only first time and then redesigned to hashtable. It is one HT for any ranges. Giant array is computed with doubled value of Baby array size, for ex if Baby array have size 2^30 then Giant Array have value G*(2^31), G*(2^32), G*(2^33)... All arrays computed only one time if not changed settings. So you can easy used all arays for different ranges and pubkeys without recompute. You can choose: 1) 2^30 baby-steps: 1*G, 2*G, ..., 2^30*G and 2^29 giant steps: P-1*2^31*G, P-2*2^31*G,..,P -a*2^31*G where a is in [1,2,.., 2^29 - 1] or 2) 2^29 baby-steps: 1*G, 2*G, ..., 2^29*G and 2^30 giant steps: P-1*2^30*G, P-2*2^30*G,..., P -a*2^30*G where a is in [1,2,..., 2^30 - 1] I don't understand why: "G*(2^31), G*(2^32), G*(2^33)": in this way you compute only 30 giant steps size of giant array is equil to thread number * block number * pparam for 2080ti i use 512 thread 138 blocks and pparam = 480 so totaly i have 33914880 doubled giant values So each cuda kernel call calculate 33914880 * 2(due to +y/-y in batch additions) giat steps Title: Re: BSGS solver for cuda Post by: jacky19790729 on October 29, 2021, 03:42:12 AM based on above table you can increase speed if you will utilize both bloom+bp https://github.com/iceland2k14/bsgs (https://github.com/iceland2k14/bsgs) so CPU cores are less powerful than cuda and i was thinking [not sure possible or not] if we load all bp in RAM and use some bloom in GPU memory perhaps their will be some dramatic speed boost I had try iceland2k14's BSGS Intel i7-7800X + 24 GB DDR4-2400 Code: D:\python\BSGS_ice>python bsgs_dll_secp256k1.py -p 0385a30d8413af4f8f9e6312400f2d194fe14f02e719b24c3f83bf1fd233a8f963 -b bPfile.bin -bl bloomfile.bin -n 1000000000000000 -keyspace 40000000000000:80000000000000 1 CPU speed: BSGS Check 0x38D7EA4C68000 key / second Title: Re: BSGS solver for cuda Post by: Etar on October 29, 2021, 05:20:18 AM -snip- with bsgscuda and single 2080ti found in 1s.Code: ============== KEYFOUND ============== 1 1 CPU speed: BSGS Check 0x38D7EA4C68000 key / second Code: FINDpubkey: 0385a30d8413af4f8f9e6312400f2d194fe14f02e719b24c3f83bf1fd233a8f963 Title: Re: BSGS solver for cuda Post by: jacky19790729 on October 29, 2021, 10:51:42 AM Quote with bsgscuda and single 2080ti found in 1s. Good.... Use my NVIDIA GeForce RTX 3090 Founders Edition #65 only spent 38 seconds to solved I have 3 RTX 3090 Founders Edition graphics cards Code: D:\BTC\cuda_BSBG>SET pub=30210c23b1a047bc9bdbb13448e67deddc108946de6de639bcc75d47c0216b1be383c4a8ed4fac77c0d2ad737d8499a362f483f8fe39d1e86aaed578a9455dfc Title: Re: BSGS solver for cuda Post by: Etar on October 29, 2021, 12:38:42 PM Quote with bsgscuda and single 2080ti found in 1s. Good.... Use my NVIDIA GeForce RTX 3090 Founders Edition #65 only spent 38 seconds to solved I have 3 RTX 3090 Founders Edition graphics cards -snip- and you will solve this puzzle #64 16 time faster Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 01, 2021, 07:23:26 PM 3 * RTX 3090 (GPU Ram only use 5 GB ) ...but RTX3090 have 24 GB DDR6
#70 My Computer took 53 seconds to solve ..... Code:
Title: Re: BSGS solver for cuda Post by: dlystyr on November 01, 2021, 10:09:28 PM Hi,
Thanks for the great program Etar. I was wondering if there is a way to save work? or if BSGS does not work like that? Or with the hash table, does it continue from where it left off already? Thanks Title: Re: BSGS solver for cuda Post by: Etar on November 05, 2021, 06:48:44 PM Hi, released v1.7.2Thanks for the great program Etar. I was wondering if there is a way to save work? or if BSGS does not work like that? Or with the hash table, does it continue from where it left off already? Thanks Current state is saved to file currentwork.txt(file name can`t be changed) every 180s (by default) but you can change this parametr with -wt If app crash or you stop app, you can start working from the last saved state. if the launch configuration has not been changed. set parametr -wl in your bat file with file name of state and app will start from this state. Also added presettings for each card(just showing) but you can try to use this presetings to fill full your GPU memory. Title: Re: BSGS solver for cuda Post by: Etar on November 05, 2021, 07:10:45 PM 3 * RTX 3090 (GPU Ram only use 5 GB ) ...but RTX3090 have 24 GB DDR6 Read what i write above and readme.md file on github.-snip- You need increase -w parameter and set -htsz dependency of -w The main task is to fill the GPU memory as much as possible with the help of the -w parameter(and -htsz) And then fill free GPU memory with the -p parameter -b (not more then x2 of SM) and -t (not more then 512) Presettings say that for your RTX 3090 good config is: -t 512 -b 328 -p 530 -w 31 -htsz 29 you fill 20436.750 MB from free 20450.000 but you need around 58GB of host RAM to generate all arrays. with saved arrays you need much less memory to launch app. In this case you perfomance will be around 2^62 per card. Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 09, 2021, 12:51:50 AM I upgrade the memory DRAM to 128 GB ( DDR4 - 32 GB * 4 DIMM )
Error !!! -t 512 -b 328 -p 530 -w 31 -htsz 29 Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_7_2.exe -d 0,1,2 -pb 90e6900a58d33393bc1097b5aed31f2e4e7cbd3e5466af958665bc0121248483d7319f127105f492fd15e009b103b4a83295722f28f07c95f9a5443ef8e77ce0 -pk 0x0000000000000000000000000000000000000000000000200000000000000000 -t 512 -b 328 -p 530 -w 31 -htsz 29 then....... -t 512 -b 328 -p 530 -w 31 -htsz 30 Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_7_2.exe -d 0,1,2 -pb 90e6900a58d33393bc1097b5aed31f2e4e7cbd3e5466af958665bc0121248483d7319f127105f492fd15e009b103b4a83295722f28f07c95f9a5443ef8e77ce0 -pk 0x0000000000000000000000000000000000000000000000200000000000000000 -t 512 -b 328 -p 530 -w 31 -htsz 30 [moderator's note: consecutive posts merged] Title: Re: BSGS solver for cuda Post by: Etar on November 09, 2021, 06:58:24 AM I upgrade the memory DRAM to 128 GB ( DDR4 - 32 GB * 4 DIMM ) Yeeh, you are right. With -w 31 and -htsz 29 there in HT can be collision with the same values. Even with -w 31 -htsz 30 HT will have collision.Error !!! -t 512 -b 328 -p 530 -w 31 -htsz 29 So try for your configuration -t 512 -b 328 -p 796 -w 30 -htsz 29 total MB: 20430.500 or -t 512 -b 328 -p 930 -w 30 -htsz 28 total MB: 20442.750 Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 09, 2021, 06:49:30 PM test #70
3 * RTX3090 use " -t 512 -b 328 -p 930 -w 30 -htsz 28 " I get error message "error cuCtxSynchronize-700" GPU Memory used 20919 MB~~~~~ Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_7_2.exe -d 0,1,2 -pb 90e6900a58d33393bc1097b5aed31f2e4e7cbd3e5466af958665bc0121248483d7319f127105f492fd15e009b103b4a83295722f28f07c95f9a5443ef8e77ce0 -pk 0x0000000000000000000000000000000000000000000000200000000000000000 -t 512 -b 328 -p 930 -w 30 -htsz 28 Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on November 09, 2021, 07:04:03 PM test #70 Have you tried running with 1 GPU or at least all GPUs that can handle the same config? It's hard to tell if your 1030 is one of the GPUs selected.3 * RTX3090 use " -t 512 -b 328 -p 930 -w 30 -htsz 28 " I get error message "error cuCtxSynchronize-700" GPU Memory used 20919 MB~~~~~ Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_7_2.exe -d 0,1,2 -pb 90e6900a58d33393bc1097b5aed31f2e4e7cbd3e5466af958665bc0121248483d7319f127105f492fd15e009b103b4a83295722f28f07c95f9a5443ef8e77ce0 -pk 0x0000000000000000000000000000000000000000000000200000000000000000 -t 512 -b 328 -p 930 -w 30 -htsz 28 Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 09, 2021, 08:01:26 PM Have you tried running with 1 GPU or at least all GPUs that can handle the same config? It's hard to tell if your 1030 is one of the GPUs selected. YES , I use 1 GPU still show this error #0 NVIDIA GeForce RTX 3090 ( PCI-E x1 - Plug in the card PIC-E x1 to x16 USB3.0 ) #1 NVIDIA GeForce RTX 3090 ( PCI-E x1 - Plug in the card PIC-E x1 to x16 USB3.0 ) #2 NVIDIA GeForce RTX 3090 ( PCI-E x1 - Plug in the card PIC-E X1 to x16 USB3.0 ) #3 NVIDIA GeForce GT 1030 ( PCI-E x16 - Plug in the ASUS TUF x299 motherboard) When using parameters "-t 512 -b 328 -p 930 -w 30 -htsz 28 "~~ get this error I use "-t 512 -b 68 -p 256 -w 29 -htsz 28" to solve #70 only 50~60 seconds Title: Re: BSGS solver for cuda Post by: COBRAS on November 10, 2021, 12:34:46 AM Have you tried running with 1 GPU or at least all GPUs that can handle the same config? It's hard to tell if your 1030 is one of the GPUs selected. YES , I use 1 GPU still show this error #0 NVIDIA GeForce RTX 3090 ( PCI-E x1 - Plug in the card PIC-E x1 to x16 USB3.0 ) #1 NVIDIA GeForce RTX 3090 ( PCI-E x1 - Plug in the card PIC-E x1 to x16 USB3.0 ) #2 NVIDIA GeForce RTX 3090 ( PCI-E x1 - Plug in the card PIC-E X1 to x16 USB3.0 ) #3 NVIDIA GeForce GT 1030 ( PCI-E x16 - Plug in the ASUS TUF x299 motherboard) When using parameters "-t 512 -b 328 -p 930 -w 30 -htsz 28 "~~ get this error I use "-t 512 -b 68 -p 256 -w 29 -htsz 28" to solve #70 only 50~60 seconds How long you need for solve 115 and 110 (one pubkey) ? Title: Re: BSGS solver for cuda Post by: Etar on November 10, 2021, 07:51:04 AM test #70 cuCtxSynchronize-700 happened when giant array is more then 4gb(32bit pointer overflow in 1.7.2)3 * RTX3090 use " -t 512 -b 328 -p 930 -w 30 -htsz 28 " I get error message "error cuCtxSynchronize-700" GPU Memory used 20919 MB~~~~~ -snip- replaced pointer with 64bit and released v1.7.3 it should fixed issue with cuCtxSynchronize-700 https://github.com/Etayson/BSGS-cuda/releases/tag/v1.7.3 (https://github.com/Etayson/BSGS-cuda/releases/tag/v1.7.3) PS. have an off topic question: Ethereum blockchain have some interesting transaction that signed with unusual R signature like this: 000000000000000000000000000000000000000000000000000000000000002D or 1820182018201820182018201820182018201820182018201820182018201820 or 8208208208208208208208208208208208208208208208208208208208208200 R=K*G.. how he calculated k for this beautiful R ? Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 10, 2021, 11:18:00 AM bsgscudaHT_1_7_3.exe work well on 3 * RTX3090 (use 20GB GPU RAM)
29 seconds solved #70 ....... Speed: 0xB5EA17F0A8A2A19E / second :D :D :D Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_7_3.exe -d 0,1,2 -pb 90e6900a58d33393bc1097b5aed31f2e4e7cbd3e5466af958665bc0121248483d7319f127105f492fd15e009b103b4a83295722f28f07c95f9a5443ef8e77ce0 -pk 0x0000000000000000000000000000000000000000000000200000000000000000 -t 512 -b 328 -p 930 -w 30 -htsz 28 Title: Re: BSGS solver for cuda Post by: Etar on November 10, 2021, 12:14:52 PM bsgscudaHT_1_7_3.exe work well on 3 * RTX3090 (use 20GB GPU RAM) Good that it is working for you.29 seconds solved #70 ....... Speed: 0xB5EA17F0A8A2A19E / second -snip- But you can try to play with -t -b -p because i don`t see reason to use all 20gb of GPU memory. Maybe -t 512 -b 164 -p 512 (total 12128MB) and compare with your current result. Title: Re: BSGS solver for cuda Post by: jovica888 on November 10, 2021, 12:49:52 PM So I need to know Public Key from the address to search the private key???
How can I find a Public key from the address - for example 1E4oDjEoBPXLS8vSYZ5dgjQEf4PZ4FLRhY Title: Re: BSGS solver for cuda Post by: bigvito19 on November 10, 2021, 12:57:58 PM So I need to know Public Key from the address to search the private key??? How can I find a Public key from the address - for example 1E4oDjEoBPXLS8vSYZ5dgjQEf4PZ4FLRhY That address doesn't have an outgoing transaction, the public key is not exposed for that address. Title: Re: BSGS solver for cuda Post by: jovica888 on November 10, 2021, 01:53:20 PM I did my research... This is a very good tool <3
Title: Re: BSGS solver for cuda Post by: _Counselor on November 10, 2021, 04:20:47 PM PS. have an off topic question: That is smart contract transactions, read here: https://eips.ethereum.org/EIPS/eip-1820Ethereum blockchain have some interesting transaction that signed with unusual R signature like this: 000000000000000000000000000000000000000000000000000000000000002D or 1820182018201820182018201820182018201820182018201820182018201820 or 8208208208208208208208208208208208208208208208208208208208208200 R=K*G.. how he calculated k for this beautiful R ? Title: Re: BSGS solver for cuda Post by: jovica888 on November 10, 2021, 05:12:23 PM I did my research again. With cuBitCrack I search around 200Mkeys/s with 2x Nvidia 1060
I have a text file with 23milion addresses The range of my search will be 0 to ffffffffffffffffffffffffffffffffffffffff - 2^160 With this software, first I manually found around 20 (just 20 not 20 million) public keys then started to scan and I got around 700Mkeys/s which is good... But I realized that my searching range is now 0 to 2^256 and also I am searching only 20 keys... So why is this software better than cuBitCrack? Title: Re: BSGS solver for cuda Post by: Etar on November 10, 2021, 05:21:28 PM -snip- thanks for the link, I didn’t know that can be ignored the usual signing process.That is smart contract transactions, read here: https://eips.ethereum.org/EIPS/eip-1820 Title: Re: BSGS solver for cuda Post by: Etar on November 11, 2021, 12:21:09 PM bsgscudaHT_1_7_3.exe work well on 3 * RTX3090 (use 20GB GPU RAM) @jacky19790729 can you test prerelease, just for testing -w ?29 seconds solved #70 ....... Speed: 0xB5EA17F0A8A2A19E / second -snip- https://github.com/Etayson/BSGS-cuda/releases/tag/v.1.8.0-alpha (https://github.com/Etayson/BSGS-cuda/releases/tag/v.1.8.0-alpha) use -t 512 -b 164 -p 512 -w 31 -htsz 29 if this works for you, try a puzzle#70 for example and let me know about result. if all will be ok, try -t 256 -b 164 -p 512 -w 32 -htsz 28 with puzzle#70 for example and let me know about result. (if there are warning messages when generating an arrays, just ignore them) Thanks! Title: Re: BSGS solver for cuda Post by: jovica888 on November 11, 2021, 02:25:47 PM Can you put the option to search multiple pubic keys... To search for example 100 keys at once? Not 1by1
Title: Re: BSGS solver for cuda Post by: math09183 on November 11, 2021, 02:37:12 PM Can you put the option to search multiple pubic keys... To search for example 100 keys at once? Not 1by1 You have no idea what you are talking about, what you are doing and what is the algorithm. Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 11, 2021, 04:07:05 PM bsgscudaHT_1_8_0.exe "-t 512 -b 164 -p 512 -w 31 -htsz 29" #75 ( 122 seconds )
bsgscudaHT_1_8_0.exe "-t 512 -b 164 -p 512 -w 31 -htsz 29" #70 ( 13 seconds ) bsgscudaHT_1_8_0.exe "-t 256 -b 164 -p 512 -w 31 -htsz 28" #70 ( 14 seconds ) 3 * RTX 3090 - 6700~6800 MKeys/s #75 result (-t 512 -b 164 -p 512 -w 31 -htsz 29) Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_8_0.exe -d 0,1,2 -pb 04726b574f193e374686d8e12bc6e4142adeb06770e0a2856f5e4ad89f660447559b15322e6707090a4db3f09c7e6632a26db57f03eb07b40979fc01c827e1b0a3 -pk 0x0000000000000000000000000000000000000000000004000000000000000000 -t 512 -b 164 -p 512 -w 31 -htsz 29 #70 result (-t 512 -b 164 -p 512 -w 31 -htsz 29) Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_8_0.exe -d 0,1,2 -pb 90e6900a58d33393bc1097b5aed31f2e4e7cbd3e5466af958665bc0121248483d7319f127105f492fd15e009b103b4a83295722f28f07c95f9a5443ef8e77ce0 -pk 0x0000000000000000000000000000000000000000000000200000000000000000 -t 512 -b 164 -p 512 -w 31 -htsz 29 #70 result ( -t 256 -b 164 -p 512 -w 31 -htsz 28 ) Code:
Title: Re: BSGS solver for cuda Post by: Etar on November 11, 2021, 05:31:39 PM bsgscudaHT_1_8_0.exe "-t 512 -b 164 -p 512 -w 31 -htsz 29" #75 ( 122 seconds ) Good and thanks! If possible try the configuration -t 256 -b 164 -p 512 -w 32 -htsz 28bsgscudaHT_1_8_0.exe "-t 512 -b 164 -p 512 -w 31 -htsz 29" #70 ( 13 seconds ) bsgscudaHT_1_8_0.exe "-t 256 -b 164 -p 512 -w 31 -htsz 28" #70 ( 14 seconds ) 3 * RTX 3090 - 6700~6800 MKeys/s -snip- Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 11, 2021, 06:58:48 PM Quote Good and thanks! If possible try the configuration -t 256 -b 164 -p 512 -w 32 -htsz 28 "-t 256 -b 164 -p 512 -w 32 -htsz 28" I try it , then get this error message "-w should be less than 32" Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_8_0.exe -d 0,1,2 -pb 90e6900a58d33393bc1097b5aed31f2e4e7cbd3e5466af958665bc0121248483d7319f127105f492fd15e009b103b4a83295722f28f07c95f9a5443ef8e77ce0 -pk 0x0000000000000000000000000000000000000000000000200000000000000000 -t 256 -b 164 -p 512 -w 32 -htsz 28 Title: Re: BSGS solver for cuda Post by: Etar on November 11, 2021, 07:08:16 PM -snip- try reload release, i was update max -w parameter"-t 256 -b 164 -p 512 -w 32 -htsz 28" I try it , then get this error message "-w should be less than 32" -snip- Title: Re: BSGS solver for cuda Post by: jovica888 on November 11, 2021, 07:20:53 PM Can you put the option to search multiple pubic keys... To search for example 100 keys at once? Not 1by1 You have no idea what you are talking about, what you are doing and what is the algorithm. I saw that program search only 1 public key - what did I ask wrong? Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on November 11, 2021, 07:41:54 PM Can you put the option to search multiple pubic keys... To search for example 100 keys at once? Not 1by1 You have no idea what you are talking about, what you are doing and what is the algorithm. I saw that program search only 1 public key - what did I ask wrong? Someone has asked Etar already. Although it could be done, it would reduce the overall speed. So if your speed is 100 MKey/s searching for 1 pubkey and then you searched for 50 pubkeys at once, your speed would now be roughly 100/50 = 2 MKey/s. It's best to break total range into smaller subranges and search multiple pubkeys that way; or at least that way requires no additional tweaks to the main BSGS cuda code. Title: Re: BSGS solver for cuda Post by: Etar on November 11, 2021, 07:49:42 PM I saw that program search only 1 public key - what did I ask wrong? Title: Re: BSGS solver for cuda Post by: jovica888 on November 11, 2021, 08:12:05 PM In cmd it says
-infile Set file with pubkey for searching in uncompressed/compressed format (search sequential) So it will get 1st key and then search it until it finds it and it will search for 2nd 3rd 4th... to the end of list Title: Re: BSGS solver for cuda Post by: math09183 on November 12, 2021, 07:22:28 AM Can you put the option to search multiple pubic keys... To search for example 100 keys at once? Not 1by1 You have no idea what you are talking about, what you are doing and what is the algorithm. I saw that program search only 1 public key - what did I ask wrong? @jovica888: 1) you did not read the topic, the question was already asked 2) it makes no sense in terms of performance. It is like watching Gordon Ramsay preparing lunch and asking "could you also do ironing and dancing at the same moment"? It is important to understand moment when you switch from consecutive work which could (sooner or later) guarantee success into playing lottery and wishing for luck. Title: Re: BSGS solver for cuda Post by: demoinvest1 on November 12, 2021, 09:28:43 AM I try run bsgscudaHT_1_7_3.exe and bsgscudaHT_1_8_0.exe is work fine but for code on github bsgscudaussualHTchangeble1_7_3.pb I try use purebasic v5.70 run it but not work with SHA1Fingerprint function How can I fix it? Just try understand method BSGS how it works? Title: Re: BSGS solver for cuda Post by: Etar on November 12, 2021, 09:45:45 AM I try run bsgscudaHT_1_7_3.exe and bsgscudaHT_1_8_0.exe is work fine but for code on github bsgscudaussualHTchangeble1_7_3.pb I try use purebasic v5.70 run it but not work with SHA1Fingerprint function How can I fix it? Just try understand method BSGS how it works? in new version of PB removed ascll mode. Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 12, 2021, 11:15:12 AM "-t 256 -b 164 -p 512 -w 32 -htsz 28" generate all BIN files need 90 GB RAM and 4 hours running time
Part1: -t 256 -b 164 -p 512 -w 32 -htsz 28 (search 6 keys publickey.txt) 3 keys lost Part2: -t 256 -b 164 -p 512 -w 31 -htsz 28 (search 6 keys publickey.txt ) 0 keys lost (Fix) Part1 log: 3 keys lost Code: D:\BTC\cuda_BSGS>bsgscudaHT_1_8_0.exe -d 0,1,2 -infile publickey.txt -pk 0x0000000000000000000000000000000000000000000000000000000000000001 -pke 0x0000000000000000000000000000000000000000000000800000000000000000 -t 256 -b 164 -p 512 -w 32 -htsz 28 Part2 log: 0 keys lost (Fix) Code:
Title: Re: BSGS solver for cuda Post by: Etar on November 12, 2021, 12:18:02 PM "-t 256 -b 164 -p 512 -w 32 -htsz 28" generate all BIN files need 90 GB RAM and 4 hours running time Many thanks, will investigate why lost happened.Part1: -t 256 -b 164 -p 512 -w 32 -htsz 28 (search 6 keys publickey.txt) 3 keys lost Part2: -t 256 -b 164 -p 512 -w 31 -htsz 28 (search 6 keys publickey.txt ) 1 keys lost -snip- Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 12, 2021, 12:24:22 PM "-t 256 -b 164 -p 512 -w 32 -htsz 28" generate all BIN files need 90 GB RAM and 4 hours running time Many thanks, will investigate why lost happened.Part1: -t 256 -b 164 -p 512 -w 32 -htsz 28 (search 6 keys publickey.txt) 3 keys lost Part2: -t 256 -b 164 -p 512 -w 31 -htsz 28 (search 6 keys publickey.txt ) 1 keys lost -snip- "-t 256 -b 164 -p 512 -w 31 -htsz 28" should be 0 keys lost sorry, I give error -pke end range for #75 public key Title: Re: BSGS solver for cuda Post by: Etar on November 12, 2021, 12:29:13 PM -snip- As i correct understand lost only in configuration with -w32 ?"-t 256 -b 164 -p 512 -w 31 -htsz 28" should be 0 keys lost sorry, I give error -pke end range for #75 public key With -w31 all keys found, correct? Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 12, 2021, 12:49:15 PM -snip- As i correct understand lost only in configuration with -w32 ?"-t 256 -b 164 -p 512 -w 31 -htsz 28" should be 0 keys lost sorry, I give error -pke end range for #75 public key With -w31 all keys found, correct? yes....-w31 find all keys...... -w32 lost some keys can you give me a file for testint ~~ public key list ..... 2^1 ~ 2^75 range I will try find all private key from your file .... Title: Re: BSGS solver for cuda Post by: Etar on November 12, 2021, 01:10:33 PM -snip- can you give me a file for testint ~~ public key list ..... 2^1 ~ 2^75 range I will try find all private key from your file .... 20keys in range 2^75 -pk 0x1 -pke 0x7ffffffffffffffffff try with -w 31, because we already know that -w 32 have problem. Code: 0473f38e8621417cef51fe848ba26f00a5e78ccc00852ab1c2ca56505e80a9b37810ca73afaf222a1f072ec9a7f48929c029c762f5fca422ec3e6bf1cc4589b946 Title: Re: BSGS solver for cuda Post by: jacky19790729 on November 12, 2021, 03:37:39 PM 20keys in range 2^75 -pk 0x1 -pke 0x7ffffffffffffffffff try with -w 31, because we already know that -w 32 have problem. "-t 512 -b 164 -p 512 -w 31 -htsz 29" I try to find 1~10 keys , I think this is the fastest and no any lost key Code: KEY[1]: 0x00000000000000000000000000000000000000000000024b831f525b5544432d Title: Re: BSGS solver for cuda Post by: demoinvest1 on November 13, 2021, 01:27:35 AM you need purebasic v5.31 because need ascll mode enabled in new version of PB removed ascll mode. Thank you Now I change to use PureBasic v5.30, problem , How I can find cuda.lib ? POLINK: fatal error: File not found lib\cuda.lib Title: Re: BSGS solver for cuda Post by: demoinvest1 on November 13, 2021, 02:00:49 AM Thank you Now I change to use PureBasic v5.30, problem , How I can find cuda.lib ? POLINK: fatal error: File not found lib\cuda.lib Ok, I solve my problem already Now, I got cuda.lib from NVIDIA CUDA 10 driver C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\lib\x64 Title: Re: BSGS solver for cuda Post by: citb0in on November 15, 2022, 08:47:43 AM Was there any progress made with BSGS solver for CUDA meanwhile? I just stumbled over this old post and tried to use it, however I was not successful. I downloaded purebasic from the suggested link at the bottom of Etayson's Github repository (https://github.com/Etayson/BSGS-cuda), however the free version that is available for download on www.purebasic.com is a demo version which is limited to a few thousand lines of code and thus the loaded purebasic file will not get executed. OP said that we need PureBasic v5.31 but I cannot find this full version 5.31 on the webpage. Can anyone point me to a working download link for 5.31 for Linux x64, please?
Is BSGS solver useless meanwhile and there are some better tools that you would suggest? I am only aware of Keyhunts' BSGS mode which is executed in CPU threads. A CUDA version would be nice to test and hopefully get a higher rate. @Etar, are you even reading this anymore? Maybe under a different username? If so, please reply. I am trying to compile your program <bsgscudaussualHTchangeble1_7_3.pb> with PureBasic v5.31 under Linux. Unfortunately I do not succeed. At the first try I got this error message: Code: $ pbcompiler ./bsgscudaussualHTchangeble1_7_3.pb Quote ****************************************** PureBasic 5.31 (Linux - x64) ****************************************** Loading external modules... Starting compilation... Starting compilation... Error: Line 2 - File not found (~/BSGS-cuda/./Curve64.pb). This one was easy to fix, I just had to replace the backslash into a forward slash in line 2 of your program. I guess you were using Windows where folders are separated by the character '\' instead of '/' in Linux. Quote IncludeFile "lib/Curve64.pb" But then after another try I get the error indicating that no cuda.lib was found. I searched for this file but wasn't able to find, even not under my CUDA installation in /usr/local/cuda* there is absolutely no such file on a linux system. Where do we find this file? I was able to find a similar file and I thought I give a try Code: cp /usr/local/cuda-11.8/targets/x86_64-linux/lib/stubs/libcuda.so ~/BSGS-cuda/lib/ then I replaced line 42 by: Quote Import "lib/libcuda.so" but the compiler still fails, see here: Code: $ pbcompiler ./bsgscudaussualHTchangeble1_7_3.pb Quote ****************************************** PureBasic 5.31 (Linux - x64) ****************************************** Loading external modules... Starting compilation... Starting compilation... Including source: lib/Curve64.pb 10273 lines processed. Creating the executable. Error: Linker /usr/bin/ld: purebasic.o: warning: relocation in read-only section `.text' /usr/bin/ld: purebasic.o: relocation R_X86_64_PC32 against symbol `exit@@GLIBC_2.2.5' can not be used when making a PIE object; recompile with -fPIE /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status If anyone else here is reading along and can help, I am of course also very grateful for helpful tips and recommendations. Any help appreciated. Thank you Title: Re: BSGS solver for cuda Post by: raschwarz on January 11, 2024, 03:36:43 PM Was there any progress made with BSGS solver for CUDA meanwhile? I just stumbled over this old post and tried to use it, however I was not successful. I downloaded purebasic from the suggested link at the bottom of Etayson's Github repository (https://github.com/Etayson/BSGS-cuda), however the free version that is available for download on www.purebasic.com is a demo version which is limited to a few thousand lines of code and thus the loaded purebasic file will not get executed. OP said that we need PureBasic v5.31 but I cannot find this full version 5.31 on the webpage. Can anyone point me to a working download link for 5.31 for Linux x64, please? Is BSGS solver useless meanwhile and there are some better tools that you would suggest? I am only aware of Keyhunts' BSGS mode which is executed in CPU threads. A CUDA version would be nice to test and hopefully get a higher rate. @Etar, are you even reading this anymore? Maybe under a different username? If so, please reply. I am trying to compile your program <bsgscudaussualHTchangeble1_7_3.pb> with PureBasic v5.31 under Linux. Unfortunately I do not succeed. At the first try I got this error message: Code: $ pbcompiler ./bsgscudaussualHTchangeble1_7_3.pb Quote ****************************************** PureBasic 5.31 (Linux - x64) ****************************************** Loading external modules... Starting compilation... Starting compilation... Error: Line 2 - File not found (~/BSGS-cuda/./Curve64.pb). This one was easy to fix, I just had to replace the backslash into a forward slash in line 2 of your program. I guess you were using Windows where folders are separated by the character '\' instead of '/' in Linux. Quote IncludeFile "lib/Curve64.pb" But then after another try I get the error indicating that no cuda.lib was found. I searched for this file but wasn't able to find, even not under my CUDA installation in /usr/local/cuda* there is absolutely no such file on a linux system. Where do we find this file? I was able to find a similar file and I thought I give a try Code: cp /usr/local/cuda-11.8/targets/x86_64-linux/lib/stubs/libcuda.so ~/BSGS-cuda/lib/ then I replaced line 42 by: Quote Import "lib/libcuda.so" but the compiler still fails, see here: Code: $ pbcompiler ./bsgscudaussualHTchangeble1_7_3.pb Quote ****************************************** PureBasic 5.31 (Linux - x64) ****************************************** Loading external modules... Starting compilation... Starting compilation... Including source: lib/Curve64.pb 10273 lines processed. Creating the executable. Error: Linker /usr/bin/ld: purebasic.o: warning: relocation in read-only section `.text' /usr/bin/ld: purebasic.o: relocation R_X86_64_PC32 against symbol `exit@@GLIBC_2.2.5' can not be used when making a PIE object; recompile with -fPIE /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status If anyone else here is reading along and can help, I am of course also very grateful for helpful tips and recommendations. Any help appreciated. Thank you I tried to switch off PIE with code below at start and you can compile and build executable for Linux, but running that code ended with "Illegal instruction (core dumped)". Code: Import "-no-pie" Actually I am trying to debug it, however I am not very far with it. No success yet. Title: Re: BSGS solver for cuda Post by: citb0in on January 12, 2024, 07:47:04 AM Thanks for your feedback. Please let us know your findings and hopefully you will have success. I tried a lot but finally gave up without a satisfying result. However I am still interested in testing this tool.
Wish you best of luck citb0in Title: Re: BSGS solver for cuda Post by: greenAlien on March 12, 2024, 02:25:14 PM Congrats for your software @etar
It would be nice to have this program for Linux or at least the binary to create the HT so we can rent a Linux server with big amount of RAM to make the HT and download it to our computers. Title: Re: BSGS solver for cuda Post by: GTX1060x2 on March 16, 2024, 02:54:27 PM I successfully compiled it for Linux, but the program just closes without an error. Has anyone been able to solve this?
Code: # ./onlygen1_9_6File -t 256 -b 96 -p 506 -w 30 -htsz 29 Title: Re: BSGS solver for cuda Post by: citb0in on March 16, 2024, 03:45:22 PM if you provide some instructions on how to compile on Linux I might look into it.
Title: Re: BSGS solver for cuda Post by: GTX1060x2 on March 16, 2024, 05:43:28 PM if you provide some instructions on how to compile on Linux I might look into it. Code: apt install build-essential gcc g++ libxxf86vm-dev libxine2-dev unixodbc-dev libsdl1.2-dev libsdl2-dev libssl-dev libgtk2.0-dev libgtk-3-dev libwebkit2gtk-4.0-dev libvlc-dev Replace lib\ to lib/ in the source And add this Code: Import "-no-pie" Code: root@vm:~/purebasic/compilers# cat /etc/lsb-release | grep -i release Do not use Ubuntu 22. Title: Re: BSGS solver for cuda Post by: citb0in on March 17, 2024, 08:18:55 AM where to download pb ?
Title: Re: BSGS solver for cuda Post by: greenAlien on March 17, 2024, 10:01:19 AM Thanks GTX1060x2!
I will take a look too! where to download pb ? From here https://github.com/Etayson/BSGS-cuda/blob/main/onlygen1_9_6File.pb (https://github.com/Etayson/BSGS-cuda/blob/main/onlygen1_9_6File.pb) Title: Re: BSGS solver for cuda Post by: greenAlien on March 23, 2024, 01:07:34 PM I successfully compiled it for Linux, but the program just closes without an error. Has anyone been able to solve this? Code: # ./onlygen1_9_6File -t 256 -b 96 -p 506 -w 30 -htsz 29 I have dedicated some time to compile the BSGS, the onlyGen file and BSGS-fractions in Linux with your instructions. The compilation was successfully for all of them however, when running the files they display the initial text but after that it closes without any output. What are we missing here ? This is and example of the execution, don't take the arguments seriously since it was just to test: Code: vboxuser@ubuntupurebasic:~/purebasic/compilers$ ./generateHT -t 256 -b 96 -p 506 -w 30 -pk 8000000000000000 -pke ffffffffffffffff -pb 03100611c54dfef604163b8358f7b7fac13ce478e02cb224ae16d45526b25d9d4d -htsz 28 I have also rented a Linux server and tried to run the binaries there just in case they didn't work on my Linux because GPU or Ram issues but I had the same results... :( Title: Re: BSGS solver for cuda Post by: Cricktor on March 24, 2024, 01:47:23 PM where to download pb ? From here https://github.com/Etayson/BSGS-cuda/blob/main/onlygen1_9_6File.pb (https://github.com/Etayson/BSGS-cuda/blob/main/onlygen1_9_6File.pb) I believe citb0in is asking where to download the PureBasic compiler itself, not Etar's PureBasic source code file(s). I haven't searched myself, but it shouldn't be rocket science to find the needed version of the PureBasic compiler for Linux with the help of internet search engines and/or Linux package search sites. Maybe @etar can help us here ? You may be lucky if you would address OP with correct spelling of his username @Etar, but maybe you're still unlucky because Etar was last active in this forum around July 23rd, 2023. Title: Re: BSGS solver for cuda Post by: citb0in on March 24, 2024, 02:27:26 PM I believe citb0in is asking where to download the PureBasic compiler itself, not Etar's PureBasic source code file(s). I haven't searched myself, but it shouldn't be rocket science to find the needed version of the PureBasic compiler for Linux with the help of internet search engines and/or Linux package search sites. That was my question, absolutely. The source code is available but I wasn't able to found a working and usable PureBasic installation source. Title: Re: BSGS solver for cuda Post by: WanderingPhilospher on March 25, 2024, 02:07:34 AM I believe citb0in is asking where to download the PureBasic compiler itself, not Etar's PureBasic source code file(s). I haven't searched myself, but it shouldn't be rocket science to find the needed version of the PureBasic compiler for Linux with the help of internet search engines and/or Linux package search sites. That was my question, absolutely. The source code is available but I wasn't able to found a working and usable PureBasic installation source. Really, none of y’all could find it? https://www.purebasic.com/pricing.php (https://www.purebasic.com/pricing.php) Yes, you have to pay for it. Once you pay for it, you can download any new or legacy versions, windows and/or Linux. You will need the one that Etar mentions in his PB code. Title: Re: BSGS solver for cuda Post by: greenAlien on March 25, 2024, 09:04:31 AM I believe citb0in is asking where to download the PureBasic compiler itself, not Etar's PureBasic source code file(s). I haven't searched myself, but it shouldn't be rocket science to find the needed version of the PureBasic compiler for Linux with the help of internet search engines and/or Linux package search sites. That was my question, absolutely. The source code is available but I wasn't able to found a working and usable PureBasic installation source. Really, none of y’all could find it? https://www.purebasic.com/pricing.php (https://www.purebasic.com/pricing.php) Yes, you have to pay for it. Once you pay for it, you can download any new or legacy versions, windows and/or Linux. You will need the one that Etar mentions in his PB code. Exactly, you can buy it or...you can just search in the internet... Does anyone have any clue regarding the no output when running the binaries after linux compilation? The binaries just close themselves ??? Title: Re: BSGS solver for cuda Post by: anjilite7 on May 21, 2024, 09:59:43 AM so the speed is around 300MKeys, wheres my exakeys? ;D
Cnt:b8c6800000000001 [1][ 316 ] = 316 MKeys/s x2^27.0=2^55.31 Jt:00:05:00 Tt:00:05:03 Title: Re: BSGS solver for cuda Post by: CY4NiDE on May 21, 2024, 10:40:49 PM so the speed is around 300MKeys, wheres my exakeys? ;D Cnt:b8c6800000000001 [1][ 316 ] = 316 MKeys/s x2^27.0=2^55.31 Jt:00:05:00 Tt:00:05:03 It says 316Mk/s x 2^27.0 = 2^55.31 2^55.31 = 44665177000000000 So this is your speed. Around 44 Pk/s (or 0.04 Ek/s). Title: Re: BSGS solver for cuda Post by: mahurovihamilo on May 23, 2024, 05:30:09 PM Hi there,
Is there a "version" of this for Ubuntu? Thanks. |