Etar (OP)
|
|
October 20, 2021, 06:19:11 AM |
|
1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this . 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. 4300000000000000000 it is 2^61.89. so whole 65range( i think you mean puzzle #65 with range 2^64bit) need 4.28 seconds I don`t have 3080 card but i think speed will be around 1400Mkeys x BabyArraySize windows10 eat 20% of GPU memory so 3080 should have 8192 free memory, so we can use -w 30 Totaly 1400mkeys = 2^30.38 and baby array x2 = 2^31 and full perfomance = 2^61.38 and to check full 2^64 need 6.14s Only Kangaroo can solve keys faster then bsgs or keyhunt or whatever. Bsgs cuda created only because i didn`t find bsgs for gpu (maybe it useless app i don`t know)
|
|
|
|
ssxb
Jr. Member
Offline
Activity: 81
Merit: 2
|
|
October 20, 2021, 07:38:15 AM Last edit: October 20, 2021, 01:42:40 PM by ssxb |
|
1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this . 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. 4300000000000000000 it is 2^61.89. so whole 65range( i think you mean puzzle #65 with range 2^64bit) need 4.28 seconds I don`t have 3080 card but i think speed will be around 1400Mkeys x BabyArraySize windows10 eat 20% of GPU memory so 3080 should have 8192 free memory, so we can use -w 30 Totaly 1400mkeys = 2^30.38 and baby array x2 = 2^31 and full perfomance = 2^61.38 and to check full 2^64 need 6.14s Only Kangaroo can solve keys faster then bsgs or keyhunt or whatever. Bsgs cuda created only because i didn`t find bsgs for gpu (maybe it useless app i don`t know) i am not arguing on your math but if you have time and hardware please just try to do research on keyhunt [CPU+memory ] and by the way i appreciate your programing skills toward cuda its really impressive and wish some day you will enhanced it more to overcome 120 and by the way i know one guy who is running it with 9+Ekeys/sec [yoyodapro]. but with divisor you can get only get 1 key out of 1073741824 if you want to reach 90bit. i loaded all keys in keyhunt and i am trying my luck but on other side i was hoping if we can figure it out how to load multi keys with cudabsgs . so i will keep busy my 3080 for that as that one is just sitting idle now.
|
|
|
|
ssxb
Jr. Member
Offline
Activity: 81
Merit: 2
|
|
October 20, 2021, 07:56:11 AM |
|
you got big mouth but less sense and knowledge Grin
i hate to tell you that grow up your knowledge & perhaps things will get more clear.
1 > 80 key not 80 keys [single key] [random mode with 4.7 Ekeys/sec] [4300000000000000000 keys/sec] [3BACAB37B62E0000 keys/sec][ whole 65 range in 1 sec]. now compare with bsgscuda with reference key in range 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e0000000000000000:49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5effffffffffffffff. let me guess 3080 took with full optimization around 17 second but keyhunt took just 1 second. even i have to reduce k and n value to reduce speed for this Grin. 2 > do your research and than find how many keys you will get while doing 120 to 2^40 divisor [lol]. if you will load 2 keys, you will make keyhunt speed half and what about billion keys . speed will be just like your mind processing to understand my answer. Your English reading or comprehension is less sense and knowledge. I never said anything about 80 keys...you are saying you found 80 key, I took that as a single key in an 80 bit range, not 80 keys because you did not pluralize the word key. So with that, I merely said instead of trying to get someone to reprogram BSGS Cuda for multi key, run keyhunt, since it already supports multi key and if you think it is faster, then break up 120 key into however many keys you want to, 2^5, 2^20, 2^40, or however many you want to and let that program eat. I said 2^40 specifically because you said an 80 key in a blink of an eye; so 2^120/2^40 = 2^80; if you found one 80 key in a blink of an eye, maybe you find the 120 key in 80 bit range in 2 blinks of an eye. BSGS Cuda, can find 65 bit key in less than a second, it all depends on your hardware. you say if you will load 2 keys, you will make keyhunt speed half the same will happen to BSGS Cuda; so I am not sure what your point is really. maybe you find the 120 key in 80 bit range in 2 blinks of an eye. ok learn basic knowledge of divisor bro , if you will do 32 times, only one key will be from 5 bit down range on unknown position other all will from uper bit ranges on exact same distance from their references values. now if you will do 2^40, you will have 1208925819614629174706176 reference values in 256 bit range and only one of key will be in 40bit range other all keys will from uper bits on exact distance from their respectively reference values. now how the hell you can work with such large number of keys and the line you said that get the 2^40 is aggressive comment without knowing my intention. my intention is that i already did divisor of 32 and loaded keys in Keyhunt and running it right now and i know how much speed and power i am getting from that , but i just dont know power of BSGScuda if i will load 32 keys parallel in that. so i asked Etar that if he can make such possibility who knows BSGS will out performed keyhunt. now come to the point in above post Etar said that my program is good until 80 bit and above that use JL kangaroo so i was comparing it with BSGS of alberto but i found that CPU based BSGS is more powerful than 3080 if you have good specification hardware but same time BSGScuda is better than keyhunt[CPU] if you dont have enough power of CPU and memory.
|
|
|
|
ssxb
Jr. Member
Offline
Activity: 81
Merit: 2
|
|
October 20, 2021, 08:05:42 AM |
|
@Etar i seriously believing that there will some way to use power of GPU cores and process all BSGS inside computer memory perhaps this will give some crazy power which never been discovered or there will be bottle neck but you can confirm it when you will build such program. assume if you have power of keyhunt and than you will make bloom in SSD [7000+ read write speed gen4] RAM bpfile elements bpfile size bloom size 8 GB 1000000000 32 GB 5.02 GB 32 GB 5000000000 160 GB 25.11 GB 128 GB 22000000000 704 GB 110.47 GB 500 GB 90000000000 2.9 TB 451.92 GB based on above table you can increase speed if you will utilize both bloom+bp https://github.com/iceland2k14/bsgsso CPU cores are less powerful than cuda and i was thinking [not sure possible or not] if we load all bp in RAM and use some bloom in GPU memory perhaps their will be some dramatic speed boost
|
|
|
|
bigvito19
|
|
October 20, 2021, 08:48:29 AM |
|
What's the link to the divisor script?
and how many keys can I generate with the divisor?
|
|
|
|
NotATether
Legendary
Offline
Activity: 1778
Merit: 7372
Top Crypto Casino
|
|
October 20, 2021, 09:55:44 AM |
|
What's the link to the divisor script?
and how many keys can I generate with the divisor?
If you mean the one I made, it's in the Kangaroo thread, anywhere from pages 90 to 100 I think.
I think we can cut the number of baby steps made if we take into account that the correct baby step amount is going to be random-looking (in other words, no long 0 or 1 sequences). Or at least make the baby steps take a higher bit count, decreasing the number of giant steps. I'm thinking that we can find the numbers represented by these random bits and then calculate their multiples to use as an incrementor... not perfect but it does the trick I guess. E.g. 5 is 101, 10 is 1010, 15 is 1111, 20 is 10100, 25 is 11001, 30 11110, ..... etc. Special care would need to be taken to choose a number whose multiples don't make long sequences of bits, like 15: 3*5 I don't think that this randomness has any correlation to primality of numbers (or inverse correlation to it).
|
|
|
|
bigvito19
|
|
October 20, 2021, 01:28:20 PM |
|
I'm testing with the divisor keys on a smaller range, but its not solving the key with keyhunt. does it work the same with xpoint mode?
|
|
|
|
ssxb
Jr. Member
Offline
Activity: 81
Merit: 2
|
|
October 20, 2021, 01:46:08 PM |
|
I'm testing with the divisor keys on a smaller range, but its not solving the key with keyhunt. does it work the same with xpoint mode?
you need to adjust K and N as smaller range will be not solved if power of K and N is more than range count or if number of keys will be more or less than power of your hardware. remember tweak is seriously needed while keeping K and N according to your hardware power as well as adjust K and N according to number of keys you will load in software ~ do the test again and again and again
|
|
|
|
Etar (OP)
|
|
October 20, 2021, 03:54:04 PM |
|
With last ptx optimisation (forgot about simmetry in batch point addition) solve 16 pubkeys from JLP in 58s ... GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001 GPU#0 Cnt:0000000000000000000000000000000000000000000000004673f00000000001 1121MKey/s x1073741824 2^30.13 x2^31=2^61.13 ***********GPU#0************ KEY!!>49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e7ad38337c7f173c7 Pub: 55b95bef84a6045a505d015ef15e136e0a31cc2aa00fa4bca62e5df215ee981b3b4d6bce33718dc6cf59f28b550648d7e8b2796ac36f25ff0c01f8bc42a16fd9 **************************** Found in 4 seconds GPU#0 job finished Working time 00:00:58s Total time 00:06:33s GPU#0 thread finished cuda finished ok
Press Enter to exit
Seems like it is the maximum that I can achieve in single 2080ti. Ofcourse JLP would probably have done it even faster
|
|
|
|
studyroom1
Jr. Member
Offline
Activity: 40
Merit: 7
|
|
October 21, 2021, 08:58:37 AM |
|
With last ptx optimisation (forgot about simmetry in batch point addition) solve 16 pubkeys from JLP in 58s ... GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001 GPU#0 Cnt:0000000000000000000000000000000000000000000000004673f00000000001 1121MKey/s x1073741824 2^30.13 x2^31=2^61.13 ***********GPU#0************ KEY!!>49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e7ad38337c7f173c7 Pub: 55b95bef84a6045a505d015ef15e136e0a31cc2aa00fa4bca62e5df215ee981b3b4d6bce33718dc6cf59f28b550648d7e8b2796ac36f25ff0c01f8bc42a16fd9 **************************** Found in 4 seconds GPU#0 job finished Working time 00:00:58s Total time 00:06:33s GPU#0 thread finished cuda finished ok
Press Enter to exit
Seems like it is the maximum that I can achieve in single 2080ti. Ofcourse JLP would probably have done it even faster impressive Etar . i have question . lets say if you have 1m keys in file and you load in bsgscuda and set scan range only to 64, now my question is if gpu finished whole 64 range scan for key1 than gpu will abandoned search of key1 and move to key2? your program is doing that or you will impalement this. right?
|
|
|
|
Etar (OP)
|
|
October 21, 2021, 10:11:16 AM |
|
impressive Etar . i have question . lets say if you have 1m keys in file and you load in bsgscuda and set scan range only to 64, now my question is if gpu finished whole 64 range scan for key1 than gpu will abandoned search of key1 and move to key2?
your program is doing that or you will impalement this. right?
Use -pk to set start range and -pke to set endrange. if pubkey will not find in this range then seraching will be switched to next pubkey.
|
|
|
|
lostrelic
Jr. Member
Offline
Activity: 32
Merit: 1
|
|
October 21, 2021, 10:22:37 AM |
|
Hi Etar thanks for your continuing support for this program. Quick question the fastest I get is 2^60 if I try to get 2^61 it sticks on add baby points to hashtable? I’ve got a 3080 16gb ram and 500gb ssd any ideas on settings to try? or how long should I wait for it to load? Thanks Relic
|
|
|
|
Etar (OP)
|
|
October 21, 2021, 11:46:14 AM |
|
Hi Etar thanks for your continuing support for this program. Quick question the fastest I get is 2^60 if I try to get 2^61 it sticks on add baby points to hashtable? I’ve got a 3080 16gb ram and 500gb ssd any ideas on settings to try? or how long should I wait for it to load? Thanks Relic
Screen what i post in post above it is the latest verion and not yet published(tested). By the way v1.6.0 shoud works fine for you but in little less perfomance, at v1.6.0 2080ti speed 826MKey/s x1073741824 2^29.69 x2^31=2^60.69 If you have 16gb gpu ram then try -w 31 and -htsz 29 In any case 3080 shoud have better perfomance then 2080ti even with the same size of baby array that i use, try set -t 512 -b 136 -p 512 -w 30 -htsz 28P.s. Maybe you stick on add baby points to hashtable because have little memory on PC to generate HT in RAM. I generate HT -w 30 on PC that have 32GB of ram. For -w 31 you need 64gb of ram to creat all arrays. To launch solver you will need less more memory with already generated arrays.
|
|
|
|
Etar (OP)
|
|
October 21, 2021, 12:47:06 PM |
|
STOP using BSGScuda, i found a bug that not all public keys found. I can`t say now from which version this bug apear, so don`t use programm while i am do not solve issue.
|
|
|
|
studyroom1
Jr. Member
Offline
Activity: 40
Merit: 7
|
|
October 21, 2021, 01:58:10 PM |
|
STOP using BSGScuda, i found a bug that not all public keys found. I can`t say now from which version this bug apear, so don`t use programm while i am do not solve issue.
oh when can we see next update
|
|
|
|
Etar (OP)
|
|
October 22, 2021, 12:55:03 PM |
|
The problem was a double giant step. Now I have removed the double giant step and in my opinion everything works as it should. I run several tests with different small -w -p options with 1024 pubkeys file and all keys are founded. True, now the total indicator is 2 times less, due to the fact that the step is normal. You can run all sorts of tests with keys and check. If there are any bugs, let me know. release 1.7.0 available on github.
|
|
|
|
_Counselor
Member
Offline
Activity: 110
Merit: 61
|
|
October 22, 2021, 02:09:15 PM |
|
The problem was a double giant step. Now I have removed the double giant step and in my opinion everything works as it should. I run several tests with different small -w -p options with 1024 pubkeys file and all keys are founded. True, now the total indicator is 2 times less, due to the fact that the step is normal. You can run all sorts of tests with keys and check. If there are any bugs, let me know. release 1.7.0 available on github.
What the kind of problem was? I think you exploited symmetry to double size of giant steps? Why it did not find some keys?
|
|
|
|
math09183
Member
Offline
Activity: 170
Merit: 58
|
|
October 23, 2021, 06:53:32 AM |
|
STOP using BSGScuda, i found a bug that not all public keys found. I can`t say now from which version this bug apear, so don`t use programm while i am do not solve issue.
LOL That's what happens when you use ad hoc written code, without proper testing. I guess you still did not prepare any set of unit tests to proof your code works? Good luck for the future releases, maybe somewhere around version 20 it will be stable
|
|
|
|
Etar (OP)
|
|
October 23, 2021, 07:01:46 AM |
|
LOL That's what happens when you use ad hoc written code, without proper testing. I guess you still did not prepare any set of unit tests to proof your code works? Good luck for the future releases, maybe somewhere around version 20 it will be stable Most of code have bugs. Are you a great programmer who does everything without mistakes? I found this bug and solved it, what's your problem?
|
|
|
|
Etar (OP)
|
|
October 23, 2021, 07:02:51 AM |
|
What the kind of problem was? I think you exploited symmetry to double size of giant steps? Why it did not find some keys?
if we talk about doubled GS (Giant Step) For ex, option -p 8 -w 4 mean baby array 2^4 =16 each giant step (doubled) is 16*2=32 let say we should find pubkey with privkey=32 program substruct GS from public key and look to a Baby array to check overlap. but if you substruct 32-32 then you get 64 and this value is not present in the baby array. But if we used the usual GS 32-16=16 and 16 is present in baby array - pubkey solved. So with doubled GS not finded every (baby array size)*2 keys.
|
|
|
|
|