Etar (OP)
|
|
April 22, 2020, 05:00:42 PM |
|
Anyway, first GPU release is coming
Thanks for your work. You are really very good at algorithms. I tried to test the GPU version. But for some reason, loading the GPU and memory is extremely small. And the speed drops immediately from 350 to 100 Mkeys per second.
|
|
|
|
MrFreeDragon
|
|
April 22, 2020, 05:08:07 PM |
|
-snip- I tried to test the GPU version. But for some reason, loading the GPU and memory is extremely small. And the speed drops immediately from 350 to 100 Mkeys per second -snip-
I am not sure, but the speed is droping while changing the target key. The program is clearing the kanagroo points and setting up the new troops of wild kangaroos. With your 2080ti you should have a good performance. The 64bit is just very small for it, and the GPU spend more time for "preparation" rather than for actual finding. Try the code with higher key (for example 80bit key). I think that with your 2080ti you could solve 80bit key for 20-30 minutes. And also you will see your actual speed during this time. EDIT: Running GTX 2080 Ti for 64bit is like driving the sport car in small village. If you want to test the actual speed of your "car" you need to go to highway!
|
|
|
|
Jean_Luc
|
|
April 22, 2020, 05:27:24 PM |
|
Yes with a large number of thread and a small range (64bits) you face an overhead due to the DP method. You can try to set by hand the dpSize using the -d option.
C:\C++\Kangaroo\VC_CUDA10\x64\Release>Kangaroo.exe -t 0 -d 10 -gpu ..\..\in.txt
You will get a warning but ignore it.
As MrFreeDragon said, such a GPU will be much more efficient on 80bit search or more.
|
|
|
|
Etar (OP)
|
|
April 22, 2020, 07:13:36 PM |
|
Yes with a large number of thread and a small range (64bits) you face an overhead due to the DP method. You can try to set by hand the dpSize using the -d option.
C:\C++\Kangaroo\VC_CUDA10\x64\Release>Kangaroo.exe -t 0 -d 10 -gpu ..\..\in.txt
You will get a warning but ignore it.
As MrFreeDragon said, such a GPU will be much more efficient on 80bit search or more.
My setup and speed: Range 2^64, DP-10, speed 500Mkeys/s Range 2^64, DP-11, speed 700Mkeys/s Range 2^64, DP-12, speed 900Mkeys/s, Time to find 16 keys is 15m30s (the time is about the same as for weaker cards, although the speed is higher) Range 2^80, DP - by default, speed 1100Mkeys/s but not a single key was found in 1 hour
|
|
|
|
MrFreeDragon
|
|
April 22, 2020, 07:42:59 PM |
|
-snip- Range 2^80, DP - by default, speed 1100Mkeys/s but not a single key was found in 1 hour
Good thing - now you know your exact speed as you tested the card "on highway" However it is strange you could not solve 80bit range for 1 hour. Actually for 80bit key the Pollard Kangaroo code should make on average 2^40 operations. So, with the speed 1100Mkey/sec you will need only 2^40/1100M seconds = 1000 seconds = 16-17 minutes. Check the range you use for the search. For 80bit key it is 0x80000000000000000000 - 0xffffffffffffffffffff (2^79 - 2^80-1). Or other in higher bit keys, but with the range length 80bit. EDIT: There is also possible that your key is out of the range you specify.
|
|
|
|
Jean_Luc
|
|
April 23, 2020, 04:07:24 AM |
|
Range 2^64, DP-12, speed 900Mkeys/s, Time to find 16 keys is 15m30s (the time is about the same as for weaker cards, although the speed is higher)
Yes having a too large DP size and too much threads create an overhead. You need more iterations. Range 2^80, DP - by default, speed 1100Mkeys/s but not a single key was found in 1 hour
I will try to make a test with a 80bit key today and improve the creation of kangaroo which is slow.
|
|
|
|
Jean_Luc
|
|
April 23, 2020, 07:11:43 AM |
|
Range 2^80, DP - by default, speed 1100Mkeys/s but not a single key was found in 1 hour
I published a new release 1.2 with faster kangaroo creation. I'm trying to attack a 80bit key but on my hardware (115MK/s), it will take time in.txt 25FEEE926526B0B4F0085358DF14702F7F6F04E8EC2200000000000000000000 25FEEE926526B0B4F0085358DF14702F7F6F04E8EC22FFFFFFFFFFFFFFFFFFFF 02E9CE716922FFB1CC2306E55D4E5A4F4A9B9D050E4ABB3EB95B246E7998A2508D
|
|
|
|
Jean_Luc
|
|
April 23, 2020, 11:09:53 AM |
|
My low cost hardware has solved the 80bit key is 03:52:25. I will now work on endomorphism and symetry optimization. Kangaroo v1.2 Start:25FEEE926526B0B4F0085358DF14702F7F6F04E8EC2200000000000000000000 Stop :25FEEE926526B0B4F0085358DF14702F7F6F04E8EC22FFFFFFFFFFFFFFFFFFFF Keys :1 Number of CPU thread: 0 Range width: 2^80 Number of random walk: 2^18.58 (Max DP=19) DP size: 19 [0xFFFFE00000000000] GPU: GPU #0 GeForce GTX 1050 Ti (6x128 cores) Grid(12x256) (45.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^18.58 kangaroos in 1320.9ms [115.22 MKey/s][GPU 115.22 MKey/s][Count 2^40.35][Dead 1][03:52:22][209.5MB] Key# 0 Pub: 0x02E9CE716922FFB1CC2306E55D4E5A4F4A9B9D050E4ABB3EB95B246E7998A2508D Priv: 0x25FEEE926526B0B4F0085358DF14702F7F6F04E8EC2243F5E7A6482FFC1F8DC4
Done: Total time 03:52:25
|
|
|
|
Etar (OP)
|
|
April 23, 2020, 01:40:14 PM |
|
My low cost hardware has solved the 80bit key is 03:52:25.
My Hi-End GPU can`t find any key for 1h)) Config txt the same like in example, just little increase range to 80bit, all public keys from example. Kangaroo v1.2 Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB4800000000000000000000 Stop :49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48FFFFFFFFFFFFFFFFFFFF Keys :16 Number of CPU thread: 0 Range width: 2^80 Number of random walk: 2^22.09 (Max DP=15) DP size: 15 [0xFFFE000000000000] GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x256) (417.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^22.09 kangaroos in 16571.8ms [1017.74 MKey/s][GPU 1017.74 MKey/s][Count 2^41.74][Dead 2][01:08:13][8578.7MB]
|
|
|
|
MrFreeDragon
|
|
April 23, 2020, 01:52:03 PM |
|
My low cost hardware has solved the 80bit key is 03:52:25.
With the 1st Kangarooo v1.1 my GPU 1080ti solved 80bit key for 35 minutes. This time seems reasonable as with the speed 436MKey 80bit key should be solved (2^40 operations) for 2^40 / 436M = 2521 sec = 42 minutes. With 2080ti at 1100MKey/sec the total time should be 15-20 minutes. $ ./kangaroo -gpu -t 0 in80.txt
Kangaroo v1.1 Start:80000000000000000000 Stop :FFFFFFFFFFFFFFFFFFFF Keys :1 Number of CPU thread: 0 Range width: 2^79 Number of random walk: 2^20.81 (Max DP=16) DP size: 16 [0xffff000000000000] GPU: GPU #0 GeForce GTX 1080 Ti (28x128 cores) Grid(56x256) (177.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^20.81 kangaroos [436.09 MKey/s][GPU 436.09 MKey/s][Count 2^39.55][Dead 0][35:22][942.4MB] Key# 0 Pub: 0x037E1238F7B1CE757DF94FAA9A2EB261BF0AEB9F84DBF81212104E78931C2A19DC Priv: 0xEA1A5C66DCC11B5AD180
Done: Total time 35:45
For the test I used 80bit key from 100 bitcoin transaction challenge: 80000000000000000000 ffffffffffffffffffff 037e1238f7b1ce757df94faa9a2eb261bf0aeb9f84dbf81212104e78931c2a19dc
|
|
|
|
Jean_Luc
|
|
April 23, 2020, 02:23:58 PM |
|
My Hi-End GPU can`t find any key for 1h))
That should work. For 80bit search, 2^41 is an average time, with a probability of ~50% to find the key. I will try this night. Many thanks to MrFreeDragon for testing the software
|
|
|
|
Etar (OP)
|
|
April 23, 2020, 02:46:51 PM |
|
2h and nothing, speed drop to 267Mkey, GPU usage drop to 25% Kangaroo v1.2 Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB4800000000000000000000 Stop :49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48FFFFFFFFFFFFFFFFFFFF Keys :16 Number of CPU thread: 0 Range width: 2^80 Number of random walk: 2^22.09 (Max DP=15) DP size: 15 [0xFFFE000000000000] GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x256) (417.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^22.09 kangaroos in 16571.8ms [286.75 MKey/s][GPU 286.75 MKey/s][Count 2^42.41][Dead 7][02:15:53][13583.1MB]
there is txt file (the same like example, only 80bit range set) 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb4800000000000000000000 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48ffffffffffffffffffff 0459A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC994327554CED887AAE5D211A2407CDD025CFC3779ECB9C9D7F2F1A1DDF3E9FF8 04A50FBBB20757CC0E9C41C49DD9DF261646EE7936272F3F68C740C9DA50D42BCD3E48440249D6BC78BC928AA52B1921E9690EBA823CBC7F3AF54B3707E6A73F34 0404A49211C0FE07C9F7C94695996F8826E09545375A3CF9677F2D780A3EB70DE3BD05357CAF8340CB041B1D46C5BB6B88CD9859A083B0804EF63D498B29D31DD1 040B39E3F26AF294502A5BE708BB87AEDD9F895868011E60C1D2ABFCA202CD7A4D1D18283AF49556CF33E1EA71A16B2D0E31EE7179D88BE7F6AA0A7C5498E5D97F 04837A31977A73A630C436E680915934A58B8C76EB9B57A42C3C717689BE8C0493E46726DE04352832790FD1C99D9DDC2EE8A96E50CAD4DCC3AF1BFB82D51F2494 040ECDB6359D41D2FD37628C718DDA9BE30E65801A88A00C3C5BDF36E7EE6ADBBAD71A2A535FCB54D56913E7F37D8103BA33ED6441D019D0922AC363FCC792C29A 0422DD52FCFA3A4384F0AFF199D019E481D335923D8C00BADAD42FFFC80AF8FCF038F139D652842243FC841E7C5B3E477D901F88C5AB0B88EE13D80080E413F2ED 04DB4F1B249406B8BD662F78CBA46F5E90E20FE27FC69D0FBAA2F06E6E50E536695DF83B68FD0F396BB9BFCF6D4FE312F32A43CF3FA1FE0F81DF70C877593B64E0 043BD0330D7381917F8860F1949ACBCCFDC7863422EEE2B6DB7EDD551850196687528B6D2BC0AA7A5855D168B26C6BAF9DDCD04B585D42C7B9913F60421716D37A 04332A02CA42C481EAADB7ADB97DF89033B23EA291FDA809BEA3CE5C3B73B20C49C410D1AD42A9247EB8FF217935C9E28411A08B325FBF28CC2AF8182CE2B5CE38 04513981849DE1A1327DEF34B51F5011C5070603CA22E6D868263CB7C908525F0C19EBA6BD2A8DCF651E4342512EDEACB6EA22DA323A194E25C6A1614ABD259BC0 04D4E6FA664BD75A508C0FF0ED6F2C52DA2ADD7C3F954D9C346D24318DBD2ECFC6805511F46262E10A25F252FD525AF1CBCC46016B6CD0A7705037364309198DA1 0456B468963752924DBF56112633DC57F07C512E3671A16CD7375C58469164599D1E04011D3E9004466C814B144A9BCB7E47D5BACA1B90DA0C4752603781BF5873 04D5BE7C653773CEE06A238020E953CFCD0F22BE2D045C6E5B4388A3F11B4586CBB4B177DFFD111F6A15A453009B568E95798B0227B60D8BEAC98AF671F31B0E2B 04B1985389D8AB680DEDD67BBA7CA781D1A9E6E5974AAD2E70518125BAD5783EB5355F46E927A030DB14CF8D3940C1BED7FB80624B32B349AB5A05226AF15A2228 0455B95BEF84A6045A505D015EF15E136E0A31CC2AA00FA4BCA62E5DF215EE981B3B4D6BCE33718DC6CF59F28B550648D7E8B2796AC36F25FF0C01F8BC42A16FD9
|
|
|
|
Jean_Luc
|
|
April 23, 2020, 02:50:00 PM |
|
2h and nothing, speed drop to 267Mkey, GPU usage drop to 25%
That's strange, may be a bug somewhere, the gpu usage drops may be because the HashTable becomes too big. I'll try Did you try with others key ?
|
|
|
|
MrFreeDragon
|
|
April 23, 2020, 02:55:31 PM |
|
My Hi-End GPU can`t find any key for 1h))
That should work. For 80bit search, 2^41 is an average time, with a probability of ~50% to find the key. I will try this night. Many thanks to MrFreeDragon for testing the software Agree with this comment. The Pollard Kangaroo solves the problem with the less operations (square root from the total width), but with not 100% probability of course. Etar, if you had this: [1017.74 MKey/s][GPU 1017.74 MKey/s][Count 2^41.74][Dead 2][01:08:13][8578.7MB]. it means that you just was unlucky. However it does not mean that you should restart the code. Actually at this stage the code sould solve the key within the next every minute. Your 2080ti actually made 2^41 operations for 40 minutes (1h8min / 2^0.74) or 2^40 operations for 20 minutes. So everything works good. You just was unlucky with the probability to solve...
|
|
|
|
Etar (OP)
|
|
April 23, 2020, 03:11:03 PM |
|
2h and nothing, speed drop to 267Mkey, GPU usage drop to 25%
That's strange, may be a bug somewhere, the gpu usage drops may be because the HashTable becomes too big. I'll try Did you try with others key ? My gpu have only 11gb of memory I dont know how hashtable can be 13 gb in this reason. No, i did not try other keys. Only from example.
|
|
|
|
Etar (OP)
|
|
April 23, 2020, 03:12:59 PM Last edit: April 23, 2020, 03:56:52 PM by Etar |
|
. Actually at this stage the code sould solve the key within the next every minute.
Your 2080ti actually made 2^41 operations for 40 minutes (1h8min / 2^0.74) or 2^40 operations for 20 minutes. So everything works good. You just was unlucky with the probability to solve...
I think the same but i was wait 2 h And it is not unlucky there should be other reason. I launch this config few times. And no one times not get result. I think that bsgs algoritm is more-more predicteble. You know how many memory you need for calculation. You know speed and you know how many time you need to check range. Kangaroo algorithm not used less memory as i see due to grow hashtable in time. Or if hashtable allocate in begining i dont know if i can allocate hashtable for range 2^255. And if i allways will get dead kangaroo how i will know that i solve range and there no key... So if range will be 128 bit so you dont know will you found key or you will get out of memory or all times wiil get dead kangaroo.
|
|
|
|
Jean_Luc
|
|
April 23, 2020, 03:23:41 PM |
|
My gpu have only 11gb of memory I dont know how hashtable can be 13 gb in this reason. No, i did not try other keys. Only from example.
The hash table is centralized and managed by the CPU. You have 7 dead kangaroo, a dead kangaroo is a collision in the same herd, and the kangaroo is re created. There is the same probability to get a dead kangaroo or to solve the key. Having 7 dead kangaroo without solving the key is like throwing 7 times a coin and having consecutively nine time "heads", bad luck ! On this key, i got 20 dead kangaroos ! ~2^35 , ~4 times more iterations needed compare to the average 2^33... [114.85 MKey/s][GPU 114.85 MKey/s][Count 2^34.98][Dead 20][05:41][1269.3MB] Key#11 Pub: 0x03D4E6FA664BD75A508C0FF0ED6F2C52DA2ADD7C3F954D9C346D24318DBD2ECFC6 Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EE3579364DE939B0C
|
|
|
|
MrFreeDragon
|
|
April 23, 2020, 04:44:53 PM |
|
I have tried the same 80bit range as you: 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb4800000000000000000000 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48ffffffffffffffffffff Kangaroo v1.1, Range width 2^80, 16 keys from the example. For 2hours 30 minutes at speed 433Mkey the script performed 2^41.7 operations, killed 7 kangaroos, but did not solve the key. I have some idea but not sure if it is correct and could have the impact. For the range with leading zeros everything worked fine (I mena then I tried the 80bit key from the challenge example). But we are all killing kangaroos with the 16 key example.... and not solving them. Probably the kangaroos are jumping outside the range and make the width much wider. What if just adjust the range to the one with leading zeros and also adjust the public key to search. I mean "to substract 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb4800000000000000000000 from the range and from the public key as well. So the range will be 0x00000000000000000000 - 0xffffffffffffffffffff and the public key will be Q - kG , where Q is the target public key and k is the start of the range.
|
|
|
|
Etar (OP)
|
|
April 23, 2020, 04:54:04 PM Last edit: April 23, 2020, 05:15:36 PM by Etar |
|
could have Probably the kangaroos are jumping outside the range and make the width much wider.
What if just adjust the range to the one with leading zeros and also adjust the public key to search. I mean "to substract 49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb4800000000000000000000 from the range and from the public key as well. So the range will be 0x00000000000000000000 - 0xffffffffffffffffffff and the public key will be Q - kG , where Q is the target public key and k is the start of the range.
Kangaroo cant jump outside because step is modulo n If i will get all time dead kangaroo due to collision in the same tribe i cant see positive moment in this algorithm. Any way thanks for reseaching. You a very-very good programmer!
|
|
|
|
Jean_Luc
|
|
April 23, 2020, 04:58:50 PM |
|
Yes I remarked that this first key of this set often generate dead kangaroos (even in 64bit range) The fact that the startrange is not zero should not impact, it is just a translation. It is equivalent to what MrFreeDragon said. All kangaroos are uniformly distributed in the range and make random jumps form G up to 2*sqrt(k2-k1).G Even if kangaroos goes out of the range it is not really a problem. I will try to change the calculation on random jump to see if it improve something.
[18.00 MKey/s][Count 2^34.40][20:47][Dead 10][20.2MB] Key# 0 Pub: 0x0259A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EBB3EF3883C1866D4
[115.23 MKey/s][GPU 115.23 MKey/s][Count 2^33.52][Dead 5][02:02][463.9MB] Key# 0 Pub: 0x0259A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EBB3EF3883C1866D4
|
|
|
|
|