mcdouglasx
|
 |
February 16, 2025, 07:53:34 PM |
|
Currently I have about 12.8GKeys/s on 4090. 5090 is a shame, I skip it and wait for next generation. Perhaps I will make all my sources public when #135 is solved, though I'm not sure, people are not interested in what I do, also I see zero good discussions on this forum about EC, so better I will spend my time for more interesting things  Yes, there are surely many people intrigued by your code; it's just that not all of us have thousands of dollars to explore or buy a high-end PC. What's more unfortunate is that those who do have the means don't offer anything just theories backed by zero code, which is a vague and empty argument. I admit that I plan to include in your final version of Rckangaroo the different kangaroo methods to verify if SOTA is the main factor or if it is the optimization of CUDA code.
|
▄▄█████████████████▄▄ ▄█████████████████████▄ ███▀▀█████▀▀░░▀▀███████ ███▄░░▀▀░░▄▄██▄░░██████ █████░░░████████░░█████ ████▌░▄░░█████▀░░██████ ███▌░▐█▌░░▀▀▀▀░░▄██████ ███░░▌██░░▄░░▄█████████ ███▌░▀▄▀░░█▄░░█████████ ████▄░░░▄███▄░░▀▀█▀▀███ ██████████████▄▄░░░▄███ ▀█████████████████████▀ ▀▀█████████████████▀▀ | Rainbet.com CRYPTO CASINO & SPORTSBOOK | | | █▄█▄█▄███████▄█▄█▄█ ███████████████████ ███████████████████ ███████████████████ █████▀█▀▀▄▄▄▀██████ █████▀▄▀████░██████ █████░██░█▀▄███████ ████▄▀▀▄▄▀███████ █████████▄▀▄███ █████████████████ ███████████████████ ███████████████████ ███████████████████ | | | |
▄█████████▄ █████████ ██ ▄▄█░▄░▄█▄░▄░█▄▄ ▀██░▐█████▌░██▀ ▄█▄░▀▀▀▀▀░▄█▄ ▀▀▀█▄▄░▄▄█▀▀▀ ▀█▀░▀█▀
| 10K WEEKLY RACE | | 100K MONTHLY RACE | | | ██
█████
| ███████▄█ ██████████▄ ████████████▄▄ ████▄███████████▄ ██████████████████▄ ░▄█████████████████▄ ▄███████████████████▄ █████████████████▀████ ██████████▀███████████ ▀█████████████████████ ░████████████████████▀ ░░▀█████████████████▀ ████▀▀██████████▀▀ | ████████ ██████████████ |
|
|
|
RetiredCoder (OP)
Full Member
 
Offline
Activity: 131
Merit: 120
No pain, no gain!
|
 |
February 16, 2025, 08:22:51 PM |
|
Yes, there are surely many people intrigued by your code; it's just that not all of us have thousands of dollars to explore or buy a high-end PC. What's more unfortunate is that those who do have the means don't offer anything just theories backed by zero code, which is a vague and empty argument. I admit that I plan to include in your final version of Rckangaroo the different kangaroo methods to verify if SOTA is the main factor or if it is the optimization of CUDA code.
Are you blind? All my ideas I published here are proved by sources so everyone can check and confirm them on CPU: https://github.com/RetiredC/Kang-1https://github.com/RetiredC/Kang-2RCKangaroo is just a proof that these ideas can be implemented efficiently on GPUs as well.
|
|
|
|
mcdouglasx
|
 |
February 16, 2025, 08:46:07 PM |
|
Yes, there are surely many people intrigued by your code; it's just that not all of us have thousands of dollars to explore or buy a high-end PC. What's more unfortunate is that those who do have the means don't offer anything just theories backed by zero code, which is a vague and empty argument. I admit that I plan to include in your final version of Rckangaroo the different kangaroo methods to verify if SOTA is the main factor or if it is the optimization of CUDA code.
Are you blind? All my ideas I published here are proved by sources so everyone can check and confirm them on CPU: https://github.com/RetiredC/Kang-1https://github.com/RetiredC/Kang-2RCKangaroo is just a proof that these ideas can be implemented efficiently on GPUs as well. Yes, I saw them but they are partially implemented. Your approach was always the SOTA (state-of-the-art) method, so it's not an impartial environment, which is what I'm referring to. The final Rckangaroo should be able to compare all methods equitably with all its advantages; that would be a fair and rigorous test.
|
▄▄█████████████████▄▄ ▄█████████████████████▄ ███▀▀█████▀▀░░▀▀███████ ███▄░░▀▀░░▄▄██▄░░██████ █████░░░████████░░█████ ████▌░▄░░█████▀░░██████ ███▌░▐█▌░░▀▀▀▀░░▄██████ ███░░▌██░░▄░░▄█████████ ███▌░▀▄▀░░█▄░░█████████ ████▄░░░▄███▄░░▀▀█▀▀███ ██████████████▄▄░░░▄███ ▀█████████████████████▀ ▀▀█████████████████▀▀ | Rainbet.com CRYPTO CASINO & SPORTSBOOK | | | █▄█▄█▄███████▄█▄█▄█ ███████████████████ ███████████████████ ███████████████████ █████▀█▀▀▄▄▄▀██████ █████▀▄▀████░██████ █████░██░█▀▄███████ ████▄▀▀▄▄▀███████ █████████▄▀▄███ █████████████████ ███████████████████ ███████████████████ ███████████████████ | | | |
▄█████████▄ █████████ ██ ▄▄█░▄░▄█▄░▄░█▄▄ ▀██░▐█████▌░██▀ ▄█▄░▀▀▀▀▀░▄█▄ ▀▀▀█▄▄░▄▄█▀▀▀ ▀█▀░▀█▀
| 10K WEEKLY RACE | | 100K MONTHLY RACE | | | ██
█████
| ███████▄█ ██████████▄ ████████████▄▄ ████▄███████████▄ ██████████████████▄ ░▄█████████████████▄ ▄███████████████████▄ █████████████████▀████ ██████████▀███████████ ▀█████████████████████ ░████████████████████▀ ░░▀█████████████████▀ ████▀▀██████████▀▀ | ████████ ██████████████ |
|
|
|
RetiredCoder (OP)
Full Member
 
Offline
Activity: 131
Merit: 120
No pain, no gain!
|
 |
February 16, 2025, 08:50:51 PM |
|
Yes, I saw them but they are partially implemented. Your approach was always the SOTA (state-of-the-art) method, so it's not an impartial environment, which is what I'm referring to.
Part #1 demonstrates full implementation of FIVE methods. So you can compare SOTA with classic methods. Easily. I have no idea what else you need, if you want to implement all these methods in RCKangaroo for some reason - no problem, it's up to you how to spend your time  But your statement that I have only theories with zero code... awesome 
|
|
|
|
mcdouglasx
|
 |
February 16, 2025, 09:12:24 PM |
|
Yes, I saw them but they are partially implemented. Your approach was always the SOTA (state-of-the-art) method, so it's not an impartial environment, which is what I'm referring to.
Part #1 demonstrates full implementation of FIVE methods. So you can compare SOTA with classic methods. Easily. I have no idea what else you need, if you want to implement all these methods in RCKangaroo for some reason - no problem, it's up to you how to spend your time  But your statement that I have only theories with zero code... awesome  I wasn't referring to you when I mentioned 'theories with zero code'; I'm talking about those who give opinions without having contributed any code, commit, or fork, saying 'this would be more efficient.' And yes, I just want to test your final version with all the methods and see it for myself. Rest assured, if the SOTA (state-of-the-art) method is better, I will say it. The truth cannot be hidden.
|
▄▄█████████████████▄▄ ▄█████████████████████▄ ███▀▀█████▀▀░░▀▀███████ ███▄░░▀▀░░▄▄██▄░░██████ █████░░░████████░░█████ ████▌░▄░░█████▀░░██████ ███▌░▐█▌░░▀▀▀▀░░▄██████ ███░░▌██░░▄░░▄█████████ ███▌░▀▄▀░░█▄░░█████████ ████▄░░░▄███▄░░▀▀█▀▀███ ██████████████▄▄░░░▄███ ▀█████████████████████▀ ▀▀█████████████████▀▀ | Rainbet.com CRYPTO CASINO & SPORTSBOOK | | | █▄█▄█▄███████▄█▄█▄█ ███████████████████ ███████████████████ ███████████████████ █████▀█▀▀▄▄▄▀██████ █████▀▄▀████░██████ █████░██░█▀▄███████ ████▄▀▀▄▄▀███████ █████████▄▀▄███ █████████████████ ███████████████████ ███████████████████ ███████████████████ | | | |
▄█████████▄ █████████ ██ ▄▄█░▄░▄█▄░▄░█▄▄ ▀██░▐█████▌░██▀ ▄█▄░▀▀▀▀▀░▄█▄ ▀▀▀█▄▄░▄▄█▀▀▀ ▀█▀░▀█▀
| 10K WEEKLY RACE | | 100K MONTHLY RACE | | | ██
█████
| ███████▄█ ██████████▄ ████████████▄▄ ████▄███████████▄ ██████████████████▄ ░▄█████████████████▄ ▄███████████████████▄ █████████████████▀████ ██████████▀███████████ ▀█████████████████████ ░████████████████████▀ ░░▀█████████████████▀ ████▀▀██████████▀▀ | ████████ ██████████████ |
|
|
|
kTimesG
|
 |
February 17, 2025, 09:28:06 AM |
|
I wasn't referring to you when I mentioned 'theories with zero code'; I'm talking about those who give opinions without having contributed any code, commit, or fork, saying 'this would be more efficient.'
Geez... so you have a software that is marked all over the place as "Proof of Concept". Then the author admits himself that the speed of his private optimized version is at least 50% faster than the public PoC. And somehow the people that have "theories with zero code" who state that the speed can be much faster and the program can be more efficient, are called bullshitters. OK. So this would mean that the author himself is a bullshitter, or not? Because he doesn't give out his optimized version? If he himself does not do that, who on their right mind would do that instead? It's like giving out a better Tesla for free to everyone, just because. Geez...
|
Off the grid, training pigeons to broadcast signed messages.
|
|
|
RetiredCoder (OP)
Full Member
 
Offline
Activity: 131
Merit: 120
No pain, no gain!
|
 |
February 17, 2025, 10:07:34 AM Last edit: May 09, 2025, 05:00:41 PM by mprep |
|
I want to clarify again: it's not my goal to publish the fastest version of ECDLP solver I have, it's not RCKangaroo. My goal is to share the best kangaroo method for solving ECDLP with proofs on CPU so you can learn/use it. And I have done it. RCKangaroo is a "Proof of Concept" of SOTA method for GPU (however, it's fastest in public) and I give it for free with sources so you can use and improve it. If someone thinks that I must do even more and publish some ultimate software for cracking #135 - it's funny 
The complexity doubles with every new range. So count how many 4090s one needs to solve 135bits or 250-256bits ranges? Kangaroo-wise solution will not do that. As of now there is no solution to do that.
#135 takes about 5.6 more calculations than #130, so I think #135 is the last high puzzle that will be solved in this decade. I will make public a solver that does at least 10.5 Gk/s on RTX 4090 by the end of this year. I believe I can make it reach 11 Gk/s by then. Combined with symmetry and 3-kang method, it will be at least as fast as RC's solver, per total, if not faster.
About zero code, may be people mean this your post where you promise to show something? It's ok that you changed your mind 
Ok, let's keep all our achievements secret, it will be a great progress for science  Take care, I will come back when something interesting happens, for example, if someone publishes a method with K lower than mine. [moderator's note: consecutive posts merged]
|
|
|
|
kTimesG
|
 |
February 17, 2025, 01:26:04 PM |
|
I will make public a solver that does at least 10.5 Gk/s on RTX 4090 by the end of this year. I believe I can make it reach 11 Gk/s by then. Combined with symmetry and 3-kang method, it will be at least as fast as RC's solver, per total, if not faster.
About zero code, may be people mean this your post where you promise to show something? It's ok that you changed your mind  Deadline missed, health is more important. Also, I'll never actually share the code that allows for the speed I mentioned (it is the real speed). It is a CUDA binary file (precompiled in advance) loaded dynamically, optimized for the specific GPU it runs on. This is a safe way to share a CUDA kernel without compromising personal IP, and it has zero security issues (a CUDA binary can't do shit except run assembler instructions on the GPU). But since the kangaroos are computed correctly and verified at the end of the jump loop, and there is a steady rate of DPs, it proves 100% that the speed is correct, because the kangaroos landed where they were supposed to, which can't be computed in advance, there is no magic shortcut to compute the final landing spot, unless they each do the entire number of jumps. But I feel you when people are asking for full solutions, they would then want the software that manages hundreds / thousands of cloud GPU instances, and so on. Lazy people will never be happy.
|
Off the grid, training pigeons to broadcast signed messages.
|
|
|
alexxino
Newbie
Offline
Activity: 20
Merit: 0
|
 |
February 18, 2025, 08:53:07 AM |
|
Thanks for this quick and optimized Kangaroo program, it is the fastest.
Is it possible to have the "save work" option and "load work" from file like in JLP's Kangaroo?
Thanks
|
|
|
|
mcdouglasx
|
 |
February 20, 2025, 11:04:25 PM |
|
A friend left me his PC to reinstall Windows and other programs, and I took the opportunity to run some tests. My impression was that RcKangaroo, in terms of SOTA, is the best version of the various Kangaroo methods published to date.
I recommend RetiredCoder to write a formal paper on this method if he hasn't done so yet.
|
▄▄█████████████████▄▄ ▄█████████████████████▄ ███▀▀█████▀▀░░▀▀███████ ███▄░░▀▀░░▄▄██▄░░██████ █████░░░████████░░█████ ████▌░▄░░█████▀░░██████ ███▌░▐█▌░░▀▀▀▀░░▄██████ ███░░▌██░░▄░░▄█████████ ███▌░▀▄▀░░█▄░░█████████ ████▄░░░▄███▄░░▀▀█▀▀███ ██████████████▄▄░░░▄███ ▀█████████████████████▀ ▀▀█████████████████▀▀ | Rainbet.com CRYPTO CASINO & SPORTSBOOK | | | █▄█▄█▄███████▄█▄█▄█ ███████████████████ ███████████████████ ███████████████████ █████▀█▀▀▄▄▄▀██████ █████▀▄▀████░██████ █████░██░█▀▄███████ ████▄▀▀▄▄▀███████ █████████▄▀▄███ █████████████████ ███████████████████ ███████████████████ ███████████████████ | | | |
▄█████████▄ █████████ ██ ▄▄█░▄░▄█▄░▄░█▄▄ ▀██░▐█████▌░██▀ ▄█▄░▀▀▀▀▀░▄█▄ ▀▀▀█▄▄░▄▄█▀▀▀ ▀█▀░▀█▀
| 10K WEEKLY RACE | | 100K MONTHLY RACE | | | ██
█████
| ███████▄█ ██████████▄ ████████████▄▄ ████▄███████████▄ ██████████████████▄ ░▄█████████████████▄ ▄███████████████████▄ █████████████████▀████ ██████████▀███████████ ▀█████████████████████ ░████████████████████▀ ░░▀█████████████████▀ ████▀▀██████████▀▀ | ████████ ██████████████ |
|
|
|
Bram24732
Member

Offline
Activity: 154
Merit: 15
|
 |
February 24, 2025, 08:22:49 AM |
|
Hey RetiredCoder, I'm the guy who broke 67. Trying to DM you but I can't as a newbie. Can you please DM me ? Thanks  Signature from bc1qfk357t8n045f8mwx672rx2re4pftm5gmjzdwq7 : ICT+NVyqwPrXEel/+jHHAMttjPlU8a/P89SCu50oH1sHERdl6L3qtHK5A1RxMUwBvUCQx/xZChNH8xzeH/QkrUc=
|
|
|
|
atom13
Newbie
Offline
Activity: 12
Merit: 1
|
 |
February 24, 2025, 10:55:41 AM |
|
I have reviewed your code, and I must say I am truly impressed. Your approach is fundamentally different from anything I have seen before. I am a developer myself, but your code seems more like the work of a mathematician, a physicist – or simply an extraordinary talent, a genius, or a scientist. What impresses me the most is its efficiency: Despite my own optimization attempts, I have never been able to achieve 12.8 GKeys/s on an RTX 4090. And what astonishes me even more – I could not find any references to your method in existing literature or research. May I ask how you came up with this remarkable approach? Currently I have about 12.8GKeys/s on 4090. 5090 is a shame, I skip it and wait for next generation. Perhaps I will make all my sources public when #135 is solved, though I'm not sure, people are not interested in what I do, also I see zero good discussions on this forum about EC, so better I will spend my time for more interesting things 
|
|
|
|
Veliquant
Newbie
Offline
Activity: 10
Merit: 0
|
 |
February 26, 2025, 05:31:22 PM |
|
Good Morning RetiredCoder:
I have been studying the puzzles for a year now. I was able to comunicate with professor Teske and professor Galbraith, they both worked with professor Pollard, and pointed me in the right direction. I have some questions and some original ideas about the pollard methods. I would like to ask if you can give me some advice.
I believe the Pollard methods can be improved using this observations I have made:
The key to accelerate the Pollard method computations is to improve the efficiency of the inversion part.
Teske in her paper says it is ok to use 32 types of jumps to give enough randomness. What about using more jump types, let´s say 2**32 types of jumps? Then you select every jump by the first 32 bits of the X coordinate. You store the X and Y in a database, which index is the first 32 bits of X. Now the jump formula considers the 32 most significant bits of the actual point, and adds the corresponding point in the database, with the same 32 bits .
This has the advantage that X2-X1 for the inversion, can be selected to give a number of only 224 Bits instead of 256. When you calculate the batch inversion, let's say you make 400 inversions, you can multiply a 256 bit number times a 224 bit number for the partial product part.
I also have made an algorithm for batch inversion using instead a pairwise multiplication, where I think you could improve the efficiency a little bit more, because you will begin making 224 bit*224 bit multiplications.
Does this make any sense? Will the big database search at every jump make the program slower vs the speedup of the multiplication of smaller numbers?
Thanks for your time.
|
|
|
|
RetiredCoder (OP)
Full Member
 
Offline
Activity: 131
Merit: 120
No pain, no gain!
|
 |
February 26, 2025, 06:04:56 PM Last edit: February 26, 2025, 06:29:29 PM by RetiredCoder |
|
May I ask how you came up with this remarkable approach?
Thanks. It was an interesting task SOTA method itself is based on classic 3-way and mirror methods with some adjustments about ranges for wilds and tames. For loops I had to examine them carefully to find a good and fast way to handle them. Teske in her paper says it is ok to use 32 types of jumps to give enough randomness. What about using more jump types, let´s say 2**32 types of jumps?
From my experience, 64 jumps are better than 32 by about 2%, so better use 64 when possible. For symmetry methods you need at least 256 jumps. Your idea is to have 2^32 jumps, right? It will take 240GB of RAM, so not suitable for GPUs. The end, sorry. A friend left me his PC to reinstall Windows and other programs, and I took the opportunity to run some tests. My impression was that RcKangaroo, in terms of SOTA, is the best version of the various Kangaroo methods published to date. I recommend RetiredCoder to write a formal paper on this method if he hasn't done so yet.
Thanks. Last time I wrote such papers for my PhD degree many years ago and I definitely don't want to start it again, too boring and useless I gave up science then, chose money  Now I do it only for fun. I'm the guy who broke 67. Trying to DM you but I can't as a newbie. Can you please DM me ?
Congrats. I don't use PM at all, write you ideas here if you want.
|
|
|
|
Veliquant
Newbie
Offline
Activity: 10
Merit: 0
|
 |
February 26, 2025, 06:42:42 PM |
|
From my experience, 64 jumps are better than 32 by about 2%, so better use 64 when possible. For symmetry methods you need at least 256 jumps. Your idea is to have 2^32 jumps, right? It will take 240GB of RAM, so not suitable for GPUs. The end, sorry.
 Thank's for your answer. I have other question: I have studied bitcrack and kangaroo (From Jean-Luc Pons) in detail. In the Pollard method with only 2 heards, Pollard recommends using a jump formula of powers of 2. When I tried to understand the Van-Oorschot method of parallelization, I realized that if you double the amount of Kangaroos, you must double the jump size mean. This results in huge jumps for a big range and a big number of kangaroos if you use powers of 2. Do you recommend better selecting the jump size using random K's in a given interval? Thanks
|
|
|
|
RetiredCoder (OP)
Full Member
 
Offline
Activity: 131
Merit: 120
No pain, no gain!
|
 |
February 26, 2025, 06:54:00 PM |
|
From my experience, 64 jumps are better than 32 by about 2%, so better use 64 when possible. For symmetry methods you need at least 256 jumps. Your idea is to have 2^32 jumps, right? It will take 240GB of RAM, so not suitable for GPUs. The end, sorry.
 Thank's for your answer. I have other question: I have studied bitcrack and kangaroo (From Jean-Luc Pons) in detail. In the Pollard method with only 2 heards, Pollard recommends using a jump formula of powers of 2. When I tried to understand the Van-Oorschot method of parallelization, I realized that if you double the amount of Kangaroos, you must double the jump size mean. This results in huge jumps for a big range and a big number of kangaroos if you use powers of 2. Do you recommend better selecting the jump size using random K's in a given interval? Thanks There are several ways to choose good jump sizes, I just use random sizes in some interval, check my sources for details. Also in my approach the average size of jumps does not depend on the number of kangs are used and it still works fine. There are not any known ways to improve K somehow by choosing jumps in some special way, all known good ways get same result, so it's not important what way you use.
|
|
|
|
Veliquant
Newbie
Offline
Activity: 10
Merit: 0
|
 |
February 26, 2025, 08:18:30 PM |
|
From my experience, 64 jumps are better than 32 by about 2%, so better use 64 when possible. For symmetry methods you need at least 256 jumps. Your idea is to have 2^32 jumps, right? It will take 240GB of RAM, so not suitable for GPUs. The end, sorry.
 Thank's for your answer. I have other question: I have studied bitcrack and kangaroo (From Jean-Luc Pons) in detail. In the Pollard method with only 2 heards, Pollard recommends using a jump formula of powers of 2. When I tried to understand the Van-Oorschot method of parallelization, I realized that if you double the amount of Kangaroos, you must double the jump size mean. This results in huge jumps for a big range and a big number of kangaroos if you use powers of 2. Do you recommend better selecting the jump size using random K's in a given interval? Thanks There are several ways to choose good jump sizes, I just use random sizes in some interval, check my sources for details. Also in my approach the average size of jumps does not depend on the number of kangs are used and it still works fine. There are not any known ways to improve K somehow by choosing jumps in some special way, all known good ways get same result, so it's not important what way you use. I understand that you use a base magnitude for the jumps and add a random portion to make the jumps, in a different way for each of the 3 heards, I will study this in detail. I'm not able to find an implementation for batch inversion on your code, do you use batch inversion to speed up the computation of the inverses? I understand that in the Classic Pollard method, you can use this approach, computing the next point for a group kangaroo paths. This will make the cost for one inversion only 3 - 4 multiplications? Is that correct?
|
|
|
|
RetiredCoder (OP)
Full Member
 
Offline
Activity: 131
Merit: 120
No pain, no gain!
|
 |
February 26, 2025, 09:16:11 PM |
|
I have been studying the puzzles for a year now. I was able to comunicate with professor Teske and professor Galbraith, they both worked with professor Pollard, and pointed me in the right direction. I have some questions and some original ideas about the pollard methods.
I understand that you use a base magnitude for the jumps and add a random portion to make the jumps, in a different way for each of the 3 heards, I will study this in detail.
It seems you don't know how it works even after a year. The jump table is the same for all herds, otherwise it won't work. I'm not able to find an implementation for batch inversion on your code, do you use batch inversion to speed up the computation of the inverses? I understand that in the Classic Pollard method, you can use this approach, computing the next point for a group kangaroo paths. This will make the cost for one inversion only 3 - 4 multiplications? Is that correct?
For CPU code I don't use batch inversion because I tried to make the code as simple as possible. That code demonstrates various methods for ECDLP and loop handling, not optimizations. For RCKangaroo I use batch inversion, check its CUDA sources.
|
|
|
|
roostam.aksenov
Newbie
Offline
Activity: 1
Merit: 0
|
 |
February 28, 2025, 04:17:51 PM |
|
Hello everyone, can you tell me how to run 135 puzzle on several machines? Something like a pool or hashtopolis server
|
|
|
|
Akito S. M. Hosana
Jr. Member
Offline
Activity: 392
Merit: 8
|
 |
March 04, 2025, 09:26:32 AM |
|
I'm not able to find an implementation for batch inversion on your code, do you use batch inversion to speed up the computation of the inverses?
The only place I've been able to find anyone publicly using Inverse Wild Herd is in this Python script here: https://github.com/mikorist/Kangaroo-256-bit-python/blob/main/kangaroo.py Everyone else seems to be hiding how it works. 
|
|
|
|
|