[PSCKangaroo] Fork of RCKangaroo — optimized for high-RAM + single GPU setups
Hi everyone,
I've been working on modifying RCKangaroo to get the most out of my specific hardware: an RTX 5070 paired with 128 GB of RAM. The original RCKangaroo is a brilliant piece of software — all credit to RetiredCoder for the SOTA algorithm — but I wanted to squeeze every bit of performance from my particular setup, where RAM is abundant but GPU is a single mid-range card.
The main idea behind PSCKangaroo is a TRAP/HUNT strategy:
Phase 1 (TRAP): Fill the entire 128 GB of RAM with TAME distinguished points.
Phase 2 (HUNT): Switch the GPU to 100% WILD kangaroos that check against the massive TAME table. WILDs are never stored — they are checked and discarded.
This effectively doubles the number of TAMEs compared to a balanced TAME/WILD split, which directly increases T-W collision probability per step.
To fit even more TAMEs into the same RAM, I implemented an ultra-compact 16-byte DP format (down from 25 bytes in the original). This gives +56% more entries at the cost of truncating 32 bits from the distance field. The truncated bits are recovered on collision via an async BSGS resolver running on CPU (4 threads, ~150ms per resolution). Yes, this introduces hash false positives that fail verification, but real collisions are resolved correctly by the BSGS step. The FP count is tracked in the stats — it's a known trade-off, not a bug.
Other features added:
3x endomorphism using secp256k1's β/λ constants (verified against bitcoin-core/secp256k1)
XDP 8x (threshold-based DP detection, accepts 8 patterns instead of 1)
Table freeze (no rotation when full, prevents FP explosion on long runs)
Checkpoint system with auto-save and Ctrl+C safe exit
Validated by solving Puzzle #80 (known answer).
I'm sharing the code as-is for anyone who might find it useful. Feel free to use, modify, or improve it.
GitHub:
https://github.com/pscamillo/PSCKangarooLicense: GPLv3 (same as the original RCKangaroo)
Feedback and contributions are welcome. I'm not claiming this is better than other approaches — it's just what works for my specific hardware profile (single GPU + lots of RAM). If you run a similar setup, it might help you too.