ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

vladkens (OP)

Newbie

Offline

Activity: 3
Merit: 8

ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 27, 2025, 05:44:06 PM

Merited by ABCbits (5), nomachine (3)

Hi everyone,

I'd like to share a tool I've been working on: ecloop – a CPU-optimized tool for searching Bitcoin public keys (hash160) on the secp256k1 curve. It combines approaches from projects like keyhunt (range/random range scanning with optional endomorphism) and brainflayer (dictionary-based search), with a focus on clean C code and SIMD acceleration. It supports both compressed and uncompressed public key formats, Bloom filter for large-scale hash160 scanning, works on MacOS / Linux (Windows via WSL). This is my first time posting it here.

Main features:
- Fixed 256-bit modular arithmetic and ECC implementation in single C file `lib/ecc.c`
- Group inversion for points addition / precomputed table for point multiplication
- SIMD acceleration for RIPEMD-160 (AVX2 / NEON) (https://vladkens.cc/rmd160-simd/)
- Accelerated SHA-256 with SHA extension (both ARM and x86)
- Search by range, random range, private key list or words list
- Bloom filter support for efficient filtering of large hash160 sets

Benchmarks show 3.5x+ speed over keyhunt on x86 CPU.

Repo: https://github.com/vladkens/ecloop

AlexanderCurl

Jr. Member

Offline

Activity: 42
Merit: 193

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 28, 2025, 03:46:33 PM

Nice piece of code. Added to my github collection.
But asking buy me a coffee for stuff (like batch addition and batch inversion(Montgomery trick))
that come from renowned cryptographers and mathematicians research is not quite appropriate.
BitCrack, JLP had that implemented for a long time now.

Akito S. M. Hosana

Jr. Member

Offline

Activity: 364
Merit: 8

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 29, 2025, 02:34:01 PM
Last edit: May 29, 2025, 04:05:12 PM by Akito S. M. Hosana

Merited by vladkens (1)

I am using Makefile flags from @nomachine

Code:

CC = cc
CC_FLAGS ?= -m64 -Ofast -Wall -Wextra -mtune=native \
           -funroll-loops -ftree-vectorize -fstrict-aliasing \
           -fno-semantic-interposition -fvect-cost-model=unlimited \
           -fno-trapping-math -fipa-ra -flto -fassociative-math \
           -mavx2 -mbmi2 -madx -fwrapv \
           -fomit-frame-pointer -fpredictive-commoning -fgcse-sm -fgcse-las \
           -fmodulo-sched -fmodulo-sched-allow-regmoves -funsafe-math-optimizations

# Source files
ifeq ($(shell uname -m),x86_64)
	CC_FLAGS += -march=native -pthread -lpthread
endif

default: build

clean:
	@rm -rf ecloop bench main a.out *.profraw *.profdata

build: clean
	$(CC) $(CC_FLAGS) main.c -o ecloop

# ./ecloop rnd -f 71.txt -t 12 -o ./BINGO.txt -r 400000000000000000:7fffffffffffffffff -endo
threads: 12 ~ addr33: 1 ~ addr65: 0 ~ endo: 1 | filter: list (1)
----------------------------------------
[RANDOM MODE] offs: 2 ~ bits: 32

0000000000000000 0000000000000000 0000000000000042 8ddff88400000000
0000000000000000 0000000000000000 0000000000000042 8ddff887fffffffc
27.91s ~ 64.92 Mkeys/s ~ 0 / 1,811,939,328 ('p' – pause)

i have about 65 Mkeys/s - This is madness. Grin

nomachine

Full Member

Offline

Activity: 714
Merit: 110

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 29, 2025, 02:44:48 PM

Merited by vladkens (1)

Quote from: Akito S. M. Hosana on May 29, 2025, 02:34:01 PM

I am using Makefile flags from @nomachine

You're welcome Wink

BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8

AlexanderCurl

Jr. Member

Offline

Activity: 42
Merit: 193

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 30, 2025, 02:31:23 PM
Last edit: May 30, 2025, 02:58:32 PM by AlexanderCurl

Quote from: Akito S. M. Hosana on May 29, 2025, 02:34:01 PM

I am using Makefile flags from @nomachine

Code:

CC = cc
CC_FLAGS ?= -m64 -Ofast -Wall -Wextra -mtune=native \
           -funroll-loops -ftree-vectorize -fstrict-aliasing \
           -fno-semantic-interposition -fvect-cost-model=unlimited \
           -fno-trapping-math -fipa-ra -flto -fassociative-math \
           -mavx2 -mbmi2 -madx -fwrapv \
           -fomit-frame-pointer -fpredictive-commoning -fgcse-sm -fgcse-las \
           -fmodulo-sched -fmodulo-sched-allow-regmoves -funsafe-math-optimizations

# Source files
ifeq ($(shell uname -m),x86_64)
	CC_FLAGS += -march=native -pthread -lpthread
endif

default: build

clean:
	@rm -rf ecloop bench main a.out *.profraw *.profdata

build: clean
	$(CC) $(CC_FLAGS) main.c -o ecloop

No wonder. The basic principle was implemented and totally described by JeanLucPons like five years ago in his BSGS.
https://github.com/JeanLucPons/BSGS/blob/master/BSGS.cpp
void BSGS::FillBabySteps(TH_PARAM *ph) ;
Very simple. If you move from the center of the group using batch addition and batch inversion along with batch subtraction(negation map, symmetry) you need only one batch inverse for addition and subtraction batches for each iteration.
That way you scan the range the fastest way possible.

vladkens (OP)

Newbie

Offline

Activity: 3
Merit: 8

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 31, 2025, 01:08:11 AM

Quote from: AlexanderCurl on May 28, 2025, 03:46:33 PM

Thanks for checking out my code and adding it to your collection! I appreciate the feedback, but the comment about the "buy me a coffee" link seems a bit out of place — I include that link in most of my public repos, not because I believe these techniques are mine or somehow original.

I know this work builds on well-known ideas from papers, wikis, Bitcoin Core, and other open-source projects. Honestly, JLP's code is quite hard for me to read, so I'm not exactly sure what's implemented there. I wrote ecloop from scratch, originally as a brain wallet checker, and figured out the necessary algorithms as I went.

Some concepts, like group inversion, come from Wikipedia and similar sources. I recently noticed JLP's trick with negative points and updated my code to use it. I also borrowed the multiplication (mod N) idea from JLP, since I didn't have enough time at the moment to search for relevant papers.

So no, I'm not claiming the ideas are entirely new — just that this is a clean, fast, and (hopefully) simpler implementation that runs well on both x86 and ARM. Maybe it will help others push things forward.

---

Also, I forgot to mention in the original post — if anyone knows of other mathematical ideas to improve CPU performance, I'd love to hear about them

vladkens (OP)

Newbie

Offline

Activity: 3
Merit: 8

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 31, 2025, 01:21:22 AM

Quote from: Akito S. M. Hosana on May 29, 2025, 02:34:01 PM

I am using Makefile flags from @nomachine

Code:

CC = cc
CC_FLAGS ?= -m64 -Ofast -Wall -Wextra -mtune=native \
           -funroll-loops -ftree-vectorize -fstrict-aliasing \
           -fno-semantic-interposition -fvect-cost-model=unlimited \
           -fno-trapping-math -fipa-ra -flto -fassociative-math \
           -mavx2 -mbmi2 -madx -fwrapv \
           -fomit-frame-pointer -fpredictive-commoning -fgcse-sm -fgcse-las \
           -fmodulo-sched -fmodulo-sched-allow-regmoves -funsafe-math-optimizations

# Source files
ifeq ($(shell uname -m),x86_64)
	CC_FLAGS += -march=native -pthread -lpthread
endif

default: build

clean:
	@rm -rf ecloop bench main a.out *.profraw *.profdata

build: clean
	$(CC) $(CC_FLAGS) main.c -o ecloop

That's cool! What CPU are you using? I don't have a good x86 processor at the moment to run proper benchmarks.

Also, which compiler did you use? On my side, Clang on Linux gives about 10% better performance compared to GCC, but I haven't figured out the reason for the difference yet.

Akito S. M. Hosana

Jr. Member

Offline

Activity: 364
Merit: 8

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 31, 2025, 08:58:59 AM

Quote from: vladkens on May 31, 2025, 01:21:22 AM

I have AMD Ryzen 5 3600 + GCC C++11 - Debian 12

What about the AOCC compiler that was @nomachine mentioned earlier?

https://www.amd.com/en/developer/aocc.html

This is a specialized Clang for AMD processors.

AOCC automatically converts scalar operations into SIMD instructions Tongue

nomachine

Full Member

Offline

Activity: 714
Merit: 110

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

May 31, 2025, 09:04:02 AM

Quote from: Akito S. M. Hosana on May 31, 2025, 08:58:59 AM

What about the AOCC compiler that was @nomachine mentioned earlier?

It has only one flaw. You can burn the processor if you don't know what you are doing and you have inadequate cooling. Grin

BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8

analyticnomad

Newbie

Offline

Activity: 48
Merit: 0

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

June 06, 2025, 02:16:39 PM

#10

Quote from: vladkens on May 27, 2025, 05:44:06 PM

Very cool! You know how to do this using GPU/CUDA? If you can, and want a $$pecial project dm me.

satscollector

Newbie

Offline

Activity: 3
Merit: 0

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

July 13, 2025, 11:33:04 AM

#11

Why does `-endo` show higher speed but takes longer?

For the test below, using `-endo` takes 3x longer for the same range.

```console
$ time ./ecloop add -f data/btc-puzzles-hash -r 400000000000000000:40000000000fffffff -o ./found_71.txt -t 4
threads: 4 ~ addr33: 1 ~ addr65: 0 ~ endo: 0 | filter: list (160)
range_s: 0000000000000000 0000000000000000 0000000000000040 0000000000000000
range_e: 0000000000000000 0000000000000000 0000000000000040 000000000fffffff
----------------------------------------
11.25s ~ 23.86 Mkeys/s ~ 0 / 268,435,456
./ecloop add -f data/btc-puzzles-hash -r 400000000000000000:40000000000ffffff 44.70s user 0.03s system 397% cpu 11.256 total

$ time ./ecloop add -f data/btc-puzzles-hash -r 400000000000000000:40000000000fffffff -o ./found_71.txt -t 4 -endo
threads: 4 ~ addr33: 1 ~ addr65: 0 ~ endo: 1 | filter: list (160)
range_s: 0000000000000000 0000000000000000 0000000000000040 0000000000000000
range_e: 0000000000000000 0000000000000000 0000000000000040 000000000fffffff
----------------------------------------
33.01s ~ 48.79 Mkeys/s ~ 0 / 1,610,612,736
./ecloop add -f data/btc-puzzles-hash -r 400000000000000000:40000000000ffffff 131.16s user 0.05s system 397% cpu 33.018 total
```

kTimesG

Full Member

Offline

Activity: 532
Merit: 131

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

July 13, 2025, 01:32:26 PM

#12

Quote from: satscollector on July 13, 2025, 11:33:04 AM

Why does `-endo` show higher speed but takes longer?

For the test below, using `-endo` takes 3x longer for the same range.

11.25s ~ 23.86 Mkeys/s ~ 0 / 268,435,456
33.01s ~ 48.79 Mkeys/s ~ 0 / 1,610,612,736

May be because one sixth (non-endo range keys) of 48.79 is equal to one third of 23.86? So the same key is reached, but at a speed rate of three times slower?

Endomorphism is like slowing down a fast car, but having more cars instead. However, only one of the cars is the one you actually want to get to the first Starbucks, while the other 5 cars are wondering around the Milky Way.

Off the grid, training pigeons to broadcast signed messages.

satscollector

Newbie

Offline

Activity: 3
Merit: 0

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

July 13, 2025, 02:17:15 PM

#13

Quote from: kTimesG on July 13, 2025, 01:32:26 PM

Ok, makes sense. But this means using `-endo` causes it to search outside the range `-r 400000000000000000:40000000000ffffff` which is explicitly provided. Right?

kTimesG

Full Member

Offline

Activity: 532
Merit: 131

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

July 13, 2025, 03:01:38 PM

#14

Quote from: satscollector on July 13, 2025, 02:17:15 PM

Quote from: kTimesG on July 13, 2025, 01:32:26 PM

Ok, makes sense. But this means using `-endo` causes it to search outside the range `-r 400000000000000000:40000000000ffffff` which is explicitly provided. Right?

Right, since that's pretty much the definition of endomorphism: fast mapping of a scalar multiple to some other point. The multiple here being a 256-bits constant.

Off the grid, training pigeons to broadcast signed messages.

satscollector

Newbie

Offline

Activity: 3
Merit: 0

Re: ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

July 14, 2025, 05:32:27 AM
Last edit: July 15, 2025, 08:35:36 AM by satscollector

#15

I did some tinkering with Cursor IDE. So, in summary, when `-endo` is used, for each key defined in the range given using `-r`, an additional 5 private keys are generated and checked. These additional 5 keys can be well outside the range.

Here is a small python script to figure out what other 5 keys are generated for each private key in the range `-r`:

Code:

#!/usr/bin/env python3

# Calculate additional private keys checked with -endo option for a single key
# Usage: python3 endomorphism_calculator.py <private_key_hex>

import sys

# Secp256k1 order N
N = 0xfffffffffffffffffffffffffffffffebaaedce6af48a03bbfd25e8cd0364141

# Endomorphism constants (alpha and alpha^2)
A1 = 0x5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72  # alpha
A2 = 0xac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce  # alpha^2

def mod_neg(k):
    """Calculate -k (mod N)"""
    return (N - k) % N

def mod_mul(a, b):
    """Calculate a * b (mod N)"""
    return (a * b) % N

def calculate_endo_keys(base_key):
    """Calculate all 6 endomorphism variants for a given base key"""
    keys = []
    keys.append(("Original", base_key))
    keys.append(("Negated", mod_neg(base_key)))
    alpha_key = mod_mul(base_key, A1)
    keys.append(("Alpha ×", alpha_key))
    keys.append(("Neg Alpha ×", mod_neg(alpha_key)))
    alpha2_key = mod_mul(base_key, A2)
    keys.append(("Alpha² ×", alpha2_key))
    keys.append(("Neg Alpha² ×", mod_neg(alpha2_key)))
    return keys

def main():
    if len(sys.argv) != 2:
        print("Usage: python3 endomorphism_calculator.py <private_key_hex>")
        print("Example: python3 endomorphism_calculator.py 0x10000")
        sys.exit(1)
    try:
        base_key = int(sys.argv[1], 16)
    except ValueError:
        print("Error: Invalid hex value")
        sys.exit(1)

    if base_key <= 0 or base_key >= N:
        print(f"Error: Private key must be in range [1, 0x{N-1:x}]")
        sys.exit(1)

    keys = calculate_endo_keys(base_key)

    print(f"All 6 endomorphism variants for private key 0x{base_key:x}:")
    print("=" * 80)
    for i, (desc, key) in enumerate(keys):
        print(f"{i}: 0x{key:064x}  # {desc}")

if __name__ == "__main__":
    main()

So for the private key 0x400000000000000000, the additional keys those are generated using -endo are:

Code:

python3 endomorphism_calculator.py 0x400000000000000000
All 6 endomorphism variants for private key 0x400000000000000000:
================================================================================
0: 0x0000000000000000000000000000000000000000000000400000000000000000  # Original
1: 0xfffffffffffffffffffffffffffffffebaaedce6af489ffbbfd25e8cd0364141  # Negated
2: 0x498700a20499169f0986f9516de685ec9f0ea6a6e6f058c3cd521c4ee2fd5497  # Alpha ×
3: 0xb678ff5dfb66e960f67906ae92197a121ba0363fc8584777f280423ded38ecaa  # Neg Alpha ×
4: 0xb678ff5dfb66e960f67906ae92197a121ba0363fc8584737f280423ded38ecaa  # Alpha² ×
5: 0x498700a20499169f0986f9516de685ec9f0ea6a6e6f05903cd521c4ee2fd5497  # Neg Alpha² ×

Pages: [1]

Bitcoin Forum > Bitcoin > Development & Technical Discussion > ecloop – CPU-optimized secp256k1 key search tool (C, SIMD, Bloom, range scan)

« previous topic next topic »