bcchanger
Newbie
Offline
Activity: 18
Merit: 0
|
 |
May 03, 2025, 03:17:41 PM |
|
The next thing I’m going to update is SECP256K1 itself—I’ve already removed some unnecessary files from Git
How fast can this go?  For example, the Ryzen 9 7940HS achieves ~10 MK/s when using 1 thread and ~67 MK/s with 16 threads. Performance also depends on how it is compiled—using GCC, Clang, etc... Any chance you could compile a Windows compatible version?
|
|
|
|
nomachine
|
 |
May 03, 2025, 03:20:04 PM Last edit: May 03, 2025, 03:46:56 PM by nomachine |
|
Nope. It is not critical update for my version. Gradually, our scripts are starting to diverge in terms of functionality. The next thing I’m going to update is SECP256K1 itself—I’ve already removed some unnecessary files from Git, like Timer.cpp, Timer.h, Random.cpp, and Random.h. They were just collecting dust and serving no purpose, unnecessarily increasing the size of the executable file.
How many keys are scanned at once (batch = group_size) when using sequential or random mode? 0x10000? 0x100000? This would be interesting to know, for example, when using STRIDE. If I start at 0x400000000000000000 and STRIDE = 0x10000000 and BATCH = 0x10000, at the end of each round I just need to add BATCH * Number of rounds to the start point 0x400000000000000000 (keeping the same STRIDE value) The same principle of both python scripts that I had sent you in PM some weeks before. That function has never worked here as it should. It just causes problems—either showing unrealistically high speeds or unrealistically low ones. I think I’ll remove it from the script altogether. P.S. Removed the stride option—to avoid further confusion, since this option does nothing.
|
BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8
|
|
|
Akito S. M. Hosana
Jr. Member
Offline
Activity: 392
Merit: 8
|
 |
May 03, 2025, 04:03:48 PM |
|
P.S. Removed the stride option—to avoid further confusion, since this option does nothing.
I think it's my fault. I asked for that option—to make it similar to Keyhunt. 
|
|
|
|
nomachine
|
 |
May 03, 2025, 04:10:30 PM |
|
P.S. Removed the stride option—to avoid further confusion, since this option does nothing.
I think it's my fault. I asked for that option—to make it similar to Keyhunt.  But this isn’t even close to the same as Keyhunt. Here, you have batches of x8 AVX2 doing the work. Try implementing SHA-256 and RIPEMD-160 with 8x parallel processing in Keyhunt, and let me know what kind of confusion you get with the speed and counter calculations. And how did you solve the stride problem where everything is processed with 8x parallelism? 
|
BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8
|
|
|
POD5
Member

Offline
Activity: 323
Merit: 10
Keep smiling if you're loosing!
|
 |
May 03, 2025, 04:17:58 PM |
|
P.S. Removed the stride option—to avoid further confusion, since this option does nothing.
I think it's my fault. I asked for that option—to make it similar to Keyhunt.  STRIDE is also not working in KEYHUNT nevertheless, the range file option is way better than stride... 
|
bc1qtmtmhzp54yvkz7asnqxc9j7ls6y5g93hg08msa
|
|
|
nochkin
Member

Offline
Activity: 74
Merit: 12
|
 |
May 03, 2025, 04:32:14 PM |
|
- is there a way to prove that all of the above are unprovable?
I'm offering a 9000 trillion multiverse BTC bounty if someone proves that the last one is unprovable to be proven. Or something like that. Please no AI. Easy-peasy. The answer to the proof is 42. Where is my reward?
|
|
|
|
nomachine
|
 |
May 03, 2025, 04:32:40 PM |
|
P.S. Removed the stride option—to avoid further confusion, since this option does nothing.
I think it's my fault. I asked for that option—to make it similar to Keyhunt.  STRIDE is also not working in KEYHUNT nevertheless, the range file option is way better than stride...  It’s much easier to talk about a particular option than to make it work in practice. Better yet, how much time did you spend on it to actually make it work? I’d rather go fishing. 
|
BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8
|
|
|
Akito S. M. Hosana
Jr. Member
Offline
Activity: 392
Merit: 8
|
 |
May 03, 2025, 04:47:33 PM |
|
I’d rather go fishing.  Maybe you would have solved the puzzle if you hadn't gone fishing? 
|
|
|
|
nochkin
Member

Offline
Activity: 74
Merit: 12
|
 |
May 03, 2025, 04:50:44 PM |
|
Maybe you would have solved the puzzle if you hadn't gone fishing?  Not necessary. But if you go phishing instead of fishing, then it's definitely. Not nice though.
|
|
|
|
FrozenThroneGuy
Jr. Member
Offline
Activity: 53
Merit: 43
|
 |
May 03, 2025, 05:18:27 PM |
|
P.S. Removed the stride option—to avoid further confusion, since this option does nothing.
I think it's my fault. I asked for that option—to make it similar to Keyhunt.  STRIDE is also not working in KEYHUNT nevertheless, the range file option is way better than stride...  It’s much easier to talk about a particular option than to make it work in practice. Better yet, how much time did you spend on it to actually make it work? I’d rather go fishing.  Hello NoMachine! Do you think about ShaNi intrinsic for speeding up SHA256? It is not so difficult like AVX Secp256k1, but it can speed up computing sha up to 2-3 times. I was trying to do this, but the speed decrease from 6ns per hash to 150ns.
|
|
|
|
kTimesG
|
 |
May 03, 2025, 05:35:24 PM |
|
It's interesting to see how all sorts of complexities attempt to simplify the situation. I guess it's now stride's week. How does it help? Well, it's boring to keep the algorithm straightforward. And it of course increases chances, since it's covering more terrain, I assume. priv -> pub -> sha[0] -> rmd[0] -> check [priv + 1] -> pub + G -> sha[1] -> rmd[1] -> check ... {privs} -> {pubs batch} -> {shas} -> {rmds} -> check
Now, I have the ultimate maximum performance and speed method: - compute k*G from 1 to 2**N / 2 (this is a one-time only step anyway) - compute (2**(N-1) + k*(2**(N - 1) / 2))*G to get the middle key of the scan interval - do a single batch addition to compute all the public keys of the interval; reuse the batched inverses (and you'll ever need a single inversion! wowza!) Voila. At some point during the last step of this batch addition, you'll hit the key. Genius. Don't thank me - script it!
|
Off the grid, training pigeons to broadcast signed messages.
|
|
|
nomachine
|
 |
May 03, 2025, 06:46:30 PM |
|
It is not so difficult like AVX Secp256k1
Yo, fam! If your code’s crawling at 150ns when it should be flying at 6ns, something’s off. AVX2’s 256-bit registers can handle 4 or 8 hashes at once like a pro. That W array expansion (the σ0/σ1 math) is begging for SIMD. sha256rnds2 goes brrr—like 5x faster than scalar. You’re probably vectorizing just ONE hash instead of stacking 4-8 like pancakes. Double-check if you’re using the flags: -mavx2 -mbmi2 -madx -fwrapv. Shoutout to the guy who nailed it: https://github.com/ulhaocheng/avxeccThe core math is similar, just different numbers. Key takeaways: AVX2-optimized field multiplication (the slowest part of ECC). Parallel limb ops (256-bit registers crunching 32/64-bit chunks). Just tweak it for Secp256k1’s constants, and you’re golden 
|
BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8
|
|
|
|
nomachine
|
 |
May 03, 2025, 08:27:17 PM |
|
So, what's the idea here? To create a completely new form of AVX2-accelerated ECC from scratch with secp256k1 parameters?
It can’t be any other way. There is nothing similar using 'limb-slicing' published on GitHub.
|
BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8
|
|
|
360videosro
Newbie
Offline
Activity: 7
Merit: 0
|
 |
May 03, 2025, 09:22:58 PM |
|
Besides bitcrack which wasn't updated for the last 5+ years, is there any similar tool using the cuda GPU but to generate randomly within a range?
|
|
|
|
kTimesG
|
 |
May 03, 2025, 09:35:20 PM |
|
Shoutout to the guy who nailed it: https://github.com/ulhaocheng/avxeccKey takeaways: AVX2-optimized field multiplication (the slowest part of ECC). Parallel limb ops (256-bit registers crunching 32/64-bit chunks). Just tweak it for Secp256k1’s constants, and you’re golden  I think you meant scalar multiplication there... which is basically a no-op if scanning ranges with sequential keys. I bet libsecp256k1 is faster than that though, since it's already SIMD-ed by the compiler, because it uses carry-free independent limbs. That's like, free vectorization out of the box, less cycles / op, and so on. Ideal for ending up with a fried CPU, which seems like it's what people want.
|
Off the grid, training pigeons to broadcast signed messages.
|
|
|
nomachine
|
 |
May 03, 2025, 09:57:31 PM |
|
Ideal for ending up with a fried CPU, which seems like it's what people want.
A fried CPU could indeed be the problem here. 
|
BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8
|
|
|
Akito S. M. Hosana
Jr. Member
Offline
Activity: 392
Merit: 8
|
 |
May 03, 2025, 10:17:04 PM |
|
Ideal for ending up with a fried CPU, which seems like it's what people want.
A fried CPU could indeed be the problem here.  What about the AOCC compiler that was mentioned earlier? https://www.amd.com/en/developer/aocc.htmlAOCC automatically converts scalar operations into SIMD instructions 
|
|
|
|
fixedpaul
Jr. Member
Offline
Activity: 55
Merit: 16
|
 |
May 03, 2025, 10:20:05 PM |
|
Besides bitcrack which wasn't updated for the last 5+ years, is there any similar tool using the cuda GPU but to generate randomly within a range?
You can check my GitHub, VanitySearch-bitcrack has a random mode
|
|
|
|
nomachine
|
 |
May 03, 2025, 10:22:23 PM |
|
This is a great thing, but on one condition: all the code must be extremely pedantic. Even then, the CPU might overheat. You’ll need very good cooling. 
|
BTC: bc1qdwnxr7s08xwelpjy3cc52rrxg63xsmagv50fa8
|
|
|
|