v3.3.0 Latest
What's New in v3.3.0
Comprehensive Benchmarks (Metal + WASM)
Metal GPU benchmark (bench_metal.mm): 9 operations — Field Mul/Add/Sub/Sqr/Inv, Point Add/Double, Scalar Mul (P×k), Generator Mul (G×k). Matches CUDA benchmark format with warmup, kernel-only timing, and throughput tables.
3 new Metal GPU kernels: field_add_bench, field_sub_bench, field_inv_bench added to secp256k1_kernels.metal
WASM benchmark (bench_wasm.mjs): Node.js benchmark for all WASM-exported operations — Pubkey Create (G×k), Point Mul (P×k), Point Add (P+Q), ECDSA Sign/Verify, Schnorr Sign/Verify, SHA-256 (32B/1KB)
WASM Benchmark Results (CI, Node.js v20, linux x64)
Operation Time/Op Throughput
SHA-256 (32B) 649 ns 1.54 M/s
SHA-256 (1KB) 6.33 µs 158 K/s
Point Add (P+Q) 22 µs 45 K/s
Pubkey Create (G×k) 70 µs 14 K/s
Schnorr Sign 92 µs 11 K/s
ECDSA Sign 146 µs 7 K/s
Point Mul (P×k) 693 µs 1.4 K/s
ECDSA Verify 825 µs 1.2 K/s
Schnorr Verify 874 µs 1.1 K/s
CI Hardening
WASM benchmark now runs in CI (Node.js 20 setup + execution in wasm job)
Metal: skip generator_mul test on non-Apple7+ paravirtual devices (CI fix)
Benchmark alert threshold raised from 120% → 150% (reduces false positives on shared CI runners)
Fix WASM runtime crash: removed --closure 1, added -fno-exceptions, increased WASM memory (4MB initial, 512KB stack)
Bug Fixes
Fix Metal shader compilation errors (MSL address space mismatches, jacobian_to_affine ordering)
Fix keccak rotl64 undefined behavior (shift by 0)
Fix macOS build flags for Clang compatibility
Fix metal2.4 shader standard for newer Xcode toolchains
Remove unused .cuh files and sorted_ecc_db
Testing & Quality
Unified test runner (12 test files consolidated)
Selftest modes: smoke (fast), ci (full), stress (extended)
Boundary KAT vectors, field limb boundary tests, batch inverse sweep
Repro bundle support for deterministic test reproduction
Sanitizer CI integration (ASan/UBSan)
Security & Maturity
SECURITY.md v3.2, THREAT_MODEL.md
API stability guarantees documented
Fuzz testing documentation
Security contact:
payysoon@gmail.comDocumentation
Batch inverse & mixed addition API reference with examples (full point, X-only, CUDA, division, scratch reuse, Montgomery trick)
README cleanup: removed AI-generated text, translated Georgian → English
Removed database/lookup/bloom references from public docs
Metal Backend Improvements
Apple Metal GPU backend with Comba-accelerated field arithmetic
4-bit windowed scalar multiplication on GPU
Chunked Montgomery batch inverse
Branchless bloom check with coalesced memory access