UltrafastSecp256k1

Author	SHA1	Message	Date
shrec	8af7320c60	Harden audit and fix Windows CUDA build	2026-03-25 14:36:36 +00:00
shrec	461ecca3c1	docs: add 5 ZK/BIP-324 GPU batch ops to all docs Update 8 documentation files to reflect the new GPU C ABI surface: - CHANGELOG: 18→23 gpu functions, added entry for GPU ZK+BIP-324 C ABI - README: 8 ops → 13 ops (8 core + 5 CUDA-first) - BACKEND_ASSURANCE_MATRIX: new feature rows + parity tracking stubs - BACKEND_PARITY: new GPU batch ops section, 3.4.0 note - GPU_VALIDATION_MATRIX: 8→13 ops, CUDA 13/13, OCL/Metal 8/13 - API_REFERENCE: 5 new C ABI function signatures - FEATURE_ASSURANCE_LEDGER: 18→23 gpu functions, 5 new batch op rows - FEATURE_MATURITY: 18→23 gpu functions count	2026-03-24 16:43:52 +00:00
shrec	3a86fcef1e	Harden ABI and finish bindings validation	2026-03-23 02:30:44 +00:00
shrec	852255bf72	docs: sync GPU ABI and batch signing docs	2026-03-22 16:57:51 +00:00
shrec	c7914410db	Merge remote-tracking branch 'origin/fix/metal-validation-gpu-presets' into dev	2026-03-21 14:15:44 +00:00
shrec	fe64eb2744	infra: self-hosted GPU CI runner + workflow + release evidence - Add gpu-selfhosted.yml: CUDA CI on parking-gpu (RTX 5060 Ti) - Triggers: push/PR dev+main, nightly 04:00 UTC, manual dispatch - GPU C ABI tests, unified audit, benchmarks, backend matrix - Environment manifest + per-commit proof summary artifacts - 90-day artifact retention - Update release.yml: GPU evidence bundle in releases - Add INFRA_AMPLIFICATION_TODO.md: completed P0-P4 tracking - Fix audit/CMakeLists.txt: prefer ufsecp_shared for GPU test linking - Update GPU_VALIDATION_MATRIX.md, README, docs, ufsecp_gpu.h	2026-03-15 23:16:17 +00:00
Vano Chkheidze	d16f254f8f	fix: Metal device validation + GPU audit presets + docs + examples (#146 ) * fix: CUDA RIPEMD160 r2 table + ECDH y-parity + GPU-side conversion - hash160.cuh: fix transposed r2[46..47] (was 13,4 -> correct 4,13) - ecdh.cuh: compute y-parity for SHA-256(02/03\|\|x) to match CPU ecdh_compute - gpu_backend_cuda.cu: GPU-side Jacobian<->compressed conversion via batch_jac_to_compressed_kernel and batch_compressed_to_jac_kernel; fix bytes_to_scalar/bytes_to_field byte order; add msm_reduce_and_compress_kernel for GPU-side MSM accumulation; remove dead host-side FieldElement code - gpu/CMakeLists.txt: remove gpu_backend_cuda_host.cpp - bindings/rust: add hex dev-dep, fix abi_version() call in smoke test - bindings/nodejs: add koffi-based smoke test (Node 22 compatible) GPU test results: 154 passed, 0 failed gpu_abi_gate: 44/44 gpu_ops_equivalence: 55/55 gpu_host_api_negative: 55/55 Binding tests: Rust (cargo test): 13/13 Node.js (koffi): 12/12 * examples: add 6-language example suite (C, Python, Rust, Node.js, Go, Java) Comprehensive CPU + GPU examples for all supported binding languages: - C: Direct C ABI calls, 16 demo sections (CPU + GPU) - Python: ctypes + Ufsecp wrapper, 14 sections (CPU + GPU) - Rust: Safe ufsecp crate wrapper, 9 sections (CPU only) - Node.js: koffi FFI, 12 sections (CPU + GPU) - Go: Pure cgo, 14 sections (CPU + GPU) - Java: JNA FFI, 14 sections (CPU + GPU) Each example covers: key generation, ECDSA (sign/verify/recover/DER), Schnorr (BIP-340), ECDH, hashing (SHA-256, Hash160), Bitcoin addresses (P2PKH, P2WPKH, P2TR), WIF encoding, BIP-32 HD derivation, and Taproot. GPU examples demonstrate: backend discovery, batch key generation, batch ECDSA verify, batch Hash160, and multi-scalar multiplication. All 6 examples tested and verified against RTX 5060 Ti (CUDA + OpenCL). * examples: add GPU+Pedersen to all 6 languages, expand README - Rust: add GPU sections [10]-[15] + Pedersen, add GPU+Pedersen FFI to ufsecp-sys - Node.js: add BIP-32, Taproot, Pedersen sections [8]-[10], renumber GPU [11]-[15] - Python: add Pedersen section [10] via direct ctypes - Go: add Pedersen section [10] via cgo - Java: add Pedersen JNA declarations + section [10] - Rust safe wrapper: add Context::as_ptr() for GPU FFI access - examples/README.md: comprehensive rewrite with all 6 languages, build/run instructions, feature coverage matrix, embedded platforms, troubleshooting - README.md: add Highlights, Performance, Architecture stack diagram, Hardware Compatibility table (16 platforms), Embedded Targets, Examples index, Use Cases section; expand Architecture with bindings + source tree All 6 examples tested: C 16/16, Rust 15/15, Python 15/15, Node.js 15/15, Go 15/15, Java 15/15 -- all sections pass including GPU operations. * docs: fix GPU backend maturity labels, add docs links, clean repo hygiene - .gitignore: add rules for example build artifacts (binaries, node_modules, target/, Cargo.lock, package-lock.json, .class files) - README.md: fix GPU overclaims -- OpenCL is partial (4/6 ops), Metal is experimental (discovery only); add GPU API, Validation Matrix, Feature Maturity, Supported Guarantees, Examples to Documentation table - FEATURE_MATURITY.md: fix contradictions -- ECDSA/Schnorr verify GPU column corrected from 'all 3' to 'CUDA' (OpenCL UNSUPPORTED per ufsecp_gpu.h); BIP-32 HD GPU corrected from 'all 3' to '-' (no GPU C ABI path) - GPU_VALIDATION_MATRIX.md: add CI and Local Verification table documenting that CUDA/OpenCL tests pass locally (RTX 5060 Ti), GH Actions lacks GPU runners; all 4 GPU C ABI tests (gpu_abi_gate, gpu_ops_equivalence, gpu_host_api_negative, gpu_backend_matrix) confirmed in ctest matrix (49 total) and pass * fix: Metal device index validation + canonical GPU audit presets - Metal backend: reject out-of-range device_index in init() before creating MetalRuntime (fixes gpu_host_api_negative on macOS CI) - CMakePresets.json: add cuda-audit, cuda-audit-5060ti configure/build presets and testPresets for reproducible 49-test GPU verification - docs/BUILDING.md: document canonical GPU audit build path - docs/LOCAL_CI.md: add GPU proof-path quick-reference - docs/README.md: update docs index entry --------- Co-authored-by: shrec <shrec@users.noreply.github.com>	2026-03-16 02:42:54 +04:00
shrec	6d92a882f3	docs: fix GPU backend maturity labels, add docs links, clean repo hygiene - .gitignore: add rules for example build artifacts (binaries, node_modules, target/, Cargo.lock, package-lock.json, .class files) - README.md: fix GPU overclaims -- OpenCL is partial (4/6 ops), Metal is experimental (discovery only); add GPU API, Validation Matrix, Feature Maturity, Supported Guarantees, Examples to Documentation table - FEATURE_MATURITY.md: fix contradictions -- ECDSA/Schnorr verify GPU column corrected from 'all 3' to 'CUDA' (OpenCL UNSUPPORTED per ufsecp_gpu.h); BIP-32 HD GPU corrected from 'all 3' to '-' (no GPU C ABI path) - GPU_VALIDATION_MATRIX.md: add CI and Local Verification table documenting that CUDA/OpenCL tests pass locally (RTX 5060 Ti), GH Actions lacks GPU runners; all 4 GPU C ABI tests (gpu_abi_gate, gpu_ops_equivalence, gpu_host_api_negative, gpu_backend_matrix) confirmed in ctest matrix (49 total) and pass	2026-03-15 21:15:55 +00:00
shrec	0afb34402d	gpu: complete P0-P2 GPU API TODO -- CUDA/OpenCL/Metal CUDA completeness (P0): - Implement ecdh_batch kernel (thin wrapper over cuda::ecdh_compute) - Implement MSM: scatter k*P kernels + host-side accumulation - Split compilation: extract FieldElement-dependent code to gpu_backend_cuda_host.cpp (host compiler, not nvcc) with POD-only header gpu_cuda_host_helpers.h - Fix namespace: extern kernel decls in secp256k1::cuda:: (not gpu::) - Add POSITION_INDEPENDENT_CODE for CUDA static lib (shared lib linking) Test infrastructure (P0): - test_gpu_ops_equivalence.cpp: 6 equivalence tests (CPU vs GPU) - test_gpu_host_api_negative.cpp: 8 groups, 55 negative checks - test_gpu_backend_matrix.cpp: backend enum + per-backend op probing - Wire 3 new test targets in audit/CMakeLists.txt OpenCL expansion (P1): - Implement ecdh_batch, hash160_pubkey_batch, msm (4/6 ops now) - ECDSA/Schnorr verify remain UNSUPPORTED stubs Error model hardening (P1): - Reorder all backends: count=0 check before NULL check (no-op is valid) - Unsupported stubs check is_ready() before returning - MSM n=0 returns Ok, infinity returns Arith Metal scope clarity (P2): - Mark Metal backend EXPERIMENTAL in header + ufsecp_gpu.h - All Metal stubs check is_ready() LTO/CUDA compatibility: - Guard INTERFACE -flto propagation when CUDA is enabled (nvcc LTO v12 vs GCC LTO v14 version mismatch) - Add IPO opt-out for GPU-linked test targets Documentation: - GPU_VALIDATION_MATRIX.md: feature maturity table - TEST_MATRIX.md: 3 new GPU test targets - AUDIT_TRACEABILITY.md: GPU coverage entries - ufsecp_gpu.h: feature maturity section (CUDA 6/6, OpenCL 4/6, Metal 0/6) - C example: include/ufsecp/examples/gpu_example.c	2026-03-15 18:24:12 +00:00
shrec	331beb01ac	security: comprehensive hardening + adversarial testing + debug invariants - Parser strictness: FE::parse_bytes_strict at ABI boundary (batch verify, batch identify, adaptor verify) - MuSig2 nonce reuse fix (CRITICAL): secnonce zeroed after partial_sign - BIP-32 overflow fix: uint64_t bounds check on derivation index - BIP-39 validate: null ctx guard added - ECIES: HMAC covers ephemeral pubkey, OS CSPRNG, strict prefix validation - Zeroization audit: secure_erase across 34 functions (ufsecp_impl, ecdh, ecies) Testing: - test_adversarial_protocol.cpp: 37 tests, 7 categories (MuSig2, FROST, Silent Payments, adaptor sigs, BIP-32, FFI hostile-caller), 150/150 pass - CPU-GPU cross-verification: gen_mul, field_mul, ECDSA (GPU audit 45/45) - Debug invariants deployed: 15 SECP_ASSERT_* insertions across point.cpp, ecdsa.cpp, schnorr.cpp, scalar.cpp (zero overhead in Release) Documentation: - docs/FEATURE_ASSURANCE_LEDGER.md: 96 API functions, 25 categories, 8 coverage dimensions - docs/GPU_VALIDATION_MATRIX.md: CUDA/OpenCL/Metal validation checklist - docs/ENGINEERING_WORKDOC.md All tests: 42/42 CTest PASS	2026-03-13 23:52:25 +00:00

10 Commits