UltrafastSecp256k1/examples/python_example/example.py
Vano Chkheidze d16f254f8f
fix: Metal device validation + GPU audit presets + docs + examples (#146)
* fix: CUDA RIPEMD160 r2 table + ECDH y-parity + GPU-side conversion

- hash160.cuh: fix transposed r2[46..47] (was 13,4 -> correct 4,13)
- ecdh.cuh: compute y-parity for SHA-256(02/03||x) to match CPU ecdh_compute
- gpu_backend_cuda.cu: GPU-side Jacobian<->compressed conversion
  via batch_jac_to_compressed_kernel and batch_compressed_to_jac_kernel;
  fix bytes_to_scalar/bytes_to_field byte order;
  add msm_reduce_and_compress_kernel for GPU-side MSM accumulation;
  remove dead host-side FieldElement code
- gpu/CMakeLists.txt: remove gpu_backend_cuda_host.cpp
- bindings/rust: add hex dev-dep, fix abi_version() call in smoke test
- bindings/nodejs: add koffi-based smoke test (Node 22 compatible)

GPU test results: 154 passed, 0 failed
  gpu_abi_gate:          44/44
  gpu_ops_equivalence:   55/55
  gpu_host_api_negative: 55/55

Binding tests:
  Rust (cargo test):     13/13
  Node.js (koffi):       12/12

* examples: add 6-language example suite (C, Python, Rust, Node.js, Go, Java)

Comprehensive CPU + GPU examples for all supported binding languages:

- C:       Direct C ABI calls, 16 demo sections (CPU + GPU)
- Python:  ctypes + Ufsecp wrapper, 14 sections (CPU + GPU)
- Rust:    Safe ufsecp crate wrapper, 9 sections (CPU only)
- Node.js: koffi FFI, 12 sections (CPU + GPU)
- Go:      Pure cgo, 14 sections (CPU + GPU)
- Java:    JNA FFI, 14 sections (CPU + GPU)

Each example covers: key generation, ECDSA (sign/verify/recover/DER),
Schnorr (BIP-340), ECDH, hashing (SHA-256, Hash160), Bitcoin addresses
(P2PKH, P2WPKH, P2TR), WIF encoding, BIP-32 HD derivation, and Taproot.

GPU examples demonstrate: backend discovery, batch key generation,
batch ECDSA verify, batch Hash160, and multi-scalar multiplication.

All 6 examples tested and verified against RTX 5060 Ti (CUDA + OpenCL).

* examples: add GPU+Pedersen to all 6 languages, expand README

- Rust: add GPU sections [10]-[15] + Pedersen, add GPU+Pedersen FFI to ufsecp-sys
- Node.js: add BIP-32, Taproot, Pedersen sections [8]-[10], renumber GPU [11]-[15]
- Python: add Pedersen section [10] via direct ctypes
- Go: add Pedersen section [10] via cgo
- Java: add Pedersen JNA declarations + section [10]
- Rust safe wrapper: add Context::as_ptr() for GPU FFI access
- examples/README.md: comprehensive rewrite with all 6 languages, build/run
  instructions, feature coverage matrix, embedded platforms, troubleshooting
- README.md: add Highlights, Performance, Architecture stack diagram,
  Hardware Compatibility table (16 platforms), Embedded Targets, Examples
  index, Use Cases section; expand Architecture with bindings + source tree

All 6 examples tested: C 16/16, Rust 15/15, Python 15/15, Node.js 15/15,
Go 15/15, Java 15/15 -- all sections pass including GPU operations.

* docs: fix GPU backend maturity labels, add docs links, clean repo hygiene

- .gitignore: add rules for example build artifacts (binaries, node_modules,
  target/, Cargo.lock, package-lock.json, .class files)
- README.md: fix GPU overclaims -- OpenCL is partial (4/6 ops), Metal is
  experimental (discovery only); add GPU API, Validation Matrix, Feature
  Maturity, Supported Guarantees, Examples to Documentation table
- FEATURE_MATURITY.md: fix contradictions -- ECDSA/Schnorr verify GPU column
  corrected from 'all 3' to 'CUDA' (OpenCL UNSUPPORTED per ufsecp_gpu.h);
  BIP-32 HD GPU corrected from 'all 3' to '-' (no GPU C ABI path)
- GPU_VALIDATION_MATRIX.md: add CI and Local Verification table documenting
  that CUDA/OpenCL tests pass locally (RTX 5060 Ti), GH Actions lacks GPU
  runners; all 4 GPU C ABI tests (gpu_abi_gate, gpu_ops_equivalence,
  gpu_host_api_negative, gpu_backend_matrix) confirmed in ctest matrix (49
  total) and pass

* fix: Metal device index validation + canonical GPU audit presets

- Metal backend: reject out-of-range device_index in init() before
  creating MetalRuntime (fixes gpu_host_api_negative on macOS CI)
- CMakePresets.json: add cuda-audit, cuda-audit-5060ti configure/build
  presets and testPresets for reproducible 49-test GPU verification
- docs/BUILDING.md: document canonical GPU audit build path
- docs/LOCAL_CI.md: add GPU proof-path quick-reference
- docs/README.md: update docs index entry

---------

Co-authored-by: shrec <shrec@users.noreply.github.com>
2026-03-16 02:42:54 +04:00

309 lines
12 KiB
Python

#!/usr/bin/env python3
"""
UltrafastSecp256k1 -- Python Example (CPU + GPU)
Demonstrates the full Python ctypes binding: key ops, ECDSA, Schnorr,
ECDH, hashing, Bitcoin addresses, BIP-32, Taproot, and GPU batch ops.
Usage:
UFSECP_LIB=../../build-linux/include/ufsecp/libufsecp.so python3 example.py
# Or if LD_LIBRARY_PATH is set:
LD_LIBRARY_PATH=../../build-linux/include/ufsecp python3 example.py
"""
import ctypes
import os
import sys
from ctypes import (
POINTER, byref, c_char_p, c_int, c_size_t, c_uint8, c_uint32, c_void_p,
)
# Add the bindings directory to the path
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
BINDINGS_DIR = os.path.join(SCRIPT_DIR, '..', '..', 'bindings', 'python')
sys.path.insert(0, BINDINGS_DIR)
# Point to the shared library
if 'UFSECP_LIB' not in os.environ:
candidate = os.path.join(SCRIPT_DIR, '..', '..', 'build-linux', 'include', 'ufsecp', 'libufsecp.so')
if os.path.exists(candidate):
os.environ['UFSECP_LIB'] = os.path.abspath(candidate)
from ufsecp import Ufsecp, NET_MAINNET
# ── Helper ─────────────────────────────────────────────────────────────────
def hexs(data: bytes) -> str:
return data.hex()
def section(num: int, title: str):
print(f"\n[{num}] {title}")
# ── CPU Examples ──────────────────────────────────────────────────────────
def demo_cpu():
print("=== CPU Operations ===")
with Ufsecp() as ctx:
privkey = bytes(31) + b'\x01'
privkey2 = bytes(31) + b'\x02'
# 1. Key Generation
section(1, "Key Generation")
pub = ctx.pubkey_create(privkey)
pub_uncompressed = ctx.pubkey_create_uncompressed(privkey)
xonly = ctx.pubkey_xonly(privkey)
print(f" Private key: {hexs(privkey)}")
print(f" Compressed (33B): {hexs(pub)}")
print(f" Uncompressed (65B): {hexs(pub_uncompressed)}")
print(f" X-only (32B): {hexs(xonly)}")
# 2. ECDSA
section(2, "ECDSA Sign / Verify (RFC 6979)")
msg = ctx.sha256(b"Hello UltrafastSecp256k1!")
print(f" Message hash: {hexs(msg)}")
sig = ctx.ecdsa_sign(msg, privkey)
print(f" ECDSA signature: {hexs(sig)}")
ok = ctx.ecdsa_verify(msg, sig, pub)
print(f" Verify: {'VALID' if ok else 'INVALID'}")
# DER encoding
der = ctx.ecdsa_sig_to_der(sig)
print(f" DER length: {len(der)} bytes")
sig_back = ctx.ecdsa_sig_from_der(der)
print(f" DER roundtrip: {'match' if sig == sig_back else 'MISMATCH'}")
# Recovery
rsig = ctx.ecdsa_sign_recoverable(msg, privkey)
recovered = ctx.ecdsa_recover(msg, rsig.signature, rsig.recovery_id)
print(f" Recovery: recid={rsig.recovery_id}, match={'YES' if recovered == pub else 'NO'}")
# 3. Schnorr
section(3, "Schnorr Sign / Verify (BIP-340)")
aux = bytes(32)
schnorr_sig = ctx.schnorr_sign(msg, privkey, aux)
print(f" Schnorr signature: {hexs(schnorr_sig)}")
ok = ctx.schnorr_verify(msg, schnorr_sig, xonly)
print(f" Verify: {'VALID' if ok else 'INVALID'}")
# 4. ECDH
section(4, "ECDH Key Agreement")
pub2 = ctx.pubkey_create(privkey2)
secret_a = ctx.ecdh(privkey, pub2)
secret_b = ctx.ecdh(privkey2, pub)
print(f" Secret (A->B): {hexs(secret_a)}")
print(f" Secret (B->A): {hexs(secret_b)}")
print(f" Match: {'YES' if secret_a == secret_b else 'NO'}")
# 5. Hashing
section(5, "Hashing")
sha = ctx.sha256(pub)
h160 = ctx.hash160(pub)
tagged = ctx.tagged_hash("BIP0340/challenge", msg)
print(f" SHA-256(pubkey): {hexs(sha)}")
print(f" Hash160(pubkey): {hexs(h160)}")
print(f" Tagged hash: {hexs(tagged)}")
# 6. Bitcoin Addresses
section(6, "Bitcoin Addresses")
print(f" P2PKH: {ctx.addr_p2pkh(pub)}")
print(f" P2WPKH: {ctx.addr_p2wpkh(pub)}")
print(f" P2TR: {ctx.addr_p2tr(xonly)}")
# 7. WIF
section(7, "WIF Encoding")
wif = ctx.wif_encode(privkey)
print(f" WIF: {wif}")
decoded = ctx.wif_decode(wif)
print(f" Decode roundtrip: match={'YES' if decoded.privkey == privkey else 'NO'}")
# 8. BIP-32
section(8, "BIP-32 HD Key Derivation")
seed = bytes([0x42] * 64)
master = ctx.bip32_master(seed)
child_key = ctx.bip32_derive_path(master, "m/44'/0'/0'/0/0")
child_priv = ctx.bip32_privkey(child_key)
child_pub = ctx.bip32_pubkey(child_key)
print(f" BIP-32 child priv: {hexs(child_priv)}")
print(f" BIP-32 child pub: {hexs(child_pub)}")
# 9. Taproot
section(9, "Taproot (BIP-341)")
tap = ctx.taproot_output_key(xonly)
print(f" Output key: {hexs(tap.output_key_x)}")
print(f" Parity: {tap.parity}")
ok = ctx.taproot_verify(tap.output_key_x, tap.parity, xonly)
print(f" Verify: {'VALID' if ok else 'INVALID'}")
# 10. Pedersen Commitment
section(10, "Pedersen Commitment")
lib_path = os.environ.get('UFSECP_LIB')
if not lib_path:
lib_path = os.path.join(SCRIPT_DIR, '..', '..', 'build-linux',
'include', 'ufsecp', 'libufsecp.so')
_lib = ctypes.CDLL(lib_path)
_lib.ufsecp_pedersen_commit.argtypes = [c_void_p, POINTER(c_uint8), POINTER(c_uint8), POINTER(c_uint8)]
_lib.ufsecp_pedersen_commit.restype = c_int
_lib.ufsecp_pedersen_verify.argtypes = [c_void_p, POINTER(c_uint8), POINTER(c_uint8), POINTER(c_uint8)]
_lib.ufsecp_pedersen_verify.restype = c_int
with Ufsecp() as ctx:
value = (c_uint8 * 32)(*([0] * 31 + [42]))
blinding = (c_uint8 * 32)(*([0] * 31 + [7]))
commitment = (c_uint8 * 33)()
rc = _lib.ufsecp_pedersen_commit(ctx._ctx, value, blinding, commitment)
assert rc == 0, f"pedersen_commit failed: {rc}"
print(f" Commitment: {hexs(bytes(commitment))}")
rc = _lib.ufsecp_pedersen_verify(ctx._ctx, commitment, value, blinding)
print(f" Verify: {'VALID' if rc == 0 else 'INVALID'}")
print()
# ── GPU Examples ──────────────────────────────────────────────────────────
def demo_gpu():
print("=== GPU Operations ===")
# Load GPU functions directly from the C library
lib_path = os.environ.get('UFSECP_LIB')
if not lib_path:
lib_path = os.path.join(SCRIPT_DIR, '..', '..', 'build-linux',
'include', 'ufsecp', 'libufsecp.so')
lib = ctypes.CDLL(lib_path)
# Bind GPU functions
lib.ufsecp_gpu_backend_count.argtypes = [POINTER(c_uint32), c_uint32]
lib.ufsecp_gpu_backend_count.restype = c_uint32
lib.ufsecp_gpu_backend_name.argtypes = [c_uint32]
lib.ufsecp_gpu_backend_name.restype = c_char_p
lib.ufsecp_gpu_is_available.argtypes = [c_uint32]
lib.ufsecp_gpu_is_available.restype = c_int
lib.ufsecp_gpu_device_count.argtypes = [c_uint32]
lib.ufsecp_gpu_device_count.restype = c_uint32
lib.ufsecp_gpu_ctx_create.argtypes = [POINTER(c_void_p), c_uint32, c_uint32]
lib.ufsecp_gpu_ctx_create.restype = c_int
lib.ufsecp_gpu_ctx_destroy.argtypes = [c_void_p]
lib.ufsecp_gpu_ctx_destroy.restype = None
lib.ufsecp_gpu_generator_mul_batch.restype = c_int
lib.ufsecp_gpu_ecdsa_verify_batch.restype = c_int
lib.ufsecp_gpu_hash160_pubkey_batch.restype = c_int
lib.ufsecp_gpu_msm.restype = c_int
lib.ufsecp_gpu_error_str.argtypes = [c_int]
lib.ufsecp_gpu_error_str.restype = c_char_p
# 10. Backend Discovery
section(10, "GPU Backend Discovery")
backend_ids = (c_uint32 * 4)()
n_backends = lib.ufsecp_gpu_backend_count(backend_ids, 4)
print(f" Backends compiled: {n_backends}")
use_backend = 0
for i in range(n_backends):
bid = backend_ids[i]
name = lib.ufsecp_gpu_backend_name(bid).decode()
avail = lib.ufsecp_gpu_is_available(bid)
devs = lib.ufsecp_gpu_device_count(bid)
print(f" Backend {bid}: {name:<8s} available={avail} devices={devs}")
if avail and not use_backend:
use_backend = bid
if not use_backend:
print(" No GPU backends available -- skipping GPU demos.")
return
# Create GPU context
gpu = c_void_p()
rc = lib.ufsecp_gpu_ctx_create(byref(gpu), use_backend, 0)
if rc != 0:
print(f" GPU context creation failed: {lib.ufsecp_gpu_error_str(rc).decode()}")
return
# 11. Batch Key Generation
section(11, "GPU Batch Key Generation (4 keys)")
N = 4
scalars = (c_uint8 * (N * 32))(*([0] * (N * 32)))
for i in range(N):
scalars[i * 32 + 31] = i + 1
pubkeys = (c_uint8 * (N * 33))()
rc = lib.ufsecp_gpu_generator_mul_batch(gpu, scalars, N, pubkeys)
if rc == 0:
for i in range(N):
pk = bytes(pubkeys[i*33:(i+1)*33])
print(f" GPU pubkey[{i}]: {hexs(pk)}")
else:
print(f" gpu_generator_mul_batch: {lib.ufsecp_gpu_error_str(rc).decode()}")
# 12. ECDSA Batch Verify
section(12, "GPU ECDSA Batch Verify")
# Sign on CPU, verify on GPU
with Ufsecp() as ctx:
msgs = (c_uint8 * (N * 32))()
sigs = (c_uint8 * (N * 64))()
pubs = (c_uint8 * (N * 33))()
for i in range(N):
priv = bytes(31) + bytes([i + 1])
msg_hash = ctx.sha256(bytes([i]))
sig = ctx.ecdsa_sign(msg_hash, priv)
pub = bytes(pubkeys[i*33:(i+1)*33])
for j in range(32):
msgs[i*32+j] = msg_hash[j]
for j in range(64):
sigs[i*64+j] = sig[j]
for j in range(33):
pubs[i*33+j] = pub[j]
results = (c_uint8 * N)()
rc = lib.ufsecp_gpu_ecdsa_verify_batch(gpu, msgs, pubs, sigs, N, results)
if rc == 0:
result_str = " ".join(f"[{i}]={'VALID' if results[i] else 'INVALID'}" for i in range(N))
print(f" Results: {result_str}")
else:
print(f" gpu_ecdsa_verify_batch: {lib.ufsecp_gpu_error_str(rc).decode()}")
# 13. Hash160 Batch
section(13, "GPU Hash160 Batch")
hashes = (c_uint8 * (N * 20))()
rc = lib.ufsecp_gpu_hash160_pubkey_batch(gpu, pubkeys, N, hashes)
if rc == 0:
for i in range(N):
h = bytes(hashes[i*20:(i+1)*20])
print(f" Hash160[{i}]: {hexs(h)}")
else:
print(f" gpu_hash160_pubkey_batch: {lib.ufsecp_gpu_error_str(rc).decode()}")
# 14. MSM
section(14, "GPU Multi-Scalar Multiplication")
msm_result = (c_uint8 * 33)()
rc = lib.ufsecp_gpu_msm(gpu, scalars, pubkeys, N, msm_result)
if rc == 0:
print(f" MSM result: {hexs(bytes(msm_result))}")
else:
print(f" gpu_msm: {lib.ufsecp_gpu_error_str(rc).decode()}")
lib.ufsecp_gpu_ctx_destroy(gpu)
print()
# ── Main ──────────────────────────────────────────────────────────────────
def main():
print("UltrafastSecp256k1 -- Python Example")
with Ufsecp() as ctx:
print(f"ABI version: {ctx.abi_version}")
print(f"Library: {ctx.version_string()}")
demo_cpu()
demo_gpu()
print("All examples completed successfully.")
if __name__ == '__main__':
main()