- opencl/kernels/secp256k1_zk.cl: remove #if 0 guard (881 lines of bulletproof
code re-enabled); fix range_verify_full_impl address-space qualifiers:
bp_G / bp_H now __global const AffinePoint*; per-iteration private copy
(AffinePoint g_pt = bp_G[i]) before passing to scalar_mul_impl.
- gpu/src/gpu_backend_opencl.cpp: replace Unsupported stub with real dispatch
via range_proof_poly_batch kernel; add bp_poly_batch_ member + cleanup;
update ensure_zk_kernels() to register the new kernel; parse 324-byte
wire format (4x65-byte uncompressed + 2x32-byte scalars) into
RangeProofPolyOCL GPU layout.
- gpu/src/gpu_backend_metal.mm: replace Unsupported stub with real dispatch
via range_proof_poly_batch kernel; build RangeProofPolyMetal (320B) from
324-byte wire format using be32_to_metal_fe / be32_to_metal_scalar helpers.
- docs/BACKEND_ASSURANCE_MATRIX.md: bulletproof row stub->Y for OpenCL+Metal;
parity tracking now shows zero remaining stubs.
- CHANGELOG.md: document bulletproof parity closure.
All three backends (CUDA, OpenCL, Metal) now implement bulletproof_verify_batch.
Zero Unsupported stubs remain in the GPU backend surface.