UltrafastSecp256k1/opencl
Craig Raw 311a60b912
Some checks failed
CI / linux (Release, gcc-13) (push) Has been cancelled
Benchmark Dashboard / benchmark (push) Has been cancelled
Benchmark Dashboard / benchmark-windows (push) Has been cancelled
CI / linux (Debug, clang-17) (push) Has been cancelled
CI / linux (Debug, gcc-13) (push) Has been cancelled
CI / linux (Release, clang-17) (push) Has been cancelled
CI / linux-arm64 (push) Has been cancelled
CI / linux-riscv64 (push) Has been cancelled
CI / Sanitizers (${{ matrix.label }}) (-DCMAKE_C_FLAGS='-fsanitize=address,undefined -fno-sanitize-recover=all -fno-omit-frame-pointer' -DCMAKE_CXX_FLAGS='-fsanitize=address,undefined -fno-sanitize-recover=all -fno-omit-frame-pointer' -DCMAKE_EXE_LINKER_FLA… (push) Has been cancelled
CI / Sanitizers (${{ matrix.label }}) (-DCMAKE_C_FLAGS='-fsanitize=thread -fno-omit-frame-pointer' -DCMAKE_CXX_FLAGS='-fsanitize=thread -fno-omit-frame-pointer' -DCMAKE_EXE_LINKER_FLAGS='-fsanitize=thread', TSan, tsan) (push) Has been cancelled
CI / windows (Release) (push) Has been cancelled
CI / macos (Release) (push) Has been cancelled
CI / ios (OS) (push) Has been cancelled
CI / ios (SIMULATOR) (push) Has been cancelled
CI / ios-xcframework (push) Has been cancelled
CI / rocm (push) Has been cancelled
CI / wasm (push) Has been cancelled
CI / android (${{ matrix.abi }}) (arm64-v8a) (push) Has been cancelled
CI / android (${{ matrix.abi }}) (armeabi-v7a) (push) Has been cancelled
CI / android (${{ matrix.abi }}) (x86_64) (push) Has been cancelled
CI / coverage (push) Has been cancelled
CI / Differential Smoke Test (push) Has been cancelled
CI / Protocol Vectors (MuSig2 BIP-327 / FROST-KAT) (push) Has been cancelled
Clang-Tidy / Static Analysis (clang-tidy) (push) Has been cancelled
CodeQL / Analyze (C/C++) (push) Has been cancelled
Cppcheck / Static Analysis (Cppcheck) (push) Has been cancelled
CT-Verif (compile-time CT check) / ct-verif LLVM analysis (push) Has been cancelled
Discord Commit Notifications / notify (push) Has been cancelled
Preflight / preflight (push) Has been cancelled
OpenSSF Scorecard / Scorecard analysis (push) Has been cancelled
Security Audit / Build with -Werror (push) Has been cancelled
Security Audit / ASan + UBSan (push) Has been cancelled
Security Audit / MSan (push) Has been cancelled
Security Audit / TSan (push) Has been cancelled
Security Audit / Valgrind Memcheck (push) Has been cancelled
Security Audit / dudect Timing Analysis (push) Has been cancelled
Security Audit / Differential vs bitcoin-core/libsecp256k1 (push) Has been cancelled
SonarCloud / SonarCloud Analysis (push) Has been cancelled
add opencl headers for windows build
2026-03-26 13:02:46 +02:00
..
benchmarks bench(cuda): BENCH_MULTI=20 in full benchmark loops — matches autotuner throughput 2026-03-21 23:06:23 +00:00
cmake perf(opencl): optimize kernels — unrolled field_mul/sqr, addition chain field_inv, wNAF scalar_mul 2026-02-14 14:48:11 +00:00
include add opencl headers for windows build 2026-03-26 13:02:46 +02:00
kernels opencl+metal: wire bulletproof_verify_batch — close last parity gap 2026-03-24 21:59:56 +00:00
src fix: OCL-H-03 use ulong not uint64_t in R"KERNEL" string 2026-03-24 00:20:04 +00:00
tests style: replace all Unicode with ASCII across entire codebase 2026-02-23 02:16:57 +04:00
BENCHMARK_RESULTS.md audit: add AUDIT_COVERAGE.md + ASCII cleanup + CT fixes 2026-02-25 19:14:21 +04:00
build_and_test.bat feat(opencl): complete OpenCL implementation with benchmarks 2026-02-14 18:13:44 +04:00
build_with_log.bat feat(opencl): complete OpenCL implementation with benchmarks 2026-02-14 18:13:44 +04:00
CMakeLists.txt fix: remove stray brace in secp256k1.cuh (#192), guard opencl benchmarks (#193) 2026-03-26 11:30:25 +02:00
README.md audit: add AUDIT_COVERAGE.md + ASCII cleanup + CT fixes 2026-02-25 19:14:21 +04:00

UltrafastSecp256k1 OpenCL Implementation

Cross-platform GPU acceleration for secp256k1 cryptographic operations.

Features

  • Cross-platform GPU support: Intel, AMD, NVIDIA GPUs via OpenCL 1.2+
  • Zero external dependencies: Only requires OpenCL runtime
  • Full field arithmetic: Addition, subtraction, multiplication, squaring, inversion
  • Point operations: Doubling, addition, scalar multiplication
  • Batch operations: High-throughput batch scalar multiplication with generator
  • Same test vectors: Identical test suite as CPU and CUDA implementations
  • Branchless operations: Critical operations are branchless for security

Requirements

  • Windows: Intel OpenCL runtime (included with Intel GPU driver) or NVIDIA/AMD drivers
  • Linux: ocl-icd-opencl-dev and vendor-specific drivers
  • CMake 3.16+ with Ninja (recommended)

Intel GPU Drivers (Windows)

Download from: https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html

Verify OpenCL Installation

# Windows (PowerShell)
clinfo | Select-String "Platform Name|Device Name"

# Linux
clinfo | grep -E "Platform Name|Device Name"

Building

# Configure (from opencl directory)
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release

# Build
cmake --build build

# Run tests
./build/opencl_test

# Run benchmarks
./build/opencl_benchmark

API Usage

#include "secp256k1_opencl.hpp"

using namespace secp256k1::opencl;

// Create context (auto-selects best GPU)
DeviceConfig config;
config.prefer_intel = true;
auto ctx = Context::create(config);

// Single operations
FieldElement a = field_from_u64(7);
FieldElement b = field_from_u64(11);
FieldElement c = ctx->field_mul(a, b);  // c = 77

// Scalar multiplication with generator
Scalar k = scalar_from_u64(12345);
JacobianPoint P = ctx->scalar_mul_generator(k);  // P = 12345 * G

// Batch operations (high throughput)
std::vector<Scalar> scalars(1000);
std::vector<JacobianPoint> results(1000);
ctx->batch_scalar_mul_generator(scalars.data(), results.data(), 1000);

// Convert to affine coordinates
AffinePoint affine = jacobian_to_affine(P);

Architecture

opencl/
+-- include/secp256k1_opencl.hpp   # Main API
+-- kernels/                        # OpenCL kernel sources
+-- src/                           # Implementation
+-- tests/                         # Test suite (32+ tests)

Test Vectors

Uses identical test vectors as CPU implementation. All 32+ tests must pass.