Adds a demonstration doc showing every harness test mapped to the UI state you should see on a correctly-configured CKBunker + Coldcard HSM. Each screenshot is paired with the specific test that asserts the outcome, plus guidance on what failure at that step means. Sensitive/site-specific identifiers (IPs, domain, device serial, CF tunnel UUID) are generalised so the doc reads as a template for any deployment. 15 screenshots in docs/images/ cover: physical rack installation, policy config UI, message signing end-to-end, sub-threshold auto-sign via web UI and CLI, the critical policy-rejection case, TOTP-authorised signing, and dashboard counter verification. |
||
|---|---|---|
| ckbunker_hsm_sign | ||
| docs | ||
| fixtures | ||
| tests | ||
| .env.example | ||
| .gitignore | ||
| config.example.yaml | ||
| hsm_validate.py | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
| requirements.txt | ||
mineracks-ckbunker-hsm-sign
Production validation test harness for CKBunker + Coldcard Mk4 HSM deployments.
Runs a short, structured sequence of tests against a live CKBunker and exits non-zero if anything fails. Designed to be run once after setup, periodically from a monitor, or as a CI gate on configuration changes — so a silently-broken policy doesn't stay silent.
The critical test: a transaction above your auto-approve cap is submitted without 2FA. The Coldcard must reject it with a specific
rule #1: need user(s) confirmationerror. If it signs, something is catastrophically wrong with your policy and the harness exits with a loud failure.
📖 See docs/DEMO.md for a full walkthrough against a real rack-mounted production deployment, with screenshots of every test showing the expected UI state. Use it as the reference for "what good looks like" before you run the harness on your own CKBunker.
Table of contents
- What this is, what it isn't
- The test sequence
- Requirements
- Quick start
- Configuration
- Generating test PSBTs
- Running as a CLI
- Running under pytest
- Example output
- Using it as a library
- CI integration
- Design rationale
- Troubleshooting
- Project layout
- License
What this is, what it isn't
Is
- A validation harness for a CKBunker + Coldcard HSM that you already have set up and policy-loaded.
- A reusable WebSocket client library for CKBunker
(
ckbunker_hsm_sign/client.py) that you can import into your own automation (BTCPay plugins, n8n scripts, custom signers). - A set of pytest tests that assert each axis of the policy works.
Isn't
- Not a setup tool — use upstream CKBunker's docs to get your bunker running and your policy loaded first.
- Not a key / seed tool — it never sees the seed and doesn't try to.
- Not a PSBT creator — you supply the test fixtures. See fixtures/README.md for how to make them.
- Not a broadcaster —
submit_psbtis always called withbroadcast=False. Nothing in this harness reaches the mempool.
The test sequence
| # | Test | What it asserts |
|---|---|---|
| 1 | connectivity |
HTTP on the CKBunker URL answers and exposes a WebSocket path. Session cookie is obtainable. |
| 2 | message_signing |
An arbitrary test message signs on your policy-allowed BIP32 path. Cheapest Coldcard reachability test. |
| 3 | rule2_auto_approve |
A PSBT ≤ your auto-approve cap signs without any TOTP. |
| 4 | rule1_without_totp_rejects |
A PSBT above your auto-approve cap is rejected when no TOTP is supplied. The critical assertion. |
| 5 | rule1_with_totp_signs |
The same PSBT signs when a fresh TOTP code is submitted. |
| 6 | counters_tracked |
Server-visible Approvals / Refusals counters moved by the expected amounts during tests 3–5. |
Tests 3–5 together exercise both sides of every policy rule in under a minute.
Tests are independently skippable via config.yaml or the --tests / --skip flags.
Requirements
- A running CKBunker (tested against
v0.9.1, commit8526755). - A Coldcard Mk4 paired to the CKBunker, in HSM mode, with a
two-tier policy loaded. The harness's default expectations match the
pattern documented in
docs/POLICY_RECOMMENDATIONS.md, but the thresholds are configurable. - Python 3.10+.
- Network access to the CKBunker's private ingress (Tailscale / VPN). The harness works via a Cloudflare-Access-fronted public URL for HTTP but WebSocket signing over CF Access with service tokens is unreliable — see docs/WHY.md.
- Two pre-crafted test PSBTs — see fixtures/README.md.
- The TOTP shared secret for the user named in your policy (required for test 5 only; test 4 runs without it).
Quick start
git clone https://git.mineracks.com/mineracks/mineracks-ckbunker-hsm-sign.git
cd mineracks-ckbunker-hsm-sign
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
$EDITOR .env # set CKBUNKER_URL, TOTP_SECRET, etc.
# Generate or copy in two PSBTs — see fixtures/README.md
# fixtures/small.psbt (≤ auto-approve cap)
# fixtures/large.psbt (> auto-approve cap, ≤ user-auth cap)
./hsm_validate.py
A full run takes 10–30 seconds once the bunker and Coldcard are warm.
Configuration
Three sources, in precedence order (highest wins):
- CLI flags —
--url,--tests,--skip,--verbose, … config.yaml(optional) — passed via--config. Seeconfig.example.yaml..env(auto-loaded from the CWD if present). See.env.example.
The same loader is used by pytest, so whatever you configure for the CLI
applies to the test suite too.
Required settings
| Setting | Source | Required for |
|---|---|---|
| CKBunker URL | CKBUNKER_URL / --url |
all tests |
| Small PSBT | SMALL_PSBT_PATH |
rule2_auto_approve |
| Large PSBT | LARGE_PSBT_PATH |
rule1_without_totp_rejects, rule1_with_totp_signs |
| TOTP secret | TOTP_SECRET |
rule1_with_totp_signs |
| HSM user | HSM_USER |
anywhere that user auth is involved |
Optional settings
| Setting | Source | Purpose |
|---|---|---|
| Cloudflare Access id | CF_ACCESS_CLIENT_ID |
HTTP through CF Access (not WS) |
| CF Access secret | CF_ACCESS_CLIENT_SECRET |
HTTP through CF Access (not WS) |
| Message sign path | MESSAGE_SIGN_PATH |
message_signing uses this derivation |
| Message sign address | MESSAGE_SIGN_ADDRESS |
If set, verified against signature |
| Verbose frames | --verbose / -v |
Dump every WebSocket frame to stdout |
| Save signed PSBTs | --save-signed <dir> |
Keep the signed outputs for inspection |
Generating test PSBTs
See fixtures/README.md for three methods (Sparrow,
bitcoin-cli, reusing stale UTXOs). The short version:
- Build a watch-only wallet from your Coldcard xpub in Sparrow.
- Construct two payments from that wallet to any address you control:
- One just under your auto-approve cap (
small.psbt). - One comfortably above the cap but inside the user-auth cap (
large.psbt).
- One just under your auto-approve cap (
- Export both as PSBT (binary or base64) into
fixtures/.
The harness never broadcasts; it signs, optionally writes the signed
result to disk, and discards. large.psbt can be re-used indefinitely —
the rejection path is deterministic regardless of UTXO state.
Running as a CLI
# Full run
./hsm_validate.py
# With a config file
./hsm_validate.py --config config.yaml
# Override a single setting
./hsm_validate.py --url http://10.0.0.14:9823
# Only the critical negative test
./hsm_validate.py --tests rule1_without_totp_rejects
# Everything except the TOTP sign test (e.g. during TOTP rotation)
./hsm_validate.py --skip rule1_with_totp_signs
# Very verbose (dumps every WebSocket frame)
./hsm_validate.py --verbose
# Save signed PSBTs for inspection
./hsm_validate.py --save-signed /tmp/hsm-validate-signed
Exit codes:
0— all enabled tests passed (or were skipped).1— at least one test failed.2— configuration error.
Running under pytest
pip install pytest pytest-asyncio
pytest -v tests/
The pytest session reads the same .env / config.yaml that the CLI does.
Each test file corresponds to one test in the CLI sequence:
tests/test_01_connectivity.py
tests/test_02_message_signing.py
tests/test_03_rule2_auto_approve.py
tests/test_04_rule1_without_totp_rejects.py ← the critical negative test
tests/test_05_rule1_with_totp_signs.py
tests/test_06_counters_tracked.py
Run only the critical test:
pytest -v tests/test_04_rule1_without_totp_rejects.py
Example output
Target: http://100.80.63.14:9823
User: mineracks
Policy: ≤10000 sats auto, ≤100000 sats with TOTP
────────────────────────────────────────────────────────────────────────
✓ connectivity HTTP + WS endpoint reachable (0.3s)
WebSocket URL: ws://100.80.63.14:9823/websocket/CBG5KH5BCCG6W3BXDH5QQY5Q
Session cookies: yes
✓ message_signing signed via Coldcard (0.9s)
Address: bc1qy926zzc4yw8f0gd6tvdy2fm0hr4a4tx3u4963h
Signature: JyeJVJuBuVB0M79FFDLrfz10j7NtGRSac+7Oj0dpyZ/MePoh...
✓ rule2_auto_approve signed without TOTP (395 bytes) (1.1s)
✓ rule1_without_totp_rejects rejected as expected — Rejected: rule #1: need user(s) confirmation, rule #2: would exceed period spending (1.2s)
✓ rule1_with_totp_signs signed with TOTP (395 bytes) (1.4s)
✓ counters_tracked dashboard counters moved as expected (0.4s)
Approvals: 2 → 4 (Δ2)
Refusals: 0 → 1 (Δ1)
Amount spent: 0.00009 → 0.00109 BTC
────────────────────────────────────────────────────────────────────────
6 passed, 0 failed, 0 skipped
A failure — the one you actually want to catch — looks like this:
✗ rule1_without_totp_rejects policy NOT enforced: large PSBT was signed without TOTP — STOP AND INVESTIGATE
Using it as a library
The WebSocket client is reusable standalone:
import asyncio
from pathlib import Path
from ckbunker_hsm_sign import Client
async def main():
client = Client(
base_url="http://100.80.63.14:9823",
totp_secret="JBSWY3DPEHPK3PXP",
)
async with client.session() as session:
psbt = Path("mytx.psbt").read_bytes()
result = await session.sign_psbt(psbt, use_totp=True)
if result.ok():
Path("signed.psbt").write_bytes(result.signed_bytes)
else:
print("sign failed:", result.status.value, result.reason)
asyncio.run(main())
Batch signing is just sequential sign calls inside the same session — the WebSocket stays open.
See docs/PROTOCOL.md for the full protocol reference.
CI integration
The CLI exits 0/1/2, which is all a CI runner needs. Minimal examples:
Gitea Actions / GitHub Actions
name: validate-hsm
on:
schedule: [{ cron: "0 6 * * *" }] # 6 AM daily
workflow_dispatch:
jobs:
validate:
runs-on: self-hosted # needs Tailscale access
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: "3.12" }
- run: pip install -r requirements.txt
- run: ./hsm_validate.py
env:
CKBUNKER_URL: ${{ secrets.CKBUNKER_URL }}
TOTP_SECRET: ${{ secrets.TOTP_SECRET }}
SMALL_PSBT_PATH: fixtures/small.psbt
LARGE_PSBT_PATH: fixtures/large.psbt
Cron / oncall monitor
# Every hour, email oncall if anything fails
17 * * * * cd /opt/hsm-validate && ./hsm_validate.py >/tmp/hsm.out 2>&1 || mail -s "HSM validation FAILED" oncall@example.com < /tmp/hsm.out
Woodpecker / Drone
steps:
- name: validate
image: python:3.12
commands:
- pip install -r requirements.txt
- ./hsm_validate.py
secrets: [ ckbunker_url, totp_secret ]
Design rationale
Full reasoning lives in docs/WHY.md. Short version:
- Explicit rejection assertions, not "sign succeeded / no error". Policy failures are silent unless you check for the specific rejection reason.
- Two-tier policy as the default assumption: auto-approve under X, TOTP under Y, reject above. This matches what most HSM-backed Bitcoin operations look like; adjust thresholds in config.
- Pre-crafted fixtures instead of PSBT generation — keeps the harness deployment-agnostic and avoids needing the Coldcard's xpub / spendable UTXOs at harness-build time.
- Hand-rolled WebSocket client — upstream CKBunker doesn't ship a
Python client library; the
ckbunkerconsole script has a broken import path in v0.9.1. - No broadcast, ever — the harness always calls
submit_psbtwithbroadcast=False. A validation run doesn't touch the mempool.
Troubleshooting
"HTTP fetch failed: 403"
You're hitting a Cloudflare-Access-protected URL without service token
credentials. Either set CF_ACCESS_CLIENT_ID + CF_ACCESS_CLIENT_SECRET
or switch CKBUNKER_URL to the private ingress (Tailscale IP).
"timeout: no decision within 30s"
- Coldcard is not responding — check
lsusb | grep Coinkiteon the VM. - CKBunker is running but the Coldcard was detached after VM boot. Re-attach USB passthrough.
ckbunker.serviceis in a restart loop. Checkjournalctl -u ckbunker.
rule1_without_totp_rejects → FAIL: "policy NOT enforced"
Stop the harness. Immediately verify the policy on the Coldcard:
- Exit HSM mode via the Boot-to-HSM escape code (press
X, code,✔within 60s of power on). - Menu → Advanced → HSM → review installed policy.
- If the policy is missing or the user-auth rule is gone, reload it from your policy YAML via MicroSD.
message_signing passes but PSBT tests fail
Coldcard is reachable but probably in a weird mode. Check the Coldcard's own screen for an error banner. Usually solved by a service restart:
sudo systemctl restart ckbunker
Counters test skipped
Your CKBunker version renders the dashboard differently from what the scraper's regexes expect. This is a soft skip — the signing tests already prove correctness. File an issue with the page HTML if you want scraper support for your version.
"TOTP_SECRET not configured" but I set it
TOTP_SECRET must be a base32 secret (usually 16+ chars, letters A-Z
and digits 2-7). If you stored a QR-code URL, extract the secret=…
parameter from it.
Project layout
.
├── README.md ← this file
├── LICENSE ← MIT
├── requirements.txt
├── pyproject.toml ← optional `pip install -e .`
├── .env.example ← environment variable template
├── config.example.yaml ← YAML config template
├── hsm_validate.py ← CLI entry point
│
├── ckbunker_hsm_sign/ ← library
│ ├── __init__.py
│ ├── client.py ← WebSocket + HTTP client
│ ├── config.py ← .env + YAML loader
│ ├── harness.py ← CLI test runner / reporter
│ └── scraper.py ← dashboard counter scraper
│
├── tests/ ← pytest suite (same tests, different runner)
│ ├── conftest.py
│ ├── test_01_connectivity.py
│ ├── test_02_message_signing.py
│ ├── test_03_rule2_auto_approve.py
│ ├── test_04_rule1_without_totp_rejects.py ← the critical negative test
│ ├── test_05_rule1_with_totp_signs.py
│ └── test_06_counters_tracked.py
│
├── fixtures/
│ └── README.md ← how to generate test PSBTs
│
└── docs/
├── DEMO.md ← full demo against a real production deployment
├── PROTOCOL.md ← CKBunker WebSocket protocol reference
├── WHY.md ← design rationale
├── POLICY_RECOMMENDATIONS.md ← how to design a two-tier policy
└── images/ ← screenshots used in DEMO.md
License
MIT — see LICENSE.
This project is not affiliated with Coinkite or the Coldcard team. "Coldcard" and "CKBunker" are products of Coinkite Inc.. This harness is an independent validation tool.