mineracks-ckbunker-hsm-sign/docs/PROTOCOL.md
mineracks 9d380f5013 Initial import: CKBunker HSM validation harness
WebSocket client + CLI harness + pytest suite that exercises each axis of
a CKBunker + Coldcard Mk4 policy and asserts the expected outcomes, including
the critical negative test that a large PSBT without TOTP is rejected with
a specific 'rule #1: need user(s) confirmation' reason.

Configuration via .env / YAML / CLI flags, two pre-crafted test PSBTs as
fixtures (generation guide in fixtures/README.md), dashboard counter
scraper as sanity check, design rationale in docs/.
2026-04-14 10:50:04 +10:00

7.4 KiB
Raw Permalink Blame History

CKBunker WebSocket protocol

Target version: CKBunker v0.9.1 (commit 8526755, 2024-08-06). This document is reverse-engineered from the running server + its Vue.js front-end. There is no formal protocol spec upstream — if a newer CKBunker release changes shapes, the client in client.py is where you'll need to adapt.

Connection setup

  1. HTTP GET / — pick up the aiohttp session cookie and the WebSocket URL. The Vue template embeds the URL as /websocket/<TOKEN> — the client's _extract_ws_url greps for that pattern (plus two fallbacks for older spellings).
  2. WebSocket connect to that URL with the session cookie in Cookie:. Without the cookie the server may accept the upgrade but ignore the first action — symptom is a client that hangs forever on _connected.
  3. Optional Cloudflare Access headers (CF-Access-Client-Id, CF-Access-Client-Secret) if the CKBunker is behind CF Access.

Cloudflare Access + WebSocket: in practice CF Access with service tokens is unreliable on the WS upgrade. For automation, use a direct private ingress (Tailscale, WireGuard, VPN) rather than the CF-fronted hostname.

Frame format

All frames are JSON objects. Client → server frames have the shape:

{"action": "<action_name>", "args": [...]}

Server → client frames have no action key; they carry one or more UI-update fields that the Vue app consumes:

Server field Meaning
vue_app_cb "Vue app callback" — UI state refresh (counters, etc.)
show_modal Render a modal dialog; its html field carries body
local_download Hand the browser a file; used to return signed PSBTs
message_signed (some versions) Returned by sign_message

Action catalogue

_connected

Sent once immediately after the WebSocket upgrade. Tells the server which page the client is "on", so it can push the right vue_app_cb refreshes.

{"action": "_connected", "args": ["/"]}

The server replies with one or more vue_app_cb frames describing the current HSM status (approvals, refusals, amount spent, period ends).

upload_psbt

Uploads a PSBT into the server's working slot. The PSBT is base64 and must match the declared SHA-256 — the server rejects mismatches.

{"action": "upload_psbt", "args": [<size_bytes>, "<sha256_hex>", "<base64_psbt>"]}

Response: a vue_app_cb confirming the slot is populated and the preview fields are rendered. No positive acknowledgement besides the UI update.

auth_offer_guess

Offers a TOTP code for the currently-loaded PSBT. The three args are (slot_index, time_window_counter, code_string):

{"action": "auth_offer_guess", "args": [0, 1712962374, "579322"]}
  • slot_index=0 — CKBunker supports multiple auth slots for multi-user policies; we only use one.
  • time_window_counterint(time.time()) // 30. This lets the server tolerate small clock skew without re-running TOTP for every skewed code.
  • code_string — the 6-digit code generated from the shared secret.

Response: usually silent if accepted; on rejection the server holds the code in its internal state and only surfaces "bad code" once you try submit_psbt.

submit_psbt

Commits to signing. The server hands the PSBT to the Coldcard for evaluation.

{"action": "submit_psbt", "args": ["<sha256>", <broadcast>, <finalize>, <download>]}
  • <sha256> — must match the previously-uploaded PSBT.
  • <broadcast> (bool) — have the server push the signed tx to a node. We always send false (we never want the harness to broadcast).
  • <finalize> (bool) — Coldcard combines and finalises, returns raw hex instead of PSBT.
  • <download> (bool) — request the signed bytes back in a local_download frame. We always send true.

Response: one of

  • local_download — success. Fields: data (bytes or hex), is_b64 flag.
  • show_modal with html containing "Rejected" — Coldcard refused. The human-readable reason follows "Rejected:" in the HTML.

sign_message

Message signing on an allowed derivation path:

{"action": "sign_message", "args": ["<text>", "<bip32_path>", "<addr_format>"]}
  • <addr_format>"segwit", "classic", or "p2sh".

Response shapes differ between CKBunker versions:

  • Newer: message_signed frame with {address, signature}.
  • Older: local_download with a three-line body: signature\naddress\nmessage.

The client handles both.

Response parsing notes

Rejection text

Coldcard rejection reasons come back embedded in a rendered HTML modal. The grammar is stable:

Rejected by Coldcard.
Rejected: <reason[, reason...]>

Common reasons observed:

Reason Meaning
rule #1: need user(s) confirmation Rule #1 applies, no user auth supplied
rule #2: would exceed period spending Rule #2 cap hit, falls through to Rule #1
bad TOTP code TOTP was supplied but didn't verify
policy refuses this path Message signing on a disallowed path
not enough funds UTXOs for the PSBT aren't available
warnings rejected PSBT carries a warning and policy doesn't allow

The harness's SignResult.is_expected_rejection("rule #1") does a case-insensitive substring match so the actual rejection reason can be asserted without overfitting to exact Coldcard firmware wording.

The "Amount Spent" display bug

CKBunker 0.9.1 occasionally renders Amount Spent as the sum of the Rule #1 and Rule #2 period caps instead of actual cumulative spend. The Coldcard's internal velocity counter is authoritative. The harness does not rely on the amount field for any assertion — it checks Approvals and Refusals deltas only, which are accurate.

Timing

Coldcard signing is fast but not instant — typical round-trip under 1s for small PSBTs, 25s for TOTP-authorised PSBTs. The harness uses a 30-second timeout for sign attempts, 20 seconds for message signing. If you see timeouts regularly, check:

  • USB passthrough is still attached (lsusb | grep d13e on the VM)
  • the Coldcard isn't blocked on a screen prompt (it shouldn't be in HSM mode)
  • ckbunker.service isn't restarting under load

What this protocol can't do

  • No policy introspection over the wire. The installed policy is only visible via the UI (and the Coldcard keypad/MicroSD log). This harness therefore relies on the operator declaring expected thresholds in config.yaml and asserts outcomes against those declared values.
  • No atomic batch sign. Each PSBT is submitted one at a time. The WebSocket can be reused, but each sign_psbt call is independent. This is fine — the Coldcard enforces per-txn limits anyway.
  • No policy change. There is no protocol action for editing the policy. This is intentional; policy changes go through keypad + MicroSD.