mineracks-ckbunker-hsm-sign/docs/PROTOCOL.md
mineracks 9d380f5013 Initial import: CKBunker HSM validation harness
WebSocket client + CLI harness + pytest suite that exercises each axis of
a CKBunker + Coldcard Mk4 policy and asserts the expected outcomes, including
the critical negative test that a large PSBT without TOTP is rejected with
a specific 'rule #1: need user(s) confirmation' reason.

Configuration via .env / YAML / CLI flags, two pre-crafted test PSBTs as
fixtures (generation guide in fixtures/README.md), dashboard counter
scraper as sanity check, design rationale in docs/.
2026-04-14 10:50:04 +10:00

186 lines
7.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CKBunker WebSocket protocol
**Target version**: CKBunker `v0.9.1` (commit `8526755`, 2024-08-06).
This document is reverse-engineered from the running server + its Vue.js
front-end. There is no formal protocol spec upstream — if a newer CKBunker
release changes shapes, the client in [`client.py`](../ckbunker_hsm_sign/client.py)
is where you'll need to adapt.
## Connection setup
1. **HTTP GET `/`** — pick up the aiohttp session cookie and the WebSocket
URL. The Vue template embeds the URL as `/websocket/<TOKEN>` — the
client's `_extract_ws_url` greps for that pattern (plus two fallbacks
for older spellings).
2. **WebSocket connect** to that URL with the session cookie in `Cookie:`.
Without the cookie the server may accept the upgrade but ignore the
first action — symptom is a client that hangs forever on `_connected`.
3. Optional Cloudflare Access headers (`CF-Access-Client-Id`,
`CF-Access-Client-Secret`) if the CKBunker is behind CF Access.
> **Cloudflare Access + WebSocket**: in practice CF Access with *service
> tokens* is unreliable on the WS upgrade. For automation, use a direct
> private ingress (Tailscale, WireGuard, VPN) rather than the CF-fronted
> hostname.
## Frame format
All frames are JSON objects. Client → server frames have the shape:
```json
{"action": "<action_name>", "args": [...]}
```
Server → client frames have no `action` key; they carry one or more
UI-update fields that the Vue app consumes:
| Server field | Meaning |
|--------------------|--------------------------------------------------------|
| `vue_app_cb` | "Vue app callback" — UI state refresh (counters, etc.) |
| `show_modal` | Render a modal dialog; its `html` field carries body |
| `local_download` | Hand the browser a file; used to return signed PSBTs |
| `message_signed` | (some versions) Returned by `sign_message` |
## Action catalogue
### `_connected`
Sent once immediately after the WebSocket upgrade. Tells the server which
page the client is "on", so it can push the right `vue_app_cb` refreshes.
```json
{"action": "_connected", "args": ["/"]}
```
The server replies with one or more `vue_app_cb` frames describing the
current HSM status (approvals, refusals, amount spent, period ends).
### `upload_psbt`
Uploads a PSBT into the server's working slot. The PSBT is base64 and
must match the declared SHA-256 — the server rejects mismatches.
```json
{"action": "upload_psbt", "args": [<size_bytes>, "<sha256_hex>", "<base64_psbt>"]}
```
Response: a `vue_app_cb` confirming the slot is populated and the
preview fields are rendered. No positive acknowledgement besides the UI
update.
### `auth_offer_guess`
Offers a TOTP code for the currently-loaded PSBT. The three args are
`(slot_index, time_window_counter, code_string)`:
```json
{"action": "auth_offer_guess", "args": [0, 1712962374, "579322"]}
```
- `slot_index=0` — CKBunker supports multiple auth slots for multi-user
policies; we only use one.
- `time_window_counter``int(time.time()) // 30`. This lets the server
tolerate small clock skew without re-running TOTP for every skewed code.
- `code_string` — the 6-digit code generated from the shared secret.
Response: usually silent if accepted; on rejection the server holds the
code in its internal state and only surfaces "bad code" once you try
`submit_psbt`.
### `submit_psbt`
Commits to signing. The server hands the PSBT to the Coldcard for
evaluation.
```json
{"action": "submit_psbt", "args": ["<sha256>", <broadcast>, <finalize>, <download>]}
```
- `<sha256>` — must match the previously-uploaded PSBT.
- `<broadcast>` (bool) — have the server push the signed tx to a node. We
always send `false` (we never want the harness to broadcast).
- `<finalize>` (bool) — Coldcard combines and finalises, returns raw hex
instead of PSBT.
- `<download>` (bool) — request the signed bytes back in a
`local_download` frame. We always send `true`.
Response: one of
- `local_download` — success. Fields: `data` (bytes or hex), `is_b64` flag.
- `show_modal` with `html` containing `"Rejected"` — Coldcard refused.
The human-readable reason follows "Rejected:" in the HTML.
### `sign_message`
Message signing on an allowed derivation path:
```json
{"action": "sign_message", "args": ["<text>", "<bip32_path>", "<addr_format>"]}
```
- `<addr_format>``"segwit"`, `"classic"`, or `"p2sh"`.
Response shapes differ between CKBunker versions:
- Newer: `message_signed` frame with `{address, signature}`.
- Older: `local_download` with a three-line body: `signature\naddress\nmessage`.
The client handles both.
## Response parsing notes
### Rejection text
Coldcard rejection reasons come back embedded in a rendered HTML modal. The
grammar is stable:
```
Rejected by Coldcard.
Rejected: <reason[, reason...]>
```
Common reasons observed:
| Reason | Meaning |
|-----------------------------------------------------------|-------------------------------------------------|
| `rule #1: need user(s) confirmation` | Rule #1 applies, no user auth supplied |
| `rule #2: would exceed period spending` | Rule #2 cap hit, falls through to Rule #1 |
| `bad TOTP code` | TOTP was supplied but didn't verify |
| `policy refuses this path` | Message signing on a disallowed path |
| `not enough funds` | UTXOs for the PSBT aren't available |
| `warnings rejected` | PSBT carries a warning and policy doesn't allow |
The harness's `SignResult.is_expected_rejection("rule #1")` does a
case-insensitive substring match so the actual rejection reason can be
asserted without overfitting to exact Coldcard firmware wording.
### The "Amount Spent" display bug
CKBunker 0.9.1 occasionally renders `Amount Spent` as the sum of the Rule #1
and Rule #2 period caps instead of actual cumulative spend. The Coldcard's
internal velocity counter is authoritative. The harness does **not** rely
on the amount field for any assertion — it checks `Approvals` and
`Refusals` deltas only, which are accurate.
## Timing
Coldcard signing is fast but not instant — typical round-trip under 1s for
small PSBTs, 25s for TOTP-authorised PSBTs. The harness uses a 30-second
timeout for sign attempts, 20 seconds for message signing. If you see
timeouts regularly, check:
- USB passthrough is still attached (`lsusb | grep d13e` on the VM)
- the Coldcard isn't blocked on a screen prompt (it shouldn't be in HSM mode)
- `ckbunker.service` isn't restarting under load
## What this protocol can't do
- **No policy introspection over the wire.** The installed policy is only
visible via the UI (and the Coldcard keypad/MicroSD log). This harness
therefore relies on the operator declaring expected thresholds in
`config.yaml` and asserts outcomes against those declared values.
- **No atomic batch sign.** Each PSBT is submitted one at a time. The
WebSocket can be reused, but each sign_psbt call is independent. This is
fine — the Coldcard enforces per-txn limits anyway.
- **No policy change.** There is no protocol action for editing the
policy. This is intentional; policy changes go through keypad + MicroSD.