# Policy design recommendations Not exhaustive — consult the upstream [Coldcard HSM docs](https://coldcardwallet.com/docs/ckbunker-hsm) for the full grammar. This file captures the two-tier pattern the harness is designed around and why it's a reasonable starting point for a signing HSM that backs automation. ## The two-tier pattern ``` Rule #2 (auto-sign, no user auth) per-txn ≤ X sats period ≤ N × X sats (N ≈ 5, so a handful of small sends per window) Rule #1 (user-auth via TOTP) per-txn ≤ Y sats (Y ≫ X, but still a small fraction of custody) period ≤ M × Y sats (implicit) Rule #3: anything else is rejected — on-device keypad/MicroSD required to authorise. ``` ### Why two tiers - **Single-tier "always require TOTP"** makes the HSM useless for automation: every BTCPay callback, every n8n webhook, every monitoring script wakes a human. - **Single-tier "always auto-sign"** is indistinguishable from a hot wallet with extra steps. - Two tiers let routine small sends go through un-touched while keeping human-in-the-loop pressure on anything larger. ### Picking X (auto-approve cap) Rule of thumb: the **most expensive single automated action** you're comfortable with happening unattended. Examples: | Automation | Sensible X (sats) | |----------------------------------------|-------------------| | Lightning channel rebalance | 50,000 – 200,000 | | BTCPay invoice settlement | 10,000 – 50,000 | | Routine small withdrawals (newsletter) | 5,000 – 20,000 | | Dev test sends | 1,000 – 5,000 | Pick **the smallest X** that covers your routine traffic. Anything larger is a Rule #1 event — worth waking the TOTP holder for. ### Picking N (period multiplier) - Too low (N=1): first sign empties the period budget, second sign fails even though it's within per-txn cap. - Too high (N≥10): an attacker who steals the VM can drain the budget faster than a human will notice. - Reasonable: N = 3 to 5. Combined with a 24 h velocity window, this caps the *catastrophic* loss from a VM compromise at ~5×X per day. ### Picking Y (user-auth cap) A hard ceiling on what TOTP alone can authorise. For custody above Y, the only path is keypad + MicroSD — physical presence at the device. Common shapes: - **Operational float** wallet: Y = 10×X. Big enough to cover a busy day; small enough that losing the TOTP secret isn't an existential problem. - **Hot reserve**: Y = 0 (no Rule #1). Forces all non-routine sends through physical presence. ## Velocity period The Coldcard resets counters after `velocity_minutes` of wall-clock. 1440 (24 h) is the standard choice. Shorter windows (60–240 min) make the HSM safer during active use but noisier during quiet periods (routine sends hit the reset mid-day). Longer windows (> 24 h) make a compromise more painful to recover from (stolen budget persists). ## Message signing Useful for: - proving control of an address to auditors / regulators - proof-of-reserves (signed message with timestamp) - sanity-checking Coldcard reachability (the harness's `message_signing` test) Usually safe to enable on any path — message signing doesn't spend funds. If you need to restrict it, the policy supports a BIP32 path regex. ## Boot-to-HSM **Always enable** for production. Without it, anyone with physical access to the device (and the PIN) can navigate out of HSM mode by tapping the menu. **Always set a 6-digit escape code** — writing down a "cannot escape HSM" device is terrifying and operationally wrong (you will need to enrol new users, update policy, etc.). The escape code must be typed within 60 seconds of Coldcard boot, which is a reasonable safety margin. **Record the escape code in a separate place from the seed backup.** A password manager on the TOTP holder's phone is fine; not the same piece of paper as the seed words. ## Logging - **MicroSD logging ON** — on-device audit trail that survives VM compromise. Keeps a tamper-evident record even if the VM is tampered with. Costs: you must physically eject the MicroSD to review it. - **Fail-if-cant-log OFF** — otherwise a MicroSD hiccup halts signing. Default is fine. ## Storage locker read count CKBunker encrypts its local state with a key held in the Coldcard's Storage Locker. The Locker has a **read counter** — typical policies allow 13 reads before the Locker self-wipes. This means: - CKBunker can restart up to 13 times before you need to re-install the policy. - Heavy debugging (restarting CKBunker to try things) burns reads fast. - After policy reinstall, the counter resets. Monitor restart frequency. If you find yourself restarting CKBunker often, investigate *why* rather than spending Locker reads.