Compare commits

...

273 Commits
v0.1.0 ... main

Author SHA1 Message Date
Peter Steinberger
0f2795d9ae
docs: add Azure changelog entry
Some checks failed
CI / Go (push) Has been cancelled
CI / Worker (push) Has been cancelled
CI / Docs (push) Has been cancelled
CI / Release Check (push) Has been cancelled
Pages / Deploy docs (push) Has been cancelled
2026-05-08 09:00:18 +01:00
Jonathan Moss
00725544c7
feat(azure): support linux and native windows leases
Add Azure as a managed provider for direct and brokered Crabbox leases.

- provision Azure Linux VMs with cloud-init, spot fallback, shared network adoption, and per-lease cleanup
- provision native Azure Windows VMs with VM Agent bootstrap and SSH/sync/run support
- add Azure broker support in the Cloudflare Worker, provider config, docs, and tests
- fix async Azure delete handling so successful 202 delete LROs do not refetch deleted resources
- keep Go core coverage above the CI threshold

Verified with CI plus live Azure Linux and native Windows leases.

Co-authored-by: Jonathan Moss <2729151+jwmoss@users.noreply.github.com>
2026-05-08 08:23:38 +01:00
Peter Steinberger
2e1194f6c0
fix: improve webvnc sharing and cursor ui 2026-05-08 07:30:30 +01:00
Peter Steinberger
188432c63a
feat: add collaborative webvnc observer mode 2026-05-08 06:25:10 +01:00
Peter Steinberger
b568298019
feat: improve desktop reliability artifacts 2026-05-08 04:52:51 +01:00
Peter Steinberger
0431fd3bb6
fix: expose share action on webvnc
Some checks are pending
CI / Go (push) Waiting to run
CI / Worker (push) Waiting to run
CI / Docs (push) Waiting to run
CI / Release Check (push) Waiting to run
Pages / Deploy docs (push) Waiting to run
2026-05-07 22:42:02 +01:00
Peter Steinberger
edd5fae230
fix: harden macos vnc password bootstrap 2026-05-07 22:38:18 +01:00
Peter Steinberger
fdef9df8af
fix: retry ssh fallback ports for desktop paths 2026-05-07 14:52:21 +01:00
Peter Steinberger
7884b1d71f
fix: fall back from coordinator pool list 2026-05-07 14:52:15 +01:00
Peter Steinberger
93a9e64998
docs: document desktop rescue UX 2026-05-07 14:13:29 +01:00
Peter Steinberger
5ed32f1bd0
feat: clarify WebVNC portal failure states 2026-05-07 14:13:26 +01:00
Peter Steinberger
4adbfc6d4a
feat: add desktop WebVNC rescue output 2026-05-07 14:13:23 +01:00
Peter Steinberger
770920e16d
docs: release 0.7.0 2026-05-07 13:46:12 +01:00
Peter Steinberger
0d3a65dfc1
feat: add lease sharing 2026-05-07 13:39:07 +01:00
Peter Steinberger
62d5c1b3d5
docs: document WebVNC clipboard controls 2026-05-07 13:18:04 +01:00
Peter Steinberger
aca01bf512
feat: harden desktop WebVNC reliability 2026-05-07 13:17:23 +01:00
Peter Steinberger
19cbc17602
fix: repair managed macos desktop readiness 2026-05-07 12:45:27 +01:00
Peter Steinberger
32a0f89627
chore: bump version to 0.7.0 2026-05-07 06:25:03 +01:00
Peter Steinberger
80c085c16b
docs: link egress from portal capabilities 2026-05-07 06:22:34 +01:00
Peter Steinberger
966d7df4bd
docs: refresh bridge command docs 2026-05-07 06:22:08 +01:00
Peter Steinberger
c638a55dbb
docs: refresh webvnc ticket docs 2026-05-07 06:21:47 +01:00
Peter Steinberger
335b1d2b28
docs: refresh desktop egress command docs 2026-05-07 06:21:23 +01:00
Peter Steinberger
88c42f96d7
docs: refresh egress cli index 2026-05-07 06:20:58 +01:00
Peter Steinberger
5abb6980cd
docs: add mediated egress flow chart 2026-05-07 06:20:26 +01:00
Peter Steinberger
d0b2c2379f
fix: allow public coordinator egress starts 2026-05-07 06:16:26 +01:00
Peter Steinberger
b40d36458a
feat: add mediated egress bridge 2026-05-07 06:10:22 +01:00
Vincent Koc
947b21ca46
fix: keep bridge tickets out of websocket urls 2026-05-06 20:29:06 -07:00
Peter Steinberger
120802c150
chore: start 0.6.2 development
Some checks are pending
CI / Go (push) Waiting to run
CI / Worker (push) Waiting to run
CI / Docs (push) Waiting to run
CI / Release Check (push) Waiting to run
Pages / Deploy docs (push) Waiting to run
2026-05-07 04:03:40 +01:00
Peter Steinberger
8c69be33a6
ci: require homebrew tap updates on release 2026-05-07 03:14:36 +01:00
Peter Steinberger
c3c111ba35
fix: sync islo workspaces before run 2026-05-07 02:30:15 +01:00
Peter Steinberger
6a45e46b1b
fix: suppress windows powershell progress output 2026-05-07 01:48:07 +01:00
Peter Steinberger
f4695953bc
fix: harden 0.6.1 runtime checks 2026-05-07 01:14:26 +01:00
Peter Steinberger
98af5a3e8f
fix: bootstrap exit-node leases over tailscale 2026-05-07 00:57:16 +01:00
Peter Steinberger
e328ead836
fix: validate tailscale exit-node egress 2026-05-07 00:48:53 +01:00
Peter Steinberger
f031e9d1aa
docs: expand crabbox user guide 2026-05-07 00:47:41 +01:00
Peter Steinberger
e82281ff08
chore: prepare 0.6.0 release 2026-05-07 00:34:03 +01:00
Peter Steinberger
e40a36f16e
docs: refresh unreleased changelog 2026-05-07 00:23:45 +01:00
Peter Steinberger
ec47b42d30
docs: preserve docs sidebar scroll 2026-05-07 00:09:52 +01:00
Vincent Koc
a8ccdfefe8
Merge remote-tracking branch 'origin/main' into crabbox/aws-auto-region-routing
* origin/main:
  fix: harden daytona auth and resource flags
2026-05-06 16:08:24 -07:00
Vincent Koc
eb1e92f680
fix(coordinator): keep capacity requests sparse 2026-05-06 16:05:01 -07:00
Vincent Koc
0e19455e57
docs(capacity): document routing hints 2026-05-06 15:49:28 -07:00
Vincent Koc
09442c1304
feat(capacity): configure broker hint policy 2026-05-06 15:49:28 -07:00
Vincent Koc
3cd1488877
feat(capacity): return broker routing hints 2026-05-06 15:44:43 -07:00
Vincent Koc
9200bdb060
fix(blacksmith): explain queued outage timeouts 2026-05-06 15:21:25 -07:00
Yossi Eliaz
0e3515023b
fix: harden daytona auth and resource flags
Use the authenticated Daytona CLI profile as a Daytona auth fallback, reject snapshot-incompatible resource flags, and document the auth path.

Verified locally with Go/docs gates and live Daytona CLI-auth run.
2026-05-06 23:17:45 +01:00
Vincent Koc
3d7b3ebfe6
docs(aws): document capacity routing 2026-05-06 15:04:02 -07:00
Vincent Koc
fa9cc0e6bc
feat(aws): route capacity across regions 2026-05-06 15:00:49 -07:00
Vincent Koc
0f192f58a0
docs(portal): document runner detail pages 2026-05-06 13:52:57 -07:00
Vincent Koc
290feaf53c
feat(portal): add external runner detail pages 2026-05-06 13:45:27 -07:00
Vincent Koc
b35717396e
docs(portal): document runner action state 2026-05-06 13:20:51 -07:00
Vincent Koc
17686bb6f5
feat(portal): surface runner action state 2026-05-06 13:18:00 -07:00
Vincent Koc
a20584cf00
fix(cli): infer mirrored testbox repos 2026-05-06 03:33:42 -07:00
Vincent Koc
eefd71fbb1
feat(portal): link testbox runners to actions 2026-05-06 03:29:09 -07:00
Vincent Koc
2a4e08af24
fix(portal): merge external runners into leases table 2026-05-06 03:18:21 -07:00
Vincent Koc
fc73712387
fix(portal): sync all testbox runner states 2026-05-06 03:06:08 -07:00
Vincent Koc
f5cdda71c3
fix(portal): batch external runner writes 2026-05-06 03:00:42 -07:00
Vincent Koc
67ac53cd28
docs(portal): document external runners 2026-05-06 02:59:20 -07:00
Vincent Koc
58435c41e1
feat(portal): show external runners 2026-05-06 02:56:41 -07:00
Vincent Koc
a17122a907
fix(telemetry): clean worker merge helpers 2026-05-06 02:31:18 -07:00
Vincent Koc
58aaa13a6c
docs(telemetry): document run trends 2026-05-06 02:28:36 -07:00
Vincent Koc
101b9c18b4
feat(telemetry): record run samples 2026-05-06 02:25:56 -07:00
Vincent Koc
24848f10cb
docs: note run detail portal polish 2026-05-06 01:59:04 -07:00
Vincent Koc
ce32e18d19
fix(portal): align run detail actions responsively 2026-05-06 01:59:04 -07:00
Vincent Koc
8b21a4d5aa
fix(portal): make run telemetry scannable 2026-05-06 01:59:04 -07:00
Vincent Koc
c06fcb0dbd
fix(portal): tighten run detail summary 2026-05-06 01:59:03 -07:00
Peter Steinberger
671d362e29
feat: support tailscale exit nodes 2026-05-06 09:47:42 +01:00
Peter Steinberger
9680656ec9
docs: add provider reference pages 2026-05-06 09:22:24 +01:00
Vincent Koc
b1298621fc
fix(portal): shorten run detail header 2026-05-06 01:18:12 -07:00
Peter Steinberger
2b8003c2ea
ci: restore provider coverage gate 2026-05-06 09:11:17 +01:00
Peter Steinberger
379b4f4faf
refactor: split provider backends 2026-05-06 09:03:19 +01:00
Vincent Koc
ffb14bf5ba
fix(portal): contain run detail overflow 2026-05-06 00:42:57 -07:00
Vincent Koc
dadf115ac9
fix(portal): unify portal headers 2026-05-06 00:30:27 -07:00
Vincent Koc
dd62f44f86
fix(portal): add crab icon to header 2026-05-06 00:18:10 -07:00
Peter Steinberger
f2210c9b38
fix: harden Daytona and Islo delegated runs 2026-05-06 08:12:12 +01:00
Vincent Koc
a521616b36
Merge remote-tracking branch 'origin/main' into main
* origin/main:
  test: cover provider config loading
  feat: add Daytona and Islo providers
2026-05-06 00:00:08 -07:00
Vincent Koc
55670610a8
Merge pull request #34 from openclaw/feat/portal-run-detail
* origin/feat/portal-run-detail:
  fix(portal): simplify run table columns
  fix(portal): keep run table actions visible
  docs(portal): cover latest portal telemetry changes
  fix(portal): fold commands into access panel
  feat(portal): chart lease telemetry history
  feat(history): summarize run telemetry
  feat(portal): record lease telemetry snapshots
  fix(portal): improve code bridge waiting state
  fix(webvnc): stop daemon child bridge
  feat(portal): show runner leases to admins
  feat(portal): compact access and sortable times
  fix(portal): shorten Windows target labels
  fix(portal): copy command rows
  feat(portal): tighten data grid layout
  feat(portal): polish lease tables
  feat(portal): show ended leases
  feat(portal): add table search controls
  feat(portal): add run detail pages

# Conflicts:
#	internal/cli/actions.go
#	internal/cli/daemon_unix.go
#	internal/cli/daemon_windows.go
#	internal/cli/run.go
#	internal/cli/status.go
#	internal/cli/webvnc.go
2026-05-05 23:58:39 -07:00
Peter Steinberger
9ac68c4fc8
test: cover provider config loading 2026-05-06 07:58:09 +01:00
Peter Steinberger
e0a85bc780
feat: add Daytona and Islo providers 2026-05-06 07:52:15 +01:00
Vincent Koc
9b3a307dc9
fix(portal): simplify run table columns 2026-05-05 23:45:26 -07:00
Peter Steinberger
6ba12e4872
fix: stabilize webvnc reconnects 2026-05-06 07:40:01 +01:00
Vincent Koc
71bedb51c6
fix(portal): keep run table actions visible 2026-05-05 23:35:06 -07:00
Vincent Koc
77a591f54c
docs(portal): cover latest portal telemetry changes 2026-05-05 23:32:18 -07:00
Vincent Koc
c7229a1c56
fix(portal): fold commands into access panel 2026-05-05 23:27:45 -07:00
Vincent Koc
4e17a91237
feat(portal): chart lease telemetry history 2026-05-05 23:19:04 -07:00
Vincent Koc
81e7603d32
feat(history): summarize run telemetry 2026-05-05 22:40:01 -07:00
Vincent Koc
7b699f4cda
feat(portal): record lease telemetry snapshots 2026-05-05 22:17:23 -07:00
Vincent Koc
45f607b43a
fix(portal): improve code bridge waiting state 2026-05-05 21:29:35 -07:00
Peter Steinberger
5aaa848d46
fix: require active coordinator lease for status readiness 2026-05-06 05:27:10 +01:00
Peter Steinberger
494f3a4d77
refactor: add provider backend registry 2026-05-06 05:23:07 +01:00
Vincent Koc
4060ba7afa
fix(webvnc): stop daemon child bridge 2026-05-05 21:10:01 -07:00
Vincent Koc
fcda716aeb
feat(portal): show runner leases to admins 2026-05-05 20:56:11 -07:00
Vincent Koc
5c9170083e
feat(portal): compact access and sortable times 2026-05-05 20:43:03 -07:00
Vincent Koc
5db3f3bb12
fix(portal): shorten Windows target labels 2026-05-05 20:33:17 -07:00
Vincent Koc
ebfab836cc
fix(portal): copy command rows 2026-05-05 20:25:33 -07:00
Vincent Koc
fba3ef8ce6
feat(portal): tighten data grid layout 2026-05-05 20:18:32 -07:00
Vincent Koc
b16372cb78
feat(portal): polish lease tables 2026-05-05 20:08:17 -07:00
Vincent Koc
9dec84ab28
feat(portal): show ended leases 2026-05-05 19:43:53 -07:00
Vincent Koc
8ca88aebfe
feat(portal): add table search controls 2026-05-05 19:33:55 -07:00
Vincent Koc
be8d830933
feat(portal): add run detail pages 2026-05-05 19:21:21 -07:00
Vincent Koc
7c1cabf5f3
Merge pull request #33 from openclaw/feat/portal-lease-detail
feat(portal): add lease detail pages
2026-05-05 19:12:34 -07:00
Vincent Koc
6e818adf49
docs(portal): document lease detail page 2026-05-05 19:07:30 -07:00
Peter Steinberger
45d73c0e0d
fix: supervise webvnc daemon bridges 2026-05-06 02:51:55 +01:00
Peter Steinberger
34c086293b
fix: keep webvnc daemon bridges alive 2026-05-06 02:46:33 +01:00
Peter Steinberger
c9e28c2bf3
fix: restore slim xfce desktop leases 2026-05-06 02:35:50 +01:00
Vincent Koc
3eae4a816d
feat(portal): add lease detail pages 2026-05-05 18:25:42 -07:00
Peter Steinberger
0bb34bdcad
docs: add Crabbox image bake runbook 2026-05-05 23:48:48 +01:00
Peter Steinberger
3df14dff23
docs: explain prebaked runner image storage
Some checks failed
Pages / Deploy docs (push) Has been cancelled
CI / Go (push) Has been cancelled
CI / Worker (push) Has been cancelled
CI / Docs (push) Has been cancelled
CI / Release Check (push) Has been cancelled
2026-05-05 21:07:36 +01:00
Vincent Koc
31c95eb7bf
Merge pull request #30 from openclaw/work/reapply-main-work-20260504233337
feat: stage desktop and WebVNC updates
2026-05-05 13:00:37 -07:00
Peter Steinberger
bbda2d46ea
perf: preinstall desktop QA helpers 2026-05-05 19:57:03 +01:00
Peter Steinberger
c1eb1dd666
fix: avoid browser cloud-init heredocs 2026-05-05 19:01:56 +01:00
Peter Steinberger
d7c07cd946
fix: align worker desktop bootstrap 2026-05-05 18:28:26 +01:00
Peter Steinberger
6e6caa018b
fix: clarify SSH readiness progress 2026-05-05 15:38:51 +01:00
Peter Steinberger
281ac8ec57
fix: speed up Linux desktop bootstrap 2026-05-05 13:54:49 +01:00
Peter Steinberger
556e1880f0
fix: expose ssh during desktop bootstrap 2026-05-05 12:31:40 +01:00
Peter Steinberger
59c506b827
fix: harden desktop warmup bootstrap retries 2026-05-05 12:08:00 +01:00
Vincent Koc
3cb9a4c5db
test(worker): update webvnc portal assertion 2026-05-05 03:16:57 -07:00
Peter Steinberger
3ffa613b61
fix: make Crabbox provider names lease-unique 2026-05-05 10:36:24 +01:00
Vincent Koc
0501865049
docs: document web code bridge behavior 2026-05-05 02:34:38 -07:00
Vincent Koc
dcf8ba40bb
fix(code): default web editor to dark theme 2026-05-05 02:34:38 -07:00
Vincent Koc
0b6e56fed1
fix(code): harden web editor bridge 2026-05-05 02:34:38 -07:00
Vincent Koc
2f336afe49
fix(code): allow large websocket frames 2026-05-05 02:34:37 -07:00
Vincent Koc
4b9337f924
fix(code): keep bridge chunks under frame limits 2026-05-05 02:34:37 -07:00
Vincent Koc
51d505aee4
fix(code): complete portal websocket handshake 2026-05-05 02:34:37 -07:00
Vincent Koc
d6ac429cf7
fix(code): trace remote websocket bridge 2026-05-05 02:34:36 -07:00
Vincent Koc
0d12be654e
fix(code): preserve websocket frame types 2026-05-05 02:34:36 -07:00
Vincent Koc
3b04d07744
fix(code): shim missing vsda static asset 2026-05-05 02:34:36 -07:00
Vincent Koc
8cfc0050bf
fix(code): clean proxied workbench html 2026-05-05 02:34:35 -07:00
Vincent Koc
a014a590ea
fix(code): align bridge chunks for base64 2026-05-05 02:34:35 -07:00
Vincent Koc
9d47a19849
fix(code): reduce bridge response chunk size 2026-05-05 02:34:35 -07:00
Vincent Koc
06472da4c3
fix(code): serialize bridge writes 2026-05-05 02:34:35 -07:00
Vincent Koc
84e9f927bc
fix(code): chunk large bridge responses 2026-05-05 02:34:34 -07:00
Vincent Koc
58fe2d85f3
fix(worker): persist portal bridge sockets 2026-05-05 02:34:34 -07:00
Vincent Koc
3395922222
fix(worker): allow code portal scripts 2026-05-05 02:34:34 -07:00
Vincent Koc
c18913b79d
fix(cli): avoid self-killing code-server launcher 2026-05-05 02:34:33 -07:00
Vincent Koc
8ccf1a3f89
fix: set HOME for code-server bootstrap 2026-05-05 02:34:33 -07:00
Vincent Koc
7a96871aa5
style(worker): format code portal changes 2026-05-05 02:34:33 -07:00
Vincent Koc
cbd25ba8c2
docs: document code lease portal 2026-05-05 02:34:32 -07:00
Vincent Koc
28418c16d0
test: cover code lease portal paths 2026-05-05 02:33:05 -07:00
Vincent Koc
b386b9a737
feat(worker): expose per-lease code portal 2026-05-05 02:33:05 -07:00
Vincent Koc
30e81c6f17
feat(cli): add code lease bridge command 2026-05-05 02:31:47 -07:00
Vincent Koc
22ea8e0a2c
Reapply "feat: polish WebVNC portal controls"
This reverts commit bd59a1f8ca.
2026-05-05 02:29:49 -07:00
Vincent Koc
40ad486039
Reapply "feat: add crabboxignore sync excludes"
This reverts commit 8ca03cb115.
2026-05-05 02:29:06 -07:00
Vincent Koc
20f7102c2f
Reapply "fix: keep WebVNC binary bridge usable"
This reverts commit ac4c1953f0.
2026-05-05 02:29:05 -07:00
Vincent Koc
5ea56ef4f2
Reapply "fix: keep WSL2 sync on fast rsync path"
This reverts commit 273dbfa0f5.
2026-05-05 02:28:07 -07:00
Vincent Koc
950f838573
Reapply "chore: use Homebrew path helper in formula test"
This reverts commit b489ad0c13.
2026-05-05 02:28:07 -07:00
Vincent Koc
5a72ff6c3e
Reapply "fix: suppress macOS xattrs in archive sync"
This reverts commit 8b7898e1ba.
2026-05-05 02:28:07 -07:00
Vincent Koc
945dade1d0
Reapply "fix: discard failed remote git seeds"
This reverts commit 5b9084975e.
2026-05-05 02:28:06 -07:00
Vincent Koc
15c10ee2e2
Reapply "fix: harden WSL2 work root and sync"
This reverts commit bbb8183eca.
2026-05-05 02:28:06 -07:00
Peter Steinberger
0ca412a8d5
feat: improve webvnc bridge ergonomics 2026-05-05 10:16:55 +01:00
Peter Steinberger
6ab061716c
fix: reconnect webvnc bridge after eof 2026-05-05 09:47:11 +01:00
Peter Steinberger
ddefc26f27
fix: keep webvnc bridge alive across viewer resets 2026-05-05 09:27:28 +01:00
Peter Steinberger
5d459e0da1
fix: suppress chrome first-run prompts 2026-05-05 09:26:39 +01:00
Vincent Koc
4e5ce36538
fix: repair Windows WebVNC credentials 2026-05-05 00:37:07 -07:00
Peter Steinberger
a0af15bd47
feat: bridge desktop launches into webvnc 2026-05-05 08:31:45 +01:00
Peter Steinberger
f353bcbee9
feat: add motion-trimmed media previews 2026-05-05 07:52:02 +01:00
Peter Steinberger
a0eb12ef24
refactor: dedupe lease and worker helpers 2026-05-05 07:37:04 +01:00
Peter Steinberger
0d949f7bab
fix: preserve shell argv quoting 2026-05-05 07:35:51 +01:00
Vincent Koc
bbb8183eca
Revert "fix: harden WSL2 work root and sync"
This reverts commit 11e6913fd9.
2026-05-04 23:32:54 -07:00
Vincent Koc
5b9084975e
Revert "fix: discard failed remote git seeds"
This reverts commit d2590b35ce.
2026-05-04 23:32:54 -07:00
Vincent Koc
8b7898e1ba
Revert "fix: suppress macOS xattrs in archive sync"
This reverts commit abe4688059.
2026-05-04 23:32:54 -07:00
Vincent Koc
b489ad0c13
Revert "chore: use Homebrew path helper in formula test"
This reverts commit ceac895abe.
2026-05-04 23:32:54 -07:00
Vincent Koc
273dbfa0f5
Revert "fix: keep WSL2 sync on fast rsync path"
This reverts commit d6ee5620dc.
2026-05-04 23:32:54 -07:00
Vincent Koc
ac4c1953f0
Revert "fix: keep WebVNC binary bridge usable"
This reverts commit 87f74e8382.
2026-05-04 23:32:53 -07:00
Vincent Koc
8ca03cb115
Revert "feat: add crabboxignore sync excludes"
This reverts commit cdc32f65d7.
2026-05-04 23:32:53 -07:00
Vincent Koc
bd59a1f8ca
Revert "feat: polish WebVNC portal controls"
This reverts commit 728cbd4e15.
2026-05-04 23:32:53 -07:00
Vincent Koc
644b66933c
Revert "chore: prepare 0.5.1 release"
This reverts commit 957b836525.
2026-05-04 23:32:53 -07:00
Vincent Koc
686f2e880b
Revert "feat: add desktop recording command"
This reverts commit 628ea8cb9a.
2026-05-04 23:32:53 -07:00
Vincent Koc
0e52767d21
Revert "fix: record full desktop by default"
This reverts commit 1e77a46627.
2026-05-04 23:32:53 -07:00
Peter Steinberger
1e77a46627
fix: record full desktop by default 2026-05-05 05:35:02 +01:00
Peter Steinberger
628ea8cb9a
feat: add desktop recording command 2026-05-05 04:44:39 +01:00
Peter Steinberger
957b836525
chore: prepare 0.5.1 release 2026-05-05 03:01:27 +01:00
Peter Steinberger
728cbd4e15
feat: polish WebVNC portal controls 2026-05-05 02:37:37 +01:00
Peter Steinberger
cdc32f65d7
feat: add crabboxignore sync excludes 2026-05-05 02:29:43 +01:00
Peter Steinberger
87f74e8382
fix: keep WebVNC binary bridge usable 2026-05-05 01:59:26 +01:00
Peter Steinberger
d6ee5620dc
fix: keep WSL2 sync on fast rsync path 2026-05-05 01:53:08 +01:00
Peter Steinberger
ceac895abe
chore: use Homebrew path helper in formula test 2026-05-05 01:00:45 +01:00
Peter Steinberger
abe4688059
fix: suppress macOS xattrs in archive sync 2026-05-05 01:00:13 +01:00
Peter Steinberger
d2590b35ce
fix: discard failed remote git seeds 2026-05-05 00:51:57 +01:00
Peter Steinberger
11e6913fd9
fix: harden WSL2 work root and sync 2026-05-05 00:43:17 +01:00
Peter Steinberger
fec348e6e8
fix: refresh WSL2 instance defaults on mode override (#25)
* fix: refresh WSL2 instance defaults on mode override

* docs: credit WSL2 mode default fix

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-05-04 16:16:15 -07:00
Vincent Koc
9bf5b20183
fix: retry incomplete WSL rootfs downloads (#27)
* fix: retry incomplete WSL rootfs downloads

* fix: harden WSL rootfs downloads
2026-05-04 16:09:39 -07:00
Peter Steinberger
7f95ff7e35
chore: prepare 0.5.0 release 2026-05-04 23:35:21 +01:00
Peter Steinberger
1d22208087
fix: keep passthrough help local 2026-05-04 23:32:22 +01:00
Peter Steinberger
a0585156af
docs: reorder 0.5.0 changelog by user impact 2026-05-04 23:28:31 +01:00
Peter Steinberger
d98475c3d2
fix: restore grouped command help forms 2026-05-04 23:28:28 +01:00
Peter Steinberger
ecdbe91adf
fix: wait for delayed blacksmith cleanup 2026-05-04 23:09:41 +01:00
Peter Steinberger
fb91f465df
ci: set up go for docs check 2026-05-04 22:49:23 +01:00
Peter Steinberger
352a6e1618
refactor: route cli commands through kong 2026-05-04 22:45:43 +01:00
Peter Steinberger
5c19a4d39a
fix: clean async blacksmith warmup leaks 2026-05-04 22:37:21 +01:00
Peter Steinberger
50ceed86bf
fix: add grouped command help docs guard 2026-05-04 22:34:31 +01:00
Peter Steinberger
41ad14cc81
fix: clean up live smoke blockers 2026-05-04 22:17:41 +01:00
Peter Steinberger
189e03657b
docs: improve Tailscale feature docs 2026-05-04 22:17:07 +01:00
Peter Steinberger
a6ddbcbe2a
fix: handle empty junit failure lists 2026-05-04 21:51:30 +01:00
Peter Steinberger
def3b5d9ce
docs: update readme source version 2026-05-04 21:24:02 +01:00
Peter Steinberger
21e7afb07f
docs: reorder changelog by user impact 2026-05-04 21:22:59 +01:00
Peter Steinberger
237ef3a64d
docs: sync provider and vnc docs (#26)
Some checks are pending
CI / Go (push) Waiting to run
CI / Worker (push) Waiting to run
CI / Docs (push) Waiting to run
CI / Release Check (push) Waiting to run
Pages / Deploy docs (push) Waiting to run
2026-05-04 21:17:49 +01:00
Vincent Koc
27f3b1c140
feat: support managed AWS Windows WSL2 (#23)
* fix: use target-aware AWS instance defaults

* feat: support managed AWS Windows WSL2

* fix: complete AWS Windows WSL2 bootstrap

* fix: default AWS WSL2 SSH to Administrator

* fix: wrap Windows WSL2 commands through PowerShell

* fix: harden AWS WSL2 command wrapper
2026-05-04 02:06:49 -07:00
Peter Steinberger
4d15c24a7f
feat: add Tailscale network support (#19)
* feat: add tailscale network support

* fix: relax tailscale network probes

* fix: use configured user for tailscale metadata
2026-05-04 08:58:45 +01:00
Vincent Koc
9ffc78e003
fix: clarify managed Windows provider boundary 2026-05-04 00:04:03 -07:00
Vincent Koc
c4f11a13de
test: stabilize run recorder output cap 2026-05-03 23:31:07 -07:00
Vincent Koc
8c6512eaef
fix(aws): configure Windows SSH ports 2026-05-03 23:27:47 -07:00
Vincent Koc
a45f308c13
fix: canonicalize portal login origin 2026-05-03 23:27:39 -07:00
Peter Steinberger
8214a13978
feat: launch apps on crabbox desktops 2026-05-04 07:19:21 +01:00
Vincent Koc
358e85fdbe
feat: add authenticated WebVNC portal bridge (#15)
* feat: add authenticated WebVNC portal bridge

* feat: require WebVNC bridge tickets

* fix: harden webvnc portal bridge

* fix: make webvnc browser path functional

* fix: handle webvnc viewer lifecycle

* fix: make webvnc credentials and assets deterministic

* fix: keep portal logout logged out

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-04 06:50:13 +01:00
Peter Steinberger
1e6e4894fa
feat: stabilize AWS desktop VNC leases (#14)
* feat: stabilize AWS desktop VNC leases

* docs: improve vnc command guide

* fix: wrap docs site code blocks
2026-05-04 05:05:19 +01:00
Peter Steinberger
abafcf7d68
fix: improve AWS desktop instance launches 2026-05-04 02:42:23 +01:00
Peter Steinberger
59afa7d720
test: cover AWS desktop targets 2026-05-04 02:28:48 +01:00
Peter Steinberger
343ca7baa9
feat: support AWS target userdata in worker 2026-05-04 02:23:58 +01:00
Peter Steinberger
ab7fc6fd71
feat: support AWS desktop targets 2026-05-04 02:22:42 +01:00
Peter Steinberger
d79cba3fa5
docs: document desktop screenshot command 2026-05-04 02:01:33 +01:00
Peter Steinberger
02bad5f369
feat: capture desktop lease screenshots 2026-05-04 01:57:39 +01:00
Peter Steinberger
04f24e9135
fix: keep static VNC host-managed (#13)
* Revert "feat: add managed static macOS VNC login (#12)"

This reverts commit 4333327f56.

* fix: keep static VNC host-managed
2026-05-04 01:50:10 +01:00
Peter Steinberger
4333327f56
feat: add managed static macOS VNC login (#12) 2026-05-04 01:03:34 +01:00
Peter Steinberger
8caabdbaa9
feat: add interactive desktop VNC support (#11) 2026-05-04 00:31:06 +01:00
Peter Steinberger
d9213f8bef
docs: plan interactive VNC leases 2026-05-03 21:29:52 +01:00
Peter Steinberger
65bc06f130
docs: add box emoji to readme title
Some checks are pending
CI / Go (push) Waiting to run
CI / Worker (push) Waiting to run
CI / Docs (push) Waiting to run
CI / Release Check (push) Waiting to run
Pages / Deploy docs (push) Waiting to run
2026-05-03 20:40:48 +01:00
Peter Steinberger
052a57b081
chore: prepare crabbox 0.4.0 release 2026-05-03 20:14:39 +01:00
Peter Steinberger
cf2d9d2d41
docs: mark 0.4.0 as unreleased 2026-05-03 19:51:18 +01:00
Peter Steinberger
82ab83b572
feat: add macOS and Windows SSH targets 2026-05-03 19:44:16 +01:00
Peter Steinberger
6e19bd1594
chore: prepare crabbox 0.3.1 release 2026-05-03 16:22:57 +01:00
Peter Steinberger
e1f8cd05bc
fix(cli): force per-lease ssh identity 2026-05-03 14:32:49 +01:00
Vincent Koc
d13c1f8237
fix(cli): suppress legacy run event warnings 2026-05-02 15:51:26 -07:00
Vincent Koc
46157e9ef1
fix(cli): defer run history on legacy coordinators 2026-05-02 14:52:36 -07:00
Vincent Koc
c30e7221a8
feat(actions): support configured hydrate fields 2026-05-02 14:49:48 -07:00
stain lu
94b15c8e2e
ci: check command docs drift
Some checks are pending
CI / Go (push) Waiting to run
CI / Worker (push) Waiting to run
CI / Docs (push) Waiting to run
CI / Release Check (push) Waiting to run
Pages / Deploy docs (push) Waiting to run
Add a docs check that reads the published CLI command list and verifies each top-level command has a command page plus README index entry. Reorder the command index to match CLI help order and note the contributor credit in the changelog.

Co-authored-by: stainlu <stainlu@newtype-ai.org>
2026-05-02 16:48:53 +01:00
Peter Steinberger
5b18b7d53c
fix: retain chunked run logs 2026-05-02 12:06:59 +01:00
Peter Steinberger
819640fbf7
docs:clarify-crabbox-skill-inspection 2026-05-02 10:17:25 +01:00
Peter Steinberger
45f9b48fbf
Revert "feat: expose crabbox plugin inspection tools"
This reverts commit 57fcb98fc8.
2026-05-02 10:16:03 +01:00
Peter Steinberger
57fcb98fc8
feat: expose crabbox plugin inspection tools
Co-authored-by: stainlu <stainlu@newtype-ai.org>
2026-05-02 09:59:17 +01:00
Peter Steinberger
7bcb028134
chore: start 0.3.1 development 2026-05-02 09:40:19 +01:00
Peter Steinberger
e54847b657
docs: refresh readme for 0.3.0 2026-05-02 09:27:23 +01:00
Peter Steinberger
e7bfbf6ca1
docs: prepare 0.3.0 release 2026-05-02 08:56:01 +01:00
Peter Steinberger
d4cae9b128
test: cover run log buffer truncation 2026-05-02 08:43:46 +01:00
Peter Steinberger
5c96a6c5db
feat: add run attach and event pagination
Co-authored-by: stainlu <stainlu@newtype-ai.org>
2026-05-02 08:38:56 +01:00
Peter Steinberger
38a28c7bd4
docs: align changelog since 0.2.0
Some checks are pending
CI / Go (push) Waiting to run
CI / Worker (push) Waiting to run
CI / Docs (push) Waiting to run
CI / Release Check (push) Waiting to run
Pages / Deploy docs (push) Waiting to run
2026-05-02 08:22:08 +01:00
Peter Steinberger
380a0c351f
fix(cli): record pre-command run failures 2026-05-02 08:14:00 +01:00
Peter Steinberger
540bbf5b68
docs: clarify direct aws fallback metadata 2026-05-02 07:49:04 +01:00
Peter Steinberger
80e1d7591a
fix(cli): include actions run url in timing json 2026-05-02 07:37:38 +01:00
Peter Steinberger
68f3d82657
feat: add durable run events 2026-05-02 07:32:15 +01:00
Peter Steinberger
8d21ba5bf2
docs: note mobile menu icon fix 2026-05-02 06:49:02 +01:00
Peter Steinberger
82100ed7cf
fix(docs): make mobile menu icon visible 2026-05-02 06:43:59 +01:00
Peter Steinberger
211c141a4b
fix: return empty blacksmith list json 2026-05-02 06:28:07 +01:00
Peter Steinberger
e1206c6f3e
feat(cli): expose capacity market and orphan hints 2026-05-02 06:13:42 +01:00
Peter Steinberger
d4448eb8a9
fix(worker): harden AWS on-demand fallback 2026-05-02 06:13:37 +01:00
Peter Steinberger
a852486c12
fix: expose resolved AWS fallback metadata 2026-05-02 05:39:06 +01:00
Peter Steinberger
950b93f9f4
docs: document crabbox remote validation flow 2026-05-02 05:20:32 +01:00
Peter Steinberger
7a448589e3
fix: harden crabbox provider timing and fallback 2026-05-02 05:20:29 +01:00
Peter Steinberger
fb5af1a424
fix: improve bootstrap wait cancellation 2026-05-02 04:15:11 +01:00
Peter Steinberger
651c014270
test: add live coordinator auth smoke 2026-05-02 03:25:12 +01:00
Peter Steinberger
d1648f5551
fix: harden coordinator auth boundaries 2026-05-02 02:59:57 +01:00
Peter Steinberger
d7d61d86bf
docs: expand access route guidance 2026-05-02 02:40:18 +01:00
Peter Steinberger
8a335c6263
docs: complete crabbox documentation audit 2026-05-02 02:39:37 +01:00
Peter Steinberger
d5e8a9c288
docs: document self-hosted github oauth 2026-05-02 02:03:01 +01:00
stainlu
a4ee15d47f
feat: add access service token headers
(cherry picked from commit 5ccf5cb93f5ae0b9b0871b217fecd74d0eca5047)
2026-05-02 02:02:22 +01:00
stainlu
89b673382b
feat: add github team login allowlist
(cherry picked from commit 8975f38741227ec9413ce0c9a49c99dda66d2dea)
2026-05-02 01:59:53 +01:00
stainlu
ab829dc0e3
fix: trust signed user token identity
(cherry picked from commit 195ffc0da4a315f8f87ac3f09f617342d5fe7289)
2026-05-02 01:59:39 +01:00
Peter Steinberger
2e43cc81ec
fix(cli): harden live provider reuse paths 2026-05-02 01:46:16 +01:00
Peter Steinberger
5f546c1ff8
fix(worker): repair AWS security group provisioning 2026-05-02 01:13:22 +01:00
Peter Steinberger
ef5fb8bc1e
docs: update changelog for image lifecycle 2026-05-02 00:48:30 +01:00
Peter Steinberger
c4832416b7
feat: add aws image bake commands
Some checks are pending
CI / Release Check (push) Waiting to run
CI / Go (push) Waiting to run
CI / Worker (push) Waiting to run
Pages / Deploy docs (push) Waiting to run
2026-05-01 20:57:53 +01:00
Vincent Koc
b28674d527
docs: prepare 0.2.0 release 2026-05-01 04:59:25 -07:00
Vincent Koc
aa1ef84ca2
fix(ssh): configure fallback ports 2026-05-01 04:51:15 -07:00
Vincent Koc
b4919bd8e7
docs: add crabbox ssh fix changelog entries 2026-05-01 04:16:00 -07:00
Vincent Koc
2bf31f5940
fix(cli): require real ssh readiness 2026-05-01 04:09:11 -07:00
Vincent Koc
6126e6e090
fix(aws): forward ssh source cidrs 2026-05-01 04:09:11 -07:00
Vincent Koc
d50f2a42a5
Update CHANGELOG.md 2026-05-01 04:02:54 -07:00
Peter Steinberger
458d6d51d2
fix: clean up blacksmith local lease state 2026-05-01 11:54:11 +01:00
Peter Steinberger
6301ddb344
feat: add blacksmith testbox workflow flags 2026-05-01 11:40:46 +01:00
Peter Steinberger
3af7656579
feat: add blacksmith provider and harden broker auth 2026-05-01 11:12:23 +01:00
Peter Steinberger
a39b5574bb
chore: start 0.2.0 development 2026-05-01 10:52:00 +01:00
Peter Steinberger
ce6b96065a
feat: add github browser login 2026-05-01 10:43:11 +01:00
Peter Steinberger
2803a876d4
docs: add crabbox skill 2026-05-01 10:11:34 +01:00
Peter Steinberger
82c2f1ce11
docs: update install tagline 2026-05-01 09:58:20 +01:00
282 changed files with 56158 additions and 2155 deletions

View File

@ -0,0 +1,185 @@
---
name: crabbox
description: Use Crabbox for remote Linux validation, warmed reusable boxes, GitHub Actions hydration, sync timing, logs, results, caches, and lease cleanup.
---
# Crabbox
Use Crabbox when a project needs remote Linux proof, larger cloud capacity,
warm reusable runner state, GitHub Actions hydration, or fast sync from a dirty
local checkout.
## Before Running
- Run from the repository root. Crabbox sync mirrors the current checkout.
- Prefer local targeted tests for tight edit loops.
- Check repo-local `crabbox.yaml` or `.crabbox.yaml` before adding flags.
- Sanity-check the selected binary before remote work:
`command -v crabbox && crabbox --version && crabbox --help | sed -n '1,80p'`.
- Install with `brew install openclaw/tap/crabbox`.
- Auth is required for brokered operation. Normal users run `crabbox login`.
- Trusted operator automation can store the shared token with:
`printf '%s' "$CRABBOX_COORDINATOR_TOKEN" | crabbox login --url https://crabbox.openclaw.ai --provider aws --token-stdin`.
- User config lives at `~/Library/Application Support/crabbox/config.yaml` on
macOS or the platform user config dir elsewhere. It should contain:
```yaml
broker:
url: https://crabbox.openclaw.ai
token: <token>
provider: aws
```
## Common Flow
Warm a reusable box:
```sh
crabbox warmup --idle-timeout 90m
crabbox warmup --provider aws --class beast --market on-demand --idle-timeout 90m
```
Hydrate it through a repository GitHub Actions workflow when CI-like setup,
services, or secret-backed preparation are needed:
```sh
crabbox actions hydrate --id <cbx_id-or-slug>
```
Run commands:
```sh
crabbox run --id <cbx_id-or-slug> -- pnpm test:changed
crabbox run --id <cbx_id-or-slug> --shell "corepack enable && pnpm install --frozen-lockfile && pnpm test"
```
For package-manager commands on raw AWS/Hetzner boxes, hydrate first when the
repo declares an Actions workflow; bootstrap only installs Crabbox plumbing, not
project runtimes. Add `--timing-json` when comparing providers or sync phases.
Stop boxes you created before handoff:
```sh
crabbox stop <cbx_id-or-slug>
```
## Useful Commands
```sh
crabbox status --id <id-or-slug> --wait
crabbox inspect --id <id-or-slug> --json
crabbox webvnc --id <id-or-slug> --open
crabbox webvnc daemon start --id <id-or-slug> --open
crabbox webvnc daemon status --id <id-or-slug>
crabbox webvnc daemon stop --id <id-or-slug>
crabbox webvnc status --id <id-or-slug>
crabbox webvnc reset --id <id-or-slug> --open
crabbox desktop doctor --id <id-or-slug>
crabbox desktop click --id <id-or-slug> --x 640 --y 420
crabbox desktop paste --id <id-or-slug> --text "peter@example.com"
crabbox desktop type --id <id-or-slug> --text "peter+qa@example.com"
crabbox desktop key --id <id-or-slug> ctrl+l
crabbox artifacts collect --id <id-or-slug> --all --output artifacts/<slug>
crabbox artifacts publish --dir artifacts/<slug> --pr <number>
crabbox sync-plan
crabbox history --lease <id-or-slug>
crabbox events <run_id> --json
crabbox attach <run_id>
crabbox logs <run_id>
crabbox results <run_id>
crabbox cache stats --id <id-or-slug>
crabbox ssh --id <id-or-slug>
crabbox usage --scope org
CRABBOX_LIVE=1 CRABBOX_LIVE_REPO=/path/to/openclaw scripts/live-smoke.sh
```
For human desktop demos, prefer WebVNC over native VNC because
`crabbox webvnc --open` preloads the lease password in the browser fragment.
Use native `crabbox vnc --id <id-or-slug> --open` as the fallback printed by
`crabbox webvnc status` or `crabbox webvnc reset`. For input automation, use
`crabbox desktop click/paste/type/key` instead of hand-written `xdotool`;
`desktop type` switches to clipboard paste for symbol-heavy text such as emails
and passwords. `desktop key` accepts both `--id <lease> <keys>` and positional
`<lease> <keys>` forms for shortcuts.
When desktop/WebVNC hangs, trust the inline rescue output first: `problem: VNC
bridge disconnected`, `problem: browser not launched`, `problem: input stack
dead`, or similar will be followed by exact `rescue:` commands such as
`crabbox webvnc status/reset` or `crabbox desktop doctor`.
For UI QA proof, use `crabbox artifacts collect` instead of ad hoc screenshots
and shell recordings. It can bundle screenshots, MP4 recordings, trimmed GIFs,
desktop doctor output, WebVNC status, run logs, and metadata, then
`crabbox artifacts publish --pr <n>` can publish inline-ready Markdown through
the configured coordinator artifact backend. Use explicit `--storage s3`,
`--storage r2`, or `--storage local` only as a local fallback.
## Run Inspection Workflow
Use the CLI for durable run inspection; do not expect extra OpenClaw plugin
tools for this surface.
Find recent runs:
```sh
crabbox history --limit 20
crabbox history --lease <id-or-slug> --limit 20
```
Follow an active run:
```sh
crabbox attach <run_id>
crabbox attach <run_id> --after <seq>
```
Page through recorded events:
```sh
crabbox events <run_id> --after <seq> --limit 100
crabbox events <run_id> --json
```
Inspect completed output and structured test summaries:
```sh
crabbox logs <run_id>
crabbox results <run_id>
```
Use `--debug` on `run` when measuring sync timing.
Use `--timing-json` on `warmup`, `actions hydrate`, and `run` when a stable
machine-readable timing record is needed.
Use `--market spot|on-demand` on AWS `warmup` or one-shot `run` when account
quota or capacity testing needs a temporary market override.
## Run Handles
Coordinator-backed `crabbox run` prints `recording run run_...` before leasing
starts. Keep that run ID in status updates. Use `crabbox events run_...` for
ordered lifecycle/output events, `crabbox attach run_...` to follow an active
run, and `crabbox logs run_...` or `crabbox results run_...` after completion.
Output events are a capped preview, not unlimited logs. Use `logs` for the
retained command output tail when debugging noisy runs.
## Hydration Boundary
Repository setup belongs in the repository hydration workflow. That workflow
owns checkout, runtime setup, dependencies, services, secret-backed preparation,
the ready marker, and keepalive.
Crabbox owns runner registration, workflow dispatch, SSH sync, command
execution, logs/results, local lease claims, and idle cleanup. Do not add
project-specific setup to the Crabbox binary.
## Cleanup
Brokered leases have coordinator-owned idle expiry and local lease claims, so
projects should not maintain their own lease ledger. Default idle timeout is 30
minutes unless config or flags set a different value. Still stop boxes you
created when done.
When `crabbox list` prints `orphan=no-active-lease`, treat it as an operator
review hint: verify the provider machine is not referenced by an active
coordinator lease before deleting anything, especially if `keep=true` is set.

View File

@ -94,6 +94,29 @@ jobs:
run: npm run build
working-directory: worker
docs:
name: Docs
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Check out
uses: actions/checkout@v6
- name: Set up Node
uses: actions/setup-node@v6
with:
node-version: 24
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: go.mod
cache: false
- name: Check docs
run: npm run docs:check
release-check:
name: Release Check
runs-on: ubuntu-latest

View File

@ -39,29 +39,38 @@ jobs:
RELEASE_TAG: ${{ inputs.tag }}
run: git checkout "$RELEASE_TAG"
- name: Check Homebrew tap token
id: homebrew
- name: Resolve release tag
id: release
env:
DISPATCH_TAG: ${{ inputs.tag }}
REF_NAME: ${{ github.ref_name }}
run: |
tag="${DISPATCH_TAG:-$REF_NAME}"
if [ -z "$tag" ]; then
echo "::error::could not resolve release tag"
exit 1
fi
echo "tag=$tag" >>"$GITHUB_OUTPUT"
echo "version=${tag#v}" >>"$GITHUB_OUTPUT"
- name: Verify Homebrew tap token
env:
HOMEBREW_TAP_GITHUB_TOKEN: ${{ secrets.HOMEBREW_TAP_GITHUB_TOKEN }}
run: |
if [ -z "$HOMEBREW_TAP_GITHUB_TOKEN" ]; then
echo "skip=true" >>"$GITHUB_OUTPUT"
echo "::warning::HOMEBREW_TAP_GITHUB_TOKEN is missing; skipping Homebrew tap publish"
exit 0
echo "::error::HOMEBREW_TAP_GITHUB_TOKEN is missing; cannot publish Homebrew formula"
exit 1
fi
code="$(curl -sS -o /dev/null -w '%{http_code}' \
-H "Authorization: Bearer $HOMEBREW_TAP_GITHUB_TOKEN" \
-H "Accept: application/vnd.github+json" \
https://api.github.com/repos/openclaw/homebrew-tap || true)"
if [ "$code" != "200" ]; then
echo "skip=true" >>"$GITHUB_OUTPUT"
echo "::warning::HOMEBREW_TAP_GITHUB_TOKEN cannot access openclaw/homebrew-tap (HTTP $code); skipping Homebrew tap publish"
exit 0
echo "::error::HOMEBREW_TAP_GITHUB_TOKEN cannot access openclaw/homebrew-tap (HTTP $code)"
exit 1
fi
echo "skip=false" >>"$GITHUB_OUTPUT"
- name: GoReleaser
if: ${{ steps.homebrew.outputs.skip != 'true' }}
uses: goreleaser/goreleaser-action@v7
with:
distribution: goreleaser
@ -71,12 +80,17 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
HOMEBREW_TAP_GITHUB_TOKEN: ${{ secrets.HOMEBREW_TAP_GITHUB_TOKEN }}
- name: GoReleaser without Homebrew
if: ${{ steps.homebrew.outputs.skip == 'true' }}
uses: goreleaser/goreleaser-action@v7
with:
distribution: goreleaser
version: "~> v2"
args: release --clean --config /tmp/.goreleaser.yaml --skip=homebrew
- name: Verify Homebrew formula
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GH_TOKEN: ${{ secrets.HOMEBREW_TAP_GITHUB_TOKEN }}
RELEASE_VERSION: ${{ steps.release.outputs.version }}
run: |
formula="$(gh api repos/openclaw/homebrew-tap/contents/Formula/crabbox.rb --jq '.content' | base64 --decode)"
if ! grep -q "version \"$RELEASE_VERSION\"" <<<"$formula"; then
echo "::error::openclaw/homebrew-tap Formula/crabbox.rb was not updated to $RELEASE_VERSION"
exit 1
fi
if ! grep -q "releases/download/v$RELEASE_VERSION/" <<<"$formula"; then
echo "::error::openclaw/homebrew-tap Formula/crabbox.rb does not point at v$RELEASE_VERSION assets"
exit 1
fi

View File

@ -55,4 +55,4 @@ brews:
install: |
bin.install "crabbox"
test: |
system "#{bin}/crabbox", "--version"
system bin/"crabbox", "--version"

View File

@ -2,7 +2,283 @@
## Unreleased
- No unreleased changes yet.
### Added
- Added `provider: azure` for managed Azure Linux and native Windows SSH leases, including direct and brokered provisioning, shared Azure networking, SKU fallback, Azure docs, and cleanup support. Thanks @jwmoss.
## 0.7.0 - 2026-05-07
### Added
- Added mediated egress commands and browser wiring so Linux desktop leases can proxy selected app traffic through the operator machine via the coordinator bridge.
- Added WebVNC portal clipboard controls for sending local clipboard text into the remote session and copying remote clipboard text back to the local browser.
- Added rescue-first desktop/WebVNC failure output that names the failing layer and prints exact `rescue:` or native VNC fallback commands when bridges, viewers, browser launches, VNC targets, or input stacks hang.
- Added lease sharing for individual users or the owning org, including `crabbox share`, `crabbox unshare`, API access checks, and a portal share control on lease detail pages.
- Added collaborative WebVNC observer mode, with one active controller, read-only observers, and a portal takeover button that shows who is controlling the session.
- Added first-class `crabbox artifacts` commands for desktop screenshots, MP4 recordings, trimmed GIFs, logs, metadata, Mantis/OpenClaw QA templates, and PR-ready publishing through broker-owned artifact storage, AWS S3, or Cloudflare R2.
### Changed
- Changed WebVNC portal sharing to open as an in-session modal, added a standalone share-page back action, and simplified collaboration controls into a single stateful control button.
### Fixed
- Fixed `egress start --coordinator` so live public-route egress starts work when the local default coordinator is Cloudflare Access-protected.
- Fixed macOS WebVNC cursor visibility by enabling noVNC's dot-cursor fallback when Screen Sharing sends a transparent or zero-sized cursor.
- Fixed Tailscale exit-node bootstrap paths to prefer tailnet metadata and fail clearly when remote exit-node egress is not active.
- Fixed `run --no-sync` timing summaries so they report `sync_skipped=true`.
- Fixed native Windows command output so first-use PowerShell progress records do not leak CLIXML into run logs.
- Fixed Islo provider sync so `crabbox run --provider islo` uploads the local workspace, uses the correct `/workspace/<workdir>`, and falls back to chunked exec upload while the archive API returns server errors.
- Fixed Code and WebVNC bridge websocket auth so upgraded brokers receive short-lived bridge tickets in the `Authorization` header instead of logging them in URL query strings, while preserving query fallback for older brokers.
- Fixed managed AWS macOS desktop leases so readiness and WebVNC use a writable `ec2-user` work root, call `crabbox-ready` by absolute path, and read the generated Screen Sharing password via sudo.
- Fixed managed AWS macOS bootstrap so VNC password generation does not abort under `pipefail` before Screen Sharing readiness is installed.
- Fixed managed Linux bootstrap so SSH service activation cannot hang cloud-init before desktop/browser setup and readiness checks run.
- Fixed WebVNC daemon start-by-slug so coordinator-backed leases use the resolved target OS in the background bridge command.
- Fixed coordinator-backed `crabbox list` so a stale admin token no longer blocks normal logged-in users; the CLI now falls back to active user-visible leases instead of failing with `401 unauthorized`.
- Fixed desktop, screenshot, VNC, and WebVNC SSH helpers so they retry live fallback ports when a coordinator lease advertises an SSH port that is not ready yet.
## 0.6.0 - 2026-05-07
### Added
- Added `provider: daytona` for Daytona sandbox leases using Daytona's SDK/toolbox for sync and command execution, with short-lived SSH access available through `crabbox ssh`.
- Added Daytona CLI profile auth fallback so `daytona login --api-key ...` can satisfy Crabbox Daytona auth without duplicating `DAYTONA_API_KEY`.
- Added `provider: islo` for delegated Islo sandbox runs using the Islo Go SDK.
- Added a provider backend registry and authoring guide so delegated and SSH-backed providers can live in provider-owned packages while core keeps command parsing, rendering, and capability validation.
- Added `--tailscale-exit-node` and `--tailscale-exit-node-allow-lan-access` so managed Linux leases can route egress through an approved tailnet exit node.
- Added broker capacity hints for AWS leases, including selected market, attempted regions, quota/capacity advice, and configurable high-pressure class warnings.
- Added `crabbox code` and per-lease `/code/` portal URLs for authenticated code-server access on `--code` Linux leases.
- Added per-lease portal detail pages with bridge status, access-panel copy commands, recent run links, and a stop action.
- Added portal run detail pages with command metadata, result summaries, dense viewport-fitted portal tables, provider/OS badges, active/ended/provider/target filters, sticky portal chrome, and copyable retained log previews.
- Added latest lease telemetry snapshots for coordinator-backed Linux leases, including load, memory, disk, and uptime in `status --json` and the portal detail view.
- Added bounded lease telemetry history with portal sparklines and stale/high-resource badges on lease detail pages.
- Added run-level telemetry summaries with start/end Linux resource snapshots in run history JSON, human history output, and portal run tables/details.
- Added live run telemetry samples for longer Linux commands, including bounded coordinator storage and portal load/memory/disk trend lines on run detail pages.
- Added portal visibility for external Blacksmith Testbox runners synced from `crabbox list --provider blacksmith-testbox`, with owner-scoped runner rows, stale markers, GitHub Actions links, status badges, stuck filters, detail pages, and copyable local stop commands.
- Added admin portal visibility for non-owned runner leases, including `mine`/`system` filters and matching detail/code/VNC drilldowns for operator sessions.
- Added `crabbox desktop launch --webvnc --open` to launch a desktop browser/app and immediately bridge the same lease into the WebVNC portal.
- Added `crabbox webvnc --daemon`/`--background` plus `--status`/`--stop` for background WebVNC bridges without tmux.
- Added `crabbox media preview` for creating motion-trimmed GIF previews and optional trimmed MP4 clips from desktop recordings.
- Documented the prebaked runner image boundary: provider-owned AMIs/snapshots hold machine capabilities while repo/runtime caches stay in QA workflows or warm leases.
### Changed
- Changed AWS capacity fallback to route configured `CRABBOX_CAPACITY_REGIONS` across both brokered and direct AWS launches, with the deployed coordinator defaulting to a wider multi-region pool for better headroom.
- Changed coordinator lease requests to omit the default capacity block, preserving mixed-version broker compatibility while still sending explicit market, strategy, fallback, multi-region, availability-zone, or hint opt-out settings.
- Changed coordinator-backed CLI lease output to print broker capacity hints when AWS routing, quota, Spot fallback, or configured high-pressure classes are involved.
- Changed the portal lease table to merge external Blacksmith Testbox runners into the main grid as muted, disabled rows instead of rendering a separate external-runners table.
- Refactored built-in provider backend implementations into `internal/providers/<name>` packages while keeping command orchestration and rendering core-owned.
### Fixed
- Fixed Daytona SDK sync so tar creation and Daytona toolbox upload stream from disk instead of buffering large archives in memory.
- Fixed Daytona resource override handling so snapshot-only sandboxes reject generic `--class` and `--type` flags instead of accepting no-op compute settings.
- Fixed Islo delegated runs so shell-mode commands preserve raw shell strings and truncated exec streams fail instead of silently reporting success.
- Fixed provider-owned flags and target/capability validation to run through registered provider specs while preserving script-facing list JSON compatibility for coordinator and Blacksmith backends.
- Fixed Blacksmith Testbox queued/outage failures so users see the upstream queue state and practical fallback guidance instead of an opaque timeout.
- Fixed Blacksmith Testbox repo inference for mirrored repositories and portal runner sync for stale or external Testbox rows.
- Fixed managed Linux desktop/browser leases to preinstall video capture and native addon build helpers, avoiding per-scenario apt installs in browser QA runs.
- Fixed managed Linux desktop leases to use a slim XFCE session instead of bare Openbox, preserving a real panel/window-manager desktop while avoiding the full XFCE meta package.
- Fixed SSH readiness progress logs to distinguish open TCP ports, failed SSH authentication, and failed Crabbox ready checks.
- Fixed auto-shell command reconstruction so arguments with spaces stay quoted when shell operators such as `&&` are present.
- Fixed managed Linux bootstrap ordering so SSH is reachable before slow desktop/browser package setup while readiness still waits for the full desktop/browser contract.
- Fixed managed desktop/browser warmups so slow cloud-init bootstraps get a longer readiness window, retry once after SSH timeout, and clean up failed leases instead of leaking unusable VMs.
- Fixed brokered cloud server names so friendly-slug collisions with stale provider VMs do not block new leases.
- Fixed human WebVNC desktop launches to keep browser windows windowed by default and reserve fullscreen for explicit capture/video workflows.
- Fixed WebVNC portal status text and bridge commands so waiting/reset states explain the exact local bridge command to run.
- Fixed the Code portal waiting state so it shows bridge status, copy/reload controls, and automatically opens the workspace once the local bridge connects.
- Fixed `crabbox webvnc --stop` so daemon shutdown terminates the active child bridge, not only the supervisor.
- Fixed portal command rows so their copy affordance copies the matching local command instead of only labelling the section.
- Fixed portal Windows target badges to show compact `win` and `win (wsl2)` labels instead of `windows / normal`.
- Fixed portal access and time columns to use compact capability icons, relative time labels, and sortable time metadata instead of wide action buttons and Zulu timestamps.
- Fixed lease detail layout so local commands live inside the access panel instead of forcing a separate full-width commands section above recent runs.
- Fixed portal run detail layout density, responsive action alignment, and run telemetry readability so long-lived run pages fit operator viewports cleanly.
- Fixed generated docs-site navigation so the sidebar scroll position is preserved while moving between pages.
- Fixed Windows WebVNC credential handling so generated portal links preserve special characters and managed TightVNC sessions copy service passwords into the logged-in user's registry profile.
- Fixed managed Linux browser setup so Chrome/Chromium launches skip first-run and default-browser prompts.
- Fixed managed Linux browser cloud-init setup so Chrome/Chromium policy and wrapper generation cannot break YAML parsing.
- Fixed WebVNC portal passwords with escaped special characters and kept the bridge alive across viewer resets and transient coordinator EOFs.
## 0.5.1 - 2026-05-05
### Added
- Added `.crabboxignore` for repo-local sync-only exclude patterns shared by `run` and `sync-plan`.
- Added WebVNC portal controls for reconnect, fullscreen, and clipboard-ready bridge commands.
### Fixed
- Fixed managed AWS Windows WSL2 bootstrap by using the current Ubuntu WSL rootfs URL, downloading large rootfs files through `curl.exe`, and retrying empty or partial rootfs downloads instead of reusing a poisoned tarball. Thanks @vincentkoc.
- Fixed AWS Windows WSL2 mode overrides so they refresh the default instance type to a nested-virtualization-capable family. Thanks @steipete.
- Fixed AWS Windows WSL2 runs so mode overrides also refresh the default work root to `/work/crabbox` while keeping WSL2 sync on the fast rsync path.
- Fixed remote git seeding so an unfetchable local commit cannot leave an empty `.git` worktree that makes sync sanity report every tracked file as deleted.
- Skipped remote git seeding for local commits that are not present in any remote-tracking ref, avoiding slow doomed clone/fetch attempts before rsync.
- Fixed WebVNC bridge reconnects so reloading or reconnecting the browser no longer requires restarting the local bridge.
- Fixed Windows archive sync from macOS so Apple extended attributes do not spam remote tar warnings.
- Fixed the Homebrew formula test command so GoReleaser emits the expected formula syntax.
## 0.5.0 - 2026-05-04
### Added
- Added `--desktop`, `--browser`, and `crabbox vnc` for optional Linux UI/browser leases, including loopback-only VNC with per-lease passwords and headless browser support without a desktop.
- Added authenticated WebVNC portal support with `crabbox webvnc`, which bridges a desktop lease into the coordinator portal with short-lived bridge tickets and without exposing the remote VNC port.
- Added managed AWS Windows desktop leases with OpenSSH, Git for Windows, loopback TightVNC, per-lease VNC passwords, and `crabbox vnc`.
- Added managed AWS Windows WSL2 support for Linux command execution inside brokered Windows leases.
- Added AWS macOS desktop lease plumbing for EC2 Mac Dedicated Hosts, including Screen Sharing setup and per-lease credentials.
- Added `crabbox vnc --open` to start the SSH tunnel and launch the local VNC client for managed desktop leases.
- Added `crabbox desktop launch` to open a browser or app inside a visible desktop lease, including native Windows scheduled-task launch for the logged-in console session.
- Added `crabbox screenshot` to save a PNG from a desktop lease without opening a VNC client.
- Added optional Tailscale reachability for managed Linux leases with `--tailscale`, `--network auto|tailscale|public`, brokered OAuth auth-key minting, and non-secret tailnet metadata in status/inspect output.
- Added static macOS/Windows VNC endpoint discovery, including SSH-tunneled loopback VNC and trusted static direct VNC on `host:5900`.
- Added generated Windows console login details and auto-logon for managed AWS Windows desktop leases.
- Added a minimal XFCE desktop profile with panel/window manager for managed VNC leases.
- Added generated command help for grouped commands so `crabbox actions --help`, `crabbox cache --help`, `crabbox desktop --help`, and similar entrypoints exit cleanly.
### Changed
- Clarified static macOS/Windows VNC as existing-host access, not Crabbox-created boxes, so `--open` no longer launches an OS credential prompt unless `--host-managed` is passed.
- Switched top-level CLI routing to Kong while preserving existing per-command flags, passthrough remote commands, aliases, and exit-code behavior.
### Fixed
- Fixed WebVNC portal login redirects by canonicalizing broker origins before starting the browser login flow.
- Fixed AWS desktop provisioning and Windows SSH bootstrap issues that could leave managed desktop leases unreachable.
- Fixed passthrough command help such as `crabbox run --help` so it prints local usage instead of provisioning a remote lease.
- Fixed `crabbox desktop launch --browser` on freshly warmed desktop leases by creating the remote workdir before launching the app.
- Fixed failed Blacksmith Testbox warmups so printed, newly listed, or delayed `tbx_...` boxes are stopped instead of being left queued after an upstream workflow error.
- Fixed `crabbox run --junit` so all-passing JUnit files record results instead of leaving the coordinator run stuck when the failure list is empty.
- Fixed native Windows `--shell` runs so multi-statement PowerShell scripts keep their quotes instead of being re-parsed by a nested PowerShell process.
- Removed the static macOS managed-login path so static host VNC cannot be mistaken for a Crabbox-created external instance.
- Excluded macOS AppleDouble `._*` sidecar files from default sync manifests so native Windows archives do not transfer invalid TypeScript/package sidecars.
- Quoted `crabbox vnc` tunnel key paths so macOS `Application Support` lease keys can be pasted directly into a shell.
- Skipped Linux-only GitHub Actions hydration stop markers on native Windows static targets.
- Fixed brokered Tailscale requests on coordinators without OAuth secrets so they fail as disabled instead of entering the auth-key minting path.
- Fixed Worker deploy smoke to prefer the Crabbox-scoped Cloudflare token when it is present in the environment or local profile.
## 0.4.0 - 2026-05-03
### Added
- Added static SSH macOS and Windows targets with `--target macos|windows`, `--windows-mode normal|wsl2`, and config/env support for reusable hosts.
### Changed
- Brokered Hetzner and AWS leases now reject non-Linux targets clearly; use `provider: ssh` for macOS or Windows hosts.
### Fixed
- Made Blacksmith live smoke explicit opt-in so the default live smoke works in repositories without a Testbox workflow.
## 0.3.1 - 2026-05-03
### Added
- Added `actions.fields` config support so repository-specific workflow inputs are sent on every Actions hydration, with CLI `-f key=value` overrides. Thanks @vincentkoc.
- Added a command-doc drift check to `npm run docs:check` so every top-level CLI command has a matching command page and index entry. Thanks @stainlu.
### Fixed
- Deferred run-history creation against legacy coordinators until a lease is known, avoiding noisy `invalid_lease_id` failures before command execution. Thanks @vincentkoc.
- Suppressed repeated run-event append warnings when a legacy coordinator does not support the newer run-event path. Thanks @vincentkoc.
- Fixed recorded run logs so long noisy commands are stored in bounded chunks instead of losing the failure evidence between the first output events and the final tail.
- Forced SSH to use Crabbox's per-lease identity file so local SSH-agent keys cannot exhaust server auth attempts before the runner key is tried.
## 0.3.0 - 2026-05-02
Crabbox 0.3.0 makes brokered runs much easier to observe and debug, adds
trusted AWS image lifecycle commands, improves AWS and Blacksmith reliability,
and tightens coordinator auth boundaries.
### Added
- Added early durable run session handles and append-only run events, plus `crabbox events <run-id>` for inspecting the coordinator event log.
- Added `crabbox attach <run-id>` for following recorded events from active runs, plus `--after` and `--limit` pagination for `crabbox events`. Thanks @stainlu.
- Added `--timing-json` for `warmup`, `actions hydrate`, and `run` so provider comparisons can read stable sync, command, total, exit-code, and Actions run timing from one JSON record.
- Added `--market spot|on-demand` to `warmup` and `run` so AWS capacity market choice no longer requires environment-only overrides.
- Added `crabbox image create --id <cbx_id> --name <ami-name> [--wait]` for trusted operators to create AWS AMIs from active brokered AWS leases.
- Added `crabbox image promote <ami-id>` for trusted operators to promote an available AMI as the coordinator default for future brokered AWS leases.
- Added JSON output and wait polling for image creation, including `--wait-timeout` and `--no-reboot` controls.
- Added best-effort AWS vCPU quota preflight for brokered launch fallback, with concise quota-code attempt metadata when a requested instance type cannot fit the applied quota.
- Added Blacksmith Testbox timing JSON output that reports delegated sync in the same schema as AWS and Hetzner runs.
- Added coordinator-orphan hints to human `crabbox list` output when provider machines carry no active coordinator lease.
- Added the Access-protected coordinator route `https://crabbox-access.openclaw.ai` for service-token proof and hardened automation.
- Added Cloudflare Access service-token headers for coordinator CLI requests. Thanks @stainlu.
- Added optional GitHub team allowlisting for browser-login tokens with `CRABBOX_GITHUB_ALLOWED_TEAMS`. Thanks @stainlu.
- Added separate coordinator admin-token auth so shared operator tokens no longer grant admin routes.
- Added Cloudflare Access JWT verification before Access identity can affect bearer-token ownership.
- Added coordinator image routes for admin-token callers: `POST /v1/images`, `GET /v1/images/{ami-id}`, and `POST /v1/images/{ami-id}/promote`.
- Added AWS provider support for `CreateImage` and `DescribeImages`, with Crabbox-owned AMI tags.
- Added `docs/commands/image.md` and linked the image command from the CLI docs, command index, docs site, and source map.
- Added `npm run docs:check` with internal Markdown link validation plus docs-site generation, and wired it into CI.
- Added `scripts/live-smoke.sh` for opt-in AWS, Hetzner, and Blacksmith Testbox live smoke coverage from a real repository checkout.
- Added `scripts/live-auth-smoke.sh` for opt-in live proof that shared tokens cannot call admin routes, admin tokens can, Access edge auth works, and raw Access identity headers are ignored.
- Added `scripts/deploy-worker-smoke.sh` to run the Worker gate, deploy the coordinator, verify public health routes, and optionally include a short AWS lease smoke.
### Changed
- Hydrated runs now skip the expensive Git base-ref hydration fetch when the remote base is already current enough for the local base SHA.
- Brokered AWS class requests now fall back through provider candidates, account-policy launch rejections, and a small burstable fallback instead of failing on the first Free Tier-ineligible high-core type.
- Brokered AWS fallback now skips known quota-impossible candidates before calling `RunInstances`, while preserving explicit `--type` failure semantics.
- Brokered lease records now keep the requested AWS instance type plus concise provisioning-attempt metadata when fallback chooses a different type.
- Coordinator run history now records the resolved lease provider/class/type when a lease exists, avoiding stale requested-type entries after fallback.
- Brokered AWS lease creation now uses the promoted AWS image when no explicit `awsAMI` or `CRABBOX_AWS_AMI` override is supplied.
- Moved the deployed coordinator route to the OpenClaw Cloudflare account at `https://crabbox.openclaw.ai` and scoped default broker org/auth settings to `openclaw`.
- User config writes now force `0600` permissions, and `crabbox doctor` reports overly broad config permissions.
- Image route validation now rejects noncanonical lease IDs, invalid AMI IDs, invalid AMI names, non-AWS leases, and promotion attempts before an image reaches `available`.
### Fixed
- Recorded durable `run.failed` events reliably for coordinator-backed pre-command failures such as lease claim, bootstrap, sync, and remote workdir errors.
- Fixed retained run-log tails under concurrent stdout/stderr writes so `crabbox logs` does not drop lines while run events are being recorded.
- Included the GitHub Actions hydration run URL in `crabbox run --timing-json` output when an Actions-hydrated workspace marker carries a run ID.
- Preserved explicit AWS `--type` requests as exact instance-type requests; Crabbox now fails clearly instead of silently falling back when the user asked for a specific type.
- Fixed AWS On-Demand launches by omitting Spot request tag specifications when no Spot request is created.
- Fixed Blacksmith Testbox JSON list output so the CLI returns an empty array when Blacksmith reports no active testboxes.
- Fixed brokered AWS security-group creation by sending EC2's required `GroupDescription` parameter, restoring first-run AWS provisioning in fresh accounts.
- Fixed coordinator warmup waits to keep touching the lease during slow bootstrap so short idle timeouts do not release a box while the foreground CLI is still waiting.
- Fixed SSH known-host handling for macOS config paths containing spaces, restoring per-lease known-host isolation under `Library/Application Support`.
- Scoped SSH ControlMaster sockets by per-lease key path so fast IP reuse across ephemeral machines cannot inherit a stale control connection.
- Fixed `crabbox list --provider blacksmith-testbox --json` to return parsed JSON instead of rejecting the shared `--json` flag.
- Prevented caller-supplied Access identity headers from overriding signed GitHub user token identity. Thanks @stainlu.
- Canceled SSH bootstrap waits when the coordinator lease disappears or becomes inactive, and made wait progress include elapsed and remaining time.
- Warned before running JavaScript package-manager commands on an unhydrated raw box when the repo declares an Actions hydration workflow.
- Fixed the generated docs-site mobile menu icon so the hamburger bars remain visible on narrow iOS/Safari viewports.
- Fixed responsive padding on the generated docs-site frontpage body content.
- Documented self-hosted GitHub OAuth setup so external coordinator deployments can avoid `Invalid redirect_uri` login failures.
## 0.2.0 - 2026-05-01
Crabbox 0.2.0 hardens the brokered runner path after real AWS and Blacksmith Testbox use: browser login is safer, AWS SSH ingress is no longer world-open by default, SSH readiness waits for the Crabbox bootstrap marker, and fallback SSH ports are configurable instead of being hidden port-22 magic.
### Added
- Added GitHub browser login for `crabbox login`, including signed user tokens, polling-based CLI completion, `--no-browser`, and JSON output support.
- Added coordinator OAuth routes for GitHub login: `/v1/auth/github/start`, `/v1/auth/github/callback`, and `/v1/auth/github/poll`.
- Added signed non-admin user-token auth in the Worker while keeping the shared operator token for admin routes.
- Added GitHub org membership enforcement before minting browser-login tokens.
- Added the canonical coordinator endpoint configured for OAuth callback generation.
- Added Blacksmith Testbox workflow flags for `crabbox warmup` and `crabbox run`, enabling one-command Testbox runs without repo YAML or environment variables.
- Added configurable SSH fallback ports via `ssh.fallbackPorts` and `CRABBOX_SSH_FALLBACK_PORTS`.
### Changed
- Updated CLI defaults, docs, examples, and auth guidance to prefer `https://crabbox.openclaw.ai`.
- Clarified that Cloudflare Access OAuth and Crabbox CLI OAuth are separate GitHub OAuth apps with separate callback URLs.
- Scoped normal GitHub-login users to their own leases, run history, logs, and usage; shared-token admin auth remains required for pool and fleet-wide operator views.
- AWS coordinator-created security groups now allow SSH only from configured CIDRs, the CLI-detected outbound IPv4 CIDR, or the request source IP instead of adding world-open SSH ingress.
- Direct AWS security groups now honor the configured AWS SSH source CIDRs when creating managed SSH ingress.
- Direct and brokered AWS now open the same configured SSH port candidates that the CLI will try.
### Fixed
- Cleaned up Blacksmith Testbox local lease claims and per-lease SSH keys after failed warmups, explicit stops, and one-shot runs.
- Fixed `status` and `inspect` readiness reporting so active leases with a host are not marked ready until SSH and `crabbox-ready` actually respond.
- Fixed remote sync sanity failures to include the remote deletion count and sample paths instead of hiding the useful stderr behind `exit status 66`.
- Restricted Worker admin routes to shared-token admin auth so GitHub browser-login users cannot call admin endpoints.
- Fixed `whoami` reporting for GitHub browser-login tokens.
- Fixed exact `cbx_...` lookups bypassing owner-scoped slug authorization checks.
- Added cleanup and a pending-login cap for unauthenticated GitHub OAuth login starts.
## 0.1.0 - 2026-05-01

577
README.md
View File

@ -1,320 +1,139 @@
# Crabbox
# 🦀 📦 Crabbox
Crabbox is an open source remote testbox runner for maintainers and agents. It gives a fast local loop on owned cloud capacity: provision or reuse a warm Linux box, sync the current dirty checkout, run a command remotely, stream output, and clean up.
[![CI](https://github.com/openclaw/crabbox/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/openclaw/crabbox/actions/workflows/ci.yml)
[![Release](https://github.com/openclaw/crabbox/actions/workflows/release.yml/badge.svg)](https://github.com/openclaw/crabbox/actions/workflows/release.yml)
[![Latest release](https://img.shields.io/github/v/release/openclaw/crabbox?sort=semver)](https://github.com/openclaw/crabbox/releases/latest)
The current implementation is a Go CLI plus a Cloudflare Worker/Durable Object coordinator. The CLI uses the coordinator for brokered Hetzner or AWS EC2 Spot leases, with direct provider calls kept as a debug fallback.
**Warm a box, sync the diff, run the suite.**
Documentation lives in [`docs/`](docs/README.md). Start with [How Crabbox Works](docs/how-it-works.md) for the end-to-end mental model, and use [Source Map](docs/source-map.md) to trace docs back to implementation files. The GitHub Pages site is generated from those Markdown files with a small dependency-free builder:
Crabbox is an open-source remote testbox runner for maintainers and AI agents. Lease fast managed cloud capacity, or point at an existing SSH host, sync your dirty checkout, run a command remotely, stream output, and release. Local edit-save-run loop, cloud-grade compute.
```sh
node scripts/build-docs-site.mjs
open dist/docs-site/index.html
crabbox run -- pnpm test
```
## Install
Behind that single command: a Go CLI on your laptop, a Cloudflare Worker broker that owns provider credentials and lease state, and a managed runner on Hetzner Cloud, AWS EC2, or Azure. Azure supports managed Linux and native Windows VMs. Crabbox can also wrap Blacksmith Testboxes when you choose `provider: blacksmith-testbox`, use Daytona or Islo sandboxes for direct-provider workflows, or use `provider: ssh` for existing macOS and Windows targets.
Latest release: `0.1.0`.
---
## Install
```sh
brew install openclaw/tap/crabbox
crabbox --version
```
Without Homebrew, download the matching archive from the `v0.1.0` release on GitHub:
No Homebrew? Grab a [GoReleaser archive](https://github.com/openclaw/crabbox/releases) for macOS, Linux, or Windows.
```text
https://github.com/openclaw/crabbox/releases/tag/v0.1.0
```
Prerequisites on the laptop: `git`, `ssh`, `ssh-keygen`, `rsync`, `curl`.
## How It Works
Crabbox has a small control plane and a simple data plane:
```text
developer laptop
crabbox CLI
|
| HTTPS JSON API, bearer auth
v
Cloudflare Worker
Fleet Durable Object
|
| provider API
v
Hetzner server or AWS EC2 Spot instance
developer laptop
|
| rsync + SSH
v
leased runner
```
The **CLI** is the user-facing tool. It loads config from `~/.config/crabbox/config.yaml`, repo-local `crabbox.yaml` or `.crabbox.yaml`, creates a per-lease SSH key, asks the broker for a lease, waits for SSH, seeds remote Git when possible, builds a Git file-list sync manifest, skips sync when the local/remote fingerprint matches, rsyncs the current checkout, runs the requested command, streams output, and releases the lease unless `--keep` is set. SSH prefers the configured port and can fall back to port 22 during bootstrap.
The **broker** is the Cloudflare Worker at `crabbox-coordinator.steipete.workers.dev`. It authenticates requests with `CRABBOX_SHARED_TOKEN`, routes all fleet operations through a single Durable Object, and owns cloud-provider credentials. Local machines do not need AWS or Hetzner API keys for the normal path.
The **Fleet Durable Object** is the serialized scheduler and lease store. It creates lease IDs, records owner/profile/class/provider metadata, tracks expiry, and has an alarm that expires stale leases. Release and expiry both call the provider delete path for non-kept machines.
The **provider layer** provisions capacity:
- Hetzner: imports or reuses the SSH key, creates a server, applies Crabbox labels, and falls back across configured server types when quota or capacity rejects a request.
- AWS: signs EC2 Query API calls inside the Worker, imports or reuses the SSH key pair, creates or reuses the `crabbox-runners` security group, launches one-time Spot instances, tags instances/volumes/Spot requests, and falls back across broad C/M/R instance families. Direct AWS mode can use Spot placement scores across configured regions before provisioning.
The **runner** is just an Ubuntu machine bootstrapped by cloud-init. Bootstrap creates the `crabbox` user, enables SSH on port `2222`, installs only Crabbox plumbing (`curl`, Git, rsync, jq, OpenSSH), and prepares `/work/crabbox` plus cache directories. Project runtimes such as Go, Node, pnpm, Docker, services, and secrets belong in the repository's GitHub Actions hydration, devcontainer, Nix, mise/asdf, or setup scripts. The runner does not need broker credentials.
The normal lifecycle is:
1. `crabbox run --class standard -- <command>` loads local config.
2. CLI sends `POST /v1/leases` with provider, class, TTL, idle timeout, slug, SSH public key, and bootstrap options.
3. Worker creates a Hetzner server or AWS Spot instance and stores the lease metadata, including `lastTouchedAt` and idle expiry.
4. CLI waits for `crabbox-ready` over SSH.
5. CLI seeds remote Git when possible, then rsyncs tracked plus nonignored untracked files into `/work/crabbox/<lease>/<repo>`.
6. CLI records sync fingerprints, enforces sync size/time guardrails, runs sync sanity checks, and hydrates configured base-ref history.
7. CLI runs the command over SSH and returns the remote exit code.
8. CLI releases the lease unless it was kept; kept leases still auto-release after idle timeout.
The GitHub Actions hydration lifecycle reuses the same machines, but lets the repository's workflow define setup:
1. `crabbox warmup` leases a reusable box and prints both a stable `cbx_...` ID and a friendly slug.
2. `crabbox actions hydrate --id blue-lobster` registers that box as an ephemeral GitHub Actions runner, dispatches the configured workflow, and waits for the workflow to write a ready marker.
3. The workflow runs normal Actions steps such as checkout, dependency install, cache/service setup, and secret-backed environment hydration.
4. `crabbox run --id blue-lobster -- <command>` syncs the local dirty checkout into the hydrated `$GITHUB_WORKSPACE`, sources the workflow's non-secret env handoff, and runs commands there.
Crabbox does not parse or reimplement GitHub Actions YAML. The project-owned workflow decides what to install and when the machine is ready. GitHub secrets and OIDC request tokens remain workflow-step scoped unless that workflow intentionally persists its own short-lived handoff.
Direct provider mode still exists for debugging. If no broker is configured, `--provider aws` uses the local AWS SDK credential chain and `--provider hetzner` uses `HCLOUD_TOKEN` or `HETZNER_TOKEN`. The brokered path is the default operational model.
## Status
Working today:
- [`crabbox doctor`](docs/commands/doctor.md)
- [`crabbox login`](docs/commands/login.md)
- [`crabbox logout`](docs/commands/logout.md)
- [`crabbox whoami`](docs/commands/whoami.md)
- [`crabbox init`](docs/commands/init.md)
- [`crabbox warmup`](docs/commands/warmup.md)
- [`crabbox run`](docs/commands/run.md)
- [`crabbox sync-plan`](docs/commands/sync-plan.md)
- [`crabbox history`](docs/commands/history.md)
- [`crabbox logs`](docs/commands/logs.md)
- [`crabbox results`](docs/commands/results.md)
- [`crabbox cache`](docs/commands/cache.md)
- [`crabbox status`](docs/commands/status.md)
- [`crabbox list`](docs/commands/list.md)
- [`crabbox usage`](docs/commands/usage.md)
- [`crabbox admin`](docs/commands/admin.md)
- [`crabbox ssh`](docs/commands/ssh.md)
- [`crabbox inspect`](docs/commands/inspect.md)
- [`crabbox stop`](docs/commands/stop.md)
- [`crabbox actions`](docs/commands/actions.md)
- [`crabbox pool list`](docs/commands/list.md)
- [`crabbox machine cleanup`](docs/commands/cleanup.md)
- [`crabbox cleanup`](docs/commands/cleanup.md)
- [Cloudflare Worker coordinator on Workers/Durable Objects](docs/features/coordinator.md)
- [bearer-token coordinator auth for automation](docs/features/broker-auth-routing.md)
- [Cloudflare route for `crabbox.clawd.bot/*`](docs/features/broker-auth-routing.md)
- [Hetzner server provisioning with class fallback](docs/features/providers.md)
- [AWS EC2 Spot provisioning with class fallback](docs/features/providers.md)
- [minimal cloud-init bootstrap for SSH, Git, rsync, and work directories](docs/features/runner-bootstrap.md)
- [Git file-list rsync overlay of tracked and nonignored local files](docs/features/sync.md)
- [sync fingerprint skip for no-change hot runs](docs/features/sync.md)
- [per-lease SSH keys under the Crabbox config directory](docs/features/ssh-keys.md)
- [coordinator cost guardrails and monthly usage summaries](docs/features/cost-usage.md)
- [coordinator run history and retained run-log tails](docs/features/history-logs.md)
- [JUnit test result summaries for recorded runs](docs/features/test-results.md)
- [explicit warm-box cache controls](docs/features/cache.md)
- [operator login, identity, and admin lease controls](docs/features/auth-admin.md)
- [provider-backed price estimates with static fallback rates](docs/features/cost-usage.md)
- [sync sanity checks for mass tracked deletions](docs/features/sync.md)
- [shallow Git hydration for configured base-ref detection](docs/features/sync.md)
- [GitHub Actions-backed hydration into project-defined runner workspaces](docs/features/actions-hydration.md)
- [SSH execution on port `2222`](docs/features/runner-bootstrap.md)
Not yet done:
- untrusted multi-tenant isolation
## Quick Start
Prerequisites:
- `git`, `ssh`, `ssh-keygen`, `rsync`, and `curl`
- broker config in `~/.config/crabbox/config.yaml` or `~/Library/Application Support/crabbox/config.yaml` on macOS
Configure the deployed broker:
## Quick start
```sh
printf '%s' "$CRABBOX_COORDINATOR_TOKEN" | \
crabbox config set-broker \
--url https://crabbox-coordinator.steipete.workers.dev \
--provider aws \
--token-stdin
```
# log in once per machine (stores a broker token in user config)
crabbox login
Check local prerequisites and broker access:
```sh
# verify local prerequisites and broker reachability
crabbox doctor
```
Inspect broker config:
# one-shot: lease, sync, run, release
crabbox run -- pnpm test
```sh
crabbox config show
```
Onboard a repo for Crabbox:
```sh
crabbox init
```
Warm a reusable testbox:
```sh
crabbox warmup --profile project-check --class beast
```
Hydrate that box through the repo's GitHub Actions setup, then run local tests inside the hydrated workspace:
```sh
crabbox actions hydrate --id blue-lobster
CI=1 crabbox run --id blue-lobster -- pnpm test:changed:max
```
Use AWS EC2 Spot through the broker:
```sh
crabbox warmup --class beast
```
Run a command on an existing lease:
```sh
CI=1 crabbox run --id blue-lobster -- pnpm test:changed:max
```
Inspect and connect:
```sh
crabbox status --id blue-lobster
# or warm a box once, then reuse it
crabbox warmup # prints cbx_... + a slug
crabbox run --id blue-lobster -- pnpm test:changed
crabbox ssh --id blue-lobster
crabbox inspect --id blue-lobster --json
```
Inspect usage and estimated cost:
```sh
crabbox usage
crabbox usage --scope org --org openclaw
crabbox usage --scope all --json
```
`crabbox usage` reads coordinator history, so it requires a configured broker. Cost is an estimate for compute leases, not a provider invoice: the coordinator prefers explicit `CRABBOX_COST_RATES_JSON` overrides, then provider pricing from AWS Spot history or Hetzner server-type prices, then built-in fallback rates. Full reference: [docs/commands/usage.md](docs/commands/usage.md).
Use the OpenClaw plugin when an agent should drive Crabbox through OpenClaw tools instead of shelling out manually. The repository root is also a native OpenClaw plugin package; install it from this repo or from a packaged release, then use the `crabbox_run`, `crabbox_warmup`, `crabbox_status`, `crabbox_list`, and `crabbox_stop` tools.
Stop a kept server:
```sh
crabbox stop blue-lobster
```
Print the CLI version:
Every lease has a stable `cbx_...` ID and a friendly crustacean slug (`blue-lobster`, `swift-hermit`, …). Either works wherever an `--id` is accepted.
```sh
crabbox --version
```
## Machine Classes
`beast` is the default. Hetzner uses dedicated-server classes:
## How it works
```text
standard ccx33, cpx62, cx53
fast ccx43, cpx62, cx53
large ccx53, ccx43, cpx62, cx53
beast ccx63, ccx53, ccx43, cpx62, cx53
your laptop Cloudflare Worker cloud provider
------------- ------------------ --------------
crabbox CLI -- HTTPS --> Fleet Durable Object --> Hetzner / AWS EC2 / Azure
| lease + cost state |
| |
+------------ SSH + rsync to leased runner <--------------+
```
During verification, Hetzner rejected `ccx63`, `ccx53`, and `ccx43` because of the account dedicated-core quota, so Crabbox fell back to `cpx62`.
- **CLI** — Go binary. Loads config, mints a per-lease SSH key, asks the broker for a lease, waits for SSH, seeds remote Git, rsyncs the dirty checkout (with fingerprint skip when nothing changed), runs the command, streams output, releases.
- **Broker** — Cloudflare Worker at `crabbox.openclaw.ai` plus a single Durable Object. Owns provider credentials, serializes lease state, enforces active-lease and monthly spend caps, and expires stale leases by alarm. Auth is GitHub login or a shared bearer token.
- **Runner** — a throwaway SSH machine prepared with SSH on the primary port, default `2222`, plus configured fallback ports and Crabbox's sync/run prerequisites. Linux uses Ubuntu with cloud-init and `/work/crabbox`; native Windows uses OpenSSH, Git for Windows, and `C:\crabbox`. No broker credentials live on the box. Project runtimes (Go, Node, Docker, services, secrets) come from your repo's GitHub Actions hydration, devcontainer, Nix, mise/asdf, or setup scripts — not from Crabbox.
AWS uses flexible EC2 Spot candidate pools:
A direct-provider mode (`--provider hetzner|aws|azure` with local credentials) exists for debugging the broker itself; the brokered path is the default.
For the full mental model, see [How Crabbox Works](docs/how-it-works.md). For the doc-to-code map, see [Source Map](docs/source-map.md).
## Highlights
- **One-shot or warm.** `crabbox run` for fire-and-forget; `crabbox warmup` + `--id` for repeated runs against the same box.
- **Run observability.** Every coordinator-backed run gets an early `run_...` handle. Use `crabbox attach <run-id>` while it is active, `crabbox events <run-id> --after <seq> --limit <n>` for durable lifecycle/output events, and `crabbox logs <run-id>` for retained output after completion.
- **Stable timing records.** `--timing-json` on `run`, `warmup`, and `actions hydrate` gives scripts one machine-readable sync/command/total timing schema across AWS, Hetzner, and Blacksmith Testboxes.
- **Local-first sync.** No clean-checkout requirement. Tracked + nonignored files only, fingerprint skip on no-op runs, sanity checks against suspicious mass deletions, optional shallow base-ref hydration for changed-test workflows.
- **Brokered cloud.** Maintainers and agents share infra without sharing provider tokens. Hetzner, AWS EC2, and Azure are managed providers; AWS also owns Windows WSL2 and EC2 Mac targets. Linux defaults to Spot unless capacity config says otherwise. Providers fall back across compatible instance families when capacity or quota rejects a request.
- **Azure Linux and native Windows.** `provider: azure` provisions Linux and native Windows VMs in a configurable Azure subscription using `DefaultAzureCredential` in direct mode or service-principal secrets in the broker. Crabbox creates a shared resource group, vnet, subnet, and NSG on first use, then per-lease public IPs, NICs, and VMs. Linux uses cloud-init; Windows uses VM Agent Custom Script Extension to install OpenSSH/Git and configure the Crabbox user.
- **macOS and Windows static hosts.** `provider: ssh` reuses existing machines; it does not create macOS or Windows Crabbox boxes. macOS and Windows WSL2 use the POSIX rsync path; native Windows uses PowerShell plus tar archive sync.
- **Blacksmith Testbox wrapper.** Set `provider: blacksmith-testbox` to delegate warmup/run/list/status/stop to the Blacksmith CLI while Crabbox keeps local slugs, repo claims, timing summaries, config conventions, and portal visibility for active external runners.
- **Daytona and Islo sandboxes.** Set `provider: daytona` for Daytona SDK/toolbox execution from a snapshot with explicit SSH access when needed, or `provider: islo` for delegated Islo sandbox execution through the Islo Go SDK.
- **Trusted AWS images.** Operators can create AMIs from active brokered AWS leases and promote a known-good image as the coordinator default.
- **Cost guardrails.** Per-lease and monthly spend caps. Live pricing from EC2 Spot history or Hetzner server-type prices, with static fallbacks. `crabbox usage` summarizes spend by user, org, provider, and type.
- **GitHub Actions hydration.** `crabbox actions hydrate` registers a leased box as an ephemeral Actions runner, so the repo's own workflow installs runtimes, services, and secrets. Crabbox does not parse Actions YAML.
- **Interactive desktop and browser leases.** `--browser` provisions Chrome or Chromium for headless automation, `--desktop` provisions visible UI with tunnel-only VNC takeover on managed Linux, AWS native Windows, and AWS EC2 Mac targets. `crabbox desktop doctor` checks session, VNC, input tooling, browser, ffmpeg, screen size, screenshot capture, and WebVNC portal state; `desktop click/paste/type/key` provide first-class input helpers so agents do not hand-roll brittle `xdotool` snippets. QA systems such as Mantis own scenario logic, screenshots, and PR evidence. Azure native Windows is SSH/sync/run only; use AWS for managed Windows desktop/WSL2 or `provider: ssh` for an existing Windows host.
- **Authenticated web portal.** Browser login opens owner-scoped and explicitly shared lease/run views with searchable, paginated tables, muted external-runner rows, compact provider/OS/access icons, relative sortable times, recent run logs/events, WebVNC, code-server, and Linux lease/run telemetry charts. `crabbox share` can grant a lease to one user or the owning org, and the lease page exposes the same sharing controls for owners/managers. WebVNC is preferred for human demos because it preloads the VNC password; `webvnc status` reports local daemon, tunnel, target reachability, bridge/viewer state, recent events, URL/password, and native VNC fallback, while `webvnc reset` restarts only the selected lease's WebVNC/input stack. Admin sessions can also see non-owned runner leases behind `mine`/`system` filters.
- **Hardened coordinator auth.** GitHub browser login, owner-scoped leases, admin-only routes, optional GitHub team allowlists, Cloudflare Access JWT verification, and service-token support keep normal use and operator automation separate.
- **OpenClaw plugin.** The repo root is a native OpenClaw plugin for box lifecycle operations: `crabbox_run`, `crabbox_warmup`, `crabbox_status`, `crabbox_list`, and `crabbox_stop`. Run inspection stays in the CLI and Crabbox skill.
- **Operator surface.** `doctor`, `init`, `status`, `inspect`, `list`, `usage`, `history`, `logs`, `results`, `cache`, `admin`, `cleanup`, plus `--json` output where it matters.
## Machine classes
`beast` is the default. Both providers fall back across an ordered list of instance types.
```text
standard c7a.8xlarge, c7i.8xlarge, m7a.8xlarge, m7i.8xlarge, c7a.4xlarge
fast c7a.16xlarge, c7i.16xlarge, m7a.16xlarge, m7i.16xlarge, c7a.12xlarge, c7a.8xlarge
large c7a.24xlarge, c7i.24xlarge, m7a.24xlarge, m7i.24xlarge, r7a.24xlarge, c7a.16xlarge, c7a.12xlarge
beast c7a.48xlarge, c7i.48xlarge, m7a.48xlarge, m7i.48xlarge, r7a.48xlarge, c7a.32xlarge, c7i.32xlarge, m7a.32xlarge, c7a.24xlarge, c7a.16xlarge
Hetzner standard ccx33, cpx62, cx53
fast ccx43, cpx62, cx53
large ccx53, ccx43, cpx62, cx53
beast ccx63, ccx53, ccx43, cpx62, cx53
AWS Linux standard c7a/c7i/m7a/m7i.8xlarge family
fast …16xlarge family
large …24xlarge family
beast …48xlarge family, falling back to 32x/24x/16x
AWS Win standard m7i.large, m7a.large, t3.large
fast m7i.xlarge, m7a.xlarge, t3.xlarge
large m7i.2xlarge, m7a.2xlarge, t3.2xlarge
beast m7i.4xlarge, m7a.4xlarge, m7i.2xlarge
AWS WSL2 standard m8i.large, m8i-flex.large, c8i.large, r8i.large
fast m8i.xlarge, m8i-flex.xlarge, c8i.xlarge, r8i.xlarge
large m8i.2xlarge, m8i-flex.2xlarge, c8i.2xlarge, r8i.2xlarge
beast m8i.4xlarge, m8i-flex.4xlarge, c8i.4xlarge, r8i.4xlarge, m8i.2xlarge
AWS macOS all mac2.metal unless --type is set
Azure standard Standard_D32ads_v6, Standard_D32ds_v6, Standard_F32s_v2, then 16-vCPU fallbacks
fast Standard_D64ads_v6, Standard_D64ds_v6, Standard_F64s_v2, then 48/32-vCPU fallbacks
large Standard_D96ads_v6, Standard_D96ds_v6, then 64/48-vCPU fallbacks
beast Standard_D192ds_v6, Standard_D128ds_v6, then 96/64-vCPU fallbacks
Azure Win standard Standard_D2ads_v6, Standard_D2ds_v6, Standard_D2ads_v5, Standard_D2ds_v5, Standard_D2as_v6
fast Standard_D4ads_v6, Standard_D4ds_v6, Standard_D4ads_v5, Standard_D4ds_v5, Standard_D4as_v6
large Standard_D8ads_v6, Standard_D8ds_v6, Standard_D8ads_v5, Standard_D8ds_v5, Standard_D8as_v6
beast Standard_D16ads_v6, Standard_D16ds_v6, Standard_D16ads_v5, Standard_D16ds_v5, Standard_D8ads_v6
```
Set `CRABBOX_SERVER_TYPE` or pass `--type` to use another EC2 type such as `c8a.24xlarge`.
## Cloudflare Deployment
Worker source lives in `worker/`.
Local checks:
```sh
npm ci --prefix worker
npm run format:check --prefix worker
npm run lint --prefix worker
npm run check --prefix worker
npm test --prefix worker
npm run build --prefix worker
```
Deploy:
```sh
export CLOUDFLARE_API_TOKEN="$CRABBOX_CLOUDFLARE_API_TOKEN"
export CLOUDFLARE_ACCOUNT_ID="$CRABBOX_CLOUDFLARE_ACCOUNT_ID"
npx wrangler deploy --config worker/wrangler.jsonc
```
Required Worker secrets:
```text
HETZNER_TOKEN
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
CRABBOX_SHARED_TOKEN
```
The Worker is deployed at:
```text
https://crabbox-coordinator.steipete.workers.dev
```
The Cloudflare route `crabbox.clawd.bot/*` is also attached and currently protected by Cloudflare Access.
## OpenClaw Verification
Verified from `/Users/steipete/Projects/openclaw` on a Cloudflare-created fallback `cpx62` runner:
```sh
CI=1 /usr/bin/time -p /Users/steipete/Projects/crabbox/bin/crabbox run --id cbx_f60f47cbc879 -- pnpm test:changed:max
```
Result:
- 61 Vitest shards completed successfully.
- End-to-end warm wall time was 93.66 seconds through the Cloudflare coordinator path.
- The timing includes rsync scan, remote Git hydration, command execution, and output streaming.
For the fastest dedicated-core verification, raise the Hetzner dedicated-core quota and re-run on `ccx63`.
Override with `--type` or `CRABBOX_SERVER_TYPE` for a specific instance.
## Configuration
Config file:
Config resolves in order: flags → env → repo `.crabbox.yaml` → user `~/.config/crabbox/config.yaml` → defaults.
```yaml
broker:
url: https://crabbox-coordinator.steipete.workers.dev
url: https://crabbox.openclaw.ai
provider: aws
token: ...
class: beast
@ -322,6 +141,7 @@ capacity:
market: spot
strategy: most-available
fallback: on-demand-after-120s
hints: true
aws:
region: eu-west-1
rootGB: 400
@ -332,124 +152,141 @@ ssh:
key: ~/.ssh/id_ed25519
user: crabbox
port: "2222"
# Ordered fallback ports tried after ssh.port; use [] to disable fallback.
fallbackPorts:
- "22"
```
Environment variables remain supported for automation and direct-provider debug:
Optional Blacksmith Testbox wrapper:
```text
HCLOUD_TOKEN or HETZNER_TOKEN Hetzner Cloud API token
AWS_PROFILE/AWS_* AWS SDK credentials for direct --provider aws fallback
CRABBOX_PROFILE default default
CRABBOX_PROVIDER default hetzner
CRABBOX_CONFIG optional config file override
CRABBOX_COORDINATOR optional broker URL override
CRABBOX_COORDINATOR_TOKEN optional broker bearer token override
CRABBOX_DEFAULT_CLASS default beast
CRABBOX_IDLE_TIMEOUT default 30m
CRABBOX_TTL default 90m
CRABBOX_SERVER_TYPE provider-specific override
CRABBOX_HETZNER_LOCATION default fsn1
CRABBOX_HETZNER_IMAGE default ubuntu-24.04
CRABBOX_HETZNER_SSH_KEY default crabbox-steipete
CRABBOX_AWS_REGION default eu-west-1
CRABBOX_AWS_AMI optional Ubuntu AMI override
CRABBOX_AWS_SECURITY_GROUP_ID optional security group override
CRABBOX_AWS_SUBNET_ID optional subnet override
CRABBOX_AWS_INSTANCE_PROFILE optional IAM instance profile name
CRABBOX_AWS_ROOT_GB default 400
CRABBOX_CAPACITY_MARKET spot or on-demand
CRABBOX_CAPACITY_STRATEGY most-available, price-capacity-optimized, capacity-optimized, or sequential
CRABBOX_CAPACITY_FALLBACK default on-demand-after-120s
CRABBOX_CAPACITY_REGIONS comma-separated AWS region candidates for Spot placement score
CRABBOX_CAPACITY_AVAILABILITY_ZONES comma-separated AWS availability zone candidates
CRABBOX_SSH_KEY default ~/.ssh/id_ed25519
CRABBOX_SSH_USER default crabbox
CRABBOX_SSH_PORT default 2222
CRABBOX_WORK_ROOT default /work/crabbox
CRABBOX_SYNC_CHECKSUM opt into checksum rsync
CRABBOX_SYNC_DELETE opt into/out of rsync --delete
CRABBOX_SYNC_GIT_SEED opt into/out of remote Git seeding
CRABBOX_SYNC_FINGERPRINT opt into/out of no-op sync skipping
CRABBOX_SYNC_BASE_REF default base ref to hydrate
CRABBOX_SYNC_TIMEOUT default 15m
CRABBOX_SYNC_WARN_FILES/BYTES large-sync warning thresholds
CRABBOX_SYNC_FAIL_FILES/BYTES large-sync failure thresholds
CRABBOX_SYNC_ALLOW_LARGE bypass large-sync failure thresholds
CRABBOX_RESULTS_JUNIT comma-separated remote JUnit XML paths
CRABBOX_CACHE_PNPM/NPM/DOCKER/GIT opt into/out of cache command kinds
CRABBOX_CACHE_MAX_GB cache policy size hint
CRABBOX_CACHE_PURGE_ON_RELEASE purge cache on release policy hint
CRABBOX_ENV_ALLOW comma-separated env allowlist
CRABBOX_OWNER bearer-auth usage owner override
CRABBOX_ORG bearer-auth usage org
CRABBOX_COST_RATES_JSON explicit hourly USD cost-rate overrides
CRABBOX_EUR_TO_USD Hetzner EUR-to-USD conversion, default 1.08
CRABBOX_MAX_ACTIVE_LEASES fleet active-lease limit
CRABBOX_MAX_ACTIVE_LEASES_PER_OWNER
CRABBOX_MAX_ACTIVE_LEASES_PER_ORG
CRABBOX_MAX_MONTHLY_USD fleet reserved monthly spend limit
CRABBOX_MAX_MONTHLY_USD_PER_OWNER
CRABBOX_MAX_MONTHLY_USD_PER_ORG
```yaml
provider: blacksmith-testbox
blacksmith:
org: openclaw
workflow: .github/workflows/ci-check-testbox.yml
job: test
ref: main
idleTimeout: 90m
```
Forwarded environment is intentionally narrow and project-configured:
`crabbox list --provider blacksmith-testbox` also refreshes muted external
runner rows in the portal lease table from the current all-status Testbox list
when coordinator auth is configured. When GitHub is reachable, Crabbox also
links those rows back to the inferred Actions run and workflow, surfaces the
Actions status/conclusion, flags long-queued or long-running rows as `stuck`,
and exposes a copyable local `crabbox stop --provider blacksmith-testbox ...`
command. Clicking an external row opens a visibility-only runner detail page
with owner, workflow, timestamps, boundary notes, and the same stop command.
Those rows are visibility-only records for Blacksmith-owned Testboxes, not
Crabbox leases.
- `NODE_OPTIONS`
- `CI`
Optional Daytona sandbox:
Do not pass secret values as command-line arguments. Keep provider tokens outside the repository.
```yaml
provider: daytona
daytona:
snapshot: crabbox-ready
workRoot: /home/daytona/crabbox
```
Optional Islo sandbox:
```yaml
provider: islo
islo:
image: docker.io/library/ubuntu:24.04
workdir: crabbox
```
Optional static macOS or Windows target:
```yaml
provider: ssh
target: windows
windows:
mode: normal # or wsl2
static:
host: win-dev.local
user: Peter
port: "22"
workRoot: C:\crabbox
```
Optional Tailscale reachability for managed Linux leases:
```yaml
tailscale:
enabled: true
network: auto
tags:
- tag:crabbox
hostnameTemplate: crabbox-{slug}
authKeyEnv: CRABBOX_TAILSCALE_AUTH_KEY
exitNode: mac-studio.example.ts.net
exitNodeAllowLanAccess: true
```
Tailscale is a network plane, not a provider. `--tailscale` joins new managed
Linux leases to the tailnet; `--network auto|tailscale|public` chooses how SSH
and VNC tunnel commands resolve the host. Brokered mode uses Worker OAuth
secrets to mint one-off keys; direct-provider mode reads the auth key from the
configured env var. `exitNode` is opt-in per lease for routing outbound internet
through an approved tailnet exit node. See [Tailscale](docs/features/tailscale.md).
Forwarded environment is intentionally narrow: `NODE_OPTIONS` and `CI`. Do not pass secrets as command-line arguments. Full env-var reference and per-command flags are in [docs/cli.md](docs/cli.md) and [docs/commands/](docs/commands/README.md).
## OpenClaw plugin
The repo root is a native OpenClaw plugin package. Once installed, it exposes Crabbox as agent tools:
- `crabbox_run`, `crabbox_warmup`, `crabbox_status`, `crabbox_list`, `crabbox_stop`
The plugin shells out to the configured `crabbox` binary, so local config, broker login, repo claims, and sync behavior stay owned by the CLI. Set `plugins.entries.crabbox.config.binary` if `crabbox` is not on `PATH`.
Durable run inspection is intentionally CLI/skill-led instead of additional plugin tools: use `crabbox history`, `crabbox events --after --limit`, `crabbox attach`, `crabbox logs`, `crabbox results`, and `crabbox usage` from a shell-capable agent.
## Development
Build from source:
```sh
# Go CLI
go build -o bin/crabbox ./cmd/crabbox
```
Run the local gate:
```sh
gofmt -w $(git ls-files '*.go')
go vet ./...
go test -race ./...
scripts/check-go-coverage.sh 85.0
go build -trimpath -o bin/crabbox ./cmd/crabbox
goreleaser release --snapshot --clean --skip=publish
# Cloudflare Worker
npm ci --prefix worker
npm run format:check --prefix worker
npm run lint --prefix worker
npm run check --prefix worker
npm test --prefix worker
npm run build --prefix worker
# Docs
npm run docs:check
# Optional live smoke, when broker/provider credentials are available
CRABBOX_LIVE=1 CRABBOX_LIVE_REPO=/path/to/openclaw scripts/live-smoke.sh
# Add Blacksmith only for repos with a Testbox workflow.
CRABBOX_LIVE=1 CRABBOX_LIVE_PROVIDERS=blacksmith-testbox scripts/live-smoke.sh
```
CI runs the same checks on pushes and pull requests.
CI runs the full gate (gofmt, vet, race tests, coverage threshold, docs link/build check, GoReleaser snapshot, Worker lint/typecheck/tests/build) on every push and PR. Tagged pushes matching `v*` publish Go archives via GoReleaser and bump the Homebrew formula at [openclaw/homebrew-tap](https://github.com/openclaw/homebrew-tap).
## Releases
Tagged pushes matching `v*` publish Go CLI archives through GoReleaser. Manual reruns can use the `release` workflow with a tag input.
GoReleaser also updates the Homebrew formula in `https://github.com/openclaw/homebrew-tap`, published to users as:
```sh
brew install openclaw/tap/crabbox
```
The release workflow needs `HOMEBREW_TAP_GITHUB_TOKEN` with write access to that tap repository.
Worker deployment, required secrets, and DNS routing live in [docs/infrastructure.md](docs/infrastructure.md).
## Docs
- [docs/architecture.md](docs/architecture.md)
- [docs/orchestrator.md](docs/orchestrator.md)
- [docs/cli.md](docs/cli.md)
- [docs/commands/README.md](docs/commands/README.md)
- [docs/infrastructure.md](docs/infrastructure.md)
- [docs/source-map.md](docs/source-map.md)
- [docs/mvp-plan.md](docs/mvp-plan.md)
- [docs/security.md](docs/security.md)
- [CHANGELOG.md](CHANGELOG.md)
- **Get the model:** [How Crabbox Works](docs/how-it-works.md), [Architecture](docs/architecture.md), [Orchestrator](docs/orchestrator.md)
- **Use the CLI:** [CLI](docs/cli.md), [Commands](docs/commands/README.md), [Features](docs/features/README.md)
- **Interactive QA:** [Interactive Desktop and VNC](docs/features/interactive-desktop-vnc.md)
- **Operate it:** [Operations](docs/operations.md), [Observability](docs/observability.md), [Troubleshooting](docs/troubleshooting.md)
- **Set it up or audit it:** [Infrastructure](docs/infrastructure.md), [Security](docs/security.md), [Source Map](docs/source-map.md), [MVP Plan](docs/mvp-plan.md)
- **Changes:** [CHANGELOG.md](CHANGELOG.md)
The GitHub Pages site at <https://openclaw.github.io/crabbox/> is generated from the `docs/` Markdown:
```sh
npm run docs:check
open dist/docs-site/index.html
```
## License
Crabbox is released under the MIT License. See [LICENSE](LICENSE).
MIT — see [LICENSE](LICENSE).

View File

@ -6,6 +6,7 @@ import (
"os"
"github.com/openclaw/crabbox/internal/cli"
_ "github.com/openclaw/crabbox/internal/providers/all"
)
func main() {

View File

@ -1,23 +1,25 @@
# Crabbox Docs
# 🦀 Crabbox Docs
**Warm a box, sync the diff, run the suite.**
## What Crabbox is
Crabbox is a shared remote testbox system for OpenClaw maintainers and AI agents. The goal is to keep the local developer story unchanged - edit, save, run - while moving compute and tests onto owned cloud capacity.
A `crabbox run` command leases a Linux machine, syncs your tracked and nonignored local files, executes the command remotely, streams stdout and stderr back, and releases the machine. Behind the scenes a small Cloudflare-hosted broker owns provider credentials, lease state, cleanup, usage, and cost guardrails so individual machines and CLIs never need to.
A `crabbox run` command leases a brokered cloud machine or reuses a static SSH host, syncs your tracked and nonignored local files, executes the command remotely, streams stdout and stderr back, and releases or unclaims the target. Behind the scenes a small Cloudflare-hosted broker owns cloud provider credentials, lease state, cleanup, usage, and cost guardrails so individual machines and CLIs never need to.
## How it fits together
```text
your laptop Cloudflare Worker cloud provider
------------- ------------------ --------------
crabbox CLI -- HTTPS --> Fleet Durable Object --> Hetzner / AWS Spot
crabbox CLI -- HTTPS --> Fleet Durable Object --> Hetzner / AWS EC2 / Azure
| lease + cost state |
| |
+------------ SSH + rsync to leased runner <--------------+
```
The CLI is a Go binary. The broker is a Cloudflare Worker plus a single Durable Object. Runners are vanilla Ubuntu boxes prepared by cloud-init with SSH, Git, rsync, curl, jq, and `/work/crabbox`. Project runtimes come from Actions hydration or repo-owned setup. Runners hold no broker credentials - they are leaf nodes.
The CLI is a Go binary. The broker is a Cloudflare Worker plus a single Durable Object. Brokered Linux runners are vanilla Ubuntu boxes prepared by cloud-init with SSH, Git, rsync, curl, jq, and `/work/crabbox`; AWS can also broker managed Windows/WSL2 and EC2 Mac desktop targets, while Azure can broker native Windows SSH/sync/run targets. Static hosts are existing SSH machines selected with `provider: ssh`. Project runtimes come from Actions hydration or repo-owned setup. Runners hold no broker credentials - they are leaf nodes.
## A run, end to end
@ -25,15 +27,23 @@ The CLI is a Go binary. The broker is a Cloudflare Worker plus a single Durable
2. CLI mints a per-lease SSH key and slug, then calls `POST /v1/leases` on the broker.
3. Worker checks active-lease and monthly spend caps, reserves worst-case TTL cost, provisions a server, returns host / port / user / workdir / expiry / slug.
4. CLI waits for `crabbox-ready`, seeds remote Git when possible, rsyncs the Git file-list manifest, runs sync guardrails and sanity checks, hydrates the configured base ref.
5. CLI runs the command over SSH, streams output, sends heartbeats/touches.
5. CLI runs the command over SSH, streams output, records run events, sends heartbeats/touches.
6. CLI releases the lease unless `--keep` is set; kept leases still auto-release after idle timeout, and the broker frees reserved cost when the lease closes.
See [How Crabbox Works](how-it-works.md) for the full picture, including warm-machine reuse and the brokered vs direct provider paths. See [Source Map](source-map.md) when you need to trace a documented behavior back to code.
## Install
```sh
brew install openclaw/tap/crabbox
```
Verify with `crabbox --version`.
## Quick start
```sh
# log in once per machine - stores a bearer token in the OS keychain
# log in once per machine - stores a broker token in user config
crabbox login
# one-shot run on a fresh leased box
@ -60,12 +70,16 @@ The repository root is also a native OpenClaw plugin package. Once installed in
The plugin shells out to the configured `crabbox` binary with argv arrays, so local Crabbox config, broker login, repo claims, and sync behavior stay owned by the CLI. Configure `plugins.entries.crabbox.config.binary` if the binary is not on `PATH`.
Run history and inspection are intentionally handled by the Crabbox CLI and repo skill, not extra plugin tools. Use `crabbox history`, `crabbox events --after --limit`, `crabbox attach`, `crabbox logs`, `crabbox results`, and `crabbox usage` from a shell-capable agent.
## Where to read next
Pick whichever matches your intent:
- **Get the mental model:** [How Crabbox Works](how-it-works.md), [Architecture](architecture.md), [Orchestrator](orchestrator.md).
- **Use the CLI:** [CLI](cli.md), [Commands](commands/README.md), [Features](features/README.md), [Actions hydration](features/actions-hydration.md).
- **Start here:** [Getting started](getting-started.md), [How Crabbox Works](how-it-works.md), [Concepts and glossary](concepts.md).
- **Get the mental model:** [Architecture](architecture.md), [Orchestrator](orchestrator.md).
- **Use the CLI:** [CLI](cli.md), [Commands](commands/README.md), [Features](features/README.md), [Configuration](features/configuration.md), [Actions hydration](features/actions-hydration.md), [Browser portal](features/portal.md), [Telemetry](features/telemetry.md).
- **Pick or add a target:** [Provider reference](providers/README.md), [Providers feature overview](features/providers.md), [Provider authoring](features/provider-authoring.md), [Provider backends](provider-backends.md), [AWS](providers/aws.md), [Hetzner](providers/hetzner.md), [Static SSH](providers/ssh.md), [Blacksmith Testbox](providers/blacksmith-testbox.md), [Daytona](providers/daytona.md), [Islo](providers/islo.md), [Interactive desktop and VNC](features/interactive-desktop-vnc.md).
- **Operate it:** [Operations](operations.md), [Observability](observability.md), [Troubleshooting](troubleshooting.md), [Performance](performance.md).
- **Set it up or audit it:** [Infrastructure](infrastructure.md), [Security](security.md), [Source Map](source-map.md), [MVP Plan](mvp-plan.md).
@ -76,6 +90,6 @@ Markdown in this directory is the user-facing documentation source. Implementati
Build the docs site locally:
```sh
node scripts/build-docs-site.mjs
npm run docs:check
open dist/docs-site/index.html
```

View File

@ -6,7 +6,7 @@ Crabbox has three main parts:
- CLI: local Go binary used by maintainers and agents.
- Coordinator: Cloudflare Worker plus Durable Object state.
- Workers: Hetzner or SSH-accessible machines that run commands.
- Workers: managed cloud or SSH-accessible machines that run commands.
The coordinator leases machines. The CLI executes work. Machines do not need to call back to the coordinator in the MVP.
@ -14,12 +14,12 @@ The coordinator leases machines. The CLI executes work. Machines do not need to
developer laptop
crabbox CLI
|
| HTTPS JSON API, Cloudflare Access
| HTTPS JSON API, Crabbox auth
v
Cloudflare Worker
Durable Object lease state
|
| Hetzner API or AWS EC2 Spot API
| Hetzner API or AWS EC2 API
v
cloud machines
@ -32,17 +32,17 @@ leased machine
## Lease Flow
1. CLI loads config and authenticates to Cloudflare Access.
1. CLI loads config and authenticates with a signed GitHub login token or shared operator token.
2. CLI creates a per-lease SSH key.
3. CLI sends `POST /v1/leases` with lease ID, slug, profile, TTL, idle timeout, desired machine class, and SSH public key.
4. Coordinator validates identity and policy.
5. Durable Object chooses a provider from config and creates a Hetzner server or AWS EC2 Spot instance.
5. Durable Object chooses a provider from config and creates a Hetzner server or AWS EC2 instance.
6. Coordinator returns lease ID, slug, machine address, SSH user, workdir, and expiry.
7. CLI waits for `crabbox-ready`.
8. CLI seeds remote Git when possible, compares sync fingerprints, and syncs changed files with `rsync --delete`.
9. CLI runs sync sanity and configured base-ref hydration.
10. CLI runs the command over SSH and streams stdout/stderr.
11. CLI heartbeats while the command runs; heartbeats touch `lastTouchedAt` and recompute idle expiry up to the TTL cap.
11. CLI heartbeats while the command runs; heartbeats touch `lastTouchedAt`, recompute idle expiry up to the TTL cap, and attach a best-effort latest Linux telemetry snapshot when SSH is reachable.
12. CLI releases the lease when done.
13. Durable Object alarm cleans up stale leases and expired machines.
@ -70,7 +70,9 @@ POST /v1/admin/leases/{id-or-slug}/release
POST /v1/admin/leases/{id-or-slug}/delete
```
Admin endpoints currently use the same Worker bearer-token gate. Split user/admin tokens and GitHub team gating are future hardening.
Admin endpoints and `GET /v1/pool` require the separate admin token. GitHub browser-login tokens are user tokens for normal lease operations and are minted only after allowed GitHub org membership is verified. User-token list, exact-ID lookup, slug lookup, heartbeat, release, run history, logs, and usage are scoped to the token owner/org.
Heartbeat bodies may include a `telemetry` object. The coordinator stores the latest sanitized snapshot on the lease record and retains a bounded `telemetryHistory` ring of the latest 60 samples for portal trend charts. Current CLI snapshots include Linux load average, memory use, root-disk use, uptime, source, and capture timestamp. Runs also accept `POST /v1/runs/{run-id}/telemetry` samples while they are active, and completed run records keep bounded start/mid/end Linux telemetry so history can show resource deltas and short trends without keeping an unbounded time series.
## Durable Object State
@ -79,9 +81,9 @@ Use one fleet Durable Object for MVP. It owns all atomic scheduling decisions.
Core stored records:
```sql
leases(id, slug, provider, cloud_id, region, owner, org, profile, class, server_type, server_id, server_name, provider_key, host, ssh_user, ssh_port, work_root, keep, ttl_seconds, idle_timeout_seconds, estimated_hourly_usd, max_estimated_usd, state, created_at, updated_at, last_touched_at, expires_at, released_at, ended_at)
runs(id, lease_id, slug, owner, org, provider, class, server_type, command_json, state, exit_code, sync_ms, command_ms, duration_ms, log_bytes, log_truncated, results_json, started_at, ended_at)
runlog(run_id, bounded_stdout_stderr_tail)
leases(id, slug, provider, cloud_id, region, owner, org, profile, class, server_type, server_id, server_name, provider_key, host, ssh_user, ssh_port, work_root, keep, ttl_seconds, idle_timeout_seconds, estimated_hourly_usd, max_estimated_usd, state, telemetry_json, telemetry_history_json, created_at, updated_at, last_touched_at, expires_at, released_at, ended_at)
runs(id, lease_id, slug, owner, org, provider, class, server_type, command_json, state, exit_code, sync_ms, command_ms, duration_ms, log_bytes, log_truncated, results_json, telemetry_json, started_at, ended_at)
runlog(run_id, bounded_stdout_stderr_capture)
```
State transitions:
@ -101,7 +103,8 @@ Owned backends:
- `hetzner-static`: pre-created warm machines.
- `hetzner-ephemeral`: created per lease or overflow.
- `aws-spot`: one-time EC2 Spot instances for burst capacity.
- `aws`: one-time EC2 instances for burst capacity, managed Windows/WSL2, and EC2 Mac.
- `azure`: one-time Azure VMs for Linux and native Windows SSH/sync/run.
- `ssh-static`: manually managed machines reachable by SSH.
Brokered backends, later:
@ -109,7 +112,7 @@ Brokered backends, later:
- `github-actions`: register or dispatch real Actions-backed runner work when workflow parity is required.
- `external-runner`: adapter boundary for other hosted runner systems if needed.
The MVP implements `hetzner-ephemeral` and `aws-spot`, and leaves interfaces ready for `hetzner-static`.
The current broker implements `hetzner-ephemeral`, `aws`, and `azure`, and leaves interfaces ready for `hetzner-static`.
## Machine Bootstrap

View File

@ -25,33 +25,65 @@ Primary output goes to stdout. Progress, diagnostics, and errors go to stderr. J
```text
crabbox doctor
crabbox login --url <url> --token-stdin [--provider hetzner|aws]
crabbox login [--url <url>] [--provider hetzner|aws|azure] [--no-browser]
crabbox login --url <url> --token-stdin [--provider hetzner|aws|azure]
crabbox logout
crabbox whoami [--json]
crabbox init [--force]
crabbox config show [--json]
crabbox config path
crabbox config set-broker --url <url> --token-stdin [--provider hetzner|aws]
crabbox warmup [--provider hetzner|aws] [--profile <name>] [--idle-timeout <duration>]
crabbox run [--id <lease-id-or-slug>] [--shell] [--checksum] [--debug] [--force-sync-large] -- <command...>
crabbox config set-broker --url <url> --token-stdin [--provider hetzner|aws|azure]
crabbox warmup [--provider hetzner|aws|azure|ssh|blacksmith-testbox|daytona|islo] [--target linux|macos|windows] [--desktop] [--browser] [--code] [--tailscale] [--network auto|tailscale|public] [--profile <name>] [--idle-timeout <duration>] [--timing-json]
crabbox run [--id <lease-id-or-slug>] [--provider hetzner|aws|azure|ssh|blacksmith-testbox|daytona|islo] [--target linux|macos|windows] [--windows-mode normal|wsl2] [--desktop] [--browser] [--code] [--tailscale] [--network auto|tailscale|public] [--shell] [--checksum] [--debug] [--force-sync-large] [--timing-json] [--blacksmith-workflow <workflow>] -- <command...>
crabbox desktop launch --id <lease-id-or-slug> [--browser] [--url <url>] [--egress <profile>] [--webvnc] [--open] [-- <command...>]
crabbox desktop doctor --id <lease-id-or-slug> [--network auto|tailscale|public]
crabbox desktop click --id <lease-id-or-slug> --x <n> --y <n> [--network auto|tailscale|public]
crabbox desktop paste --id <lease-id-or-slug> --text <text> [--network auto|tailscale|public]
crabbox desktop paste --id <lease-id-or-slug> [--network auto|tailscale|public] < input.txt
crabbox desktop type --id <lease-id-or-slug> --text <text> [--network auto|tailscale|public]
crabbox desktop key --id <lease-id-or-slug> <keys> [--network auto|tailscale|public]
crabbox code --id <lease-id-or-slug> [--open]
crabbox egress start --id <lease-id-or-slug> [--profile <name>|--allow <hosts>] [--listen <addr>] [--coordinator <url>] [--daemon]
crabbox egress host --id <lease-id-or-slug> [--profile <name>|--allow <hosts>]
crabbox egress client --id <lease-id-or-slug> [--listen <addr>] [--ticket <ticket>] [--session <id>]
crabbox egress status --id <lease-id-or-slug>
crabbox egress stop --id <lease-id-or-slug>
crabbox media preview --input <video> --output <preview.gif> [--trimmed-video-output <change.mp4>]
crabbox artifacts collect --id <lease-id-or-slug> [--output <dir>] [--run <run-id>] [--all] [--screenshot] [--video] [--gif] [--doctor] [--webvnc-status] [--metadata] [--duration <duration>] [--fps <n>] [--gif-width <px>] [--network auto|tailscale|public] [--json]
crabbox artifacts video --id <lease-id-or-slug> [--output <path>] [--duration <duration>] [--fps <n>]
crabbox artifacts gif --input <video> --output <preview.gif> [--trimmed-video-output <change.mp4>]
crabbox artifacts template openclaw|mantis [--summary <text>|--summary-file <path>] [--before <path>] [--after <path>] [--output <path>]
crabbox artifacts publish --dir <dir> [--pr <n>] [--repo owner/name] [--storage auto|broker|s3|cloudflare|r2|local] [--bucket <name>] [--prefix <path>] [--base-url <url>] [--region <region>] [--profile <profile>] [--endpoint-url <url>] [--acl <acl>] [--presign] [--expires <duration>] [--dry-run] [--no-comment]
crabbox screenshot --id <lease-id-or-slug> [--output <path>]
crabbox sync-plan [--limit <n>]
crabbox history [--lease <lease-id>] [--owner <email>] [--org <name>] [--limit <n>] [--json]
crabbox logs <run-id> [--json]
crabbox events <run-id> [--after <seq>] [--limit <n>] [--json]
crabbox attach <run-id> [--after <seq>] [--poll <duration>]
crabbox results <run-id> [--json]
crabbox cache stats --id <lease-id-or-slug> [--json]
crabbox cache purge --id <lease-id-or-slug> --kind pnpm|npm|docker|git|all --force
crabbox cache warm --id <lease-id-or-slug> -- <command...>
crabbox actions hydrate --id <lease-id-or-slug> [--workflow <file|name|id>] [--wait-timeout <duration>]
crabbox actions hydrate --id <lease-id-or-slug> [--workflow <file|name|id>] [--wait-timeout <duration>] [--timing-json]
crabbox actions register --id <lease-id-or-slug> [--repo owner/name]
crabbox actions dispatch [--workflow <file|name|id>] [-f key=value]
crabbox status --id <lease-id-or-slug> [--wait]
crabbox status --id <lease-id-or-slug> [--network auto|tailscale|public] [--wait]
crabbox list [--json]
crabbox share --id <lease-id-or-slug> [--user <email>] [--org] [--role use|manage] [--list] [--json]
crabbox unshare --id <lease-id-or-slug> [--user <email>] [--org] [--all] [--json]
crabbox usage [--scope user|org|all] [--user <email>] [--org <name>] [--month YYYY-MM] [--json]
crabbox admin leases [--state active|released|expired|failed] [--owner <email>] [--org <name>] [--json]
crabbox admin release <lease-id-or-slug> [--delete]
crabbox admin delete <lease-id-or-slug> --force
crabbox ssh --id <lease-id-or-slug>
crabbox inspect --id <lease-id-or-slug> [--json]
crabbox ssh --id <lease-id-or-slug> [--network auto|tailscale|public]
crabbox vnc --id <lease-id-or-slug> [--network auto|tailscale|public] [--open]
crabbox webvnc --id <lease-id-or-slug> [--network auto|tailscale|public] [--open]
crabbox webvnc daemon start --id <lease-id-or-slug> [--network auto|tailscale|public] [--open]
crabbox webvnc daemon status --id <lease-id-or-slug>
crabbox webvnc daemon stop --id <lease-id-or-slug>
crabbox webvnc status --id <lease-id-or-slug> [--network auto|tailscale|public]
crabbox webvnc reset --id <lease-id-or-slug> [--network auto|tailscale|public] [--open]
crabbox inspect --id <lease-id-or-slug> [--network auto|tailscale|public] [--json]
crabbox stop <lease-id-or-slug>
crabbox cleanup [--dry-run]
```
@ -64,7 +96,7 @@ One-shot run:
crabbox run --profile project-check -- pnpm check:changed
```
AWS EC2 Spot run:
AWS EC2 run:
```sh
crabbox run --class beast -- pnpm check:changed
@ -74,7 +106,28 @@ Warm a box, then reuse it:
```sh
crabbox warmup --profile project-check
crabbox warmup --tailscale
crabbox warmup --desktop --browser
crabbox run --id blue-lobster -- pnpm test:changed
crabbox vnc --id blue-lobster --open
crabbox webvnc --id blue-lobster --open
crabbox webvnc status --id blue-lobster
crabbox webvnc daemon start --id blue-lobster --open
crabbox code --id blue-lobster --open
crabbox desktop launch --id blue-lobster --browser --url https://example.com --webvnc --open
crabbox desktop doctor --id blue-lobster
crabbox desktop paste --id blue-lobster --text "peter@example.com"
crabbox desktop key --id blue-lobster ctrl+l
crabbox egress start --id blue-lobster --profile discord --daemon
crabbox desktop launch --id blue-lobster --browser --url https://discord.com/login --egress discord --webvnc --open
crabbox egress status --id blue-lobster
crabbox egress stop --id blue-lobster
crabbox share --id blue-lobster --user friend@example.com
crabbox share --id blue-lobster --org
crabbox screenshot --id blue-lobster --output desktop.png
crabbox media preview --input desktop.mp4 --output desktop-preview.gif --trimmed-video-output desktop-change.mp4
crabbox artifacts collect --id blue-lobster --all --output artifacts/blue-lobster
crabbox artifacts publish --dir artifacts/blue-lobster --pr 123
crabbox run --id blue-lobster --shell 'pnpm install --frozen-lockfile && pnpm test'
crabbox stop blue-lobster
```
@ -88,6 +141,49 @@ crabbox run --id blue-lobster -- pnpm test:changed
crabbox stop blue-lobster
```
Use Blacksmith Testboxes through the same Crabbox surface:
```sh
blacksmith auth login
crabbox warmup --provider blacksmith-testbox --blacksmith-workflow .github/workflows/ci-check-testbox.yml --blacksmith-job test
crabbox run --provider blacksmith-testbox --id blue-lobster -- pnpm test:changed
crabbox run --provider blacksmith-testbox --blacksmith-workflow .github/workflows/ci-check-testbox.yml --blacksmith-job test -- pnpm test
crabbox stop --provider blacksmith-testbox blue-lobster
```
Use an existing macOS or Windows SSH host:
```sh
crabbox run --provider ssh --target macos --static-host mac-studio.local -- xcodebuild test
crabbox run --provider ssh --target windows --windows-mode normal --static-host win-dev.local -- dotnet test
crabbox run --provider ssh --target windows --windows-mode wsl2 --static-host win-dev.local -- pnpm test
```
Create managed AWS desktop boxes:
```sh
crabbox warmup --provider aws --target windows --desktop
CRABBOX_AWS_MAC_HOST_ID=h-... crabbox warmup --provider aws --target macos --desktop --market on-demand
crabbox vnc --id blue-lobster
crabbox screenshot --id blue-lobster --output desktop.png
```
Managed provider targets are intentionally narrow:
- Hetzner managed provisioning supports Linux only.
- AWS supports Linux, native Windows (`--target windows --windows-mode normal`),
Windows WSL2 (`--target windows --windows-mode wsl2`), and EC2 Mac
(`--target macos`) when the Mac Dedicated Host is provided.
- Existing macOS and Windows machines belong on `provider=ssh`.
Use Tailscale as an optional network plane:
```sh
crabbox warmup --tailscale
crabbox ssh --id blue-lobster --network tailscale
crabbox vnc --id blue-lobster --network tailscale --open
```
Inspect pool:
```sh
@ -134,6 +230,8 @@ Inspect recorded runs:
crabbox run --id blue-lobster --junit junit.xml -- go test ./...
crabbox history --lease cbx_abcdef123456
crabbox logs run_123
crabbox events run_123
crabbox attach run_123
crabbox results run_123
```
@ -153,6 +251,13 @@ crabbox admin release blue-lobster
crabbox admin delete cbx_abcdef123456 --force
```
Trusted operator image controls:
```sh
crabbox image create --id cbx_abcdef123456 --name openclaw-crabbox-20260501-1246 --wait
crabbox image promote ami-1234567890abcdef0
```
## `run`
`crabbox run` is the main command.
@ -160,16 +265,17 @@ crabbox admin delete cbx_abcdef123456 --force
Behavior:
1. Load config.
2. Acquire a lease unless `--id` is provided.
3. Verify SSH readiness.
4. Use the GitHub Actions workspace when the lease has a hydration marker.
5. Sync current repo, unless a matching sync fingerprint lets Crabbox skip rsync.
6. Seed remote Git from the configured origin/base ref before first sync when possible.
7. Run command over SSH.
8. Stream remote output and retain the latest log tail in coordinator history.
9. Heartbeat coordinator leases in the background.
10. Release lease unless `--keep` is set.
11. Exit with the remote command exit code.
2. Create a durable `run_...` handle when a coordinator is configured.
3. Acquire a lease unless `--id` is provided.
4. Verify SSH readiness.
5. Use the GitHub Actions workspace when the lease has a hydration marker.
6. Sync current repo, unless a matching sync fingerprint lets Crabbox skip rsync.
7. Seed remote Git from the configured origin/base ref before first sync when possible.
8. Run command over SSH.
9. Stream remote output, append run events, and retain bounded command output in coordinator history.
10. Heartbeat coordinator leases in the background.
11. Release lease unless `--keep` is set.
12. Exit with the remote command exit code.
Fresh non-kept leases retry once with a new machine when bootstrap never reaches SSH readiness. Existing leases and `--keep` runs are not retried automatically, so commands are not duplicated on a machine the user asked to keep. Runner bootstrap retries apt and installs only Crabbox plumbing before `crabbox-ready` is allowed to pass.
@ -177,12 +283,29 @@ Flags:
```text
--id <lease-id-or-slug> reuse an existing lease
--provider <name> hetzner or aws
--provider <name> hetzner, aws, ssh, blacksmith-testbox, daytona, or islo
--target <name> linux, macos, or windows
--windows-mode <mode> normal or wsl2
--static-host <host> existing SSH host for provider=ssh
--static-user <user> static SSH user override
--static-port <port> static SSH port override
--static-work-root <path> static target work root
--profile <name> profile to run on
--class <name> machine class override
--type <name> provider server or instance type override
--market spot|on-demand AWS capacity market override
--ttl <duration> maximum lease lifetime, default 90m
--idle-timeout <duration> idle expiry, default 30m
--desktop provision or require visible desktop capability
--browser provision or require browser capability
--code provision or require web code capability
--tailscale join new managed Linux leases to the configured tailnet
--tailscale-tags <csv> Tailscale tags for new managed leases
--tailscale-hostname-template <template>
--tailscale-auth-key-env <env-var>
--tailscale-exit-node <name-or-100.x>
--tailscale-exit-node-allow-lan-access
--network auto|tailscale|public
--no-sync run without syncing
--sync-only sync and exit
--force-sync-large allow a sync candidate above configured fail thresholds
@ -192,12 +315,30 @@ Flags:
--debug print sync timing and itemized rsync output
--junit <paths> comma-separated remote JUnit XML paths to attach to run history
--reclaim claim an existing lease for the current repo
--timing-json print a final JSON timing record
--blacksmith-org <org> Blacksmith organization
--blacksmith-workflow <file|name|id> Blacksmith Testbox workflow
--blacksmith-job <job> Blacksmith Testbox workflow job
--blacksmith-ref <ref> Blacksmith Testbox git ref
```
Secrets must not be accepted as flag values. Env forwarding is name-based only.
Crabbox stores local lease claims under its state directory. `warmup` and first reuse claim the lease for the current repo; later `run`, `ssh`, `cache`, and `actions hydrate/register` refuse a conflicting repo claim unless `--reclaim` is set.
With `provider: blacksmith-testbox`, Crabbox delegates machine setup, sync, and command transport to the Blacksmith CLI. `--sync-only` is unsupported, sync timing is reported as `sync=delegated`, and Blacksmith auth is handled by `blacksmith auth login`, not `crabbox login`.
With `provider: daytona`, Crabbox creates Daytona sandboxes from
`daytona.snapshot`, uploads workspaces through Daytona toolbox file APIs, and
runs commands through Daytona toolbox process APIs. `crabbox ssh` mints
short-lived Daytona SSH tokens and redacts those tokens from output. Daytona
auth can come from `DAYTONA_API_KEY` / `DAYTONA_JWT_TOKEN` env or an
authenticated Daytona CLI profile created by `daytona login --api-key`. With
`provider: islo`, Crabbox delegates sandbox setup and command execution to the
Islo Go SDK, uploads the Crabbox sync manifest as a gzipped archive into the
Islo workdir, and rejects only the SSH/rsync-specific `--sync-only` and
`--checksum` modes.
## Exit Codes
```text
@ -229,9 +370,12 @@ User config:
```yaml
broker:
url: https://crabbox-coordinator.steipete.workers.dev
url: https://crabbox.openclaw.ai
provider: aws
token: ...
access:
clientId: ...
clientSecret: ...
profile: project-check
class: beast
lease:
@ -241,6 +385,7 @@ capacity:
market: spot
strategy: most-available
fallback: on-demand-after-120s
hints: true
aws:
region: eu-west-1
rootGB: 400
@ -248,13 +393,67 @@ ssh:
key: ~/.ssh/id_ed25519
user: crabbox
port: "2222"
# Ordered fallback ports tried after ssh.port; use [] to disable fallback.
fallbackPorts:
- "22"
```
Set broker auth without putting the token in shell history:
Static macOS target:
```yaml
provider: ssh
target: macos
static:
host: mac-studio.local
user: steipete
port: "22"
workRoot: /Users/steipete/crabbox
```
Static Windows target:
```yaml
provider: ssh
target: windows
windows:
mode: normal # normal or wsl2
static:
host: win-dev.local
user: Peter
port: "22"
workRoot: C:\crabbox
```
AWS EC2 Mac target:
```yaml
provider: aws
target: macos
aws:
macHostId: h-0123456789abcdef0
capacity:
market: on-demand
```
`windows.mode: normal` runs native PowerShell over OpenSSH and syncs with a tar
archive. `windows.mode: wsl2` runs commands through `wsl.exe --exec bash -lc`
and uses rsync inside WSL2, so `static.workRoot` should be a WSL path.
`crabbox warmup --market spot|on-demand` and `crabbox run --market spot|on-demand`
override `capacity.market` for a single AWS lease. Use this for temporary quota
or capacity shifts without rewriting repo config.
Open GitHub browser login:
```sh
crabbox login
```
Trusted operators can still set shared-token broker auth without putting the token in shell history:
```sh
printf '%s' "$TOKEN" | crabbox login \
--url https://crabbox-coordinator.steipete.workers.dev \
--url https://crabbox.openclaw.ai \
--provider aws \
--token-stdin
```
@ -269,6 +468,8 @@ class: beast
actions:
workflow: .github/workflows/crabbox.yml
ref: main
fields:
- crabbox_docker_cache=true
runnerLabels:
- crabbox
sync:
@ -304,12 +505,44 @@ cache:
purgeOnRelease: false
```
Blacksmith Testbox config:
```yaml
provider: blacksmith-testbox
blacksmith:
org: openclaw
workflow: .github/workflows/ci-check-testbox.yml
job: test
ref: main
idleTimeout: 90m
debug: false
```
## Environment Variables
```text
CRABBOX_COORDINATOR
CRABBOX_COORDINATOR_TOKEN
CRABBOX_COORDINATOR_ADMIN_TOKEN
CRABBOX_ADMIN_TOKEN alias for CRABBOX_COORDINATOR_ADMIN_TOKEN
CRABBOX_ACCESS_CLIENT_ID
CRABBOX_ACCESS_CLIENT_SECRET
CRABBOX_ACCESS_TOKEN
CRABBOX_PROVIDER
CRABBOX_TARGET
CRABBOX_TARGET_OS alias for CRABBOX_TARGET
CRABBOX_WINDOWS_MODE
CRABBOX_DESKTOP
CRABBOX_BROWSER
CRABBOX_NETWORK
CRABBOX_STATIC_ID
CRABBOX_STATIC_NAME
CRABBOX_STATIC_HOST
CRABBOX_STATIC_USER
CRABBOX_STATIC_PORT
CRABBOX_STATIC_WORK_ROOT
CRABBOX_OWNER
CRABBOX_ORG
CRABBOX_PROFILE
CRABBOX_CONFIG
CRABBOX_DEFAULT_CLASS
@ -319,7 +552,23 @@ CRABBOX_TTL
CRABBOX_SSH_KEY
CRABBOX_SSH_USER
CRABBOX_SSH_PORT
CRABBOX_SSH_FALLBACK_PORTS comma-separated fallback ports, or none
CRABBOX_WORK_ROOT
CRABBOX_AWS_REGION
CRABBOX_AWS_AMI
CRABBOX_AWS_SECURITY_GROUP_ID
CRABBOX_AWS_SUBNET_ID
CRABBOX_AWS_INSTANCE_PROFILE
CRABBOX_AWS_ROOT_GB
CRABBOX_AWS_SSH_CIDRS
CRABBOX_AWS_MAC_HOST_ID
CRABBOX_CAPACITY_MARKET
CRABBOX_CAPACITY_STRATEGY
CRABBOX_CAPACITY_FALLBACK
CRABBOX_CAPACITY_REGIONS
CRABBOX_CAPACITY_AVAILABILITY_ZONES
CRABBOX_CAPACITY_HINTS
CRABBOX_CAPACITY_LARGE_CLASSES
CRABBOX_ACTIONS_WORKFLOW
CRABBOX_ACTIONS_JOB
CRABBOX_ACTIONS_REF
@ -327,6 +576,12 @@ CRABBOX_ACTIONS_REPO
CRABBOX_ACTIONS_RUNNER_VERSION
CRABBOX_ACTIONS_RUNNER_LABELS
CRABBOX_ACTIONS_EPHEMERAL
CRABBOX_BLACKSMITH_ORG
CRABBOX_BLACKSMITH_WORKFLOW
CRABBOX_BLACKSMITH_JOB
CRABBOX_BLACKSMITH_REF
CRABBOX_BLACKSMITH_IDLE_TIMEOUT
CRABBOX_BLACKSMITH_DEBUG
CRABBOX_RESULTS_JUNIT
CRABBOX_SYNC_CHECKSUM
CRABBOX_SYNC_DELETE
@ -343,6 +598,23 @@ CRABBOX_ENV_ALLOW
CRABBOX_CACHE_PNPM/NPM/DOCKER/GIT
CRABBOX_CACHE_MAX_GB
CRABBOX_CACHE_PURGE_ON_RELEASE
CRABBOX_TAILSCALE
CRABBOX_TAILSCALE_TAGS
CRABBOX_TAILSCALE_HOSTNAME_TEMPLATE
CRABBOX_TAILSCALE_AUTH_KEY_ENV
CRABBOX_TAILSCALE_AUTH_KEY direct-provider only, via auth-key env
CRABBOX_TAILSCALE_EXIT_NODE
CRABBOX_TAILSCALE_EXIT_NODE_ALLOW_LAN_ACCESS
CRABBOX_ARTIFACTS_STORAGE default --storage for artifacts publish
CRABBOX_ARTIFACTS_BUCKET
CRABBOX_ARTIFACTS_PREFIX
CRABBOX_ARTIFACTS_BASE_URL
CRABBOX_ARTIFACTS_AWS_REGION
CRABBOX_ARTIFACTS_AWS_PROFILE
CRABBOX_ARTIFACTS_ENDPOINT_URL
CRABBOX_ARTIFACTS_S3_ACL
CRABBOX_ARTIFACTS_PRESIGN
CRABBOX_ARTIFACTS_EXPIRES
```
Provider/deploy variables live outside normal CLI operation:
@ -351,8 +623,11 @@ Provider/deploy variables live outside normal CLI operation:
CRABBOX_CLOUDFLARE_API_TOKEN
CRABBOX_CLOUDFLARE_ACCOUNT_ID
CRABBOX_CLOUDFLARE_ZONE_ID
HCLOUD_TOKEN
AWS_PROFILE/AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY
CRABBOX_CLOUDFLARE_ZONE_NAME
CRABBOX_DOMAIN
CRABBOX_FALLBACK_DOMAIN
HCLOUD_TOKEN/HETZNER_TOKEN
AWS_PROFILE/AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_SESSION_TOKEN
GITHUB_TOKEN
```

View File

@ -8,21 +8,34 @@ Command docs live here, one file per top-level command. Keep `docs/cli.md` as th
- [login](login.md)
- [logout](logout.md)
- [whoami](whoami.md)
- [doctor](doctor.md)
- [warmup](warmup.md)
- [run](run.md)
- [desktop](desktop.md)
- [media](media.md)
- [artifacts](artifacts.md)
- [sync-plan](sync-plan.md)
- [history](history.md)
- [logs](logs.md)
- [events](events.md)
- [attach](attach.md)
- [results](results.md)
- [cache](cache.md)
- [status](status.md)
- [list](list.md)
- [share](share.md)
- [unshare](unshare.md)
- [image](image.md)
- [usage](usage.md)
- [admin](admin.md)
- [actions](actions.md)
- [ssh](ssh.md)
- [vnc](vnc.md)
- [webvnc](webvnc.md)
- [code](code.md)
- [egress](egress.md)
- [screenshot](screenshot.md)
- [inspect](inspect.md)
- [stop](stop.md)
- [actions](actions.md)
- [cleanup](cleanup.md)
- [doctor](doctor.md)
- [config](config.md)

View File

@ -12,7 +12,11 @@ For `actions hydrate`, Crabbox inspects the selected workflow's `workflow_dispat
Runner names and extra labels use the friendly slug when available, but workflow inputs and state-file paths keep using the canonical `cbx_...` ID.
On success, `actions hydrate` prints a concise total duration line.
Runner registration currently supports Linux targets only. Static macOS and
Windows hosts can run commands through `provider=ssh`, but `actions hydrate` and
`actions register` still install the Linux GitHub Actions runner package.
On success, `actions hydrate` prints a concise total duration line. Add `--timing-json` to emit a final JSON timing record with provider, lease ID, slug, total duration, exit code, and the GitHub Actions run URL when the workflow marker reports a run ID.
```sh
crabbox warmup --actions-runner
@ -25,9 +29,9 @@ crabbox run --id blue-lobster -- pnpm test
Subcommands:
```text
hydrate --id <lease-id-or-slug> [--repo owner/name] [--workflow <file|name|id>] [--ref <ref>] [--wait-timeout 20m] [--keep-alive-minutes 90] [--reclaim] [-f key=value]
hydrate --id <lease-id-or-slug> [--repo owner/name] [--workflow <file|name|id>] [--ref <ref>] [--wait-timeout 20m] [--keep-alive-minutes 90] [--reclaim] [--timing-json] [-f key=value] [--field key=value]
register --id <lease-id-or-slug> [--repo owner/name] [--name <runner-name>] [--labels <csv>] [--version latest] [--ephemeral=true] [--reclaim]
dispatch [--repo owner/name] [--workflow <file|name|id>] [--ref <ref>] [-f key=value]
dispatch [--repo owner/name] [--workflow <file|name|id>] [--ref <ref>] [-f key=value] [--field key=value]
```
Hydrate/register validate the local repo claim before touching the lease. Use `--reclaim` when intentionally moving a lease to the current repo.
@ -40,6 +44,8 @@ actions:
workflow: .github/workflows/crabbox.yml
job: hydrate
ref: main
fields:
- crabbox_docker_cache=true
runnerLabels:
- crabbox
runnerVersion: latest
@ -48,6 +54,7 @@ actions:
Workflow jobs should target the dynamic label printed by registration, for example `crabbox-cbx-123`, plus any static labels configured for the project.
When `actions.job` is set and the workflow declares `crabbox_job`, Crabbox sends it and verifies that the ready marker came from that job. Older workflows can omit both.
Use `actions.fields` for repository-specific workflow inputs that should be sent on every hydration. CLI `-f key=value` / `--field key=value` values override matching configured fields for that dispatch.
## Hydration Flow

View File

@ -10,9 +10,11 @@ crabbox admin release blue-lobster --delete
crabbox admin delete cbx_... --force
```
Release/delete accept a canonical `cbx_...` ID or an active slug; use the canonical ID when an admin slug lookup is ambiguous.
Release/delete accept a canonical `cbx_...` ID or an active slug; use the canonical ID when an admin slug lookup is ambiguous. Add `--json` to print the updated lease record.
Admin commands require a configured coordinator and bearer token. The current coordinator trusts the shared operator token; do not expose it to untrusted users.
Admin commands require a configured coordinator and a separate admin bearer token
stored as `broker.adminToken` or `CRABBOX_COORDINATOR_ADMIN_TOKEN`. The shared
operator token is not enough for admin routes.
## leases
@ -32,10 +34,26 @@ Flags:
Mark a lease released. Add `--delete` to delete the backing server while releasing.
Flags:
```text
--id <lease-id-or-slug>
--delete
--json
```
## delete
Delete the backing server for an active lease and mark it released. Requires `--force`.
Flags:
```text
--id <lease-id-or-slug>
--force
--json
```
Related docs:
- [Operations](../operations.md)

235
docs/commands/artifacts.md Normal file
View File

@ -0,0 +1,235 @@
# artifacts
`crabbox artifacts` collects desktop QA evidence into a durable bundle, creates
trimmed review media, and publishes inline-ready assets for pull requests.
Use it when a desktop/WebVNC issue or UI fix needs more than a one-off
screenshot: MP4 recording, trimmed GIF, logs, doctor output, WebVNC status, and
metadata in one directory.
## Collect
```sh
crabbox artifacts collect --id blue-lobster --output artifacts/blue-lobster
crabbox artifacts collect --id blue-lobster --all --duration 20s --output artifacts/blue-lobster
crabbox artifacts collect --id blue-lobster --run run_123 --output artifacts/blue-lobster
```
By default `collect` writes:
- `metadata.json`
- `screenshot.png`
- `doctor.txt`
- `webvnc-status.json` when a coordinator login is configured
- `logs.txt` and `run.json` when `--run <run-id>` is provided
`--all` also records `screen.mp4`, creates `screen.trimmed.gif`, and writes
`screen.trimmed.mp4` using the same motion window. Video/GIF capture currently
requires a Linux desktop lease with `ffmpeg` and X11 capture support.
Useful flags:
```text
--id <lease-id-or-slug>
--output <dir>
--run <run-id>
--all
--screenshot
--video
--gif
--doctor
--webvnc-status
--metadata
--duration <duration> default 10s
--fps <n> default 15
--gif-width <px> default 640
--provider <name>
--network auto|public|tailscale
--json
```
When collection hits an unhealthy desktop, WebVNC, VNC, or input layer, it
prints the same inline `problem:`, `detail:`, and `rescue:` commands used by the
desktop and WebVNC commands. With `--json`, stdout remains valid JSON and those
same repair hints are returned in the `warnings` array instead of being printed
as text before the JSON document. If a capture step fails after the bundle has
started, the command still exits nonzero and includes an `error` object with a
stable code and message.
## Video
```sh
crabbox artifacts video --id blue-lobster --duration 15s --output screen.mp4
```
`video` records only an MP4 from a Linux desktop lease. It is useful when you
want to keep capture separate from bundle collection.
## GIF
```sh
crabbox artifacts gif \
--input screen.mp4 \
--output screen.trimmed.gif \
--trimmed-video-output screen.trimmed.mp4
```
`gif` is an alias for the same local motion-trimmed preview logic as
[`crabbox media preview`](media.md).
## Templates
```sh
crabbox artifacts template openclaw \
--before before.png \
--after after.gif \
--summary "Login modal no longer overlaps the toolbar." \
--output summary.md
crabbox artifacts template mantis --summary-file qa-notes.md
```
Templates write Markdown with `Summary`, `Before / After`, and `Evidence`
sections sized for Mantis/OpenClaw QA comments.
## Publish
```sh
crabbox artifacts publish \
--dir artifacts/blue-lobster \
--pr 123
crabbox artifacts publish \
--dir artifacts/blue-lobster \
--pr 123 \
--storage s3 \
--bucket qa-artifacts \
--prefix pr-123/blue-lobster \
--base-url https://qa-artifacts.example.com
crabbox artifacts publish \
--dir artifacts/blue-lobster \
--pr 123 \
--storage cloudflare \
--bucket qa-artifacts \
--prefix pr-123/blue-lobster \
--base-url https://artifacts.example.com
```
`publish` uploads bundle files, writes `published-artifacts.md`, and comments
on the PR with inline images/GIFs plus links to videos, logs, and metadata.
Use `--dry-run` to generate markdown and print intended actions without upload
or comment side effects.
Storage backends:
- `--storage auto` is the default. When a coordinator is configured, Crabbox
asks the broker for upload URLs and the broker-owned artifact backend handles
storage credentials. Without a coordinator, auto falls back to local markdown.
- `--storage broker` requires a configured coordinator and uploads through
broker-minted URLs.
- `--storage s3` uses the AWS CLI and uploads to `s3://<bucket>/<prefix>/...`.
- `--storage cloudflare` uses `wrangler r2 object put --remote`.
- `--storage r2` uses the AWS CLI against an S3-compatible R2 endpoint.
- `--storage local` writes markdown only. For `--pr`, local publishing needs a
`--base-url` that already serves the files, otherwise the PR would contain
unusable local paths.
S3 flags:
```text
--bucket <name>
--prefix <path>
--base-url <url>
--region <region>
--profile <profile>
--endpoint-url <url>
--acl <acl>
--presign
--expires <duration> default 168h
```
When `--base-url` is supplied, published links use that public URL. Otherwise
`--presign` generates temporary AWS/R2 S3 URLs after upload.
Cloudflare R2 flags:
```text
--bucket <name>
--prefix <path>
--base-url <url> required for --pr inline-ready links
```
For native Cloudflare publishing, `publish` runs `wrangler` with
`CRABBOX_ARTIFACTS_CLOUDFLARE_*` when present, then the generic
`CLOUDFLARE_*` environment. Prefer brokered publishing for shared teams so
Cloudflare and object-store secrets stay on the coordinator.
For S3-compatible R2 publishing, pass `--storage r2 --endpoint-url <r2-endpoint>
--profile <r2-profile>`. When present, Crabbox uses
`CRABBOX_ARTIFACTS_R2_ENDPOINT_URL` and `CRABBOX_ARTIFACTS_R2_AWS_PROFILE`
before falling back to generic AWS defaults.
Coordinator artifact backend configuration:
```text
CRABBOX_ARTIFACTS_BACKEND=s3|r2
CRABBOX_ARTIFACTS_BUCKET
CRABBOX_ARTIFACTS_PREFIX
CRABBOX_ARTIFACTS_BASE_URL
CRABBOX_ARTIFACTS_REGION
CRABBOX_ARTIFACTS_ENDPOINT_URL
CRABBOX_ARTIFACTS_ACCESS_KEY_ID
CRABBOX_ARTIFACTS_SECRET_ACCESS_KEY
CRABBOX_ARTIFACTS_SESSION_TOKEN
CRABBOX_ARTIFACTS_UPLOAD_EXPIRES_SECONDS
CRABBOX_ARTIFACTS_URL_EXPIRES_SECONDS
```
For brokered publishing, the CLI never receives object-store credentials. It
sends artifact names, sizes, content types, and hashes to
`POST /v1/artifacts/uploads`; the coordinator returns one short-lived upload URL
per file plus the final URL to place in Markdown. Upload grants are signed with
the declared `content-length`, so the object store rejects oversized PUTs during
the grant window; the broker also caps each upload request at 5 GiB total before
signing grants. When `--prefix` is omitted for hosted publishing, the CLI derives
a unique prefix from the PR number, bundle directory, and current time so later
QA comments do not overwrite earlier evidence.
Coordinator artifact values split into two groups:
- Worker vars: `CRABBOX_ARTIFACTS_BACKEND`, `CRABBOX_ARTIFACTS_BUCKET`,
`CRABBOX_ARTIFACTS_PREFIX`, `CRABBOX_ARTIFACTS_BASE_URL`,
`CRABBOX_ARTIFACTS_REGION`, `CRABBOX_ARTIFACTS_ENDPOINT_URL`,
`CRABBOX_ARTIFACTS_UPLOAD_EXPIRES_SECONDS`, and
`CRABBOX_ARTIFACTS_URL_EXPIRES_SECONDS`. These describe where artifacts go and
how long URLs should live.
- Worker secrets: `CRABBOX_ARTIFACTS_ACCESS_KEY_ID`,
`CRABBOX_ARTIFACTS_SECRET_ACCESS_KEY`, and optional
`CRABBOX_ARTIFACTS_SESSION_TOKEN`. These are S3-compatible object-store keys
used only by the coordinator to sign artifact upload/read URLs.
Our deployed coordinator currently uses R2-compatible storage with public final
URLs on `https://artifacts.openclaw.ai`, bucket
`openclaw-crabbox-artifacts`, and object prefix `crabbox-artifacts`. The actual
R2 access key id and secret access key are Worker secrets; they are not required
on developer machines for normal `crabbox artifacts publish`.
Environment defaults:
```text
CRABBOX_ARTIFACTS_STORAGE
CRABBOX_ARTIFACTS_BUCKET
CRABBOX_ARTIFACTS_PREFIX
CRABBOX_ARTIFACTS_BASE_URL
CRABBOX_ARTIFACTS_AWS_REGION
CRABBOX_ARTIFACTS_AWS_PROFILE
CRABBOX_ARTIFACTS_ENDPOINT_URL
CRABBOX_ARTIFACTS_S3_ACL
CRABBOX_ARTIFACTS_PRESIGN
CRABBOX_ARTIFACTS_EXPIRES
```
`publish --pr` uses `gh issue comment <pr> --body-file ...`, so the current
checkout must be authenticated with GitHub. Pass `--repo owner/name` when the
working directory is not inside the target repository.

65
docs/commands/attach.md Normal file
View File

@ -0,0 +1,65 @@
# attach
`crabbox attach` follows recorded events for an active coordinator run.
```sh
crabbox attach run_abcdef123456
crabbox attach --id run_abcdef123456 --after 42
crabbox attach run_abcdef123456 --poll 500ms
```
## Behavior
`attach` polls the coordinator for new run events on a fixed interval,
prints them as they arrive, and exits when the run finishes.
- stdout and stderr preview events are written back to stdout and stderr,
preserving the stream split;
- lifecycle events (lease, bootstrap, sync, command-start, finish, release)
are printed to stderr with their sequence number, phase, timestamp, and
message;
- when the run has already finished, attach prints any remaining events
and exits;
- when the run is still active, attach polls until it sees a `finish`
event.
`attach` is not detached command execution. It follows the events the
original CLI is emitting; if that CLI process dies, the run state remains
inspectable through [history](history.md), [events](events.md), and
[logs](logs.md), but `attach` cannot resurrect it.
## Bounded Output
Output events are a bounded preview. The coordinator caps stdout/stderr
capture at 64 KiB per run and records an `output.truncated` marker when the
cap is reached. Use [logs](logs.md) for the larger retained command output
after completion.
## Flags
```text
--id <run-id> run id (also accepted as a positional argument)
--after <seq> resume after this event sequence number
--poll <duration> polling interval, default 1s
```
## Use Cases
- watch a long warmup or run from a second terminal without disturbing the
original CLI;
- monitor an agent-launched run while doing something else locally;
- replay events from a known sequence (`--after`) when reconnecting after
a network blip.
## Direct Mode
Direct-provider mode does not record runs centrally, so `attach` has no
event stream to follow. Use shell output from the original CLI instead.
Related docs:
- [logs](logs.md)
- [events](events.md)
- [history](history.md)
- [run](run.md)
- [History and logs](../features/history-logs.md)

View File

@ -9,23 +9,107 @@ crabbox cache warm --id blue-lobster -- pnpm install --frozen-lockfile
crabbox cache purge --id blue-lobster --kind pnpm --force
```
`--id` accepts the stable `cbx_...` ID or an active friendly slug. Cache commands that SSH to the box touch the lease and validate the local repo claim; add `--reclaim` to move an existing claim.
Cache kinds:
## Subcommands
```text
pnpm
npm
docker
git
all
cache stats show usage for each cache kind on the lease
cache warm run a command in the synced workdir to populate caches
cache purge delete one or all cache kinds (requires --force)
```
`cache warm` runs a command in the synced repo workdir for that lease. On boxes prepared by `crabbox actions hydrate`, it uses the hydrated `$GITHUB_WORKSPACE` and sources the workflow env handoff like `crabbox run`.
`--id` accepts the canonical `cbx_...` lease ID or an active friendly
slug. Cache commands SSH to the box, touch the lease, and validate the
local repo claim. Add `--reclaim` to move an existing claim from another
repo.
Repo `cache.pnpm`, `cache.npm`, `cache.docker`, and `cache.git` toggles control which kinds `stats` reports and which kinds `purge --kind all` removes.
## Cache Kinds
```text
pnpm /var/cache/crabbox/pnpm
npm /var/cache/crabbox/npm
docker Docker layer/image cache (host-managed)
git /var/cache/crabbox/git (shared origin objects)
all every kind enabled in repo config
```
Repo `cache.pnpm`, `cache.npm`, `cache.docker`, and `cache.git` toggles
control which kinds `stats` reports and which kinds `purge --kind all`
removes. Disabled kinds are omitted from stats, are not purged by
`--kind all`, and asking to purge a disabled specific kind fails early.
## stats
```sh
crabbox cache stats --id blue-lobster
```
Prints sizes for each enabled cache kind:
```text
pnpm 8.4GiB
npm 1.2GiB
docker 18.7GiB
git 430MiB
```
`--json` returns the same data as a structured object.
## warm
```sh
crabbox cache warm --id blue-lobster -- pnpm install --frozen-lockfile
crabbox cache warm --id blue-lobster -- docker compose pull
```
Runs a command in the synced repo workdir for that lease. On boxes
prepared by `crabbox actions hydrate`, it uses the hydrated
`$GITHUB_WORKSPACE` and sources the workflow env handoff, just like
`crabbox run` does.
Use warm for one-off cache priming when you do not want to record a full
run history entry.
## purge
```sh
crabbox cache purge --id blue-lobster --kind pnpm --force
crabbox cache purge --id blue-lobster --kind all --force
```
Removes the named cache kind from the lease. `--force` is required to
prevent accidental purges. If `cache.maxGB` is set, purge is rarely
needed - the runner trims the oldest entries automatically when caches
exceed the cap.
## Flags
```text
--id <lease-id-or-slug> target lease (required)
--kind pnpm|npm|docker|git|all for purge
--force required for purge
--reclaim move local claim from another repo
--json stats as JSON
```
## When To Use Cache
Caches are speed hints, not source of truth. The synced worktree remains
authoritative.
- Use `cache stats` to confirm a long-lived warm box is gaining benefit
from cached packages.
- Use `cache warm` to prime a fresh lease before handing it to agents that
run many short commands.
- Use `cache purge` when a corrupt cache is poisoning a build (rare;
usually the underlying tool's own cache reset works first).
Disposable leases lose cache state when the VM is deleted; kept leases
can reuse cache state across repeated agent runs. For shared baked
images, see [Prebaked runner images](../features/prebaked-images.md).
Related docs:
- [Performance](../performance.md)
- [Cache controls](../features/cache.md)
- [Performance](../performance.md)
- [run](run.md)
- [actions](actions.md)

View File

@ -1,21 +1,77 @@
# cleanup
`crabbox cleanup` sweeps direct-provider leftovers.
`crabbox cleanup` sweeps direct-provider leftovers based on Crabbox labels.
```sh
crabbox cleanup --dry-run
crabbox cleanup
```
Cleanup refuses to run when a coordinator is configured. Brokered cleanup belongs to the Durable Object alarm.
`crabbox machine cleanup` is preserved as a compatibility alias.
Direct cleanup skips kept machines, deletes expired ready/leased/active machines, and gives running/provisioning machines an extra stale safety window. It relies on provider labels such as `lease`, `slug`, `expires_at`, and `state`.
## Behavior
Flags:
Cleanup refuses to run when a coordinator is configured. Brokered cleanup
belongs to the Durable Object alarm; sweeping provider resources behind the
coordinator can race live brokered leases.
In direct-provider mode, cleanup is intentionally conservative:
- skip machines tagged `keep=true`;
- skip machines in `running` or `provisioning` state until the extra stale
safety window passes (expiry plus 12 hours);
- delete machines that are clearly expired in `ready`, `leased`, or
`active` states;
- delete machines that have been inactive past expiry.
Selection is label-driven. Cleanup uses `lease`, `slug`, `expires_at`,
`last_touched_at`, `state`, and `keep` labels written when the machine was
created. Resources without Crabbox labels are never touched.
Static SSH targets are existing operator-owned hosts, so `provider=ssh`
has nothing to sweep. Cleanup exits early for that provider.
## Output
`--dry-run` lists every decision without taking action:
```text
--provider hetzner|aws
--dry-run
hetzner cx53 hz-12345 lease=cbx_abcdef123456 slug=blue-lobster keep=true skip=keep
hetzner cx53 hz-67890 lease=cbx_abcdef234567 slug=amber-crab expires_at=2026-05-01T17:30:00Z delete
```
`crabbox machine cleanup` remains as a compatibility alias.
Without `--dry-run`, the same lines print but each `delete` is followed by
`deleted` after the provider call returns. Failures print the provider
error and continue with the next candidate.
## Flags
```text
--provider hetzner|aws|azure provider to sweep (delegated providers do not need cleanup)
--target linux|macos|windows for AWS, restrict by target
--windows-mode normal|wsl2 when target=windows
--static-host <host> ignored (provider=ssh has nothing to sweep)
--static-user <user> ignored
--static-port <port> ignored
--static-work-root <path> ignored
--dry-run log decisions without making provider calls
```
## When To Run
- after a CLI process crashed mid-warmup and left a server behind;
- when migrating from direct mode to brokered mode (sweep first, then
switch);
- as a safety net after rotating provider credentials;
- never as part of a brokered workflow - the coordinator owns that path.
For brokered fleets, audit `crabbox admin leases --state active` and use
`crabbox admin release` instead.
Related docs:
- [stop](stop.md)
- [admin](admin.md)
- [Lifecycle cleanup](../features/lifecycle-cleanup.md)
- [Orchestrator](../orchestrator.md)
- [Operations](../operations.md)

106
docs/commands/code.md Normal file
View File

@ -0,0 +1,106 @@
# code
`crabbox code` bridges a code-server workspace for a Linux lease into the
authenticated coordinator portal.
```sh
crabbox warmup --code
crabbox code --id blue-lobster
crabbox code --id blue-lobster --open
```
## How It Works
Create or reuse a lease with `code=true`:
```sh
crabbox warmup --code
```
The Linux bootstrap installs `code-server` only for leases that request the
capability. `crabbox code` then resolves the lease, starts `code-server` on
runner loopback, opens an SSH tunnel, mints a short-lived bridge ticket, and
registers a local bridge with the coordinator.
The editor opens the synced workspace by default. If you run `crabbox code`
from a subdirectory inside the local checkout, Crabbox maps that relative path
onto the remote workspace and opens the matching folder. Actions-hydrated
leases use the hydration workspace instead of the default `/work/crabbox/...`
path.
The browser URL is lease-scoped:
```text
/portal/leases/<lease-id>/code/
```
The data path is:
```text
browser
<-> coordinator /portal/leases/<lease>/code/
<-> local crabbox code process
<-> SSH tunnel
<-> runner 127.0.0.1:8080
```
Keep the local `crabbox code` process running while using the editor. The
coordinator authenticates the browser through portal auth and authenticates the
local bridge with a one-use, short-lived ticket. The CLI sends the ticket as
an `Authorization: Bearer ...` header so it stays out of websocket URLs and
proxy/access logs; the coordinator accepts a `?ticket=` query string as a
fallback for older CLIs.
If the browser opens before the local bridge connects, the Code portal renders a
waiting state with the exact `crabbox code --id <lease> --open` command, copy
and reload controls, and bridge status. Once the bridge is connected, the page
automatically opens the mapped workspace.
Managed code-server starts with `Default Dark Modern` as the default theme. The
bridge also chunks large HTTP responses and websocket frames so VS Code assets
and extension-host traffic stay below coordinator websocket frame limits.
## Flags
```text
--id <lease-id-or-slug>
--provider hetzner|aws|azure
--target linux
--network auto|tailscale|public
--local-port <port>
--open
--reclaim
```
## Limitations
- Coordinator-backed Linux leases are supported.
- Static SSH hosts, Windows, macOS, and Blacksmith Testbox are intentionally not
supported by this portal bridge yet.
- `code-server` auth is disabled on the runner side because the trusted access
boundary is the authenticated coordinator portal plus the local bridge.
## Troubleshooting
`lease ... was not created with code=true`
Warm a new lease with the code capability:
```sh
crabbox warmup --code
```
The portal shows a bridge command
The browser can reach the coordinator, but no local bridge is registered. Use
the command shown by the portal, or start `crabbox code --id <lease> --open`
locally and keep it running.
Check bridge health with:
```sh
curl https://crabbox.openclaw.ai/portal/leases/<lease>/code/health
```
When authenticated, the health response includes whether the code bridge agent
is currently connected.

View File

@ -6,7 +6,8 @@
crabbox config path
crabbox config show
crabbox config show --json
printf '%s' "$TOKEN" | crabbox config set-broker --url https://crabbox-coordinator.steipete.workers.dev --provider aws --token-stdin
printf '%s' "$TOKEN" | crabbox config set-broker --url https://crabbox.openclaw.ai --provider aws --token-stdin
printf '%s' "$ADMIN_TOKEN" | crabbox config set-broker --url https://crabbox.openclaw.ai --admin-token-stdin
```
Subcommands:
@ -14,13 +15,29 @@ Subcommands:
```text
path
show [--json]
set-broker --url <url> --token-stdin [--provider hetzner|aws]
set-broker --url <url> [--token-stdin] [--admin-token-stdin] [--provider hetzner|aws|azure]
```
`config show` reports broker auth as `auth` and `admin_auth`, plus
`access_auth` as `missing`, `service-token`, `token`, `service-token+token`, or
`incomplete`, without printing secret values. Store broker tokens and Access
secrets only in user config or environment variables, not repo-local config.
User config is written with `0600` permissions, and `crabbox doctor` flags
broader permissions.
User config lives under the OS user config directory. Repo-local `crabbox.yaml` or `.crabbox.yaml` can override user defaults for a checkout. Keep project-specific sync, env, capacity, and Actions policy in repo config, not in the Crabbox binary:
```yaml
profile: project-check
tailscale:
enabled: true
network: auto
tags:
- tag:crabbox
hostnameTemplate: crabbox-{slug}
authKeyEnv: CRABBOX_TAILSCALE_AUTH_KEY
exitNode: mac-studio.example.ts.net
exitNodeAllowLanAccess: true
capacity:
market: spot
strategy: most-available
@ -46,3 +63,18 @@ env:
- NODE_OPTIONS
- PROJECT_*
```
`tailscale.enabled` requests tailnet join for new managed Linux leases.
`tailscale.network` selects the SSH target resolution path:
- `auto`: prefer Tailscale when lease metadata exists and SSH is reachable;
- `tailscale`: require the tailnet path;
- `public`: force the provider/public host.
Brokered `--tailscale` leases use Worker-minted one-off auth keys. Direct
provider leases read a local one-off key from `tailscale.authKeyEnv`; do not
store that key in repo config.
`tailscale.exitNode` routes lease egress through an approved tailnet exit node.
`tailscale.exitNodeAllowLanAccess` keeps LAN access available while using that
exit node.

112
docs/commands/desktop.md Normal file
View File

@ -0,0 +1,112 @@
# desktop
`crabbox desktop launch` starts an app inside a desktop lease without taking
over VNC manually.
```sh
crabbox warmup --desktop --browser
crabbox desktop launch --id blue-lobster --browser --url https://example.com
crabbox desktop launch --id blue-lobster --browser --url https://example.com --webvnc --open
crabbox desktop launch --id blue-lobster --browser --url https://discord.com/login --egress discord --webvnc --open
crabbox desktop launch --id blue-lobster -- xterm
crabbox desktop doctor --id blue-lobster
crabbox desktop click --id blue-lobster --x 640 --y 420
crabbox desktop paste --id blue-lobster --text "peter@example.com"
printf 'peter@example.com' | crabbox desktop paste --id blue-lobster
crabbox desktop type --id blue-lobster --text "hello"
crabbox desktop key --id blue-lobster ctrl+l
crabbox desktop key blue-lobster ctrl+l
```
The command resolves and touches the lease, verifies `desktop=true`, waits for
the loopback VNC service, then starts the process detached from the SSH session.
With `--browser`, Crabbox probes the target browser the same way `run --browser`
does and launches `BROWSER` when no explicit command is provided.
With `--webvnc`, the command keeps running after launch and bridges the desktop
into the authenticated WebVNC portal. Add `--open` to open that portal locally.
Browser launches default to a windowed human desktop with the remote panel and
title bar visible; use `--fullscreen` only for capture/video workflows.
`--egress <profile>` passes the active lease-local egress proxy to the launched
browser as `--proxy-server=http://127.0.0.1:3128`, so the browser exits to the
internet through the operator machine running `crabbox egress start`. Start
the egress bridge first; the flag currently requires `--browser`. Override the
proxy address with `--egress-proxy host:port` if you started egress on a
non-default port. See [egress](egress.md) for the full bridge model.
On Windows, SSH sessions cannot directly own the visible console desktop, so
Crabbox writes a one-shot PowerShell launcher under `C:\ProgramData\crabbox` and
runs it as an interactive scheduled task for the logged-in `crabbox` user. The
launcher minimizes existing windows, starts the app, and tries to foreground
the new process. On Linux and macOS, the command is detached with `setsid` or
`nohup`.
`crabbox desktop doctor` checks the selected lease without syncing the repo.
For Linux desktop leases it reports VM/session health separately from portal
health: `DISPLAY`, Xvfb/window manager/panel, VNC listener, `xdotool`,
clipboard tool, browser binary, `ffmpeg`, screen size, screenshot capture, and
WebVNC bridge/viewer state. Failures include a one-line repair suggestion so
you can tell session bugs from WebVNC/browser-portal bugs.
Desktop launch and input failures now surface the failing layer directly in the
CLI output. For example, a missing visible browser reports `problem: browser not
launched`, a dead input path reports `problem: input stack dead`, and a broken
portal path reports `problem: VNC bridge disconnected` or `problem: WebVNC
daemon not running`. The same output includes exact `rescue:` commands such as
`crabbox desktop doctor --id <lease>` or `crabbox webvnc reset --id <lease>
--open`.
Input helpers also operate on the selected lease over SSH without repo sync.
Use them instead of hand-written `xdotool` snippets. `desktop type` uses raw
`xdotool type` only for simple alphanumeric text; text with emails, passwords,
symbols such as `@` or `+`, URLs, whitespace, or long payloads goes through the
remote clipboard and paste path because keyboard layouts can otherwise corrupt
special characters.
`desktop paste` accepts `--text` or stdin. `desktop key` accepts either
`--id <lease> <keys>` or the positional lease form `<lease> <keys>`; the key
sequence is parsed after lease flags so common forms such as
`crabbox desktop key blue-lobster ctrl+l` and
`crabbox desktop key -id blue-lobster ctrl+l` send `ctrl+l`, not the lease id.
Flags:
```text
--id <lease-id-or-slug>
--provider hetzner|aws|azure|ssh|daytona
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--browser
--url <url>
--webvnc
--open
--fullscreen
--egress <profile>
--egress-proxy <host:port>
--reclaim
```
Input helper flags:
```text
desktop doctor --id <lease-id-or-slug>
desktop click --id <lease-id-or-slug> --x <n> --y <n>
desktop paste --id <lease-id-or-slug> --text <text>
desktop paste --id <lease-id-or-slug> < input.txt
desktop type --id <lease-id-or-slug> --text <text>
desktop key --id <lease-id-or-slug> <keys>
desktop key <lease-id-or-slug> <keys>
desktop key --id <lease-id-or-slug> --keys <keys>
```
Related docs:
- [egress](egress.md)
- [vnc](vnc.md)
- [webvnc](webvnc.md)
- [Lease capabilities](../features/capabilities.md)
- [Mediated egress](../features/egress.md)

View File

@ -1,16 +1,101 @@
# doctor
`crabbox doctor` checks local prerequisites and broker/provider access.
`crabbox doctor` runs the local preflight before you commit to a long
workflow. It is fast (under a second on a healthy machine), local-only, and
never calls a billable provider API.
```sh
crabbox doctor
crabbox doctor --provider aws
crabbox doctor --provider hetzner --target linux
crabbox doctor --provider ssh --target windows --windows-mode normal --static-host win-dev.local
```
It checks local tools, per-lease key generation support, coordinator health when configured, and direct-provider API access otherwise. If `CRABBOX_SSH_KEY` is explicitly set, it also validates that private key and matching `.pub` file.
Flags:
## What It Checks
```text
--provider hetzner|aws
config config files load and parse, required keys are present
auth broker token is set, signed token is valid, identity resolves
network coordinator URL reachable, DNS works, SSH transport probes work
ssh SSH key path readable, key permissions sane, ssh-keygen on PATH
tools rsync, git, ssh, ssh-keygen present and executable
```
For `--provider ssh`, doctor also probes the static host: SSH reachability
on the configured port, target-required tools (`bash`, `git`, `rsync`,
`tar` for POSIX targets; OpenSSH, PowerShell, and `tar` for native
Windows), and `static.workRoot` writability.
When `CRABBOX_SSH_KEY` is explicitly set, doctor validates the private key
and the matching `.pub` file. When unset, it skips that check because
per-lease keys do not need a global key.
For the full list of checks, including how each one decides between
`fail`, `skip`, and `ok`, see
[Doctor checks](../features/doctor.md).
## Output
```text
config:
ok user config: ~/.config/crabbox/config.yaml
ok repo config: ./.crabbox.yaml
ok provider: aws
ok target: linux
auth:
ok broker: https://crabbox.openclaw.ai
ok owner: alex@example.com
network:
ok coordinator dns
ok coordinator https
ssh:
ok ssh-keygen present
skip ssh.key unset (per-lease keys will be used)
tools:
ok git
ok rsync
ok ssh
ok ssh-keygen
```
Failures swap the leading `ok` for `fail` and add a remediation hint:
```text
auth:
fail broker token is missing - run `crabbox login`
```
Exit code is `0` on full success, `2` on any failure. Skips never change
the exit code.
## Flags
```text
--provider hetzner|aws|azure|ssh provider to validate
--target linux|macos|windows target OS for ssh provider checks
--windows-mode normal|wsl2 when target=windows
--static-host <host> static SSH host
--static-user <user> static SSH user override
--static-port <port> static SSH port override
--static-work-root <path> static target work root
```
## When To Run
- before the first `crabbox run` on a new machine;
- after rotating the broker token;
- after editing `~/.crabbox.yaml` or repo config;
- in agent boot sequences as a sanity check;
- when triaging "Crabbox is broken" reports - doctor often catches the
problem before the user has to describe it.
Doctor is safe to run from `pre-commit`, scheduled jobs, and CI smoke
because it never provisions, never costs money, and never modifies state.
Related docs:
- [Doctor checks](../features/doctor.md)
- [Configuration](../features/configuration.md)
- [Auth and admin](../features/auth-admin.md)
- [Network and reachability](../features/network.md)
- [Troubleshooting](../troubleshooting.md)

135
docs/commands/egress.md Normal file
View File

@ -0,0 +1,135 @@
# egress
`crabbox egress` bridges lease-local browser or app traffic through the machine
running the egress host agent.
```sh
crabbox egress start --id blue-lobster --profile discord
crabbox egress start --id blue-lobster --profile discord --daemon
crabbox desktop launch --id blue-lobster --browser --url https://discord.com/login --egress discord
crabbox egress status --id blue-lobster
crabbox egress stop --id blue-lobster
```
## How It Works
`egress start` installs a short-lived egress client helper on the lease, starts
a loopback HTTP proxy such as `127.0.0.1:3128`, then runs a local host bridge on
the operator machine. Both sides connect outbound to the coordinator with
one-use tickets. The coordinator pairs the two WebSockets and forwards
multiplexed proxy messages; it does not open internet connections itself.
The browser/app data path is:
```text
Chrome in lease
-> lease 127.0.0.1:3128
-> coordinator Durable Object
-> local crabbox egress host process
-> internet from the operator machine
```
`desktop launch --egress <profile>` passes the lease-local proxy to Chrome as:
```text
--proxy-server=http://127.0.0.1:3128
```
The portal lease detail page shows the active egress session, host/client
connection state, and copyable `egress status` / `egress stop` commands. It
does not expose tickets or raw proxy URLs.
## Subcommands
```text
start Start a remote lease proxy and local host bridge
host Run only the local egress host bridge
client Run only the lease-side proxy bridge
status Show coordinator bridge status
stop Stop the local host daemon and remote lease client
```
Use `host` and `client` directly when debugging tickets, custom tunnels, or a
manually installed helper.
## Profiles And Allowlist
The host side refuses to become an open proxy. Use a built-in profile or an
explicit allowlist:
```sh
crabbox egress start --id blue-lobster --profile discord
crabbox egress start --id blue-lobster --allow example.com,*.example.com
```
Built-in profiles:
- `discord`: `discord.com`, `*.discord.com`, `discordcdn.com`,
`*.discordcdn.com`, `hcaptcha.com`, `*.hcaptcha.com`
- `slack`: `slack.com`, `*.slack.com`, `slack-edge.com`, `*.slack-edge.com`
Wildcard entries match the named domain and subdomains.
## Flags
Common:
```text
--id <lease-id-or-slug>
--provider hetzner|aws
--profile <name>
--allow <comma-separated-host-patterns>
```
`start`:
```text
--listen 127.0.0.1:3128
--daemon
--coordinator <public-coordinator-url>
--target linux
--network auto|tailscale|public
```
`host` and `client` debugging:
```text
--coordinator <url>
--ticket <ticket>
--session <session-id>
```
## Limitations
- The shipped path is per-app/per-process egress, not full VM routing.
- `egress start` supports coordinator-backed Linux SSH leases.
- `egress start` refuses non-Linux targets until target-specific remote helper
install/start commands exist.
- `egress start` does not install Cloudflare Access service-token credentials
on the remote lease. If Access credentials are configured locally, use a
public coordinator route, or run `egress client` manually only when it is safe
to provide the required access headers.
- The first implementation uses JSON/base64 bridge frames. That is good enough
for browser QA but can be optimized with binary frames later.
## Troubleshooting
`egress host requires --profile or --allow`
The host bridge will not start as an open proxy. Pick a profile or pass an
explicit allowlist.
`remote egress client did not listen`
Inspect the remote helper log:
```sh
crabbox ssh --id blue-lobster
cat /tmp/crabbox-egress-client.log
```
`desktop launch --egress currently requires --browser`
The automatic proxy flag is wired for browser launches. For custom apps, pass
the app's proxy flag yourself or use the lease-local proxy address printed by
`egress start`.

81
docs/commands/events.md Normal file
View File

@ -0,0 +1,81 @@
# events
`crabbox events` prints the coordinator event log for a recorded run.
```sh
crabbox events run_abcdef123456
crabbox events --id run_abcdef123456 --after 42 --limit 100
crabbox events run_abcdef123456 --json
```
## What Events Are Recorded
Coordinator-backed `crabbox run` creates a durable `run_...` handle before
it leases or syncs. The CLI appends ordered events as the run advances:
- `lease.acquire.start`, `lease.acquire.success`, `lease.acquire.fail`;
- `bootstrap.wait`, `bootstrap.ready`;
- `sync.start`, `sync.skip`, `sync.success`, `sync.fail`;
- `command.start`, `command.finish`;
- `output.stdout`, `output.stderr`, `output.truncated`;
- `release.start`, `release.success`, `release.fail`.
Each event carries a sequence number, event type, phase, optional stream
(stdout/stderr), timestamp, and short message or output text.
## Output
Human output prints sequence number, event type, phase, stream, timestamp,
and message:
```text
1 lease.acquire.start plan 2026-05-07T07:42:18Z
2 lease.acquire.success plan 2026-05-07T07:42:21Z leased=cbx_abcdef123456 slug=blue-lobster
3 bootstrap.wait provision 2026-05-07T07:42:21Z
4 bootstrap.ready provision 2026-05-07T07:43:05Z
5 sync.start sync 2026-05-07T07:43:05Z
6 sync.success sync 2026-05-07T07:43:08Z files=184 bytes=12.4MiB
7 command.start run 2026-05-07T07:43:08Z pnpm test
8 output.stdout run 2026-05-07T07:43:09Z > vitest run
9 output.stdout run 2026-05-07T07:43:11Z ✓ src/foo.test.ts (8)
...
42 command.finish run 2026-05-07T07:45:32Z exit=0
43 release.success release 2026-05-07T07:45:34Z
```
`--json` returns the raw event records.
## Bounded Output Capture
Output events are a bounded preview. The coordinator caps stdout/stderr
capture at 64 KiB per run and records an `output.truncated` marker when
the cap is reached. The retained log keeps up to 8 MiB. For the larger
retained command output, use [logs](logs.md).
## Flags
```text
--id <run-id> run id (also accepted as a positional argument)
--after <seq> only show events after this sequence number
--limit <n> maximum number of events, default 500, maximum 500
--json print JSON
```
`--after` is what `attach` uses internally - resume from a known sequence
without replaying the whole event log.
## Use Cases
- post-mortem on a failed run when you need the exact sequence of phases;
- correlating a failed step with the timestamps of surrounding sync or
bootstrap events;
- scripting a status check that filters by event type;
- archiving event records for runs that exceeded the retained log cap.
Related docs:
- [history](history.md)
- [logs](logs.md)
- [attach](attach.md)
- [results](results.md)
- [History and logs](../features/history-logs.md)

View File

@ -20,9 +20,15 @@ Flags:
--json print JSON
```
Human output includes run ID, lease ID, state, exit code, duration, start time, and command. Use the run ID with [logs](logs.md).
Human output includes run ID, lease ID, state, phase, exit code, duration, start
time, command, and any recorded run resource summary. `--json` includes the
start/end telemetry snapshots when a coordinator-backed Linux run captured
them. Use the run ID with [events](events.md), [attach](attach.md), or
[logs](logs.md).
Related docs:
- [events](events.md)
- [attach](attach.md)
- [logs](logs.md)
- [History and logs](../features/history-logs.md)

104
docs/commands/image.md Normal file
View File

@ -0,0 +1,104 @@
# image
`crabbox image` contains trusted operator controls for AWS runner images.
```sh
crabbox image create --id cbx_... --name openclaw-crabbox-20260501-1246 --wait
crabbox image promote ami-...
crabbox image promote ami-... --json
```
Image commands require a configured coordinator and admin-token auth. Set
`broker.adminToken` or `CRABBOX_COORDINATOR_ADMIN_TOKEN` locally; the Worker
checks `CRABBOX_ADMIN_TOKEN`.
They are intentionally not available to normal GitHub browser-login users.
Image bytes live in the provider account, not in git or coordinator durable
state. AWS images are AMIs backed by EBS snapshots. Crabbox stores only the
promoted AMI id and related metadata so future AWS leases can resolve the
default image. Hetzner snapshots/images should live in the Hetzner project and
be selected through `image`/`CRABBOX_HETZNER_IMAGE` until Crabbox grows
Hetzner create/promote lifecycle commands.
## create
Create an AWS AMI from an active AWS lease.
Flags:
```text
--id <cbx_id> source lease; must be a canonical AWS lease ID
--name <name> AMI name
--wait poll until the AMI is available
--wait-timeout <d> default 45m
--no-reboot default true
--json print JSON
```
The source lease must still be active in the coordinator. The Worker calls AWS
`CreateImage` from the backing instance ID and tags the image as Crabbox-owned.
Recommended bake flow:
```sh
crabbox warmup --provider aws --class standard --ttl 2h --idle-timeout 30m
crabbox run --id <slug> --shell -- 'command -v ssh git rsync curl jq && test -d /work/crabbox'
crabbox image create --id <cbx_id> --name openclaw-crabbox-YYYYMMDD-HHMM --wait
```
Use a fresh, intentionally warmed lease as the source. Do not bake personal
workspace state, local secrets, repository checkouts, or one-off debugging
artifacts into the image.
For desktop/browser or Mantis images, follow the full [Image bake runbook](../features/image-bake-runbook.md)
instead of relying only on the short smoke above.
Failure handling:
- If `--wait` times out, run `crabbox image create ... --json` or inspect the
AWS AMI state before retrying. AWS image creation can continue after the CLI
stops polling.
- If the AMI enters a failed state, leave the current promoted image in place
and create a new image from a fresh lease.
- If the source lease disappears, create a new warm lease and restart the bake;
image creation requires the backing AWS instance ID.
- If the baked image boots but never reaches `crabbox-ready`, do not promote it.
Keep the previous promoted AMI and debug bootstrap on a normal lease first.
- Cleanup of stale candidate AMIs is an AWS operator task. Promotion does not
delete old images or snapshots.
- If a Mantis timing report does not improve after promotion, treat that as a
failed performance bake even if the AMI boots.
## promote
Promote an available AMI as the coordinator's default AWS image:
```sh
crabbox image promote ami-1234567890abcdef0
```
Add `--json` to print the promoted image record for automation.
Future brokered AWS leases use the promoted image when the request does not set
an explicit `awsAMI` or `CRABBOX_AWS_AMI` override. Promotion stores coordinator
metadata only; it does not copy or modify the AMI.
Promotion and rollback:
```sh
crabbox image promote ami-new
crabbox warmup --provider aws --class standard --ttl 20m --idle-timeout 6m
crabbox run --id <slug> --shell -- 'echo image-smoke-ok && uname -srm && test -d /work/crabbox'
crabbox stop <slug>
```
If the smoke fails, promote the previous known-good AMI again. The coordinator
stores only the selected AMI ID, so rollback is another `image promote` call.
Keep the previous AMI available until at least one brokered AWS smoke succeeds
on the new image.
Related docs:
- [Image bake runbook](../features/image-bake-runbook.md)
- [Prebaked runner images](../features/prebaked-images.md)
- [Infrastructure](../infrastructure.md)
- [Runner bootstrap](../features/runner-bootstrap.md)

View File

@ -1,27 +1,106 @@
# init
`crabbox init` onboards a repository for agent-first remote verification.
It writes the minimum config needed for `crabbox run` and sets up the
optional Actions hydration bridge and agent skill.
```sh
crabbox init
crabbox init --force
crabbox init --workflow .github/workflows/crabbox-test.yml
```
It writes:
## Files It Writes
- `.crabbox.yaml`
- `.github/workflows/crabbox.yml`
- `.agents/skills/crabbox/SKILL.md`
```text
.crabbox.yaml repo defaults (provider, profile, class, sync, env)
.github/workflows/crabbox.yml Actions hydration stub (optional)
.agents/skills/crabbox/SKILL.md agent-facing skill instructions
```
The generated workflow is intentionally conservative. It is a starting point for repo-specific hydration, not a full replacement for CI. Edit it to install dependencies, start service containers, and warm caches before agents begin repeated `crabbox run` calls.
By default `init` will not overwrite existing files. `--force` overrides
that and replaces them with freshly generated content.
The workflow contract is the same one used by `crabbox actions hydrate`: it accepts the Crabbox lease ID and dynamic runner label, runs on that self-hosted runner, writes a ready marker under `$HOME/.crabbox/actions`, and keeps the job alive for the remote command loop.
## `.crabbox.yaml`
Flags:
A starting template that includes:
- a default `profile` and `class`;
- `sync.exclude` covering common heavy directories;
- `env.allow` with conservative defaults (`CI`, `NODE_OPTIONS`,
`PROJECT_*`);
- `actions.workflow` pointing at the generated workflow stub;
- `cache` toggles for pnpm, npm, docker, and git.
Open the file after `init` and adjust it to match the repo:
- pick the right `class` for the workload;
- add repo-specific `sync.exclude` patterns;
- expand `env.allow` for project-specific tunables;
- pin `sync.baseRef` to the project's default branch.
See [Configuration](../features/configuration.md) for the full schema.
## `.github/workflows/crabbox.yml`
The generated workflow is intentionally conservative. It is a starting
point for repo-specific hydration, not a full replacement for CI. Edit it
to install dependencies, start service containers, and warm caches before
agents begin repeated `crabbox run` calls.
The workflow contract is the one used by `crabbox actions hydrate`:
- accepts the Crabbox lease ID and dynamic runner label;
- runs on that self-hosted runner registered by Crabbox;
- writes a ready marker under `$HOME/.crabbox/actions`;
- keeps the job alive so the local CLI can run repeated commands in the
hydrated workspace.
If the repo has no Actions hydration plans, you can delete the workflow.
`crabbox run` works fine without it - hydration is optional.
## `.agents/skills/crabbox/SKILL.md`
Repo-local agent instructions. The generated skill explains:
- when to use Crabbox vs running locally;
- how to acquire and reuse leases;
- which commands the agent should prefer (`warmup`, `run --id`, `stop`);
- what env vars the project allows;
- where to find repo-specific test commands.
Edit this file to match how you want agents to operate in the repo. The
skill is read by OpenClaw and similar agent runtimes that auto-discover
`.agents/skills/`.
## Flags
```text
--force overwrite generated files
--config <path> repo config path
--workflow <path> workflow path
--skill <path> agent skill path
--config <path> repo config path (default ./.crabbox.yaml)
--workflow <path> Actions workflow path (default .github/workflows/crabbox.yml)
--skill <path> agent skill path (default .agents/skills/crabbox/SKILL.md)
```
## Idempotency
`init` is safe to re-run. Without `--force`, it leaves existing files
alone and exits with a summary of what would be created. With `--force`,
it replaces files atomically.
## After Init
```sh
crabbox doctor # validate the config
crabbox sync-plan # preview what would sync
crabbox warmup # acquire a lease
crabbox run -- pnpm test # run a command
```
Related docs:
- [Configuration](../features/configuration.md)
- [Repository onboarding](../features/repository-onboarding.md)
- [Actions hydration](../features/actions-hydration.md)
- [Sync](../features/sync.md)
- [Getting started](../getting-started.md)

View File

@ -1,18 +1,70 @@
# inspect
`crabbox inspect` prints detailed lease and provider metadata.
`crabbox inspect` prints detailed lease and provider metadata. Use it for
debugging coordinator state, provider labels, expiry, SSH target details,
and Tailscale metadata.
```sh
crabbox inspect --id blue-lobster
crabbox inspect --id blue-lobster --network tailscale
crabbox inspect --id blue-lobster --json
crabbox inspect --provider ssh --target windows --windows-mode wsl2 --static-host win-dev.local
```
Use this for debugging coordinator state, provider labels, expiry, and SSH target details.
## Output
Flags:
Human output prints lease state, provider, server type, public IP, work
root, owner, org, idle timeout, TTL, expiry, last touched, the resolved
SSH command for the selected network mode, and any Tailscale metadata the
lease carries.
```text
--id <lease-id-or-slug>
--provider hetzner|aws
--json
lease=cbx_abcdef123456 slug=blue-lobster
state=active provider=aws server=i-0abcdef0123456789 type=c7a.48xlarge
host=203.0.113.10 user=crabbox port=2222 work_root=/work/crabbox
owner=alex@example.com org=openclaw
idle_timeout=30m0s ttl=90m0s
created_at=2026-05-07T07:42:18Z last_touched=2026-05-07T07:55:12Z expires_at=2026-05-07T08:25:12Z
ssh: ssh -i ~/.config/crabbox/testboxes/cbx_abcdef123456/id_ed25519 -p 2222 crabbox@203.0.113.10
tailscale: state=ok ipv4=100.64.0.5 fqdn=blue-lobster.tail-scale.ts.net tags=tag:crabbox
```
JSON output returns the structured record, including non-secret Tailscale
metadata. Secrets (broker tokens, provider keys, VNC passwords) are never
included.
## Flags
```text
--id <lease-id-or-slug> lease to inspect; required for managed providers
--provider hetzner|aws|azure|ssh|daytona override the configured provider
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host> static SSH host for provider=ssh
--static-user <user> static SSH user override
--static-port <port> static SSH port override
--static-work-root <path> static target work root
--network auto|tailscale|public select which address inspect prints
--json print JSON
```
## Inspect vs Status vs List
- `inspect` is the long-form record for one lease, including provider
metadata, label state, and the resolved SSH command;
- `status` is the shorter "is this lease healthy right now" check, with
optional `--wait` and bounded telemetry;
- `list` is the table view across many leases, scoped by owner/org or
fleet-wide for admins.
Use `inspect` when something is unexpected and you want all the detail in
one place. Use `status` when an automation needs a quick liveness check.
Use `list` when you are looking for a specific lease across the pool.
Related docs:
- [status](status.md)
- [list](list.md)
- [ssh](ssh.md)
- [Identifiers](../features/identifiers.md)
- [Network and reachability](../features/network.md)

View File

@ -5,14 +5,42 @@
```sh
crabbox list
crabbox list --provider aws
crabbox list --provider ssh --target macos --static-host mac-studio.local
crabbox list --provider blacksmith-testbox
crabbox list --provider daytona
crabbox list --provider islo
crabbox list --json
```
`crabbox pool list` remains as a compatibility alias.
In `provider=ssh` mode this prints the configured static target.
In `blacksmith-testbox` mode this reads `blacksmith testbox list` and renders the
same Crabbox list shape as other providers. `--json` keeps the compatibility
shape parsed from the Blacksmith table: id, status, repo, workflow, job, ref,
and created time when the upstream table exposes those columns.
When coordinator auth is configured, the same list command also refreshes
owner-scoped external runner rows in the portal lease table from the current
all-status Blacksmith list. Crabbox also attempts to infer the matching GitHub
Actions run/workflow from the row's repo, workflow, ref, and created time.
The portal shows that Actions status, tags long-queued or long-running workflow
owners as `stuck`, exposes a copyable local stop command, and links each row to
a visibility-only runner detail page. Missing runners from later syncs are
marked stale rather than treated as Crabbox leases.
In `daytona` and `islo` modes, rendering is core-owned: human output and `--json`
use the normalized Crabbox lease view.
Flags:
```text
--provider hetzner|aws
--provider hetzner|aws|azure|ssh|blacksmith-testbox|daytona|islo
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--json
```

View File

@ -1,10 +1,22 @@
# login
`crabbox login` stores broker credentials in the user config and verifies coordinator identity. It is currently token-based, not a GitHub browser OAuth flow.
`crabbox login` opens GitHub in the browser, waits for the coordinator callback, stores the returned broker token in the user config, and verifies identity with `GET /v1/whoami`.
```sh
crabbox login
```
If the browser cannot open automatically, print the URL and paste it manually:
```sh
crabbox login --no-browser
```
Trusted operator automation can still write the shared coordinator token over stdin:
```sh
printf '%s' "$CRABBOX_COORDINATOR_TOKEN" | crabbox login \
--url https://crabbox-coordinator.steipete.workers.dev \
--url https://crabbox.openclaw.ai \
--provider aws \
--token-stdin
```
@ -15,14 +27,34 @@ Flags:
```text
--url <url> broker URL
--provider hetzner|aws default provider to store with the broker
--token-stdin read broker token from stdin
--provider hetzner|aws|azure default provider to store with the broker
--no-browser print the GitHub login URL instead of opening it
--token-stdin read broker token from stdin for operator automation
--json print JSON
```
`login` calls `GET /v1/whoami` after writing config. If verification fails, inspect the stored config with `crabbox config show` and retry with the correct token.
The default broker URL is `https://crabbox.openclaw.ai`; pass `--url` for another coordinator. GitHub browser login issues a user-scoped Crabbox bearer token. `--token-stdin` stores the shared operator token and should stay limited to trusted maintainers.
The coordinator may still derive identity from Cloudflare Access or Git email headers, but the CLI does not yet open a browser or mint a GitHub-scoped user token.
## Self-hosted coordinators
The public `https://crabbox.openclaw.ai` coordinator uses OpenClaw-owned
GitHub OAuth credentials and only admits configured OpenClaw org or team
members. A separate organization or private deployment needs its own
Cloudflare Worker, Worker secrets, and GitHub OAuth app.
Configure that GitHub OAuth app with a callback URL that exactly matches the
coordinator public URL:
```text
https://<your-coordinator-host>/v1/auth/github/callback
```
Set the same public origin in `CRABBOX_PUBLIC_URL` on the Worker, then deploy
`CRABBOX_GITHUB_CLIENT_ID`, `CRABBOX_GITHUB_CLIENT_SECRET`,
`CRABBOX_SESSION_SECRET`, and the relevant `CRABBOX_GITHUB_ALLOWED_ORG(S)` or
`CRABBOX_GITHUB_ALLOWED_TEAMS` values. A GitHub `Invalid redirect_uri` error
means the URL generated by `crabbox login` does not match one of the callback
URLs configured on that OAuth app.
Related docs:

View File

@ -7,9 +7,31 @@ crabbox logout
crabbox logout --json
```
The broker URL and provider are left in place so a later `crabbox login --token-stdin` can reuse them.
The broker URL and provider stay in place so a later `crabbox login` or
`crabbox login --token-stdin` can reuse them. Per-lease SSH keys, repo
claims, and history records are unaffected.
After logout:
- `crabbox whoami` exits with auth code 3 (`auth failure`);
- `crabbox run` and `crabbox warmup` against the coordinator fail with the
same code;
- direct-provider mode keeps working when local provider credentials
(AWS SDK, `HCLOUD_TOKEN`) are present, because direct mode does not need
the broker token.
Use logout when:
- a token has leaked or you want to rotate it;
- you are switching the operator identity on a shared workstation;
- you are testing the unauthenticated path.
To clear everything (URL, provider, token, profile defaults), edit the user
config file directly. `crabbox config path` prints the location.
Related docs:
- [login](login.md)
- [whoami](whoami.md)
- [Auth and admin](../features/auth-admin.md)
- [Configuration](../features/configuration.md)

View File

@ -1,18 +1,70 @@
# logs
`crabbox logs` prints the retained remote output tail for a recorded run.
`crabbox logs` prints the retained command output for a recorded run.
```sh
crabbox logs run_...
crabbox logs --id run_...
crabbox logs run_... --json
crabbox logs run_abcdef123456
crabbox logs --id run_abcdef123456
crabbox logs run_abcdef123456 --json
```
The plain form writes the log text to stdout. `--json` returns run metadata plus the log.
## What Gets Stored
Logs are bounded tails of remote stdout/stderr. They are for debugging recent runs, not unlimited archival.
When `crabbox run` runs against a coordinator, it streams remote stdout and
stderr to the local terminal *and* records a bounded copy on the
coordinator. The CLI keeps up to 8 MiB of capture per run; the coordinator
stores larger captures in chunks so a noisy parallel run does not exceed
Durable Object storage limits.
Output beyond the cap is truncated with an `output.truncated` marker on the
last event so the consumer knows the tail is missing.
## Output
The plain form writes the log text to stdout. `--json` returns run metadata
plus the log:
```json
{
"runId": "run_abcdef123456",
"leaseId": "cbx_abcdef123456",
"exitCode": 0,
"truncated": false,
"log": "..."
}
```
`--json` is stable enough for scripts that filter by exit code and want the
log text in one payload.
## Flags
```text
--id <run-id> run id (also accepted as a positional argument)
--json print JSON with metadata and log text
```
## When To Use Logs vs Events vs Attach
- `logs` returns the retained command output. Use when you want the full
bounded transcript after the run finished.
- `events` returns ordered run events (lease, sync, command, output chunks,
finish). Use when you need to know *what happened* and *when*.
- `attach` follows live events. Use when the run is still active and you
want to watch it without re-attaching the original CLI.
Logs and events are independent surfaces - logs stay focused on command
output, events stay focused on lifecycle.
## Direct Mode
Direct-provider mode does not record runs centrally, so `crabbox logs` has
nothing to fetch. Use shell output or the local terminal log instead.
Related docs:
- [history](history.md)
- [events](events.md)
- [attach](attach.md)
- [results](results.md)
- [History and logs](../features/history-logs.md)

46
docs/commands/media.md Normal file
View File

@ -0,0 +1,46 @@
# media
`crabbox media` creates lightweight review artifacts from recorded desktop
videos. It runs locally and does not need a lease.
## Preview
`crabbox media preview` converts an MP4 or other ffmpeg-readable video into a
small animated GIF that GitHub can render inline in comments and pull request
bodies.
```sh
crabbox media preview \
--input desktop.mp4 \
--output desktop-preview.gif \
--trimmed-video-output desktop-change.mp4
```
By default the preview is motion-focused:
- ffmpeg `freezedetect` finds leading and trailing static regions.
- Crabbox keeps a little padding around the first and last moving frame.
- The GIF is palette-optimized at 4 fps and 640 px wide.
- `--trimmed-video-output` writes an MP4 clip using the same motion window.
If no motion is detected, Crabbox keeps the full source video instead of
returning an empty preview.
Useful flags:
```text
--input <path>
--output <path>
--trimmed-video-output <path>
--width <px> default 640
--fps <n> default 4
--trim-static default true
--no-trim-static
--trim-padding <duration> default 750ms
--freeze-duration <dur> default 500ms
--freeze-noise <level> default -50dB
--min-duration <duration> default 1500ms
--json
```
`ffmpeg` and `ffprobe` must be on `PATH`.

View File

@ -1,17 +1,22 @@
# results
`crabbox results` prints structured test summaries attached to a recorded run.
`crabbox results` prints structured test summaries attached to a recorded
run.
```sh
crabbox run --id cbx_... --junit junit.xml -- go test ./...
crabbox results run_...
crabbox results run_... --json
crabbox run --id cbx_abcdef123456 --junit junit.xml -- go test ./...
crabbox results run_abcdef123456
crabbox results run_abcdef123456 --json
```
Results are attached only when `crabbox run` is told where to find remote JUnit XML. Use either:
## When Results Are Attached
Results are attached only when `crabbox run` is told where to find remote
JUnit XML. Use either:
```sh
crabbox run --junit junit.xml -- <command...>
crabbox run --junit junit.xml,reports/junit.xml -- <command...>
```
or repo config:
@ -23,10 +28,76 @@ results:
- reports/junit.xml
```
Human output shows totals and failed test cases. JSON output returns the stored summary. Stored summaries keep aggregate counts but cap bulky failure details.
After the command exits, the CLI reads each remote file from the workdir,
parses JUnit, and sends only the summary to the coordinator. Raw XML is not
stored. Multiple JUnit files are merged into a single summary so a multi-
report test setup still produces one result record.
## Output
Human output shows totals and the names of failed test cases:
```text
run_abcdef123456 lease=cbx_abcdef123456 command="pnpm test"
totals: tests=412 failures=2 errors=0 skipped=4 time=42.318s
failures:
src/auth.test.ts > login → returns user
src/sync.test.ts > rsync → handles deletes
```
`--json` returns the stored structured summary:
```json
{
"runId": "run_abcdef123456",
"totals": { "tests": 412, "failures": 2, "errors": 0, "skipped": 4, "timeSeconds": 42.318 },
"failures": [
{ "suite": "src/auth.test.ts", "name": "login → returns user" },
{ "suite": "src/sync.test.ts", "name": "rsync → handles deletes" }
],
"files": [
{ "path": "junit.xml", "size": 12345 }
]
}
```
## Limits
The coordinator caps stored summaries:
- aggregate counters (tests, failures, errors, skipped) are kept verbatim;
- failed-case entries are capped to a bounded list;
- long strings (test names, suite names, message bodies) are truncated;
- file lists keep paths and sizes, never raw bytes.
This keeps the result record small enough for the lease detail page and
the run detail page to render without paging through gigabytes of XML.
## Flags
```text
--id <run-id> run id (also accepted as a positional argument)
--json print JSON
```
## When To Use Results vs Logs
- `results` is the structured summary - "did the suite pass, and which
cases failed?";
- `logs` is the retained command output - "what did the command print?".
Use `results` for dashboards and quick triage. Use `logs` when you need to
read the actual stack trace.
## Future Formats
Today only JUnit XML is supported. Vitest JSON, Go `test2json`, and flaky-
test correlation across runs are tracked in
[Test results](../features/test-results.md).
Related docs:
- [run](run.md)
- [history](history.md)
- [logs](logs.md)
- [Test results](../features/test-results.md)

View File

@ -5,27 +5,83 @@
```sh
crabbox run --id blue-lobster -- pnpm test:changed:max
crabbox run --class beast -- pnpm check
crabbox run --provider aws --class beast --market on-demand -- pnpm check
crabbox run --tailscale -- pnpm check
crabbox run --id blue-lobster --network tailscale -- pnpm test
crabbox run --browser -- google-chrome --headless --version
crabbox run --desktop --browser --shell 'echo "$DISPLAY"; "$BROWSER" --version'
crabbox run --id blue-lobster --shell 'pnpm install --frozen-lockfile && pnpm test'
crabbox run --id cbx_abcdef123456 --junit junit.xml -- go test ./...
crabbox run --provider blacksmith-testbox --blacksmith-workflow .github/workflows/ci-check-testbox.yml --blacksmith-job test -- pnpm test
crabbox run --provider daytona --daytona-snapshot crabbox-ready -- pnpm test
crabbox run --provider islo --islo-image docker.io/library/ubuntu:24.04 -- pnpm test
crabbox run --provider ssh --target macos --static-host mac-studio.local -- xcodebuild test
crabbox run --provider ssh --target windows --windows-mode normal --static-host win-dev.local -- dotnet test
crabbox run --provider ssh --target windows --windows-mode normal --static-host win-dev.local --shell 'Write-Output ("BROWSER=" + $env:BROWSER)'
crabbox run --provider ssh --target windows --windows-mode wsl2 --static-host win-dev.local -- pnpm test
```
If `--id` is omitted, Crabbox creates a fresh non-kept lease and releases it when the command exits. `--id` accepts the stable `cbx_...` ID or the active friendly slug.
With `--provider blacksmith-testbox`, `--id` accepts a Blacksmith `tbx_...` ID or a local Crabbox slug. Crabbox forwards the command to `blacksmith testbox run`, delegates sync to Blacksmith, and prints `sync=delegated` in the final timing summary.
With `--provider daytona`, `--id` accepts a Daytona-backed Crabbox `cbx_...` ID
or local slug. Crabbox uploads the sync archive through Daytona toolbox file
APIs, extracts it in the sandbox, and runs the command through Daytona toolbox
process APIs. The final timing summary reports `sync=delegated`.
With `--provider islo`, `--id` accepts an `isb_<crabbox-sandbox-name>` lease ID,
a Crabbox-created sandbox name, or a local Crabbox slug. Islo owns sandbox
workspace setup and command execution, so sync is delegated and the final timing
summary reports `sync=delegated`.
When the lease has been hydrated by `crabbox actions hydrate`, `run` reads the remote marker under `$HOME/.crabbox/actions`, syncs into the workflow's `$GITHUB_WORKSPACE`, and sources the non-secret env file written by the workflow. That preserves the setup the workflow performed: checkout path, installed dependencies, service containers, caches, runner temp/toolcache paths, and any project-specific preparation. GitHub secrets and OIDC request tokens remain workflow-step scoped unless the project explicitly persists its own short-lived credentials.
Sync uses `git ls-files --cached --others --exclude-standard` to build a file manifest, then feeds that manifest to rsync over SSH. That means tracked files plus nonignored untracked files sync, while `.git`, ignored local build output, dependency folders, and common caches stay out of the transfer. Crabbox records a local/remote sync fingerprint and skips rsync when the tracked commit plus manifest and dirty metadata have not changed. Use `--checksum` when you need a paranoid checksum scan, and `--debug` to print sync timing, progress, and itemized rsync output.
If a configured Actions hydration workflow exists and a package-manager command such as `pnpm`, `npm`, `node`, or `corepack` is run before a hydration marker exists, Crabbox warns that the raw box may not have the project runtime installed. Hydrate first for CI-like setup, or include the runtime setup explicitly in the command.
`--browser` provisions or requires a known browser binary and injects
`CRABBOX_BROWSER=1`, `BROWSER`, and `CHROME_BIN` into the remote command. It
does not imply `--desktop`; use it alone for headless browser automation.
Browser login/profile state is not managed by Crabbox; use a scenario-owned
profile directory or app-specific auth artifact when tests need a logged-in
browser.
`--desktop` provisions or requires a visible desktop/VNC session and injects
`CRABBOX_DESKTOP=1`; POSIX desktop targets also use `DISPLAY=:99`. It does not
imply a browser. Use `--desktop --browser` for headed browser automation in the
VNC-visible session.
`--tailscale` asks new managed Linux leases to join the configured tailnet.
`--network` selects how Crabbox resolves SSH for reused leases and for the final
connection after a new lease becomes ready. `auto` prefers Tailscale when
metadata exists and SSH is reachable, `tailscale` fails if the tailnet path is
not available, and `public` forces the provider host. See
[Tailscale](../features/tailscale.md).
Sync uses `git ls-files --cached --others --exclude-standard` to build a file manifest, then feeds that manifest to rsync over SSH. That means tracked files plus nonignored untracked files sync, while `.git`, ignored local build output, dependency folders, `.crabboxignore` patterns, `sync.exclude` patterns, and common caches stay out of the transfer. Crabbox records a local/remote sync fingerprint and skips rsync when the tracked commit plus manifest and dirty metadata have not changed. Use `--checksum` when you need a paranoid checksum scan, and `--debug` to print sync timing, progress, and itemized rsync output.
For `provider=ssh`, `target=macos` and `target=windows windows.mode=wsl2`
use the same POSIX rsync flow. Native Windows mode uses PowerShell over OpenSSH
and sends the manifest as a tar archive into `static.workRoot`; cache purge and
GitHub Actions runner registration remain Linux-only.
On native Windows, plain argv is best for one executable such as `dotnet test`.
Use `--shell` for multi-statement PowerShell snippets, env inspection, or
commands that need PowerShell expression syntax.
Before rsync starts, Crabbox prints the candidate file count and byte estimate. Large syncs warn or fail according to `sync.warnFiles`, `sync.warnBytes`, `sync.failFiles`, and `sync.failBytes`; use `--force-sync-large` or `sync.allowLarge: true` only when the transfer size is intentional. Quiet rsync runs print a heartbeat, and `sync.timeout` kills stalled syncs.
At the end of every command, `run` prints a one-line summary with sync duration, command duration, total duration, whether sync was skipped by fingerprint, and the remote exit code.
Before the first rsync into a Git checkout, Crabbox tries to seed the remote worktree from the local `origin` remote so the first sync is a dirty-tree overlay instead of a full source upload. Project-specific excludes, env forwarding, and base ref belong in `crabbox.yaml` or `.crabbox.yaml`.
Use `--timing-json` to emit a final JSON timing record with provider, lease ID, sync phases, command duration, total duration, exit code, and Actions run URL when available. In `blacksmith-testbox` mode, sync is reported as delegated in the same schema.
Before the first rsync into a Git checkout, Crabbox tries to seed the remote worktree from the local `origin` remote so the first sync is a dirty-tree overlay instead of a full source upload. Project-specific excludes can live in `.crabboxignore` or `sync.exclude` in `crabbox.yaml` / `.crabbox.yaml`; env forwarding and base ref belong in config.
After sync, Crabbox runs a remote sanity check. If the remote checkout reports at least 200 tracked deletions, Crabbox fails before running tests unless local `CRABBOX_ALLOW_MASS_DELETIONS=1` is set.
When a coordinator is configured, Crabbox records each remote command as a run history item. `crabbox history` lists those records and `crabbox logs <run-id>` prints the retained remote output tail. Log retention is intentionally bounded so a noisy command cannot fill Durable Object storage.
When a coordinator is configured, Crabbox records each remote command as a run history item. `crabbox history` lists those records and `crabbox logs <run-id>` prints retained remote output. Log retention is intentionally bounded so a noisy command cannot fill Durable Object storage.
Add `--junit <path>` or configure `results.junit` to attach JUnit XML summaries to the run record. `crabbox results <run-id>` then prints failed tests without reading the raw log tail.
Add `--junit <path>` or configure `results.junit` to attach JUnit XML summaries to the run record. `crabbox results <run-id>` then prints failed tests without reading the raw log.
Use `crabbox sync-plan` to inspect the same local manifest without leasing a box when a sync estimate looks unexpectedly large.
@ -33,12 +89,29 @@ Flags:
```text
--id <lease-id-or-slug>
--provider hetzner|aws
--provider hetzner|aws|azure|ssh|blacksmith-testbox|daytona|islo
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--profile <name>
--class <name>
--type <provider-type>
--market spot|on-demand
--ttl <duration>
--idle-timeout <duration>
--desktop
--browser
--code
--tailscale
--tailscale-tags <comma-separated tags>
--tailscale-hostname-template <template>
--tailscale-auth-key-env <env-var>
--tailscale-exit-node <name-or-100.x>
--tailscale-exit-node-allow-lan-access
--network auto|tailscale|public
--keep
--no-sync
--sync-only
@ -48,7 +121,22 @@ Flags:
--debug
--junit <comma-separated remote XML paths>
--reclaim
--timing-json
--blacksmith-org <org>
--blacksmith-workflow <file|name|id>
--blacksmith-job <job>
--blacksmith-ref <ref>
```
`--idle-timeout` controls inactivity expiry, default `30m`. `--ttl` remains the maximum wall-clock lifetime, default `90m`.
Crabbox records a local repo claim for each reused lease. If a lease is already claimed by another repo, use `--reclaim` to move the claim intentionally.
`--code` provisions or requires a Linux lease with code-server installed. Use
`crabbox code --id <lease>` to expose the editor through the authenticated
portal.
For AWS one-shot leases, `--market` overrides `capacity.market` for this run.
Explicit `--type` keeps exact-type semantics; Crabbox reports why that type
failed rather than falling back to a different size.
Blacksmith Testbox mode does not support `--sync-only`; Blacksmith owns its own sync behavior.

View File

@ -0,0 +1,58 @@
# screenshot
`crabbox screenshot` captures a PNG from a desktop lease without opening a VNC
client.
```sh
crabbox warmup --desktop
crabbox screenshot --id blue-lobster
crabbox screenshot --id blue-lobster --network tailscale
crabbox screenshot --id blue-lobster --output desktop.png
```
The command resolves and touches the lease like `crabbox ssh`, verifies that the
lease has `desktop=true`, waits for the loopback desktop/VNC service, then
streams a PNG over SSH. Linux captures `DISPLAY=:99`. Windows creates a
one-shot scheduled task inside the logged-in `crabbox` console session, because
non-interactive SSH sessions cannot capture the visible desktop. macOS uses
`screencapture`.
For Windows, the screenshot reflects the active console session in the
Crabbox-created instance. Managed AWS Windows desktop leases enable auto-logon
for the generated `crabbox` user, store that password under
`C:\ProgramData\crabbox`, and use it only on the instance to run the scheduled
capture task.
If `--output` is omitted, Crabbox writes:
```text
crabbox-<slug-or-id>-screenshot.png
```
Static macOS and Windows targets are existing host machines, not Crabbox-created
desktops, so `screenshot` rejects those targets instead of capturing your local
or home-host desktop by accident. Managed AWS Windows and AWS macOS desktop
leases are Crabbox-created boxes and can be captured by lease id or slug.
Flags:
```text
--id <lease-id-or-slug>
--provider hetzner|aws|azure|ssh|daytona
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--network auto|tailscale|public
--output <path>
--reclaim
```
Related docs:
- [Interactive desktop and VNC](../features/interactive-desktop-vnc.md)
- [Linux VNC](../features/vnc-linux.md)
- [Windows VNC](../features/vnc-windows.md)
- [macOS VNC](../features/vnc-macos.md)

43
docs/commands/share.md Normal file
View File

@ -0,0 +1,43 @@
# share
`crabbox share` grants access to an existing coordinator lease.
```sh
crabbox share --id blue-lobster --user friend@example.com
crabbox share --id blue-lobster --user friend@example.com --role manage
crabbox share --id blue-lobster --org
crabbox share --id blue-lobster --org --role manage
crabbox share --id blue-lobster --list
crabbox share blue-lobster --list --json
```
Roles:
```text
use see the lease and use visible portal bridges such as WebVNC/code
manage use access plus changing sharing and stopping the lease
```
`--org` shares with authenticated users whose org matches the lease org.
`--user` is repeatable and stores normalized lowercase email addresses.
SSH-based commands still require a local private key accepted by the runner.
Sharing grants coordinator and portal access; it does not copy SSH private keys
between people.
Flags:
```text
--id <lease-id-or-slug>
--user <email>
--org
--role use|manage
--list
--json
```
Related docs:
- [unshare](unshare.md)
- [Auth and admin](../features/auth-admin.md)
- [Browser portal](../features/portal.md)

View File

@ -4,16 +4,31 @@
```sh
crabbox ssh --id blue-lobster
crabbox ssh --id blue-lobster --network tailscale
crabbox ssh --provider daytona --id blue-lobster
crabbox ssh --provider ssh --target macos --static-host mac-studio.local
```
The output includes the per-lease private key path when Crabbox created one. Printing an SSH command touches coordinator leases because it signals intended manual use.
The output includes the per-lease private key path when Crabbox created one. Printing an SSH command touches coordinator leases because it signals intended manual use. In `provider=ssh` mode it resolves the configured static target. In `provider=daytona` mode the short-lived SSH token is redacted by default; pass `--show-secret` only when you need a pasteable command in a trusted terminal.
Flags:
```text
--id <lease-id-or-slug>
--provider hetzner|aws
--provider hetzner|aws|azure|ssh|daytona
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--network auto|tailscale|public
--reclaim
--show-secret
```
`ssh` touches the lease and validates the local repo claim. Use `--reclaim` when intentionally taking over a lease from another repo.
`--network auto` prefers the tailnet host when the lease has Tailscale metadata
and this client can reach it. `--network tailscale` requires that path.
`--network public` forces the provider host.

View File

@ -4,18 +4,45 @@
```sh
crabbox status --id blue-lobster
crabbox status --id blue-lobster --network tailscale
crabbox status --id blue-lobster --wait --wait-timeout 10m
crabbox status --id blue-lobster --json
crabbox status --provider daytona --id blue-lobster
crabbox status --provider islo --id blue-lobster
crabbox status --provider ssh --target macos --static-host mac-studio.local
```
`--id` accepts the canonical `cbx_...` ID or active slug. Plain status is read-only; `--wait` touches the lease while waiting.
`--id` accepts the canonical `cbx_...` ID or active slug. In
`blacksmith-testbox` mode it accepts a `tbx_...` ID or local slug and derives a
normalized Crabbox status view from `blacksmith testbox list --all`. In
`daytona` mode it resolves Crabbox labels and sandbox state through Daytona
APIs. In `islo` mode it accepts an `isb_...` ID, Crabbox-created sandbox name,
or local slug and renders SDK status through the core status view. In
`provider=ssh` mode `--id` is optional and resolves the configured static target
or local claim. Plain status is read-only; `--wait` touches the lease while
waiting for Crabbox brokered leases.
Flags:
```text
--id <lease-id-or-slug>
--provider hetzner|aws
--provider hetzner|aws|azure|ssh|blacksmith-testbox|daytona|islo
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--network auto|tailscale|public
--wait
--wait-timeout <duration>
--json
```
Human and JSON output include the selected network. With Tailscale metadata,
status also prints the tailnet host/state. For coordinator-backed Linux leases
that have received a recent heartbeat, status also includes the latest
best-effort telemetry snapshot: load, memory, disk, uptime, and capture age.
JSON status includes `telemetryHistory` when the coordinator has retained recent
samples for portal trend charts. The retained history is bounded to the latest
60 samples per lease.

View File

@ -4,13 +4,22 @@
```sh
crabbox stop blue-lobster
crabbox stop --provider daytona blue-lobster
crabbox stop --provider islo blue-lobster
crabbox stop --provider ssh --static-host mac-studio.local mac-studio.local
```
`crabbox release` remains as a compatibility alias.
The argument accepts the stable `cbx_...` ID or an active friendly slug.
The argument accepts the stable `cbx_...` ID or an active friendly slug. In `blacksmith-testbox` mode it accepts a `tbx_...` ID or local slug and forwards to `blacksmith testbox stop`. In `daytona` mode it deletes the Daytona sandbox. In `islo` mode it accepts an `isb_...` ID, Crabbox-created sandbox name, or local slug and deletes the Islo sandbox. In `provider=ssh` mode it removes the local claim for the configured static target; it never deletes the host.
Flags:
```text
--provider hetzner|aws
--provider hetzner|aws|azure|ssh|blacksmith-testbox|daytona|islo
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
```

View File

@ -1,22 +1,80 @@
# sync-plan
`crabbox sync-plan` prints the local sync manifest without leasing a box.
Use it to preview what `crabbox run` would send before paying for a cold
sync, or after editing `.crabboxignore` to confirm artifacts dropped out
of the manifest.
```sh
crabbox sync-plan
crabbox sync-plan --limit 10
crabbox sync-plan --limit 25 --json
```
It uses the same Git file-list manifest and excludes as `crabbox run`, then prints:
## What It Reads
`sync-plan` uses the same Git file-list manifest, `.crabboxignore`, and
`sync.exclude` rules as `crabbox run`:
- tracked files from `git ls-files --cached`;
- nonignored untracked files from
`git ls-files --others --exclude-standard`;
- root `.crabboxignore` patterns;
- repo-local `sync.exclude` patterns;
- Crabbox's default cache/build excludes.
It does not require a lease, does not call the broker, and does not call
any provider API.
## Output
Default output prints:
- candidate file count and total bytes;
- tracked deletes that would be applied remotely;
- largest files;
- largest first or second-level directories.
- the largest files;
- the largest first or second-level directories.
Use it before a cold sync when the preflight estimate looks too large.
```text
files: 1843
bytes: 312.5MiB
tracked deletes: 0
largest files:
84.5MiB assets/demo.mp4
12.4MiB fixtures/sample-data.json
...
largest directories:
140.2MiB assets
80.1MiB fixtures
...
```
## Flags
```text
--limit <n> show this many files and directories in each top list (default 5)
--json print structured JSON output
```
`--limit 0` shows the full lists (use sparingly; large repos produce big
output).
## Use Cases
- preview a first sync before warming a beast-class lease;
- find sneaky directories that grew (`.cache/`, `dist/`, generated assets);
- audit `.crabboxignore` after adding new excludes;
- compare repo footprint over time as part of repo health checks.
The numbers `sync-plan` prints are upper bounds; rsync's actual transfer
size depends on what is already on the remote runner. Repeat sync after a
warmup is much smaller because the manifest matches the remote fingerprint
and rsync ships only changed bytes.
Related docs:
- [run](run.md)
- [Sync](../features/sync.md)
- [Configuration](../features/configuration.md)

30
docs/commands/unshare.md Normal file
View File

@ -0,0 +1,30 @@
# unshare
`crabbox unshare` removes sharing from an existing coordinator lease.
```sh
crabbox unshare --id blue-lobster --user friend@example.com
crabbox unshare --id blue-lobster --org
crabbox unshare --id blue-lobster --all
crabbox unshare blue-lobster --all --json
```
Use `--user` to remove individual users, `--org` to remove org-wide access, or
`--all` to clear every sharing rule. Only the lease owner, a `manage` share, or
an admin session can change sharing.
Flags:
```text
--id <lease-id-or-slug>
--user <email>
--org
--all
--json
```
Related docs:
- [share](share.md)
- [Auth and admin](../features/auth-admin.md)
- [Browser portal](../features/portal.md)

View File

@ -13,7 +13,9 @@ crabbox usage --scope all --json
Usage requires a configured coordinator. Direct-provider mode has no central history to query.
Lease ownership comes from Cloudflare Access when available. In bearer-token mode, the CLI sends `CRABBOX_OWNER`, Git email env, or local `git config user.email`; set `CRABBOX_ORG` to group leases under an org.
Lease ownership comes from the signed GitHub login token for normal users. In shared bearer-token mode, the CLI sends `CRABBOX_OWNER`, Git email env, or local `git config user.email`; set `CRABBOX_ORG` to group leases under an org. Raw Cloudflare Access identity headers are ignored; only a verified Access JWT email can become the bearer-token owner.
GitHub browser-login users see their own owner/org usage regardless of requested `--scope`, `--user`, or `--org`. Fleet-wide `--scope org` and `--scope all` views require admin-token auth.
## Scopes

259
docs/commands/vnc.md Normal file
View File

@ -0,0 +1,259 @@
# vnc
`crabbox vnc` prints connection details for a desktop-capable Crabbox target.
For Crabbox-created desktop leases, it gives you an SSH tunnel, a local VNC
endpoint, and the generated per-lease password. For static SSH targets, it can
describe an existing host-managed VNC service, but it will not pretend that
host is a Crabbox-created box.
Use this command when you need to look at or manually drive the visible desktop
inside a lease:
```sh
crabbox warmup --desktop
crabbox vnc --id blue-lobster
crabbox vnc --id blue-lobster --network tailscale
crabbox vnc --id blue-lobster --open
```
Managed AWS Windows and EC2 Mac desktop leases use the same command:
```sh
crabbox warmup --provider aws --target windows --desktop
crabbox vnc --id crimson-crab
CRABBOX_AWS_MAC_HOST_ID=h-... \
crabbox warmup --provider aws --target macos --desktop --market on-demand
crabbox vnc --id silver-squid
```
Static hosts are explicit and host-managed:
```sh
crabbox vnc --provider ssh --target macos --static-host mac-studio.local
crabbox vnc --provider ssh --target windows --static-host win-dev.local
```
## Output
A managed Linux lease prints:
```text
lease: cbx_... slug=blue-lobster provider=aws target=linux
managed: true
display: :99
ssh tunnel:
ssh -i ... -p 2222 -N -L 5901:127.0.0.1:5900 crabbox@203.0.113.10
vnc:
localhost:5901
password: ...
Keep the tunnel process running while connected.
```
Run the printed `ssh -N -L ...` tunnel in another terminal, then connect your
VNC client to the printed `localhost:<port>` endpoint. The tunnel forwards your
local port to `127.0.0.1:5900` on the remote box.
Use `--open` when you want Crabbox to start the tunnel and open the local VNC
URL for you:
```sh
crabbox vnc --id blue-lobster --open
```
Use `crabbox webvnc --id <lease> --open` when you want the same desktop inside
the authenticated coordinator portal instead of a native VNC client. WebVNC
still uses a local SSH tunnel and does not expose the runner's VNC port.
When `--network tailscale` is selected, only the SSH endpoint changes. Managed
VNC remains loopback-bound on the runner and is still reached through the SSH
tunnel.
Keep the tunnel process alive while you are connected.
## Credentials
Managed desktop leases use generated per-lease credentials. The password is
stored only on the instance and is retrieved over SSH when `crabbox vnc` runs.
Crabbox does not store it in provider tags, labels, or run history.
Password locations:
| Target | Password file |
| --- | --- |
| Linux | `/var/lib/crabbox/vnc.password` |
| Windows | `C:\ProgramData\crabbox\vnc.password` |
| macOS | `/var/db/crabbox/vnc.password` |
Managed AWS Windows leases also print the generated Windows console login:
```text
password: Cb1!...
windows username: crabbox
windows password: Cb1!...
```
That login belongs to the Crabbox-created Windows instance, not your local
machine. Windows desktop bootstrap creates a local `crabbox` administrator,
configures auto-logon for that user, installs TightVNC, and keeps VNC reachable
only through the SSH tunnel.
Managed AWS macOS leases print the EC2 macOS account login:
```text
password: ...
macos username: ec2-user
macos password: ...
```
That password is generated per lease and set on the EC2 Mac account during
bootstrap.
Static macOS and Windows hosts are different. Their VNC or Screen Sharing
credentials are host-managed, because those targets are existing machines.
Crabbox does not synthesize or print those passwords.
## Managed Vs Static
Managed means Crabbox created the box and owns the desktop setup:
- cloud instance lifecycle;
- SSH key and connection metadata;
- desktop or VNC service setup;
- generated per-lease password;
- `desktop=true` lease capability;
- tunnel-only access.
Static means Crabbox is pointing at an existing SSH host:
- the host already exists;
- the operator owns VNC setup and credentials;
- the host may be your local LAN, Tailscale, or another durable machine;
- opening VNC can show that host's OS login prompt.
`--open` refuses host-managed static VNC unless you pass `--host-managed`.
That guard prevents a local Mac or durable Windows host prompt from being
mistaken for a Crabbox-created cloud box.
```sh
crabbox vnc --provider ssh --target macos --static-host mac-studio.local --host-managed --open
```
Only use `--host-managed` when you intentionally want to open the existing
host's VNC or Screen Sharing prompt.
## Provider Support
| Provider / target | Managed VNC | Notes |
| --- | --- | --- |
| Hetzner Linux | Yes | Requires `--desktop`; installs slim XFCE, Xvfb, x11vnc, and capture tools. |
| AWS Linux | Yes | Requires `--desktop`; same Linux desktop profile. |
| AWS Windows | Yes | Requires `--target windows --desktop`; installs Git for Windows and TightVNC after EC2Launch enables OpenSSH. Spot or On-Demand follows the AWS capacity config. |
| AWS macOS | Yes | Requires `--target macos --desktop --market on-demand` plus `CRABBOX_AWS_MAC_HOST_ID` or `aws.macHostId`. |
| Static Linux | Host-managed | Requires an existing loopback VNC service on the host. |
| Static macOS | Host-managed | Uses existing Screen Sharing or VNC. |
| Static Windows | Host-managed | Uses an existing VNC server. |
| Blacksmith Testbox | No | Blacksmith owns machine connectivity today. |
AWS EC2 Mac has an important cost and lifecycle constraint: Mac instances run on
allocated EC2 Mac Dedicated Hosts, are On-Demand only, and the Dedicated Host
has a 24-hour minimum allocation period. Crabbox launches onto a host id you
provide; it does not allocate or scrub EC2 Mac hosts for you.
## Security Model
Crabbox VNC is tunnel-first:
- managed VNC binds to `127.0.0.1:5900` on the remote box;
- the cloud security group does not open public VNC ingress;
- the local machine connects through SSH port forwarding;
- the normal lease TTL and idle-timeout lifecycle still apply;
- generated passwords are retrieved only on demand over SSH.
For static hosts, direct `host:5900` VNC is allowed only when that endpoint is
already reachable. Treat direct static VNC as operator-managed and keep it on a
trusted network such as Tailscale or a private LAN.
## Screenshots
Use `crabbox screenshot` when you need a PNG but do not need to open a VNC
client:
```sh
crabbox screenshot --id blue-lobster --output desktop.png
```
Screenshots share the same managed desktop boundary as VNC. Static macOS and
Windows hosts are rejected so Crabbox does not accidentally capture your local
or home-host desktop.
Windows screenshots run a one-shot scheduled task inside the logged-in
`crabbox` console session. Non-interactive SSH sessions cannot reliably capture
the visible Windows desktop.
## Troubleshooting
`lease ... was not created with desktop=true`
Warm a new lease with `--desktop`. Existing non-desktop leases do not gain a
desktop after creation:
```sh
crabbox warmup --desktop
```
`target does not expose VNC on 127.0.0.1:5900`
The SSH connection works, but the desktop or VNC service is not listening on
remote loopback. On managed boxes, inspect bootstrap logs or warm a fresh lease.
On static hosts, start or configure the host's VNC service.
VNC opens an OS credential prompt
Check `managed:` in the output. If it says `managed: false`, you opened a
static host. Static host credentials belong to that host. For Crabbox-created
Windows or macOS, use the generated username/password printed by `crabbox vnc`.
Tunnel command uses port `22` instead of `2222`
That is expected on AWS Windows. EC2Launch enables the first OpenSSH foothold on
port `22`, and Crabbox records the working SSH port after probing fallbacks.
Windows screenshot is black or fails from raw SSH
Use `crabbox screenshot`, not an ad hoc PowerShell `CopyFromScreen` over SSH.
The command captures from the logged-in console session using a scheduled task.
macOS launch fails with missing host id
Set `CRABBOX_AWS_MAC_HOST_ID` or `aws.macHostId`, use `--market on-demand`, and
make sure the Dedicated Host is allocated in the selected AWS region.
## Flags
```text
--id <lease-id-or-slug>
--provider hetzner|aws|azure|ssh|daytona
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--network auto|tailscale|public
--local-port <port>
--open
--host-managed
--reclaim
```
Related docs:
- [screenshot](screenshot.md)
- [warmup](warmup.md)
- [Interactive desktop and VNC](../features/interactive-desktop-vnc.md)
- [Linux VNC](../features/vnc-linux.md)
- [Windows VNC](../features/vnc-windows.md)
- [macOS VNC](../features/vnc-macos.md)
- [Tailscale](../features/tailscale.md)
- [Providers](../features/providers.md)

View File

@ -4,32 +4,145 @@
```sh
crabbox warmup --class beast
crabbox warmup --provider aws --class beast --market on-demand
crabbox warmup --browser
crabbox warmup --tailscale
crabbox warmup --desktop --browser
crabbox warmup --provider aws --target windows --desktop
crabbox warmup --provider azure --target windows
crabbox warmup --provider aws --target macos --desktop --market on-demand --type mac2.metal
crabbox warmup --actions-runner
crabbox warmup --provider blacksmith-testbox --blacksmith-workflow .github/workflows/ci-check-testbox.yml --blacksmith-job test
crabbox warmup --provider daytona --daytona-snapshot crabbox-ready
crabbox warmup --provider islo --islo-image docker.io/library/ubuntu:24.04
crabbox warmup --provider ssh --target macos --static-host mac-studio.local
crabbox warmup --provider ssh --target windows --windows-mode normal --static-host win-dev.local --static-work-root 'C:\crabbox' --browser
```
The command returns a stable `cbx_...` lease ID and a friendly slug. Reuse either for subsequent `run`, `status`, `ssh`, `inspect`, and `stop` commands; scripts should keep using the canonical ID.
On success, `warmup` prints a concise total duration line.
With `--provider blacksmith-testbox`, the canonical ID is the Blacksmith `tbx_...` ID returned by `blacksmith testbox warmup`; Crabbox still assigns and stores a local slug for reuse.
With `--provider daytona`, the canonical ID is a Crabbox `cbx_...` lease backed
by a Daytona sandbox created from `daytona.snapshot`. `run` uses Daytona
SDK/toolbox APIs; `ssh` mints short-lived Daytona SSH access tokens and redacts
them from output.
With `--provider islo`, the canonical ID is
`isb_<crabbox-sandbox-name>`. Crabbox stores a local slug, but Islo owns sandbox
setup and command execution.
With `--provider ssh`, warmup claims an existing static SSH host instead of
creating cloud capacity. Use `--target macos`, `--target windows
--windows-mode normal`, or `--target windows --windows-mode wsl2` to select the
remote command/sync contract. Native Windows static hosts must already have
OpenSSH Server reachable, PowerShell, Git, `tar`, and a writable
`static.workRoot`. Restart `sshd` after installing Git so new SSH sessions see
the updated PATH.
With `--provider hetzner`, managed provisioning supports Linux only. Hetzner can
run Windows through ISO/snapshot installation flows, but Crabbox does not manage
that path today. Use `--provider aws --target windows` for managed Windows
desktop or WSL2, `--provider azure --target windows` for native Windows
SSH/sync/run, or `--provider ssh --target windows` for an existing Hetzner
Windows host.
With `--provider aws --target windows --windows-mode normal --desktop`, Crabbox
creates a real AWS Windows Server lease. EC2Launch user data installs OpenSSH
Server, Git for Windows, TightVNC Server, a per-lease local administrator named
`crabbox`, and a loopback VNC password retrievable through
`crabbox vnc --id <lease>`.
With `--provider aws --target windows --windows-mode wsl2`, Crabbox still
creates a Windows Server host, then enables WSL, VirtualMachinePlatform, and
HypervisorPlatform, reboots as needed, updates the WSL kernel from the web,
imports an Ubuntu rootfs, and prepares the Linux-side `crabbox-ready` toolchain.
The AWS launch enables nested virtualization and uses C8i, M8i, or R8i instance
families for this mode. Commands and sync then use the POSIX WSL contract.
With `--provider azure --target windows`, Crabbox creates a native Windows
Server lease, uses the Azure VM Agent Custom Script Extension to install
OpenSSH Server and Git for Windows, and configures the `crabbox` user for
SSH/sync/run. Azure Windows does not provision VNC/browser/WSL2.
With `--provider aws --target macos --desktop`, Crabbox launches an EC2 Mac
instance on an already allocated Dedicated Host. Set `CRABBOX_AWS_MAC_HOST_ID`
or `aws.macHostId`, use `--market on-demand`, and expect EC2 Mac host lifecycle
rules to dominate cleanup and cost. The default SSH user is `ec2-user`; the VNC
password printed by `crabbox vnc` is the per-lease macOS account password set by
bootstrap.
On success, `warmup` prints a concise total duration line. Add `--timing-json` to emit a final JSON timing record with provider, lease ID, slug, total duration, and exit code.
Flags:
```text
--provider hetzner|aws
--provider hetzner|aws|azure|ssh|blacksmith-testbox|daytona|islo
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--profile <name>
--class <name>
--type <provider-type>
--market spot|on-demand
--ttl <duration>
--idle-timeout <duration>
--desktop
--browser
--code
--tailscale
--tailscale-tags <comma-separated tags>
--tailscale-hostname-template <template>
--tailscale-auth-key-env <env-var>
--tailscale-exit-node <name-or-100.x>
--tailscale-exit-node-allow-lan-access
--network auto|tailscale|public
--keep
--actions-runner
--reclaim
--timing-json
--blacksmith-org <org>
--blacksmith-workflow <file|name|id>
--blacksmith-job <job>
--blacksmith-ref <ref>
```
`--idle-timeout` releases the lease after no touch for that duration, default `30m`. `--ttl` remains the maximum wall-clock lifetime, default `90m`.
Warmup records a local claim tying the lease to the current repo; `--reclaim` overwrites an existing local claim for that lease.
`--browser` provisions a known browser binary and records it in
`/var/lib/crabbox/browser.env`. It can be used without `--desktop` for headless
browser automation. Managed Linux tries Google Chrome stable first, then a
Chromium package fallback.
`--desktop` provisions Xvfb, slim XFCE, and loopback-bound x11vnc for visible UI
automation and operator takeover. It does not imply a browser. Use
`--desktop --browser` when a headed browser should run in the visible display.
`--code` provisions `code-server` for Linux leases and enables
`crabbox code --id <lease>` to bridge the workspace through the authenticated
portal at `/portal/leases/<lease>/code/`.
`--tailscale` joins newly created managed Linux leases to the configured
tailnet. `--network` controls the SSH endpoint printed after readiness:
`auto` prefers the tailnet when reachable, `tailscale` requires it, and
`public` forces the provider/public host. Tailscale is a reachability layer, not
a provider; static hosts should put a MagicDNS name or 100.x address in
`static.host` instead. See [Tailscale](../features/tailscale.md).
For AWS, `--market` overrides `capacity.market` for this lease. Use
`--market on-demand` when Spot capacity is blocked or when a quota request was
approved only for the standard On-Demand quota. Explicit `--type` still means
exact type: Crabbox reports quota/capacity/policy failures instead of silently
changing capacity.
`--actions-runner` immediately registers the warm box as an ephemeral self-hosted GitHub Actions runner for the current repository. Most projects should prefer `crabbox actions hydrate --id <lease-id-or-slug>` after warmup because it also dispatches the workflow and waits for the ready marker.
`--actions-runner` is not supported with `blacksmith-testbox` because Blacksmith owns Testbox workflow hydration.
New leases use per-lease SSH keys under the user config directory:
```text

236
docs/commands/webvnc.md Normal file
View File

@ -0,0 +1,236 @@
# webvnc
`crabbox webvnc` bridges a desktop lease into the authenticated coordinator
portal.
Use it when you want the same VNC desktop that `crabbox vnc` opens, but inside
a browser tab instead of a native VNC client.
```sh
crabbox warmup --desktop
crabbox webvnc --id blue-lobster
crabbox webvnc --id blue-lobster --network tailscale
crabbox webvnc --id blue-lobster --open
crabbox webvnc daemon start --id blue-lobster --open
crabbox webvnc daemon status --id blue-lobster
crabbox webvnc daemon stop --id blue-lobster
crabbox webvnc status --id blue-lobster
crabbox webvnc status --id blue-lobster --network tailscale
crabbox webvnc reset --id blue-lobster --open
```
## How It Works
The command resolves the lease like `crabbox vnc`, verifies that the lease has
`desktop=true`, starts the normal SSH tunnel to the runner's loopback VNC
service, mints a short-lived bridge ticket over the authenticated coordinator
API, and opens a websocket bridge to the coordinator with that ticket. The
browser connects to `/portal/leases/<lease>/vnc` after GitHub portal auth, and
the Durable Object pairs that browser websocket with the local bridge process.
The data path is:
```text
browser noVNC
<-> coordinator portal websocket
<-> local crabbox webvnc process
<-> SSH tunnel
<-> runner 127.0.0.1:5900
```
That means the local `crabbox webvnc` process is not just a launcher. It is the
live bridge between the browser and the SSH-tunneled VNC socket. Keep it
running while the browser tab is open. If the browser tab reloads or drops, the
command re-registers a fresh bridge so the portal retry can reconnect.
## Security Boundary
This keeps the security boundary the same as `crabbox vnc`:
- VNC stays bound to runner loopback.
- The cloud provider does not open public VNC ingress.
- The coordinator authenticates the browser through portal auth and the bridge
through a one-use short-lived ticket. The CLI sends the ticket as an
`Authorization: Bearer ...` header so it stays out of websocket URLs and
proxy/access logs; the coordinator falls back to a `?ticket=` query string
for older CLIs.
- The noVNC client is served from the coordinator origin, not a third-party CDN.
- The local `crabbox webvnc` process must keep running while the browser uses
the desktop.
Use `crabbox webvnc daemon start --id <lease> --open` to keep the bridge
running without a tmux or foreground shell. Crabbox writes the bridge log and
pid file under its local state directory and prints both paths. Use
`crabbox webvnc daemon status --id <lease>` for the local pid/log check, and
`crabbox webvnc daemon stop --id <lease>` to kill the background bridge for
that lease. Shutdown terminates both the daemon supervisor and the active child
bridge process.
The bridge keeps a warm pool of backend VNC sessions open (default 4 slots,
which is what the `slots=` field in `webvnc status` reports). That lets
multiple portal viewers join the same lease: one viewer is the controller,
later viewers start in observer mode, and any viewer can press **take over**
to become the controller — including the prior controller, who stays connected
as an observer and can reclaim control the same way. Observer mode is a
collaboration UX for trusted shared leases; it relies on the portal noVNC
client staying read-only and is not a hostile-client isolation boundary.
The older `crabbox webvnc --id <lease> --daemon`, `--background`, `--status`,
and `--stop` forms remain accepted as compatibility aliases, but new docs and
automation should use the explicit `daemon` subcommands.
Use `crabbox webvnc status --id <lease>` for the full health view: local daemon
pid/log, SSH tunnel command, target VNC reachability, coordinator bridge/viewer
state, recent bridge events, portal URL/password, and the exact native VNC
fallback command. If status or reset is run with `--network public` or
`--network tailscale`, the printed native VNC fallback carries the same network
selection.
Typical status output is meant to be directly actionable:
```text
webvnc daemon: pid=12345 log=...
vnc target: reachable 127.0.0.1:5900 managed=true
ssh tunnel: ssh ... -L 5901:127.0.0.1:5900 ...
portal bridge: connected=true viewers=2 observers=1 slots=2
portal controller: peter
event: 2026-05-07T12:00:00Z bridge_connected
webvnc: https://crabbox.openclaw.ai/portal/leases/cbx_.../vnc#password=...
fallback: crabbox vnc --provider aws --target linux --network tailscale --id cbx_... --open
```
When a layer is unhealthy, the CLI prints `problem:`, optional `detail:`, and
one or more exact `rescue:` commands in the command output, not only in docs.
Common problems include `VNC bridge disconnected`, `WebVNC daemon not running`,
`waiting for an available WebVNC observer slot`, and `VNC target unreachable`.
If the browser portal path looks unhealthy but the target VNC service is
reachable, the output also prints the native `crabbox vnc ... --open` fallback
command with the same provider/target/network flags.
Use `crabbox webvnc reset --id <lease> --open` when the portal is stuck on a
stale bridge/viewer/session. Reset closes only that lease's coordinator
WebVNC sockets, stops only that lease's local daemon pid after verifying it is
a Crabbox WebVNC process, restarts the target desktop helper/VNC services, then
starts a fresh background bridge and prints the new portal URL.
`--network tailscale` changes only the SSH endpoint used for the local tunnel.
The runner VNC service stays bound to loopback.
## Portal And Passwords
`--open` opens the portal page after the bridge starts. If the VNC password is
available, the command also places it in the URL fragment for the local browser
tab. URL fragments are not sent to the coordinator, and Crabbox preserves
special characters such as `!` when building the fragment. If the portal login
flow redirects first, the page may still prompt for the VNC password; use the
password printed by the command. If an old browser tab is retrying with a stale
fragment, close it before opening the new bridge URL.
The portal page may show `WebVNC daemon not running` or `waiting for VNC
bridge` until the local command has connected. If you opened the portal first,
start:
```sh
crabbox webvnc --id <lease-id-or-slug>
```
in a terminal and leave it running.
For human demos, prefer WebVNC over native VNC because `crabbox webvnc --open`
preloads the per-lease password in the local browser URL fragment. Use native
VNC only as the fallback printed by `crabbox webvnc status` or
`crabbox webvnc reset`.
The WebVNC toolbar includes clipboard controls. The paste control reads the
local browser clipboard, sends it through noVNC, and then sends the target paste
shortcut: Command-V for macOS targets, Ctrl-V for Linux and Windows targets.
When the remote VNC server publishes clipboard text, the copy-remote control is
enabled; click it to write that remote text into the local browser clipboard.
Browsers require a user gesture for clipboard writes, so remote-to-local copy is
explicit instead of fully automatic.
## Flags
Flags:
```text
--id <lease-id-or-slug>
--provider hetzner|aws|azure
--target linux|macos|windows
--windows-mode normal|wsl2
--static-host <host>
--static-user <user>
--static-port <port>
--static-work-root <path>
--network auto|tailscale|public
--local-port <port>
--open
status
reset
daemon start
daemon status
daemon stop
--reclaim
```
## Limitations
Limitations:
- Coordinator-backed Hetzner, AWS, and Azure Linux desktop leases are supported.
- Static SSH hosts are intentionally not supported yet because the portal cannot
prove that host-managed VNC credentials and prompts are safe to expose.
- Blacksmith Testbox still owns its own machine connectivity.
## Troubleshooting
`webvnc requires a configured coordinator login`
Run `crabbox login` for the coordinator you are using. WebVNC needs both the CLI
bridge and the browser portal to authenticate with the coordinator.
`webvnc currently supports coordinator-backed hetzner/aws/azure desktop leases`
WebVNC is not available for static SSH hosts or Blacksmith Testbox. Use
`crabbox vnc` for static hosts when you explicitly trust the host-managed VNC
service.
`target does not expose VNC on 127.0.0.1:5900`
The lease is reachable over SSH, but the desktop service is not ready or was not
provisioned. Create the lease with `--desktop`, or wait for bootstrap to finish
and retry.
The portal keeps saying `WebVNC daemon not running` or `waiting for VNC bridge`
The browser can reach the coordinator, but no local bridge is currently paired
with that lease. Start or restart `crabbox webvnc daemon start --id <lease>
--open`, or run `crabbox webvnc reset --id <lease> --open` when stale tabs or
session state are likely. If the command is still running, wait for the portal
retry or reload the browser tab.
`waiting for an available WebVNC observer slot`
The portal is reachable, but all bridge slots are already paired with viewers.
Restart the bridge with a current Crabbox CLI so it opens the default backend
pool. If the portal still cannot get a slot, run:
```sh
crabbox webvnc reset --id <lease-id-or-slug> --open
```
If WebVNC remains unreliable, use the exact native fallback command printed by
`crabbox webvnc status --id <lease-id-or-slug>`.
VNC authentication fails
Use the password printed by `crabbox webvnc`. With `--open`, the command tries
to pass the password in the browser URL fragment, but a portal login redirect
can lose that fragment before noVNC sees it.
Related docs:
- [Interactive desktop and VNC](../features/interactive-desktop-vnc.md)
- [Linux VNC](../features/vnc-linux.md)
- [Windows VNC](../features/vnc-windows.md)
- [macOS VNC](../features/vnc-macos.md)

View File

@ -1,21 +1,77 @@
# whoami
`crabbox whoami` verifies broker auth and prints the identity the coordinator sees.
`crabbox whoami` verifies broker auth and prints the identity the
coordinator sees.
```sh
crabbox whoami
crabbox whoami --json
```
Human output:
## Human Output
```text
user=steipete@gmail.com org=openclaw auth=bearer broker=https://crabbox-coordinator.steipete.workers.dev
user=alex@example.com org=openclaw auth=github broker=https://crabbox.openclaw.ai
```
Identity comes from Cloudflare Access email when present. In bearer-token mode, the CLI sends `X-Crabbox-Owner` from `CRABBOX_OWNER`, Git email env, or `git config user.email`, and `X-Crabbox-Org` from `CRABBOX_ORG`.
The fields:
- `user` - the resolved owner email.
- `org` - the organization namespace, when set.
- `auth` - the authentication mode the coordinator accepted (`github` for
signed login tokens, `bearer` for shared automation tokens).
- `broker` - the configured coordinator URL.
## JSON Output
```json
{
"owner": "alex@example.com",
"org": "openclaw",
"auth": "github",
"broker": "https://crabbox.openclaw.ai",
"tokenSource": "user-config",
"accessJwtVerified": false
}
```
JSON output also reports the forwarded auth mode, where the token came
from (`user-config`, `env`, `stdin`), and whether a verified Cloudflare
Access JWT was present.
## Identity Sources
Identity normally comes from the signed GitHub login token. The browser
flow embeds the verified GitHub email and allowed-org membership in a
short-lived signed token; the coordinator extracts owner/org from that
token, not from headers.
Shared bearer-token automation reports owner/org from `X-Crabbox-Owner` and
`X-Crabbox-Org`. The CLI fills those headers from:
- `CRABBOX_OWNER` env (highest precedence);
- `GIT_AUTHOR_EMAIL` or `GIT_COMMITTER_EMAIL` env;
- `git config user.email`;
- `CRABBOX_ORG` env for the org header.
Raw Cloudflare Access identity headers are ignored. Only a verified Access
JWT email (with the JWT validated against the Cloudflare team's public
keys) can become the bearer-token owner.
## Exit Codes
```text
0 identity resolved successfully
2 broker URL or token missing
3 auth failure (token rejected, GitHub org membership missing, etc.)
```
Use `whoami` in CI scripts before any long workflow to fail fast on auth
issues.
Related docs:
- [login](login.md)
- [logout](logout.md)
- [Auth and admin](../features/auth-admin.md)
- [Broker auth and routing](../features/broker-auth-routing.md)

256
docs/concepts.md Normal file
View File

@ -0,0 +1,256 @@
# Concepts
Read when:
- you encounter a Crabbox term you do not recognize;
- you are writing docs and want to stay consistent with existing usage;
- you need a single page that lays out the vocabulary.
This page is a glossary. It defines the nouns and the verbs Crabbox uses
across the CLI, broker, providers, and docs. When two synonyms exist, the
preferred form is in **bold**.
## Compute Vocabulary
**Lease** - a time-bounded reservation of a remote runner that Crabbox
created or resolved. Has a canonical ID (`cbx_...`), a friendly slug, an
idle timeout, a TTL, and a state (`active`, `released`, `expired`,
`failed`). Leases are the unit of cost accounting and cleanup.
**Runner** - the remote machine itself. Provisioned by the provider,
prepared by cloud-init, used for one or more leases. Crabbox does not
distinguish between a Hetzner cloud server, an AWS EC2 instance, and a
static SSH host beyond what the provider backend tells it - all are
runners.
**Box** / **Testbox** - informal synonym for runner. Used in the README and
some early docs. Prefer "runner" in new docs unless the surrounding context
is talking about leases as a product (in which case "box" reads better).
**Pool** - the set of currently active runners visible to a user, org, or
the whole fleet. `crabbox list` and `/v1/pool` both expose it.
**Slug** - the friendly name for a lease. Looks like `blue-lobster`.
Generated from a stable hash of the lease ID; collisions append a 4-hex
suffix. See [Identifiers](features/identifiers.md).
**Lease ID** - the canonical machine-friendly identifier
(`cbx_abcdef123456`). Used in labels, logs, and APIs. Always 16 chars.
**Run** - a single `crabbox run` invocation against a coordinator. Has a
`run_...` ID, an owning lease, a command, an exit code, and a record in
coordinator history.
## Roles
**CLI** - the local Go binary `crabbox`. Owns config, sync, command
execution, output streaming, and per-lease SSH keys. See
[Architecture](architecture.md).
**Broker** / **Coordinator** - the Cloudflare Worker plus Fleet Durable
Object. Owns provider credentials, lease state, expiry, cleanup alarms,
usage, and cost. Both terms are used interchangeably; "coordinator" is
preferred in feature docs that emphasize state, "broker" when emphasizing
the trust boundary between CLI and provider.
**Provider** - a Crabbox component that knows how to acquire, resolve,
list, and release runners on a backing service. Built-in providers: AWS,
Hetzner, Static SSH, Blacksmith Testbox, Daytona, Islo. See
[Provider reference](providers/README.md).
**Backend** - the Go interface a provider implements:
`SSHLeaseBackend` for providers that hand Crabbox a real SSH target,
`DelegatedRunBackend` for providers that own command execution
themselves. See [Provider backends](provider-backends.md).
**Operator** - a person with broker-side access (admin token, Cloudflare
config). Operators run `crabbox admin` commands and image bake/promote
flows.
**Agent** - an LLM-backed process invoking Crabbox through the CLI or the
OpenClaw plugin. Agents are first-class users of Crabbox; the docs
intentionally write for both humans and agents.
## Modes
**Brokered mode** / **coordinator mode** - the normal path, where the CLI
talks to the Cloudflare Worker for lease creation, lease state, and
cleanup. Provider secrets stay broker-side. Used for shared team
infrastructure.
**Direct mode** / **direct-provider mode** - the local-debug fallback, where
the CLI talks straight to the provider API (AWS SDK, Hetzner API, Daytona
SDK, Islo SDK). No coordinator, no central history, no spend caps. Use
when you are debugging the broker itself.
**Static mode** - lease behavior for `provider: ssh`. The host is operator-
owned; Crabbox does not provision or delete it. Bypasses both broker and
direct provisioning paths.
**Delegated mode** - the path used by Blacksmith, Islo, and the Daytona
`run` flow. The provider owns command execution and streams output back to
Crabbox. Crabbox-owned sync (`--sync-only`, `--checksum`) is rejected;
sync timing reports `sync=delegated`.
## Commands
**warmup** - acquire a lease and keep it ready. No command runs yet.
**run** - acquire or reuse a lease, sync, run a command, stream output,
release.
**stop** - release a specific lease and delete its provider resources.
**cleanup** - sweep direct-provider leftovers based on labels. Refuses
when a coordinator is configured.
**reuse** - using `--id` (or a slug) to pick an existing lease instead of
creating a new one. Both `warmup` (idempotent) and `run` accept `--id`.
**reclaim** - move a local claim from one repo to another so a lease
created in repo A can be reused from repo B. Required because Crabbox
binds leases to repos by default.
**hydrate** - prepare a runner with project dependencies, usually by
dispatching a real GitHub Actions job that registers an ephemeral
self-hosted runner. The CLI then runs the local command in the hydrated
workspace. See [Actions hydration](features/actions-hydration.md).
## State
**Idle timeout** - the duration a lease may go without heartbeats before
the broker auto-releases it. Default 30m. Reset by every heartbeat or
explicit touch.
**TTL** - the absolute maximum wall-clock lifetime of a lease. Default
90m. Cannot be extended by heartbeats. `expiresAt = min(createdAt + ttl,
lastTouchedAt + idleTimeout)`.
**Heartbeat** - a `POST /v1/leases/{id}/heartbeat` call sent by the CLI
during long-running commands. Updates `lastTouchedAt`, can ship telemetry
samples, and can update idle timeout when explicitly requested.
**Touch** - lower-level synonym for "update lease state and idle". The
provider's `Touch` method is what handles direct-provider state updates;
heartbeat is the brokered equivalent.
**Reserved cost** - the worst-case TTL cost the broker reserves for a
lease at creation time (`hourlyRate × ttl`). Charged against the monthly
spend cap until the lease ends; freed on release. Distinct from elapsed
runtime cost, which is reported by `crabbox usage`.
**Estimated cost** - elapsed-runtime cost for a lease, computed from the
hourly rate and the time spent in `active`. What `crabbox usage` reports
as a billing approximation.
## Sync
**Manifest** - the NUL-delimited list of paths Crabbox will sync, built
from `git ls-files --cached` and `git ls-files --others --exclude-standard`.
**Fingerprint** - a hash of the commit, dirty file metadata, and manifest.
When the local fingerprint matches the remote one, Crabbox skips rsync.
**Git seeding** - the optional first-sync step where Crabbox fetches the
configured origin/base ref into the runner's Git directory before rsync,
so changed-file diffs are available remotely.
**Base ref** - the Git ref that Crabbox seeds and hydrates. Default
`main`. Configurable per repo in `sync.baseRef`.
**Sanity check** - a guardrail run after rsync that detects mass tracked
deletions, missing manifest entries, and other suspicious sync outcomes.
## Capabilities
**Desktop** - lease capability that adds Xvfb + XFCE + x11vnc. Required
for `crabbox vnc`, `crabbox webvnc`, and most `--browser` UI runs.
**Browser** - lease capability that installs Chrome/Chromium and exports
`BROWSER`/`CHROME_BIN`. Useful for Playwright/Vitest/etc. without a full
QA harness.
**Code** - lease capability that installs code-server bound to loopback.
Used by `crabbox code` and the portal `/code/` bridge.
**Tailscale** - optional reachability layer for managed Linux leases.
Joins the lease to the configured tailnet so clients on the tailnet can
reach the runner without the public IP. Distinct from the network mode
(`--network tailscale`) that selects which plane the CLI uses.
## Backplane
**Durable Object** - the Cloudflare Worker primitive that holds Crabbox
fleet state. Crabbox uses one fleet Durable Object so all scheduling
decisions are serialized.
**Alarm** - the Durable Object scheduling primitive that fires on a future
timestamp. Crabbox uses alarms for idle-timeout sweeps and TTL cleanup.
**Portal** - the server-rendered web UI hosted by the same Worker. Pages
under `/portal/...`. See [Browser portal](features/portal.md).
**Bridge** - a portal endpoint that proxies traffic to a loopback service
on the lease (VNC, code-server). Bridges authenticate against the portal
session, then talk to the lease over the internal SSH plane.
## Identity
**Owner** - the email address that owns a lease. Resolved from the signed
GitHub login token, `CRABBOX_OWNER`, Git env, or `git config user.email`.
**Org** - the GitHub-style organization namespace for a lease. Resolved
from the signed token or `CRABBOX_ORG`. Used for usage scoping and
multi-tenant cost caps.
**Allowed org** - the GitHub org membership the broker requires before
issuing a signed login token. Configured per Cloudflare Worker.
**Admin token** - the separately scoped token required for `/v1/pool`,
admin lease routes, and fleet-wide listing. Held more closely than the
shared automation token.
**Cloudflare Access** - optional protection layer in front of the Worker.
When configured, the Worker only trusts the `CF-Access-Jwt-Assertion`
header (verified upstream); raw identity headers from the client are
ignored.
## Storage
**State directory** - where the CLI keeps local state (claims, per-lease
keys, known_hosts). Defaults to `$XDG_STATE_HOME/crabbox`, falling back to
the platform-specific user config directory.
**Claim** - a JSON file under the state directory binding a lease to a
repo. Required for `crabbox run --id` to resolve slugs and to refuse
cross-repo reuse without `--reclaim`.
**Workdir** / **work root** - the directory on the runner where Crabbox
syncs the repo. Default `/work/crabbox` on Linux; provider-specific on
Windows and macOS.
## Documentation
**Source map** - the doc page that points each user-facing behavior at the
implementation file behind it. Updated when behavior changes. See
[Source map](source-map.md).
**Feature page** - a doc under `docs/features/<name>.md` describing what
Crabbox does in one capability area. Owns the conceptual story; commands
and providers cross-link from here.
**Command page** - a doc under `docs/commands/<name>.md` describing the
flags, behavior, and exit codes of one CLI command. One per top-level
command, kept in sync with `--help` by `scripts/check-command-docs.mjs`.
**Provider page** - a doc under `docs/providers/<name>.md` describing one
provider's targets, config keys, env vars, sync behavior, and expected
failures.
Related docs:
- [How Crabbox Works](how-it-works.md)
- [Architecture](architecture.md)
- [CLI](cli.md)
- [Configuration](features/configuration.md)
- [Provider backends](provider-backends.md)

View File

@ -8,25 +8,65 @@ Read when:
- you are deciding where a behavior belongs;
- you need the feature-level contract before changing code.
Core features:
## Foundations
- [Configuration](configuration.md): precedence, YAML schema, profiles, classes, env vars.
- [Identifiers](identifiers.md): lease IDs, slugs, run IDs, claims, and how lookup resolves.
- [Doctor checks](doctor.md): what `crabbox doctor` validates and how to extend it.
- [Network and reachability](network.md): `--network auto|tailscale|public`, port fallback, public/tailnet planes.
- [Lease capabilities](capabilities.md): `--desktop`, `--browser`, and `--code` selection rules.
- [Environment forwarding](env-forwarding.md): name-based env allowlist for the remote command.
## Brokered fleet
- [Coordinator](coordinator.md): brokered leases through Cloudflare Workers and Durable Objects.
- [Broker auth and routing](broker-auth-routing.md): bearer tokens, Cloudflare Access identity, and Worker routes.
- [Providers](providers.md): Hetzner and AWS EC2 Spot provisioning, classes, and fallback.
- [Browser portal](portal.md): authenticated lease/run UI, detail pages, bridge routes, and runner visibility.
- [Broker auth and routing](broker-auth-routing.md): GitHub login, shared bearer tokens, optional Cloudflare Access, and Worker routes.
- [Auth and admin](auth-admin.md): login/logout/whoami and trusted operator controls.
- [Telemetry](telemetry.md): lightweight Linux load, memory, disk, uptime, and run resource samples.
- [History and logs](history-logs.md): coordinator run records, events, and retained remote output.
- [Cost and usage](cost-usage.md): guardrails, provider-backed pricing, and reporting.
- [Lifecycle cleanup](lifecycle-cleanup.md): release, expiry, keep mode, and direct cleanup.
## Providers
- [Providers](providers.md): provider overview, target matrix, classes, and fallback.
- [Capacity and fallback](capacity-fallback.md): class chains, market spot/on-demand, region/AZ routing.
- [Provider backends](../provider-backends.md): contract reference for backend interfaces and registration.
- [Authoring a provider](provider-authoring.md): step-by-step guide to writing a new provider.
- [AWS](aws.md): EC2 Linux, Windows, WSL2, EC2 Mac, capacity, AMIs, and security groups.
- [Azure](azure.md): Azure Linux/native Windows, shared infra, capacity, and cleanup.
- [Hetzner](hetzner.md): Linux-only managed Hetzner behavior, classes, and cleanup.
- [Blacksmith Testbox](blacksmith-testbox.md): delegated Testbox backend behavior.
- [Daytona](daytona.md): Daytona SDK/toolbox sandbox leases with optional short-lived SSH access.
- [Islo](islo.md): delegated Islo sandbox runs using the Islo Go SDK.
## Runners and reachability
- [Tailscale](tailscale.md): optional tailnet reachability for managed Linux leases and static hosts.
- [Mediated egress](egress.md): browser/app egress through an operator machine using the Cloudflare Worker mediator.
- [Runner bootstrap](runner-bootstrap.md): cloud-init, installed tools, SSH port, and readiness.
- [Prebaked runner images](prebaked-images.md): provider-owned image storage and the image/cache/state boundary.
- [Image bake runbook](image-bake-runbook.md): exact AWS bake, candidate smoke, promotion, rollback, and cleanup flow.
- [SSH keys](ssh-keys.md): per-lease keys, provider key cleanup, and local storage.
## Sync, run, and recording
- [Sync](sync.md): Git file-list manifests, rsync, fingerprints, excludes, guardrails, and sanity checks.
- [Actions hydration](actions-hydration.md): let GitHub Actions prepare a runner, then sync local work into that workspace.
- [SSH keys](ssh-keys.md): per-lease keys, provider key cleanup, and local storage.
- [Cost and usage](cost-usage.md): guardrails, provider-backed pricing, and reporting.
- [History and logs](history-logs.md): coordinator run records and retained remote output tails.
- [Interactive desktop and VNC](interactive-desktop-vnc.md): VNC hub, support matrix, tunnel model, and QA boundaries.
- [Artifacts](artifacts.md): screenshots, video, trimmed GIFs, logs, metadata, templates, and PR publishing.
- [Linux VNC](vnc-linux.md), [Windows VNC](vnc-windows.md), [macOS VNC](vnc-macos.md): OS-specific desktop setup and troubleshooting.
- [Test results](test-results.md): JUnit summaries attached to recorded runs.
- [Cache controls](cache.md): inspect, purge, and warm remote package/build caches.
- [Auth and admin](auth-admin.md): login/logout/whoami and trusted operator controls.
- [Lifecycle cleanup](lifecycle-cleanup.md): release, expiry, keep mode, and direct cleanup.
## Integrations
- [OpenClaw plugin](openclaw-plugin.md): agent tools that wrap the CLI.
- [Repository onboarding](repository-onboarding.md): `crabbox init`, repo config, workflow stub, and agent skill.
- [Source map](../source-map.md): implementation files behind documented behavior.
Command docs:
## Command docs
- [doctor](../commands/doctor.md)
- [init](../commands/init.md)
@ -35,11 +75,13 @@ Command docs:
- [history](../commands/history.md)
- [logs](../commands/logs.md)
- [results](../commands/results.md)
- [artifacts](../commands/artifacts.md)
- [cache](../commands/cache.md)
- [status](../commands/status.md)
- [list](../commands/list.md)
- [usage](../commands/usage.md)
- [ssh](../commands/ssh.md)
- [vnc](../commands/vnc.md)
- [inspect](../commands/inspect.md)
- [stop](../commands/stop.md)
- [actions](../commands/actions.md)

View File

@ -8,6 +8,10 @@ Read when:
Actions hydration lets a repository reuse its existing GitHub Actions setup without putting repository-specific setup code in the Crabbox binary.
Runner registration is currently Linux-only. Brokered Hetzner/AWS/Azure Linux
targets work; static macOS/Windows and managed Windows/macOS targets are for
direct `crabbox run` loops until platform-specific runner installation is added.
The flow:
1. `crabbox warmup` leases a machine and prints both `cbx_...` and a friendly slug.

119
docs/features/artifacts.md Normal file
View File

@ -0,0 +1,119 @@
# Artifacts
Read when:
- collecting screenshots, videos, logs, or metadata from a desktop lease;
- turning a desktop recording into a trimmed GIF;
- publishing QA proof into a GitHub PR;
- deciding whether AWS S3 or Cloudflare R2 should host inline assets.
Crabbox artifacts are a local bundle plus optional hosted URLs. The command is
designed for QA handoff: capture the state of a lease, preserve enough metadata
to reproduce what happened, and publish a concise before/after/summary comment.
## Bundle Contract
`crabbox artifacts collect --id <lease>` writes a directory such as
`artifacts/blue-lobster` with:
- `metadata.json`: Crabbox version, lease id, slug, provider, network, target,
run id when provided, and capture time.
- `screenshot.png`: a desktop screenshot captured through the managed VNC
boundary.
- `doctor.txt`: the same desktop/session checks as `crabbox desktop doctor`.
- `webvnc-status.json`: bridge/viewer status when the lease is coordinator
backed.
- `logs.txt` and `run.json`: retained run output and run metadata when
`--run <run-id>` is set.
- `screen.mp4`, `screen.trimmed.gif`, and `screen.trimmed.mp4` when video/GIF
capture is requested.
Failures keep the rescue-first UX. If the input stack is dead, the VNC bridge
is disconnected, the browser did not launch, or screenshot/video capture fails,
the command prints a concrete `problem:` plus exact `rescue:` commands before
returning. In `--json` mode those hints are kept in `warnings`, stdout remains
parseable JSON, and post-start capture failures add an `error` object while
still returning a nonzero exit code.
## Media
Video capture is intentionally lease-local and Linux-first. The CLI records
the X11 desktop with remote `ffmpeg` and streams the MP4 back over SSH. GIF
generation then reuses the local motion-trimming logic from `crabbox media
preview`: leading/trailing static regions are removed and an optional trimmed
MP4 can be emitted beside the GIF.
Use `desktop launch --fullscreen` only when the artifact should show a
browser-only capture. The standard human QA profile remains windowed so panel
and window chrome stay visible.
## Publishing
GitHub comments cannot directly upload arbitrary local files through the issue
comment API. `crabbox artifacts publish --pr <n>` therefore uploads files to a
storage backend first, renders Markdown with inline image/GIF links, writes the
same body to `published-artifacts.md`, and posts that body with `gh`.
Supported storage:
- Brokered coordinator publishing through `crabbox artifacts publish` with no
storage flags. The coordinator owns object-store credentials and returns
short-lived upload URLs plus final public URLs.
- AWS S3 through the `aws` CLI.
- Cloudflare R2 through `wrangler r2 object put`.
- Local/hosted mode through `--storage local --base-url <url>` when another
process already serves the bundle.
For AWS S3, use either public/custom-domain URLs through `--base-url` or
temporary links through `--presign --expires <duration>`. For Cloudflare R2,
provide a public bucket/custom-domain `--base-url` when publishing to a PR;
without it, the upload can succeed but the PR would only have `r2://` object
identifiers, not inline-ready links.
## Broker Secret Model
Brokered publishing is intentionally asymmetric. Local users and agents only
need normal Crabbox coordinator auth. The coordinator holds the storage keys and
uses them to sign one upload request per artifact. Each upload grant includes a
signed `content-length`, so the configured size cap is enforced by the storage
backend, not only by the request metadata. The broker enforces both a 1 GiB
per-file cap and a 5 GiB per-request aggregate cap before minting upload URLs.
When users do not pass `--prefix`, hosted publishing adds a unique
PR/bundle/timestamp prefix so later artifact bundles cannot overwrite links from
earlier QA comments.
Coordinator artifact vars describe the backend:
- `CRABBOX_ARTIFACTS_BACKEND`: `s3` or `r2`.
- `CRABBOX_ARTIFACTS_BUCKET`: destination bucket.
- `CRABBOX_ARTIFACTS_PREFIX`: root object prefix for all brokered uploads.
- `CRABBOX_ARTIFACTS_BASE_URL`: public URL prefix for final Markdown links.
- `CRABBOX_ARTIFACTS_REGION` and `CRABBOX_ARTIFACTS_ENDPOINT_URL`: S3/R2 signing
endpoint details.
- `CRABBOX_ARTIFACTS_UPLOAD_EXPIRES_SECONDS`: lifetime for write grants.
- `CRABBOX_ARTIFACTS_URL_EXPIRES_SECONDS`: lifetime for signed read URLs when
no public base URL is configured.
Coordinator artifact secrets authorize signing:
- `CRABBOX_ARTIFACTS_ACCESS_KEY_ID`
- `CRABBOX_ARTIFACTS_SECRET_ACCESS_KEY`
- `CRABBOX_ARTIFACTS_SESSION_TOKEN` when the backend uses temporary
credentials.
These keys are object-store credentials, not Crabbox provider credentials. They
should be scoped to the artifact bucket/prefix and should not grant Worker
deployment, Cloudflare account administration, lease creation, or cloud VM
provider access. The CLI receives only pre-signed URLs and final asset URLs.
## Templates
`crabbox artifacts template openclaw` and `crabbox artifacts template mantis`
produce Markdown with:
- `Summary`
- `Before / After`
- `Evidence`
The publish command uses the same layout, so local preview and PR comments stay
consistent.

View File

@ -6,23 +6,29 @@ Read when:
- changing trusted operator controls;
- debugging who owns a lease or run.
Crabbox currently supports bearer-token broker auth. `crabbox login` stores the broker URL, provider, and token in the user config, then verifies the token with `GET /v1/whoami`. It is not yet a GitHub browser OAuth flow.
Crabbox supports GitHub browser login for normal users, shared bearer-token login for trusted operator automation, and a separate admin token for fleet-wide routes. `crabbox login` opens GitHub, the coordinator exchanges the OAuth code, verifies active membership in the allowed GitHub org and optional allowed team slugs, and the CLI stores a signed user token in the user config. `crabbox login --token-stdin` stores the shared operator token instead.
Identity sent to the coordinator:
```text
Cloudflare Access email, when present
signed GitHub login token from browser auth
X-Crabbox-Owner from CRABBOX_OWNER, Git email env, or git config user.email
X-Crabbox-Org from CRABBOX_ORG
Verified Cloudflare Access JWT email, when configured and present
CRABBOX_DEFAULT_ORG fallback in the Worker
```
Commands:
```sh
crabbox login
crabbox login --no-browser
crabbox login --url <url> --token-stdin
crabbox whoami
crabbox logout
crabbox share --id blue-lobster --user friend@example.com
crabbox share --id blue-lobster --org
crabbox unshare --id blue-lobster --user friend@example.com
```
Trusted operator controls:
@ -33,7 +39,29 @@ crabbox admin release blue-lobster
crabbox admin delete cbx_... --force
```
Admin commands use the same coordinator token as normal broker calls. Do not distribute the shared token to untrusted users. A future access-control pass should split operator and user tokens before Crabbox is opened beyond trusted maintainers.
Admin commands require the separate admin token. GitHub browser-login tokens can create and use normal leases only after allowed-org membership, and configured team membership when present, is verified. They cannot call admin routes.
Normal user tokens are owner/org scoped:
```text
GET /v1/leases own and shared leases only
GET /v1/leases/{id-or-slug} exact ID and slug lookup must be visible
POST /v1/leases/{id}/heartbeat own or shared leases
PUT/DELETE /v1/leases/{id}/share owner, manage share, or admin only
POST /v1/leases/{id}/release owner, manage share, or admin only
GET /v1/runs and logs own runs only
GET /v1/usage own usage only
GET /v1/pool admin token only
```
Lease sharing grants coordinator and portal access without distributing the
shared bearer token or admin token. A `use` share can see the lease and open
visible portal bridges such as WebVNC/code. A `manage` share can also change
sharing and stop the lease. `--org` shares with authenticated users whose org
matches the lease org. SSH-based CLI use still requires a local private key
accepted by the runner; sharing does not copy SSH private keys between users.
Do not distribute the shared token or admin token to untrusted users. Keep the admin token narrower and more closely held than the shared automation token.
Related docs:

146
docs/features/aws.md Normal file
View File

@ -0,0 +1,146 @@
# AWS
Read when:
- choosing AWS as the Crabbox provider;
- debugging EC2 capacity, quotas, AMIs, security groups, or EC2 Mac hosts;
- changing AWS provisioning code in the CLI or Worker.
AWS is Crabbox's broad managed provider. It supports Linux, native Windows,
Windows WSL2, and EC2 Mac targets. Brokered mode keeps AWS credentials in the
Cloudflare Worker; direct mode uses the local AWS credential chain for provider
debugging.
## Targets
| Target | Managed | Notes |
| --- | --- | --- |
| Linux | Yes | Spot by default; On-Demand optional; cloud-init bootstrap. |
| Windows native | Yes | EC2Launch, OpenSSH, Git for Windows, TightVNC, archive sync. |
| Windows WSL2 | Yes | Nested virtualization on C8i/M8i/R8i families; POSIX sync through WSL. |
| macOS | Yes | EC2 Mac Dedicated Host id required; On-Demand only. |
Examples:
```sh
crabbox warmup --provider aws --class beast
crabbox run --provider aws --class beast --market on-demand -- pnpm check
crabbox warmup --provider aws --target windows --desktop
crabbox warmup --provider aws --target windows --windows-mode wsl2
CRABBOX_AWS_MAC_HOST_ID=h-... crabbox warmup --provider aws --target macos --desktop --market on-demand
```
## Capacity
AWS Linux defaults to Spot. Use `--market on-demand` for one lease when Spot is
blocked or when an account only has On-Demand quota. `capacity.fallback` can
fall back to On-Demand after Spot capacity/quota failures when configured.
Set `CRABBOX_CAPACITY_REGIONS` or `capacity.regions` to give AWS more regional
headroom. Brokered and direct AWS launches try the primary region first, then
the configured capacity regions in order. The public coordinator defaults to:
```sh
CRABBOX_CAPACITY_REGIONS=eu-west-1,eu-west-2,eu-central-1,us-east-1,us-west-2
```
Prefer `standard` or `fast` during capacity incidents. `beast` starts at
48xlarge candidates and can consume 192 vCPUs per request before fallback.
Brokered AWS leases return capacity hints in the lease payload and CLI output.
Hints include the selected region/market, failed attempt regions, quota
pressure, Spot-to-On-Demand fallback, and high-pressure class warnings. Set
`capacity.hints: false` or `CRABBOX_CAPACITY_HINTS=0` to suppress them. Set
`CRABBOX_CAPACITY_LARGE_CLASSES=beast,large` when an installation wants warning
hints for a different set of classes.
These fields are wire-compatible with mixed CLI/broker versions. Upgraded
brokers add optional response fields that older clients ignore. Upgraded
clients keep the lease request sparse: they omit default hint and routing fields
and do not send the capacity block at all for broker defaults, unless an
operator configures a non-default market/strategy/fallback, a multi-region pool,
pinned availability zones, or `capacity.hints: false`.
Crabbox tries ordered instance candidates for the requested class. Explicit
`--type` is exact: if EC2 rejects it, Crabbox fails clearly instead of silently
choosing another type.
Current class defaults:
```text
AWS Linux
standard c7a.8xlarge, c7i.8xlarge, m7a.8xlarge, m7i.8xlarge, c7a.4xlarge
fast c7a.16xlarge, c7i.16xlarge, m7a.16xlarge, m7i.16xlarge, c7a.12xlarge, c7a.8xlarge
large c7a.24xlarge, c7i.24xlarge, m7a.24xlarge, m7i.24xlarge, r7a.24xlarge, c7a.16xlarge, c7a.12xlarge
beast c7a.48xlarge, c7i.48xlarge, m7a.48xlarge, m7i.48xlarge, r7a.48xlarge, c7a.32xlarge, c7i.32xlarge, m7a.32xlarge, c7a.24xlarge, c7a.16xlarge
AWS Windows
standard m7i.large, m7a.large, t3.large
fast m7i.xlarge, m7a.xlarge, t3.xlarge
large m7i.2xlarge, m7a.2xlarge, t3.2xlarge
beast m7i.4xlarge, m7a.4xlarge, m7i.2xlarge
AWS Windows WSL2
standard m8i.large, m8i-flex.large, c8i.large, r8i.large
fast m8i.xlarge, m8i-flex.xlarge, c8i.xlarge, r8i.xlarge
large m8i.2xlarge, m8i-flex.2xlarge, c8i.2xlarge, r8i.2xlarge
beast m8i.4xlarge, m8i-flex.4xlarge, c8i.4xlarge, r8i.4xlarge, m8i.2xlarge
AWS macOS
all mac2.metal unless `--type` is set
```
## Broker Secrets And Env
Worker secrets:
```text
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN optional
CRABBOX_AWS_MAC_HOST_ID optional; required for brokered target=macos
```
CLI/direct env and config:
```text
AWS_PROFILE
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN
CRABBOX_AWS_REGION
CRABBOX_AWS_AMI
CRABBOX_AWS_SECURITY_GROUP_ID
CRABBOX_AWS_SUBNET_ID
CRABBOX_AWS_INSTANCE_PROFILE
CRABBOX_AWS_ROOT_GB
CRABBOX_AWS_SSH_CIDRS
CRABBOX_AWS_MAC_HOST_ID
CRABBOX_CAPACITY_REGIONS
CRABBOX_CAPACITY_AVAILABILITY_ZONES
CRABBOX_CAPACITY_HINTS
CRABBOX_CAPACITY_LARGE_CLASSES
```
## Security And Networking
Crabbox imports or reuses an EC2 key pair, creates or reuses the
`crabbox-runners` security group when no security group is supplied, and opens
only SSH ports to configured CIDRs or the detected request source. VNC stays
behind the SSH tunnel. Supplying `CRABBOX_AWS_SECURITY_GROUP_ID` makes ingress
policy your responsibility.
## Images
Linux resolves the latest Ubuntu 24.04 x86_64 AMI unless overridden. Windows
resolves the latest Windows Server 2022 English Full Base AMI unless overridden.
Operators can create and promote trusted AWS images with `crabbox image`.
Related docs:
- [Providers](providers.md)
- [Linux VNC](vnc-linux.md)
- [Windows VNC](vnc-windows.md)
- [macOS VNC](vnc-macos.md)
- [Infrastructure](../infrastructure.md)
- [image command](../commands/image.md)

133
docs/features/azure.md Normal file
View File

@ -0,0 +1,133 @@
# Azure
Read when:
- choosing Azure as the Crabbox provider;
- debugging Azure VM capacity, quotas, images, or SSH readiness;
- changing Azure provisioning code in the CLI.
Azure is a managed provider for Linux and native Windows SSH leases. It
creates VMs in a shared resource group, tags them with Crabbox lease
metadata, and bootstraps the normal SSH/sync contract through cloud-init
on Linux or Custom Script Extension on Windows. It works in direct mode with
local Azure auth and in brokered mode through Worker-owned service principal
secrets.
## Targets
| Target | Managed | Notes |
| --- | --- | --- |
| Linux | Yes | Cloud-init bootstrap, SSH, rsync, optional desktop/browser/code. |
| Windows | Yes | Native Windows SSH/sync/run only. No Azure desktop/browser/WSL2. |
| macOS | No | Azure does not offer managed macOS; use AWS EC2 Mac or static SSH. |
Examples:
```sh
crabbox warmup --provider azure --class beast
crabbox run --provider azure --class standard -- pnpm test
crabbox warmup --provider azure --target windows --class standard
crabbox warmup --provider azure --desktop --browser
crabbox vnc --id blue-lobster --open
```
## Classes
```text
standard Standard_D32ads_v6, Standard_D32ds_v6, Standard_F32s_v2, then D/F 16-vCPU fallbacks
fast Standard_D64ads_v6, Standard_D64ds_v6, Standard_F64s_v2, then D/F 48-vCPU and 32-vCPU fallbacks
large Standard_D96ads_v6, Standard_D96ds_v6, then D/F 64-vCPU and 48-vCPU fallbacks
beast Standard_D192ds_v6, Standard_D128ds_v6, then D/F 96-vCPU and 64-vCPU fallbacks
```
Native Windows uses the smaller AWS Windows class scale:
```text
standard Standard_D2ads_v6, Standard_D2ds_v6, Standard_D2ads_v5, Standard_D2ds_v5, then Standard_D2as_v6
fast Standard_D4ads_v6, Standard_D4ds_v6, Standard_D4ads_v5, Standard_D4ds_v5, then Standard_D4as_v6
large Standard_D8ads_v6, Standard_D8ds_v6, Standard_D8ads_v5, Standard_D8ds_v5, then Standard_D8as_v6
beast Standard_D16ads_v6, Standard_D16ds_v6, Standard_D16ads_v5, Standard_D16ds_v5, then Standard_D8ads_v6
```
Crabbox falls back through the candidate list when Azure rejects a SKU for
capacity or quota. Explicit `--type` is exact and fails clearly when the
SKU cannot be created. Spot leases fall back to on-demand when
`capacity.fallback` starts with `on-demand`.
Default Azure Linux class candidates mirror the vCPU scale of the AWS Linux
class table. Default Azure native Windows candidates mirror the AWS native
Windows class table. Crabbox asks Azure Resource SKUs whether the selected VM
supports ephemeral OS disks; ephemeral-capable sizes use local OS disks,
while exact `--type` requests for non-ephemeral sizes use managed
`StandardSSD_LRS` OS disks.
## Direct Auth And Env
Service principal env vars consumed by `DefaultAzureCredential`:
```text
AZURE_TENANT_ID
AZURE_CLIENT_ID
AZURE_CLIENT_SECRET
AZURE_SUBSCRIPTION_ID
```
Crabbox-specific overrides:
```text
CRABBOX_AZURE_SUBSCRIPTION_ID
CRABBOX_AZURE_TENANT_ID
CRABBOX_AZURE_CLIENT_ID
CRABBOX_AZURE_LOCATION
CRABBOX_AZURE_RESOURCE_GROUP
CRABBOX_AZURE_IMAGE
CRABBOX_AZURE_VNET
CRABBOX_AZURE_SUBNET
CRABBOX_AZURE_NSG
CRABBOX_AZURE_SSH_CIDRS
```
The service principal needs the
[Contributor](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#contributor)
role on the target resource group (or subscription, if you want Crabbox to
create the resource group on first use).
Brokered Azure uses `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`,
`AZURE_CLIENT_SECRET`, and `AZURE_SUBSCRIPTION_ID` on the Worker. Operators
own the shared infra settings through `CRABBOX_AZURE_*`. Lease requests may
override only `azureLocation` and `azureImage`.
## Shared Infra
The first acquire in an empty subscription creates:
- a resource group (default `crabbox-leases`);
- a virtual network and subnet (`10.42.0.0/16` / `10.42.0.0/24`);
- a network security group with SSH rules derived from `azure.sshCIDRs`,
the configured SSH port, and fallback ports.
These resources are created with `createOrUpdate` and reused across leases.
Per-lease provisioning creates only the public IP, NIC, VM, and OS disk.
Azure pricing is not hardcoded. Use `CRABBOX_COST_RATES_JSON` for exact
Azure cost guardrails.
## Desktop
Azure desktop leases use the standard Linux VNC path: Xvfb, a lightweight
desktop session, x11vnc bound to `127.0.0.1:5900`, and an SSH local tunnel
created by `crabbox vnc`. Azure native Windows currently supports SSH, sync,
and run only. Use AWS for managed Windows desktop or Windows WSL2.
## Cleanup
Direct cleanup is best-effort through Crabbox lease tags. `crabbox cleanup
--provider azure` enumerates VMs in the configured resource group, skips
kept or unexpired leases, and cascade-deletes expired ones. The shared
resource group, vnet, subnet, and NSG are preserved.
Related docs:
- [Providers](providers.md)
- [Linux VNC](vnc-linux.md)
- [cleanup command](../commands/cleanup.md)

View File

@ -0,0 +1,160 @@
# Blacksmith Testbox
Read when:
- choosing a provider/service page;
- choosing `provider: blacksmith-testbox`;
- changing Blacksmith CLI forwarding;
- deciding what Crabbox owns versus Blacksmith owns.
Crabbox can use Blacksmith Testboxes as the machine backend without using the Crabbox broker. Select it with `--provider blacksmith-testbox` for one command, or put `provider: blacksmith-testbox` in config when a repo or machine should use it by default.
Blacksmith is a delegated service integration. Crabbox does not provision,
bootstrap, sync, or expose VNC for the Testbox itself; it forwards to the
Blacksmith CLI and keeps local Crabbox ergonomics around that CLI.
## One-Liners
If you already have a Blacksmith Testbox ID, no Crabbox YAML is required:
```sh
crabbox run --provider blacksmith-testbox --id tbx_123 -- pnpm test
```
If Crabbox has already claimed a friendly slug for that Testbox, the slug works too:
```sh
crabbox run --provider blacksmith-testbox --id blue-lobster -- pnpm test:changed
crabbox status --provider blacksmith-testbox --id blue-lobster
crabbox stop --provider blacksmith-testbox blue-lobster
```
That path only needs Blacksmith auth and a reachable Testbox. Crabbox resolves the ID or slug, preserves the local repo claim, forwards the command to `blacksmith testbox run`, and prints `sync=delegated` in the final summary.
To create a fresh Testbox without YAML, provide the workflow details as flags:
```sh
crabbox warmup \
--provider blacksmith-testbox \
--blacksmith-org openclaw \
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
--blacksmith-job test \
--blacksmith-ref main \
--idle-timeout 90m
```
The same flags work for one-shot `run` when no `--id` is supplied:
```sh
crabbox run \
--provider blacksmith-testbox \
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
--blacksmith-job test \
-- pnpm test
```
YAML is a convenience, not a requirement, when the command line already tells Crabbox which backend and workflow to use. Environment variables such as `CRABBOX_BLACKSMITH_WORKFLOW`, `CRABBOX_BLACKSMITH_JOB`, `CRABBOX_BLACKSMITH_REF`, and `CRABBOX_BLACKSMITH_ORG` are also supported for shell defaults or scripts.
## Repo Config
Use repo config when every agent or maintainer should get the same Blacksmith defaults without repeating flags:
```yaml
provider: blacksmith-testbox
blacksmith:
org: openclaw
workflow: .github/workflows/ci-check-testbox.yml
job: test
ref: main
idleTimeout: 90m
```
For repos that already use Crabbox Actions hydration, `blacksmith.workflow`, `blacksmith.job`, and `blacksmith.ref` can be omitted when `actions.workflow`, `actions.job`, and `actions.ref` carry the same values.
`blacksmith` is accepted as a shorthand provider alias, but docs and scripts should prefer `blacksmith-testbox`.
## Forwarded Commands
Crabbox forwards lifecycle and run operations to the Blacksmith CLI:
```sh
blacksmith testbox warmup <workflow> --job <job> --ref <ref> --ssh-public-key <key> --idle-timeout <minutes>
blacksmith testbox run --id <tbx_id> --ssh-private-key <key> <command>
blacksmith testbox list
blacksmith testbox list --all
blacksmith testbox stop --id <tbx_id>
```
The wrapper is deliberately thin for warmup, run, and stop. `crabbox list` and
`crabbox status` normalize Blacksmith data into Crabbox's common list/status
views so rendering stays core-owned across providers. Status currently reads
`blacksmith testbox list --all` to build that view.
If `blacksmith testbox list --all` and `crabbox status --provider
blacksmith-testbox --id <tbx_id>` work but new warmups remain `queued` with no
IP, treat it as Blacksmith service, queue, org-limit, or billing pressure
instead of a Crabbox provisioning bug. Stop queued IDs you created and switch to
another provider until the Blacksmith account or service recovers.
`crabbox list --provider blacksmith-testbox --json` parses the Blacksmith table
output into compatibility JSON rows with the fields Crabbox can see. That parser is a
compatibility layer, not a Blacksmith API contract. If the Blacksmith CLI adds
native JSON output, Crabbox should switch to that and drop table parsing.
When coordinator auth is configured, `crabbox list --provider blacksmith-testbox`
also performs a best-effort sync of the current all-status Blacksmith list into
the portal lease table. Those muted rows are owner-scoped visibility records for
Blacksmith-owned Testboxes. When the row includes enough context, Crabbox queries
GitHub Actions and links the row to the closest workflow run plus the workflow
definition. The portal also renders the Actions status/conclusion, adds a
`stuck` filter for long-queued or long-running workflow owners, and offers a
copyable local `crabbox stop --provider blacksmith-testbox ...` command for
operator cleanup. Clicking the row opens a visibility-only detail page with
owner/org, Actions ownership, timestamps, boundary notes, and the same local stop
command. They are not Crabbox leases, do not expose box access actions, do not
heartbeat, do not participate in Crabbox expiry or cost control, and become stale
when a later sync does not see the runner.
## Auth
Auth stays with Blacksmith. Run `blacksmith auth login` before using this provider. Crabbox does not call the Crabbox login broker, does not send work to the Cloudflare coordinator, and does not hold Blacksmith credentials.
## Ownership Boundary
- Blacksmith owns provisioning, workflow hydration, remote workspace setup, sync, command transport, logs emitted by its CLI, and idle expiry.
- Crabbox owns local YAML/env config, per-Testbox SSH keys, friendly slugs, repo claims, provider selection, command quoting, and final timing summaries.
Because Blacksmith owns sync in this mode, Crabbox sync flags such as `--sync-only`, `--checksum`, `--force-sync-large`, and sync guardrails do not apply. `crabbox run` prints `sync=delegated` in the final summary.
`blacksmith.workflow` is required only when Crabbox needs to warm or acquire a Testbox. Reusing an existing `tbx_...` ID or slug does not need workflow config.
## Desktop And VNC
Blacksmith can run headless browser automation through its own runner setup, but
Crabbox does not currently expose `crabbox vnc`, `crabbox webvnc`, or managed
screenshots for `provider=blacksmith-testbox`. Blacksmith owns machine
connectivity in this mode. Crabbox should add VNC only after Blacksmith exposes
a stable SSH tunnel or connection-info API that preserves the same security
boundary as managed Crabbox leases.
## Choosing The Path
Use the one-liner when:
- you already have `tbx_...`;
- you are trying Blacksmith on one command;
- an agent can pass provider and workflow directly as flags.
Use repo YAML when:
- the repo should default to Blacksmith;
- multiple agents should share the same workflow/job/ref;
- you want `crabbox warmup` to work without extra env.
Related docs:
- [Providers](providers.md)
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [run command](../commands/run.md)
- [warmup command](../commands/warmup.md)
- [Source map](../source-map.md)

View File

@ -4,16 +4,76 @@ Read when:
- changing coordinator authentication;
- changing Cloudflare routes or Access policy;
- debugging bearer-token automation.
- debugging bearer-token automation or GitHub browser login.
The broker is exposed through Cloudflare Workers routes:
```text
https://crabbox-coordinator.steipete.workers.dev
https://crabbox.openclaw.ai
https://crabbox-access.openclaw.ai
https://crabbox-coordinator.services-91b.workers.dev
crabbox.clawd.bot/*
```
Normal automation uses a shared bearer token configured in the CLI and Worker. The CLI sends:
## Route Model
`https://crabbox.openclaw.ai` is the normal coordinator route. It is public at
the Cloudflare edge so `crabbox login` can complete a browser-based GitHub
OAuth flow. The Worker still requires Crabbox auth for every non-health route.
`https://crabbox-access.openclaw.ai` is the same Worker behind a Cloudflare
Access application. It exists for automation and proof that Crabbox works when
an operator wants an outer Cloudflare gate in front of the coordinator. Requests
to this route must pass two checks:
1. Cloudflare Access accepts the service-token headers before the request
reaches the Worker.
2. The Crabbox Worker accepts either the shared operator bearer token, the
separate admin bearer token for admin routes, or a signed Crabbox user token.
That means the Access service token is not a Crabbox admin token. It only gets
the HTTP request through Cloudflare Access. The Worker still decides what the
caller can do.
The current Access app is `Crabbox Coordinator Service Token` on
`crabbox-access.openclaw.ai`. Its policy is `non_identity` service-token auth,
scoped to the local Crabbox CLI service token rather than any token in the
account.
Normal users run `crabbox login`, which opens GitHub and stores a signed Crabbox user token. The coordinator needs a GitHub OAuth app with callback:
```text
https://crabbox.openclaw.ai/v1/auth/github/callback
```
Self-hosted coordinators need their own GitHub OAuth app. The callback URL on
that app must exactly match the public Worker URL plus
`/v1/auth/github/callback`, and the Worker `CRABBOX_PUBLIC_URL` must use that
same public origin.
Worker secrets:
```text
CRABBOX_GITHUB_CLIENT_ID
CRABBOX_GITHUB_CLIENT_SECRET
CRABBOX_GITHUB_ALLOWED_ORG
CRABBOX_GITHUB_ALLOWED_ORGS
CRABBOX_GITHUB_ALLOWED_TEAMS
CRABBOX_SESSION_SECRET
```
GitHub browser login requires active membership in the allowed GitHub org before
the coordinator mints a Crabbox user token. Set `CRABBOX_GITHUB_ALLOWED_ORG` or
comma-separated `CRABBOX_GITHUB_ALLOWED_ORGS`; if unset, the Worker falls back
to `CRABBOX_DEFAULT_ORG`, then `openclaw`. The OAuth app must request
`read:user user:email read:org`.
Set comma-separated `CRABBOX_GITHUB_ALLOWED_TEAMS` to require membership in at
least one team after org membership passes. Entries are GitHub team slugs. Use
`team-slug` for the selected org or `org/team-slug` when multiple orgs are
allowed.
Trusted automation can still use the shared operator bearer token configured in the CLI and Worker. Shared-token callers are normal automation, not admin callers. The CLI sends:
```text
Authorization: Bearer <token>
@ -21,6 +81,52 @@ X-Crabbox-Owner: <email>
X-Crabbox-Org: <org>
```
If the coordinator route is also protected by Cloudflare Access, the CLI can
send Access credentials before the Worker receives the request. Configure
`CRABBOX_ACCESS_CLIENT_ID` and `CRABBOX_ACCESS_CLIENT_SECRET` for a Cloudflare
Access service token, or `CRABBOX_ACCESS_TOKEN` to forward an already minted
Access JWT as `cf-access-token`. These Access credentials only satisfy
Cloudflare Access; the Worker still requires the Crabbox bearer token or a
signed Crabbox user token. When `CRABBOX_ACCESS_TEAM_DOMAIN` and
`CRABBOX_ACCESS_AUD` are configured, the Worker verifies
`Cf-Access-Jwt-Assertion` against Cloudflare Access certs before using any
Access identity. Raw `cf-access-authenticated-user-email` headers are ignored.
The live Access-protected route is `https://crabbox-access.openclaw.ai`. Its Access app is service-token-only (`non_identity`) and currently allows the local Crabbox CLI service token, so automated clients can prove both layers independently: first Cloudflare Access, then the Worker bearer or signed user token.
Local config shape:
```yaml
broker:
url: https://crabbox.openclaw.ai
token: <crabbox-shared-token-or-user-token>
adminToken: <crabbox-admin-token>
access:
clientId: <cloudflare-access-client-id>
clientSecret: <cloudflare-access-client-secret>
provider: aws
```
Set `CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai` when you want a
command to use the Access-protected route without changing the default public
broker URL. `crabbox config show` reports the Access credential state as
`access_auth=service-token` without printing secrets.
Useful proof commands:
```sh
curl -i https://crabbox-access.openclaw.ai/v1/health
CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai bin/crabbox doctor
CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai bin/crabbox whoami
CRABBOX_LIVE=1 CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai CRABBOX_BIN=bin/crabbox scripts/live-auth-smoke.sh
CRABBOX_LIVE=1 CRABBOX_LIVE_PROVIDERS=aws CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai CRABBOX_BIN=bin/crabbox scripts/live-smoke.sh
```
The first command should fail at Cloudflare Access without credentials. The auth
smoke should pass when local Access credentials, shared broker auth, and admin
broker auth are configured. The provider smoke additionally proves the same
route can lease, run, and release a real machine.
Owner selection for bearer-token requests:
```text
@ -30,9 +136,18 @@ GIT_COMMITTER_EMAIL
git config user.email
```
`CRABBOX_ORG` sets the org header. When Cloudflare Access identity is present, Access email wins over the CLI-provided owner.
`CRABBOX_ORG` sets the org header. Raw Cloudflare Access email headers do not
override CLI-provided owner/org headers. If the Worker can verify an Access JWT
and that JWT contains an email, that verified Access email becomes the bearer
request owner. Normal `crabbox login` requests use the signed GitHub token
identity.
The `crabbox.clawd.bot/*` route is protected by Cloudflare Access. The worker.dev route is useful for automation and direct health checks when configured with bearer auth.
GitHub user tokens are signed by the Worker and are not admin tokens. Admin
routes require the separate admin token. The `crabbox.openclaw.ai/*` route is
the canonical CLI and browser-login endpoint. `crabbox-access.openclaw.ai/*` is
the service-token-protected endpoint.
`https://crabbox-coordinator.services-91b.workers.dev` and `crabbox.clawd.bot/*`
are fallbacks.
Related docs:

View File

@ -0,0 +1,199 @@
# Lease Capabilities
Read when:
- adding `--desktop`, `--browser`, or `--code` to a workflow;
- changing how Crabbox detects whether a lease can host a visible desktop;
- adding a new lease capability flag.
Lease capabilities are opt-in features that change what a managed runner can
do beyond running headless commands. They are a separate concept from the
provider feature set declared in `ProviderSpec.Features`: feature set says
"this provider can support a desktop"; lease capability says "this lease was
created with a desktop and exposes one right now".
## The Three Capabilities
```text
--desktop visible desktop with a loopback VNC server
--browser Chrome/Chromium installed and exported via $BROWSER and $CHROME_BIN
--code code-server bound to a loopback port for portal/code bridging
```
All three default to off. They have to be requested at lease creation time
(`crabbox warmup --desktop`) and reused afterwards. A lease created without a
capability cannot grow it later.
## Selection And Validation
Capability flags follow a two-step validation:
1. **Provider feature check.** When the user sets a capability flag,
`validateRequestedCapabilities` looks up the selected provider's
`Spec.Features` and rejects the request if the matching feature
(`FeatureDesktop`, `FeatureBrowser`, `FeatureCode`) is missing. Hetzner
Linux supports all three; Blacksmith Testbox supports none.
2. **Lease label check.** When reusing a lease (`--id`),
`enforceManagedLeaseCapabilities` checks the matching label
(`desktop=true`, `browser=true`, `code=true`) on the existing lease. If
the label is missing, Crabbox refuses with a hint to warm a new lease.
For static SSH targets, label enforcement is skipped because Crabbox does not
own the host. The capability is detected probe-by-probe instead - `--desktop`
on a static target probes the loopback VNC port; `--browser` on a static
target probes for Chrome and exports `BROWSER`/`CHROME_BIN` from what it
finds.
`--code` is currently restricted to managed Linux leases. The validator
rejects it for Windows, macOS, and static SSH.
## Desktop
When a managed Linux lease is created with `--desktop`, bootstrap installs:
- Xvfb (virtual framebuffer);
- a slim XFCE session;
- x11vnc bound to `127.0.0.1:5900`;
- a randomized VNC password at `/var/lib/crabbox/vnc.password`;
- screenshot tooling (`scrot`) and ffmpeg.
`crabbox vnc --id ...` opens an SSH tunnel to that loopback port. The user's
local VNC viewer talks through the tunnel and uses the password the CLI
fetches from `/var/lib/crabbox/vnc.password`. There is no public VNC port; the
loopback bind is the security boundary.
Static targets must already expose loopback VNC at `127.0.0.1:5900`. macOS
hosts can enable Screen Sharing; Windows hosts need a VNC server bound to
loopback (TightVNC works).
For per-OS detail and known limits, see:
- [Linux VNC](vnc-linux.md);
- [Windows VNC](vnc-windows.md);
- [macOS VNC](vnc-macos.md);
- [Interactive desktop and VNC](interactive-desktop-vnc.md).
When the run injects environment, Crabbox also sets:
```text
DISPLAY=:99
CRABBOX_DESKTOP=1
```
Tools that respect `DISPLAY` will draw onto the desktop the lease created.
## Browser
`--browser` adds a usable browser to the lease without dragging in a full QA
test environment.
On managed Linux:
- Google Chrome stable when available;
- Chromium fallback;
- native addon build helpers (`build-essential`, `libgbm-dev`,
`libnss3-dev`, etc.) so dependency installs that compile against Chromium
succeed.
On static targets, Crabbox probes for an existing browser and reports an
error if none is found. `requestedCapabilityEnv` shells out to the host:
- macOS: `/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`;
- Windows: `chrome.exe` or `msedge.exe` from PATH or the standard install
directories;
- Linux: `$BROWSER`, `$CHROME_BIN`, then `google-chrome`, `chromium`, or
`chromium-browser` from PATH.
The detected path is exported into the run as:
```text
BROWSER=/path/to/browser
CHROME_BIN=/path/to/browser
CRABBOX_BROWSER=1
```
Test runners that read `BROWSER` or `CHROME_BIN` (Vitest, Playwright, etc.)
work without extra plumbing. If a browser is requested but no binary is
found, the run aborts before the command starts.
For browser QA where the remote service is sensitive to source IP (Discord
login, Slack workspace bootstrap, regional CDN behavior), pair `--browser`
with [mediated egress](egress.md). `crabbox egress start` opens a lease-local
proxy that exits to the internet through the operator machine, and `crabbox
desktop launch --egress <profile>` passes that proxy to Chrome.
## Code
`--code` provisions code-server on managed Linux leases:
- installs the binary at `/usr/local/bin/code-server`;
- binds to a loopback port (default `8080`);
- generates an auth token stored in coordinator state.
The portal and `crabbox code --id ...` open a code-server tab through the
authenticated portal bridge at `/portal/leases/{id-or-slug}/code/`. The bridge
proxies HTTP and WebSocket traffic to the loopback port; the code-server
auth token is injected by the bridge so the user does not see it. There is no
public code-server port.
Code is managed-Linux-only because the bridge depends on the lease shape and
the cloud-init that prepares the binary. Windows, macOS, and static SSH are
intentionally not supported today.
## Capability Labels
Managed lease records carry capability labels so list, status, and detail
pages can render the capability matrix without re-probing the host:
```text
desktop=true|false
browser=true|false
code=true|false
```
`enforceManagedLeaseCapabilities` reads these labels to gate `--desktop`,
`--browser`, and `--code` on `--id` reuse paths. The labels are written when
the lease is created and never flipped on a live lease.
## Composing Capabilities
Capabilities are independent - any combination is allowed where the
provider supports them:
```sh
crabbox warmup --desktop # desktop only
crabbox warmup --desktop --browser # browser running on the desktop
crabbox warmup --desktop --browser --code # full interactive box
crabbox warmup --browser # headless browser, no VNC
crabbox warmup --code # editor-only Linux lease
```
Capability bootstrap adds installation time. A bare lease is the fastest to
warm; a lease with all three takes the longest. Use the lightest combination
that satisfies the workflow.
## Static Targets
For static SSH hosts, capability validation degrades to probe-based detection:
- `--desktop`: probe `127.0.0.1:5900` over SSH; fail with a clear error if
the port is not bound;
- `--browser`: probe for a browser binary using the OS-specific search list;
fail if none found;
- `--code` is rejected (managed Linux only).
This is intentional. Crabbox is not responsible for installing software on
operator-owned static hosts; if the box does not expose the capability, the
run should not silently fall back.
Related docs:
- [warmup command](../commands/warmup.md)
- [run command](../commands/run.md)
- [vnc command](../commands/vnc.md)
- [webvnc command](../commands/webvnc.md)
- [code command](../commands/code.md)
- [egress command](../commands/egress.md)
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [Mediated egress](egress.md)
- [Browser portal](portal.md)

View File

@ -0,0 +1,215 @@
# Capacity And Fallback
Read when:
- adding or changing machine classes;
- debugging "why did Crabbox pick this instance type?";
- working on AWS spot/on-demand fallback or Hetzner location fallback;
- configuring multi-region or multi-AZ capacity for AWS.
Crabbox cares about capacity in three ways:
1. **Class fallback** - the ordered list of provider types that satisfy a
class request.
2. **Market fallback** - AWS-specific Spot to On-Demand failover within a
class.
3. **Region/AZ routing** - where the broker tries to provision when capacity
is tight in a single zone.
Hetzner only deals with class fallback. AWS deals with all three. Static
SSH, Blacksmith, Daytona, and Islo do not have capacity fallback because
the operator or external service controls the underlying resources.
## Classes
Class names are provider-agnostic intent labels:
```text
standard typical CI lane
fast ~2x more cores than standard for parallel-friendly suites
large memory-heavy or many-process workloads
beast maximum capacity within the provider's burstable family
```
Each provider maps the four class names to an ordered list of concrete
instance types. The list is the fallback chain: try the first; if rejected,
try the second; and so on.
The full Hetzner and AWS class tables live in
[Providers](providers.md#hetzner-summary). The table also lists the AWS
Windows, Windows WSL2, and macOS class maps.
## When Class Fallback Triggers
Hetzner falls back when:
- the requested server type is unavailable in the configured location;
- the project quota rejects the request;
- the API returns a transient capacity error.
AWS falls back when:
- the instance type is rejected by capacity in the chosen Availability Zone;
- the account policy denies the type (e.g. quota = 0 vCPUs);
- the spot request is rejected by capacity.
Quota rejections are detected from the API error code rather than scraped
from the message string, so the fallback is deterministic. The next
candidate in the chain is tried until either one succeeds or the chain is
exhausted.
When the chain is exhausted, Crabbox returns exit code 4 (`no capacity`) and
the error includes `provisioningAttempts` that record which types were
tried, why each failed, and where (region/AZ for AWS). The same metadata is
attached to the failed lease record on the coordinator so operators can
inspect what went wrong without rerunning the workflow.
## Explicit Type Override
`--type c7a.16xlarge` and the matching `type:` config key skip the class
fallback chain and request that specific instance type. The contract is
"give me this exact type, not a fallback". If the provider rejects it,
Crabbox fails loudly with exit code 4 and does not silently choose a
different type.
Use `--type` when:
- you want deterministic capacity for benchmarks;
- you are pinning a specific generation for a known-bug workaround;
- you are debugging the capacity layer itself.
For everything else, prefer a class - the fallback chain handles transient
rejections without operator intervention.
## AWS Market Fallback
AWS supports two markets: `spot` and `on-demand`.
```yaml
capacity:
market: spot
fallback: on-demand-after-120s
```
`capacity.market: spot` requests Spot capacity first. `capacity.fallback:
on-demand-after-120s` falls back to On-Demand for the same instance type
when Spot fails to come up within 120 seconds. Set `fallback` to `none` (or
omit it) to never fall back to On-Demand.
Per-command overrides:
```sh
crabbox warmup --market spot
crabbox run --market on-demand -- pnpm test
```
The `--market` flag overrides `capacity.market` for one lease without
rewriting repo config. Use it when an account is temporarily out of Spot
quota or when Spot interruption rates spike.
## AWS Capacity Hints
The brokered AWS path uses Service Quotas and EC2 placement scoring to
preflight large requests:
```yaml
capacity:
hints: true
largeClasses:
- large
- beast
```
When `hints: true` and the class is in `largeClasses`:
- the broker calls Service Quotas to check applied Spot or On-Demand vCPU
limits;
- candidates that exceed quota are recorded as quota attempts and skipped;
- remaining candidates are scored with `GetSpotPlacementScores` (Spot mode)
to pick the most-available region/AZ.
The result is a single provisioning attempt that picks the best location
and skips known-rejected types instead of letting the chain stumble through
them sequentially.
Hints apply only on the brokered (Worker) path. Direct AWS mode still falls
back through the class chain but does not run quota or placement preflight.
## Region And Availability Zone Routing
```yaml
capacity:
regions:
- eu-west-1
- us-east-1
availabilityZones:
- eu-west-1a
- eu-west-1b
```
`regions` is the ordered list of AWS regions the broker considers when
multiple regions are configured. Single-region setups use `aws.region` and
leave `capacity.regions` empty; multi-region setups list every region the
broker may launch into.
`availabilityZones` narrows the per-region zone selection. The broker uses
Spot placement scoring across the listed AZs and picks the highest-scoring
zone that has capacity.
Regions are tried in order; AZs within a region are scored. If every AZ in
a region rejects the request, Crabbox advances to the next region.
## Fallback Strategies
```yaml
capacity:
strategy: most-available
```
| Value | Behavior |
|:------|:---------|
| `most-available` (default) | use placement scoring or class chain order |
| `cheapest` | prefer types with the lowest live hourly price (when known) |
| `provider-default` | follow the provider's own placement defaults |
`cheapest` is currently honored on the brokered AWS path that has live
pricing. Hetzner does not differentiate strategies because its server-type
prices are consistent across locations.
## Direct Mode Differences
Direct provider mode (no coordinator) supports class fallback but has no
quota preflight, no placement score, no `provisioningAttempts` metadata, and
no central history. Direct AWS still respects `--market` and the `fallback`
config key, so spot-to-on-demand failover works locally - just without the
diagnostic richness the broker provides.
If a direct AWS run exits with code 4, run the same command through the
broker once to get structured `provisioningAttempts` evidence; then go back
to direct mode for the rest of the iteration loop.
## Failure Surface
Capacity failures map to:
```text
exit 4 no capacity every candidate in the chain was rejected
exit 5 provisioning failed a candidate was accepted but never reached SSH
exit 8 lease expired long warmup exceeded the configured TTL before SSH
```
The accompanying error message names the chain, the markets that were
tried, and (for brokered runs) `provisioningAttempts` you can inspect with:
```sh
crabbox history --lease cbx_...
```
Related docs:
- [Providers](providers.md)
- [AWS](../providers/aws.md)
- [Hetzner](../providers/hetzner.md)
- [Cost and usage](cost-usage.md)
- [Orchestrator](../orchestrator.md)
- [Operations](../operations.md)

View File

@ -0,0 +1,399 @@
# Configuration
Read when:
- adding a new config key, env override, or flag;
- debugging "why is Crabbox using value X here?";
- onboarding a repo and choosing what belongs in repo config vs user config;
- reviewing the YAML schema that `crabbox config show` and `crabbox init`
emit.
Crabbox configuration is layered. The CLI loads values from five sources and
merges them in a deterministic order. Each source is optional - the binary
boots with sane defaults for everything.
## Precedence
```text
flags > env > repo-local crabbox.yaml/.crabbox.yaml > user config > defaults
```
Reading order is the lowest precedence first: defaults are applied, then
overridden by user config, then repo config, then env vars, then flags. Every
override only replaces fields that are explicitly set; unset fields fall
through.
`crabbox config show` prints the merged configuration as the CLI sees it after
all five layers run. `--json` is stable enough to diff in scripts.
`crabbox config path` prints the user config file path so other tools can
edit it without parsing prose.
## File Locations
```text
macOS user: ~/Library/Application Support/crabbox/config.yaml
Linux user: ~/.config/crabbox/config.yaml
XDG override: $XDG_CONFIG_HOME/crabbox/config.yaml
repo: ./crabbox.yaml or ./.crabbox.yaml at repo root
explicit: $CRABBOX_CONFIG (any path)
```
If `CRABBOX_CONFIG` is set, it overrides the repo-local search and replaces
the effective repo config. User config is never replaced by the env override.
State that does not belong in either YAML file:
- live lease records (those are coordinator-owned);
- per-lease SSH private keys (those live under the user config dir but not in
`config.yaml`);
- provider secrets (those live in the broker environment, your shell env, or
a credential manager).
## YAML Schema
The full schema below merges what `crabbox init` emits and what advanced
operators set in user config. Most repos only need a small subset.
### Top-level
```yaml
broker:
url: https://crabbox.openclaw.ai
provider: aws
token: <signed-github-token-or-shared-token>
access:
clientId: <cloudflare-access-service-token-id>
clientSecret: <cloudflare-access-service-token-secret>
provider: aws # default provider when --provider is not set
target: linux # default target OS
windows:
mode: normal # normal or wsl2 when target=windows
profile: project-check
class: beast # standard | fast | large | beast
type: c7a.48xlarge # explicit provider type, overrides class fallback
network: auto # auto | tailscale | public
lease:
idleTimeout: 30m
ttl: 90m
```
### Capacity
```yaml
capacity:
market: spot # spot | on-demand
strategy: most-available
fallback: on-demand-after-120s
hints: true
regions:
- eu-west-1
- us-east-1
availabilityZones:
- eu-west-1a
- eu-west-1b
largeClasses:
- large
- beast
```
### AWS
```yaml
aws:
region: eu-west-1
ami: ami-0123456789abcdef0
securityGroupId: sg-0abcdef0123456789
subnetId: subnet-0abcdef0123456789
instanceProfile: crabbox-runner
rootGB: 400
sshCidrs:
- 203.0.113.0/24
macHostId: h-0123456789abcdef0
```
### Hetzner
Hetzner credentials and image come from broker-side config. Repos do not need
a `hetzner:` block unless they pin a class or location.
### Static SSH
```yaml
provider: ssh
target: macos
static:
host: mac-studio.local
user: steipete
port: "22"
workRoot: /Users/steipete/crabbox
```
### Blacksmith Testbox
```yaml
provider: blacksmith-testbox
blacksmith:
org: openclaw
workflow: .github/workflows/ci-check-testbox.yml
job: test
ref: main
idleTimeout: 90m
debug: false
```
### Daytona
```yaml
provider: daytona
daytona:
snapshot: openclaw-crabbox
apiKey: <daytona-api-key> # prefer DAYTONA_API_KEY env
```
### Sync
```yaml
sync:
delete: true
checksum: false
gitSeed: true
fingerprint: true
baseRef: main
timeout: 15m
warnFiles: 50000
warnBytes: 5368709120
failFiles: 150000
failBytes: 21474836480
allowLarge: false
exclude:
- node_modules
- .turbo
- dist
```
A `.crabboxignore` file at the repo root appends to `sync.exclude`. See
[Sync](sync.md) for the matcher rules.
### Env Forwarding
```yaml
env:
allow:
- CI
- NODE_OPTIONS
- PROJECT_*
```
`env.allow` is name-based and supports trailing wildcards. Crabbox forwards
matching local env vars to the remote command. Secrets do not belong in
`env.allow`; pass them through provider-side mechanisms.
### Actions
```yaml
actions:
workflow: .github/workflows/crabbox.yml
job: test
ref: main
fields:
- crabbox_docker_cache=true
runnerLabels:
- crabbox
ephemeral: true
runnerVersion: latest
```
### Cache
```yaml
cache:
pnpm: true
npm: true
docker: true
git: true
maxGB: 80
purgeOnRelease: false
```
### Results
```yaml
results:
junit:
- junit.xml
- reports/junit.xml
```
### SSH
```yaml
ssh:
key: ~/.ssh/id_ed25519
user: crabbox
port: "2222"
fallbackPorts:
- "22"
```
### Tailscale
```yaml
tailscale:
enabled: false
tags:
- tag:crabbox
hostnameTemplate: crabbox-{slug}
authKeyEnv: CRABBOX_TAILSCALE_AUTH_KEY
exitNode: ""
exitNodeAllowLanAccess: false
```
### Mediated Egress
Mediated egress is a browser/app QA feature where a lease exits to the internet
through an operator machine over the Cloudflare Worker mediator. The first
implementation is opt-in and profile-based.
```yaml
egress:
enabled: false
listen: 127.0.0.1:3128
browserProxy: true
profiles:
discord:
allow:
- discord.com
- "*.discord.com"
- discordcdn.com
- "*.discordcdn.com"
slack:
allow:
- slack.com
- "*.slack.com"
- slack-edge.com
- "*.slack-edge.com"
```
See [Mediated egress](egress.md) for the design, security model, and command
surface. The current CLI ships built-in `discord` and `slack` profiles; the
YAML shape is the intended config surface for making those profiles
user-configurable.
## Profiles
Profiles are named bundles of config that get applied as a layer on top of
user/repo config. They live under a `profiles:` map and are selected by
`--profile` or `profile:` in repo config.
```yaml
profiles:
project-check:
class: beast
sync:
baseRef: main
env:
allow:
- PROJECT_*
smoke:
class: standard
lease:
ttl: 30m
```
Use profiles when one repo has multiple test lanes with different machine
classes, sync rules, or env allowlists. A repo without profiles never needs
the block.
## Machine Classes
A machine class is a provider-agnostic name for "standard", "fast", "large",
or "beast" capacity. Each provider maps the class to a list of concrete
instance/server types and falls back through the list when the first
candidate cannot be provisioned.
| Class | Intent |
|:------|:-------|
| `standard` | typical CI lane |
| `fast` | ~2x more cores than standard for parallel-friendly suites |
| `large` | memory-heavy or many-process workloads |
| `beast` | maximum capacity within the provider's burstable family |
Class-to-type mappings live in [Providers](providers.md). When you set
`type:`, that exact provider type wins and the class is ignored. The
`--type` and `type:` paths intentionally do not fall back; they fail loud
if the provider rejects the type.
## Environment Variables
Every YAML key has a `CRABBOX_*` env override. The full list is in
[CLI](../cli.md#environment-variables). Common ones:
```text
CRABBOX_COORDINATOR
CRABBOX_COORDINATOR_TOKEN
CRABBOX_PROVIDER
CRABBOX_TARGET
CRABBOX_PROFILE
CRABBOX_DEFAULT_CLASS
CRABBOX_IDLE_TIMEOUT
CRABBOX_TTL
CRABBOX_NETWORK
CRABBOX_OWNER
CRABBOX_ORG
```
Provider credentials live outside the Crabbox env namespace because they are
provider-native:
```text
HCLOUD_TOKEN / HETZNER_TOKEN
AWS_PROFILE / AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN
DAYTONA_API_KEY / DAYTONA_JWT_TOKEN
BLACKSMITH_* (read by the Blacksmith CLI)
ISLO_API_KEY (read by the Islo SDK)
```
## What Belongs Where
| Setting | User config | Repo config | Profile | Notes |
|:--------|:------------|:------------|:--------|:------|
| `broker.url` and `broker.token` | yes | no | no | Per-machine identity. |
| `provider`, `class`, `type` | optional default | yes | yes | Per-repo defaults; profiles for lanes. |
| `sync.exclude`, `sync.fingerprint`, `sync.baseRef` | no | yes | yes | Lives with the repo. |
| `env.allow` | no | yes | yes | Repo decides what is safe to forward. |
| Per-user SSH key path | yes | no | no | Personal preference. |
| `aws.region`, `aws.ami` | optional | yes | yes | Repos can pin region. |
| Tailscale tags and template | yes | yes | yes | Both layers can set this. |
| Profiles | yes | yes | n/a | Either layer can define profiles. |
The rule of thumb: anything other repos should inherit when they clone goes in
repo config; anything tied to one operator's machine goes in user config.
## Validation
The CLI validates config eagerly:
- `parseNetworkMode` rejects `--network` values outside `auto|tailscale|public`;
- `validateNetworkConfig` requires `tailscale.tags` when `tailscale.enabled`
is true and rejects Tailscale on Blacksmith and static providers;
- `validateRequestedCapabilities` rejects `--desktop`, `--browser`, or
`--code` for providers whose `Spec.Features` does not list the matching
feature flag;
- `crabbox doctor` runs a richer set of checks against config, network
reachability, and SSH keys.
When validation fails, `crabbox` exits with code 2 and a message that names
the offending field.
Related docs:
- [CLI](../cli.md)
- [config command](../commands/config.md)
- [doctor command](../commands/doctor.md)
- [Sync](sync.md)
- [Providers](providers.md)
- [Capacity and fallback](capacity-fallback.md)
- [Network and reachability](network.md)

View File

@ -10,10 +10,11 @@ The coordinator is the Cloudflare Worker plus Fleet Durable Object. Normal Crabb
Responsibilities:
- authenticate broker requests with the shared token and Cloudflare Access context when present;
- authenticate broker requests with signed GitHub user tokens, the shared operator token, or the separate admin token, with optional verified Cloudflare Access context on protected fallback routes;
- serialize fleet state in one Durable Object;
- create, heartbeat, release, expire, and look up leases;
- own provider credentials;
- own artifact storage credentials and mint scoped artifact upload URLs;
- create and delete provider resources;
- list the pool;
- enforce cost and active-lease guardrails;
@ -35,12 +36,64 @@ POST /v1/runs
GET /v1/runs/{run-id}
GET /v1/runs/{run-id}/logs
POST /v1/runs/{run-id}/finish
POST /v1/artifacts/uploads
GET /v1/runners
POST /v1/runners/sync
GET /v1/usage
GET /v1/admin/leases
POST /v1/admin/leases/{id-or-slug}/release
POST /v1/admin/leases/{id-or-slug}/delete
```
Browser portal surface:
```text
GET /portal
GET /portal/leases/{id-or-slug}
POST /portal/leases/{id-or-slug}/release
GET /portal/leases/{id-or-slug}/vnc
GET /portal/leases/{id-or-slug}/code/
GET /portal/runs/{run-id}
GET /portal/runs/{run-id}/logs
GET /portal/runs/{run-id}/events
GET /portal/runners/{provider}/{runner-id}
```
`/portal` renders a searchable/paginated/sortable lease data grid with compact
provider/target badges, icon-only access capabilities, relative time cells,
dense rows, sticky column headers, and active, ended, provider, target, and all
filters. Normal browser sessions are owner/org scoped. Admin/operator sessions
can also see non-owned runner leases, with `mine` and `system` filters so
Blacksmith/Testbox-style coordinator leases are visible without leaking them to
normal users. It defaults to active leases when any are active, and falls back to
all visible leases when the active list is empty. External runner rows, currently
Blacksmith Testboxes synced from the CLI's current all-status list, render in the
same grid as muted, disabled rows with search, pagination, status/provider
filters, inferred GitHub Actions run/workflow links and status badges when
available, `stuck` markers for long-queued or long-running Actions owners, a
copyable local stop command, and stale markers when the next sync no longer sees
a previously visible runner. Clicking an external runner opens
`/portal/runners/{provider}/{runner-id}`, a visibility-only detail page with
owner/org, Actions ownership, lifecycle timestamps, boundary notes, and the local
stop command.
`/portal/leases/{id-or-slug}` is the authenticated lease detail page. It shows
the lease state, bridge status, compact provider/target badges, latest Linux
telemetry, access-panel copy commands for `ssh`, `run`, WebVNC, and code, a
viewport-fitted recent runs grid with state filters, and a stop action for the
visible lease. When multiple telemetry samples are present, the detail page
adds load, memory, and disk sparklines plus stale/high-resource status pills.
Portal run links mirror the `/v1/runs/...` resources but use the browser
session cookie, so users can inspect logs and events without copying a bearer
token into the browser. The run detail page at `/portal/runs/{run-id}` renders
the command, owner, lease, provider metadata, exit status, JUnit summary when
present, a searchable/paginated event table with event-type filters, and a
copyable retained log tail. Longer Linux runs include bounded load, memory, and
disk trend lines collected from run telemetry samples; `/logs` and `/events`
remain raw/plain resources for copying and automation.
GitHub browser-login tokens are owner/org scoped for lease, run, log, and usage routes. Shared-token admin auth is required for `GET /v1/pool`, admin lease routes, and fleet-wide usage/listing.
Lease responses include the canonical `cbx_...` ID, friendly slug when present, provider metadata, owner/org, `createdAt`, `lastTouchedAt`, `idleTimeoutSeconds`, `ttlSeconds`, and computed `expiresAt`. Heartbeat is a touch and can update idle timeout only when the request explicitly sends `idleTimeoutSeconds`.
The CLI owns local config, per-lease SSH keys, SSH readiness, sync, command execution, output streaming, and local fallback handling.

View File

@ -34,10 +34,11 @@ CRABBOX_DEFAULT_ORG
Identity for usage:
- Cloudflare Access email wins when present;
- bearer-token CLI requests send `X-Crabbox-Owner`;
- signed GitHub login tokens carry owner/org identity;
- shared bearer-token CLI requests send `X-Crabbox-Owner`;
- `X-Crabbox-Owner` comes from `CRABBOX_OWNER`, Git email env, or `git config user.email`;
- `CRABBOX_ORG` sends `X-Crabbox-Org`.
- raw Cloudflare Access identity headers are ignored; only a verified Access JWT email can become the bearer-token owner.
`estimatedUSD` is elapsed runtime cost. `reservedUSD` is TTL worst-case cost reserved before provisioning. Provider extras such as static IP charges, egress, snapshots, taxes, credits, and discounts are not fully modeled.

85
docs/features/daytona.md Normal file
View File

@ -0,0 +1,85 @@
# Daytona
Read when:
- choosing `provider: daytona`;
- configuring Daytona auth, snapshots, or SSH access;
- reviewing Daytona provider behavior.
`provider: daytona` provisions Daytona sandboxes from snapshots. `run` and
`warmup` use Daytona's SDK/toolbox for workspace upload and command execution;
`ssh` mints a short-lived Daytona SSH token only when interactive shell access is
requested.
## Auth
Run Daytona's CLI login:
```sh
daytona login --api-key ...
```
Crabbox uses the active Daytona CLI profile when no explicit Daytona auth
environment variables are set.
Alternatively, set one of:
```sh
export DAYTONA_API_KEY=...
```
or:
```sh
export DAYTONA_JWT_TOKEN=...
export DAYTONA_ORGANIZATION_ID=...
```
`DAYTONA_ORGANIZATION_ID` is required when JWT auth is used. `DAYTONA_API_URL`
or `daytona.apiUrl` can override the default `https://app.daytona.io/api`.
Explicit environment or Crabbox config values override the Daytona CLI profile.
## Config
Daytona's first Crabbox integration is snapshot-first. The snapshot owns CPU,
memory, disk, and installed tooling. Crabbox does not expose Daytona resource
flags in this mode.
```yaml
provider: daytona
target: linux
daytona:
snapshot: crabbox-ready
target: ""
user: daytona
workRoot: /home/daytona/crabbox
sshGatewayHost: ssh.app.daytona.io # fallback when the API omits sshCommand
sshAccessMinutes: 30
```
Equivalent flags:
```sh
crabbox warmup --provider daytona --daytona-snapshot crabbox-ready
crabbox run --provider daytona --id <slug> -- pnpm test
crabbox stop --provider daytona <slug>
```
## Behavior
- `warmup` creates a Daytona sandbox from `daytona.snapshot`, waits for the
sandbox, records Crabbox labels, then prints a normal Crabbox lease ID and
slug.
- `run --id` resolves a Daytona sandbox, uploads a Crabbox manifest archive
through Daytona toolbox file APIs, extracts it in the sandbox, and executes the
command through Daytona toolbox process APIs.
- `list`, `status`, and `stop` use Daytona sandbox labels to find Crabbox-owned
sandboxes.
- `ssh` mints a fresh Daytona SSH token, parses the host and port returned by
Daytona's `sshCommand`, and redacts the token as `<token>` unless
`--show-secret` is used.
Daytona is a hybrid backend: core rendering, lease labels, sync manifests, and
repo claim checks stay Crabbox-owned, while the actual `run` transport is
Daytona SDK/toolbox. Actions runner hydration is not supported for Daytona
warmup because it requires a normal long-lived SSH runner host.

172
docs/features/doctor.md Normal file
View File

@ -0,0 +1,172 @@
# Doctor Checks
Read when:
- adding a new precheck before users run long workflows;
- debugging an unexpected `doctor` failure;
- deciding whether a check belongs in `doctor` or somewhere else.
`crabbox doctor` is the local preflight. It validates the things that have
silently broken commands in the past so users get an answer before they
spend ten minutes on a failed lease.
The command is fast (under a second on a healthy machine), local-only,
non-destructive, and never talks to provider APIs that might cost money.
## Categories
Doctor groups checks under five categories:
```text
config config files load and parse, required keys are present
auth broker token is set, signed token is valid, identity resolves
network coordinator URL reachable, DNS works, SSH transport probes work
ssh SSH key path readable, key type acceptable, ssh-keygen on PATH
tools rsync, git, ssh, ssh-keygen present and executable
```
Each category emits one or more pass/fail/skip lines. Failures are listed
first; passes and skips follow in deterministic order so the output is
diffable across runs.
## What `config` Checks
- The user config file parses without error.
- The repo config (when present) parses without error.
- Provider name resolves through `ProviderFor`.
- Target OS is one of `linux`, `macos`, `windows`.
- Network mode is one of `auto`, `tailscale`, `public`.
- Tailscale config validates when `tailscale.enabled: true` (tags non-empty,
hostname template non-empty, exit-node-allow-lan-access requires an
exit node, target is `linux`, provider is not Blacksmith or static).
- Class is one of `standard`, `fast`, `large`, `beast` when set; explicit
`type:` values are accepted as-is.
## What `auth` Checks
- A broker URL is configured if the user expects coordinator mode.
- A broker token is present when the URL is configured.
- The signed token (when GitHub login was used) decodes and is not expired.
- Owner can be resolved from `CRABBOX_OWNER`, Git env, or
`git config user.email`.
- `whoami` succeeds against the configured coordinator with the stored
token.
When auth is missing, doctor prints `crabbox login` as the next step.
## What `network` Checks
- The coordinator URL resolves via DNS.
- The coordinator is reachable over HTTPS within a small timeout.
- When `--network tailscale` is configured, `tailscale status` reports a
joined client.
- SSH transport probes succeed for the primary port and fall back to the
configured fallback ports.
DNS is checked before HTTPS so a broken DNS responder does not look like a
broker outage.
## What `ssh` Checks
- The configured SSH key path (`ssh.key` or `CRABBOX_SSH_KEY`) is readable
when set.
- The key file has a sensible permissions mode (warn on group/world
readable).
- `ssh-keygen` is on PATH so per-lease key generation works.
- The user's `~/.ssh/known_hosts` is writable (if it exists).
When `ssh.key` is unset, doctor skips the path validation - per-lease keys
do not need a global key.
## What `tools` Checks
- `git` is on PATH.
- `rsync` is on PATH.
- `ssh` is on PATH.
- `ssh-keygen` is on PATH.
The check is path-based, not version-based. Crabbox tolerates any reasonably
modern version of these tools.
## What Doctor Does Not Do
Doctor stays local on purpose. It does not:
- start a real lease or provision a server;
- talk to AWS, Hetzner, Daytona, Islo, or any provider API;
- run `git ls-files` against the repo (that belongs in `crabbox sync-plan`);
- estimate costs;
- modify config or rotate keys.
Anything that costs money or has side effects belongs in a different
command. Doctor is for "before I run anything, is my machine sane?" and
should be safe to run from `pre-commit` hooks, agent boot, or CI smoke.
## Output Shape
```text
config:
ok user config: ~/.config/crabbox/config.yaml
ok repo config: ./.crabbox.yaml
ok provider: aws
ok target: linux
ok network: auto
auth:
ok broker: https://crabbox.openclaw.ai
ok owner: alex@example.com
ok org: openclaw
network:
ok coordinator dns
ok coordinator https
ssh:
ok ssh-keygen present
skip ssh.key unset (per-lease keys will be used)
tools:
ok git
ok rsync
ok ssh
ok ssh-keygen
```
Failures swap the leading `ok` for `fail` and add a remediation hint:
```text
auth:
fail broker token is missing - run `crabbox login`
```
Skips swap `ok` for `skip` and explain why the check did not run:
```text
network:
skip coordinator unconfigured (direct provider mode)
```
Exit code is `0` on full success, `2` on any failure. Skips do not change
the exit code.
## Adding A Check
Doctor checks live in `internal/cli/doctor.go`. Each check returns a
`doctorResult{ Status, Category, Subject, Detail, Remediation }`. The CLI
sorts results by category, then by subject, so output stays stable.
Rules for new checks:
- they must run in under ~100ms;
- they must not call out to a paid API or write any state;
- they must produce a `Remediation` string when they fail;
- they should `skip` (not `fail`) when the configuration genuinely does
not apply (e.g. SSH key check when `ssh.key` is unset).
Tests in `doctor_test.go` exercise the result struct and ordering. Add a
test for the new check that asserts the failure message and remediation
text so future refactors do not silently regress the user-facing output.
Related docs:
- [doctor command](../commands/doctor.md)
- [Configuration](configuration.md)
- [Network and reachability](network.md)
- [SSH keys](ssh-keys.md)
- [Source map](../source-map.md)

489
docs/features/egress.md Normal file
View File

@ -0,0 +1,489 @@
# Mediated Egress
Read when:
- browser or app QA needs a lease to use the same public internet path as an
operator workstation;
- adding the `crabbox egress` command family;
- comparing mediated browser/app egress with Tailscale exit nodes, Cloudflare
Tunnel, or full-VM routing;
- wiring Mantis-style visual QA for Discord, Slack, or other web apps that
are sensitive to source IP, browser login, or regional routing.
Status: implemented as a CLI-first bridge. The shipped slice supports
`egress start`, `host`, `client`, `status`, and browser launches with
`desktop launch --egress`.
## Goal
Some QA scenarios need the runner to look like it is browsing from the same
network as the human or agent driving the test. Discord and Slack are good
examples: login, bot verification, abuse heuristics, and regional behavior can
change when the browser comes from a fresh cloud IP.
The first Crabbox egress goal is:
```text
Chrome or an app inside a Crabbox lease
uses a local proxy inside the lease
and exits to the internet from the operator machine running Crabbox.
```
This is intentionally per-app/per-process egress. It should make browser QA and
Mantis scenarios reproducible without changing every route on the VM. Full
machine routing can be added later through a Linux exit node or a dedicated
gateway when a scenario truly needs all traffic to move.
## Non-Goals
Mediated egress is not:
- a public open proxy;
- a replacement for provider firewalls or SSH access controls;
- a transparent VM-wide VPN in the first implementation;
- a way for the Cloudflare Worker itself to become the internet egress point;
- a place to store browser login state, app credentials, or provider secrets.
The Cloudflare Worker is the mediator. The operator machine is the egress point.
## Existing Pieces
Crabbox already has two bridge models that are close to the desired shape:
- WebVNC: `crabbox webvnc` keeps an SSH tunnel to the lease VNC service and
connects a local bridge process to the coordinator with a one-use ticket. The
browser portal then talks to that bridge through the Worker Durable Object.
- Code portal: `crabbox code` starts a code-server process on the lease and
proxies HTTP/WebSocket traffic through a ticketed coordinator bridge.
Those bridges establish the important boundaries:
- the Worker owns authenticated routing, tickets, status, and cleanup;
- bridge agents connect outbound to the Worker;
- each bridge is tied to one lease and short-lived ticket material;
- the portal is not allowed to reach private runner services by itself.
Mediated egress should reuse that model instead of introducing an unrelated
proxy service.
## Architecture
Mediated egress has two long-running agents and one Worker Durable Object
session.
```text
Cloudflare Worker / Fleet Durable Object
+----------------------------------------+
| ticket auth, socket pairing, status, |
| allowlist metadata, cleanup, counters |
+-------------------+--------------------+
|
paired WebSocket streams over HTTPS
|
+------------------------------+------------------------------+
| |
+-------v-----------------+ +-------------v------+
| lease egress client | | host egress agent |
| runs inside the lease | | runs on operator |
| listens on 127.0.0.1 | | machine / gateway |
+-----------+-------------+ +-------------+------+
| |
| HTTP CONNECT / proxy | TCP
| |
+-----v------+ +------v-----+
| Chrome / | | internet |
| Slack app | | from host |
+------------+ +------------+
```
The lease side exposes a loopback proxy such as `127.0.0.1:3128`. Chrome or a
desktop app is launched with:
```sh
--proxy-server=http://127.0.0.1:3128
```
The host side opens the real outbound TCP connections. Remote services see the
operator machine's internet path, not the cloud provider's default egress IP.
## Setup And Traffic Flow
```text
Operator CLI
|
| crabbox egress start --id blue-lobster --profile discord --daemon
v
Resolve lease through coordinator
|
+-- if local coordinator is Access-protected:
| use --coordinator https://crabbox.openclaw.ai
| so the lease can connect without private Access credentials
|
v
Create shared egress session
|
+--> create client ticket
| |
| v
| SSH to lease
| |
| v
| install/run crabbox egress client
| |
| v
| listen on 127.0.0.1:3128 inside lease
|
+--> create host ticket
|
v
run local crabbox egress host
|
v
connect outbound to coordinator
Runtime browser request
|
| Chrome --proxy-server=http://127.0.0.1:3128
v
Lease-local proxy
|
| HTTP CONNECT host:443
v
Cloudflare Worker / Fleet Durable Object
|
| pair lease client + host agent by leaseID/sessionID
v
Host egress agent on operator machine
|
| enforce allowlist, open TCP connection
v
Internet service sees operator public IP
```
Teardown runs in the opposite direction: `crabbox egress stop` stops the local
host daemon and asks the lease to kill the remote client; releasing a lease also
clears coordinator-side egress sockets and session status.
## Command Shape
The CLI is explicit enough for debugging but ergonomic for the common
desktop-browser case.
Low-level commands:
```sh
crabbox egress host --id blue-lobster --profile discord
crabbox egress client --id blue-lobster --listen 127.0.0.1:3128
crabbox egress status --id blue-lobster
crabbox egress stop --id blue-lobster
```
Operator-friendly orchestration:
```sh
crabbox egress start --id blue-lobster --profile discord --daemon
crabbox desktop launch --id blue-lobster \
--browser \
--url https://discord.com/login \
--egress discord \
--webvnc \
--open
```
`egress start`:
1. resolve the lease;
2. create a host ticket and start the host bridge locally;
3. create a client ticket and start the lease-side proxy over SSH;
4. write the active proxy endpoint into lease-local state;
5. print status and cleanup commands.
Today the orchestrated `egress start` path is Linux-only because it installs a
Linux helper and starts it with POSIX shell commands. Non-Linux targets should
use manual target-specific setup until Crabbox grows native helper install
commands for those operating systems. If your coordinator needs Cloudflare
Access credentials, use a public coordinator route for `egress start`, or run
the low-level pieces manually with an explicit secret-handling plan.
`desktop launch --egress <profile>` passes the configured lease-local proxy to
the browser command. Start `egress start` first so something is listening on the
lease proxy port.
## Worker API
The coordinator exposes ticketed routes next to the WebVNC and code bridge
routes:
```text
POST /v1/leases/{leaseID}/egress/ticket
GET /v1/leases/{leaseID}/egress/host?ticket=...
GET /v1/leases/{leaseID}/egress/client?ticket=...
GET /v1/leases/{leaseID}/egress/status
```
The ticket request should include:
```json
{
"role": "host",
"profile": "discord",
"allow": ["discord.com", "*.discord.com"],
"sessionID": "egress_..."
}
```
The Worker tracks enough state to answer status and clean up stale bridges:
```text
leaseID
sessionID
owner/org
profile
allowlist
hostConnected
clientConnected
activeConnections
bytesIn
bytesOut
lastHostnames
createdAt
lastSeenAt
expiresAt
```
Like WebVNC/code, agent WebSocket upgrades should be accepted only after a
one-use ticket is consumed by the Fleet Durable Object. Cloudflare Access
service-token headers may get the request through the edge, but Crabbox ticket
auth still owns the bridge authorization.
## Stream Protocol
WebVNC can forward one raw byte stream. Egress needs many concurrent TCP
connections because a browser opens several sockets at once.
The bridge protocol needs multiplexed streams:
```text
hello { role, sessionID, protocolVersion }
open { connId, host, port }
open_ok { connId }
data { connId, bytes }
close { connId }
error { connId, message }
stats { activeConnections, bytesIn, bytesOut }
```
The lease egress client parses HTTP proxy requests from Chrome. For `CONNECT
host:port`, it asks the host agent to open a TCP connection. For plain HTTP
absolute-form requests, it can either proxy them directly or translate them to
a stream to port 80.
The first implementation may use JSON control frames and base64 data chunks for
simplicity. The protocol should reserve a version field so a later binary frame
format can avoid base64 overhead without changing the CLI surface.
## Security Model
Mediated egress must default closed.
Required guardrails:
- no listener bound to a public interface;
- one-use, short-lived tickets bound to lease, owner/org, role, and session;
- explicit domain allowlist or named profile;
- idle timeout and lease TTL cleanup;
- bounded active connections per session;
- bounded per-frame size;
- hostname logging only, not URLs or payload;
- no proxy passwords, tickets, or credentials in logs;
- host agent refuses destinations outside the allowlist;
- session closes when either side disconnects for longer than a short grace
period.
The host agent is powerful because it opens internet connections from the
operator network. It should show a clear startup summary before connecting:
```text
lease: blue-lobster
profile: discord
allowed: discord.com, *.discord.com, discordcdn.com, *.discordcdn.com
listening: none public; outbound websocket only
```
## Profiles
Profiles keep common browser QA scenarios repeatable without turning egress
into a blanket tunnel.
Intended config shape:
```yaml
egress:
enabled: false
listen: 127.0.0.1:3128
browserProxy: true
profiles:
discord:
allow:
- discord.com
- "*.discord.com"
- discordcdn.com
- "*.discordcdn.com"
- hcaptcha.com
- "*.hcaptcha.com"
slack:
allow:
- slack.com
- "*.slack.com"
- slack-edge.com
- "*.slack-edge.com"
```
Profiles should be merged like other config: flags over env over repo config
over user config over defaults. Repo config can define scenario profiles; user
config can define local preferences such as the default listen address.
## Browser And Desktop Integration
`--browser` leases already install a browser wrapper exposed through `BROWSER`
and `CHROME_BIN`. Egress should integrate at that seam.
Planned behavior:
- `crabbox egress start` launches the lease-local proxy at
`127.0.0.1:3128` by default;
- `crabbox desktop launch --egress <profile>` passes
`--proxy-server=http://127.0.0.1:<port>` when launching Chrome/Chromium;
- a later `crabbox run --egress <profile>` may opt command processes into
`HTTP_PROXY`, `HTTPS_PROXY`, and `ALL_PROXY`, but should never do this by
default for every run.
This keeps browser QA easy while avoiding surprising build or package-manager
traffic through a workstation.
## Portal Integration
The portal lease detail page shows egress status when a session exists:
- profile and allowlist;
- host/client connected state;
- copyable start/status/stop commands.
The portal should not expose raw proxy URLs or ticket values. It should treat
egress like WebVNC/code: a bridge that exists only while local agents are
running.
Connection counts, byte counters, and recent hostnames are still CLI/API-only
follow-ups once the bridge reports structured runtime stats.
## Comparison With Alternatives
### Tailscale Exit Node
A Tailscale exit node can route the whole VM through another machine. That is
useful when every process must share the same egress path. It is also more
fragile: OS forwarding, NAT, ACLs, and route approval all have to line up.
Use Tailscale exit nodes later for full-machine scenarios. Use mediated egress
first for browser/app QA.
### Cloudflare Tunnel TCP
A named Cloudflare Tunnel plus Access can expose private TCP services without a
public listener. It is useful as an operational building block, but it still
needs host and lease processes plus lifecycle management. Keeping the first
implementation inside the existing Worker/Durable Object bridge gives Crabbox
one auth, status, and cleanup model.
### Cloudflare Worker As Egress
Workers should not be the source of browser internet traffic for this feature.
The goal is not "use Cloudflare's IP"; it is "use the operator machine's
internet". The Worker mediates the two sides.
## Implementation Plan
### Phase 1: CLI-Only Mediated Proxy
Done:
- egress ticket and status routes in the Fleet Durable Object;
- host/client WebSocket bridge attachments;
- multiplexed stream protocol with connection IDs;
- `crabbox egress host`, `client`, `start`, `status`, and `stop`;
- domain allowlist enforcement on the host side;
- tests for ticket use, allowlist rejection, request parsing, and status
reporting.
### Phase 2: Browser Wiring
- Add `desktop launch --egress <profile>`. Done.
- Add optional browser wrapper support for `CRABBOX_BROWSER_PROXY_SERVER`.
- Add lease-local egress state beyond the active proxy port.
- Add a live smoke that launches a browser through the proxy and proves the
observed public IP matches the host agent path.
### Phase 3: Portal And Daemon UX
Done:
- portal egress status on the lease detail page;
- daemon supervisor behavior matching WebVNC;
- duplicate-daemon replacement and cleanup;
- clearer cleanup on lease stop/expiry.
Remaining:
- Add docs and examples for Discord and Slack QA.
### Phase 4: Full-Machine Options
- Keep mediated per-app egress as the default.
- Add a separate full-route mode only when the target is a suitable Linux
gateway or a confirmed Tailscale exit node.
- Document full-route mode as higher-risk and provider/OS dependent.
## Verification
Useful proof for the first implementation:
```sh
crabbox warmup --provider hetzner --desktop --browser
crabbox egress start --id blue-lobster --profile discord --daemon
crabbox desktop launch --id blue-lobster \
--browser \
--url https://discord.com/login \
--egress discord \
--webvnc \
--open
```
Expected evidence:
- `egress status` reports host and client connected;
- a browser IP check shows the host-side egress IP;
- Discord loads inside the WebVNC desktop;
- the host agent logs only allowed hostnames and byte counters;
- stopping the lease tears down the bridge and local proxy.
## Source Map
Planned implementation files:
- CLI command router: `internal/cli/cli_kong.go`
- egress command implementation: `internal/cli/egress.go`
- coordinator client ticket/status calls: `internal/cli/coordinator.go`
- desktop/browser launch integration: `internal/cli/desktop.go`
- browser wrapper bootstrap: `internal/cli/bootstrap.go`, `worker/src/bootstrap.ts`
- Worker top-level WebSocket routing: `worker/src/index.ts`
- Fleet Durable Object bridge state and routes: `worker/src/fleet.ts`
- Worker request/record types: `worker/src/types.ts`
- portal lease detail status: `worker/src/portal.ts`
Related docs:
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [Broker auth and routing](broker-auth-routing.md)
- [Browser portal](portal.md)
- [Tailscale](tailscale.md)
- [Configuration](configuration.md)

View File

@ -0,0 +1,155 @@
# Environment Forwarding
Read when:
- adding a new env var that the remote command needs to see;
- debugging "why is `$CI` empty inside `crabbox run`?";
- writing a repo config that lets agents set tunable values without flags;
- reviewing a PR that loosens or tightens the env allowlist.
By default, `crabbox run` does not forward arbitrary local environment
variables to the remote command. Forwarding is opt-in and name-based: the
repo declares which variable names are allowed, and Crabbox forwards only
those that are present locally.
## Why Allowlist
Agents and CI environments run with rich and sometimes sensitive
environments: tokens, private credentials, terminal paths, vendor-specific
debug flags. Forwarding everything would:
- leak secrets to remote runners;
- introduce non-determinism between local and CI runs;
- make it impossible to reason about what affects a remote command.
Allowlist forwarding makes the contract explicit. The repo decides what
"counts" as input to the remote command, and the user can audit the
allowlist in `crabbox.yaml`.
## Configuration
```yaml
env:
allow:
- CI
- NODE_OPTIONS
- PROJECT_*
```
Rules:
- entries are env var names, not values;
- a trailing `*` is a prefix wildcard (`PROJECT_*` matches `PROJECT_FOO`,
`PROJECT_BAR`);
- inline wildcards (`PROJECT_*_DEBUG`) are not supported;
- match is exact and case-sensitive;
- empty entries are ignored.
The user-side override is `CRABBOX_ENV_ALLOW`, a comma-separated list:
```sh
CRABBOX_ENV_ALLOW='CI,NODE_OPTIONS,PROJECT_*' crabbox run -- pnpm test
```
`CRABBOX_ENV_ALLOW` replaces the repo allowlist for that command rather than
appending to it. Use it for one-off tests; persistent allowances belong in
`env.allow`.
## What Gets Forwarded
For each env var in the allowlist, Crabbox checks whether the variable is
set locally. If it is, the variable is forwarded to the remote command with
the same name and value. If it is not set locally, nothing is forwarded -
Crabbox does not invent values.
The remote command sees the variables as part of its environment when run
through SSH:
```sh
ssh runner 'CI=true NODE_OPTIONS=--max_old_space_size=4096 cd workdir && pnpm test'
```
Quoting and escaping happen automatically. Values that contain shell
metacharacters are passed through safely.
## Capability-Injected Env
A small set of env vars is injected by Crabbox itself when the matching
capability is requested. These bypass the allowlist because Crabbox owns
them:
```text
DISPLAY=:99 when --desktop
CRABBOX_DESKTOP=1 when --desktop
BROWSER=<path> when --browser, after probe
CHROME_BIN=<path> when --browser, after probe
CRABBOX_BROWSER=1 when --browser
```
User-allowed env vars override capability-injected ones if they overlap.
Repos that need a different `BROWSER` value can include `BROWSER` in
`env.allow` and set it locally.
## Secrets
Do not put secrets in `env.allow` even if forwarding seems convenient.
Secrets belong in:
- the broker environment (Cloudflare Worker secrets) for provider
credentials;
- the operator's credential store (`op`, AWS Vault, etc.) for short-lived
tokens;
- per-runner image bake when the secret should be on every lease;
- post-bootstrap secret injection in repo-owned setup scripts (devcontainer,
mise, repo-controlled `bin/setup`).
Crabbox forwards values it sees locally. If a secret leaks into the
allowlist, every run of every contributor will leak it.
## Examples
```yaml
env:
allow:
- CI # mark a remote command as CI-driven
- NODE_OPTIONS # adjust Node memory in test suites
- PYTEST_ADDOPTS # tune pytest flags from the local env
- PROJECT_* # repo's own debug knobs
- VITEST_* # let agents override vitest config
- DEBUG # `debug` package selector
```
Common things you usually do not allow:
```text
HOME, USER, PATH, SHELL runner already has its own
SSH_* leaks SSH agent state
GITHUB_TOKEN use Actions hydration or runner setup
AWS_* use IAM roles or instance profile
*_API_KEY, *_TOKEN use a secret manager
```
## Inspecting Forwarding
`crabbox run --debug` prints the set of env vars that were forwarded for
that invocation. Use it to verify that the allowlist matches expectations
before debugging "why does the remote command not see this variable?".
```sh
$ crabbox run --debug -- env | grep '^PROJECT'
[crabbox] forwarding env: CI NODE_OPTIONS PROJECT_FOO PROJECT_BAR
PROJECT_FOO=value
PROJECT_BAR=other-value
```
Variables that match the allowlist but are unset locally are not in the
forwarded list, so the debug line is the source of truth for "what did the
remote command actually see".
Related docs:
- [Sync](sync.md)
- [Configuration](configuration.md)
- [run command](../commands/run.md)
- [Capabilities](capabilities.md)
- [Security](../security.md)

78
docs/features/hetzner.md Normal file
View File

@ -0,0 +1,78 @@
# Hetzner
Read when:
- choosing Hetzner as the Crabbox provider;
- debugging Hetzner capacity, quotas, images, or SSH readiness;
- changing Hetzner provisioning code in the CLI or Worker.
Hetzner is Crabbox's Linux-only managed provider. It creates Ubuntu servers,
labels them with Crabbox lease metadata, bootstraps the normal SSH/sync
contract, and optionally adds Linux desktop/browser capability.
## Targets
| Target | Managed | Notes |
| --- | --- | --- |
| Linux | Yes | Cloud-init bootstrap, SSH, rsync, optional desktop/browser. |
| Windows | No | Use AWS for managed Windows or `provider: ssh` for an existing Windows host. |
| macOS | No | Use AWS EC2 Mac or `provider: ssh` for an existing Mac. |
Examples:
```sh
crabbox warmup --provider hetzner --class beast
crabbox run --provider hetzner --class standard -- pnpm test
crabbox warmup --provider hetzner --desktop --browser
crabbox vnc --id blue-lobster --open
```
## Classes
```text
standard ccx33, cpx62, cx53
fast ccx43, cpx62, cx53
large ccx53, ccx43, cpx62, cx53
beast ccx63, ccx53, ccx43, cpx62, cx53
```
Dedicated-core types can hit account quota. Crabbox falls back through the
configured server types when Hetzner rejects a candidate for capacity or quota.
Explicit `--type` is exact and fails clearly when the type cannot be created.
## Broker Secrets And Env
Worker secret:
```text
HETZNER_TOKEN
```
Direct/provider env and config:
```text
HCLOUD_TOKEN
HETZNER_TOKEN
CRABBOX_HETZNER_IMAGE
CRABBOX_HETZNER_LOCATION
CRABBOX_HETZNER_SSH_KEY
```
## Desktop
Hetzner desktop leases use the Linux VNC path: Xvfb, a lightweight desktop
session, x11vnc bound to `127.0.0.1:5900`, and an SSH local tunnel created by
`crabbox vnc`. Hetzner does not manage Windows desktop boxes in Crabbox.
## Cleanup
Brokered cleanup belongs to the Durable Object alarm. Direct cleanup is
best-effort through provider labels and `crabbox cleanup`; it skips kept
machines and deletes expired direct-provider leftovers.
Related docs:
- [Providers](providers.md)
- [Linux VNC](vnc-linux.md)
- [Infrastructure](../infrastructure.md)
- [cleanup command](../commands/cleanup.md)

View File

@ -6,7 +6,12 @@ Read when:
- debugging failed remote commands;
- deciding what belongs in coordinator history.
Coordinator-backed `crabbox run` records a run before the remote command starts. When the command exits, the CLI finishes that run with:
Coordinator-backed `crabbox run` creates a durable `run_...` handle before
leasing starts. As the CLI advances, it appends ordered events for leasing,
bootstrap, sync, command start, stdout/stderr chunks, command finish, and lease
release. Stdout/stderr events are capped at 64 KiB per run and followed by an
`output.truncated` marker when the cap is reached. When the command exits, the
CLI finishes that run with:
- exit code;
- sync duration;
@ -14,22 +19,39 @@ Coordinator-backed `crabbox run` records a run before the remote command starts.
- total duration;
- owner and org;
- provider, class, and server type;
- retained remote output tail.
- optional Linux telemetry snapshots for load, memory, disk, and uptime, including bounded mid-run samples for longer runs;
- retained remote output.
Use:
```sh
crabbox history
crabbox history --lease cbx_...
crabbox events run_...
crabbox attach run_...
crabbox logs run_...
```
History records live in the Fleet Durable Object. Log text is stored separately from run metadata and intentionally capped to the latest tail so noisy commands cannot exhaust storage.
In the authenticated browser portal, `/portal/runs/<run-id>` renders the same
run as a human page with command metadata, result summary, searchable/paginated
recent events, compact resource deltas, short telemetry trend lines, and a
copyable retained log tail.
`/portal/runs/<run-id>/logs` stays a plain-text log endpoint, and
`/portal/runs/<run-id>/events` stays JSON for copying or browser-side
inspection.
History records and run events live in the Fleet Durable Object. Log text is
stored separately from run metadata and intentionally capped so noisy commands
cannot exhaust storage. Logs larger than one storage value are chunked by the
coordinator and reassembled by `crabbox logs`. Event output capture is also
bounded; use `crabbox attach` for active run previews and `crabbox logs` for the
retained command output.
Direct-provider mode does not have central history. Use shell output or local terminal logs there.
Related docs:
- [history command](../commands/history.md)
- [attach command](../commands/attach.md)
- [logs command](../commands/logs.md)
- [Observability](../observability.md)

View File

@ -0,0 +1,199 @@
# Identifiers
Read when:
- changing how Crabbox names leases, slugs, runs, or claims;
- debugging "why does `crabbox run --id` not find this lease?";
- adding a new lookup form (alias, provider id, anything that resolves to a
lease).
Crabbox names every long-lived thing twice: once with a stable canonical ID
that machines compare, and once with a friendly slug that humans type. This
page lists the identifiers, where they come from, and how lookup resolves
across them.
## Lease ID
Canonical lease IDs look like:
```text
cbx_abcdef123456
```
The pattern is fixed: the literal `cbx_` prefix followed by 12 hex characters.
`isCanonicalLeaseID` enforces it as a regex; anything else is treated as a
slug or alias.
The CLI mints a provisional lease ID before calling the broker. The broker
may return a different final ID (when the Worker dedupes a retried request,
for example); the CLI then moves the local SSH key directory and claim file
from the provisional ID to the final ID with `MoveStoredTestboxKey` and
re-keys references accordingly.
Provider resources reference the lease ID through Crabbox labels:
```text
crabbox-lease=cbx_abcdef123456
```
That label is what `crabbox cleanup` and `crabbox list` use to map a provider
machine back to a Crabbox lease.
## Slug
Slugs are friendly, human-typeable lease names. They look like:
```text
blue-lobster
amber-crab
silver-shrimp
```
Slugs are generated from a stable hash of the lease ID, so the same lease
always gets the same slug. The vocabulary is small (14 adjectives, 8 nouns)
because Crabbox is intentionally a small fleet. When a slug collides with an
existing active lease, `slugWithCollisionSuffix` appends a 4-hex suffix
keyed by the seed:
```text
blue-lobster-1234
```
The collision path is rare in normal use - a single user's active leases
rarely exceed the 14 × 8 = 112 unique base slugs.
Slugs are normalized everywhere they are accepted. `normalizeLeaseSlug` keeps
only `[a-z0-9-]`, collapses runs of separators, and trims leading/trailing
dashes. `Blue_Lobster` and `BLUE-LOBSTER` resolve to `blue-lobster`.
## Provider Name
Each managed lease also gets a per-provider resource name that includes the
slug and a hash of the lease ID, so the provider console shows useful names:
```text
crabbox-blue-lobster-7f8a2c1d
```
That name is what shows up as the EC2 `Name` tag, the Hetzner server name,
and the Daytona sandbox name. It is derived from `leaseProviderName(leaseID,
slug)`; the function falls back to `crabbox-cbx-...` if the slug is empty.
## Run ID
Each `crabbox run` against a coordinator also gets a durable run handle:
```text
run_abcdef123456
```
A run is created before the lease is acquired so events can be appended for
leasing failures, sync failures, and command output even when the run never
reaches command-start. Run IDs are stable across a single invocation;
retrying the same command produces a new run.
`crabbox history`, `crabbox events`, `crabbox attach`, `crabbox logs`, and
`crabbox results` all accept run IDs. Slugs do not resolve to runs - only to
leases.
## Local Claims
Reusable leases get a JSON claim file stored under the user state directory:
```text
$XDG_STATE_HOME/crabbox/claims/cbx_abcdef123456.json
```
When `XDG_STATE_HOME` is not set, claims live next to user config in
`~/Library/Application Support/crabbox/state/claims` on macOS or
`~/.config/crabbox/state/claims` on Linux.
The claim payload looks like:
```json
{
"leaseID": "cbx_abcdef123456",
"slug": "blue-lobster",
"provider": "aws",
"repoRoot": "/Users/steipete/Projects/openclaw",
"claimedAt": "2026-05-07T07:42:18Z",
"lastUsedAt": "2026-05-07T07:55:12Z",
"idleTimeoutSeconds": 1800
}
```
Claims do three things:
- bind a lease to one repo so wrappers and agents do not silently reuse a
lease against a different checkout;
- give `crabbox run --id blue-lobster` a slug-to-canonical-ID translation
without round-tripping the broker;
- power "is this lease still mine?" checks before destructive operations
(`stop`, `cleanup`, `actions register`).
A conflicting claim (same lease, different repo) refuses commands by default;
`--reclaim` overrides the check and rewrites the claim atomically.
Static SSH leases tag their claims with `provider: ssh` so the resolver knows
the lease bypasses the coordinator. Coordinator-backed claims leave
`provider` blank because the coordinator owns provider tracking.
## SSH Key Storage
Per-lease SSH key directories are keyed by lease ID:
```text
~/.config/crabbox/testboxes/cbx_abcdef123456/id_ed25519
~/.config/crabbox/testboxes/cbx_abcdef123456/id_ed25519.pub
~/.config/crabbox/testboxes/cbx_abcdef123456/known_hosts
```
The provisional → final lease ID move uses `os.Rename` on the directory so
the key, public key, and known_hosts file all migrate atomically. The
provider key name (`crabbox-cbx-abcdef123456`) is what the cloud account
sees.
## Resolving An Identifier
`crabbox <command> --id <value>` accepts:
- a canonical `cbx_...` lease ID;
- a normalized slug (`blue-lobster`, `Blue Lobster`, `BLUE_LOBSTER` all resolve
to the same lease);
- in coordinator mode, also the slug as known to the broker, regardless of
case.
Resolution order:
1. Read the local claim store for the literal identifier or any slug match
in `claims/`.
2. If a matching claim exists, use its `leaseID` as the canonical handle.
3. If no claim is found and a coordinator is configured, ask the coordinator
to resolve the identifier (slug or canonical ID).
4. For static SSH and direct-provider modes, fall back to the provider's
`Resolve` implementation (`SSHLeaseBackend.Resolve`).
The first source that returns a hit wins. This is why `--id blue-lobster`
works from any directory once the warmup ran in some other repo - the local
claim translates slug to lease ID before the broker is involved.
## Identifier Lifetime
```text
provisional lease ID newLeaseID() call → broker returns final ID
final lease ID broker accepts → stored in claim, key dir, labels
slug computed on first lease creation, stable forever
provider name derived from lease ID + slug
run ID minted per crabbox run when a coordinator is configured
```
Slugs are not recycled. When a lease ends, the slug stays free for any future
lease that happens to hash to it; the small vocabulary makes that
collision-by-hash possible but rare in practice.
Related docs:
- [Coordinator](coordinator.md)
- [SSH keys](ssh-keys.md)
- [Lifecycle cleanup](lifecycle-cleanup.md)
- [Source map](../source-map.md)

View File

@ -0,0 +1,212 @@
# Image Bake Runbook
Read when:
- baking a new Crabbox AWS image;
- promoting or rolling back the default AWS image;
- preparing a desktop/browser image for Mantis or other UI QA;
- checking whether state belongs in the image or in a warm lease.
This runbook is for trusted operators. Image commands need coordinator admin
auth and can create provider-side artifacts that cost money until cleaned up.
## Naming
Use names that identify owner, purpose, and UTC bake time:
```text
openclaw-crabbox-linux-desktop-browser-YYYYMMDD-HHMM
openclaw-mantis-linux-desktop-browser-YYYYMMDD-HHMM
```
Use a generic `openclaw-crabbox-*` image when the contents are useful to many
repositories. Use `openclaw-mantis-*` only when the image is specifically tuned
for OpenClaw Mantis QA.
## What To Bake
Bake machine capabilities:
- current OS security updates;
- SSH, Git, rsync, curl, jq, and readiness helpers;
- Xvfb/slim XFCE/VNC for desktop leases;
- Chrome/Chromium for browser leases;
- `ffmpeg`, `ffprobe`, `scrot`, `xdotool`, and other capture helpers;
- Node 22, npm, corepack, pnpm;
- build-essential, Python, and common native-addon headers;
- empty cache directories such as `/var/cache/crabbox/pnpm`.
Do not bake scenario state:
- secrets, tokens, or provider credentials;
- browser profiles, cookies, Slack/Discord/WhatsApp sessions, or OAuth state;
- source checkouts, `node_modules`, `dist`, PR artifacts, screenshots, or
videos;
- local operator notes or one-off debugging files.
## Create A Candidate AMI
Warm a source lease:
```bash
crabbox warmup \
--provider aws \
--class standard \
--desktop \
--browser \
--ttl 2h \
--idle-timeout 30m
```
Capture the lease id from the output. Use the canonical `cbx_...` id for image
commands, not only the friendly slug.
Verify the source lease:
```bash
crabbox run \
--provider aws \
--id <cbx_id> \
--no-sync \
--shell -- \
'set -euo pipefail
command -v ssh
command -v git
command -v rsync
command -v jq
command -v node
command -v pnpm
command -v ffmpeg
command -v scrot
command -v x11vnc
command -v google-chrome || command -v chromium || command -v chromium-browser
test -d /work/crabbox
sudo mkdir -p /var/cache/crabbox/pnpm
sudo chmod 1777 /var/cache/crabbox /var/cache/crabbox/pnpm'
```
Create the candidate image:
```bash
crabbox image create \
--id <cbx_id> \
--name openclaw-crabbox-linux-desktop-browser-YYYYMMDD-HHMM \
--wait \
--json
```
Keep the JSON output. At minimum, record the AMI id, name, source lease id,
creation time, and operator.
## Smoke Candidate Before Promotion
Boot the candidate explicitly. Use the provider image override supported by the
current environment, for example:
```bash
CRABBOX_AWS_AMI=ami-1234567890abcdef0 \
crabbox warmup \
--provider aws \
--class standard \
--desktop \
--browser \
--ttl 30m \
--idle-timeout 10m
```
Run a smoke on the candidate:
```bash
crabbox run \
--provider aws \
--id <candidate-cbx_id-or-slug> \
--no-sync \
--shell -- \
'set -euo pipefail
echo image-smoke-ok
uname -srm
command -v node
command -v pnpm
command -v ffmpeg
command -v scrot
command -v google-chrome || command -v chromium || command -v chromium-browser
test -d /work/crabbox'
```
For Mantis images, also run a real desktop/browser proof:
```bash
crabbox screenshot --provider aws --id <candidate-cbx_id-or-slug> --output /tmp/crabbox-image-smoke.png
```
Do not promote if SSH readiness, browser startup, screenshot capture, or the
package/tool checks fail.
## Promote
Promote only after a candidate smoke passes:
```bash
crabbox image promote ami-1234567890abcdef0 --json
```
Then verify a normal brokered lease without overrides uses the promoted image:
```bash
crabbox warmup \
--provider aws \
--class standard \
--desktop \
--browser \
--ttl 30m \
--idle-timeout 10m
crabbox run \
--provider aws \
--id <new-cbx_id-or-slug> \
--no-sync \
--shell -- \
'echo promoted-image-smoke-ok && command -v ffmpeg && command -v node'
```
Keep the previous promoted AMI available until at least one normal brokered
lease and one relevant QA lane pass on the new image.
## Roll Back
Rollback is another promotion:
```bash
crabbox image promote ami-previous-good --json
```
Run the normal brokered smoke again. Do not delete the failed AMI immediately;
keep it long enough to inspect tags, logs, and source-lease details.
## Cleanup
Promotion does not delete old AMIs or EBS snapshots. Cleanup is a provider
operator task:
- keep the current promoted AMI;
- keep the previous known-good AMI until the new one has real QA proof;
- deregister stale failed/candidate AMIs after investigation;
- delete their orphaned EBS snapshots in the AWS account.
Do not rely on Crabbox coordinator state as the source of truth for old image
storage costs. Check AWS directly.
## Hetzner Status
Hetzner image bytes belong in the Hetzner project. Crabbox can boot a configured
image through `image` or `CRABBOX_HETZNER_IMAGE`, but Hetzner image
create/promote lifecycle commands are not implemented yet. Until then, create
and manage Hetzner snapshots with Hetzner tooling, then configure Crabbox to use
the selected image.
Related docs:
- [Prebaked runner images](prebaked-images.md)
- [image command](../commands/image.md)
- [Runner bootstrap](runner-bootstrap.md)
- [Interactive desktop and VNC](interactive-desktop-vnc.md)

View File

@ -0,0 +1,247 @@
# Interactive Desktop And VNC
Read when:
- choosing a desktop target for browser/UI QA;
- opening a lease with VNC or WebVNC;
- diagnosing stale WebVNC viewers, bridge disconnects, or broken desktop
sessions;
- driving desktop input from agents without hand-written `xdotool`;
- deciding which layer owns desktop setup, browser state, screenshots, or
credentials.
Crabbox treats desktop access as a lease capability, not a separate remote
access product. A desktop lease still uses the normal Crabbox boundaries:
provider lifecycle, per-lease SSH keys, SSH tunnels, idle expiry, cleanup, and
run history. VNC is a way to inspect or drive the visible session inside that
boundary.
## Quick Start
```sh
crabbox warmup --desktop --browser
crabbox webvnc --id blue-lobster --open
crabbox webvnc status --id blue-lobster
crabbox desktop doctor --id blue-lobster
crabbox vnc --id blue-lobster --open
crabbox screenshot --id blue-lobster --output desktop.png
```
AWS Windows and EC2 Mac use the same VNC command once the desktop lease exists:
```sh
crabbox warmup --provider aws --target windows --desktop
crabbox vnc --id crimson-crab --open
CRABBOX_AWS_MAC_HOST_ID=h-... \
crabbox warmup --provider aws --target macos --desktop --market on-demand
crabbox vnc --id silver-squid --open
```
Static hosts are explicit and host-managed:
```sh
crabbox vnc --provider ssh --target macos --static-host mac-studio.local --host-managed --open
crabbox vnc --provider ssh --target windows --static-host win-dev.local --host-managed --open
```
## What Crabbox Owns
Crabbox owns:
- the lease lifecycle and cleanup;
- per-lease SSH keys and known_hosts scoping;
- SSH local forwarding to the target's loopback VNC service;
- generated per-lease VNC or OS passwords for managed desktop leases;
- `desktop=true` and `browser=true` lease metadata;
- screenshots and desktop launch commands that operate inside the lease.
Scenario systems such as Mantis own:
- product-specific login and app credentials;
- browser profile import/export;
- screenshots that prove a bug before and after a fix;
- PR comments, issue triage, and artifact summaries.
## Support Matrix
| Target | Managed by Crabbox | Desktop access | Primary page |
| --- | --- | --- | --- |
| Linux on Hetzner | Yes | Xvfb/XFCE/x11vnc over SSH tunnel | [Linux VNC](vnc-linux.md) |
| Linux on AWS | Yes | Xvfb/XFCE/x11vnc over SSH tunnel | [Linux VNC](vnc-linux.md) |
| Linux on Azure | Yes | Xvfb/XFCE/x11vnc over SSH tunnel | [Linux VNC](vnc-linux.md) |
| AWS Windows | Yes | TightVNC over SSH tunnel | [Windows VNC](vnc-windows.md) |
| AWS EC2 Mac | Yes | Screen Sharing/VNC over SSH tunnel | [macOS VNC](vnc-macos.md) |
| Azure Windows | No | SSH/sync/run only | [Azure](azure.md) |
| Static Linux | Host-managed | Existing loopback VNC service | [Linux VNC](vnc-linux.md) |
| Static macOS | Host-managed | Existing Screen Sharing/VNC | [macOS VNC](vnc-macos.md) |
| Static Windows | Host-managed | Existing VNC service | [Windows VNC](vnc-windows.md) |
| Blacksmith Testbox | No | Not exposed through Crabbox VNC today | [Blacksmith Testbox](blacksmith-testbox.md) |
## Commands
Use `crabbox webvnc` for the authenticated coordinator portal. This is the
preferred path for human demos because `--open` preloads the VNC password in
the local browser fragment:
```sh
crabbox webvnc --id blue-lobster --open
crabbox webvnc status --id blue-lobster
crabbox webvnc reset --id blue-lobster --open
```
Use `crabbox vnc` for a native VNC client when WebVNC status/reset says the
portal/browser path is unhealthy or when you need a native client feature:
```sh
crabbox vnc --id blue-lobster
crabbox vnc --id blue-lobster --network tailscale
crabbox vnc --id blue-lobster --open
```
WebVNC uses the same runner-side VNC service as `crabbox vnc`. The difference
is the viewer path: a local `crabbox webvnc` process keeps an SSH tunnel open,
connects to the coordinator with a one-use bridge ticket, and the browser uses
bundled noVNC from the authenticated portal. The portal does not connect to the
runner by itself; the local bridge must keep running.
WebVNC supports collaborative viewing. The local bridge keeps a warm pool of
backend VNC sessions (default 4 slots), the first browser viewer controls the
lease, and additional viewers join as read-only observers. Any viewer — a new
observer or the prior controller — can press **take over** to become the
controller; whoever loses control stays connected as an observer and sees who
took over. Observer mode is intended for trusted shared leases; it is not a
hostile-client security boundary.
The portal toolbar supports explicit clipboard exchange. Paste reads the local
browser clipboard, forwards it to the remote VNC server, and sends the target
paste shortcut. Copy-remote is enabled after the remote server publishes
clipboard text and then writes that text to the local browser clipboard on
click; browsers generally block fully automatic clipboard writes without a user
gesture.
Use `crabbox screenshot` when you need a PNG without taking over the session:
```sh
crabbox screenshot --id blue-lobster --output desktop.png
```
Use `crabbox artifacts` when QA needs a durable proof bundle instead of a
single screenshot:
```sh
crabbox artifacts collect --id blue-lobster --all --output artifacts/blue-lobster
crabbox artifacts publish --dir artifacts/blue-lobster --pr 123 --storage s3 --bucket qa-artifacts
```
Use `crabbox desktop launch` to start a browser or app inside the visible
session without keeping the SSH command attached:
```sh
crabbox desktop launch --id blue-lobster --browser --url https://example.com --webvnc --open
```
For human demos, Crabbox keeps launched browsers windowed so the remote desktop
panel, title bar, and surrounding session remain visible. Use
`desktop launch --fullscreen` only when you intentionally want browser-only
video or capture output.
Use `crabbox desktop doctor --id <lease>` before blaming WebVNC. It checks the
lease's desktop session, VNC service, input tooling, browser binary, ffmpeg,
screen geometry, and screenshot capture, then separately reports WebVNC
bridge/viewer status with one-line repair suggestions.
Failure output is designed for rescue-first debugging. When a desktop command
cannot prove the expected state, Crabbox prints the failed layer as
`problem: browser not launched`, `problem: input stack dead`, `problem: VNC
bridge disconnected`, `problem: WebVNC daemon not running`, or similar, followed
by an exact `rescue:` command. WebVNC status/reset also prints the exact native
`crabbox vnc ... --open` fallback when the native viewer is the better next
step.
Use first-class input helpers instead of hand-rolled `xdotool`:
```sh
crabbox desktop click --id blue-lobster --x 640 --y 420
crabbox desktop paste --id blue-lobster --text "peter@example.com"
printf 'peter@example.com' | crabbox desktop paste --id blue-lobster
crabbox desktop type --id blue-lobster --text "hello"
crabbox desktop key --id blue-lobster ctrl+l
crabbox desktop key blue-lobster ctrl+l
```
Prefer `desktop paste` or symbol-aware `desktop type` for emails, passwords,
URLs, and text containing characters such as `@` or `+`; raw key-symbol typing
can vary with the target X keyboard layout. `desktop key` is for shortcuts and
special keys, and supports both `--id <lease> <keys>` and positional
`<lease> <keys>` forms.
## Network Model
Managed VNC is tunnel-first:
- VNC binds to `127.0.0.1:5900` on the target.
- The cloud firewall/security group opens SSH only, not VNC.
- `crabbox vnc` forwards a local port such as `localhost:5901` to remote
`127.0.0.1:5900`.
- `--network tailscale` changes only the SSH endpoint used by that tunnel.
- WebVNC keeps the same local SSH tunnel and adds an authenticated browser
websocket through the coordinator.
- WebVNC browser websockets are paired with local bridge backend sessions
inside the coordinator Durable Object. One viewer is the controller; other
viewers are observers until they press **take over**. If a browser view
disconnects, only its paired backend session is reset and the local command
reconnects a fresh bridge slot for the next portal retry.
- `crabbox webvnc status` reports the local daemon pid/log, SSH tunnel command,
target VNC reachability, coordinator bridge/viewer state, recent bridge
events, portal URL/password, and the exact native `crabbox vnc ... --open`
fallback. The fallback preserves explicit `--network public` or
`--network tailscale` selections.
- `crabbox webvnc reset` closes only the selected lease's WebVNC sockets,
stops only that lease's verified local WebVNC daemon, restarts the target
desktop/VNC services, then prints the fresh portal URL.
- WebVNC and desktop commands print rescue commands inline when the bridge,
viewer, browser launch, VNC target, or input stack fails, so operators do not
need to dig through troubleshooting docs during a demo.
Crabbox does not bind managed VNC directly to a public IP or Tailscale 100.x
address. Static hosts can expose direct `host:5900` only when the operator has
already made that endpoint reachable on a trusted network.
## Browser State
`--browser` guarantees a browser binary and env such as `BROWSER` and
`CHROME_BIN`; it does not create, unlock, sync, or migrate a logged-in profile.
On managed Linux leases, these env vars point to a Crabbox wrapper that disables
Chrome/Chromium first-run and default-browser prompts for repeatable VNC use.
On managed targets, manual browser login through VNC lasts only for that lease
unless the caller intentionally exports an artifact. On static hosts, any
existing browser profile belongs to that host.
For repeatable logged-in tests, use scenario-owned state such as a Playwright
storage-state file or an app-specific short-lived token. Avoid syncing full
browser profile directories between operating systems; browser credentials are
often machine- and user-encrypted.
## Security Rules
- Never expose managed VNC directly to the public internet.
- Do not expose managed VNC directly on a Tailscale interface.
- Prefer SSH local forwarding such as
`localhost:5901 -> 127.0.0.1:5900`.
- Generate per-lease passwords for managed desktop leases.
- Redact passwords from logs, provider metadata, and run records.
- Keep TTL and idle-timeout cleanup in force.
- Require `--host-managed` before opening static-host VNC prompts.
## Where To Go Next
- [Linux VNC](vnc-linux.md): Hetzner/AWS Linux desktop services and static Linux.
- [Windows VNC](vnc-windows.md): AWS Windows, native Windows static hosts, and WSL2 boundaries.
- [macOS VNC](vnc-macos.md): AWS EC2 Mac and static Mac Screen Sharing.
- [AWS](aws.md): AWS target matrix, capacity, AMIs, and EC2 Mac host requirements.
- [Hetzner](hetzner.md): Linux-only managed Hetzner behavior.
- [Blacksmith Testbox](blacksmith-testbox.md): delegated Testbox behavior and why VNC is not a Crabbox feature there yet.
- [vnc command](../commands/vnc.md), [webvnc command](../commands/webvnc.md), [screenshot command](../commands/screenshot.md), [desktop command](../commands/desktop.md), [artifacts command](../commands/artifacts.md), [egress command](../commands/egress.md).
- [Mediated egress](egress.md): per-app browser/app egress through the operator
machine for Discord, Slack, and similar source-IP-sensitive QA.

64
docs/features/islo.md Normal file
View File

@ -0,0 +1,64 @@
# Islo
Read when:
- choosing `provider: islo`;
- configuring Islo sandbox image, sizing, or gateway profile;
- reviewing delegated provider behavior.
`provider: islo` delegates sandbox setup and command execution to Islo. Crabbox
uses the Islo Go SDK for auth, sandbox lifecycle, list, status, and stop. It
builds the normal Crabbox sync manifest and uploads it as a gzipped archive into
the sandbox workdir before executing the command. The SDK's current exec stream
helper coalesces output, so Crabbox keeps a small SSE reader for
`POST /sandboxes/{name}/exec/stream` while still using the SDK auth provider.
## Auth
```sh
export ISLO_API_KEY=ak_...
```
`ISLO_BASE_URL` or `islo.baseUrl` can override the default
`https://api.islo.dev`.
## Config
```yaml
provider: islo
target: linux
islo:
image: docker.io/library/ubuntu:24.04
workdir: crabbox
gatewayProfile: ""
snapshotName: ""
vcpus: 2
memoryMB: 4096
diskGB: 20
```
Equivalent flags:
```sh
crabbox warmup --provider islo --islo-image docker.io/library/ubuntu:24.04
crabbox run --provider islo -- pnpm test
crabbox status --provider islo --id <slug>
crabbox stop --provider islo <slug>
```
## Behavior
- `warmup` creates a `crabbox-...` Islo sandbox and stores a local lease ID of
the form `isb_<crabbox-sandbox-name>` plus a Crabbox slug.
- `run` creates or reuses a sandbox, syncs the local Git-managed working set
into `/workspace/<islo.workdir>`, streams stdout/stderr from Islo's SSE exec
endpoint, and returns the remote exit code.
- `--sync-only` and `--checksum` are rejected because Islo does not expose a
Crabbox SSH/rsync target. Large-sync guardrails still apply, and
`--force-sync-large` is honored for intentional large archive syncs.
- `list`, `status`, and `stop` use the Islo SDK and return core-rendered
Crabbox views for Crabbox-created sandboxes only.
Islo is not an SSH lease backend today. Commands that require a Crabbox SSH
target, such as `ssh`, `vnc`, `code`, and Actions runner hydration, should use
Hetzner, AWS, static SSH, or Daytona instead.

195
docs/features/network.md Normal file
View File

@ -0,0 +1,195 @@
# Network And Reachability
Read when:
- choosing between `--network auto`, `tailscale`, or `public`;
- debugging "Crabbox can SSH but my browser can't reach the desktop";
- changing how Crabbox falls back between the public IP and the tailnet IP;
- adjusting SSH port fallbacks for restrictive operator networks.
A Crabbox lease can be reachable through more than one network plane.
Brokered Linux leases can join a Tailscale tailnet, brokered AWS Windows and
EC2 Mac leases stay public, and static SSH targets can be on either depending
on how the operator configured them. The CLI picks one plane per command and
prints which it picked.
## Modes
```text
--network auto prefer tailnet when reachable, otherwise fall back to public
--network tailscale require tailnet reachability; fail otherwise
--network public ignore tailnet metadata and use the public address
```
`auto` is the default. It optimizes for "do not surprise me": prefer tailnet
when both client and runner are on the tailnet, fall back transparently to
the public path when the client is off-tailnet.
`tailscale` is the strict mode. Use it when you specifically want to verify
tailnet reachability or when the public IP is firewalled to a CI runner that
your local box cannot reach.
`public` is the escape hatch. Use it when the tailnet metadata is stale, when
you are debugging public-network issues, or when the client cannot reach the
tailnet for unrelated reasons.
The mode applies to `crabbox ssh`, `crabbox run`, `crabbox vnc`, and
`crabbox webvnc`. `crabbox status --network auto` also resolves through this
path so the printed address matches what later commands will use.
## How `auto` Picks A Plane
For a lease with tailnet metadata, `auto` mode:
1. reads `tailscale_fqdn`, `tailscale_ipv4`, and `tailscale_hostname` from the
server labels;
2. probes the first non-empty option over SSH with a 5-second TCP transport
probe;
3. uses that target if the probe succeeds;
4. falls back to the public IP and prints `network=public` with the reason
`tailscale_unreachable`.
For a lease with no tailnet metadata, `auto` is just public mode.
Static SSH targets behave the same way when the static host name is a
MagicDNS or `100.x` address. If the operator points `static.host` at a
MagicDNS name, `--network tailscale` works without any other configuration -
the address is already on the tailnet.
## Public Reachability
Brokered AWS Linux, AWS Windows, AWS Mac, Hetzner Linux, Daytona, and Islo
leases all expose at least one public address. Crabbox stores the public
address on the server record and uses it whenever the network mode resolves
to `public`.
Public addresses are gated by the provider's security group / firewall. AWS
managed leases use the `crabbox-runners` security group with SSH ingress
limited to the configured CIDRs or the request source IP. Hetzner managed
leases use the cloud firewall attached to the project; the broker keeps it
limited to the operator's IPs.
If your client IP changes during a long warmup, the existing security group
rule may not include the new IP. Re-running `crabbox status` adds the
current IP back and updates the rule.
## Tailnet Reachability
When a managed Linux lease is created with `--tailscale`, cloud-init:
- installs the Tailscale package;
- joins the tailnet with the configured tags (default `tag:crabbox`);
- writes non-secret metadata to `/var/lib/crabbox/tailscale-*`;
- extends `crabbox-ready` with a bounded check that a `100.x` address has
been assigned;
- discards the auth key after `tailscale up` so it never persists.
The metadata Crabbox stores on the lease record:
```text
tailscale=true
tailscale_hostname=blue-lobster
tailscale_fqdn=blue-lobster.tail-scale.ts.net
tailscale_ipv4=100.64.0.5
tailscale_state=ok
tailscale_tags=tag:crabbox
tailscale_exit_node=...
tailscale_exit_node_allow_lan_access=true|false
```
Brokered leases get a one-shot auth key minted by the Worker via Tailscale
OAuth (`worker/src/tailscale.ts`). Direct-provider leases use a key from
`CRABBOX_TAILSCALE_AUTH_KEY`. The auth key is never stored on the runner.
When the metadata says the lease is on the tailnet but the client cannot
reach it, the most common reasons are:
- the client is not joined to the tailnet (`tailscale status` on the client);
- ACLs block the tag pair from reaching `100.x`;
- the runner's `tailscaled` process died (rare; readiness probes catch it
before the lease is handed back).
`crabbox status --id <lease> --network tailscale` is the fastest way to test
tailnet reachability after lease creation.
## SSH Port And Fallback
Crabbox runs SSH on a non-standard port by default to keep noise out of the
provider firewall logs:
```yaml
ssh:
port: "2222"
fallbackPorts:
- "22"
```
`ssh.port` is the primary port the bootstrap binds to. `ssh.fallbackPorts` is
an ordered list of additional ports the CLI will try when the primary port
is unreachable - typically because the operator's egress is restricted, the
sshd has not bound the new port yet, or cloud-init is still mid-flight.
Fallback rules:
- the CLI tries primary first, then each fallback in order;
- the first port that opens a TCP connection wins for that command;
- success is sticky for the run; the next command repeats the probe;
- the CLI prints `ssh-port-fallback=22` when fallback was used.
Set `ssh.fallbackPorts: []` or `CRABBOX_SSH_FALLBACK_PORTS=none` to disable
fallback entirely. Some networks prefer this so a misconfigured `2222` rule
fails loud instead of quietly using `22`.
## Loopback-Bound Capabilities
Lease capabilities (desktop, code) are bound to loopback on purpose so they
do not need provider firewall changes:
```text
VNC 127.0.0.1:5900 reached via SSH tunnel
code-server 127.0.0.1:8080 reached via portal bridge
```
The network mode does not change loopback bindings. `--network` only changes
which interface the SSH tunnel or portal bridge uses to talk to the lease.
Loopback is loopback; it is reachable from the runner regardless.
## Static Hosts
Static SSH targets honor the same modes:
- `--network public` uses `static.host` as configured;
- `--network tailscale` requires `static.host` to be a MagicDNS name or
`100.x` address, then probes for SSH reachability;
- `--network auto` defers to the resolved address: if `static.host` is on
the tailnet, that is what `auto` uses; otherwise it is public.
Tailscale-managed bootstrap (`--tailscale`) is rejected for static providers.
Static hosts are operator-owned; Crabbox does not install Tailscale on them.
Set `static.host` to a tailnet address and select `--network tailscale`
explicitly.
## Failure Surface
When a network mode cannot be satisfied, the CLI exits with code 5 and a
message that names the mode and the lease:
```text
network=tailscale requested but lease cbx_... has no tailnet address
network=tailscale requested for static host mac-studio but SSH is not reachable
network=tailscale requested but blue-lobster.tail-scale.ts.net is not reachable over SSH
```
`auto` mode never fails on a tailnet probe; it falls back to public and
records the reason. The `network=public reason=tailscale_unreachable` log
line is the diagnostic signal that the tailnet plane is unhealthy even
though the command kept working.
Related docs:
- [Tailscale](tailscale.md)
- [Runner bootstrap](runner-bootstrap.md)
- [SSH keys](ssh-keys.md)
- [vnc command](../commands/vnc.md)
- [ssh command](../commands/ssh.md)
- [doctor command](../commands/doctor.md)

View File

@ -0,0 +1,165 @@
# OpenClaw Plugin
Read when:
- enabling Crabbox as a plugin inside OpenClaw;
- changing the plugin tools, schema, or wrapper behavior;
- understanding why some Crabbox surfaces are CLI-only and not plugin tools.
The Crabbox repository root is also a native OpenClaw plugin package. When
OpenClaw loads the plugin, it exposes a small set of agent tools that shell
out to the user's installed `crabbox` binary. The plugin does not embed the
CLI or duplicate any of its logic - it is a thin contract for safe, allowlisted
invocations.
## Plugin Manifest
`openclaw.plugin.json` declares the plugin id, the tools it owns, and the
config schema:
```json
{
"id": "crabbox",
"name": "Crabbox",
"description": "Run Crabbox remote testbox checks from OpenClaw.",
"activation": { "onStartup": true },
"contracts": {
"tools": [
"crabbox_run",
"crabbox_warmup",
"crabbox_status",
"crabbox_list",
"crabbox_stop"
]
},
"configSchema": { ... }
}
```
The runtime entrypoint is `index.js`. Tests in `index.test.js` lock the tool
schemas, argv shapes, output trimming, and config validation so a future
refactor cannot silently change the agent-facing contract.
## Tools
```text
crabbox_run run a command on a leased remote box
crabbox_warmup acquire a warm box for repeated commands
crabbox_status query a lease's state
crabbox_list list visible leases for the current owner/org
crabbox_stop stop a lease and release its provider resources
```
Each tool accepts an argv array of `string` plus an optional `env` object of
string values. The plugin enforces these as JSON schema before invoking the
binary, so an agent cannot pass arbitrary shell commands or non-string env
values.
`crabbox_run`, `crabbox_warmup`, and `crabbox_stop` can be disabled per
install by setting `allowRun`, `allowWarmup`, or `allowStop` to `false` in
plugin config. `crabbox_status` and `crabbox_list` are read-only and always
allowed.
## Config
The plugin accepts only four config keys, all optional:
```json
{
"binary": "crabbox",
"maxOutputBytes": 60000,
"timeoutSeconds": 1800,
"allowRun": true,
"allowWarmup": true,
"allowStop": true
}
```
| Key | Default | Effect |
|:----|:--------|:-------|
| `binary` | `crabbox` | Path to the Crabbox binary. Set when the binary is not on PATH. |
| `maxOutputBytes` | 60000 | Max captured stdout/stderr returned to the model per call. |
| `timeoutSeconds` | 1800 | Default wrapper timeout for a Crabbox CLI invocation. |
| `allowRun` | true | Gate `crabbox_run`. |
| `allowWarmup` | true | Gate `crabbox_warmup`. |
| `allowStop` | true | Gate `crabbox_stop`. |
Crabbox config (broker URL, provider, token, profile, class) lives in the
user/repo config files. The plugin does not duplicate those keys; it inherits
them from whatever `crabbox config show` would return for the agent's
working directory.
## Output Handling
The plugin captures stdout and stderr separately, trims each to
`maxOutputBytes`, and reports the exit code, the trimmed bytes, and a
truncation flag back to the model. Truncated output gets a tail marker so
agents know they did not get the full transcript:
```text
... [output truncated; 12345 of 87654 bytes shown]
```
Long-running tools still respect `timeoutSeconds`. When the wrapper times
out, the plugin sends SIGTERM, waits a short grace period, then escalates to
SIGKILL. The exit code in the response reflects the wrapper outcome, not the
inner remote command.
## What Belongs In The CLI Instead
History, log inspection, attach, results, usage, and admin operations are
intentionally not plugin tools. They are best run from a shell-capable agent:
```sh
crabbox history --lease cbx_...
crabbox events run_... --after 0 --limit 50
crabbox attach run_...
crabbox logs run_...
crabbox results run_...
crabbox usage --scope user
crabbox admin leases --state active
crabbox cleanup --dry-run
```
Reasons for keeping these out of the plugin:
- they often produce more output than `maxOutputBytes` can usefully capture;
- agents tend to want raw logs they can grep, not trimmed model output;
- admin tools are easier to gate at the shell level (env, allowlists) than
through plugin config;
- `crabbox attach` is interactive by design.
## Provider Allowlist
The plugin schema constrains the `provider` argument to the providers
Crabbox actually supports:
```text
aws | hetzner | ssh | blacksmith-testbox | blacksmith | daytona | islo
```
Adding a provider to the CLI requires updating this list in `index.js` and
the test fixture in `index.test.js`. The schema is the agent-facing contract;
without the update, the new provider would be rejected by JSON validation
before reaching the binary.
## When To Update
Edit the plugin when you:
- add or remove a provider;
- add a new agent-safe tool (read-only, owner-scoped, bounded output);
- change argv conventions across all `crabbox` commands (rare);
- update default timeouts or output budgets.
Run `node --test index.test.js` after every change. The tests exercise the
schema, argv handling, and output trimming end-to-end.
Related docs:
- [docs/README.md](../README.md) - top-level overview includes the plugin.
- [Source map](../source-map.md) - `package.json`, `openclaw.plugin.json`,
`index.js`, `index.test.js`.
- [run command](../commands/run.md) - what `crabbox_run` ultimately invokes.
- [warmup command](../commands/warmup.md) - what `crabbox_warmup` invokes.
- [stop command](../commands/stop.md) - what `crabbox_stop` invokes.

168
docs/features/portal.md Normal file
View File

@ -0,0 +1,168 @@
# Browser Portal
Read when:
- using the web UI to inspect leases or runs;
- changing portal pages or page-level routes;
- deciding whether a feature should land in the CLI, the API, or the portal.
The browser portal is a small server-rendered web UI hosted by the same
Cloudflare Worker that backs the Crabbox API. It is not a separate frontend
or single-page app: every page is HTML rendered by the Worker, with light
client-side JavaScript only for filtering, sorting, and clipboard copy.
## URL Map
```text
GET /portal
GET /portal/leases/{id-or-slug}
GET /portal/leases/{id-or-slug}/share
POST /portal/leases/{id-or-slug}/share
POST /portal/leases/{id-or-slug}/release
GET /portal/leases/{id-or-slug}/vnc
GET /portal/leases/{id-or-slug}/code/
GET /portal/runs/{run-id}
GET /portal/runs/{run-id}/logs
GET /portal/runs/{run-id}/events
GET /portal/runners/{provider}/{runner-id}
```
Portal authentication uses a browser session cookie minted after a successful
GitHub login through the same OAuth flow as `crabbox login`. The cookie
carries owner/org claims; the Worker scopes every page to that identity. Raw
Cloudflare Access headers are not trusted - only a verified Access JWT email
can become the portal owner.
## Lease Index `/portal`
The index renders a searchable, paginated, sortable lease grid. Columns
include compact provider/target badges, icon-only access capabilities (SSH,
VNC, code, browser), relative time cells, dense rows, and sticky column
headers. Filters at the top of the page select active, ended, provider,
target, or all.
Default view rules:
- Defaults to active leases when any are active.
- Falls back to all visible leases when the active list is empty.
- Normal browser sessions see their own leases plus leases shared directly
with them or with their org.
- Admin sessions also see non-owned runner leases. `mine` and `system`
filters distinguish personal leases from external runners (Blacksmith
Testboxes synced from CLI list output) so external rows do not leak to
normal users.
External runner rows render in the same grid as muted, disabled rows. They
include status/provider filters, inferred GitHub Actions run/workflow links,
status badges, `stuck` markers for long-queued or long-running Actions
owners, a copyable local stop command, and stale markers when the next
runner sync no longer sees a previously visible runner. Clicking an
external runner opens `/portal/runners/{provider}/{runner-id}`, a
visibility-only detail page.
## Lease Detail `/portal/leases/{id-or-slug}`
The lease detail page shows:
- compact provider/target badges and the lease state pill;
- bridge status for the WebVNC, code-server, and mediated egress bridges,
including host/client connection state for an active egress session;
- the latest Linux telemetry sample as gauges, with sparklines when multiple
samples are present;
- stale-telemetry, high-load, high-memory, and high-disk status pills when
thresholds are exceeded;
- an access panel with copy-to-clipboard commands for `crabbox ssh`,
`crabbox run`, `crabbox webvnc`, `crabbox code`, and (when an egress
session is active) `crabbox egress status` / `crabbox egress stop`;
- a viewport-fitted "recent runs" grid with state filters;
- a stop action when the lease is releasable.
Owners and users with `manage` access see a share control in the top-right
lease header. The share page can add individual users, set org-wide access, or
clear sharing. `use` shares can open visible lease pages and portal bridges;
`manage` shares can also change sharing and stop the lease.
`/portal/leases/{id-or-slug}/vnc` and `/portal/leases/{id-or-slug}/code/`
are bridges, not portal pages. They proxy WebSocket and HTTP traffic to the
matching capability on the lease so a user does not need an SSH tunnel to
open the desktop or editor. The mediated egress bridge has its own
ticketed websocket route under `/v1/leases/{id-or-slug}/egress/...` rather
than a portal path, because egress is operator-driven and never opens an
HTML view. See [Interactive desktop and VNC](interactive-desktop-vnc.md),
[code command](../commands/code.md), and [Mediated egress](egress.md).
All bridge tickets travel as `Authorization: Bearer ...` headers on the
agent websocket upgrade, with a `?ticket=` query string fallback for older
CLIs. The portal never echoes ticket values back to the browser.
## Run Detail `/portal/runs/{run-id}`
Run detail mirrors the `/v1/runs/...` resources but uses the browser session
cookie, so users can inspect logs and events without copying a bearer token
into the browser. The page renders:
- the command, owner, lease, provider metadata, and exit status;
- a JUnit summary when the run attached results;
- a searchable, paginated event table with event-type filters;
- a copyable retained log tail;
- bounded load, memory, and disk trend lines for longer Linux runs that
attached mid-run telemetry samples.
`/portal/runs/{run-id}/logs` returns the retained log as plain text.
`/portal/runs/{run-id}/events` returns the events as JSON. Both stay raw on
purpose so they are easy to copy or pipe.
## Runner Detail `/portal/runners/{provider}/{runner-id}`
External runner detail is visibility-only. It shows:
- owner/org;
- inferred GitHub Actions ownership (workflow, run id, status);
- lifecycle timestamps;
- boundary notes that explain Crabbox cannot stop or release the runner;
- a copyable local stop command for the operator's terminal.
External runners do not heartbeat through Crabbox and do not participate in
Crabbox lease expiry, cleanup, or cost accounting. The detail page exists so
operators have a single URL to share when an external runner is stuck.
## Authentication And Scope
```text
session authenticated GitHub user (owner/org embedded)
admin portal sessions with the admin token role
```
Per-route scope rules:
- Lease index, lease detail, run detail: own leases/runs only.
- Admin filters and external runner visibility: admin sessions only.
- VNC and code bridges: only when the lease has the matching capability and
the session owns the lease.
Tokens for `/v1/...` API calls are separate. The portal never echoes a
bearer token back to the browser.
## Why Server-Rendered
The portal is intentionally a thin server-rendered surface, not a SPA:
- the Worker already owns lease and run data; rendering at the edge avoids a
separate API/UI deployment;
- pages stay copy-pasteable - URLs deep-link to a specific lease or run;
- there is no build step, no JavaScript framework, and no offline session
management to maintain;
- the portal cannot drift from the API because both serve the same Durable
Object state.
Adding a portal feature usually means a new render in `worker/src/portal.ts`,
a new endpoint in `worker/src/fleet.ts`, and a doc update here.
Related docs:
- [Coordinator](coordinator.md)
- [Broker auth and routing](broker-auth-routing.md)
- [History and logs](history-logs.md)
- [Telemetry](telemetry.md)
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [Source map](../source-map.md)

View File

@ -0,0 +1,86 @@
# Prebaked Runner Images
Read when:
- creating or promoting Crabbox runner images;
- speeding up desktop/browser QA leases;
- deciding whether state belongs in a provider image, a warm lease, or a repo cache.
Prebaked images store machine capabilities, not scenario state.
## Where Images Live
Provider-owned image storage is the source of truth:
- AWS: AMIs plus their EBS snapshots live in the AWS account. `crabbox image
promote` stores the selected AMI id in coordinator metadata so future AWS
brokered leases can use it.
- Hetzner: snapshots/images live in the Hetzner project. Crabbox can already
boot a configured image through `image`/`CRABBOX_HETZNER_IMAGE`, but
create/promote lifecycle commands are not implemented for Hetzner yet.
- Blacksmith Testbox: images are owned by Blacksmith/GitHub runner
infrastructure, not Crabbox.
Do not store image bytes in git, release artifacts, or coordinator durable
state. The coordinator should hold only the current provider image identifier,
promotion metadata, and enough tags to explain provenance.
## Bake Into Images
Good prebake contents:
- OS patches and base packages;
- SSH, Git, rsync, curl, jq, and readiness helpers;
- desktop/browser capabilities for `--desktop --browser` leases;
- screenshot and recording tools such as `scrot`, `ffmpeg`, `xdotool`, and VNC;
- Node 22, corepack/pnpm, build-essential, Python, and common native-addon
headers when the image targets browser/channel QA;
- empty shared cache directories such as `/var/cache/crabbox/pnpm`.
Bad prebake contents:
- personal or CI secrets;
- browser profiles, Slack/Discord/WhatsApp login state, cookies, or OAuth
tokens;
- repository checkouts, `node_modules`, built product `dist/`, or PR artifacts;
- one-off debugging files.
## Runtime Caches
Runtime caches belong outside the image:
- warm leases can keep `/var/cache/crabbox/pnpm` and browser profiles for
short-lived operator sessions;
- GitHub Actions should cache candidate pnpm stores by lockfile and platform;
- product-specific runtime bundles and evidence artifacts belong in the repo
workflow workspace, for example under `.artifacts/qa-e2e/...`;
- long-lived reusable volumes should be keyed by repo, lockfile, Node version,
platform, and image id before Crabbox mounts them into leases.
This split keeps images reusable across repositories while still letting slow QA
systems skip repeated dependency work when they deliberately reuse a warm lease
or a keyed external cache.
## Operator Flow
Use the [Image bake runbook](image-bake-runbook.md) for the exact AWS bake,
candidate smoke, promotion, rollback, and cleanup commands. At a high level:
1. Warm a fresh `--desktop --browser` AWS lease.
2. Verify the machine capability contract on that lease.
3. Create an AMI with `crabbox image create --wait`.
4. Boot the AMI explicitly through an image override and smoke it.
5. Promote the AMI with `crabbox image promote`.
6. Run a normal brokered lease and the relevant QA lane.
7. Keep the previous known-good AMI until the new image has real QA proof.
For Mantis, image bake success is not just "Chrome exists." A useful image must
reduce `crabbox.warmup` or `crabbox.remote_run` time in the Mantis timing
report while keeping Slack/browser login state outside the image.
Related docs:
- [Image bake runbook](image-bake-runbook.md)
- [image command](../commands/image.md)
- [Runner bootstrap](runner-bootstrap.md)
- [Interactive desktop and VNC](interactive-desktop-vnc.md)

View File

@ -0,0 +1,496 @@
# Authoring A Provider
Read when:
- adding a new Crabbox provider end to end;
- porting a hosted runner service into Crabbox;
- learning what core owns versus what your backend owns.
This page is the step-by-step guide. The contract reference for backend
interfaces, registration, and review checklist lives in
[Provider backends](../provider-backends.md). Read this page first, then use
that reference as a checklist while you implement.
## What A Provider Does
A Crabbox provider answers four questions:
1. What execution model does the provider expose?
2. What targets and capabilities can it satisfy?
3. How does it acquire, resolve, list, and release a runner?
4. What flags and config does it own that core does not?
Everything else - command parsing, sync, command streaming, recorded runs,
heartbeats, slugs, claims, list/status rendering, JSON output - belongs to
core. A provider that needs to fork those concerns is fighting the design.
## Step 1. Pick The Backend Shape
Two execution models exist:
- `SSHLeaseBackend` - the provider hands Crabbox a real SSH target. Core owns
sync, command streaming, results, heartbeats, and release. Use this when you
can populate `LeaseTarget.SSH` with host, port, user, key, work root, and
target OS.
- `DelegatedRunBackend` - the provider owns command execution and streams output
back to Crabbox. Use this when you cannot give Crabbox a stable SSH contract
(Blacksmith Testbox, Islo, Daytona's `run` path).
If you can give Crabbox SSH, prefer `SSHLeaseBackend`. The CLI has more invested
in the SSH path, including Actions hydration, VNC, code-server, screenshot,
and cache stats/warm/purge. A delegated backend cannot reuse those without a
stable connection contract.
| Capability | SSH lease | Delegated run |
|:-----------|:----------|:--------------|
| `crabbox run` | yes | yes |
| `crabbox warmup` | yes | yes |
| `crabbox ssh` | yes | only if you implement short-lived SSH |
| `crabbox vnc / webvnc / code` | yes (Linux + browser) | no |
| `crabbox actions hydrate` | yes (Linux) | no |
| `crabbox cache stats / purge / warm` | yes | no |
| Crabbox-owned sync | yes | no - your backend owns sync |
| Coordinator support | optional | not used |
## Step 2. Lay Out The Package
Built-in providers live under `internal/providers/<name>`:
```text
internal/providers/example/
provider.go # Provider type, init() registration, Spec()
backend.go # SSH lease or delegated run implementation
flags.go # provider-specific flag struct (optional)
example.go # API client, helpers, types
example_test.go # backend tests, no live calls
```
Then add the side-effect import in `internal/providers/all/all.go`:
```go
import _ "github.com/openclaw/crabbox/internal/providers/example"
```
`cmd/crabbox` imports `internal/providers/all` already; nothing else needs to
change for the binary to see the new provider.
Tests inside `internal/cli` cannot import `internal/providers/all` because that
creates a cycle. If you need a test provider for core dispatch, register it from
a same-package test file.
## Step 3. Register The Provider
A provider is a small struct that satisfies `cli.Provider`:
```go
package example
import (
"flag"
core "github.com/openclaw/crabbox/internal/cli"
)
func init() {
core.RegisterProvider(Provider{})
}
type Provider struct{}
func (Provider) Name() string { return "example" }
func (Provider) Aliases() []string { return nil }
func (Provider) Spec() core.ProviderSpec {
return core.ProviderSpec{
Name: "example",
Kind: core.ProviderKindSSHLease,
Targets: []core.TargetSpec{
{OS: core.TargetLinux},
},
Features: core.FeatureSet{
core.FeatureSSH,
core.FeatureCrabboxSync,
core.FeatureCleanup,
},
Coordinator: core.CoordinatorNever,
}
}
func (Provider) RegisterFlags(*flag.FlagSet, core.Config) any {
return core.NoProviderFlags()
}
func (Provider) ApplyFlags(*core.Config, *flag.FlagSet, any) error {
return nil
}
func (p Provider) Configure(cfg core.Config, rt core.Runtime) (core.Backend, error) {
return NewExampleBackend(p.Spec(), cfg, rt), nil
}
```
`Name()` is the canonical name used in docs, config (`provider: example`), and
the `--provider` flag. Aliases are for compatibility - Blacksmith uses
`blacksmith` as an alias for `blacksmith-testbox`. Do not invent aliases for
new providers; pick one canonical name.
`Spec()` is the source of truth for what the provider can do. Read on.
## Step 4. Be Honest In `Spec`
`ProviderSpec` is command-facing metadata. Help text, target validation, and
feature gating all read from it.
```go
type ProviderSpec struct {
Name string
Kind ProviderKind
Targets []TargetSpec
Features FeatureSet
Coordinator CoordinatorMode
}
```
Rules:
- `Kind` must match the real execution model. Do not declare `SSHLease` if you
cannot return a usable `SSHTarget`.
- `Targets` lists only OS combinations you actually support end to end. Hetzner
is `linux` only. AWS lists `linux`, `windows` (normal and `wsl2`), and
`macos`. Static SSH lists all three but does no setup; the host must already
match.
- `Features` lists concrete capabilities:
- `FeatureSSH` - plain SSH access works.
- `FeatureCrabboxSync` - core can rsync a manifest into the runner.
- `FeatureCleanup` - implement `CleanupBackend` for orphan cleanup.
- `FeatureDesktop`, `FeatureBrowser`, `FeatureCode` - lease can host a visible
desktop, browser, or code-server instance.
- `FeatureTailscale` - lease can join a tailnet via cloud-init/`--tailscale`.
- `Coordinator` is `CoordinatorSupported` only when the Cloudflare Worker can
provision your runners. Direct-only providers, including all delegated run
backends and Static SSH, set `CoordinatorNever`.
Actions runner hydration is not a feature flag. Core checks for an SSH lease
backend on `target=linux` instead. Setting `FeatureSSH` on a non-Linux-only
provider is fine; setting `target=linux` on a backend that cannot satisfy it is
not.
## Step 5. Own Provider-Specific Flags
Go's `flag` package rejects unknown flags, so provider flags must be registered
before parse and applied only after a provider is selected.
```go
type exampleFlagValues struct {
Region *string
}
func (Provider) RegisterFlags(fs *flag.FlagSet, defaults core.Config) any {
return exampleFlagValues{
Region: fs.String("example-region", defaults.Example.Region, "Example region"),
}
}
func (Provider) ApplyFlags(cfg *core.Config, fs *flag.FlagSet, values any) error {
v, ok := values.(exampleFlagValues)
if !ok {
return nil
}
if core.FlagWasSet(fs, "example-region") {
cfg.Example.Region = *v.Region
}
return nil
}
```
Conventions:
- Prefix every flag name with the provider name (`--blacksmith-org`,
`--aws-region`). Crabbox does not gate flag visibility per provider, so the
prefix is the only thing keeping namespaces clean.
- `RegisterFlags` must be cheap and side-effect free. It runs for every
provider on every command, even when that provider is not selected.
- Apply only flags that were explicitly set with `FlagWasSet`. Otherwise zero
values from one command will overwrite intentional config from another.
- For providers that need rich config but have no flags, return
`core.NoProviderFlags()` from `RegisterFlags` and ignore the values in
`ApplyFlags`.
Never accept secrets as flag arguments. Pull them from environment variables,
SDK config, the coordinator, or the operator's credential store. Flags are
visible in shell history, process listings, and recorded run logs.
## Step 6. Implement The Backend
Pick the interface that matches the kind you declared.
### SSH Lease Backend
```go
type SSHLeaseBackend interface {
Backend
Acquire(ctx context.Context, req AcquireRequest) (LeaseTarget, error)
Resolve(ctx context.Context, req ResolveRequest) (LeaseTarget, error)
List(ctx context.Context, req ListRequest) ([]LeaseView, error)
ReleaseLease(ctx context.Context, req ReleaseLeaseRequest) error
Touch(ctx context.Context, req TouchRequest) (Server, error)
}
```
`Acquire` is the heavy lifter. A complete implementation:
1. validates direct-mode prerequisites (credentials, region, image);
2. accepts the lease ID from `req` or generates one if the provider needs it;
3. ensures or installs the per-lease SSH key with the provider;
4. provisions the machine or sandbox with Crabbox labels/tags;
5. waits for the provider to assign an address;
6. populates `SSHTarget` with host, port, user, key, work root, target OS, and
any Windows mode;
7. waits for SSH readiness when the provider owns boot;
8. flips provider labels/tags to `ready`;
9. returns the populated `LeaseTarget`.
`Resolve` handles `crabbox run --id`, `crabbox ssh --id`, and similar reuse
paths. Accept canonical lease IDs; accept slugs and provider-native IDs when
you can. Return the stored per-lease SSH key when available so reuse does not
need a fresh key.
`List` returns `[]LeaseView` (an alias for `Server`). Do not print from `List`
- core renders the table.
`Touch` updates idle/state metadata on the provider when possible. Use
`provider_labels.go` helpers for safe label encoding. For static providers, an
in-memory update is enough.
`ReleaseLease` is called when a lease ends or expires. Make it idempotent;
treat `not found` as success. Remove local claims and per-lease key
directories after the provider release succeeds.
If cleanup is meaningful, also implement:
```go
type CleanupBackend interface {
Backend
Cleanup(ctx context.Context, req CleanupRequest) error
}
```
Cleanup must honor `DryRun`, log every skip/delete decision to stderr, and
filter by Crabbox labels so it never touches unrelated machines. When a
coordinator is configured, core refuses to call provider cleanup at all -
brokered cleanup belongs to the Durable Object alarm.
### Delegated Run Backend
```go
type DelegatedRunBackend interface {
Backend
Warmup(ctx context.Context, req WarmupRequest) error
Run(ctx context.Context, req RunRequest) (RunResult, error)
List(ctx context.Context, req ListRequest) ([]LeaseView, error)
Status(ctx context.Context, req StatusRequest) (StatusView, error)
Stop(ctx context.Context, req StopRequest) error
}
```
`Warmup` should validate workflow/config, create or warm the provider resource,
claim the resource locally with provider name and slug, and print the standard
warmup summary.
`Run` should:
1. reject Crabbox sync options the provider cannot honor:
```go
if err := core.RejectDelegatedSyncOptions(p.Name(), req); err != nil {
return core.RunResult{}, err
}
```
2. acquire a resource or resolve an existing id/slug;
3. claim or reclaim the resource for the calling repo;
4. stream provider output through `rt.Stdout` and `rt.Stderr`;
5. return `RunResult` with command duration, exit code, and `SyncDelegated:
true`;
6. stop temporary resources when `Keep` is false.
`Status` returns a normalized `StatusView`. If the provider only emits a table,
parse it inside the backend and return structured fields - do not print the
native table.
`Stop` should stop the provider resource, remove local claims, and remove
per-resource keys the backend created.
Delegated backends should refuse `crabbox ssh`, `vnc`, `webvnc`, `screenshot`,
`code`, and Actions hydration unless the provider can keep Crabbox's security
boundary intact across those flows.
### Optional JSON Compatibility
If your provider already exposes a script-facing JSON shape that callers
depend on, add `JSONListBackend`:
```go
type JSONListBackend interface {
Backend
ListJSON(ctx context.Context, req ListRequest) (any, error)
}
```
This is an escape hatch for compatibility. New providers should not use it;
return normalized `[]LeaseView` from `List` instead and let core render JSON.
## Step 7. Use The Runtime
Backends receive a narrow runtime instead of touching package-level state:
```go
type Runtime struct {
Stdout io.Writer
Stderr io.Writer
Clock Clock
HTTP *http.Client
Exec CommandRunner
}
```
Rules:
- Use `rt.Exec.Run(ctx, core.LocalCommandRequest{...})` for every subprocess.
Never call `exec.CommandContext` directly. Tests pass a fake `CommandRunner`
to assert on argv without spawning real processes.
- Use `rt.Clock.Now()` for timing inside the backend. The default is
wall-clock; tests can pass a fake clock for deterministic timing assertions.
- Use `rt.Stdout` and `rt.Stderr` for streaming and warnings. Do not write
directly to `os.Stdout`/`os.Stderr`.
- Use `rt.HTTP` for outbound HTTP when the provider has a JSON API. Tests can
inject a stubbed transport.
Anything that bypasses runtime breaks tests and parallel safety.
## Step 8. Hand-Off Boundaries
The most common review feedback on new providers is "this belongs in core."
Use this map:
| Concern | Owned by |
|:--------|:---------|
| `--provider`, `--target`, `--id`, `--profile` parsing | core |
| Config precedence (flags → env → repo → user → defaults) | core |
| Friendly slug generation, normalization, collisions | core |
| Local claim files and `--reclaim` behavior | core |
| SSH key creation and storage under user config | core |
| `crabbox-ready` readiness wait | core |
| Repo manifest, fingerprints, rsync, sanity checks | core |
| Heartbeats, idle expiry math | core (coordinator) or core direct labels |
| Recorded runs, retained logs, telemetry samples | core |
| List/status table rendering and JSON output | core |
| Provider lifecycle (create, delete, list, label) | provider |
| Provider-native auth (SDK config, env, CLI tokens) | provider |
| Translating provider state into normalized lease views | provider |
| Rejecting unsupported delegated options | provider helper |
If your provider needs to own one of the core-owned concerns, raise it in the
PR description. The fix is usually a small core helper, not a fork.
## Step 9. Test Without Live Credentials
Land the provider with tests that prove the contract without hitting a real
account. Cover:
- Provider registration: canonical name resolves through `ProviderFor`,
declared aliases resolve, `Spec()` returns the right kind/targets/features,
flag values apply only when that provider is selected.
- SSH lease backends: `Acquire` populates a complete `LeaseTarget`, partial
failures release what they created, `Resolve` accepts the supported lookup
shapes, `List` returns normalized views, `Touch` updates state/idle, and
`ReleaseLease` is idempotent. If you implement `Cleanup`, assert dry-run
prints decisions and does not call destructive APIs.
- Delegated run backends: sync-only/checksum/force-large are rejected, fresh
`Run` acquires/streams/stops, existing `--id` resolves and reuses, `List`
and `Status` parse provider output into normalized values, `Stop` removes
claims, every subprocess goes through `rt.Exec`.
Use the existing fakes:
- a recording `CommandRunner` for argv assertions;
- a fake clock for timing;
- an `http.RoundTripper` test transport for API calls;
- per-provider test client where the provider has a typed SDK.
Run at least:
```sh
go test -count=1 ./internal/cli ./internal/providers/...
go test -count=1 ./...
go vet ./...
npm run docs:check
```
Add a live smoke only when the provider can be exercised cheaply with
explicit credentials. Wire it into `scripts/live-smoke.sh` so it runs in the
same place as the others.
## Step 10. Document The Provider
Three doc surfaces care about a new provider:
- `docs/providers/<name>.md` - one page in the provider reference. Use the
existing pages as a template: target matrix, config keys, env vars, sync
behavior, expected failures.
- `docs/features/<name>.md` - feature page when the provider has interesting
semantics worth a separate read (capacity fallback, sandbox lifecycle,
workflow integration). Skip when the reference page already covers it.
- `docs/source-map.md` - add the new package paths under `Providers And
Runner Bootstrap` so the source map keeps tracking implementation truth.
Also add the provider to:
- the provider table in `docs/providers/README.md`;
- the feature matrix in the same file;
- the index in `docs/features/README.md` if you added a feature page;
- the related-doc lists at the bottom of any pages you cross-link from.
Run `npm run docs:check` before pushing - it builds the CLI, validates the
command/help surface, checks every internal link, and rebuilds the docs site.
## Step 11. Ship The PR
A reviewable provider PR includes:
- a folder under `internal/providers/<name>` with `provider.go`, `backend.go`,
helpers, and tests;
- registration in `internal/providers/all/all.go`;
- doc pages in `docs/providers/<name>.md` and (optionally)
`docs/features/<name>.md`;
- index updates in `docs/providers/README.md`, `docs/features/README.md`, and
`docs/source-map.md`;
- tests that pass without live credentials;
- a CHANGELOG entry under `Unreleased` describing the new provider.
Keep the diff focused. If you find yourself touching `run.go`, `repo.go`,
`coordinator.go`, or `provider_backend.go`, stop and check whether the change
is really provider-specific or whether it should be a shared helper landed in
a separate PR.
## External Process Plugins
External provider plugins are not implemented yet. Do not add a provider that
depends on an undocumented stdio protocol. The intended direction is:
- a built-in Go provider package configures and launches the external process;
- the process speaks JSON over stdio for capabilities, acquire, resolve, list,
release, touch, run, status, and stop;
- the Go side adapts that to `SSHLeaseBackend` or `DelegatedRunBackend`;
- core commands still own list/status rendering and SSH workflows where the
provider exposes them.
When that protocol exists, a plugin will look like a normal registered provider
to the rest of Crabbox.
Related docs:
- [Provider backends](../provider-backends.md): contract reference and review
checklist.
- [Provider reference](../providers/README.md): one page per built-in backend.
- [Source map](../source-map.md): files behind documented behavior.
- [Architecture](../architecture.md): system overview and lease flow.
- [Coordinator](coordinator.md): brokered lease contract.

View File

@ -2,18 +2,47 @@
Read when:
- changing Hetzner or AWS provisioning;
- changing Hetzner, AWS, Azure, or Blacksmith Testbox provisioning;
- adding a backend;
- adjusting machine classes, fallback order, regions, or images.
Crabbox currently supports two brokered providers:
Crabbox currently supports three brokered providers:
```text
hetzner Hetzner Cloud servers
aws AWS EC2 one-time Spot instances
aws AWS EC2 instances
azure Azure Virtual Machines
```
Hetzner behavior:
Brokered Hetzner leases are Linux targets. Brokered AWS supports Linux, native
Windows Server, Windows WSL2, and EC2 Mac when a Dedicated Host is configured.
Brokered Azure supports Linux and native Windows SSH/sync/run. Static SSH still
exists for reusing existing macOS and Windows machines:
```text
ssh Existing SSH host selected by static.host
```
Direct provider backends can also run without the Crabbox coordinator:
```text
daytona Daytona sandboxes with SDK/toolbox run and short-lived SSH access
islo Islo sandboxes with delegated command execution
```
## Provider Pages
- [Provider reference](../providers/README.md): one page per built-in backend.
- [AWS](../providers/aws.md): EC2 Linux, Windows, WSL2, EC2 Mac, capacity, AMIs, and security groups.
- [Azure](../providers/azure.md): Azure Linux/native Windows, shared infra, capacity, and cleanup.
- [Hetzner](../providers/hetzner.md): Linux-only managed provider behavior, classes, and cleanup.
- [Static SSH](../providers/ssh.md): existing Linux, macOS, and Windows SSH hosts.
- [Blacksmith Testbox](../providers/blacksmith-testbox.md): delegated Testbox backend behavior.
- [Daytona](../providers/daytona.md): Daytona SDK/toolbox sandbox leases.
- [Islo](../providers/islo.md): delegated Islo sandbox execution.
- [Provider backends](../provider-backends.md): implementation guide for adding a new provider/backend/plugin.
## Hetzner Summary
- imports or reuses the lease SSH key;
- creates a server with Crabbox labels;
@ -21,18 +50,32 @@ Hetzner behavior:
- falls back across class server types when capacity or quota rejects a request;
- fetches server-type hourly prices when cost estimates need provider pricing.
AWS behavior:
## AWS Summary
- signs EC2 Query API calls inside the Worker;
- imports or reuses an EC2 key pair;
- creates or reuses the `crabbox-runners` security group;
- launches one-time Spot instances;
- creates or reuses the `crabbox-runners` security group with SSH ingress limited to configured CIDRs or the request source IP;
- launches one-time Linux Spot or On-Demand instances;
- launches AWS Windows Server desktop leases with EC2Launch PowerShell user
data, OpenSSH, Git for Windows, and TightVNC when `target=windows`;
- launches EC2 Mac leases only with an explicit Dedicated Host id
(`CRABBOX_AWS_MAC_HOST_ID` or `aws.macHostId`) and On-Demand capacity;
- tags instances, volumes, and Spot requests;
- falls back across broad C/M/R instance families;
- falls back across broad C/M/R instance families for class requests, including account policy and capacity rejections;
- can fall back to a small burstable type when account policy rejects the high-core class candidates;
- preflights applied Spot or On-Demand vCPU quotas in brokered mode when Service Quotas allows it, then records skipped candidates as quota attempts;
- supports `--market spot|on-demand` on `warmup` and `run` for one-off capacity-market overrides;
- uses Spot placement score across configured regions in direct AWS mode;
- can fall back to On-Demand after Spot capacity/quota failures when configured;
- fetches Spot price history when cost estimates need provider pricing.
Explicit `--type` requests are treated as exact provider type requests. If that type is rejected, Crabbox fails clearly instead of silently choosing a different instance type. Remove `--type` and use a machine class when fallback is desired.
`crabbox list` marks brokered provider machines as `orphan=no-active-lease`
when their provider label references a lease that is no longer active in the
coordinator. This is an operator hint only; `keep=true` machines are not
deleted automatically.
Machine classes map to provider-specific types:
```text
@ -47,12 +90,99 @@ standard c7a.8xlarge, c7i.8xlarge, m7a.8xlarge, m7i.8xlarge, c7a.4xlarge
fast c7a.16xlarge, c7i.16xlarge, m7a.16xlarge, m7i.16xlarge, c7a.12xlarge, c7a.8xlarge
large c7a.24xlarge, c7i.24xlarge, m7a.24xlarge, m7i.24xlarge, r7a.24xlarge, c7a.16xlarge, c7a.12xlarge
beast c7a.48xlarge, c7i.48xlarge, m7a.48xlarge, m7i.48xlarge, r7a.48xlarge, c7a.32xlarge, c7i.32xlarge, m7a.32xlarge, c7a.24xlarge, c7a.16xlarge
AWS Windows
standard m7i.large, m7a.large, t3.large
fast m7i.xlarge, m7a.xlarge, t3.xlarge
large m7i.2xlarge, m7a.2xlarge, t3.2xlarge
beast m7i.4xlarge, m7a.4xlarge, m7i.2xlarge
AWS Windows WSL2
standard m8i.large, m8i-flex.large, c8i.large, r8i.large
fast m8i.xlarge, m8i-flex.xlarge, c8i.xlarge, r8i.xlarge
large m8i.2xlarge, m8i-flex.2xlarge, c8i.2xlarge, r8i.2xlarge
beast m8i.4xlarge, m8i-flex.4xlarge, c8i.4xlarge, r8i.4xlarge, m8i.2xlarge
AWS macOS
all mac2.metal unless `--type` is set
```
Direct provider mode still exists when no coordinator is configured. It uses local AWS credentials or `HCLOUD_TOKEN`/`HETZNER_TOKEN` and should stay secondary to the brokered path.
Tailscale is not a provider. Use `--tailscale` to add tailnet reachability to
new managed Linux leases, or set a static host to a MagicDNS name/100.x address
when the existing host is already on a tailnet. See [Tailscale](tailscale.md).
Direct smoke shape:
```sh
tmp="$(mktemp)"
printf 'provider: hetzner\n' > "$tmp"
CRABBOX_CONFIG="$tmp" CRABBOX_COORDINATOR= crabbox warmup --provider hetzner --class standard --ttl 15m --idle-timeout 4m
CRABBOX_CONFIG="$tmp" CRABBOX_COORDINATOR= crabbox run --provider hetzner --id <slug> --no-sync -- echo direct-hetzner-ok
CRABBOX_CONFIG="$tmp" CRABBOX_COORDINATOR= crabbox stop --provider hetzner <slug>
rm -f "$tmp"
```
Use `--provider aws` with AWS SDK credentials for direct AWS smoke. Direct mode
has no Durable Object alarm; cleanup is best-effort through provider labels and
manual `crabbox cleanup`. Direct AWS fallback can retry provider types, but the
structured quota preflight and `provisioningAttempts` metadata belong to the
brokered Worker path.
Crabbox can also wrap Blacksmith Testboxes with `provider: blacksmith-testbox`. That backend does not use the Crabbox broker or direct cloud credentials. It shells out to the authenticated Blacksmith CLI for `testbox warmup`, `run`, `status`, `list`, and `stop`, while Crabbox keeps local slugs, repo claims, config, and timing summaries. See [Blacksmith Testbox](blacksmith-testbox.md).
Crabbox can use Daytona sandboxes with `provider: daytona`. Crabbox creates a
sandbox from `daytona.snapshot`, syncs and executes `run` through Daytona's
SDK/toolbox APIs, and mints short-lived SSH tokens only for explicit `ssh`
access. See [Daytona](daytona.md).
Crabbox can use Islo sandboxes with `provider: islo`. Islo is a delegated run
backend: the Islo Go SDK owns sandbox lifecycle and Crabbox streams command
output from Islo's exec SSE endpoint. See [Islo](islo.md).
Static SSH targets:
```yaml
provider: ssh
target: macos
static:
host: mac-studio.local
user: steipete
port: "22"
workRoot: /Users/steipete/crabbox
```
```yaml
provider: ssh
target: windows
windows:
mode: normal
static:
host: win-dev.local
user: Peter
port: "22"
workRoot: C:\crabbox
```
`target: windows` supports `windows.mode: normal` and `windows.mode: wsl2`.
Normal mode uses PowerShell over OpenSSH and syncs the manifest as a tar archive.
WSL2 mode requires AWS nested virtualization, so managed AWS WSL2 leases use
C8i, M8i, or R8i families and enable nested virtualization at launch. Static
WSL2 hosts keep the POSIX SSH contract: commands run through
`wsl.exe --exec bash -lc`, rsync uses `wsl.exe rsync`, and `static.workRoot`
should be a WSL path such as `/home/peter/crabbox`. macOS also uses the POSIX
contract and needs `git`, `rsync`, `tar`, and SSH.
Related docs:
- [Infrastructure](../infrastructure.md)
- [Provider reference](../providers/README.md)
- [AWS](../providers/aws.md)
- [Hetzner](../providers/hetzner.md)
- [Tailscale](tailscale.md)
- [Blacksmith Testbox](../providers/blacksmith-testbox.md)
- [Daytona](../providers/daytona.md)
- [Islo](../providers/islo.md)
- [Runner bootstrap](runner-bootstrap.md)
- [Cost and usage](cost-usage.md)

View File

@ -6,13 +6,13 @@ Read when:
- debugging machines that never become SSH-ready;
- changing the minimal runner contract or readiness checks.
Each runner is an Ubuntu machine prepared by cloud-init. It does not need coordinator credentials.
Brokered cloud runners are Ubuntu machines prepared by cloud-init. They do not need coordinator credentials.
Bootstrap creates:
- the `crabbox` user;
- SSH key-only access;
- SSH on port `2222`;
- SSH on the primary port, default `2222`, and configured fallback ports, default `22`;
- `/work/crabbox`;
- shared package caches.
@ -24,13 +24,44 @@ Bootstrap installs:
- jq;
- OpenSSH server.
Bootstrap intentionally does not install project language runtimes such as Go, Node, pnpm, Docker, databases, or service dependencies. Those belong in GitHub Actions hydration, devcontainers, Nix, mise/asdf, or repository setup scripts. A machine should not pass readiness until `crabbox-ready` succeeds over SSH.
Bootstrap intentionally does not install project language runtimes such as Go, Node, pnpm, Docker, databases, or service dependencies. Those belong in GitHub Actions hydration, devcontainers, Nix, mise/asdf, or repository setup scripts. A brokered machine should not pass readiness until `crabbox-ready` succeeds over SSH.
The CLI prefers the configured SSH port and can fall back to port 22 during early bootstrap. Long term, snapshots or provider images can replace slow cloud-init once the bootstrap contract is stable.
Interactive desktop and browser tooling are optional lease profiles, not part
of the minimal bootstrap. The desktop profile installs Xvfb/slim XFCE, x11vnc,
screenshots, and video capture tools. The browser profile installs
Chrome/Chromium plus native addon build helpers that browser-channel QA often
needs during dependency fallback installs. Crabbox owns these machine
capabilities; scenario systems still own browser automation and proof artifacts.
For slow QA lanes, bake these machine capabilities into provider images while
keeping secrets, browser profiles, repository checkouts, and built artifacts out
of the image. See [Interactive desktop and VNC](interactive-desktop-vnc.md) and
[Prebaked runner images](prebaked-images.md).
Tailscale is optional too. `--tailscale` on a managed Linux lease installs the
Tailscale package, joins the configured tailnet, writes non-secret metadata
under `/var/lib/crabbox`, and extends `crabbox-ready` with a bounded 100.x
address check. The bootstrap does not persist the auth key after `tailscale up`.
Brokered leases receive a one-off key from the coordinator; direct-provider
leases read it from `CRABBOX_TAILSCALE_AUTH_KEY`. See [Tailscale](tailscale.md).
Static SSH targets are not bootstrapped by Crabbox. They are assumed to be
operator-managed:
- macOS and Windows WSL2 targets need SSH, `bash`, `git`, `rsync`, and `tar`;
- native Windows targets need OpenSSH, PowerShell, `git`, and `tar`;
- `static.workRoot` must point at a writable directory for that target mode.
For native Windows, install Git before the Crabbox check or restart OpenSSH
Server afterward so new non-interactive SSH sessions inherit Git and `tar` on
PATH.
The CLI prefers the configured SSH port and can fall back through `ssh.fallbackPorts` during early bootstrap or operator-network egress restrictions. Set `ssh.fallbackPorts: []` or `CRABBOX_SSH_FALLBACK_PORTS=none` when the fallback should be disabled. Long term, snapshots or provider images can replace slow cloud-init once the bootstrap contract is stable.
Related docs:
- [Providers](providers.md)
- [Prebaked runner images](prebaked-images.md)
- [Tailscale](tailscale.md)
- [SSH keys](ssh-keys.md)
- [run command](../commands/run.md)
- [doctor command](../commands/doctor.md)

View File

@ -11,10 +11,12 @@ Crabbox creates a fresh SSH key per lease by default. This avoids sharing a long
Local key storage is under the Crabbox user config directory, outside the repository:
```text
macOS: ~/Library/Application Support/crabbox/keys/<lease>/
Linux: ~/.config/crabbox/keys/<lease>/
macOS: ~/Library/Application Support/crabbox/testboxes/<lease>/id_ed25519
Linux: ~/.config/crabbox/testboxes/<lease>/id_ed25519
```
A per-lease `known_hosts` file lives beside the key. SSH ControlMaster sockets are also scoped to the key path, so reused provider IPs do not poison the user's global `~/.ssh/known_hosts` and do not cross streams between leases.
The CLI sends only the public key to the coordinator. The Worker imports or reuses that public key in the provider:
- Hetzner SSH key;

View File

@ -11,7 +11,8 @@ It syncs the Git-managed working set, not the whole directory tree:
- tracked files from `git ls-files --cached`;
- nonignored untracked files from `git ls-files --others --exclude-standard`;
- repo-local `sync.exclude` patterns and Crabbox's default cache/build excludes.
- root `.crabboxignore` patterns, repo-local `sync.exclude` patterns, and
Crabbox's default cache/build excludes.
Ignored build output, dependency folders, `.git`, and common local caches stay out of the transfer. This keeps first syncs close to the code that CI would see while still letting agents test uncommitted edits.
@ -28,7 +29,7 @@ Sync flow:
9. run sanity checks for mass tracked deletions;
10. hydrate configured base-ref history for changed-test workflows.
The remote manifest deletion step only removes paths Crabbox previously synced. It does not delete workflow-created state, package caches, `.git`, or other local runner files outside the managed file list.
The remote manifest deletion step only removes paths Crabbox previously synced. It does not delete workflow-created state, package caches, `.git`, or other local runner files outside the managed file list. Native Windows static targets use the same Git manifest but transfer it as a tar archive over OpenSSH instead of rsync.
In remote Git worktrees, Crabbox stores its sync metadata under `.git/crabbox` so repository status stays clean. Crabbox does not delete files under the worktree `.crabbox/` directory; that path remains available for repository-owned files and config.
@ -67,6 +68,12 @@ Use `crabbox sync-plan` to inspect the local manifest before leasing a box. It p
Repo-local config should hold project-specific excludes and env allowlists. Secrets must not be passed as command-line arguments or broad env globs.
Use `.crabboxignore` when you only need repo-local sync exclusions. The file is
read from the repository root. Blank lines and lines starting with `#` are
ignored; remaining lines are appended to `sync.exclude` and use the same matcher
as config excludes. Crabbox intentionally supports only `.crabboxignore`; there
is no short alias.
Related docs:
- [CLI](../cli.md)

191
docs/features/tailscale.md Normal file
View File

@ -0,0 +1,191 @@
# Tailscale
Read when:
- adding or debugging tailnet reachability;
- deciding whether a host is provider-owned or only network-reachable;
- changing SSH, VNC, or coordinator bootstrap behavior.
Tailscale is an optional Crabbox reachability layer. It is not a provider.
Providers still own machines: Hetzner, AWS, Azure, static SSH hosts, and Blacksmith
Testbox. Tailscale only changes which host Crabbox dials for SSH-backed work.
V1 support:
- managed Linux leases can join a tailnet with `--tailscale`;
- static hosts can use MagicDNS names or 100.x addresses in `static.host`;
- managed Windows and EC2 Mac Tailscale provisioning is not enabled yet;
- Blacksmith Testbox connectivity remains Blacksmith-owned.
## Commands
Create a managed Linux lease that joins the configured tailnet:
```sh
crabbox warmup --tailscale
crabbox run --tailscale -- pnpm test
crabbox run --tailscale --desktop --browser -- pnpm test:e2e
```
Choose the connection path for SSH, VNC, screenshots, WebVNC, status, inspect,
and reused `run --id` leases:
```sh
crabbox ssh --id blue-lobster --network auto
crabbox ssh --id blue-lobster --network tailscale
crabbox vnc --id blue-lobster --network tailscale --open
crabbox run --id blue-lobster --network public -- pnpm test
```
Network modes:
- `auto`: prefer Tailscale when lease metadata exists and SSH is reachable,
otherwise use the provider/public host;
- `tailscale`: require a tailnet host and fail clearly when this client cannot
reach it;
- `public`: force the provider/public host for debugging.
When `auto` falls back to the public host, Crabbox prints the selected network
in ready/status output instead of silently hiding the path.
## Config
```yaml
tailscale:
enabled: true
network: auto
tags:
- tag:crabbox
hostnameTemplate: crabbox-{slug}
authKeyEnv: CRABBOX_TAILSCALE_AUTH_KEY
exitNode: mac-studio.example.ts.net
exitNodeAllowLanAccess: true
```
Environment overrides:
```text
CRABBOX_TAILSCALE=1
CRABBOX_NETWORK=auto|tailscale|public
CRABBOX_TAILSCALE_TAGS=tag:crabbox,tag:ci
CRABBOX_TAILSCALE_HOSTNAME_TEMPLATE=crabbox-{slug}
CRABBOX_TAILSCALE_AUTH_KEY=<direct-provider only>
CRABBOX_TAILSCALE_EXIT_NODE=mac-studio.example.ts.net
CRABBOX_TAILSCALE_EXIT_NODE_ALLOW_LAN_ACCESS=1
```
`tailscale.enabled` and `--tailscale` request tailnet join for newly created
managed Linux leases. `tailscale.network` and `--network` choose target
resolution for SSH-backed commands. Hostname templates support `{id}`, `{slug}`,
and `{provider}`.
Direct-provider mode reads the one-off auth key from `tailscale.authKeyEnv`.
Brokered mode does not require a local Tailscale key.
`tailscale.exitNode` asks the lease to route outbound internet through a
tailnet exit node after it joins Tailscale. Use a MagicDNS name or 100.x address
for an approved exit node. `tailscale.exitNodeAllowLanAccess` maps to
Tailscale's LAN-access flag and requires `tailscale.exitNode`. In `network:
auto`, exit-node leases bootstrap over the tailnet host once it appears because
the public/provider SSH path can become asymmetric after the lease selects the
exit node.
## Brokered Mode
The Worker mints a fresh auth key per requested lease using Tailscale OAuth.
Secrets live in Worker configuration:
```text
CRABBOX_TAILSCALE_CLIENT_ID
CRABBOX_TAILSCALE_CLIENT_SECRET
CRABBOX_TAILSCALE_TAILNET optional, defaults to -
CRABBOX_TAILSCALE_TAGS default/allowed comma-separated tags
CRABBOX_TAILSCALE_ENABLED set 0 to disable
```
Flow:
1. The CLI sends `tailscale`, `tailscaleTags`, `tailscaleHostname`, and optional
exit-node settings in `CreateLease`.
2. The Worker validates requested tags against `CRABBOX_TAILSCALE_TAGS`.
3. The Worker uses OAuth to mint a one-off, ephemeral, pre-approved, tagged auth
key.
4. The key is injected only into cloud-init user-data.
5. The runner installs Tailscale, runs `tailscale up`, and writes non-secret
metadata under `/var/lib/crabbox`.
6. After SSH readiness, the CLI reads that metadata and posts it back to the
coordinator.
The auth key is never stored in lease records, provider labels, run logs, or
local config. User-data can still contain the short-lived key at the provider,
so use one-off ephemeral keys and avoid long-lived reusable keys.
## Exit Nodes
Exit-node egress is opt-in per lease:
```sh
crabbox warmup --tailscale --tailscale-exit-node mac-studio.example.ts.net --tailscale-exit-node-allow-lan-access
crabbox run --tailscale --tailscale-exit-node 100.100.100.100 -- curl -4 https://ifconfig.me
```
The exit node must already advertise exit-node capability and be approved in
Tailscale admin. ACLs/grants must allow the lease's tags, such as
`tag:crabbox`, to access `autogroup:internet` through exit nodes.
After the lease is reachable, Crabbox verifies that the selected exit node can
reach the public internet. If that check fails, the run stops before sync or the
remote command and reports the exit-node egress failure. This usually means the
exit node is not approved for internet routing, the tailnet policy does not
grant `autogroup:internet` to the lease tag, or the exit-node machine itself is
not forwarding traffic.
## VNC And SSH
Crabbox continues to use OpenSSH and per-lease SSH keys. Tailscale SSH is not
enabled in v1.
Managed VNC remains loopback-bound:
```text
local localhost:5901 -> SSH -> remote 127.0.0.1:5900
```
Tailscale only changes the SSH endpoint from the public/provider host to the
tailnet host. Crabbox does not bind managed VNC to 100.x addresses, and does
not use Tailscale Serve, Funnel, or noVNC for managed leases.
## Static Hosts
Static hosts are operator-managed. Point `static.host` at a MagicDNS name or a
100.x address:
```yaml
provider: ssh
target: macos
static:
host: mac-studio.example.ts.net
user: steipete
port: "22"
workRoot: /Users/steipete/crabbox
```
For static hosts, `--network tailscale` is a reachability assertion. Crabbox
does not install or join Tailscale on the host.
## Tailscale References
- [Auth keys](https://tailscale.com/kb/1085/auth-keys)
- [Ephemeral nodes](https://tailscale.com/docs/features/ephemeral-nodes)
- [OAuth clients](https://tailscale.com/kb/1215/oauth-clients)
- [ACL tags](https://tailscale.com/kb/1068/acl-tags)
- [Secure auth-key CLI usage](https://tailscale.com/kb/1595/secure-auth-key-cli)
- [tailscale up flags](https://tailscale.com/kb/1241/tailscale-up)
Related docs:
- [Providers](providers.md)
- [Runner bootstrap](runner-bootstrap.md)
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [Security](../security.md)
- [Troubleshooting](../troubleshooting.md#tailscale-path-fails)

151
docs/features/telemetry.md Normal file
View File

@ -0,0 +1,151 @@
# Telemetry
Read when:
- changing how Crabbox samples runner load, memory, disk, or uptime;
- adding new metrics to lease records or run history;
- debugging missing portal sparklines or stale telemetry pills;
- understanding where telemetry stops and full observability begins.
Crabbox captures lightweight runner telemetry so a lease detail page or run
record can answer "is this box healthy right now?" and "did this command spike
memory?" without standing up Prometheus or shipping a logging agent. Telemetry
is best-effort, capped, and only exists for managed Linux leases.
## What Gets Captured
For Linux runners, the CLI runs a small remote script through the lease SSH
target whenever it has a reason to talk to the box (heartbeat,
warmup-complete, status check, mid-run sample). The script reads:
- `load1`, `load5`, `load15` from `/proc/loadavg`;
- `memoryTotalBytes`, `memoryUsedBytes`, `memoryPercent` derived from
`MemTotal` and `MemAvailable` in `/proc/meminfo`;
- `diskTotalBytes`, `diskUsedBytes`, `diskPercent` from `df -PB1 /`;
- `uptimeSeconds` from `/proc/uptime`.
Each sample is parsed into a `LeaseTelemetry` record:
```json
{
"capturedAt": "2026-05-07T07:42:18Z",
"source": "ssh-linux",
"load1": 0.42,
"load5": 0.30,
"load15": 0.18,
"memoryUsedBytes": 5368709120,
"memoryTotalBytes": 16777216000,
"memoryPercent": 32.0,
"diskUsedBytes": 21474836480,
"diskTotalBytes": 107374182400,
"diskPercent": 20.0,
"uptimeSeconds": 38400
}
```
Non-Linux targets (managed Windows, EC2 Mac, static SSH macOS/Windows) are
intentionally excluded from telemetry capture today. The collector returns
`nil` for non-Linux targets and the coordinator silently skips storing
samples for them.
## Where It Lives
Telemetry lives in two places on the coordinator:
- **Lease record.** The Fleet Durable Object stores the most recent sanitized
snapshot on the lease (`telemetry`) and a bounded ring of the latest 60
samples (`telemetryHistory`). The ring is keyed by `capturedAt`; older
samples drop off as new ones arrive.
- **Run record.** When a `run_...` is in progress, the CLI POSTs samples to
`/v1/runs/{run-id}/telemetry`. The run record keeps a bounded `start`,
`end`, and a small `samples[]` array so longer commands have a short
load/memory/disk trend instead of just two endpoints.
Static SSH and delegated providers do not produce telemetry. Their lease
records have no `telemetry` field; their portal rows render a quiet "no
telemetry" pill.
## How Samples Get Sent
The CLI samples in three contexts:
1. **Heartbeat.** While a command runs, the heartbeat goroutine asks for a
fresh sample with a short 5-second timeout, attaches it to the heartbeat
body, and lets the coordinator update the lease record and append to the
ring. Heartbeats that fail to collect just send no telemetry; the command
keeps running.
2. **Warmup and status.** `crabbox warmup`, `crabbox status`, and
`crabbox inspect` collect a one-off sample so the user sees current load
on the same line that prints lease state.
3. **Run telemetry.** Long commands periodically post samples through the run
telemetry endpoint while the command is active; the run record captures
start, end, and a trimmed series.
All collection runs through `collectLeaseTelemetryBestEffort`, which wraps the
collector in a 5-second timeout. A failed sample is never an error - it's a
signal that the box was busy or temporarily unreachable.
## What Shows Up Where
- **`crabbox status --id ...`**: prints `load=0.42 mem=5.0GiB/16.0GiB
disk=20.0GiB/100.0GiB uptime=10h40m telemetry=2s` when a sample is
available. Older samples render as `telemetry=4m12s` so freshness is
obvious at a glance.
- **`crabbox history`** and **`crabbox events`**: include start/end snapshots
plus a memory delta on completed runs.
- **`/portal/leases/{id-or-slug}`**: shows the latest sample as gauges and
renders load, memory, and disk sparklines when more than one sample is
present. Stale samples (>5 minutes) get a `stale telemetry` pill;
high-resource samples get `high load`, `high memory`, or `high disk` pills
on the same row.
- **`/portal/runs/{run-id}`**: renders a compact resource delta line and short
trend lines for runs with mid-run samples.
The coordinator never serves raw `/proc` content - only the parsed numeric
fields above. Tests assert that hostnames, kernel versions, mount points, and
process tables never reach storage.
## Limits And Defaults
- Sampler timeout: 5 seconds per call.
- Lease telemetry ring: 60 samples per lease.
- Run telemetry samples: bounded to a small ring (start, end, plus a small
middle series) and serialized once on `POST /v1/runs/{run-id}/finish`.
- High-resource pill thresholds: load > number of CPUs, memory percent > 90,
disk percent > 90.
- Stale telemetry threshold: 5 minutes since `capturedAt`.
These thresholds are operational hints, not alerts - Crabbox does not page or
auto-action on telemetry. Use observability tooling for that.
## When To Use Full Observability Instead
Telemetry is intentionally narrow. It is a "is the box healthy?" pulse, not a
metrics pipeline. For per-process traces, per-command flame graphs, or
historical correlations across many runs, scrape the runner with a real
agent or ship logs to a real backend. Crabbox does not try to replace that
layer; see [Observability](../observability.md) for what we plumb upstream.
## Configuration
Telemetry has no user-facing toggle. Disabling it would not save meaningful
runtime but would remove the most useful health signal in the portal. There
is no env flag to silence sampling.
If you need to extend the captured fields, add them in:
- the parser in `internal/cli/telemetry.go`;
- the coordinator schema in `worker/src/types.ts`;
- the lease/run portal renderers in `worker/src/portal.ts`;
- the storage in `worker/src/fleet.ts`.
Keep new fields numeric, sanitized, and bounded. Free-form strings, hostnames,
and process names do not belong on the telemetry record.
Related docs:
- [Coordinator](coordinator.md)
- [Orchestrator](../orchestrator.md)
- [History and logs](history-logs.md)
- [Observability](../observability.md)
- [Source map](../source-map.md)

View File

@ -6,7 +6,7 @@ Read when:
- changing how failed tests are summarized;
- debugging why `crabbox results` has no data.
Crabbox can attach JUnit XML summaries to coordinator run history. The agent uses this so a failed run can answer "which tests failed?" without scraping a large log tail.
Crabbox can attach JUnit XML summaries to coordinator run history. The agent uses this so a failed run can answer "which tests failed?" without scraping a large raw log.
Configure per run:

111
docs/features/vnc-linux.md Normal file
View File

@ -0,0 +1,111 @@
# Linux VNC
Read when:
- using `--desktop` on Hetzner or AWS Linux;
- debugging Xvfb, XFCE/Openbox, x11vnc, or screenshots on a Linux lease;
- preparing a static Linux host for Crabbox VNC.
Linux is the simplest managed desktop path. Hetzner and AWS Linux leases use
the same bootstrap shape: install a lightweight desktop, run it on `DISPLAY=:99`,
bind x11vnc to loopback, and let the CLI create an SSH tunnel.
## Managed Linux
```sh
crabbox warmup --desktop --browser
crabbox run --id blue-lobster --desktop --browser -- google-chrome --version
crabbox desktop doctor --id blue-lobster
crabbox webvnc --id blue-lobster --open
crabbox vnc --id blue-lobster --open
crabbox screenshot --id blue-lobster --output linux.png
```
Managed Linux desktop leases include:
- Xvfb on `:99`;
- a lightweight desktop/window-manager session;
- x11vnc bound to `127.0.0.1:5900`;
- screenshot and video capture tools (`scrot` and `ffmpeg`);
- input helpers (`xdotool`) and clipboard paste tools (`xclip`/`xsel`);
- a generated per-lease VNC password at `/var/lib/crabbox/vnc.password`;
- optional Chrome stable or Chromium fallback, first-run suppression, and native
addon build helpers when `--browser` is requested;
- readiness checks that verify desktop services when `desktop=true`.
`crabbox run --desktop` injects `CRABBOX_DESKTOP=1` and `DISPLAY=:99`.
`crabbox run --browser` injects `CRABBOX_BROWSER=1`, `BROWSER`, and
`CHROME_BIN` after probing the target.
## Static Linux
Static Linux is host-managed. Crabbox does not install packages or start a
desktop service on an existing machine. The host must already provide a VNC
service reachable from SSH loopback:
```yaml
provider: ssh
target: linux
static:
host: linux-box.tailnet-name.ts.net
user: crabbox
port: "22"
workRoot: /home/crabbox/work
```
```sh
crabbox vnc --provider ssh --target linux --static-host linux-box.tailnet-name.ts.net
```
For static Linux, keep x11vnc or another VNC server bound to
`127.0.0.1:5900`. Direct `host:5900` is accepted only when reachable and should
be limited to a trusted LAN or tailnet.
## Troubleshooting
`lease ... was not created with desktop=true`
Warm a new lease with `--desktop`; existing leases do not gain the capability
after creation.
`target=linux does not expose a loopback VNC desktop`
For managed leases, inspect cloud-init and service logs or warm a fresh box.
For static hosts, start Xvfb/desktop services and x11vnc on
`127.0.0.1:5900`.
Black screen
Check that the app was launched into `DISPLAY=:99`. For detached browser work,
use:
```sh
crabbox desktop launch --id blue-lobster --browser --url https://example.com
```
Run `crabbox desktop doctor --id blue-lobster` to separate session problems
from WebVNC/browser-portal problems. Missing `xfwm4`, `xfce4-panel`, x11vnc,
clipboard tools, browser, ffmpeg, screen size, or screenshot capture each get a
specific repair line.
Input symbols are wrong
Use Crabbox's desktop helpers instead of raw `xdotool type`:
```sh
crabbox desktop paste --id blue-lobster --text "peter+qa@example.com"
crabbox desktop type --id blue-lobster --text "peter+qa@example.com"
```
`desktop type` uses clipboard paste for symbol-heavy text, so `@`, `+`,
password-like values, and URLs do not depend on the target X keyboard layout.
Related docs:
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [Hetzner](hetzner.md)
- [AWS](aws.md)
- [vnc command](../commands/vnc.md)
- [webvnc command](../commands/webvnc.md)
- [desktop command](../commands/desktop.md)
- [screenshot command](../commands/screenshot.md)

View File

@ -0,0 +1,90 @@
# macOS VNC
Read when:
- launching managed AWS EC2 Mac desktop leases;
- preparing a static Mac for Crabbox VNC;
- debugging Screen Sharing credentials or EC2 Mac host requirements.
Crabbox supports macOS in two ways:
- managed AWS EC2 Mac leases on an operator-provided Dedicated Host;
- static Macs reached through `provider: ssh`.
## Managed AWS EC2 Mac
```sh
CRABBOX_AWS_MAC_HOST_ID=h-... \
crabbox warmup --provider aws --target macos --desktop --market on-demand
crabbox vnc --id silver-squid --open
crabbox screenshot --id silver-squid --output macos.png
```
EC2 Mac requirements:
- an allocated EC2 Mac Dedicated Host in the selected region;
- `CRABBOX_AWS_MAC_HOST_ID` or `aws.macHostId`;
- On-Demand capacity;
- the default `mac2.metal` instance type unless `--type` is set.
Bootstrap enables Screen Sharing for `ec2-user`, sets a generated per-lease
password, stores it at `/var/db/crabbox/vnc.password`, and keeps access behind
the SSH tunnel. Managed EC2 Mac leases use `/Users/ec2-user/crabbox` as the
default work root because the macOS system volume is read-only. `crabbox vnc`
prints:
```text
macos username: ec2-user
macos password: ...
```
AWS EC2 Mac has a provider-level lifecycle constraint: Mac instances run on
allocated Dedicated Hosts with a 24-hour minimum host allocation period.
Crabbox launches onto a host id you provide; it does not allocate, scrub, or
retire Mac hosts for you.
## Static Mac
Static Mac targets are existing machines:
```yaml
provider: ssh
target: macos
static:
host: mac-studio.tailnet-name.ts.net
user: steipete
port: "22"
workRoot: /Users/steipete/crabbox
```
```sh
crabbox vnc --provider ssh --target macos --static-host mac-studio.tailnet-name.ts.net --host-managed --open
```
The Mac must already have SSH, `git`, `rsync`, `tar`, and Screen Sharing or a
VNC-compatible service. Credentials are host-managed. `--open` requires
`--host-managed` because the visible login prompt belongs to that Mac, not to a
Crabbox-created cloud lease.
Static Macs work well over Tailscale: put the MagicDNS name or 100.x address in
`static.host` and keep Screen Sharing limited to trusted networks.
## Troubleshooting
Missing host id
Set `CRABBOX_AWS_MAC_HOST_ID` or `aws.macHostId`, use `--market on-demand`, and
verify the Dedicated Host is allocated in the selected AWS region.
VNC prompt asks for host credentials
If `managed: false`, you opened a static Mac. Use the Mac's own Screen Sharing
credentials. Managed AWS EC2 Mac leases print the generated `ec2-user`
password.
Related docs:
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [AWS](aws.md)
- [vnc command](../commands/vnc.md)
- [screenshot command](../commands/screenshot.md)

View File

@ -0,0 +1,120 @@
# Windows VNC
Read when:
- using managed AWS Windows desktop leases;
- choosing between native Windows and WSL2;
- preparing a static Windows host for Crabbox VNC.
Crabbox has two Windows execution contracts:
- native Windows: PowerShell over OpenSSH, archive sync, Windows desktop;
- WSL2: POSIX commands through WSL, Linux-style sync, no separate managed VNC
contract beyond the underlying Windows host.
Managed Windows desktop support is AWS-only.
## Managed AWS Windows
```sh
crabbox warmup --provider aws --target windows --desktop
crabbox vnc --id crimson-crab --open
crabbox screenshot --id crimson-crab --output windows.png
```
Bootstrap flow:
- EC2Launch v2 enables the first OpenSSH foothold on port `22`.
- Crabbox installs Git for Windows and TightVNC.
- Crabbox creates a local `crabbox` administrator.
- Windows auto-logon starts a visible console session for that user.
- TightVNC runs in that logged-in user session, with its HKCU password values
copied from the service configuration during startup.
- The generated password is stored at
`C:\ProgramData\crabbox\vnc.password`.
- VNC remains reachable only through the SSH tunnel.
`crabbox vnc` prints both the VNC password and the generated Windows console
login:
```text
windows username: crabbox
windows password: ...
```
That login belongs to the Crabbox-created EC2 instance. It is not your local
Windows account and is not stored in coordinator history.
## WSL2
Managed AWS WSL2 leases are Windows instances with nested virtualization
enabled and an Ubuntu rootfs imported into WSL. Commands and sync use the POSIX
WSL contract:
```sh
crabbox warmup --provider aws --target windows --windows-mode wsl2
crabbox run --id blue-lobster -- pnpm test
```
Use native Windows mode when you need the Windows desktop. Use WSL2 when you
need Linux tooling on Windows-capable AWS instance families.
## Static Windows
Static Windows is host-managed:
```yaml
provider: ssh
target: windows
windows:
mode: normal
static:
host: win-dev.local
user: Peter
port: "22"
workRoot: C:\crabbox
```
```sh
crabbox vnc --provider ssh --target windows --static-host win-dev.local --host-managed --open
```
The static host must already have OpenSSH Server, PowerShell, Git, `tar`, a
writable `static.workRoot`, and a VNC-compatible service. `--open` requires
`--host-managed` because the visible password prompt belongs to that durable
host, not to a Crabbox-created lease.
For static WSL2, set `windows.mode: wsl2` and use a WSL path such as
`/home/peter/crabbox` for `static.workRoot`.
## Troubleshooting
Tunnel command uses port `22`
Expected for AWS Windows. EC2Launch enables OpenSSH on port `22`, and Crabbox
records the working SSH port after probing fallbacks.
Screenshot is black from raw SSH
Use `crabbox screenshot`. It runs a scheduled task inside the logged-in console
session; an ad hoc non-interactive SSH PowerShell session cannot reliably
capture the visible desktop.
VNC opens an OS credential prompt
Check `managed:` in `crabbox vnc` output. If it is `false`, you opened a static
host. Use that host's credentials and pass `--host-managed` intentionally.
WebVNC keeps retrying in the browser
Close any older retrying tab and start a fresh `crabbox webvnc` bridge. A stale
tab can keep reconnecting with an old URL fragment. On managed AWS Windows,
Crabbox configures TightVNC in the logged-in user's registry profile; if direct
VNC auth also fails, recreate the lease with a current Crabbox build.
Related docs:
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
- [AWS](aws.md)
- [vnc command](../commands/vnc.md)
- [screenshot command](../commands/screenshot.md)

232
docs/getting-started.md Normal file
View File

@ -0,0 +1,232 @@
# Getting Started
Read when:
- you are new to Crabbox and want a working `run` in 10 minutes;
- you are evaluating Crabbox for a repo and want to see the shape;
- you want a reference for what a typical onboarding looks like.
This is a cookbook, not a reference. It walks through one repo end to end,
from install to `crabbox run -- pnpm test`. For deeper coverage, follow the
links in each step.
## Step 1. Install
```sh
brew install openclaw/tap/crabbox
```
Verify the install:
```sh
crabbox --version
crabbox doctor
```
`crabbox doctor` should print `ok` for `tools` (git, rsync, ssh,
ssh-keygen). It is fine if `auth` and `network` are still missing - we set
those next.
If you do not have Homebrew, GitHub Releases ship signed tarballs for macOS,
Linux, and Windows. Download the matching archive from
<https://github.com/openclaw/crabbox/releases>.
## Step 2. Log In
```sh
crabbox login
```
`login` opens a browser to the GitHub OAuth flow. The broker exchanges the
OAuth code, verifies your GitHub org membership, and writes a signed token
to your user config. From then on, every `crabbox` command authenticates
automatically.
```sh
crabbox whoami
```
Confirms the resolved owner, org, broker URL, and selected provider.
If you are running Crabbox in a CI environment that cannot open a browser,
use shared-token auth:
```sh
printf '%s' "$TOKEN" | crabbox login \
--url https://crabbox.openclaw.ai \
--provider aws \
--token-stdin
```
See [Auth and admin](features/auth-admin.md) for the full identity model.
## Step 3. Onboard A Repo
Inside the repo:
```sh
crabbox init
```
`init` writes three files:
```text
.crabbox.yaml repo defaults (profile, class, sync, env)
.github/workflows/crabbox.yml Actions hydration stub (optional)
.agents/skills/crabbox/SKILL.md agent-facing skill instructions
```
Open `.crabbox.yaml` and fill in:
- `profile`: a name for this lane (e.g. `project-check`);
- `class`: `standard`, `fast`, `large`, or `beast`;
- `sync.exclude`: directories that should not be sent to the runner;
- `env.allow`: env vars the remote command should see.
Then run:
```sh
crabbox sync-plan
```
`sync-plan` previews what would be sent: file count, total bytes, the
biggest files. If it shows surprises (a `dist/` folder, a `.cache/` you
forgot, a 2 GiB asset), tighten `sync.exclude` and re-run. The first sync
to a fresh runner is bound by this size.
## Step 4. Warm A Box
```sh
crabbox warmup
```
Warmup acquires a lease through the broker, provisions the runner,
bootstraps SSH and tooling, and prints a slug + lease ID:
```text
leased cbx_abcdef123456 slug=blue-lobster provider=aws server=i-0123 type=c7a.48xlarge ip=203.0.113.10 idle_timeout=30m0s expires=2026-05-07T17:30:00Z
```
The lease is now waiting for commands. Idle timeout (default 30m) and TTL
(default 90m) bound how long it lives before the broker reclaims it.
## Step 5. Run A Command
```sh
crabbox run --id blue-lobster -- pnpm test
```
What happens:
1. The CLI verifies SSH readiness on the lease.
2. It seeds remote Git from your origin/base ref, then rsyncs the dirty
working tree.
3. It runs the command over SSH, streaming stdout/stderr.
4. It heartbeats the broker so the lease does not idle out mid-test.
5. It records a `run_...` history entry with sync time, command time, exit
code, and (for Linux) bounded telemetry samples.
You can omit `--id` for a one-shot run:
```sh
crabbox run -- pnpm test
```
That acquires a fresh lease, runs the command, and releases the lease when
the command exits. Use this for ad-hoc tests; use `warmup` + `--id` for
iterative work.
## Step 6. Inspect History
```sh
crabbox history
crabbox events run_abcdef123456
crabbox logs run_abcdef123456
crabbox results run_abcdef123456
```
`history` lists recent runs for the lease or owner. `events` prints ordered
events (lease, sync, command, output chunks, finish). `logs` returns the
retained command output. `results` parses any JUnit reports the run
attached.
`/portal/runs/run_abcdef123456` renders the same data as a browser page if
you prefer a UI.
## Step 7. Stop The Lease
When you are done:
```sh
crabbox stop blue-lobster
```
Stop releases the lease, deletes the provider machine, removes the local
claim, and frees reserved cost. If you forget, the broker idle alarm
releases the lease automatically.
```sh
crabbox cleanup --dry-run
```
`cleanup` is a sweep for direct-provider leftovers. It refuses to run when
a coordinator is configured because brokered cleanup is the alarm's job.
## Common Variations
Use a kept lease across days:
```sh
crabbox warmup --idle-timeout 4h --ttl 8h
crabbox run --id blue-lobster -- pnpm test
crabbox run --id blue-lobster -- pnpm bench
crabbox stop blue-lobster
```
Open a desktop session:
```sh
crabbox warmup --desktop
crabbox vnc --id blue-lobster --open
```
Open a code-server tab:
```sh
crabbox warmup --code
crabbox code --id blue-lobster --open
```
Use a Mac Studio you already own:
```yaml
# .crabbox.yaml
provider: ssh
target: macos
static:
host: mac-studio.local
user: steipete
port: "22"
workRoot: /Users/steipete/crabbox
```
```sh
crabbox run -- xcodebuild test
```
Use AWS instead of the configured default:
```sh
crabbox run --provider aws --class beast -- pnpm test
```
## Where To Go Next
- [How Crabbox Works](how-it-works.md) - the mental model.
- [CLI](cli.md) - the full command surface and exit codes.
- [Commands](commands/README.md) - one page per command.
- [Features](features/README.md) - one page per feature.
- [Configuration](features/configuration.md) - YAML schema and precedence.
- [Providers](features/providers.md) - which provider to pick.
- [Provider authoring](features/provider-authoring.md) - add a new provider.
- [Troubleshooting](troubleshooting.md) - what to do when a step fails.

View File

@ -28,7 +28,7 @@ Cloud machines are vanilla Ubuntu runners that hold no broker secrets. They are
| | provider API
| v
| +------------------------------+
| SSH (port 2222) + | Hetzner Cloud / AWS Spot |
| SSH (primary + fallback) | Hetzner Cloud / AWS EC2 |
+----------- rsync ------------> | Ubuntu runner |
| /work/crabbox/<lease>/<repo> |
+------------------------------+
@ -42,8 +42,8 @@ The CLI talks to the broker over HTTPS, then talks **directly** to the leased ru
|:------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **CLI** | config + flags; per-lease SSH key; SSH readiness; Git seeding + rsync; sync fingerprints + sanity checks; remote command + streaming; heartbeats; release |
| **Broker** | request auth + identity; serialized lease state; provider credentials; machine create/delete; lease expiry; pool/status/inspect; usage; spend caps |
| **Provider** | raw compute: Hetzner Cloud servers or AWS EC2 Spot instances |
| **Runner** | nothing durable: Ubuntu prepared by cloud-init with SSH, Git, rsync, curl, jq, `/work/crabbox`; project runtimes come from repo-owned setup |
| **Provider** | raw compute: Hetzner Cloud servers or AWS EC2 instances |
| **Runner** | nothing durable for brokered boxes: Linux prepared by cloud-init with SSH, Git, rsync, curl, jq, `/work/crabbox`; AWS Windows/WSL2/macOS targets have provider-specific bootstrap; static targets are existing SSH hosts; project runtimes come from repo-owned setup |
## What `crabbox run` does
@ -53,7 +53,7 @@ A single `crabbox run` command walks through five phases:
**2. Lease.** `POST /v1/leases` to the broker with class, provider, TTL, idle timeout, slug, bootstrap options, and the SSH public key. Worker authenticates, then forwards to the Fleet Durable Object. Durable Object enforces active-lease and monthly spend caps, asks the provider for live pricing, reserves the worst-case TTL cost, provisions the machine, and returns host / SSH user / port / work root / expiry / lease ID / slug. CLI re-keys its local key dir if the broker assigned a different final lease ID.
**3. Sync.** Wait for SSH and the `crabbox-ready` marker. Seed remote Git when possible. Compare local and remote sync fingerprints; skip rsync if nothing changed. Otherwise rsync the dirty checkout into `/work/crabbox/<lease>/<repo>`, run sanity checks, hydrate the configured base ref.
**3. Sync.** Wait for SSH and the target readiness probe. Seed remote Git when possible. Compare local and remote sync fingerprints; skip rsync if nothing changed. Otherwise rsync the dirty checkout into `/work/crabbox/<lease>/<repo>` for POSIX targets, or send a manifest tar archive for native Windows, then run sanity checks and hydrate the configured base ref when supported.
**4. Run.** Start heartbeats in the background. Run the requested command over SSH and stream stdout/stderr.
@ -92,9 +92,13 @@ CLI -> runner over SSH/rsync
Direct mode needs local provider credentials (AWS SDK chain or `HCLOUD_TOKEN`). It has no central usage history and no brokered heartbeat. It is handy for diagnosing the broker itself, not for day-to-day work.
Static SSH targets use `provider: ssh` and bypass the broker even when a broker
URL exists in config. macOS and Windows WSL2 use the POSIX rsync contract;
native Windows uses PowerShell plus tar archive sync.
## Auth And Identity
The broker accepts bearer-token automation and can also use Cloudflare Access identity when present. Bearer-token CLI requests send:
The broker accepts signed GitHub login tokens for normal users and shared bearer tokens for trusted automation. Fallback routes can also sit behind Cloudflare Access before the Worker sees the request. Bearer-token CLI requests send:
```text
Authorization: Bearer <token>
@ -102,7 +106,7 @@ X-Crabbox-Owner: <email>
X-Crabbox-Org: <org>
```
Owner is resolved from `CRABBOX_OWNER`, the Git email env, or `git config user.email`. `CRABBOX_ORG` sets the org. Cloudflare Access email wins when both are present.
Owner is resolved from the signed GitHub token for `crabbox login` users. In shared-token mode, owner comes from `CRABBOX_OWNER`, the Git email env, or `git config user.email`; `CRABBOX_ORG` sets the org. Raw Cloudflare Access identity headers are ignored; only a verified Access JWT email can become the bearer-token owner.
## Sync Model

View File

@ -5,22 +5,28 @@
Canonical Worker endpoint:
```text
https://crabbox-coordinator.steipete.workers.dev
https://crabbox.openclaw.ai
```
Cloudflare Access protected route:
Access-protected Worker endpoint:
```text
https://crabbox-access.openclaw.ai
```
Legacy fallback route:
```text
https://crabbox.clawd.bot
```
Intended future product endpoint:
Workers.dev fallback endpoint:
```text
https://crabbox.openclaw.ai
https://crabbox-coordinator.services-91b.workers.dev
```
The Worker route is the stable automation endpoint today. `crabbox.clawd.bot/*` is attached for Access-protected browser/user flows. Move to `crabbox.openclaw.ai` once that zone is available in Cloudflare.
The `crabbox.openclaw.ai/*` Worker route is the stable automation and browser-login endpoint. `crabbox-access.openclaw.ai/*` is the Cloudflare Access-protected route for service-token proof and hardened automation. `crabbox.clawd.bot/*` and the workers.dev URL remain fallback routes.
## Cloudflare
@ -30,17 +36,17 @@ Use Cloudflare for:
- Access auth.
- Worker runtime.
- Durable Object lease state.
- DNS/custom domain once the target zone is available.
- DNS/custom domain routing.
Known setup:
- Access org: `openclaw-crabbox.cloudflareaccess.com`.
- Access org: `crabbox-openclaw.cloudflareaccess.com`.
- Access enabled.
- Current IdPs: one-time PIN and GitHub.
- GitHub IdP name: `GitHub OpenClaw`.
- GitHub IdP restriction: org `openclaw`.
- Fallback Access app: `Crabbox Coordinator` on `crabbox.clawd.bot`.
- Fallback Access policy readback verifies the GitHub org include rule for `openclaw`.
- Service-token Access app: `Crabbox Coordinator Service Token` on `crabbox-access.openclaw.ai`.
- Service-token Access policy: `CLI service token`, `non_identity`, include the local Crabbox CLI service token.
Required env:
@ -52,15 +58,17 @@ CRABBOX_CLOUDFLARE_ZONE_NAME
CRABBOX_DOMAIN
CRABBOX_FALLBACK_DOMAIN
CRABBOX_GITHUB_ALLOWED_ORG
CRABBOX_GITHUB_ALLOWED_ORGS
CRABBOX_GITHUB_ALLOWED_TEAMS
```
GitHub IdP needs a GitHub OAuth app:
Crabbox browser login needs a GitHub OAuth app owned by the `openclaw` org:
```text
GitHub org: openclaw
App name: Crabbox Access
Homepage URL: https://crabbox.openclaw.ai
Callback URL: https://openclaw-crabbox.cloudflareaccess.com/cdn-cgi/access/callback
Callback URL: https://crabbox.openclaw.ai/v1/auth/github/callback
```
Store resulting values outside the repo:
@ -68,36 +76,70 @@ Store resulting values outside the repo:
```text
CRABBOX_GITHUB_OAUTH_CLIENT_ID
CRABBOX_GITHUB_OAUTH_CLIENT_SECRET
CRABBOX_GITHUB_CLIENT_ID
CRABBOX_GITHUB_CLIENT_SECRET
CRABBOX_GITHUB_ALLOWED_ORG
CRABBOX_GITHUB_ALLOWED_TEAMS
CRABBOX_SESSION_SECRET
```
Optional Tailscale brokered reachability uses a Tailscale OAuth client with the
`auth_keys` scope and only the tags Crabbox may assign, usually `tag:crabbox`.
Store OAuth credentials as Worker secrets:
```text
CRABBOX_TAILSCALE_CLIENT_ID
CRABBOX_TAILSCALE_CLIENT_SECRET
```
Optional Worker config:
```text
CRABBOX_TAILSCALE_ENABLED=1
CRABBOX_TAILSCALE_TAILNET=- # or explicit tailnet/org
CRABBOX_TAILSCALE_TAGS=tag:crabbox # allowlist/default tags
```
Operator checklist:
1. Create a Tailscale OAuth client with the `auth_keys` scope.
2. Limit the OAuth client to tags Crabbox may assign, usually `tag:crabbox`.
3. Store the client ID and secret as Worker secrets.
4. Set `CRABBOX_TAILSCALE_TAGS` to the same allowed tag list.
5. Verify with `crabbox warmup --tailscale --network tailscale`.
The Worker mints one-off ephemeral pre-approved auth keys per lease and injects
the key only into cloud-init. Lease records and provider labels store only
non-secret Tailscale metadata such as hostname, FQDN, 100.x address, state, and
tags.
Current local status:
- Core Cloudflare, Hetzner, and GitHub tokens are present in local `~/.profile`.
- The Crabbox Cloudflare token is mirrored to MacBook Pro `~/.profile`.
- `CRABBOX_COORDINATOR` and `CRABBOX_COORDINATOR_TOKEN` are present in local and MacBook Pro `~/.profile`.
- GitHub OAuth client ID and secret are present in local and MacBook Pro `~/.profile`.
- Cloudflare Access GitHub IdP is created.
- Cloudflare Access fallback app is created for `crabbox.clawd.bot`.
- The GitHub OAuth client ID and secret may be stored locally as `CRABBOX_GITHUB_OAUTH_*` and deployed to the Worker as `CRABBOX_GITHUB_CLIENT_*`.
- Cloudflare Access service-token CLI credentials can be stored locally as `CRABBOX_ACCESS_CLIENT_ID` and `CRABBOX_ACCESS_CLIENT_SECRET`; `CRABBOX_ACCESS_TOKEN` can carry an already minted Access JWT for protected fallback routes.
- Crabbox browser-login OAuth secrets are deployed as Worker secrets `CRABBOX_GITHUB_CLIENT_ID`, `CRABBOX_GITHUB_CLIENT_SECRET`, and `CRABBOX_SESSION_SECRET`.
- Worker routes are attached for `crabbox.openclaw.ai/*` and `crabbox-access.openclaw.ai/*`.
- `CRABBOX_COORDINATOR`, `CRABBOX_PROFILE`, `CRABBOX_CONFIG`, `CRABBOX_FLEET_CONFIG`, `CRABBOX_SSH_KEY`, `CRABBOX_NO_COLOR`, and `CRABBOX_LOG` are optional CLI defaults and are not required to build the MVP.
The Cloudflare token `crabbox-deploy` is scoped to `Steipete@gmail.com's Account` and the `clawd.bot` zone. It verifies access to Workers scripts, Access applications, Access identity providers, Access keys, DNS records, and zone Worker routes from both the local machine and MacBook Pro.
The Cloudflare token `crabbox-deploy` is scoped to the OpenClaw Cloudflare account and the Crabbox/OpenClaw routes it manages. It verifies access to Workers scripts, Access applications, Access identity providers, Access keys, DNS records, and zone Worker routes from both the local machine and MacBook Pro.
## DNS Decision
## DNS State
Preferred path:
Current path:
1. Add `openclaw.ai` to Cloudflare.
2. Copy existing DNS records exactly.
3. Add `crabbox.openclaw.ai`.
4. Switch nameservers at registrar.
5. Deploy Worker custom domain.
1. Keep the main `openclaw.ai` website on Vercel.
2. Manage `crabbox.openclaw.ai` in the OpenClaw Cloudflare account.
3. Proxy `crabbox.openclaw.ai/*` and `crabbox-access.openclaw.ai/*` to the `crabbox-coordinator` Worker.
4. Set `CRABBOX_PUBLIC_URL=https://crabbox.openclaw.ai`.
5. Configure the GitHub OAuth callback on `https://crabbox.openclaw.ai/v1/auth/github/callback`.
Temporary path:
Fallback path:
1. Deploy Worker under `crabbox.clawd.bot`.
2. Keep `CRABBOX_DOMAIN=crabbox.openclaw.ai` as intended target.
3. Use fallback domain for early testing.
4. Move to `openclaw.ai` once DNS is ready.
1. Use the workers.dev URL for health checks if DNS is disrupted.
2. Use `crabbox.clawd.bot` only as a legacy fallback.
## Hetzner
@ -118,7 +160,10 @@ location: fsn1
serverType: ccx63
image: ubuntu-24.04
sshUser: crabbox
sshPort: 2222
sshPort: "2222"
# Ordered fallback ports tried after sshPort; use [] to disable fallback.
sshFallbackPorts:
- "22"
workdir: /work/crabbox
```
@ -144,12 +189,12 @@ Current direct-CLI status:
- The `beast` class tries `ccx63`, `ccx53`, `ccx43`, `cpx62`, then `cx53`.
- Dedicated-core types currently fail on the available account quota, so the verified runner used `cpx62`.
- Cloud-init installs only Crabbox plumbing: OpenSSH, curl/CA certificates, Git, rsync, jq, and a readiness probe through a retrying bootstrap script. Project runtimes and services are supplied by Actions hydration or repo-owned setup.
- SSH prefers port 2222 and falls back to port 22 during AWS bootstrap when the base image exposes default SSH before the custom port restart lands.
- SSH prefers the configured primary port, default `2222`, and then tries `ssh.fallbackPorts`, default `["22"]`. Set `ssh.fallbackPorts: []` or `CRABBOX_SSH_FALLBACK_PORTS=none` to disable fallback dialing/opening.
- The verified kept lease was `cbx_f782c469c9ce` on server `128694755`, `cpx62`, `188.245.91.84`.
## AWS EC2 Spot
## AWS EC2
Use AWS as the first non-Hetzner burst backend. The Cloudflare coordinator brokers AWS EC2 Spot by default; the CLI direct provider remains available with `--provider aws` when no broker is configured.
Use AWS as the first non-Hetzner burst backend. The Cloudflare coordinator brokers AWS EC2 Spot by default for Linux, can launch managed Windows and WSL2 targets, and can launch EC2 Mac instances on an operator-provided Dedicated Host. The CLI direct provider remains available with `--provider aws` when no broker is configured.
Brokered AWS credentials live as Worker secrets:
@ -157,6 +202,7 @@ Brokered AWS credentials live as Worker secrets:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN optional
CRABBOX_AWS_MAC_HOST_ID optional; required only for brokered target=macos
```
Direct fallback env is whatever the AWS SDK can resolve, such as:
@ -172,14 +218,25 @@ AWS-specific Crabbox env:
```text
CRABBOX_AWS_REGION default eu-west-1
CRABBOX_AWS_AMI optional Ubuntu 24.04 x86_64 AMI override
CRABBOX_AWS_AMI optional AMI override for selected AWS target
CRABBOX_AWS_SECURITY_GROUP_ID optional security group override
CRABBOX_AWS_SUBNET_ID optional subnet override
CRABBOX_AWS_INSTANCE_PROFILE optional IAM instance profile name
CRABBOX_AWS_ROOT_GB default 400
CRABBOX_AWS_SSH_CIDRS optional comma-separated SSH source CIDRs
CRABBOX_AWS_MAC_HOST_ID EC2 Mac Dedicated Host id for target=macos
CRABBOX_SSH_FALLBACK_PORTS optional comma-separated SSH fallback ports, or none
```
The AWS provider imports the local SSH public key as an EC2 key pair when needed, creates or reuses a `crabbox-runners` security group when no security group is supplied, launches one-time Spot instances, tags instances and volumes with Crabbox lease metadata, and terminates non-kept instances after the command.
The AWS provider imports the local SSH public key as an EC2 key pair when needed, creates or reuses a `crabbox-runners` security group when no security group is supplied, launches one-time EC2 instances, tags instances and volumes with Crabbox lease metadata, and terminates non-kept instances after the command.
Grant the Worker AWS principal EC2 launch/list/tag/terminate permissions plus
`servicequotas:GetServiceQuota`. Service Quotas access is best-effort: when it
is available, Crabbox can skip known quota-impossible instance types before
calling `RunInstances`; when it is missing, EC2 launch errors are still
classified after the failed call.
SSH ingress for AWS security groups is source-scoped. If `CRABBOX_AWS_SSH_CIDRS` is set, Crabbox adds those CIDRs. Otherwise, the CLI sends its detected outbound IPv4 `/32` to the broker; when that is unavailable, the Worker falls back to `CF-Connecting-IP` as `/32` or `/128`. Direct and brokered AWS open the primary SSH port plus configured fallback ports. Crabbox also revokes the old managed `0.0.0.0/0` SSH ingress rule when the broker touches the managed security group. Supplying `CRABBOX_AWS_SECURITY_GROUP_ID` makes network policy your responsibility.
## Machine Classes
@ -211,20 +268,27 @@ classes:
Current AWS defaults:
```yaml
classes:
standard:
provider: aws
serverTypes: [c7a.8xlarge, c7a.4xlarge]
fast:
provider: aws
serverTypes: [c7a.16xlarge, c7a.12xlarge, c7a.8xlarge]
large:
provider: aws
serverTypes: [c7a.24xlarge, c7a.16xlarge, c7a.12xlarge]
beast:
provider: aws
serverTypes: [c7a.48xlarge, c7a.32xlarge, c7a.24xlarge, c7a.16xlarge]
```text
AWS Linux
standard c7a.8xlarge, c7i.8xlarge, m7a.8xlarge, m7i.8xlarge, c7a.4xlarge
fast c7a.16xlarge, c7i.16xlarge, m7a.16xlarge, m7i.16xlarge, c7a.12xlarge, c7a.8xlarge
large c7a.24xlarge, c7i.24xlarge, m7a.24xlarge, m7i.24xlarge, r7a.24xlarge, c7a.16xlarge, c7a.12xlarge
beast c7a.48xlarge, c7i.48xlarge, m7a.48xlarge, m7i.48xlarge, r7a.48xlarge, c7a.32xlarge, c7i.32xlarge, m7a.32xlarge, c7a.24xlarge, c7a.16xlarge
AWS Windows
standard m7i.large, m7a.large, t3.large
fast m7i.xlarge, m7a.xlarge, t3.xlarge
large m7i.2xlarge, m7a.2xlarge, t3.2xlarge
beast m7i.4xlarge, m7a.4xlarge, m7i.2xlarge
AWS Windows WSL2
standard m8i.large, m8i-flex.large, c8i.large, r8i.large
fast m8i.xlarge, m8i-flex.xlarge, c8i.xlarge, r8i.xlarge
large m8i.2xlarge, m8i-flex.2xlarge, c8i.2xlarge, r8i.2xlarge
beast m8i.4xlarge, m8i-flex.4xlarge, c8i.4xlarge, r8i.4xlarge, m8i.2xlarge
AWS macOS
all mac2.metal unless `--type` is set
```
Profiles choose a default class, and commands can override with `--class`.
@ -250,31 +314,67 @@ Deployment should:
3. Set Worker secrets.
4. Deploy Worker.
5. Verify `/v1/health` on `workers.dev`.
6. Configure route/custom domain on `crabbox.clawd.bot`.
7. Verify `/v1/health` on the fallback domain.
6. Configure route/custom domain on `crabbox.openclaw.ai`.
7. Verify `/v1/health` on the canonical and fallback domains.
Use `npx wrangler` from the Worker package unless `wrangler` is installed globally. Do not assume `hcloud` is installed; the implementation can use the Hetzner API directly from Go or from the Worker.
Current deployed coordinator:
```text
https://crabbox-coordinator.steipete.workers.dev
crabbox.clawd.bot/* -> crabbox-coordinator, protected by Cloudflare Access
https://crabbox.openclaw.ai
https://crabbox-access.openclaw.ai
https://crabbox-coordinator.services-91b.workers.dev
crabbox.clawd.bot/* -> crabbox-coordinator fallback
```
Current Worker secrets:
Current Worker secrets and settings:
```text
HETZNER_TOKEN
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN optional
CRABBOX_AWS_MAC_HOST_ID optional; required only for brokered target=macos
CRABBOX_SHARED_TOKEN
CRABBOX_ADMIN_TOKEN optional; required for admin routes and image promotion
CRABBOX_GITHUB_CLIENT_ID
CRABBOX_GITHUB_CLIENT_SECRET
CRABBOX_GITHUB_ALLOWED_ORG
CRABBOX_GITHUB_ALLOWED_ORGS optional
CRABBOX_GITHUB_ALLOWED_TEAMS optional
CRABBOX_DEFAULT_ORG
CRABBOX_SESSION_SECRET
CRABBOX_ACCESS_TEAM_DOMAIN
CRABBOX_ACCESS_AUD
CRABBOX_TAILSCALE_ENABLED optional
CRABBOX_TAILSCALE_CLIENT_ID optional; required for brokered --tailscale
CRABBOX_TAILSCALE_CLIENT_SECRET optional; required for brokered --tailscale
CRABBOX_TAILSCALE_TAILNET optional
CRABBOX_TAILSCALE_TAGS optional
CRABBOX_ARTIFACTS_BACKEND optional; currently r2
CRABBOX_ARTIFACTS_BUCKET optional; currently openclaw-crabbox-artifacts
CRABBOX_ARTIFACTS_PREFIX optional; currently crabbox-artifacts
CRABBOX_ARTIFACTS_BASE_URL optional; currently https://artifacts.openclaw.ai
CRABBOX_ARTIFACTS_REGION optional; currently auto
CRABBOX_ARTIFACTS_ENDPOINT_URL optional; currently the R2 S3-compatible endpoint
CRABBOX_ARTIFACTS_ACCESS_KEY_ID optional; Worker secret when artifacts backend is enabled
CRABBOX_ARTIFACTS_SECRET_ACCESS_KEY optional; Worker secret when artifacts backend is enabled
CRABBOX_ARTIFACTS_SESSION_TOKEN optional; Worker secret for temporary credentials
CRABBOX_ARTIFACTS_UPLOAD_EXPIRES_SECONDS optional
CRABBOX_ARTIFACTS_URL_EXPIRES_SECONDS optional
```
Artifact credentials on the coordinator are storage-only S3-compatible keys.
They exist so the Worker can sign one upload URL per artifact and return the
final asset URL. They are not Cloudflare deploy tokens, not Crabbox bearer/admin
tokens, and not VM provider credentials. Keep direct local S3/R2 credentials as
operator fallback only; normal artifact publishing should go through the
coordinator.
## Verified OpenClaw Run
Warm-run command from `/Users/steipete/Projects/openclaw` through the Cloudflare coordinator:
Historical warm-run command from an OpenClaw checkout through the Cloudflare coordinator:
```sh
CI=1 /usr/bin/time -p /Users/steipete/Projects/crabbox/bin/crabbox run --id cbx_f60f47cbc879 -- pnpm test:changed:max
@ -287,6 +387,14 @@ Result:
- Runner class: requested `beast`, actual fallback `cpx62`.
- Sync path: rsync overlay plus remote Git hydrate for shallow checkout merge-base support.
Current live smoke command:
```sh
CRABBOX_LIVE=1 CRABBOX_LIVE_REPO=/Users/steipete/Projects/clawdbot6 /Users/steipete/Projects/crabbox/scripts/live-smoke.sh
```
The smoke covers brokered AWS, direct Hetzner, Blacksmith Testbox delegation, slug reuse, status/inspect/cache/history/logs, stop, and final active-lease cleanup checks.
## Local, MacBook Pro, And Mac Studio
The same required env should exist on the local machine, MacBook Pro, and Mac Studio. Do not commit these values.

View File

@ -127,7 +127,7 @@ Build in this order:
9. Access/auth
- Primary org: GitHub `openclaw`.
- Cloudflare Access org: `openclaw-crabbox.cloudflareaccess.com`.
- Cloudflare Access org: `crabbox-openclaw.cloudflareaccess.com`.
- Cloudflare OTP remains available for early fallback.
- GitHub OAuth app exists under the `openclaw` org as `Crabbox Access`.
- GitHub IdP exists in Cloudflare Access as `GitHub OpenClaw`.
@ -172,21 +172,23 @@ And proves:
## Known Current Infra Facts
- Direct CLI execution is implemented and verified. It can create/reuse a Hetzner server, bootstrap it, sync a local checkout with rsync, hydrate shallow Git history enough for changed-test detection, run commands over SSH, stream output, and release/delete leases.
- The Cloudflare coordinator and Durable Object lease store are implemented and deployed. The CLI uses them when `CRABBOX_COORDINATOR` is set, and falls back to direct Hetzner otherwise.
- Intended primary domain: `crabbox.openclaw.ai`.
- The Cloudflare coordinator and Durable Object lease store are implemented and deployed. The CLI uses them when a broker URL is configured, and direct provider mode remains a debug fallback.
- Primary domain: `crabbox.openclaw.ai`.
- Access-protected service-token domain: `crabbox-access.openclaw.ai`.
- Current Cloudflare-manageable fallback domain: `crabbox.clawd.bot`.
- `openclaw.ai` is currently not visible as a Cloudflare zone in the available account; DNS is on Namecheap nameservers.
- Workers.dev fallback: `https://crabbox-coordinator.services-91b.workers.dev`.
- `crabbox.openclaw.ai/*` is attached as a Worker route in the OpenClaw Cloudflare account. The main `openclaw.ai` website can stay on Vercel; only the Crabbox subdomain needs to route to Cloudflare/Workers.
- Cloudflare account ID and Crabbox Cloudflare token are available in local and MacBook Pro `~/.profile`.
- The current Crabbox Cloudflare token is `crabbox-deploy`, scoped to `Steipete@gmail.com's Account` and the `clawd.bot` zone.
- The current Crabbox Cloudflare token is `crabbox-deploy`, scoped to the OpenClaw Cloudflare account and the routes/zones Crabbox manages.
- The current Crabbox Cloudflare token verifies Workers scripts, Access apps, Access IdPs, Access keys, DNS records, and zone Worker routes.
- Cloudflare Access is enabled.
- Current Access IdPs are OTP and GitHub.
- GitHub OAuth app `Crabbox Access` exists under the `openclaw` org.
- GitHub OAuth client ID and secret are present in local and MacBook Pro `~/.profile`.
- GitHub OAuth app `Crabbox Access` exists under the `openclaw` org for Cloudflare Access.
- Crabbox browser login uses a GitHub OAuth callback at `/v1/auth/github/callback` and stores OAuth client values as Worker secrets.
- Cloudflare Access GitHub IdP `GitHub OpenClaw` exists.
- Cloudflare Access app `Crabbox Coordinator` exists for `crabbox.clawd.bot`.
- Worker `crabbox-coordinator` is deployed at `https://crabbox-coordinator.steipete.workers.dev` and routed from `crabbox.clawd.bot/*`.
- Coordinator bearer auth uses `CRABBOX_COORDINATOR_TOKEN` locally and `CRABBOX_SHARED_TOKEN` in the Worker.
- Worker `crabbox-coordinator` is deployed at `https://crabbox-coordinator.services-91b.workers.dev`, routed from `crabbox.openclaw.ai/*` and `crabbox-access.openclaw.ai/*`, and optionally reachable through fallback routes.
- Coordinator auth supports GitHub browser-login user tokens plus shared-token operator automation. Shared-token auth uses `CRABBOX_COORDINATOR_TOKEN` locally and `CRABBOX_SHARED_TOKEN` in the Worker.
- Hetzner token is available in local and Mac Studio `~/.profile`.
- The Hetzner account currently hits a dedicated-core quota/resource limit for `ccx63`, `ccx53`, and `ccx43`. The `beast` class falls back to `cpx62` until quota is raised.
- Public SSH on port 22 was not usable from the tested network path; cloud-init opens SSH on port 2222 and the CLI uses that by default.
@ -197,8 +199,7 @@ And proves:
## Next Implementation Milestones
1. Raise Hetzner dedicated-core quota so `beast` can use `ccx63` instead of falling back to `cpx62`.
2. Replace shared-token login with Cloudflare Access/GitHub OAuth user tokens.
3. Add Cloudflare Access service-token support for non-browser CLI use on `crabbox.clawd.bot`.
4. Add one-shot `run --profile` cleanup semantics coverage in integration tests.
5. Add coordinator drain controls beyond release/delete.
6. Re-run OpenClaw `pnpm test:changed:max` on `ccx63` and compare against the current Crabbox baseline.
2. Add one-shot `run --profile` cleanup semantics coverage in integration tests.
3. Add coordinator drain controls beyond release/delete.
4. Re-run OpenClaw `pnpm test:changed:max` on `ccx63` and compare against the current Crabbox baseline after quota is raised.
5. Add generated CLI docs or a docs drift check so command pages cannot silently diverge from actual flags.

View File

@ -7,7 +7,7 @@ Read when:
- finding a remote machine for SSH inspection;
- correlating Actions hydration with the remote workspace.
Crabbox exposes operational visibility through CLI commands, coordinator usage summaries, retained run history/log tails, provider labels, GitHub Actions run links, and Worker logs. The reliable path is to keep the lease ID and run ID together.
Crabbox exposes operational visibility through CLI commands, coordinator usage summaries, retained run history/logs, provider labels, GitHub Actions run links, and Worker logs. The reliable path is to keep the lease ID and run ID together.
## Lease State
@ -47,7 +47,9 @@ Reports include lease count, active lease count, elapsed runtime, estimated elap
## Run History And Logs
Coordinator-backed `crabbox run` creates a run record before the remote command starts and finishes it with exit code, timing, and the latest retained output tail.
Coordinator-backed `crabbox run` creates a durable run record before leasing
starts, appends lifecycle events while the CLI progresses, and finishes the run
with exit code, timing, and retained command output.
Use:
@ -55,11 +57,17 @@ Use:
bin/crabbox history
bin/crabbox history --lease cbx_...
bin/crabbox history --owner steipete@gmail.com --json
bin/crabbox events run_...
bin/crabbox attach run_...
bin/crabbox logs run_...
bin/crabbox results run_...
```
History is for command debugging, not unlimited log archival. Logs are bounded tails of remote stdout/stderr. Test results are stored as structured summaries when `--junit` or `results.junit` is configured.
History is for command debugging, not unlimited log archival. Events are ordered
phase and output chunks for reconnect/inspection, and `attach` can follow those
events while the original CLI is still alive. Logs are bounded retained remote
stdout/stderr captures. Test results are stored as structured summaries when `--junit`
or `results.junit` is configured.
## Remote Debugging

View File

@ -23,15 +23,42 @@ Run these before a release or after changing secrets:
go test ./...
npm run check --prefix worker
npm test --prefix worker
node scripts/build-docs-site.mjs
npm run docs:check
bin/crabbox doctor
bin/crabbox whoami
bin/crabbox status --json
bin/crabbox list --json
bin/crabbox usage --scope all --json
bin/crabbox history --limit 5
```
`crabbox doctor` checks local prerequisites and coordinator reachability. `crabbox whoami` verifies identity. `crabbox status` confirms the broker can answer lease state. `crabbox usage` proves the cost accounting path is reachable. `crabbox history` proves run history is reachable.
`crabbox doctor` checks local prerequisites and coordinator reachability. `crabbox whoami` verifies identity. `crabbox list` confirms the broker can answer lease state. `crabbox usage` proves the cost accounting path is reachable. `crabbox history` proves run history is reachable.
When broker/provider credentials are available and infra changed, run the live smoke:
```sh
CRABBOX_LIVE=1 CRABBOX_LIVE_REPO=/path/to/openclaw scripts/live-smoke.sh
```
To narrow the live matrix while debugging, set `CRABBOX_LIVE_PROVIDERS`:
```sh
CRABBOX_LIVE=1 CRABBOX_LIVE_PROVIDERS=aws CRABBOX_LIVE_REPO=/path/to/openclaw scripts/live-smoke.sh
CRABBOX_LIVE=1 CRABBOX_LIVE_PROVIDERS=hetzner CRABBOX_LIVE_REPO=/path/to/openclaw scripts/live-smoke.sh
CRABBOX_LIVE=1 CRABBOX_LIVE_PROVIDERS=blacksmith-testbox CRABBOX_LIVE_REPO=/path/to/openclaw scripts/live-smoke.sh
```
For direct-provider smoke, disable the coordinator with a scratch config and run the same commands manually:
```sh
tmp="$(mktemp)"
printf 'provider: hetzner\n' > "$tmp"
CRABBOX_CONFIG="$tmp" CRABBOX_COORDINATOR= bin/crabbox warmup --provider hetzner --class standard --ttl 15m --idle-timeout 4m
CRABBOX_CONFIG="$tmp" CRABBOX_COORDINATOR= bin/crabbox run --provider hetzner --id <slug> --no-sync -- echo direct-hetzner-ok
CRABBOX_CONFIG="$tmp" CRABBOX_COORDINATOR= bin/crabbox stop --provider hetzner <slug>
rm -f "$tmp"
```
Use `--provider aws` with AWS SDK credentials for the direct AWS equivalent.
## Deployment
@ -47,6 +74,20 @@ npm run build --prefix worker
npx wrangler deploy --config worker/wrangler.jsonc
```
The repeatable deploy proof is:
```sh
scripts/deploy-worker-smoke.sh
```
It runs Worker format, lint, typecheck, tests, dry-run build, deploy, and public
health checks for `crabbox.openclaw.ai` plus the workers.dev fallback. To include
a short AWS lease smoke after deploy:
```sh
CRABBOX_DEPLOY_SMOKE_AWS=1 CRABBOX_LIVE_REPO=/path/to/openclaw scripts/deploy-worker-smoke.sh
```
Required Worker secrets:
```text
@ -56,6 +97,59 @@ AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
```
Conditional Worker secrets and settings:
```text
AWS_SESSION_TOKEN optional
CRABBOX_AWS_MAC_HOST_ID required only for brokered target=macos
CRABBOX_ADMIN_TOKEN required for admin routes and image promotion
CRABBOX_GITHUB_CLIENT_ID required for browser login
CRABBOX_GITHUB_CLIENT_SECRET required for browser login
CRABBOX_SESSION_SECRET required for browser login
CRABBOX_GITHUB_ALLOWED_ORG or CRABBOX_GITHUB_ALLOWED_ORGS
CRABBOX_GITHUB_ALLOWED_TEAMS optional
CRABBOX_ACCESS_TEAM_DOMAIN required for Access JWT verification
CRABBOX_ACCESS_AUD required for Access JWT verification
CRABBOX_TAILSCALE_CLIENT_ID required for brokered --tailscale
CRABBOX_TAILSCALE_CLIENT_SECRET required for brokered --tailscale
CRABBOX_TAILSCALE_TAILNET optional
CRABBOX_TAILSCALE_TAGS optional
CRABBOX_TAILSCALE_ENABLED optional; set 0 to disable brokered Tailscale
CRABBOX_ARTIFACTS_BACKEND optional; enables brokered artifact publishing
CRABBOX_ARTIFACTS_BUCKET required when artifact backend is enabled
CRABBOX_ARTIFACTS_PREFIX optional
CRABBOX_ARTIFACTS_BASE_URL optional; public final artifact URL prefix
CRABBOX_ARTIFACTS_REGION optional
CRABBOX_ARTIFACTS_ENDPOINT_URL optional; required for R2/custom S3 endpoints
CRABBOX_ARTIFACTS_ACCESS_KEY_ID required when artifact backend is enabled
CRABBOX_ARTIFACTS_SECRET_ACCESS_KEY required when artifact backend is enabled
CRABBOX_ARTIFACTS_SESSION_TOKEN optional
CRABBOX_ARTIFACTS_UPLOAD_EXPIRES_SECONDS optional
CRABBOX_ARTIFACTS_URL_EXPIRES_SECONDS optional
```
Artifact backend vars are ordinary Worker vars except
`CRABBOX_ARTIFACTS_ACCESS_KEY_ID`, `CRABBOX_ARTIFACTS_SECRET_ACCESS_KEY`, and
optional `CRABBOX_ARTIFACTS_SESSION_TOKEN`, which must be Worker secrets. These
object-store keys let the coordinator sign short-lived artifact upload/read
URLs; they should be scoped to the artifact bucket or prefix and should not have
Cloudflare account, Worker deployment, lease-provider, or VM permissions.
Our current coordinator artifact config is R2-compatible:
```text
CRABBOX_ARTIFACTS_BACKEND=r2
CRABBOX_ARTIFACTS_BUCKET=openclaw-crabbox-artifacts
CRABBOX_ARTIFACTS_PREFIX=crabbox-artifacts
CRABBOX_ARTIFACTS_BASE_URL=https://artifacts.openclaw.ai
CRABBOX_ARTIFACTS_REGION=auto
CRABBOX_ARTIFACTS_ENDPOINT_URL=<account>.r2.cloudflarestorage.com
```
The corresponding R2 access key id and secret access key are deployed as Worker
secrets, not local CLI defaults. Normal users should run
`crabbox artifacts publish` without direct S3/R2 credentials.
Cost-control secrets and settings:
```text
@ -75,10 +169,32 @@ CRABBOX_DEFAULT_ORG
The canonical Worker URL is:
```text
https://crabbox-coordinator.steipete.workers.dev
https://crabbox.openclaw.ai
```
The `crabbox.clawd.bot/*` route is attached and protected by Cloudflare Access. Bearer-token CLI automation talks to the Worker with `CRABBOX_SHARED_TOKEN`/`CRABBOX_COORDINATOR_TOKEN`.
The Access-protected Worker URL is:
```text
https://crabbox-access.openclaw.ai
```
The `crabbox.openclaw.ai/*` route is attached to the coordinator Worker for normal CLI and browser-login use. `crabbox-access.openclaw.ai/*` is attached to the same Worker behind Cloudflare Access for service-token proof and hardened automation. Bearer-token CLI automation talks to the Worker with `CRABBOX_SHARED_TOKEN`/`CRABBOX_COORDINATOR_TOKEN`; GitHub browser login stores a user-scoped signed token. Access-protected routes also require `CRABBOX_ACCESS_CLIENT_ID` plus `CRABBOX_ACCESS_CLIENT_SECRET`, or `CRABBOX_ACCESS_TOKEN` for an already minted Access JWT.
Use the protected route when testing the Cloudflare Access layer:
```sh
CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai bin/crabbox doctor
CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai bin/crabbox whoami
CRABBOX_LIVE=1 CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai CRABBOX_BIN=bin/crabbox scripts/live-auth-smoke.sh
CRABBOX_LIVE=1 CRABBOX_LIVE_PROVIDERS=aws CRABBOX_COORDINATOR=https://crabbox-access.openclaw.ai CRABBOX_BIN=bin/crabbox scripts/live-smoke.sh
```
`doctor` should report `access=service-token`. `scripts/live-auth-smoke.sh`
proves the auth boundary without leasing a machine: no Access headers are denied
at the edge, shared-token user auth works, raw Access identity spoofing is
ignored, shared-token admin calls fail, and admin-token admin calls pass. A raw
request without Access headers to `https://crabbox-access.openclaw.ai/v1/health`
should return a Cloudflare Access `403`.
Use `crabbox config show` to confirm which URL and provider the CLI will use:
@ -125,11 +241,27 @@ Cost is an estimate for compute leases, not an invoice. See [Cost And Usage](fea
## Release Checklist
Before handing off:
Before tagging a release:
- `go test ./...`
- Worker format, lint, typecheck, tests, and build.
- `node scripts/build-docs-site.mjs`
- docs link check, when a link checker is available.
- Reorder `CHANGELOG.md` with the user-facing changes first, date the release
section, and keep contributor thanks/co-author notes intact.
- Update package metadata that carries the project version, including
`package.json`, `worker/package.json`, and `worker/package-lock.json`.
- `go vet ./...`
- `go test -race ./...`
- `go build -trimpath -o bin/crabbox ./cmd/crabbox`
- `scripts/check-go-coverage.sh 85.0`
- Worker format, lint, typecheck, tests, and build:
`npm run format:check --prefix worker && npm run lint --prefix worker && npm run check --prefix worker && npm test --prefix worker && npm run build --prefix worker`
- `npm run docs:check`
- `git diff --check`
- live `crabbox doctor` if broker credentials are available.
- Live smoke at least one coordinator-backed `crabbox run`, then verify
`crabbox attach`, `crabbox events`, `crabbox logs`, and lease cleanup.
- Push, pull, and wait for CI green on the release commit.
- Tag and push `vX.Y.Z`, then wait for the release workflow. The workflow
publishes GitHub release assets and directly pushes the generated
`Formula/crabbox.rb` update to `openclaw/homebrew-tap` with
`HOMEBREW_TAP_GITHUB_TOKEN`; missing tap access is a release failure.
- Verify the GitHub release assets and Homebrew formula update.
- `brew update`, install or upgrade `openclaw/tap/crabbox`, run
`crabbox --version`, and run a short live smoke from the installed binary.

View File

@ -43,8 +43,19 @@ The Worker stores coordinator leases as `active`, `released`, `expired`, or `fai
`crabbox warmup --idle-timeout 30m` and `crabbox run --idle-timeout 30m` set inactivity expiry. `--ttl` is a separate maximum wall-clock lifetime. The CLI sends coordinator heartbeats while a lease is in use; each heartbeat updates `lastTouchedAt` and recomputes `expiresAt = min(createdAt + ttl, lastTouchedAt + idleTimeout)`.
For Linux leases, heartbeats also attach best-effort telemetry when SSH is reachable. The Durable Object keeps the latest sanitized load, memory, disk, uptime, source, and capture timestamp on the lease record, plus a bounded `telemetryHistory` ring of the latest 60 samples for compact portal trends. Active runs can append their own bounded telemetry samples through the run telemetry endpoint, so longer commands show short load, memory, and disk trends on the run detail page instead of only start/end deltas.
Direct-provider mode does not have a central heartbeat or alarm. It labels machines with `created_at`, `last_touched_at`, `idle_timeout_secs`, `expires_at`, `state`, `lease`, and `slug`; `crabbox cleanup` uses those labels conservatively.
Delegated external runners, such as Blacksmith Testboxes, are visibility-only
records in the coordinator. `crabbox list --provider blacksmith-testbox` syncs
the current all-status Blacksmith table into muted `/portal` lease-grid rows,
adds inferred GitHub Actions run/workflow links and status/conclusion badges
when available, and a later sync marks missing runners stale. Long-queued or
long-running Actions owners are tagged as `stuck`, and each row can open a
visibility-only runner detail page. These rows do not heartbeat and do not
participate in Crabbox lease expiry, cleanup, or cost accounting.
## Cleanup
Brokered cleanup is owned by the Durable Object alarm. `crabbox cleanup` refuses to run when a coordinator is configured, because sweeping provider resources behind the coordinator can delete live leases.
@ -85,7 +96,7 @@ CRABBOX_MAX_MONTHLY_USD_PER_ORG
CRABBOX_DEFAULT_ORG
```
The CLI sends `X-Crabbox-Owner` from `CRABBOX_OWNER`, Git author/committer email env, or local `git config user.email`. It sends `X-Crabbox-Org` from `CRABBOX_ORG` when set. Cloudflare Access email still wins when present.
For signed GitHub login tokens, owner/org is embedded in the token that the Worker forwards to the Fleet Durable Object. In shared-token automation, the CLI sends `X-Crabbox-Owner` from `CRABBOX_OWNER`, Git author/committer email env, or local `git config user.email`, and sends `X-Crabbox-Org` from `CRABBOX_ORG` when set. Raw Cloudflare Access identity headers are ignored; only a verified Access JWT email can become the bearer-token owner.
If a new lease would exceed a configured active-lease or monthly reserved-cost limit, the coordinator returns `cost_limit_exceeded` and does not provision the machine.

View File

@ -52,6 +52,8 @@ pnpm check:changed
git diff --name-only origin/main...
```
When a box is already Actions-hydrated and the remote checkout already has the configured base ref at the same SHA as the local `origin/<baseRef>`, Crabbox skips the extra Git hydration fetch and records the skip reason in the sync summary. This keeps dirty-overlay reruns focused on rsync plus the command instead of repeatedly fetching base history.
## Package And Tool Caches
Runner bootstrap prepares shared cache directories, but does not install project runtimes. Package-manager and Docker caches are best-effort speedups once the repository setup installs those tools; they must not be treated as source of truth.
@ -84,7 +86,7 @@ Typical choices:
- `large`: broad test shards or heavy builds.
- `beast`: high-core changed-test runs.
Hetzner dedicated classes can hit account quota. AWS Spot classes can hit regional capacity. For AWS, `CRABBOX_CAPACITY_STRATEGY=most-available` and multiple `CRABBOX_CAPACITY_REGIONS` give the coordinator more room to find capacity.
Hetzner dedicated classes can hit account quota. AWS Spot classes can hit regional capacity or account policy limits. For AWS, class requests try the configured high-core candidates first and can fall back to a small burstable type when the account rejects those candidates. Multiple `CRABBOX_CAPACITY_REGIONS` let brokered and direct AWS launches move to another region before giving up.
## Measure The Loop
@ -94,4 +96,4 @@ Use wall-clock timing around the whole command, not just the remote test process
/usr/bin/time -p bin/crabbox run --id cbx_... -- pnpm test:changed:max
```
The useful number includes lease wait, SSH readiness, sync, Git hydration, command execution, and release. For warm leases, sync fingerprints and package caches should make repeated runs much faster than cold runs.
The useful number includes lease wait, SSH readiness, sync, Git hydration, command execution, and release. Add `--timing-json` when comparing providers or checking whether a run paid for `rsync`, `git_hydrate`, or only the remote command. For warm leases, sync fingerprints and package caches should make repeated runs much faster than cold runs.

484
docs/plan/vnc.md Normal file
View File

@ -0,0 +1,484 @@
# Interactive Desktop, VNC, And Browser Plan
Read when:
- implementing `--desktop`, `--browser`, or `crabbox vnc`;
- changing Linux UI bootstrap or browser provisioning;
- deciding how static macOS/Windows hosts participate in interactive QA;
- reviewing the security boundary for desktop takeover.
## Goal
Implement the first real Crabbox interactive-desktop vertical slice so
Mantis/OpenClaw can request a UI-capable machine, run browser automation in a
visible session, and let Peter take over through a tunnel.
Crabbox owns machine capability:
- lease lifecycle, TTL, idle touch, cleanup, and claims;
- provider-specific bootstrap and SSH connection details;
- desktop services, browser installation/probing, and connection metadata;
- tunnel-only VNC instructions.
Mantis/OpenClaw own scenario logic:
- Discord or app credentials;
- browser profiles, Playwright/Selenium scripts, assertions, screenshots, and
videos;
- PR comments, artifacts, and pass/fail reporting.
## Capability Flags
Use two explicit capability flags:
```sh
crabbox warmup --desktop
crabbox warmup --desktop --browser
crabbox run --desktop --browser -- <command...>
```
`--desktop` means the lease should expose a visible UI session and takeover
path. On managed Linux this provisions desktop/VNC services. On static targets
it probes existing operator-managed services.
`--browser` means the target should have a known browser binary for automation.
It is separate because browser installation is heavier and more provider/OS
specific than a basic display session.
For `run`, `--browser` never implies `--desktop`. It supports headless browser
automation on a machine with a known browser binary. Use `--desktop --browser`
only when the browser should run in the visible VNC session.
Store both capabilities on leases:
```json
{
"desktop": true,
"browser": true
}
```
Provider labels/tags should include:
```text
desktop=true
browser=true
```
## CLI Surface
Add:
```sh
crabbox warmup --desktop [--browser]
crabbox run --desktop [--browser] -- <command...>
crabbox vnc --id <lease-or-slug>
```
`crabbox vnc` should resolve a lease like `crabbox ssh`, claim/touch it like
manual use, and print a concise connection block:
```text
lease: cbx_... slug=blue-lobster provider=aws target=linux
display: :99
ssh tunnel:
ssh -i ... -p 2222 -N -L 5901:127.0.0.1:5900 crabbox@203.0.113.10
vnc:
localhost:5901
Keep the tunnel process running while connected.
```
JSON output can come later. Text output is enough for v0.
If noVNC is implemented later, extend the block with a local browser URL. Do
not implement public noVNC in this slice.
## Security Boundary
Hard requirements:
- never expose VNC/noVNC to the public internet;
- bind runner-side VNC to `127.0.0.1`;
- do not add provider firewall/security-group ingress for VNC;
- print SSH tunnel commands only;
- do not put VNC passwords in command-line arguments, provider labels, run
history, or logs;
- keep TTL and idle-timeout behavior unchanged;
- cleanup remains VM deletion or static-host no-op, as today.
For Linux v0, use loopback-bound x11vnc with a per-lease password:
```sh
x11vnc -display :99 -localhost -rfbport 5900 -forever -shared -rfbauth /var/lib/crabbox/vnc.pass
```
Generate a per-lease remote password file, do not log it, and have
`crabbox vnc` retrieve and print it only when needed.
## Managed Linux Bootstrap
Default bootstrap must remain tiny. Desktop/browser packages are installed only
when requested.
### `--desktop`
Install the smallest useful visible-session stack:
```text
xvfb
xfce4-session
xfwm4
xfce4-panel
xfdesktop4
xfce4-terminal
x11vnc
xauth
dbus-x11
fonts-dejavu
fonts-liberation
ca-certificates
```
Use systemd units so the desktop survives command boundaries on kept leases:
- `crabbox-xvfb.service`
- `crabbox-desktop.service`
- `crabbox-x11vnc.service`
Suggested unit behavior:
```text
crabbox-xvfb:
Xvfb :99 -screen 0 1920x1080x24 -nolisten tcp -ac
crabbox-desktop:
DISPLAY=:99 startxfce4
crabbox-x11vnc:
x11vnc -display :99 -localhost -rfbport 5900 -forever -shared -nopw
```
`crabbox-ready` should check desktop readiness only when `desktop=true`:
```sh
systemctl is-active --quiet crabbox-xvfb.service
systemctl is-active --quiet crabbox-desktop.service
systemctl is-active --quiet crabbox-x11vnc.service
ss -ltn | grep -q '127.0.0.1:5900'
```
Normal non-desktop leases must not run these checks.
### `--browser`
Browser support should be opt-in.
For managed Linux, install Chrome stable if feasible and fall back to Chromium
when the distro package path is available. Prefer Chrome stable over Ubuntu
`chromium-browser` because Ubuntu Chromium commonly routes through Snap, which
is awkward in minimal cloud images, but a verified Chromium fallback is
acceptable.
Preferred managed Linux path:
1. install Google signing key into `/etc/apt/keyrings`;
2. add the Chrome apt source;
3. install `google-chrome-stable`;
4. write a small metadata file with the discovered browser path.
Example metadata:
```text
/var/lib/crabbox/browser.env
```
Content:
```sh
CHROME_BIN=/usr/bin/google-chrome
BROWSER=/usr/bin/google-chrome
```
`crabbox-ready` should check the browser only when `browser=true`:
```sh
test -x /usr/bin/google-chrome
/usr/bin/google-chrome --version
```
## Runtime Environment
When `run --desktop` executes on a Linux desktop-capable target, inject:
```sh
DISPLAY=:99
CRABBOX_DESKTOP=1
```
When `run --desktop --browser` knows a browser path, also inject:
```sh
CRABBOX_BROWSER=1
CHROME_BIN=/usr/bin/google-chrome
BROWSER=/usr/bin/google-chrome
```
This should merge with the existing allowed-env and Actions env-file behavior.
Do not leak secrets; these values are static machine metadata.
If `--desktop` is requested against an existing lease that was not provisioned
with `desktop=true`, fail clearly before running:
```text
lease cbx_... was not created with desktop=true; warm a new lease with --desktop
```
Static Linux can instead probe services and proceed if they are already present.
## Provider Behavior
### Brokered AWS/Hetzner
Support both `--desktop` and `--browser`.
Flow:
1. CLI sends `desktop` and `browser` in the lease request.
2. Worker validates Linux-only target as today.
3. Worker stores both booleans on `LeaseRecord`.
4. Worker labels/tags cloud machines with `desktop` and `browser`.
5. Worker cloud-init appends optional desktop/browser bootstrap blocks.
6. CLI receives the booleans back from `CoordinatorLease`.
7. `run` and `vnc` enforce/probe the capability before use.
Do not change AWS security group ingress. SSH remains the only public ingress.
### Direct AWS/Hetzner
Support both `--desktop` and `--browser` with the same optional cloud-init path
as the Worker.
Direct labels should include the booleans so `findLease` can detect whether an
existing lease is desktop/browser-capable.
### Static Linux
Support `crabbox vnc` if services already exist. Do not install packages on
static hosts in v0.
Probe:
```sh
test "${DISPLAY:-:99}" = ":99" || true
pgrep -f 'Xvfb :99'
pgrep -f x11vnc
ss -ltn | grep -q '127.0.0.1:5900'
```
For browser:
```sh
command -v google-chrome || command -v chromium || command -v chromium-browser
```
If missing, fail with clear operator instructions.
### Static macOS
Do not install or enable services in v0.
Support browser probing:
```sh
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --version
```
For takeover, macOS Screen Sharing uses VNC-compatible port `5900`, but enabling
it requires administrator configuration. `crabbox vnc` can print a tunnel only
if port `127.0.0.1:5900` or `localhost:5900` is reachable on the host.
If not reachable:
```text
target=macos does not expose a localhost VNC service; enable Screen Sharing or use a preconfigured VNC server
```
### Static Windows
Do not install or enable services in v0.
Support browser probing for common paths or `where`:
```powershell
where chrome.exe
where msedge.exe
```
Windows native takeover is RDP, not VNC. For v0, `crabbox vnc` should fail
unless a VNC server is already bound to loopback and reachable through SSH.
Clear failure:
```text
target=windows does not support managed VNC in v0; configure a loopback VNC server or use an OS-native remote desktop path
```
Do not open firewall rules or install a VNC server automatically.
### Blacksmith Testbox
`--desktop` and `crabbox vnc` are unsupported until Blacksmith exposes a stable
tunnel/connection API.
Headless browser automation can remain possible through Blacksmith-owned
workflow setup, but Crabbox should fail clearly for desktop takeover:
```text
desktop/VNC is not supported for provider=blacksmith-testbox; Blacksmith owns machine connectivity
```
## Implementation Files
CLI:
- `internal/cli/app.go`: route `vnc`, top-level help.
- `internal/cli/config.go`: `Desktop`, `Browser`, YAML/env parsing.
- `internal/cli/run.go`: `--desktop`, `--browser`, lease acquisition, existing
lease enforcement, run env injection.
- `internal/cli/bootstrap.go`: optional desktop/browser cloud-init blocks.
- `internal/cli/coordinator.go`: request/response structs and lease conversion.
- `internal/cli/provider_labels.go`: direct provider labels.
- `internal/cli/static.go`: static target probe behavior.
- `internal/cli/ssh_cmd.go`: reuse patterns for claim/touch.
- `internal/cli/vnc.go`: new command.
- `internal/cli/target.go`: provider/target validation helpers.
Worker:
- `worker/src/types.ts`: `desktop`, `browser` on request/record.
- `worker/src/config.ts`: config coercion/defaults.
- `worker/src/bootstrap.ts`: optional desktop/browser bootstrap.
- `worker/src/provider-labels.ts`: cloud labels.
- `worker/src/fleet.ts`: persist booleans and return them in leases.
Docs:
- `docs/features/interactive-desktop-vnc.md`
- `docs/features/runner-bootstrap.md`
- `docs/commands/warmup.md`
- `docs/commands/run.md`
- `docs/commands/vnc.md`
- `docs/commands/README.md`
- `docs/features/README.md`
- `README.md`
- `docs/source-map.md`
## Tests
Go tests:
- `cloudInit(baseConfig())` does not include desktop/browser packages or units.
- `cloudInit(Config{Desktop:true})` includes desktop packages, units, and
desktop readiness checks.
- `cloudInit(Config{Desktop:true, Browser:true})` includes Chrome setup and
browser readiness checks.
- `--desktop` and `--browser` parse for `warmup` and `run`.
- `run --desktop` injects `DISPLAY=:99` and `CRABBOX_DESKTOP=1`.
- `run --desktop --browser` injects `CHROME_BIN`, `BROWSER`, and
`CRABBOX_BROWSER=1` when metadata exists or managed Linux defaults apply.
- `crabbox vnc --id <lease>` prints SSH tunnel, VNC endpoint, display, and
tunnel warning.
- `crabbox vnc` rejects Blacksmith and unsupported static macOS/Windows cases
with clear messages.
- Existing `warmup` and `run` tests confirm default behavior remains unchanged.
Worker tests:
- `leaseConfig` defaults `desktop=false`, `browser=false`.
- `leaseConfig({ desktop:true, browser:true })` preserves both.
- Worker cloud-init excludes desktop/browser blocks by default.
- Worker cloud-init includes desktop/browser blocks only when requested.
- Fleet create response stores `desktop` and `browser`.
- Provider labels include `desktop=true` and `browser=true` only when requested
or include explicit false values if label consistency is preferred.
Docs tests:
- `npm run docs:check` must pass after adding `docs/commands/vnc.md`.
## Gates
Focused during implementation:
```sh
go test ./internal/cli
npm test --prefix worker -- bootstrap config provider-labels fleet
npm run docs:check
```
Before handoff:
```sh
gofmt -w $(git ls-files '*.go')
go vet ./...
go test -race ./...
scripts/check-go-coverage.sh 85.0
npm run check
npm run docs:check
npm run format:check --prefix worker
npm run lint --prefix worker
npm run check --prefix worker
npm test --prefix worker
npm run build --prefix worker
git diff --check
```
Live proof:
```sh
go build -trimpath -o bin/crabbox ./cmd/crabbox
bin/crabbox warmup --provider aws --type t3.small --desktop --browser --ttl 20m --idle-timeout 5m
bin/crabbox run --id <slug> --desktop --browser -- google-chrome --version
bin/crabbox run --id <slug> --desktop --browser --shell 'echo "$DISPLAY"; echo "$CHROME_BIN"'
bin/crabbox vnc --id <slug>
bin/crabbox stop <slug>
bin/crabbox admin leases --state active --json
```
For the first live run, also verify over SSH that VNC is loopback-bound:
```sh
ss -ltn | grep 5900
```
Expected remote bind:
```text
127.0.0.1:5900
```
## Acceptance Criteria
1. Existing `warmup` and `run` behavior is unchanged without `--desktop` or
`--browser`.
2. `warmup --desktop` requests and provisions a Linux lease with desktop
bootstrap.
3. `warmup --desktop --browser` additionally provisions a known browser binary.
4. `run --desktop --browser -- <cmd>` runs with `DISPLAY=:99` and browser env.
5. `crabbox vnc --id <lease>` prints a usable SSH tunnel command and endpoint.
6. VNC is never exposed publicly; no provider firewall ingress is added.
7. Static Linux can participate if services already exist.
8. Static macOS/Windows fail clearly when VNC/browser prerequisites are missing.
9. Blacksmith desktop/VNC fails clearly.
10. Docs and tests are updated.
11. The repo is clean except for intentional commits.
## Deferred
- noVNC/websockify.
- Automatic static macOS Screen Sharing enablement.
- Automatic Windows VNC/RDP service installation.
- Browser profile lifecycle management.
- Scenario screenshots, videos, assertions, and PR comments.
- Blacksmith Testbox desktop integration.

582
docs/provider-backends.md Normal file
View File

@ -0,0 +1,582 @@
# Provider Backends
Read when:
- adding a new Crabbox provider;
- deciding between an SSH lease backend and a delegated run backend;
- adding provider-specific flags or config;
- reviewing a provider PR for the right ownership boundary;
- designing a future external provider plugin protocol.
Crabbox providers are built around one rule:
Providers configure backends. Core commands own workflows.
That keeps `crabbox run`, `warmup`, `list`, `status`, `stop`, `cleanup`,
Actions hydration, sync, result collection, rendering, and timing consistent
across providers. A provider should describe what it can do and return a backend
object. It should not fork the command surface.
## Choose The Backend Shape
Start by choosing the execution model.
### SSH Lease Backend
Use `SSHLeaseBackend` when the provider can hand Crabbox an SSH target.
Examples:
- Hetzner Cloud
- AWS EC2
- static SSH hosts
Crabbox core owns the normal workflow after acquisition:
- claim and slug handling;
- SSH readiness checks;
- network target resolution;
- sync and sync guardrails;
- command wrapping and streaming;
- JUnit/result collection;
- Actions runner hydration over SSH;
- heartbeat/touch;
- release.
The backend owns only provider lifecycle:
```go
type SSHLeaseBackend interface {
Backend
Acquire(ctx context.Context, req AcquireRequest) (LeaseTarget, error)
Resolve(ctx context.Context, req ResolveRequest) (LeaseTarget, error)
List(ctx context.Context, req ListRequest) ([]LeaseView, error)
ReleaseLease(ctx context.Context, req ReleaseLeaseRequest) error
Touch(ctx context.Context, req TouchRequest) (Server, error)
}
```
Implement this when `LeaseTarget.SSH` can be populated with host, port, user,
key, work root, target OS, and Windows mode.
### Delegated Run Backend
Use `DelegatedRunBackend` when the provider owns execution instead of exposing
Crabbox-managed SSH.
Examples:
- Blacksmith Testbox
- Islo sandboxes, where Islo owns workspace setup and command streaming
- Daytona sandboxes for `run`, where Daytona toolbox owns file upload and
process execution while `crabbox ssh` still uses short-lived SSH tokens
- a future external runner service that accepts a command and streams output
The delegated backend owns warmup, command execution, output streaming, and
stop. Crabbox core still owns provider selection, config loading, local claims,
friendly slugs, timing summaries, and normalized list/status rendering.
```go
type DelegatedRunBackend interface {
Backend
Warmup(ctx context.Context, req WarmupRequest) error
Run(ctx context.Context, req RunRequest) (RunResult, error)
List(ctx context.Context, req ListRequest) ([]LeaseView, error)
Status(ctx context.Context, req StatusRequest) (StatusView, error)
Stop(ctx context.Context, req StopRequest) error
}
```
Delegated backends return normalized `StatusView` values. Rendering remains
core-owned, so provider packages should not print their own `status` or `list`
tables unless a compatibility interface explicitly asks for native output.
A delegated backend must reject sync-only options that Crabbox cannot honor:
```go
if err := cli.RejectDelegatedSyncOptions(providerName, req); err != nil {
return RunResult{}, err
}
```
Do not pretend a delegated provider is SSH-like unless the provider has a stable
SSH contract. If Crabbox cannot run rsync and remote commands itself, use
`DelegatedRunBackend`.
### Optional Interfaces
Add optional capabilities as small interfaces instead of widening every backend.
Cleanup is already optional:
```go
type CleanupBackend interface {
Backend
Cleanup(ctx context.Context, req CleanupRequest) error
}
```
List JSON compatibility is optional:
```go
type JSONListBackend interface {
Backend
ListJSON(ctx context.Context, req ListRequest) (any, error)
}
```
`JSONListBackend` is a compatibility escape hatch for script-facing JSON shapes.
Use it only when an existing provider already exposed a different JSON schema
than the normalized `[]LeaseView` shape.
Future provider-specific capability areas should follow the same pattern, for
example pricing or image management.
## Package Layout
Built-in providers live under `internal/providers/<name>`:
```text
internal/providers/all
internal/providers/hetzner
internal/providers/aws
internal/providers/ssh
internal/providers/blacksmith
internal/providers/daytona
internal/providers/islo
```
Each provider package owns registration, provider name, aliases, spec,
provider-specific flags, backend configuration, provider clients, provider
lifecycle code, and provider-specific tests. `cmd/crabbox` imports
`internal/providers/all` for side-effect registration:
```go
import (
"github.com/openclaw/crabbox/internal/cli"
_ "github.com/openclaw/crabbox/internal/providers/all"
)
```
The core provider contract remains in `internal/cli`; built-in implementations
live in their provider folders:
```text
internal/cli/provider_backend.go # interfaces, registry, request/result types
internal/cli/provider_coordinator.go # brokered coordinator lease wrapper
internal/cli/provider_labels.go # shared direct-provider label helpers
internal/providers/shared # shared direct SSH retry/touch/cleanup helpers
internal/providers/aws # AWS SSH lease backend
internal/providers/hetzner # Hetzner SSH lease backend
internal/providers/ssh # static SSH backend
internal/providers/blacksmith # Blacksmith delegated backend
internal/providers/daytona # Daytona SSH + delegated SDK backend
internal/providers/islo # Islo delegated backend
```
Provider packages may use small exported core helpers for claims, labels,
sync preflight, timing JSON, and SSH key storage. Keep that helper surface
narrow: if a provider needs broad command orchestration, the behavior probably
belongs in core instead.
## Provider Registration
A provider implements `cli.Provider`:
```go
type Provider interface {
Name() string
Aliases() []string
Spec() ProviderSpec
RegisterFlags(fs *flag.FlagSet, defaults Config) any
ApplyFlags(cfg *Config, fs *flag.FlagSet, values any) error
Configure(cfg Config, rt Runtime) (Backend, error)
}
```
Minimal SSH provider package:
```go
package example
import (
"flag"
"github.com/openclaw/crabbox/internal/cli"
)
func init() {
cli.RegisterProvider(Provider{})
}
type Provider struct{}
func (Provider) Name() string { return "example" }
func (Provider) Aliases() []string { return nil }
func (Provider) Spec() cli.ProviderSpec {
return cli.ProviderSpec{
Name: "example",
Kind: cli.ProviderKindSSHLease,
Targets: []cli.TargetSpec{
{OS: "linux"},
},
Features: cli.FeatureSet{
cli.FeatureSSH,
cli.FeatureCrabboxSync,
},
Coordinator: cli.CoordinatorNever,
}
}
func (Provider) RegisterFlags(*flag.FlagSet, cli.Config) any {
return cli.NoProviderFlags()
}
func (Provider) ApplyFlags(*cli.Config, *flag.FlagSet, any) error {
return nil
}
func (p Provider) Configure(cfg cli.Config, rt cli.Runtime) (cli.Backend, error) {
return cli.NewExampleLeaseBackend(p.Spec(), cfg, rt), nil
}
```
`NewExampleLeaseBackend` stands in for the backend constructor you add for the
provider. Existing providers use constructors such as `NewAWSLeaseBackend` and
`NewBlacksmithBackend`.
Then add the provider to `internal/providers/all/all.go`:
```go
import _ "github.com/openclaw/crabbox/internal/providers/example"
```
Tests in `internal/cli` do not import `internal/providers/all`, because that
would create an import cycle. Register test providers from a same-package test
file when testing core dispatch.
## Provider Spec
`ProviderSpec` is command-facing metadata:
```go
type ProviderSpec struct {
Name string
Kind ProviderKind
Targets []TargetSpec
Features FeatureSet
Coordinator CoordinatorMode
}
```
Use canonical provider names in docs and config. Aliases are for compatibility.
Pick `Kind` carefully:
- `ProviderKindSSHLease`: provider returns SSH targets and Crabbox owns sync/run.
- `ProviderKindDelegatedRun`: provider owns execution and output streaming.
Targets should describe what the provider can actually satisfy. Do not list
`windows`, `macos`, `desktop`, `browser`, or `code` unless the backend supports
that path end to end.
Feature flags should be concrete:
```go
cli.FeatureSSH
cli.FeatureCrabboxSync
cli.FeatureCleanup
cli.FeatureDesktop
cli.FeatureBrowser
cli.FeatureCode
cli.FeatureTailscale
```
Actions runner hydration is intentionally not a provider feature. It is a core
SSH-over-Linux workflow. It requires:
- an SSH lease backend;
- `target=linux`;
- no delegated execution.
Only set `CoordinatorSupported` when the Crabbox coordinator can provision that
provider. A direct-only SSH provider should use `CoordinatorNever`.
## Flags And Config
Provider flags are registered before parsing because Go's `flag` package rejects
unknown flags. `RegisterFlags` must be cheap and side-effect free. It returns an
opaque values struct that is passed back into `ApplyFlags` only after config and
common flags select the provider.
Pattern, when the provider has an exported flag helper or lives in `internal/cli`:
```go
type exampleFlagValues struct {
Region *string
}
func (Provider) RegisterFlags(fs *flag.FlagSet, defaults cli.Config) any {
return exampleFlagValues{
Region: fs.String("example-region", defaults.Example.Region, "Example region"),
}
}
func (Provider) ApplyFlags(cfg *cli.Config, fs *flag.FlagSet, values any) error {
v, ok := values.(exampleFlagValues)
if !ok {
return nil
}
if cli.FlagWasSet(fs, "example-region") {
cfg.Example.Region = *v.Region
}
return nil
}
```
`Config` does not yet have a generic provider config bag. New provider packages
should either:
- add typed config fields and use `cli.FlagWasSet` from the provider package; or
- expose a small provider-specific flag helper from `internal/cli`, as
Blacksmith does, when the config type is not ready to export cleanly.
If a provider needs durable config, add typed config fields in `Config` and env
overrides in `config.go`. Keep compatibility shims for existing top-level
provider config, but prefer `providers.<name>` for new provider families once
that config bag lands.
Never pass provider secrets as command-line arguments. Use environment variables,
local SDK config, the coordinator, or a credential store outside repo config.
## Runtime
Backends receive a narrow runtime:
```go
type Runtime struct {
Stdout io.Writer
Stderr io.Writer
Clock Clock
HTTP *http.Client
Exec CommandRunner
}
```
Use it instead of `App`, global clocks, or package-level command hooks.
Delegated CLI integrations must use `Runtime.Exec`:
```go
result, err := rt.Exec.Run(ctx, cli.LocalCommandRequest{
Name: "provider-cli",
Args: args,
Stdout: rt.Stdout,
Stderr: rt.Stderr,
})
```
This gives tests a fake command runner and avoids package-level
`exec.CommandContext` seams.
Use `Runtime.Clock` for timing in backend code. Use `Runtime.Stdout` and
`Runtime.Stderr` for streaming and warnings.
## Implementing An SSH Lease Backend
An SSH lease backend should return a complete `LeaseTarget`:
```go
type LeaseTarget struct {
Server Server
SSH SSHTarget
LeaseID string
Coordinator *CoordinatorClient
}
```
`Acquire` should:
1. validate direct-provider prerequisites;
2. mint or accept the lease id handled by the request path;
3. ensure or install the SSH key;
4. provision the machine or sandbox;
5. wait until an address exists;
6. populate `SSHTarget`;
7. wait for SSH readiness when the provider owns boot;
8. mark provider labels/tags as ready;
9. return `LeaseTarget`.
`Resolve` should accept canonical lease IDs, provider IDs, names, and slugs
where the provider can support them. It should return the stored per-lease SSH
key when available.
`List` returns normalized `LeaseView` values. Do not print from `List`; command
rendering belongs to core.
`Touch` should update provider labels/tags with idle and state metadata when the
provider supports it. Static providers can update only the in-memory view.
`ReleaseLease` should be idempotent where practical. Remove local claims after
the provider release succeeds or is known to be unnecessary.
If cleanup is meaningful, implement `CleanupBackend`. Cleanup should honor
`DryRun`, log skip/delete decisions to stderr, and use provider labels to avoid
deleting unrelated machines.
## Implementing A Delegated Run Backend
A delegated backend should preserve Crabbox ergonomics while letting the provider
own the remote workflow.
`Warmup` should:
1. validate provider-specific workflow config;
2. create or warm the provider resource;
3. claim the resource locally with provider name and slug;
4. print the standard warmup summary;
5. write timing JSON when requested.
`Run` should:
1. reject unsupported Crabbox sync options;
2. acquire a resource or resolve an existing id/slug;
3. claim/reclaim the resource for the repo;
4. stream provider output through `Runtime.Stdout` and `Runtime.Stderr`;
5. return `RunResult`;
6. stop temporary resources when `Keep` is false.
`List` and `Status` should return normalized views. If the provider only offers
a table or lossy native status shape, keep that parsing inside the backend.
`Stop` should stop the provider resource, remove local claims, and remove local
per-resource keys if the backend created them.
Do not make delegated providers support `crabbox ssh`, `vnc`, `webvnc`,
`screenshot`, `code`, or Actions runner hydration unless the provider exposes a
stable connection contract that preserves Crabbox's security boundary.
## Rendering
Backends return values. Core renders output.
`ListRequest` and `StatusRequest` intentionally do not carry JSON flags. The
command handler decides whether to render human output or JSON.
`JSONListBackend` is the exception for compatibility with older script-facing
JSON schemas. It should not be used for new providers.
That rule keeps:
- `crabbox list --json`;
- `crabbox status --json`;
- human tables;
- future UI/plugin consumers;
consistent across backend kinds.
## External Provider Plugins
External process plugins are not implemented yet. Do not add a provider that
depends on an undocumented stdio protocol.
The intended direction is:
- a built-in Go provider package discovers/configures the external process;
- the process speaks JSON over stdio;
- the Go side adapts it to `SSHLeaseBackend` or `DelegatedRunBackend`;
- core commands still render list/status and own SSH workflows where applicable.
Expected rough command shape:
```text
provider-plugin capabilities
provider-plugin acquire
provider-plugin resolve
provider-plugin list
provider-plugin release
provider-plugin touch
provider-plugin run
provider-plugin status
provider-plugin stop
```
The external protocol should not bypass the backend interfaces. It is an
implementation detail behind a normal registered provider.
## Tests
Add tests at the lowest level that proves the contract.
For provider registration:
- canonical name resolves through `ProviderFor`;
- aliases resolve where promised;
- `Spec` has the expected kind, targets, features, and coordinator mode;
- provider-specific flags apply only after selection.
For SSH lease backends:
- acquire success returns a `LeaseTarget` with host, user, port, key, lease id;
- acquire failure releases partial resources when possible;
- resolve supports lease id and supported aliases;
- list returns normalized views without printing;
- touch updates labels/tags and honors state/idle timeout;
- release removes claims and provider resources;
- cleanup honors dry-run.
For delegated run backends:
- sync-only/checksum/force-large options are rejected;
- new run acquires, claims, streams, and stops when `Keep=false`;
- existing id/slug resolves and claims correctly;
- list/status parse provider output into normalized views;
- stop removes claims and local keys;
- all subprocess calls go through `Runtime.Exec`.
Use fake `CommandRunner`, fake clocks, fake HTTP clients, and provider test
clients. Avoid live provider calls in unit tests.
Run at least:
```sh
go test -count=1 ./internal/cli ./internal/providers/...
go test -count=1 ./...
go vet ./...
npm run docs:check
```
For high-risk provider changes, also run:
```sh
go test -race -count=1 ./internal/cli
go build -trimpath -o bin/crabbox ./cmd/crabbox
```
Add live smoke only when credentials and cost boundaries are explicit.
## Review Checklist
Before landing a new backend:
- The provider has a folder under `internal/providers/<name>`.
- The provider is imported by `internal/providers/all`.
- `Name` is canonical and docs use that name.
- Compatibility aliases are intentional and tested.
- `ProviderSpec.Kind` matches the real execution model.
- Targets and features describe implemented behavior only.
- Coordinator mode is `CoordinatorNever` unless the coordinator can provision it.
- Provider flags are registered before parse and applied only after selection.
- Secrets are not stored in repo config or passed in argv.
- `list` and `status` return normalized values instead of printing.
- Delegated providers reject unsupported sync options.
- SSH providers do not own core sync/run/rendering.
- Tests cover command dispatch and backend behavior without live credentials.
- Docs and source map are updated.

80
docs/providers/README.md Normal file
View File

@ -0,0 +1,80 @@
# Provider Reference
Read when:
- choosing a Crabbox provider for a repo or one-off command;
- debugging provider-specific provisioning, sync, or command execution;
- changing provider registration, flags, config, or backend behavior.
Crabbox supports managed SSH lease providers, delegated run providers, and one
static SSH provider for existing machines.
| Provider | Backend kind | Targets | Best for |
| --- | --- | --- | --- |
| [AWS](aws.md) | SSH lease | Linux, Windows, macOS | broad managed capacity, Windows, EC2 Mac |
| [Azure](azure.md) | SSH lease | Linux, Windows | Azure-backed Linux and native Windows capacity |
| [Hetzner](hetzner.md) | SSH lease | Linux | fast Linux capacity at low cost |
| [Static SSH](ssh.md) | SSH lease | Linux, macOS, Windows | reusing an existing host |
| [Blacksmith Testbox](blacksmith-testbox.md) | delegated run | Linux | existing Blacksmith Testbox workflows |
| [Daytona](daytona.md) | hybrid delegated run + SSH | Linux | Daytona snapshot sandboxes |
| [Islo](islo.md) | delegated run | Linux | Islo-owned sandbox execution |
## Shared Rules
Core Crabbox owns provider selection, config loading, friendly slugs, local repo
claims, timing summaries, command rendering, and normalized list/status output.
Providers own only their backend boundary: provisioning or delegated command
execution.
Use `--provider <name>` for one command, or set `provider: <name>` in Crabbox
config. Provider flags are registered by provider packages before command-line
parsing, so provider-specific flags work even when that provider is not the
default.
```sh
crabbox warmup --provider aws --class beast
crabbox run --provider hetzner -- pnpm test
crabbox run --provider blacksmith-testbox --id tbx_123 -- pnpm test
```
## Brokered Versus Direct
AWS, Azure, and Hetzner can run through the Crabbox coordinator or directly
from the CLI.
Coordinator mode is the normal shared-team path: the Worker owns cloud
credentials, cost state, cleanup alarms, and lease accounting.
Direct mode is for local operator debugging or non-brokered setups. It uses local
provider credentials and best-effort cleanup through provider labels.
Delegated providers do not use the Crabbox coordinator:
- Blacksmith uses the authenticated Blacksmith CLI.
- Daytona uses Daytona API and SDK/toolbox APIs.
- Islo uses the Islo API and SDK auth.
## Feature Matrix
| Provider | `run` | `warmup` | `ssh` | VNC/code | Crabbox sync | Provider sync |
| --- | --- | --- | --- | --- | --- | --- |
| AWS | yes | yes | yes | yes | yes | no |
| Azure | yes | yes | yes | Linux VNC/code | yes | no |
| Hetzner | yes | yes | yes | Linux VNC/code | yes | no |
| Static SSH | yes | resolves host | yes | host-dependent | yes | no |
| Blacksmith Testbox | yes | yes | no | no | no | yes |
| Daytona | yes | yes | yes | no | archive via Daytona toolbox | no |
| Islo | yes | yes | no | no | no | yes |
Actions runner hydration requires a normal SSH lease on Linux and is core-over-SSH.
Use AWS, Hetzner, or Static SSH for that path.
## Implementation
Provider implementation lives under `internal/providers/<name>`. The command
orchestration and renderer surface stays in `internal/cli`.
Related docs:
- [Provider backends](../provider-backends.md)
- [Feature overview](../features/providers.md)
- [Source map](../source-map.md)

130
docs/providers/aws.md Normal file
View File

@ -0,0 +1,130 @@
# AWS Provider
Read when:
- choosing `provider: aws`;
- debugging EC2 capacity, quotas, AMIs, security groups, or EC2 Mac hosts;
- changing `internal/providers/aws` or brokered AWS provisioning.
AWS is the broad managed provider. It supports Linux, native Windows, Windows
WSL2, and EC2 Mac leases. The backend is an SSH lease provider: after
provisioning, Crabbox owns SSH readiness, sync, command execution, results,
desktop tunnels, and cleanup.
## When To Use
Use AWS when you need:
- managed Windows or WSL2 test machines;
- EC2 Mac desktops through a configured Dedicated Host;
- broad Linux capacity with Spot and On-Demand fallback;
- coordinator-owned cloud credentials and cost accounting.
Use Hetzner for cheaper Linux-only capacity. Use Static SSH when a known host
already exists.
## Commands
```sh
crabbox warmup --provider aws --class standard
crabbox run --provider aws --class fast -- pnpm test
crabbox run --provider aws --market on-demand -- pnpm check
crabbox warmup --provider aws --target windows --desktop
crabbox warmup --provider aws --target windows --windows-mode wsl2
crabbox warmup --provider aws --target macos --desktop --market on-demand
```
`--type` is exact. If EC2 rejects that type, Crabbox fails instead of silently
choosing another instance. Use `--class` when fallback is desired.
## Config
```yaml
provider: aws
target: linux
class: beast
market: spot
aws:
region: us-east-1
ami: ""
securityGroupId: ""
subnetId: ""
instanceProfile: ""
rootGB: 120
sshCIDRs: []
```
Important direct-mode environment:
```text
AWS_PROFILE
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN
CRABBOX_AWS_REGION
CRABBOX_AWS_AMI
CRABBOX_AWS_SECURITY_GROUP_ID
CRABBOX_AWS_SUBNET_ID
CRABBOX_AWS_INSTANCE_PROFILE
CRABBOX_AWS_ROOT_GB
CRABBOX_AWS_SSH_CIDRS
CRABBOX_AWS_MAC_HOST_ID
CRABBOX_CAPACITY_REGIONS
CRABBOX_CAPACITY_AVAILABILITY_ZONES
CRABBOX_CAPACITY_HINTS
CRABBOX_CAPACITY_LARGE_CLASSES
```
Brokered AWS credentials belong in the Worker, not on developer machines.
## Targets
| Target | Notes |
| --- | --- |
| Linux | Ubuntu bootstrap, SSH, rsync, optional desktop/browser/code. |
| Windows native | EC2Launch, OpenSSH, Git for Windows, TightVNC, archive sync. |
| Windows WSL2 | Nested virtualization families; POSIX sync and commands through WSL. |
| macOS | Requires `CRABBOX_AWS_MAC_HOST_ID` or `aws.macHostId`; On-Demand only. |
## Lifecycle
1. Import or reuse the lease SSH key.
2. Select region, market, instance type, subnet, AMI, and security group.
3. Launch EC2 instance, Spot request, Windows instance, or EC2 Mac host-backed
instance.
4. Tag instance, volumes, and Spot requests with Crabbox lease labels.
5. Wait for SSH and `crabbox-ready`.
6. Let core sync and run over SSH.
7. Terminate on release, cleanup, or coordinator expiry.
Brokered cleanup is coordinator-owned. Direct cleanup is best-effort through
provider labels and `crabbox cleanup`.
## Capabilities
- SSH: yes.
- Crabbox sync: yes.
- Desktop/browser/code: yes, target-dependent.
- Tailscale: Linux managed leases.
- Actions hydration: Linux SSH leases only.
- Coordinator: yes.
## Gotchas
- Spot capacity and quota errors are normal. Prefer classes over exact `--type`
when you want fallback.
- Brokered leases include `capacityHints` unless disabled with
`capacity.hints: false` or `CRABBOX_CAPACITY_HINTS=0`.
- During capacity pressure, prefer `standard` or `fast` plus multiple
`CRABBOX_CAPACITY_REGIONS`; `beast` starts at 48xlarge candidates and can
consume 192 vCPUs per request.
- Windows WSL2 needs nested virtualization instance families.
- EC2 Mac needs an explicit Dedicated Host id.
- VNC stays behind SSH tunnels; do not expose VNC ports directly.
Related docs:
- [Feature: AWS](../features/aws.md)
- [Windows VNC](../features/vnc-windows.md)
- [macOS VNC](../features/vnc-macos.md)
- [Provider backends](../provider-backends.md)

215
docs/providers/azure.md Normal file
View File

@ -0,0 +1,215 @@
# Azure Provider
Read when:
- choosing `provider: azure`;
- debugging Azure VM capacity, quotas, images, or SSH readiness;
- changing `internal/providers/azure` or the direct Azure provisioning code.
Azure is a managed provider for Linux and native Windows SSH leases. Azure
provisions the VM, public IP, NIC, and OS disk, then Crabbox owns SSH
readiness, sync, command execution, results, and cleanup.
## When To Use
Use Azure when the team's cloud capacity lives in an Azure subscription, or
when Microsoft tooling, Entra ID, or Azure-specific networking constraints
make AWS or Hetzner inappropriate. Use Hetzner for cheaper Linux-only
capacity and AWS for Windows desktop, Windows WSL2, or macOS targets.
Azure supports direct mode and brokered Linux/native Windows leases. Direct
mode uses local Azure credentials. Brokered mode uses the operator-owned
Azure service principal configured on the Worker.
## Commands
```sh
crabbox warmup --provider azure --class beast
crabbox run --provider azure --class standard -- pnpm test
crabbox warmup --provider azure --target windows --class standard
crabbox warmup --provider azure --desktop --browser
crabbox ssh --provider azure --id blue-lobster
crabbox stop --provider azure blue-lobster
crabbox cleanup --provider azure
```
`--type` is exact (e.g. `--type Standard_D32ads_v6`). Use `--class` when SKU
fallback is desired.
## Config
```yaml
provider: azure
target: linux
class: beast
azure:
subscriptionId: 00000000-0000-0000-0000-000000000000
tenantId: 00000000-0000-0000-0000-000000000000
clientId: 00000000-0000-0000-0000-000000000000
location: eastus
resourceGroup: crabbox-leases
image: Canonical:0001-com-ubuntu-server-jammy:22_04-lts-gen2:latest
vnet: crabbox-vnet
subnet: crabbox-subnet
nsg: crabbox-nsg
sshCIDRs: []
```
`subscriptionId`, `tenantId`, and `clientId` may be set in config or sourced
from environment variables. The client secret is never read from config; it
must come from the environment.
Important direct-mode environment:
```text
AZURE_SUBSCRIPTION_ID
AZURE_TENANT_ID
AZURE_CLIENT_ID
AZURE_CLIENT_SECRET
CRABBOX_AZURE_SUBSCRIPTION_ID
CRABBOX_AZURE_TENANT_ID
CRABBOX_AZURE_CLIENT_ID
CRABBOX_AZURE_LOCATION
CRABBOX_AZURE_RESOURCE_GROUP
CRABBOX_AZURE_IMAGE
CRABBOX_AZURE_VNET
CRABBOX_AZURE_SUBNET
CRABBOX_AZURE_NSG
CRABBOX_AZURE_SSH_CIDRS
```
`AZURE_*` are the standard service principal env vars consumed by
`DefaultAzureCredential`. Crabbox does not read or print the client secret.
Brokered mode uses the same Azure service-principal secrets on the Worker:
`AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, and
`AZURE_SUBSCRIPTION_ID`. Operators own the resource group, vnet, subnet,
NSG, and SSH CIDR defaults through `CRABBOX_AZURE_*` env vars. A lease
request may override only `azureLocation` and `azureImage`.
## Auth
If `azure.tenantId` and `azure.clientId` (or `CRABBOX_AZURE_TENANT_ID` /
`CRABBOX_AZURE_CLIENT_ID`) are configured and `AZURE_CLIENT_SECRET` is set
in the environment, Crabbox builds a `ClientSecretCredential` from those
explicit values. Otherwise it falls back to
[`azidentity.NewDefaultAzureCredential`](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity#DefaultAzureCredential),
which scans environment, workload identity, managed identity, and CLI
credentials in order. The simplest setup is a service principal with the
[Contributor](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#contributor)
role scoped to the resource group, configured via:
```sh
export AZURE_TENANT_ID=...
export AZURE_CLIENT_ID=...
export AZURE_CLIENT_SECRET=...
export AZURE_SUBSCRIPTION_ID=...
```
See [Authenticate Go apps to Azure services with service principals](https://learn.microsoft.com/azure/developer/go/sdk/authentication/local-development-service-principal).
## Lifecycle
1. Resolve credentials per the rules above.
2. Ensure the shared resource group, virtual network, subnet, and network
security group exist. Crabbox first issues `Get` calls against each
resource. If a resource exists without the `managed_by=crabbox` tag,
Crabbox refuses to mutate it and returns an adopt-or-rename error. If a
resource exists with the tag, it is left alone (Crabbox does not
overwrite tags, address spaces, subnets, or rules on subsequent
acquires). If a resource is missing, it is created with Crabbox tags
and the configured layout. Inbound SSH rules are derived from
`azure.sshCIDRs`, the configured SSH port, and any fallback ports.
3. Mint a per-lease SSH key.
4. Pick the configured class SKU candidates and try each in order.
5. For each lease: create a public IP, NIC, and VM with cloud-init in
`osProfile.customData` and the SSH key in
`osProfile.linuxConfiguration.ssh.publicKeys` for Linux. Native Windows
uses a Windows Server small-disk Gen2 image, Windows `osProfile` fields
(`adminPassword`, `computerName`, and `windowsConfiguration`), and a
Custom Script Extension that runs the Crabbox bootstrap saved in
`C:\AzureData\CustomData.bin`.
6. Query Azure Resource SKUs for the VM size. If Azure reports ephemeral OS
disk support, use a local ephemeral OS disk. Otherwise use a managed
`StandardSSD_LRS` OS disk.
7. Tag the VM, NIC, and public IP with Crabbox lease metadata.
8. Wait for the public IP to allocate, then for SSH and `crabbox-ready`.
9. Let core sync and run over SSH.
10. On release/cleanup, cascade-delete VM → NIC → public IP → OS disk. The
shared infra remains.
## Classes
Default Linux SKUs:
```text
standard Standard_D32ads_v6, Standard_D32ds_v6, Standard_F32s_v2, Standard_D32ads_v5, Standard_D32ds_v5, then D/F 16-vCPU fallbacks
fast Standard_D64ads_v6, Standard_D64ds_v6, Standard_F64s_v2, Standard_D64ads_v5, Standard_D64ds_v5, then D/F 48-vCPU and 32-vCPU fallbacks
large Standard_D96ads_v6, Standard_D96ds_v6, Standard_D96ads_v5, Standard_D96ds_v5, then D/F 64-vCPU and 48-vCPU fallbacks
beast Standard_D192ds_v6, Standard_D128ds_v6, then D/F 96-vCPU and 64-vCPU fallbacks
```
Default native Windows SKUs:
```text
standard Standard_D2ads_v6, Standard_D2ds_v6, Standard_D2ads_v5, Standard_D2ds_v5, then Standard_D2as_v6
fast Standard_D4ads_v6, Standard_D4ds_v6, Standard_D4ads_v5, Standard_D4ds_v5, then Standard_D4as_v6
large Standard_D8ads_v6, Standard_D8ds_v6, Standard_D8ads_v5, Standard_D8ds_v5, then Standard_D8as_v6
beast Standard_D16ads_v6, Standard_D16ds_v6, Standard_D16ads_v5, Standard_D16ds_v5, then Standard_D8ads_v6
```
Class-based provisioning falls back across the candidate list when Azure
rejects a SKU for capacity or quota
(`SkuNotAvailable`, `QuotaExceeded`, `AllocationFailed`,
`OverconstrainedAllocationRequest`). Spot leases fall back to on-demand when
`capacity.fallback` starts with `on-demand`. Explicit `--type` is exact.
The default Linux candidates mirror the AWS Linux class table's vCPU scale.
The default Windows candidates mirror the AWS native Windows class table's
vCPU scale. Azure native Windows support covers SSH, sync, and run; Windows
WSL2 and macOS remain AWS or static-SSH targets.
## Capabilities
- SSH: yes.
- Crabbox sync: yes.
- Native Windows: yes for SSH, sync, and run.
- Desktop / browser / code: Linux only on Azure.
- Tailscale: Linux managed leases.
- Actions hydration: yes, Linux SSH leases.
- Coordinator: yes, brokered Linux/native Windows leases.
## Gotchas
- Azure VM names are constrained to 1-64 characters and cannot contain
underscores. The `leaseProviderName` helper substitutes underscores
for dashes; if you customize naming, keep that constraint in mind.
- Windows computer names are limited to 15 characters. Crabbox keeps the VM
resource name stable and derives a shorter Windows `computerName`.
- The first acquire in an empty subscription pays the cost of creating the
shared resource group, vnet, and NSG. Subsequent acquires only create
per-lease resources.
- If you already have a resource group / vnet / NSG with the configured
names, Crabbox will refuse to mutate them unless they carry
`managed_by=crabbox` as a tag. Either tag them to adopt, choose
different names in `azure.*` config, or let Crabbox create dedicated
resources.
- `crabbox stop --provider azure <name>` will only act on VMs that carry
`crabbox=true` (and either no `provider` tag or `provider=azure`). A
manually-named VM in the resource group will not be deleted by Crabbox.
- The default SSH NSG rule allows `0.0.0.0/0` when `azure.sshCIDRs` is
empty. Set explicit CIDRs for any production-adjacent setup.
- Azure costs are not hardcoded in Crabbox. Set `CRABBOX_COST_RATES_JSON`
when you need exact Azure cost guardrails.
- Azure native Windows uses Custom Script Extension because Windows custom
data is saved to disk but not executed by Azure provisioning. Do not add
rebooting bootstrap work to that extension path.
- Azure does not provide managed Windows WSL2 or macOS through this provider.
Use AWS or `provider: ssh` for those targets.
- Direct-mode cleanup is best effort. Use `crabbox cleanup --provider azure`
to sweep expired direct leases.
Related docs:
- [Feature: Azure](../features/azure.md)
- [Linux VNC](../features/vnc-linux.md)
- [Provider backends](../provider-backends.md)

View File

@ -0,0 +1,115 @@
# Blacksmith Testbox Provider
Read when:
- choosing `provider: blacksmith-testbox`;
- wrapping an existing Blacksmith Testbox workflow with Crabbox;
- changing `internal/providers/blacksmith`.
Blacksmith Testbox is a delegated run provider. Crabbox does not provision,
bootstrap, rsync, or expose VNC for the remote machine. It shells out to the
authenticated Blacksmith CLI and keeps Crabbox ergonomics around IDs, slugs,
repo claims, timing, and normalized output.
## When To Use
Use Blacksmith when the repo already has a Testbox workflow and the remote
workspace should be owned by Blacksmith. Use AWS, Hetzner, Static SSH, or Daytona
when Crabbox must own SSH sync and interactive access.
## Commands
Reuse an existing Testbox:
```sh
crabbox run --provider blacksmith-testbox --id tbx_123 -- pnpm test
crabbox status --provider blacksmith-testbox --id tbx_123
crabbox stop --provider blacksmith-testbox tbx_123
```
Warm a fresh Testbox:
```sh
crabbox warmup \
--provider blacksmith-testbox \
--blacksmith-org openclaw \
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
--blacksmith-job test \
--blacksmith-ref main
```
`blacksmith` is accepted as an alias, but docs and scripts should prefer
`blacksmith-testbox`.
## Config
```yaml
provider: blacksmith-testbox
blacksmith:
org: openclaw
workflow: .github/workflows/ci-check-testbox.yml
job: test
ref: main
idleTimeout: 90m
```
Environment variables can provide the same defaults:
```text
CRABBOX_BLACKSMITH_ORG
CRABBOX_BLACKSMITH_WORKFLOW
CRABBOX_BLACKSMITH_JOB
CRABBOX_BLACKSMITH_REF
```
Blacksmith authentication stays in the Blacksmith CLI. Run
`blacksmith auth login` before using this provider.
## Lifecycle
Crabbox forwards:
```sh
blacksmith testbox warmup ...
blacksmith testbox run ...
blacksmith testbox list
blacksmith testbox list --all
blacksmith testbox stop ...
```
If list/status calls work but new warmups sit `queued` with no IP, the
Blacksmith service or organization is accepting requests but not assigning
capacity. Stop queued IDs you created and use AWS, Hetzner, Static SSH, or
Daytona until Blacksmith service, billing, or org limits are healthy again.
Crabbox stores a per-Testbox SSH key locally, claims the Testbox for the current
repo, maps IDs to friendly slugs, and prints a normal Crabbox timing summary.
When coordinator auth is configured, `crabbox list --provider blacksmith-testbox`
also syncs visibility-only Testbox rows into the portal lease table. If Crabbox
can infer the owning GitHub Actions run, the portal links the row to the run and
workflow, shows the Actions status/conclusion, flags long-queued or long-running
rows as `stuck`, exposes a copyable local stop command, and provides a
visibility-only detail page for the row.
## Capabilities
- SSH: no Crabbox SSH lease.
- Crabbox sync: no.
- Provider sync: yes, Blacksmith-owned.
- Desktop/browser/code: no Crabbox VNC/code surface.
- Actions hydration: Blacksmith-owned workflow setup, not Crabbox SSH hydration.
- Coordinator: no.
## Gotchas
- `--sync-only`, `--checksum`, and `--force-sync-large` do not apply because
Blacksmith owns sync.
- `list` and `status` are core-rendered from parsed Blacksmith CLI output.
- `blacksmith.workflow` is required only when Crabbox needs to create a Testbox.
Reusing an existing ID or slug does not need workflow config.
Related docs:
- [Feature: Blacksmith Testbox](../features/blacksmith-testbox.md)
- [Provider backends](../provider-backends.md)

115
docs/providers/daytona.md Normal file
View File

@ -0,0 +1,115 @@
# Daytona Provider
Read when:
- choosing `provider: daytona`;
- configuring Daytona API auth, snapshots, or SSH access;
- changing `internal/providers/daytona`.
Daytona is a hybrid provider. `run` and `warmup` use Daytona SDK/toolbox APIs
for sandbox lifecycle, archive upload, extraction, and process execution.
Explicit `ssh` access mints a short-lived Daytona SSH token and then uses the
normal Crabbox SSH client.
## When To Use
Use Daytona when the sandbox image should come from a Daytona snapshot and
command execution should stay inside Daytona's toolbox APIs. Use AWS, Hetzner,
or Static SSH when you need a normal long-lived SSH lease for Actions hydration
or VNC/code workflows.
## Commands
```sh
crabbox warmup --provider daytona --daytona-snapshot crabbox-ready
crabbox run --provider daytona --daytona-snapshot crabbox-ready -- pnpm test
crabbox run --provider daytona --id blue-lobster -- pnpm test:changed
crabbox ssh --provider daytona --id blue-lobster
crabbox stop --provider daytona blue-lobster
```
## Auth
Use the Daytona CLI login:
```sh
daytona login --api-key ...
```
Crabbox reads the active Daytona CLI profile when no Daytona auth environment
variables are set.
You can also use explicit environment auth with an API key:
```sh
export DAYTONA_API_KEY=...
```
or JWT auth:
```sh
export DAYTONA_JWT_TOKEN=...
export DAYTONA_ORGANIZATION_ID=...
```
`DAYTONA_ORGANIZATION_ID` is required with JWT auth.
Explicit environment or Crabbox config values override the Daytona CLI profile.
## Config
```yaml
provider: daytona
target: linux
daytona:
apiUrl: https://app.daytona.io/api
snapshot: crabbox-ready
target: ""
user: daytona
workRoot: /home/daytona/crabbox
sshGatewayHost: ssh.app.daytona.io
sshAccessMinutes: 30
```
Provider flags:
```text
--daytona-api-url
--daytona-snapshot
--daytona-target
--daytona-user
--daytona-work-root
--daytona-ssh-gateway-host
--daytona-ssh-access-minutes
```
## Lifecycle
1. Create or resolve a Daytona sandbox from `daytona.snapshot`.
2. Store Crabbox labels and local repo claims.
3. For `run`, build the Crabbox sync manifest, create a gzipped tar archive,
stream the archive to Daytona toolbox upload, extract it, and execute through
Daytona process APIs.
4. For `ssh`, request short-lived SSH access, parse Daytona's `sshCommand`, and
redact the token in normal output.
5. Delete the sandbox on release unless the lease is kept.
## Capabilities
- SSH: yes, explicit short-lived token access.
- Crabbox sync: yes, archive sync through Daytona toolbox.
- Desktop/browser/code: no current Crabbox VNC/code surface.
- Actions hydration: no.
- Coordinator: no.
## Gotchas
- `daytona.snapshot` is required when creating a sandbox.
- Snapshot contents own CPU, memory, disk, and installed tooling in this mode.
- Daytona `run` is delegated to toolbox APIs; it is not the same as core-over-SSH
execution.
- `--actions-runner` is rejected because it needs a normal SSH lease host.
Related docs:
- [Feature: Daytona](../features/daytona.md)
- [Provider backends](../provider-backends.md)

Some files were not shown because too many files have changed in this diff Show More