docs: add Crabbox image bake runbook
This commit is contained in:
parent
3df14dff23
commit
0bb34bdcad
@ -49,6 +49,8 @@ crabbox image create --id <cbx_id> --name openclaw-crabbox-YYYYMMDD-HHMM --wait
|
||||
Use a fresh, intentionally warmed lease as the source. Do not bake personal
|
||||
workspace state, local secrets, repository checkouts, or one-off debugging
|
||||
artifacts into the image.
|
||||
For desktop/browser or Mantis images, follow the full [Image bake runbook](../features/image-bake-runbook.md)
|
||||
instead of relying only on the short smoke above.
|
||||
|
||||
Failure handling:
|
||||
|
||||
@ -63,6 +65,8 @@ Failure handling:
|
||||
Keep the previous promoted AMI and debug bootstrap on a normal lease first.
|
||||
- Cleanup of stale candidate AMIs is an AWS operator task. Promotion does not
|
||||
delete old images or snapshots.
|
||||
- If a Mantis timing report does not improve after promotion, treat that as a
|
||||
failed performance bake even if the AMI boots.
|
||||
|
||||
## promote
|
||||
|
||||
@ -94,6 +98,7 @@ on the new image.
|
||||
|
||||
Related docs:
|
||||
|
||||
- [Image bake runbook](../features/image-bake-runbook.md)
|
||||
- [Prebaked runner images](../features/prebaked-images.md)
|
||||
- [Infrastructure](../infrastructure.md)
|
||||
- [Runner bootstrap](../features/runner-bootstrap.md)
|
||||
|
||||
@ -19,6 +19,7 @@ Core features:
|
||||
- [Tailscale](tailscale.md): optional tailnet reachability for managed Linux leases and static hosts.
|
||||
- [Runner bootstrap](runner-bootstrap.md): cloud-init, installed tools, SSH port, and readiness.
|
||||
- [Prebaked runner images](prebaked-images.md): provider-owned image storage and the image/cache/state boundary.
|
||||
- [Image bake runbook](image-bake-runbook.md): exact AWS bake, candidate smoke, promotion, rollback, and cleanup flow.
|
||||
- [Sync](sync.md): Git file-list manifests, rsync, fingerprints, excludes, guardrails, and sanity checks.
|
||||
- [Actions hydration](actions-hydration.md): let GitHub Actions prepare a runner, then sync local work into that workspace.
|
||||
- [Interactive desktop and VNC](interactive-desktop-vnc.md): VNC hub, support matrix, tunnel model, and QA boundaries.
|
||||
|
||||
212
docs/features/image-bake-runbook.md
Normal file
212
docs/features/image-bake-runbook.md
Normal file
@ -0,0 +1,212 @@
|
||||
# Image Bake Runbook
|
||||
|
||||
Read when:
|
||||
|
||||
- baking a new Crabbox AWS image;
|
||||
- promoting or rolling back the default AWS image;
|
||||
- preparing a desktop/browser image for Mantis or other UI QA;
|
||||
- checking whether state belongs in the image or in a warm lease.
|
||||
|
||||
This runbook is for trusted operators. Image commands need coordinator admin
|
||||
auth and can create provider-side artifacts that cost money until cleaned up.
|
||||
|
||||
## Naming
|
||||
|
||||
Use names that identify owner, purpose, and UTC bake time:
|
||||
|
||||
```text
|
||||
openclaw-crabbox-linux-desktop-browser-YYYYMMDD-HHMM
|
||||
openclaw-mantis-linux-desktop-browser-YYYYMMDD-HHMM
|
||||
```
|
||||
|
||||
Use a generic `openclaw-crabbox-*` image when the contents are useful to many
|
||||
repositories. Use `openclaw-mantis-*` only when the image is specifically tuned
|
||||
for OpenClaw Mantis QA.
|
||||
|
||||
## What To Bake
|
||||
|
||||
Bake machine capabilities:
|
||||
|
||||
- current OS security updates;
|
||||
- SSH, Git, rsync, curl, jq, and readiness helpers;
|
||||
- Xvfb/Openbox/VNC for desktop leases;
|
||||
- Chrome/Chromium for browser leases;
|
||||
- `ffmpeg`, `ffprobe`, `scrot`, `xdotool`, and other capture helpers;
|
||||
- Node 22, npm, corepack, pnpm;
|
||||
- build-essential, Python, and common native-addon headers;
|
||||
- empty cache directories such as `/var/cache/crabbox/pnpm`.
|
||||
|
||||
Do not bake scenario state:
|
||||
|
||||
- secrets, tokens, or provider credentials;
|
||||
- browser profiles, cookies, Slack/Discord/WhatsApp sessions, or OAuth state;
|
||||
- source checkouts, `node_modules`, `dist`, PR artifacts, screenshots, or
|
||||
videos;
|
||||
- local operator notes or one-off debugging files.
|
||||
|
||||
## Create A Candidate AMI
|
||||
|
||||
Warm a source lease:
|
||||
|
||||
```bash
|
||||
crabbox warmup \
|
||||
--provider aws \
|
||||
--class standard \
|
||||
--desktop \
|
||||
--browser \
|
||||
--ttl 2h \
|
||||
--idle-timeout 30m
|
||||
```
|
||||
|
||||
Capture the lease id from the output. Use the canonical `cbx_...` id for image
|
||||
commands, not only the friendly slug.
|
||||
|
||||
Verify the source lease:
|
||||
|
||||
```bash
|
||||
crabbox run \
|
||||
--provider aws \
|
||||
--id <cbx_id> \
|
||||
--no-sync \
|
||||
--shell -- \
|
||||
'set -euo pipefail
|
||||
command -v ssh
|
||||
command -v git
|
||||
command -v rsync
|
||||
command -v jq
|
||||
command -v node
|
||||
command -v pnpm
|
||||
command -v ffmpeg
|
||||
command -v scrot
|
||||
command -v x11vnc
|
||||
command -v google-chrome || command -v chromium || command -v chromium-browser
|
||||
test -d /work/crabbox
|
||||
sudo mkdir -p /var/cache/crabbox/pnpm
|
||||
sudo chmod 1777 /var/cache/crabbox /var/cache/crabbox/pnpm'
|
||||
```
|
||||
|
||||
Create the candidate image:
|
||||
|
||||
```bash
|
||||
crabbox image create \
|
||||
--id <cbx_id> \
|
||||
--name openclaw-crabbox-linux-desktop-browser-YYYYMMDD-HHMM \
|
||||
--wait \
|
||||
--json
|
||||
```
|
||||
|
||||
Keep the JSON output. At minimum, record the AMI id, name, source lease id,
|
||||
creation time, and operator.
|
||||
|
||||
## Smoke Candidate Before Promotion
|
||||
|
||||
Boot the candidate explicitly. Use the provider image override supported by the
|
||||
current environment, for example:
|
||||
|
||||
```bash
|
||||
CRABBOX_AWS_AMI=ami-1234567890abcdef0 \
|
||||
crabbox warmup \
|
||||
--provider aws \
|
||||
--class standard \
|
||||
--desktop \
|
||||
--browser \
|
||||
--ttl 30m \
|
||||
--idle-timeout 10m
|
||||
```
|
||||
|
||||
Run a smoke on the candidate:
|
||||
|
||||
```bash
|
||||
crabbox run \
|
||||
--provider aws \
|
||||
--id <candidate-cbx_id-or-slug> \
|
||||
--no-sync \
|
||||
--shell -- \
|
||||
'set -euo pipefail
|
||||
echo image-smoke-ok
|
||||
uname -srm
|
||||
command -v node
|
||||
command -v pnpm
|
||||
command -v ffmpeg
|
||||
command -v scrot
|
||||
command -v google-chrome || command -v chromium || command -v chromium-browser
|
||||
test -d /work/crabbox'
|
||||
```
|
||||
|
||||
For Mantis images, also run a real desktop/browser proof:
|
||||
|
||||
```bash
|
||||
crabbox screenshot --provider aws --id <candidate-cbx_id-or-slug> --output /tmp/crabbox-image-smoke.png
|
||||
```
|
||||
|
||||
Do not promote if SSH readiness, browser startup, screenshot capture, or the
|
||||
package/tool checks fail.
|
||||
|
||||
## Promote
|
||||
|
||||
Promote only after a candidate smoke passes:
|
||||
|
||||
```bash
|
||||
crabbox image promote ami-1234567890abcdef0 --json
|
||||
```
|
||||
|
||||
Then verify a normal brokered lease without overrides uses the promoted image:
|
||||
|
||||
```bash
|
||||
crabbox warmup \
|
||||
--provider aws \
|
||||
--class standard \
|
||||
--desktop \
|
||||
--browser \
|
||||
--ttl 30m \
|
||||
--idle-timeout 10m
|
||||
|
||||
crabbox run \
|
||||
--provider aws \
|
||||
--id <new-cbx_id-or-slug> \
|
||||
--no-sync \
|
||||
--shell -- \
|
||||
'echo promoted-image-smoke-ok && command -v ffmpeg && command -v node'
|
||||
```
|
||||
|
||||
Keep the previous promoted AMI available until at least one normal brokered
|
||||
lease and one relevant QA lane pass on the new image.
|
||||
|
||||
## Roll Back
|
||||
|
||||
Rollback is another promotion:
|
||||
|
||||
```bash
|
||||
crabbox image promote ami-previous-good --json
|
||||
```
|
||||
|
||||
Run the normal brokered smoke again. Do not delete the failed AMI immediately;
|
||||
keep it long enough to inspect tags, logs, and source-lease details.
|
||||
|
||||
## Cleanup
|
||||
|
||||
Promotion does not delete old AMIs or EBS snapshots. Cleanup is a provider
|
||||
operator task:
|
||||
|
||||
- keep the current promoted AMI;
|
||||
- keep the previous known-good AMI until the new one has real QA proof;
|
||||
- deregister stale failed/candidate AMIs after investigation;
|
||||
- delete their orphaned EBS snapshots in the AWS account.
|
||||
|
||||
Do not rely on Crabbox coordinator state as the source of truth for old image
|
||||
storage costs. Check AWS directly.
|
||||
|
||||
## Hetzner Status
|
||||
|
||||
Hetzner image bytes belong in the Hetzner project. Crabbox can boot a configured
|
||||
image through `image` or `CRABBOX_HETZNER_IMAGE`, but Hetzner image
|
||||
create/promote lifecycle commands are not implemented yet. Until then, create
|
||||
and manage Hetzner snapshots with Hetzner tooling, then configure Crabbox to use
|
||||
the selected image.
|
||||
|
||||
Related docs:
|
||||
|
||||
- [Prebaked runner images](prebaked-images.md)
|
||||
- [image command](../commands/image.md)
|
||||
- [Runner bootstrap](runner-bootstrap.md)
|
||||
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
|
||||
@ -61,8 +61,26 @@ This split keeps images reusable across repositories while still letting slow QA
|
||||
systems skip repeated dependency work when they deliberately reuse a warm lease
|
||||
or a keyed external cache.
|
||||
|
||||
## Operator Flow
|
||||
|
||||
Use the [Image bake runbook](image-bake-runbook.md) for the exact AWS bake,
|
||||
candidate smoke, promotion, rollback, and cleanup commands. At a high level:
|
||||
|
||||
1. Warm a fresh `--desktop --browser` AWS lease.
|
||||
2. Verify the machine capability contract on that lease.
|
||||
3. Create an AMI with `crabbox image create --wait`.
|
||||
4. Boot the AMI explicitly through an image override and smoke it.
|
||||
5. Promote the AMI with `crabbox image promote`.
|
||||
6. Run a normal brokered lease and the relevant QA lane.
|
||||
7. Keep the previous known-good AMI until the new image has real QA proof.
|
||||
|
||||
For Mantis, image bake success is not just "Chrome exists." A useful image must
|
||||
reduce `crabbox.warmup` or `crabbox.remote_run` time in the Mantis timing
|
||||
report while keeping Slack/browser login state outside the image.
|
||||
|
||||
Related docs:
|
||||
|
||||
- [Image bake runbook](image-bake-runbook.md)
|
||||
- [image command](../commands/image.md)
|
||||
- [Runner bootstrap](runner-bootstrap.md)
|
||||
- [Interactive desktop and VNC](interactive-desktop-vnc.md)
|
||||
|
||||
Loading…
Reference in New Issue
Block a user