diff --git a/docs/README.md b/docs/README.md index f92d303..43c4702 100644 --- a/docs/README.md +++ b/docs/README.md @@ -76,8 +76,9 @@ Run history and inspection are intentionally handled by the Crabbox CLI and repo Pick whichever matches your intent: -- **Get the mental model:** [How Crabbox Works](how-it-works.md), [Architecture](architecture.md), [Orchestrator](orchestrator.md). -- **Use the CLI:** [CLI](cli.md), [Commands](commands/README.md), [Features](features/README.md), [Actions hydration](features/actions-hydration.md), [Browser portal](features/portal.md), [Telemetry](features/telemetry.md). +- **Start here:** [Getting started](getting-started.md), [How Crabbox Works](how-it-works.md), [Concepts and glossary](concepts.md). +- **Get the mental model:** [Architecture](architecture.md), [Orchestrator](orchestrator.md). +- **Use the CLI:** [CLI](cli.md), [Commands](commands/README.md), [Features](features/README.md), [Configuration](features/configuration.md), [Actions hydration](features/actions-hydration.md), [Browser portal](features/portal.md), [Telemetry](features/telemetry.md). - **Pick or add a target:** [Provider reference](providers/README.md), [Providers feature overview](features/providers.md), [Provider authoring](features/provider-authoring.md), [Provider backends](provider-backends.md), [AWS](providers/aws.md), [Hetzner](providers/hetzner.md), [Static SSH](providers/ssh.md), [Blacksmith Testbox](providers/blacksmith-testbox.md), [Daytona](providers/daytona.md), [Islo](providers/islo.md), [Interactive desktop and VNC](features/interactive-desktop-vnc.md). - **Operate it:** [Operations](operations.md), [Observability](observability.md), [Troubleshooting](troubleshooting.md), [Performance](performance.md). - **Set it up or audit it:** [Infrastructure](infrastructure.md), [Security](security.md), [Source Map](source-map.md), [MVP Plan](mvp-plan.md). diff --git a/docs/commands/attach.md b/docs/commands/attach.md index 5b60bf1..e823c32 100644 --- a/docs/commands/attach.md +++ b/docs/commands/attach.md @@ -3,27 +3,63 @@ `crabbox attach` follows recorded events for an active coordinator run. ```sh -crabbox attach run_... -crabbox attach --id run_... --after 42 +crabbox attach run_abcdef123456 +crabbox attach --id run_abcdef123456 --after 42 +crabbox attach run_abcdef123456 --poll 500ms ``` -Stdout and stderr preview events are written back to stdout and stderr. -Lifecycle events are printed to stderr with their sequence number, phase, -timestamp, and message. When the run has already finished, `attach` prints any -remaining events and exits. +## Behavior -Flags: +`attach` polls the coordinator for new run events on a fixed interval, +prints them as they arrive, and exits when the run finishes. + +- stdout and stderr preview events are written back to stdout and stderr, + preserving the stream split; +- lifecycle events (lease, bootstrap, sync, command-start, finish, release) + are printed to stderr with their sequence number, phase, timestamp, and + message; +- when the run has already finished, attach prints any remaining events + and exits; +- when the run is still active, attach polls until it sees a `finish` + event. + +`attach` is not detached command execution. It follows the events the +original CLI is emitting; if that CLI process dies, the run state remains +inspectable through [history](history.md), [events](events.md), and +[logs](logs.md), but `attach` cannot resurrect it. + +## Bounded Output + +Output events are a bounded preview. The coordinator caps stdout/stderr +capture at 64 KiB per run and records an `output.truncated` marker when the +cap is reached. Use [logs](logs.md) for the larger retained command output +after completion. + +## Flags ```text ---id run id ---after resume after this event sequence +--id run id (also accepted as a positional argument) +--after resume after this event sequence number --poll polling interval, default 1s ``` -`attach` follows events emitted by the original CLI. It is not detached command -execution. If the original CLI process dies, the last recorded phase remains -inspectable through [history](history.md), [events](events.md), and -[logs](logs.md). +## Use Cases -Output events are a bounded preview. Use [logs](logs.md) for the retained -command output after completion. +- watch a long warmup or run from a second terminal without disturbing the + original CLI; +- monitor an agent-launched run while doing something else locally; +- replay events from a known sequence (`--after`) when reconnecting after + a network blip. + +## Direct Mode + +Direct-provider mode does not record runs centrally, so `attach` has no +event stream to follow. Use shell output from the original CLI instead. + +Related docs: + +- [logs](logs.md) +- [events](events.md) +- [history](history.md) +- [run](run.md) +- [History and logs](../features/history-logs.md) diff --git a/docs/commands/cache.md b/docs/commands/cache.md index 27fb49a..cf7bc98 100644 --- a/docs/commands/cache.md +++ b/docs/commands/cache.md @@ -9,23 +9,107 @@ crabbox cache warm --id blue-lobster -- pnpm install --frozen-lockfile crabbox cache purge --id blue-lobster --kind pnpm --force ``` -`--id` accepts the stable `cbx_...` ID or an active friendly slug. Cache commands that SSH to the box touch the lease and validate the local repo claim; add `--reclaim` to move an existing claim. - -Cache kinds: +## Subcommands ```text -pnpm -npm -docker -git -all +cache stats show usage for each cache kind on the lease +cache warm run a command in the synced workdir to populate caches +cache purge delete one or all cache kinds (requires --force) ``` -`cache warm` runs a command in the synced repo workdir for that lease. On boxes prepared by `crabbox actions hydrate`, it uses the hydrated `$GITHUB_WORKSPACE` and sources the workflow env handoff like `crabbox run`. +`--id` accepts the canonical `cbx_...` lease ID or an active friendly +slug. Cache commands SSH to the box, touch the lease, and validate the +local repo claim. Add `--reclaim` to move an existing claim from another +repo. -Repo `cache.pnpm`, `cache.npm`, `cache.docker`, and `cache.git` toggles control which kinds `stats` reports and which kinds `purge --kind all` removes. +## Cache Kinds + +```text +pnpm /var/cache/crabbox/pnpm +npm /var/cache/crabbox/npm +docker Docker layer/image cache (host-managed) +git /var/cache/crabbox/git (shared origin objects) +all every kind enabled in repo config +``` + +Repo `cache.pnpm`, `cache.npm`, `cache.docker`, and `cache.git` toggles +control which kinds `stats` reports and which kinds `purge --kind all` +removes. Disabled kinds are omitted from stats, are not purged by +`--kind all`, and asking to purge a disabled specific kind fails early. + +## stats + +```sh +crabbox cache stats --id blue-lobster +``` + +Prints sizes for each enabled cache kind: + +```text +pnpm 8.4GiB +npm 1.2GiB +docker 18.7GiB +git 430MiB +``` + +`--json` returns the same data as a structured object. + +## warm + +```sh +crabbox cache warm --id blue-lobster -- pnpm install --frozen-lockfile +crabbox cache warm --id blue-lobster -- docker compose pull +``` + +Runs a command in the synced repo workdir for that lease. On boxes +prepared by `crabbox actions hydrate`, it uses the hydrated +`$GITHUB_WORKSPACE` and sources the workflow env handoff, just like +`crabbox run` does. + +Use warm for one-off cache priming when you do not want to record a full +run history entry. + +## purge + +```sh +crabbox cache purge --id blue-lobster --kind pnpm --force +crabbox cache purge --id blue-lobster --kind all --force +``` + +Removes the named cache kind from the lease. `--force` is required to +prevent accidental purges. If `cache.maxGB` is set, purge is rarely +needed - the runner trims the oldest entries automatically when caches +exceed the cap. + +## Flags + +```text +--id target lease (required) +--kind pnpm|npm|docker|git|all for purge +--force required for purge +--reclaim move local claim from another repo +--json stats as JSON +``` + +## When To Use Cache + +Caches are speed hints, not source of truth. The synced worktree remains +authoritative. + +- Use `cache stats` to confirm a long-lived warm box is gaining benefit + from cached packages. +- Use `cache warm` to prime a fresh lease before handing it to agents that + run many short commands. +- Use `cache purge` when a corrupt cache is poisoning a build (rare; + usually the underlying tool's own cache reset works first). + +Disposable leases lose cache state when the VM is deleted; kept leases +can reuse cache state across repeated agent runs. For shared baked +images, see [Prebaked runner images](../features/prebaked-images.md). Related docs: -- [Performance](../performance.md) - [Cache controls](../features/cache.md) +- [Performance](../performance.md) +- [run](run.md) +- [actions](actions.md) diff --git a/docs/commands/cleanup.md b/docs/commands/cleanup.md index 4badc4e..6951b86 100644 --- a/docs/commands/cleanup.md +++ b/docs/commands/cleanup.md @@ -1,29 +1,77 @@ # cleanup -`crabbox cleanup` sweeps direct-provider leftovers. +`crabbox cleanup` sweeps direct-provider leftovers based on Crabbox labels. ```sh crabbox cleanup --dry-run crabbox cleanup ``` -Cleanup refuses to run when a coordinator is configured. Brokered cleanup belongs to the Durable Object alarm. +`crabbox machine cleanup` is preserved as a compatibility alias. -Direct cleanup skips kept machines, deletes expired ready/leased/active machines, and gives running/provisioning machines an extra stale safety window. It relies on provider labels such as `lease`, `slug`, `expires_at`, and `state`. +## Behavior -Static SSH targets are existing hosts, so `provider=ssh` has nothing to sweep. +Cleanup refuses to run when a coordinator is configured. Brokered cleanup +belongs to the Durable Object alarm; sweeping provider resources behind the +coordinator can race live brokered leases. -Flags: +In direct-provider mode, cleanup is intentionally conservative: + +- skip machines tagged `keep=true`; +- skip machines in `running` or `provisioning` state until the extra stale + safety window passes (expiry plus 12 hours); +- delete machines that are clearly expired in `ready`, `leased`, or + `active` states; +- delete machines that have been inactive past expiry. + +Selection is label-driven. Cleanup uses `lease`, `slug`, `expires_at`, +`last_touched_at`, `state`, and `keep` labels written when the machine was +created. Resources without Crabbox labels are never touched. + +Static SSH targets are existing operator-owned hosts, so `provider=ssh` +has nothing to sweep. Cleanup exits early for that provider. + +## Output + +`--dry-run` lists every decision without taking action: ```text ---provider hetzner|aws ---target linux|macos|windows ---windows-mode normal|wsl2 ---static-host ---static-user ---static-port ---static-work-root ---dry-run +hetzner cx53 hz-12345 lease=cbx_abcdef123456 slug=blue-lobster keep=true skip=keep +hetzner cx53 hz-67890 lease=cbx_abcdef234567 slug=amber-crab expires_at=2026-05-01T17:30:00Z delete ``` -`crabbox machine cleanup` remains as a compatibility alias. +Without `--dry-run`, the same lines print but each `delete` is followed by +`deleted` after the provider call returns. Failures print the provider +error and continue with the next candidate. + +## Flags + +```text +--provider hetzner|aws provider to sweep (delegated providers do not need cleanup) +--target linux|macos|windows for AWS, restrict by target +--windows-mode normal|wsl2 when target=windows +--static-host ignored (provider=ssh has nothing to sweep) +--static-user ignored +--static-port ignored +--static-work-root ignored +--dry-run log decisions without making provider calls +``` + +## When To Run + +- after a CLI process crashed mid-warmup and left a server behind; +- when migrating from direct mode to brokered mode (sweep first, then + switch); +- as a safety net after rotating provider credentials; +- never as part of a brokered workflow - the coordinator owns that path. + +For brokered fleets, audit `crabbox admin leases --state active` and use +`crabbox admin release` instead. + +Related docs: + +- [stop](stop.md) +- [admin](admin.md) +- [Lifecycle cleanup](../features/lifecycle-cleanup.md) +- [Orchestrator](../orchestrator.md) +- [Operations](../operations.md) diff --git a/docs/commands/doctor.md b/docs/commands/doctor.md index 9d7b3ba..c80dd85 100644 --- a/docs/commands/doctor.md +++ b/docs/commands/doctor.md @@ -1,29 +1,101 @@ # doctor -`crabbox doctor` checks local prerequisites and broker/provider access. +`crabbox doctor` runs the local preflight before you commit to a long +workflow. It is fast (under a second on a healthy machine), local-only, and +never calls a billable provider API. ```sh crabbox doctor crabbox doctor --provider aws +crabbox doctor --provider hetzner --target linux crabbox doctor --provider ssh --target windows --windows-mode normal --static-host win-dev.local ``` -It checks local tools, user config permissions, per-lease key generation support, -coordinator health when configured, and direct-provider API access otherwise. If -`CRABBOX_SSH_KEY` is explicitly set, it also validates that private key and -matching `.pub` file. - -For `provider=ssh`, doctor checks that the static SSH host is reachable and has -the tools required by the selected target mode. - -Flags: +## What It Checks ```text ---provider hetzner|aws|ssh ---target linux|macos|windows ---windows-mode normal|wsl2 ---static-host ---static-user ---static-port ---static-work-root +config config files load and parse, required keys are present +auth broker token is set, signed token is valid, identity resolves +network coordinator URL reachable, DNS works, SSH transport probes work +ssh SSH key path readable, key permissions sane, ssh-keygen on PATH +tools rsync, git, ssh, ssh-keygen present and executable ``` + +For `--provider ssh`, doctor also probes the static host: SSH reachability +on the configured port, target-required tools (`bash`, `git`, `rsync`, +`tar` for POSIX targets; OpenSSH, PowerShell, and `tar` for native +Windows), and `static.workRoot` writability. + +When `CRABBOX_SSH_KEY` is explicitly set, doctor validates the private key +and the matching `.pub` file. When unset, it skips that check because +per-lease keys do not need a global key. + +For the full list of checks, including how each one decides between +`fail`, `skip`, and `ok`, see +[Doctor checks](../features/doctor.md). + +## Output + +```text +config: + ok user config: ~/.config/crabbox/config.yaml + ok repo config: ./.crabbox.yaml + ok provider: aws + ok target: linux +auth: + ok broker: https://crabbox.openclaw.ai + ok owner: alex@example.com +network: + ok coordinator dns + ok coordinator https +ssh: + ok ssh-keygen present + skip ssh.key unset (per-lease keys will be used) +tools: + ok git + ok rsync + ok ssh + ok ssh-keygen +``` + +Failures swap the leading `ok` for `fail` and add a remediation hint: + +```text +auth: + fail broker token is missing - run `crabbox login` +``` + +Exit code is `0` on full success, `2` on any failure. Skips never change +the exit code. + +## Flags + +```text +--provider hetzner|aws|ssh provider to validate +--target linux|macos|windows target OS for ssh provider checks +--windows-mode normal|wsl2 when target=windows +--static-host static SSH host +--static-user static SSH user override +--static-port static SSH port override +--static-work-root static target work root +``` + +## When To Run + +- before the first `crabbox run` on a new machine; +- after rotating the broker token; +- after editing `~/.crabbox.yaml` or repo config; +- in agent boot sequences as a sanity check; +- when triaging "Crabbox is broken" reports - doctor often catches the + problem before the user has to describe it. + +Doctor is safe to run from `pre-commit`, scheduled jobs, and CI smoke +because it never provisions, never costs money, and never modifies state. + +Related docs: + +- [Doctor checks](../features/doctor.md) +- [Configuration](../features/configuration.md) +- [Auth and admin](../features/auth-admin.md) +- [Network and reachability](../features/network.md) +- [Troubleshooting](../troubleshooting.md) diff --git a/docs/commands/events.md b/docs/commands/events.md index 6c46207..569108c 100644 --- a/docs/commands/events.md +++ b/docs/commands/events.md @@ -3,34 +3,79 @@ `crabbox events` prints the coordinator event log for a recorded run. ```sh -crabbox events run_... -crabbox events --id run_... --after 42 --limit 100 -crabbox events run_... --json +crabbox events run_abcdef123456 +crabbox events --id run_abcdef123456 --after 42 --limit 100 +crabbox events run_abcdef123456 --json ``` -Coordinator-backed `crabbox run` creates a durable `run_...` handle before it -leases or syncs. The CLI appends lifecycle events as the run advances through -leasing, bootstrap, sync, command execution, output streaming, finish, and -release. +## What Events Are Recorded -Human output includes sequence number, event type, phase, stream, timestamp, and -short message or output text. JSON output returns the raw event records. -Output events are a bounded preview: stdout/stderr capture stops after 64 KiB -per run and records an `output.truncated` marker. Use `crabbox logs` for the -larger retained command output. +Coordinator-backed `crabbox run` creates a durable `run_...` handle before +it leases or syncs. The CLI appends ordered events as the run advances: -Flags: +- `lease.acquire.start`, `lease.acquire.success`, `lease.acquire.fail`; +- `bootstrap.wait`, `bootstrap.ready`; +- `sync.start`, `sync.skip`, `sync.success`, `sync.fail`; +- `command.start`, `command.finish`; +- `output.stdout`, `output.stderr`, `output.truncated`; +- `release.start`, `release.success`, `release.fail`. + +Each event carries a sequence number, event type, phase, optional stream +(stdout/stderr), timestamp, and short message or output text. + +## Output + +Human output prints sequence number, event type, phase, stream, timestamp, +and message: ```text ---id run id ---after only show events after this sequence ---limit default 500, maximum 500 ---json print JSON + 1 lease.acquire.start plan 2026-05-07T07:42:18Z + 2 lease.acquire.success plan 2026-05-07T07:42:21Z leased=cbx_abcdef123456 slug=blue-lobster + 3 bootstrap.wait provision 2026-05-07T07:42:21Z + 4 bootstrap.ready provision 2026-05-07T07:43:05Z + 5 sync.start sync 2026-05-07T07:43:05Z + 6 sync.success sync 2026-05-07T07:43:08Z files=184 bytes=12.4MiB + 7 command.start run 2026-05-07T07:43:08Z pnpm test + 8 output.stdout run 2026-05-07T07:43:09Z > vitest run + 9 output.stdout run 2026-05-07T07:43:11Z ✓ src/foo.test.ts (8) + ... + 42 command.finish run 2026-05-07T07:45:32Z exit=0 + 43 release.success release 2026-05-07T07:45:34Z ``` -Related: +`--json` returns the raw event records. + +## Bounded Output Capture + +Output events are a bounded preview. The coordinator caps stdout/stderr +capture at 64 KiB per run and records an `output.truncated` marker when +the cap is reached. The retained log keeps up to 8 MiB. For the larger +retained command output, use [logs](logs.md). + +## Flags + +```text +--id run id (also accepted as a positional argument) +--after only show events after this sequence number +--limit maximum number of events, default 500, maximum 500 +--json print JSON +``` + +`--after` is what `attach` uses internally - resume from a known sequence +without replaying the whole event log. + +## Use Cases + +- post-mortem on a failed run when you need the exact sequence of phases; +- correlating a failed step with the timestamps of surrounding sync or + bootstrap events; +- scripting a status check that filters by event type; +- archiving event records for runs that exceeded the retained log cap. + +Related docs: - [history](history.md) -- [attach](attach.md) - [logs](logs.md) +- [attach](attach.md) +- [results](results.md) - [History and logs](../features/history-logs.md) diff --git a/docs/commands/init.md b/docs/commands/init.md index 302285a..9f98827 100644 --- a/docs/commands/init.md +++ b/docs/commands/init.md @@ -1,27 +1,106 @@ # init `crabbox init` onboards a repository for agent-first remote verification. +It writes the minimum config needed for `crabbox run` and sets up the +optional Actions hydration bridge and agent skill. ```sh crabbox init crabbox init --force +crabbox init --workflow .github/workflows/crabbox-test.yml ``` -It writes: +## Files It Writes -- `.crabbox.yaml` -- `.github/workflows/crabbox.yml` -- `.agents/skills/crabbox/SKILL.md` +```text +.crabbox.yaml repo defaults (provider, profile, class, sync, env) +.github/workflows/crabbox.yml Actions hydration stub (optional) +.agents/skills/crabbox/SKILL.md agent-facing skill instructions +``` -The generated workflow is intentionally conservative. It is a starting point for repo-specific hydration, not a full replacement for CI. Edit it to install dependencies, start service containers, and warm caches before agents begin repeated `crabbox run` calls. +By default `init` will not overwrite existing files. `--force` overrides +that and replaces them with freshly generated content. -The workflow contract is the same one used by `crabbox actions hydrate`: it accepts the Crabbox lease ID and dynamic runner label, runs on that self-hosted runner, writes a ready marker under `$HOME/.crabbox/actions`, and keeps the job alive for the remote command loop. +## `.crabbox.yaml` -Flags: +A starting template that includes: + +- a default `profile` and `class`; +- `sync.exclude` covering common heavy directories; +- `env.allow` with conservative defaults (`CI`, `NODE_OPTIONS`, + `PROJECT_*`); +- `actions.workflow` pointing at the generated workflow stub; +- `cache` toggles for pnpm, npm, docker, and git. + +Open the file after `init` and adjust it to match the repo: + +- pick the right `class` for the workload; +- add repo-specific `sync.exclude` patterns; +- expand `env.allow` for project-specific tunables; +- pin `sync.baseRef` to the project's default branch. + +See [Configuration](../features/configuration.md) for the full schema. + +## `.github/workflows/crabbox.yml` + +The generated workflow is intentionally conservative. It is a starting +point for repo-specific hydration, not a full replacement for CI. Edit it +to install dependencies, start service containers, and warm caches before +agents begin repeated `crabbox run` calls. + +The workflow contract is the one used by `crabbox actions hydrate`: + +- accepts the Crabbox lease ID and dynamic runner label; +- runs on that self-hosted runner registered by Crabbox; +- writes a ready marker under `$HOME/.crabbox/actions`; +- keeps the job alive so the local CLI can run repeated commands in the + hydrated workspace. + +If the repo has no Actions hydration plans, you can delete the workflow. +`crabbox run` works fine without it - hydration is optional. + +## `.agents/skills/crabbox/SKILL.md` + +Repo-local agent instructions. The generated skill explains: + +- when to use Crabbox vs running locally; +- how to acquire and reuse leases; +- which commands the agent should prefer (`warmup`, `run --id`, `stop`); +- what env vars the project allows; +- where to find repo-specific test commands. + +Edit this file to match how you want agents to operate in the repo. The +skill is read by OpenClaw and similar agent runtimes that auto-discover +`.agents/skills/`. + +## Flags ```text --force overwrite generated files ---config repo config path ---workflow workflow path ---skill agent skill path +--config repo config path (default ./.crabbox.yaml) +--workflow Actions workflow path (default .github/workflows/crabbox.yml) +--skill agent skill path (default .agents/skills/crabbox/SKILL.md) ``` + +## Idempotency + +`init` is safe to re-run. Without `--force`, it leaves existing files +alone and exits with a summary of what would be created. With `--force`, +it replaces files atomically. + +## After Init + +```sh +crabbox doctor # validate the config +crabbox sync-plan # preview what would sync +crabbox warmup # acquire a lease +crabbox run -- pnpm test # run a command +``` + +Related docs: + +- [Configuration](../features/configuration.md) +- [Repository onboarding](../features/repository-onboarding.md) +- [Actions hydration](../features/actions-hydration.md) +- [Sync](../features/sync.md) +- [Getting started](../getting-started.md) diff --git a/docs/commands/inspect.md b/docs/commands/inspect.md index bdab9b9..c0bd4d4 100644 --- a/docs/commands/inspect.md +++ b/docs/commands/inspect.md @@ -1,6 +1,8 @@ # inspect -`crabbox inspect` prints detailed lease and provider metadata. +`crabbox inspect` prints detailed lease and provider metadata. Use it for +debugging coordinator state, provider labels, expiry, SSH target details, +and Tailscale metadata. ```sh crabbox inspect --id blue-lobster @@ -9,23 +11,60 @@ crabbox inspect --id blue-lobster --json crabbox inspect --provider ssh --target windows --windows-mode wsl2 --static-host win-dev.local ``` -Use this for debugging coordinator state, provider labels, expiry, and SSH target details. +## Output -Flags: +Human output prints lease state, provider, server type, public IP, work +root, owner, org, idle timeout, TTL, expiry, last touched, the resolved +SSH command for the selected network mode, and any Tailscale metadata the +lease carries. ```text ---id ---provider hetzner|aws|ssh ---target linux|macos|windows ---windows-mode normal|wsl2 ---static-host ---static-user ---static-port ---static-work-root ---network auto|tailscale|public ---json +lease=cbx_abcdef123456 slug=blue-lobster +state=active provider=aws server=i-0abcdef0123456789 type=c7a.48xlarge +host=203.0.113.10 user=crabbox port=2222 work_root=/work/crabbox +owner=alex@example.com org=openclaw +idle_timeout=30m0s ttl=90m0s +created_at=2026-05-07T07:42:18Z last_touched=2026-05-07T07:55:12Z expires_at=2026-05-07T08:25:12Z +ssh: ssh -i ~/.config/crabbox/testboxes/cbx_abcdef123456/id_ed25519 -p 2222 crabbox@203.0.113.10 +tailscale: state=ok ipv4=100.64.0.5 fqdn=blue-lobster.tail-scale.ts.net tags=tag:crabbox ``` -JSON output includes non-secret Tailscale metadata when present. Human output -prints both the provider host and the resolved SSH command for the selected -network. +JSON output returns the structured record, including non-secret Tailscale +metadata. Secrets (broker tokens, provider keys, VNC passwords) are never +included. + +## Flags + +```text +--id lease to inspect; required for managed providers +--provider hetzner|aws|ssh override the configured provider +--target linux|macos|windows +--windows-mode normal|wsl2 +--static-host static SSH host for provider=ssh +--static-user static SSH user override +--static-port static SSH port override +--static-work-root static target work root +--network auto|tailscale|public select which address inspect prints +--json print JSON +``` + +## Inspect vs Status vs List + +- `inspect` is the long-form record for one lease, including provider + metadata, label state, and the resolved SSH command; +- `status` is the shorter "is this lease healthy right now" check, with + optional `--wait` and bounded telemetry; +- `list` is the table view across many leases, scoped by owner/org or + fleet-wide for admins. + +Use `inspect` when something is unexpected and you want all the detail in +one place. Use `status` when an automation needs a quick liveness check. +Use `list` when you are looking for a specific lease across the pool. + +Related docs: + +- [status](status.md) +- [list](list.md) +- [ssh](ssh.md) +- [Identifiers](../features/identifiers.md) +- [Network and reachability](../features/network.md) diff --git a/docs/commands/logout.md b/docs/commands/logout.md index 6c15a81..a399285 100644 --- a/docs/commands/logout.md +++ b/docs/commands/logout.md @@ -7,9 +7,31 @@ crabbox logout crabbox logout --json ``` -The broker URL and provider are left in place so a later `crabbox login` or `crabbox login --token-stdin` can reuse them. +The broker URL and provider stay in place so a later `crabbox login` or +`crabbox login --token-stdin` can reuse them. Per-lease SSH keys, repo +claims, and history records are unaffected. + +After logout: + +- `crabbox whoami` exits with auth code 3 (`auth failure`); +- `crabbox run` and `crabbox warmup` against the coordinator fail with the + same code; +- direct-provider mode keeps working when local provider credentials + (AWS SDK, `HCLOUD_TOKEN`) are present, because direct mode does not need + the broker token. + +Use logout when: + +- a token has leaked or you want to rotate it; +- you are switching the operator identity on a shared workstation; +- you are testing the unauthenticated path. + +To clear everything (URL, provider, token, profile defaults), edit the user +config file directly. `crabbox config path` prints the location. Related docs: - [login](login.md) - [whoami](whoami.md) +- [Auth and admin](../features/auth-admin.md) +- [Configuration](../features/configuration.md) diff --git a/docs/commands/logs.md b/docs/commands/logs.md index f5b1bfe..f84c1a1 100644 --- a/docs/commands/logs.md +++ b/docs/commands/logs.md @@ -1,20 +1,70 @@ # logs -`crabbox logs` prints the retained remote output for a recorded run. +`crabbox logs` prints the retained command output for a recorded run. ```sh -crabbox logs run_... -crabbox logs --id run_... -crabbox logs run_... --json +crabbox logs run_abcdef123456 +crabbox logs --id run_abcdef123456 +crabbox logs run_abcdef123456 --json ``` -The plain form writes the log text to stdout. `--json` returns run metadata plus the log. +## What Gets Stored -Logs are bounded remote stdout/stderr captures. The CLI keeps up to 8 MiB per run and the coordinator stores larger captures in chunks, so failures from noisy parallel runs remain visible without turning run history into unlimited archival storage. +When `crabbox run` runs against a coordinator, it streams remote stdout and +stderr to the local terminal *and* records a bounded copy on the +coordinator. The CLI keeps up to 8 MiB of capture per run; the coordinator +stores larger captures in chunks so a noisy parallel run does not exceed +Durable Object storage limits. + +Output beyond the cap is truncated with an `output.truncated` marker on the +last event so the consumer knows the tail is missing. + +## Output + +The plain form writes the log text to stdout. `--json` returns run metadata +plus the log: + +```json +{ + "runId": "run_abcdef123456", + "leaseId": "cbx_abcdef123456", + "exitCode": 0, + "truncated": false, + "log": "..." +} +``` + +`--json` is stable enough for scripts that filter by exit code and want the +log text in one payload. + +## Flags + +```text +--id run id (also accepted as a positional argument) +--json print JSON with metadata and log text +``` + +## When To Use Logs vs Events vs Attach + +- `logs` returns the retained command output. Use when you want the full + bounded transcript after the run finished. +- `events` returns ordered run events (lease, sync, command, output chunks, + finish). Use when you need to know *what happened* and *when*. +- `attach` follows live events. Use when the run is still active and you + want to watch it without re-attaching the original CLI. + +Logs and events are independent surfaces - logs stay focused on command +output, events stay focused on lifecycle. + +## Direct Mode + +Direct-provider mode does not record runs centrally, so `crabbox logs` has +nothing to fetch. Use shell output or the local terminal log instead. Related docs: - [history](history.md) - [events](events.md) - [attach](attach.md) +- [results](results.md) - [History and logs](../features/history-logs.md) diff --git a/docs/commands/results.md b/docs/commands/results.md index d1b1ce5..c49d925 100644 --- a/docs/commands/results.md +++ b/docs/commands/results.md @@ -1,17 +1,22 @@ # results -`crabbox results` prints structured test summaries attached to a recorded run. +`crabbox results` prints structured test summaries attached to a recorded +run. ```sh -crabbox run --id cbx_... --junit junit.xml -- go test ./... -crabbox results run_... -crabbox results run_... --json +crabbox run --id cbx_abcdef123456 --junit junit.xml -- go test ./... +crabbox results run_abcdef123456 +crabbox results run_abcdef123456 --json ``` -Results are attached only when `crabbox run` is told where to find remote JUnit XML. Use either: +## When Results Are Attached + +Results are attached only when `crabbox run` is told where to find remote +JUnit XML. Use either: ```sh crabbox run --junit junit.xml -- +crabbox run --junit junit.xml,reports/junit.xml -- ``` or repo config: @@ -23,10 +28,76 @@ results: - reports/junit.xml ``` -Human output shows totals and failed test cases. JSON output returns the stored summary. Stored summaries keep aggregate counts but cap bulky failure details. +After the command exits, the CLI reads each remote file from the workdir, +parses JUnit, and sends only the summary to the coordinator. Raw XML is not +stored. Multiple JUnit files are merged into a single summary so a multi- +report test setup still produces one result record. + +## Output + +Human output shows totals and the names of failed test cases: + +```text +run_abcdef123456 lease=cbx_abcdef123456 command="pnpm test" +totals: tests=412 failures=2 errors=0 skipped=4 time=42.318s +failures: + src/auth.test.ts > login → returns user + src/sync.test.ts > rsync → handles deletes +``` + +`--json` returns the stored structured summary: + +```json +{ + "runId": "run_abcdef123456", + "totals": { "tests": 412, "failures": 2, "errors": 0, "skipped": 4, "timeSeconds": 42.318 }, + "failures": [ + { "suite": "src/auth.test.ts", "name": "login → returns user" }, + { "suite": "src/sync.test.ts", "name": "rsync → handles deletes" } + ], + "files": [ + { "path": "junit.xml", "size": 12345 } + ] +} +``` + +## Limits + +The coordinator caps stored summaries: + +- aggregate counters (tests, failures, errors, skipped) are kept verbatim; +- failed-case entries are capped to a bounded list; +- long strings (test names, suite names, message bodies) are truncated; +- file lists keep paths and sizes, never raw bytes. + +This keeps the result record small enough for the lease detail page and +the run detail page to render without paging through gigabytes of XML. + +## Flags + +```text +--id run id (also accepted as a positional argument) +--json print JSON +``` + +## When To Use Results vs Logs + +- `results` is the structured summary - "did the suite pass, and which + cases failed?"; +- `logs` is the retained command output - "what did the command print?". + +Use `results` for dashboards and quick triage. Use `logs` when you need to +read the actual stack trace. + +## Future Formats + +Today only JUnit XML is supported. Vitest JSON, Go `test2json`, and flaky- +test correlation across runs are tracked in +[Test results](../features/test-results.md). Related docs: - [run](run.md) - [history](history.md) +- [logs](logs.md) - [Test results](../features/test-results.md) diff --git a/docs/commands/sync-plan.md b/docs/commands/sync-plan.md index fe48d91..7e5cfd8 100644 --- a/docs/commands/sync-plan.md +++ b/docs/commands/sync-plan.md @@ -1,25 +1,80 @@ # sync-plan `crabbox sync-plan` prints the local sync manifest without leasing a box. +Use it to preview what `crabbox run` would send before paying for a cold +sync, or after editing `.crabboxignore` to confirm artifacts dropped out +of the manifest. ```sh crabbox sync-plan crabbox sync-plan --limit 10 +crabbox sync-plan --limit 25 --json ``` -It uses the same Git file-list manifest, `.crabboxignore`, and config excludes -as `crabbox run`, then prints: +## What It Reads + +`sync-plan` uses the same Git file-list manifest, `.crabboxignore`, and +`sync.exclude` rules as `crabbox run`: + +- tracked files from `git ls-files --cached`; +- nonignored untracked files from + `git ls-files --others --exclude-standard`; +- root `.crabboxignore` patterns; +- repo-local `sync.exclude` patterns; +- Crabbox's default cache/build excludes. + +It does not require a lease, does not call the broker, and does not call +any provider API. + +## Output + +Default output prints: - candidate file count and total bytes; - tracked deletes that would be applied remotely; -- largest files; -- largest first or second-level directories. +- the largest files; +- the largest first or second-level directories. -Use it before a cold sync when the preflight estimate looks too large, or after -editing `.crabboxignore` to confirm that local artifacts dropped out of the -manifest. +```text +files: 1843 +bytes: 312.5MiB +tracked deletes: 0 + +largest files: + 84.5MiB assets/demo.mp4 + 12.4MiB fixtures/sample-data.json + ... + +largest directories: + 140.2MiB assets + 80.1MiB fixtures + ... +``` + +## Flags + +```text +--limit show this many files and directories in each top list (default 5) +--json print structured JSON output +``` + +`--limit 0` shows the full lists (use sparingly; large repos produce big +output). + +## Use Cases + +- preview a first sync before warming a beast-class lease; +- find sneaky directories that grew (`.cache/`, `dist/`, generated assets); +- audit `.crabboxignore` after adding new excludes; +- compare repo footprint over time as part of repo health checks. + +The numbers `sync-plan` prints are upper bounds; rsync's actual transfer +size depends on what is already on the remote runner. Repeat sync after a +warmup is much smaller because the manifest matches the remote fingerprint +and rsync ships only changed bytes. Related docs: - [run](run.md) - [Sync](../features/sync.md) +- [Configuration](../features/configuration.md) diff --git a/docs/commands/whoami.md b/docs/commands/whoami.md index 8693e53..925688e 100644 --- a/docs/commands/whoami.md +++ b/docs/commands/whoami.md @@ -1,21 +1,77 @@ # whoami -`crabbox whoami` verifies broker auth and prints the identity the coordinator sees. +`crabbox whoami` verifies broker auth and prints the identity the +coordinator sees. ```sh crabbox whoami crabbox whoami --json ``` -Human output: +## Human Output ```text -user=steipete@gmail.com org=openclaw auth=github broker=https://crabbox.openclaw.ai +user=alex@example.com org=openclaw auth=github broker=https://crabbox.openclaw.ai ``` -Identity normally comes from the signed GitHub login token. Shared bearer-token automation reports owner/org from `X-Crabbox-Owner` and `X-Crabbox-Org`; the CLI fills those from `CRABBOX_OWNER`, Git email env, `git config user.email`, and `CRABBOX_ORG`. Raw Cloudflare Access identity headers are ignored; only a verified Access JWT email can become the bearer-token owner. JSON output also reports the forwarded auth mode, such as `github` or `bearer`. +The fields: + +- `user` - the resolved owner email. +- `org` - the organization namespace, when set. +- `auth` - the authentication mode the coordinator accepted (`github` for + signed login tokens, `bearer` for shared automation tokens). +- `broker` - the configured coordinator URL. + +## JSON Output + +```json +{ + "owner": "alex@example.com", + "org": "openclaw", + "auth": "github", + "broker": "https://crabbox.openclaw.ai", + "tokenSource": "user-config", + "accessJwtVerified": false +} +``` + +JSON output also reports the forwarded auth mode, where the token came +from (`user-config`, `env`, `stdin`), and whether a verified Cloudflare +Access JWT was present. + +## Identity Sources + +Identity normally comes from the signed GitHub login token. The browser +flow embeds the verified GitHub email and allowed-org membership in a +short-lived signed token; the coordinator extracts owner/org from that +token, not from headers. + +Shared bearer-token automation reports owner/org from `X-Crabbox-Owner` and +`X-Crabbox-Org`. The CLI fills those headers from: + +- `CRABBOX_OWNER` env (highest precedence); +- `GIT_AUTHOR_EMAIL` or `GIT_COMMITTER_EMAIL` env; +- `git config user.email`; +- `CRABBOX_ORG` env for the org header. + +Raw Cloudflare Access identity headers are ignored. Only a verified Access +JWT email (with the JWT validated against the Cloudflare team's public +keys) can become the bearer-token owner. + +## Exit Codes + +```text +0 identity resolved successfully +2 broker URL or token missing +3 auth failure (token rejected, GitHub org membership missing, etc.) +``` + +Use `whoami` in CI scripts before any long workflow to fail fast on auth +issues. Related docs: - [login](login.md) +- [logout](logout.md) +- [Auth and admin](../features/auth-admin.md) - [Broker auth and routing](../features/broker-auth-routing.md) diff --git a/docs/concepts.md b/docs/concepts.md new file mode 100644 index 0000000..4ecd8da --- /dev/null +++ b/docs/concepts.md @@ -0,0 +1,256 @@ +# Concepts + +Read when: + +- you encounter a Crabbox term you do not recognize; +- you are writing docs and want to stay consistent with existing usage; +- you need a single page that lays out the vocabulary. + +This page is a glossary. It defines the nouns and the verbs Crabbox uses +across the CLI, broker, providers, and docs. When two synonyms exist, the +preferred form is in **bold**. + +## Compute Vocabulary + +**Lease** - a time-bounded reservation of a remote runner that Crabbox +created or resolved. Has a canonical ID (`cbx_...`), a friendly slug, an +idle timeout, a TTL, and a state (`active`, `released`, `expired`, +`failed`). Leases are the unit of cost accounting and cleanup. + +**Runner** - the remote machine itself. Provisioned by the provider, +prepared by cloud-init, used for one or more leases. Crabbox does not +distinguish between a Hetzner cloud server, an AWS EC2 instance, and a +static SSH host beyond what the provider backend tells it - all are +runners. + +**Box** / **Testbox** - informal synonym for runner. Used in the README and +some early docs. Prefer "runner" in new docs unless the surrounding context +is talking about leases as a product (in which case "box" reads better). + +**Pool** - the set of currently active runners visible to a user, org, or +the whole fleet. `crabbox list` and `/v1/pool` both expose it. + +**Slug** - the friendly name for a lease. Looks like `blue-lobster`. +Generated from a stable hash of the lease ID; collisions append a 4-hex +suffix. See [Identifiers](features/identifiers.md). + +**Lease ID** - the canonical machine-friendly identifier +(`cbx_abcdef123456`). Used in labels, logs, and APIs. Always 16 chars. + +**Run** - a single `crabbox run` invocation against a coordinator. Has a +`run_...` ID, an owning lease, a command, an exit code, and a record in +coordinator history. + +## Roles + +**CLI** - the local Go binary `crabbox`. Owns config, sync, command +execution, output streaming, and per-lease SSH keys. See +[Architecture](architecture.md). + +**Broker** / **Coordinator** - the Cloudflare Worker plus Fleet Durable +Object. Owns provider credentials, lease state, expiry, cleanup alarms, +usage, and cost. Both terms are used interchangeably; "coordinator" is +preferred in feature docs that emphasize state, "broker" when emphasizing +the trust boundary between CLI and provider. + +**Provider** - a Crabbox component that knows how to acquire, resolve, +list, and release runners on a backing service. Built-in providers: AWS, +Hetzner, Static SSH, Blacksmith Testbox, Daytona, Islo. See +[Provider reference](providers/README.md). + +**Backend** - the Go interface a provider implements: +`SSHLeaseBackend` for providers that hand Crabbox a real SSH target, +`DelegatedRunBackend` for providers that own command execution +themselves. See [Provider backends](provider-backends.md). + +**Operator** - a person with broker-side access (admin token, Cloudflare +config). Operators run `crabbox admin` commands and image bake/promote +flows. + +**Agent** - an LLM-backed process invoking Crabbox through the CLI or the +OpenClaw plugin. Agents are first-class users of Crabbox; the docs +intentionally write for both humans and agents. + +## Modes + +**Brokered mode** / **coordinator mode** - the normal path, where the CLI +talks to the Cloudflare Worker for lease creation, lease state, and +cleanup. Provider secrets stay broker-side. Used for shared team +infrastructure. + +**Direct mode** / **direct-provider mode** - the local-debug fallback, where +the CLI talks straight to the provider API (AWS SDK, Hetzner API, Daytona +SDK, Islo SDK). No coordinator, no central history, no spend caps. Use +when you are debugging the broker itself. + +**Static mode** - lease behavior for `provider: ssh`. The host is operator- +owned; Crabbox does not provision or delete it. Bypasses both broker and +direct provisioning paths. + +**Delegated mode** - the path used by Blacksmith, Islo, and the Daytona +`run` flow. The provider owns command execution and streams output back to +Crabbox. Crabbox-owned sync (`--sync-only`, `--checksum`) is rejected; +sync timing reports `sync=delegated`. + +## Commands + +**warmup** - acquire a lease and keep it ready. No command runs yet. + +**run** - acquire or reuse a lease, sync, run a command, stream output, +release. + +**stop** - release a specific lease and delete its provider resources. + +**cleanup** - sweep direct-provider leftovers based on labels. Refuses +when a coordinator is configured. + +**reuse** - using `--id` (or a slug) to pick an existing lease instead of +creating a new one. Both `warmup` (idempotent) and `run` accept `--id`. + +**reclaim** - move a local claim from one repo to another so a lease +created in repo A can be reused from repo B. Required because Crabbox +binds leases to repos by default. + +**hydrate** - prepare a runner with project dependencies, usually by +dispatching a real GitHub Actions job that registers an ephemeral +self-hosted runner. The CLI then runs the local command in the hydrated +workspace. See [Actions hydration](features/actions-hydration.md). + +## State + +**Idle timeout** - the duration a lease may go without heartbeats before +the broker auto-releases it. Default 30m. Reset by every heartbeat or +explicit touch. + +**TTL** - the absolute maximum wall-clock lifetime of a lease. Default +90m. Cannot be extended by heartbeats. `expiresAt = min(createdAt + ttl, +lastTouchedAt + idleTimeout)`. + +**Heartbeat** - a `POST /v1/leases/{id}/heartbeat` call sent by the CLI +during long-running commands. Updates `lastTouchedAt`, can ship telemetry +samples, and can update idle timeout when explicitly requested. + +**Touch** - lower-level synonym for "update lease state and idle". The +provider's `Touch` method is what handles direct-provider state updates; +heartbeat is the brokered equivalent. + +**Reserved cost** - the worst-case TTL cost the broker reserves for a +lease at creation time (`hourlyRate × ttl`). Charged against the monthly +spend cap until the lease ends; freed on release. Distinct from elapsed +runtime cost, which is reported by `crabbox usage`. + +**Estimated cost** - elapsed-runtime cost for a lease, computed from the +hourly rate and the time spent in `active`. What `crabbox usage` reports +as a billing approximation. + +## Sync + +**Manifest** - the NUL-delimited list of paths Crabbox will sync, built +from `git ls-files --cached` and `git ls-files --others --exclude-standard`. + +**Fingerprint** - a hash of the commit, dirty file metadata, and manifest. +When the local fingerprint matches the remote one, Crabbox skips rsync. + +**Git seeding** - the optional first-sync step where Crabbox fetches the +configured origin/base ref into the runner's Git directory before rsync, +so changed-file diffs are available remotely. + +**Base ref** - the Git ref that Crabbox seeds and hydrates. Default +`main`. Configurable per repo in `sync.baseRef`. + +**Sanity check** - a guardrail run after rsync that detects mass tracked +deletions, missing manifest entries, and other suspicious sync outcomes. + +## Capabilities + +**Desktop** - lease capability that adds Xvfb + XFCE + x11vnc. Required +for `crabbox vnc`, `crabbox webvnc`, and most `--browser` UI runs. + +**Browser** - lease capability that installs Chrome/Chromium and exports +`BROWSER`/`CHROME_BIN`. Useful for Playwright/Vitest/etc. without a full +QA harness. + +**Code** - lease capability that installs code-server bound to loopback. +Used by `crabbox code` and the portal `/code/` bridge. + +**Tailscale** - optional reachability layer for managed Linux leases. +Joins the lease to the configured tailnet so clients on the tailnet can +reach the runner without the public IP. Distinct from the network mode +(`--network tailscale`) that selects which plane the CLI uses. + +## Backplane + +**Durable Object** - the Cloudflare Worker primitive that holds Crabbox +fleet state. Crabbox uses one fleet Durable Object so all scheduling +decisions are serialized. + +**Alarm** - the Durable Object scheduling primitive that fires on a future +timestamp. Crabbox uses alarms for idle-timeout sweeps and TTL cleanup. + +**Portal** - the server-rendered web UI hosted by the same Worker. Pages +under `/portal/...`. See [Browser portal](features/portal.md). + +**Bridge** - a portal endpoint that proxies traffic to a loopback service +on the lease (VNC, code-server). Bridges authenticate against the portal +session, then talk to the lease over the internal SSH plane. + +## Identity + +**Owner** - the email address that owns a lease. Resolved from the signed +GitHub login token, `CRABBOX_OWNER`, Git env, or `git config user.email`. + +**Org** - the GitHub-style organization namespace for a lease. Resolved +from the signed token or `CRABBOX_ORG`. Used for usage scoping and +multi-tenant cost caps. + +**Allowed org** - the GitHub org membership the broker requires before +issuing a signed login token. Configured per Cloudflare Worker. + +**Admin token** - the separately scoped token required for `/v1/pool`, +admin lease routes, and fleet-wide listing. Held more closely than the +shared automation token. + +**Cloudflare Access** - optional protection layer in front of the Worker. +When configured, the Worker only trusts the `CF-Access-Jwt-Assertion` +header (verified upstream); raw identity headers from the client are +ignored. + +## Storage + +**State directory** - where the CLI keeps local state (claims, per-lease +keys, known_hosts). Defaults to `$XDG_STATE_HOME/crabbox`, falling back to +the platform-specific user config directory. + +**Claim** - a JSON file under the state directory binding a lease to a +repo. Required for `crabbox run --id` to resolve slugs and to refuse +cross-repo reuse without `--reclaim`. + +**Workdir** / **work root** - the directory on the runner where Crabbox +syncs the repo. Default `/work/crabbox` on Linux; provider-specific on +Windows and macOS. + +## Documentation + +**Source map** - the doc page that points each user-facing behavior at the +implementation file behind it. Updated when behavior changes. See +[Source map](source-map.md). + +**Feature page** - a doc under `docs/features/.md` describing what +Crabbox does in one capability area. Owns the conceptual story; commands +and providers cross-link from here. + +**Command page** - a doc under `docs/commands/.md` describing the +flags, behavior, and exit codes of one CLI command. One per top-level +command, kept in sync with `--help` by `scripts/check-command-docs.mjs`. + +**Provider page** - a doc under `docs/providers/.md` describing one +provider's targets, config keys, env vars, sync behavior, and expected +failures. + +Related docs: + +- [How Crabbox Works](how-it-works.md) +- [Architecture](architecture.md) +- [CLI](cli.md) +- [Configuration](features/configuration.md) +- [Provider backends](provider-backends.md) diff --git a/docs/features/README.md b/docs/features/README.md index 2901119..0f0eb45 100644 --- a/docs/features/README.md +++ b/docs/features/README.md @@ -8,39 +8,62 @@ Read when: - you are deciding where a behavior belongs; - you need the feature-level contract before changing code. -Core features: +## Foundations + +- [Configuration](configuration.md): precedence, YAML schema, profiles, classes, env vars. +- [Identifiers](identifiers.md): lease IDs, slugs, run IDs, claims, and how lookup resolves. +- [Doctor checks](doctor.md): what `crabbox doctor` validates and how to extend it. +- [Network and reachability](network.md): `--network auto|tailscale|public`, port fallback, public/tailnet planes. +- [Lease capabilities](capabilities.md): `--desktop`, `--browser`, and `--code` selection rules. +- [Environment forwarding](env-forwarding.md): name-based env allowlist for the remote command. + +## Brokered fleet - [Coordinator](coordinator.md): brokered leases through Cloudflare Workers and Durable Objects. -- [Broker auth and routing](broker-auth-routing.md): GitHub login, shared bearer tokens, optional Cloudflare Access, and Worker routes. - [Browser portal](portal.md): authenticated lease/run UI, detail pages, bridge routes, and runner visibility. +- [Broker auth and routing](broker-auth-routing.md): GitHub login, shared bearer tokens, optional Cloudflare Access, and Worker routes. +- [Auth and admin](auth-admin.md): login/logout/whoami and trusted operator controls. +- [Telemetry](telemetry.md): lightweight Linux load, memory, disk, uptime, and run resource samples. +- [History and logs](history-logs.md): coordinator run records, events, and retained remote output. +- [Cost and usage](cost-usage.md): guardrails, provider-backed pricing, and reporting. +- [Lifecycle cleanup](lifecycle-cleanup.md): release, expiry, keep mode, and direct cleanup. + +## Providers + - [Providers](providers.md): provider overview, target matrix, classes, and fallback. -- [Provider backends](../provider-backends.md): implementation guide for adding a new provider/backend/plugin. -- [Provider authoring](provider-authoring.md): step-by-step guide for adding a provider package. +- [Capacity and fallback](capacity-fallback.md): class chains, market spot/on-demand, region/AZ routing. +- [Provider backends](../provider-backends.md): contract reference for backend interfaces and registration. +- [Authoring a provider](provider-authoring.md): step-by-step guide to writing a new provider. - [AWS](aws.md): EC2 Linux, Windows, WSL2, EC2 Mac, capacity, AMIs, and security groups. - [Hetzner](hetzner.md): Linux-only managed Hetzner behavior, classes, and cleanup. - [Blacksmith Testbox](blacksmith-testbox.md): delegated Testbox backend behavior. - [Daytona](daytona.md): Daytona SDK/toolbox sandbox leases with optional short-lived SSH access. - [Islo](islo.md): delegated Islo sandbox runs using the Islo Go SDK. + +## Runners and reachability + - [Tailscale](tailscale.md): optional tailnet reachability for managed Linux leases and static hosts. - [Runner bootstrap](runner-bootstrap.md): cloud-init, installed tools, SSH port, and readiness. - [Prebaked runner images](prebaked-images.md): provider-owned image storage and the image/cache/state boundary. - [Image bake runbook](image-bake-runbook.md): exact AWS bake, candidate smoke, promotion, rollback, and cleanup flow. +- [SSH keys](ssh-keys.md): per-lease keys, provider key cleanup, and local storage. + +## Sync, run, and recording + - [Sync](sync.md): Git file-list manifests, rsync, fingerprints, excludes, guardrails, and sanity checks. - [Actions hydration](actions-hydration.md): let GitHub Actions prepare a runner, then sync local work into that workspace. - [Interactive desktop and VNC](interactive-desktop-vnc.md): VNC hub, support matrix, tunnel model, and QA boundaries. - [Linux VNC](vnc-linux.md), [Windows VNC](vnc-windows.md), [macOS VNC](vnc-macos.md): OS-specific desktop setup and troubleshooting. -- [SSH keys](ssh-keys.md): per-lease keys, provider key cleanup, and local storage. -- [Cost and usage](cost-usage.md): guardrails, provider-backed pricing, and reporting. -- [History and logs](history-logs.md): coordinator run records, events, and retained remote output. -- [Telemetry](telemetry.md): lightweight Linux load, memory, disk, uptime, and run resource samples. - [Test results](test-results.md): JUnit summaries attached to recorded runs. - [Cache controls](cache.md): inspect, purge, and warm remote package/build caches. -- [Auth and admin](auth-admin.md): login/logout/whoami and trusted operator controls. -- [Lifecycle cleanup](lifecycle-cleanup.md): release, expiry, keep mode, and direct cleanup. + +## Integrations + +- [OpenClaw plugin](openclaw-plugin.md): agent tools that wrap the CLI. - [Repository onboarding](repository-onboarding.md): `crabbox init`, repo config, workflow stub, and agent skill. - [Source map](../source-map.md): implementation files behind documented behavior. -Command docs: +## Command docs - [doctor](../commands/doctor.md) - [init](../commands/init.md) diff --git a/docs/features/capabilities.md b/docs/features/capabilities.md new file mode 100644 index 0000000..b330080 --- /dev/null +++ b/docs/features/capabilities.md @@ -0,0 +1,191 @@ +# Lease Capabilities + +Read when: + +- adding `--desktop`, `--browser`, or `--code` to a workflow; +- changing how Crabbox detects whether a lease can host a visible desktop; +- adding a new lease capability flag. + +Lease capabilities are opt-in features that change what a managed runner can +do beyond running headless commands. They are a separate concept from the +provider feature set declared in `ProviderSpec.Features`: feature set says +"this provider can support a desktop"; lease capability says "this lease was +created with a desktop and exposes one right now". + +## The Three Capabilities + +```text +--desktop visible desktop with a loopback VNC server +--browser Chrome/Chromium installed and exported via $BROWSER and $CHROME_BIN +--code code-server bound to a loopback port for portal/code bridging +``` + +All three default to off. They have to be requested at lease creation time +(`crabbox warmup --desktop`) and reused afterwards. A lease created without a +capability cannot grow it later. + +## Selection And Validation + +Capability flags follow a two-step validation: + +1. **Provider feature check.** When the user sets a capability flag, + `validateRequestedCapabilities` looks up the selected provider's + `Spec.Features` and rejects the request if the matching feature + (`FeatureDesktop`, `FeatureBrowser`, `FeatureCode`) is missing. Hetzner + Linux supports all three; Blacksmith Testbox supports none. +2. **Lease label check.** When reusing a lease (`--id`), + `enforceManagedLeaseCapabilities` checks the matching label + (`desktop=true`, `browser=true`, `code=true`) on the existing lease. If + the label is missing, Crabbox refuses with a hint to warm a new lease. + +For static SSH targets, label enforcement is skipped because Crabbox does not +own the host. The capability is detected probe-by-probe instead - `--desktop` +on a static target probes the loopback VNC port; `--browser` on a static +target probes for Chrome and exports `BROWSER`/`CHROME_BIN` from what it +finds. + +`--code` is currently restricted to managed Linux leases. The validator +rejects it for Windows, macOS, and static SSH. + +## Desktop + +When a managed Linux lease is created with `--desktop`, bootstrap installs: + +- Xvfb (virtual framebuffer); +- a slim XFCE session; +- x11vnc bound to `127.0.0.1:5900`; +- a randomized VNC password at `/var/lib/crabbox/vnc.password`; +- screenshot tooling (`scrot`) and ffmpeg. + +`crabbox vnc --id ...` opens an SSH tunnel to that loopback port. The user's +local VNC viewer talks through the tunnel and uses the password the CLI +fetches from `/var/lib/crabbox/vnc.password`. There is no public VNC port; the +loopback bind is the security boundary. + +Static targets must already expose loopback VNC at `127.0.0.1:5900`. macOS +hosts can enable Screen Sharing; Windows hosts need a VNC server bound to +loopback (TightVNC works). + +For per-OS detail and known limits, see: + +- [Linux VNC](vnc-linux.md); +- [Windows VNC](vnc-windows.md); +- [macOS VNC](vnc-macos.md); +- [Interactive desktop and VNC](interactive-desktop-vnc.md). + +When the run injects environment, Crabbox also sets: + +```text +DISPLAY=:99 +CRABBOX_DESKTOP=1 +``` + +Tools that respect `DISPLAY` will draw onto the desktop the lease created. + +## Browser + +`--browser` adds a usable browser to the lease without dragging in a full QA +test environment. + +On managed Linux: + +- Google Chrome stable when available; +- Chromium fallback; +- native addon build helpers (`build-essential`, `libgbm-dev`, + `libnss3-dev`, etc.) so dependency installs that compile against Chromium + succeed. + +On static targets, Crabbox probes for an existing browser and reports an +error if none is found. `requestedCapabilityEnv` shells out to the host: + +- macOS: `/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`; +- Windows: `chrome.exe` or `msedge.exe` from PATH or the standard install + directories; +- Linux: `$BROWSER`, `$CHROME_BIN`, then `google-chrome`, `chromium`, or + `chromium-browser` from PATH. + +The detected path is exported into the run as: + +```text +BROWSER=/path/to/browser +CHROME_BIN=/path/to/browser +CRABBOX_BROWSER=1 +``` + +Test runners that read `BROWSER` or `CHROME_BIN` (Vitest, Playwright, etc.) +work without extra plumbing. If a browser is requested but no binary is +found, the run aborts before the command starts. + +## Code + +`--code` provisions code-server on managed Linux leases: + +- installs the binary at `/usr/local/bin/code-server`; +- binds to a loopback port (default `8080`); +- generates an auth token stored in coordinator state. + +The portal and `crabbox code --id ...` open a code-server tab through the +authenticated portal bridge at `/portal/leases/{id-or-slug}/code/`. The bridge +proxies HTTP and WebSocket traffic to the loopback port; the code-server +auth token is injected by the bridge so the user does not see it. There is no +public code-server port. + +Code is managed-Linux-only because the bridge depends on the lease shape and +the cloud-init that prepares the binary. Windows, macOS, and static SSH are +intentionally not supported today. + +## Capability Labels + +Managed lease records carry capability labels so list, status, and detail +pages can render the capability matrix without re-probing the host: + +```text +desktop=true|false +browser=true|false +code=true|false +``` + +`enforceManagedLeaseCapabilities` reads these labels to gate `--desktop`, +`--browser`, and `--code` on `--id` reuse paths. The labels are written when +the lease is created and never flipped on a live lease. + +## Composing Capabilities + +Capabilities are independent - any combination is allowed where the +provider supports them: + +```sh +crabbox warmup --desktop # desktop only +crabbox warmup --desktop --browser # browser running on the desktop +crabbox warmup --desktop --browser --code # full interactive box +crabbox warmup --browser # headless browser, no VNC +crabbox warmup --code # editor-only Linux lease +``` + +Capability bootstrap adds installation time. A bare lease is the fastest to +warm; a lease with all three takes the longest. Use the lightest combination +that satisfies the workflow. + +## Static Targets + +For static SSH hosts, capability validation degrades to probe-based detection: + +- `--desktop`: probe `127.0.0.1:5900` over SSH; fail with a clear error if + the port is not bound; +- `--browser`: probe for a browser binary using the OS-specific search list; + fail if none found; +- `--code` is rejected (managed Linux only). + +This is intentional. Crabbox is not responsible for installing software on +operator-owned static hosts; if the box does not expose the capability, the +run should not silently fall back. + +Related docs: + +- [warmup command](../commands/warmup.md) +- [run command](../commands/run.md) +- [vnc command](../commands/vnc.md) +- [webvnc command](../commands/webvnc.md) +- [code command](../commands/code.md) +- [Interactive desktop and VNC](interactive-desktop-vnc.md) +- [Browser portal](portal.md) diff --git a/docs/features/capacity-fallback.md b/docs/features/capacity-fallback.md new file mode 100644 index 0000000..0be2c47 --- /dev/null +++ b/docs/features/capacity-fallback.md @@ -0,0 +1,215 @@ +# Capacity And Fallback + +Read when: + +- adding or changing machine classes; +- debugging "why did Crabbox pick this instance type?"; +- working on AWS spot/on-demand fallback or Hetzner location fallback; +- configuring multi-region or multi-AZ capacity for AWS. + +Crabbox cares about capacity in three ways: + +1. **Class fallback** - the ordered list of provider types that satisfy a + class request. +2. **Market fallback** - AWS-specific Spot to On-Demand failover within a + class. +3. **Region/AZ routing** - where the broker tries to provision when capacity + is tight in a single zone. + +Hetzner only deals with class fallback. AWS deals with all three. Static +SSH, Blacksmith, Daytona, and Islo do not have capacity fallback because +the operator or external service controls the underlying resources. + +## Classes + +Class names are provider-agnostic intent labels: + +```text +standard typical CI lane +fast ~2x more cores than standard for parallel-friendly suites +large memory-heavy or many-process workloads +beast maximum capacity within the provider's burstable family +``` + +Each provider maps the four class names to an ordered list of concrete +instance types. The list is the fallback chain: try the first; if rejected, +try the second; and so on. + +The full Hetzner and AWS class tables live in +[Providers](providers.md#hetzner-summary). The table also lists the AWS +Windows, Windows WSL2, and macOS class maps. + +## When Class Fallback Triggers + +Hetzner falls back when: + +- the requested server type is unavailable in the configured location; +- the project quota rejects the request; +- the API returns a transient capacity error. + +AWS falls back when: + +- the instance type is rejected by capacity in the chosen Availability Zone; +- the account policy denies the type (e.g. quota = 0 vCPUs); +- the spot request is rejected by capacity. + +Quota rejections are detected from the API error code rather than scraped +from the message string, so the fallback is deterministic. The next +candidate in the chain is tried until either one succeeds or the chain is +exhausted. + +When the chain is exhausted, Crabbox returns exit code 4 (`no capacity`) and +the error includes `provisioningAttempts` that record which types were +tried, why each failed, and where (region/AZ for AWS). The same metadata is +attached to the failed lease record on the coordinator so operators can +inspect what went wrong without rerunning the workflow. + +## Explicit Type Override + +`--type c7a.16xlarge` and the matching `type:` config key skip the class +fallback chain and request that specific instance type. The contract is +"give me this exact type, not a fallback". If the provider rejects it, +Crabbox fails loudly with exit code 4 and does not silently choose a +different type. + +Use `--type` when: + +- you want deterministic capacity for benchmarks; +- you are pinning a specific generation for a known-bug workaround; +- you are debugging the capacity layer itself. + +For everything else, prefer a class - the fallback chain handles transient +rejections without operator intervention. + +## AWS Market Fallback + +AWS supports two markets: `spot` and `on-demand`. + +```yaml +capacity: + market: spot + fallback: on-demand-after-120s +``` + +`capacity.market: spot` requests Spot capacity first. `capacity.fallback: +on-demand-after-120s` falls back to On-Demand for the same instance type +when Spot fails to come up within 120 seconds. Set `fallback` to `none` (or +omit it) to never fall back to On-Demand. + +Per-command overrides: + +```sh +crabbox warmup --market spot +crabbox run --market on-demand -- pnpm test +``` + +The `--market` flag overrides `capacity.market` for one lease without +rewriting repo config. Use it when an account is temporarily out of Spot +quota or when Spot interruption rates spike. + +## AWS Capacity Hints + +The brokered AWS path uses Service Quotas and EC2 placement scoring to +preflight large requests: + +```yaml +capacity: + hints: true + largeClasses: + - large + - beast +``` + +When `hints: true` and the class is in `largeClasses`: + +- the broker calls Service Quotas to check applied Spot or On-Demand vCPU + limits; +- candidates that exceed quota are recorded as quota attempts and skipped; +- remaining candidates are scored with `GetSpotPlacementScores` (Spot mode) + to pick the most-available region/AZ. + +The result is a single provisioning attempt that picks the best location +and skips known-rejected types instead of letting the chain stumble through +them sequentially. + +Hints apply only on the brokered (Worker) path. Direct AWS mode still falls +back through the class chain but does not run quota or placement preflight. + +## Region And Availability Zone Routing + +```yaml +capacity: + regions: + - eu-west-1 + - us-east-1 + availabilityZones: + - eu-west-1a + - eu-west-1b +``` + +`regions` is the ordered list of AWS regions the broker considers when +multiple regions are configured. Single-region setups use `aws.region` and +leave `capacity.regions` empty; multi-region setups list every region the +broker may launch into. + +`availabilityZones` narrows the per-region zone selection. The broker uses +Spot placement scoring across the listed AZs and picks the highest-scoring +zone that has capacity. + +Regions are tried in order; AZs within a region are scored. If every AZ in +a region rejects the request, Crabbox advances to the next region. + +## Fallback Strategies + +```yaml +capacity: + strategy: most-available +``` + +| Value | Behavior | +|:------|:---------| +| `most-available` (default) | use placement scoring or class chain order | +| `cheapest` | prefer types with the lowest live hourly price (when known) | +| `provider-default` | follow the provider's own placement defaults | + +`cheapest` is currently honored on the brokered AWS path that has live +pricing. Hetzner does not differentiate strategies because its server-type +prices are consistent across locations. + +## Direct Mode Differences + +Direct provider mode (no coordinator) supports class fallback but has no +quota preflight, no placement score, no `provisioningAttempts` metadata, and +no central history. Direct AWS still respects `--market` and the `fallback` +config key, so spot-to-on-demand failover works locally - just without the +diagnostic richness the broker provides. + +If a direct AWS run exits with code 4, run the same command through the +broker once to get structured `provisioningAttempts` evidence; then go back +to direct mode for the rest of the iteration loop. + +## Failure Surface + +Capacity failures map to: + +```text +exit 4 no capacity every candidate in the chain was rejected +exit 5 provisioning failed a candidate was accepted but never reached SSH +exit 8 lease expired long warmup exceeded the configured TTL before SSH +``` + +The accompanying error message names the chain, the markets that were +tried, and (for brokered runs) `provisioningAttempts` you can inspect with: + +```sh +crabbox history --lease cbx_... +``` + +Related docs: + +- [Providers](providers.md) +- [AWS](../providers/aws.md) +- [Hetzner](../providers/hetzner.md) +- [Cost and usage](cost-usage.md) +- [Orchestrator](../orchestrator.md) +- [Operations](../operations.md) diff --git a/docs/features/configuration.md b/docs/features/configuration.md new file mode 100644 index 0000000..97144c0 --- /dev/null +++ b/docs/features/configuration.md @@ -0,0 +1,368 @@ +# Configuration + +Read when: + +- adding a new config key, env override, or flag; +- debugging "why is Crabbox using value X here?"; +- onboarding a repo and choosing what belongs in repo config vs user config; +- reviewing the YAML schema that `crabbox config show` and `crabbox init` + emit. + +Crabbox configuration is layered. The CLI loads values from five sources and +merges them in a deterministic order. Each source is optional - the binary +boots with sane defaults for everything. + +## Precedence + +```text +flags > env > repo-local crabbox.yaml/.crabbox.yaml > user config > defaults +``` + +Reading order is the lowest precedence first: defaults are applied, then +overridden by user config, then repo config, then env vars, then flags. Every +override only replaces fields that are explicitly set; unset fields fall +through. + +`crabbox config show` prints the merged configuration as the CLI sees it after +all five layers run. `--json` is stable enough to diff in scripts. +`crabbox config path` prints the user config file path so other tools can +edit it without parsing prose. + +## File Locations + +```text +macOS user: ~/Library/Application Support/crabbox/config.yaml +Linux user: ~/.config/crabbox/config.yaml +XDG override: $XDG_CONFIG_HOME/crabbox/config.yaml +repo: ./crabbox.yaml or ./.crabbox.yaml at repo root +explicit: $CRABBOX_CONFIG (any path) +``` + +If `CRABBOX_CONFIG` is set, it overrides the repo-local search and replaces +the effective repo config. User config is never replaced by the env override. + +State that does not belong in either YAML file: + +- live lease records (those are coordinator-owned); +- per-lease SSH private keys (those live under the user config dir but not in + `config.yaml`); +- provider secrets (those live in the broker environment, your shell env, or + a credential manager). + +## YAML Schema + +The full schema below merges what `crabbox init` emits and what advanced +operators set in user config. Most repos only need a small subset. + +### Top-level + +```yaml +broker: + url: https://crabbox.openclaw.ai + provider: aws + token: + access: + clientId: + clientSecret: + +provider: aws # default provider when --provider is not set +target: linux # default target OS +windows: + mode: normal # normal or wsl2 when target=windows + +profile: project-check +class: beast # standard | fast | large | beast +type: c7a.48xlarge # explicit provider type, overrides class fallback +network: auto # auto | tailscale | public + +lease: + idleTimeout: 30m + ttl: 90m +``` + +### Capacity + +```yaml +capacity: + market: spot # spot | on-demand + strategy: most-available + fallback: on-demand-after-120s + hints: true + regions: + - eu-west-1 + - us-east-1 + availabilityZones: + - eu-west-1a + - eu-west-1b + largeClasses: + - large + - beast +``` + +### AWS + +```yaml +aws: + region: eu-west-1 + ami: ami-0123456789abcdef0 + securityGroupId: sg-0abcdef0123456789 + subnetId: subnet-0abcdef0123456789 + instanceProfile: crabbox-runner + rootGB: 400 + sshCidrs: + - 203.0.113.0/24 + macHostId: h-0123456789abcdef0 +``` + +### Hetzner + +Hetzner credentials and image come from broker-side config. Repos do not need +a `hetzner:` block unless they pin a class or location. + +### Static SSH + +```yaml +provider: ssh +target: macos +static: + host: mac-studio.local + user: steipete + port: "22" + workRoot: /Users/steipete/crabbox +``` + +### Blacksmith Testbox + +```yaml +provider: blacksmith-testbox +blacksmith: + org: openclaw + workflow: .github/workflows/ci-check-testbox.yml + job: test + ref: main + idleTimeout: 90m + debug: false +``` + +### Daytona + +```yaml +provider: daytona +daytona: + snapshot: openclaw-crabbox + apiKey: # prefer DAYTONA_API_KEY env +``` + +### Sync + +```yaml +sync: + delete: true + checksum: false + gitSeed: true + fingerprint: true + baseRef: main + timeout: 15m + warnFiles: 50000 + warnBytes: 5368709120 + failFiles: 150000 + failBytes: 21474836480 + allowLarge: false + exclude: + - node_modules + - .turbo + - dist +``` + +A `.crabboxignore` file at the repo root appends to `sync.exclude`. See +[Sync](sync.md) for the matcher rules. + +### Env Forwarding + +```yaml +env: + allow: + - CI + - NODE_OPTIONS + - PROJECT_* +``` + +`env.allow` is name-based and supports trailing wildcards. Crabbox forwards +matching local env vars to the remote command. Secrets do not belong in +`env.allow`; pass them through provider-side mechanisms. + +### Actions + +```yaml +actions: + workflow: .github/workflows/crabbox.yml + job: test + ref: main + fields: + - crabbox_docker_cache=true + runnerLabels: + - crabbox + ephemeral: true + runnerVersion: latest +``` + +### Cache + +```yaml +cache: + pnpm: true + npm: true + docker: true + git: true + maxGB: 80 + purgeOnRelease: false +``` + +### Results + +```yaml +results: + junit: + - junit.xml + - reports/junit.xml +``` + +### SSH + +```yaml +ssh: + key: ~/.ssh/id_ed25519 + user: crabbox + port: "2222" + fallbackPorts: + - "22" +``` + +### Tailscale + +```yaml +tailscale: + enabled: false + tags: + - tag:crabbox + hostnameTemplate: crabbox-{slug} + authKeyEnv: CRABBOX_TAILSCALE_AUTH_KEY + exitNode: "" + exitNodeAllowLanAccess: false +``` + +## Profiles + +Profiles are named bundles of config that get applied as a layer on top of +user/repo config. They live under a `profiles:` map and are selected by +`--profile` or `profile:` in repo config. + +```yaml +profiles: + project-check: + class: beast + sync: + baseRef: main + env: + allow: + - PROJECT_* + smoke: + class: standard + lease: + ttl: 30m +``` + +Use profiles when one repo has multiple test lanes with different machine +classes, sync rules, or env allowlists. A repo without profiles never needs +the block. + +## Machine Classes + +A machine class is a provider-agnostic name for "standard", "fast", "large", +or "beast" capacity. Each provider maps the class to a list of concrete +instance/server types and falls back through the list when the first +candidate cannot be provisioned. + +| Class | Intent | +|:------|:-------| +| `standard` | typical CI lane | +| `fast` | ~2x more cores than standard for parallel-friendly suites | +| `large` | memory-heavy or many-process workloads | +| `beast` | maximum capacity within the provider's burstable family | + +Class-to-type mappings live in [Providers](providers.md). When you set +`type:`, that exact provider type wins and the class is ignored. The +`--type` and `type:` paths intentionally do not fall back; they fail loud +if the provider rejects the type. + +## Environment Variables + +Every YAML key has a `CRABBOX_*` env override. The full list is in +[CLI](../cli.md#environment-variables). Common ones: + +```text +CRABBOX_COORDINATOR +CRABBOX_COORDINATOR_TOKEN +CRABBOX_PROVIDER +CRABBOX_TARGET +CRABBOX_PROFILE +CRABBOX_DEFAULT_CLASS +CRABBOX_IDLE_TIMEOUT +CRABBOX_TTL +CRABBOX_NETWORK +CRABBOX_OWNER +CRABBOX_ORG +``` + +Provider credentials live outside the Crabbox env namespace because they are +provider-native: + +```text +HCLOUD_TOKEN / HETZNER_TOKEN +AWS_PROFILE / AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN +DAYTONA_API_KEY / DAYTONA_JWT_TOKEN +BLACKSMITH_* (read by the Blacksmith CLI) +ISLO_API_KEY (read by the Islo SDK) +``` + +## What Belongs Where + +| Setting | User config | Repo config | Profile | Notes | +|:--------|:------------|:------------|:--------|:------| +| `broker.url` and `broker.token` | yes | no | no | Per-machine identity. | +| `provider`, `class`, `type` | optional default | yes | yes | Per-repo defaults; profiles for lanes. | +| `sync.exclude`, `sync.fingerprint`, `sync.baseRef` | no | yes | yes | Lives with the repo. | +| `env.allow` | no | yes | yes | Repo decides what is safe to forward. | +| Per-user SSH key path | yes | no | no | Personal preference. | +| `aws.region`, `aws.ami` | optional | yes | yes | Repos can pin region. | +| Tailscale tags and template | yes | yes | yes | Both layers can set this. | +| Profiles | yes | yes | n/a | Either layer can define profiles. | + +The rule of thumb: anything other repos should inherit when they clone goes in +repo config; anything tied to one operator's machine goes in user config. + +## Validation + +The CLI validates config eagerly: + +- `parseNetworkMode` rejects `--network` values outside `auto|tailscale|public`; +- `validateNetworkConfig` requires `tailscale.tags` when `tailscale.enabled` + is true and rejects Tailscale on Blacksmith and static providers; +- `validateRequestedCapabilities` rejects `--desktop`, `--browser`, or + `--code` for providers whose `Spec.Features` does not list the matching + feature flag; +- `crabbox doctor` runs a richer set of checks against config, network + reachability, and SSH keys. + +When validation fails, `crabbox` exits with code 2 and a message that names +the offending field. + +Related docs: + +- [CLI](../cli.md) +- [config command](../commands/config.md) +- [doctor command](../commands/doctor.md) +- [Sync](sync.md) +- [Providers](providers.md) +- [Capacity and fallback](capacity-fallback.md) +- [Network and reachability](network.md) diff --git a/docs/features/doctor.md b/docs/features/doctor.md new file mode 100644 index 0000000..3e46cd8 --- /dev/null +++ b/docs/features/doctor.md @@ -0,0 +1,172 @@ +# Doctor Checks + +Read when: + +- adding a new precheck before users run long workflows; +- debugging an unexpected `doctor` failure; +- deciding whether a check belongs in `doctor` or somewhere else. + +`crabbox doctor` is the local preflight. It validates the things that have +silently broken commands in the past so users get an answer before they +spend ten minutes on a failed lease. + +The command is fast (under a second on a healthy machine), local-only, +non-destructive, and never talks to provider APIs that might cost money. + +## Categories + +Doctor groups checks under five categories: + +```text +config config files load and parse, required keys are present +auth broker token is set, signed token is valid, identity resolves +network coordinator URL reachable, DNS works, SSH transport probes work +ssh SSH key path readable, key type acceptable, ssh-keygen on PATH +tools rsync, git, ssh, ssh-keygen present and executable +``` + +Each category emits one or more pass/fail/skip lines. Failures are listed +first; passes and skips follow in deterministic order so the output is +diffable across runs. + +## What `config` Checks + +- The user config file parses without error. +- The repo config (when present) parses without error. +- Provider name resolves through `ProviderFor`. +- Target OS is one of `linux`, `macos`, `windows`. +- Network mode is one of `auto`, `tailscale`, `public`. +- Tailscale config validates when `tailscale.enabled: true` (tags non-empty, + hostname template non-empty, exit-node-allow-lan-access requires an + exit node, target is `linux`, provider is not Blacksmith or static). +- Class is one of `standard`, `fast`, `large`, `beast` when set; explicit + `type:` values are accepted as-is. + +## What `auth` Checks + +- A broker URL is configured if the user expects coordinator mode. +- A broker token is present when the URL is configured. +- The signed token (when GitHub login was used) decodes and is not expired. +- Owner can be resolved from `CRABBOX_OWNER`, Git env, or + `git config user.email`. +- `whoami` succeeds against the configured coordinator with the stored + token. + +When auth is missing, doctor prints `crabbox login` as the next step. + +## What `network` Checks + +- The coordinator URL resolves via DNS. +- The coordinator is reachable over HTTPS within a small timeout. +- When `--network tailscale` is configured, `tailscale status` reports a + joined client. +- SSH transport probes succeed for the primary port and fall back to the + configured fallback ports. + +DNS is checked before HTTPS so a broken DNS responder does not look like a +broker outage. + +## What `ssh` Checks + +- The configured SSH key path (`ssh.key` or `CRABBOX_SSH_KEY`) is readable + when set. +- The key file has a sensible permissions mode (warn on group/world + readable). +- `ssh-keygen` is on PATH so per-lease key generation works. +- The user's `~/.ssh/known_hosts` is writable (if it exists). + +When `ssh.key` is unset, doctor skips the path validation - per-lease keys +do not need a global key. + +## What `tools` Checks + +- `git` is on PATH. +- `rsync` is on PATH. +- `ssh` is on PATH. +- `ssh-keygen` is on PATH. + +The check is path-based, not version-based. Crabbox tolerates any reasonably +modern version of these tools. + +## What Doctor Does Not Do + +Doctor stays local on purpose. It does not: + +- start a real lease or provision a server; +- talk to AWS, Hetzner, Daytona, Islo, or any provider API; +- run `git ls-files` against the repo (that belongs in `crabbox sync-plan`); +- estimate costs; +- modify config or rotate keys. + +Anything that costs money or has side effects belongs in a different +command. Doctor is for "before I run anything, is my machine sane?" and +should be safe to run from `pre-commit` hooks, agent boot, or CI smoke. + +## Output Shape + +```text +config: + ok user config: ~/.config/crabbox/config.yaml + ok repo config: ./.crabbox.yaml + ok provider: aws + ok target: linux + ok network: auto +auth: + ok broker: https://crabbox.openclaw.ai + ok owner: alex@example.com + ok org: openclaw +network: + ok coordinator dns + ok coordinator https +ssh: + ok ssh-keygen present + skip ssh.key unset (per-lease keys will be used) +tools: + ok git + ok rsync + ok ssh + ok ssh-keygen +``` + +Failures swap the leading `ok` for `fail` and add a remediation hint: + +```text +auth: + fail broker token is missing - run `crabbox login` +``` + +Skips swap `ok` for `skip` and explain why the check did not run: + +```text +network: + skip coordinator unconfigured (direct provider mode) +``` + +Exit code is `0` on full success, `2` on any failure. Skips do not change +the exit code. + +## Adding A Check + +Doctor checks live in `internal/cli/doctor.go`. Each check returns a +`doctorResult{ Status, Category, Subject, Detail, Remediation }`. The CLI +sorts results by category, then by subject, so output stays stable. + +Rules for new checks: + +- they must run in under ~100ms; +- they must not call out to a paid API or write any state; +- they must produce a `Remediation` string when they fail; +- they should `skip` (not `fail`) when the configuration genuinely does + not apply (e.g. SSH key check when `ssh.key` is unset). + +Tests in `doctor_test.go` exercise the result struct and ordering. Add a +test for the new check that asserts the failure message and remediation +text so future refactors do not silently regress the user-facing output. + +Related docs: + +- [doctor command](../commands/doctor.md) +- [Configuration](configuration.md) +- [Network and reachability](network.md) +- [SSH keys](ssh-keys.md) +- [Source map](../source-map.md) diff --git a/docs/features/env-forwarding.md b/docs/features/env-forwarding.md new file mode 100644 index 0000000..947dc97 --- /dev/null +++ b/docs/features/env-forwarding.md @@ -0,0 +1,155 @@ +# Environment Forwarding + +Read when: + +- adding a new env var that the remote command needs to see; +- debugging "why is `$CI` empty inside `crabbox run`?"; +- writing a repo config that lets agents set tunable values without flags; +- reviewing a PR that loosens or tightens the env allowlist. + +By default, `crabbox run` does not forward arbitrary local environment +variables to the remote command. Forwarding is opt-in and name-based: the +repo declares which variable names are allowed, and Crabbox forwards only +those that are present locally. + +## Why Allowlist + +Agents and CI environments run with rich and sometimes sensitive +environments: tokens, private credentials, terminal paths, vendor-specific +debug flags. Forwarding everything would: + +- leak secrets to remote runners; +- introduce non-determinism between local and CI runs; +- make it impossible to reason about what affects a remote command. + +Allowlist forwarding makes the contract explicit. The repo decides what +"counts" as input to the remote command, and the user can audit the +allowlist in `crabbox.yaml`. + +## Configuration + +```yaml +env: + allow: + - CI + - NODE_OPTIONS + - PROJECT_* +``` + +Rules: + +- entries are env var names, not values; +- a trailing `*` is a prefix wildcard (`PROJECT_*` matches `PROJECT_FOO`, + `PROJECT_BAR`); +- inline wildcards (`PROJECT_*_DEBUG`) are not supported; +- match is exact and case-sensitive; +- empty entries are ignored. + +The user-side override is `CRABBOX_ENV_ALLOW`, a comma-separated list: + +```sh +CRABBOX_ENV_ALLOW='CI,NODE_OPTIONS,PROJECT_*' crabbox run -- pnpm test +``` + +`CRABBOX_ENV_ALLOW` replaces the repo allowlist for that command rather than +appending to it. Use it for one-off tests; persistent allowances belong in +`env.allow`. + +## What Gets Forwarded + +For each env var in the allowlist, Crabbox checks whether the variable is +set locally. If it is, the variable is forwarded to the remote command with +the same name and value. If it is not set locally, nothing is forwarded - +Crabbox does not invent values. + +The remote command sees the variables as part of its environment when run +through SSH: + +```sh +ssh runner 'CI=true NODE_OPTIONS=--max_old_space_size=4096 cd workdir && pnpm test' +``` + +Quoting and escaping happen automatically. Values that contain shell +metacharacters are passed through safely. + +## Capability-Injected Env + +A small set of env vars is injected by Crabbox itself when the matching +capability is requested. These bypass the allowlist because Crabbox owns +them: + +```text +DISPLAY=:99 when --desktop +CRABBOX_DESKTOP=1 when --desktop +BROWSER= when --browser, after probe +CHROME_BIN= when --browser, after probe +CRABBOX_BROWSER=1 when --browser +``` + +User-allowed env vars override capability-injected ones if they overlap. +Repos that need a different `BROWSER` value can include `BROWSER` in +`env.allow` and set it locally. + +## Secrets + +Do not put secrets in `env.allow` even if forwarding seems convenient. +Secrets belong in: + +- the broker environment (Cloudflare Worker secrets) for provider + credentials; +- the operator's credential store (`op`, AWS Vault, etc.) for short-lived + tokens; +- per-runner image bake when the secret should be on every lease; +- post-bootstrap secret injection in repo-owned setup scripts (devcontainer, + mise, repo-controlled `bin/setup`). + +Crabbox forwards values it sees locally. If a secret leaks into the +allowlist, every run of every contributor will leak it. + +## Examples + +```yaml +env: + allow: + - CI # mark a remote command as CI-driven + - NODE_OPTIONS # adjust Node memory in test suites + - PYTEST_ADDOPTS # tune pytest flags from the local env + - PROJECT_* # repo's own debug knobs + - VITEST_* # let agents override vitest config + - DEBUG # `debug` package selector +``` + +Common things you usually do not allow: + +```text +HOME, USER, PATH, SHELL runner already has its own +SSH_* leaks SSH agent state +GITHUB_TOKEN use Actions hydration or runner setup +AWS_* use IAM roles or instance profile +*_API_KEY, *_TOKEN use a secret manager +``` + +## Inspecting Forwarding + +`crabbox run --debug` prints the set of env vars that were forwarded for +that invocation. Use it to verify that the allowlist matches expectations +before debugging "why does the remote command not see this variable?". + +```sh +$ crabbox run --debug -- env | grep '^PROJECT' +[crabbox] forwarding env: CI NODE_OPTIONS PROJECT_FOO PROJECT_BAR +PROJECT_FOO=value +PROJECT_BAR=other-value +``` + +Variables that match the allowlist but are unset locally are not in the +forwarded list, so the debug line is the source of truth for "what did the +remote command actually see". + +Related docs: + +- [Sync](sync.md) +- [Configuration](configuration.md) +- [run command](../commands/run.md) +- [Capabilities](capabilities.md) +- [Security](../security.md) diff --git a/docs/features/identifiers.md b/docs/features/identifiers.md new file mode 100644 index 0000000..72e99d8 --- /dev/null +++ b/docs/features/identifiers.md @@ -0,0 +1,199 @@ +# Identifiers + +Read when: + +- changing how Crabbox names leases, slugs, runs, or claims; +- debugging "why does `crabbox run --id` not find this lease?"; +- adding a new lookup form (alias, provider id, anything that resolves to a + lease). + +Crabbox names every long-lived thing twice: once with a stable canonical ID +that machines compare, and once with a friendly slug that humans type. This +page lists the identifiers, where they come from, and how lookup resolves +across them. + +## Lease ID + +Canonical lease IDs look like: + +```text +cbx_abcdef123456 +``` + +The pattern is fixed: the literal `cbx_` prefix followed by 12 hex characters. +`isCanonicalLeaseID` enforces it as a regex; anything else is treated as a +slug or alias. + +The CLI mints a provisional lease ID before calling the broker. The broker +may return a different final ID (when the Worker dedupes a retried request, +for example); the CLI then moves the local SSH key directory and claim file +from the provisional ID to the final ID with `MoveStoredTestboxKey` and +re-keys references accordingly. + +Provider resources reference the lease ID through Crabbox labels: + +```text +crabbox-lease=cbx_abcdef123456 +``` + +That label is what `crabbox cleanup` and `crabbox list` use to map a provider +machine back to a Crabbox lease. + +## Slug + +Slugs are friendly, human-typeable lease names. They look like: + +```text +blue-lobster +amber-crab +silver-shrimp +``` + +Slugs are generated from a stable hash of the lease ID, so the same lease +always gets the same slug. The vocabulary is small (14 adjectives, 8 nouns) +because Crabbox is intentionally a small fleet. When a slug collides with an +existing active lease, `slugWithCollisionSuffix` appends a 4-hex suffix +keyed by the seed: + +```text +blue-lobster-1234 +``` + +The collision path is rare in normal use - a single user's active leases +rarely exceed the 14 × 8 = 112 unique base slugs. + +Slugs are normalized everywhere they are accepted. `normalizeLeaseSlug` keeps +only `[a-z0-9-]`, collapses runs of separators, and trims leading/trailing +dashes. `Blue_Lobster` and `BLUE-LOBSTER` resolve to `blue-lobster`. + +## Provider Name + +Each managed lease also gets a per-provider resource name that includes the +slug and a hash of the lease ID, so the provider console shows useful names: + +```text +crabbox-blue-lobster-7f8a2c1d +``` + +That name is what shows up as the EC2 `Name` tag, the Hetzner server name, +and the Daytona sandbox name. It is derived from `leaseProviderName(leaseID, +slug)`; the function falls back to `crabbox-cbx-...` if the slug is empty. + +## Run ID + +Each `crabbox run` against a coordinator also gets a durable run handle: + +```text +run_abcdef123456 +``` + +A run is created before the lease is acquired so events can be appended for +leasing failures, sync failures, and command output even when the run never +reaches command-start. Run IDs are stable across a single invocation; +retrying the same command produces a new run. + +`crabbox history`, `crabbox events`, `crabbox attach`, `crabbox logs`, and +`crabbox results` all accept run IDs. Slugs do not resolve to runs - only to +leases. + +## Local Claims + +Reusable leases get a JSON claim file stored under the user state directory: + +```text +$XDG_STATE_HOME/crabbox/claims/cbx_abcdef123456.json +``` + +When `XDG_STATE_HOME` is not set, claims live next to user config in +`~/Library/Application Support/crabbox/state/claims` on macOS or +`~/.config/crabbox/state/claims` on Linux. + +The claim payload looks like: + +```json +{ + "leaseID": "cbx_abcdef123456", + "slug": "blue-lobster", + "provider": "aws", + "repoRoot": "/Users/steipete/Projects/openclaw", + "claimedAt": "2026-05-07T07:42:18Z", + "lastUsedAt": "2026-05-07T07:55:12Z", + "idleTimeoutSeconds": 1800 +} +``` + +Claims do three things: + +- bind a lease to one repo so wrappers and agents do not silently reuse a + lease against a different checkout; +- give `crabbox run --id blue-lobster` a slug-to-canonical-ID translation + without round-tripping the broker; +- power "is this lease still mine?" checks before destructive operations + (`stop`, `cleanup`, `actions register`). + +A conflicting claim (same lease, different repo) refuses commands by default; +`--reclaim` overrides the check and rewrites the claim atomically. + +Static SSH leases tag their claims with `provider: ssh` so the resolver knows +the lease bypasses the coordinator. Coordinator-backed claims leave +`provider` blank because the coordinator owns provider tracking. + +## SSH Key Storage + +Per-lease SSH key directories are keyed by lease ID: + +```text +~/.config/crabbox/testboxes/cbx_abcdef123456/id_ed25519 +~/.config/crabbox/testboxes/cbx_abcdef123456/id_ed25519.pub +~/.config/crabbox/testboxes/cbx_abcdef123456/known_hosts +``` + +The provisional → final lease ID move uses `os.Rename` on the directory so +the key, public key, and known_hosts file all migrate atomically. The +provider key name (`crabbox-cbx-abcdef123456`) is what the cloud account +sees. + +## Resolving An Identifier + +`crabbox --id ` accepts: + +- a canonical `cbx_...` lease ID; +- a normalized slug (`blue-lobster`, `Blue Lobster`, `BLUE_LOBSTER` all resolve + to the same lease); +- in coordinator mode, also the slug as known to the broker, regardless of + case. + +Resolution order: + +1. Read the local claim store for the literal identifier or any slug match + in `claims/`. +2. If a matching claim exists, use its `leaseID` as the canonical handle. +3. If no claim is found and a coordinator is configured, ask the coordinator + to resolve the identifier (slug or canonical ID). +4. For static SSH and direct-provider modes, fall back to the provider's + `Resolve` implementation (`SSHLeaseBackend.Resolve`). + +The first source that returns a hit wins. This is why `--id blue-lobster` +works from any directory once the warmup ran in some other repo - the local +claim translates slug to lease ID before the broker is involved. + +## Identifier Lifetime + +```text +provisional lease ID newLeaseID() call → broker returns final ID +final lease ID broker accepts → stored in claim, key dir, labels +slug computed on first lease creation, stable forever +provider name derived from lease ID + slug +run ID minted per crabbox run when a coordinator is configured +``` + +Slugs are not recycled. When a lease ends, the slug stays free for any future +lease that happens to hash to it; the small vocabulary makes that +collision-by-hash possible but rare in practice. + +Related docs: + +- [Coordinator](coordinator.md) +- [SSH keys](ssh-keys.md) +- [Lifecycle cleanup](lifecycle-cleanup.md) +- [Source map](../source-map.md) diff --git a/docs/features/network.md b/docs/features/network.md new file mode 100644 index 0000000..3c14d85 --- /dev/null +++ b/docs/features/network.md @@ -0,0 +1,195 @@ +# Network And Reachability + +Read when: + +- choosing between `--network auto`, `tailscale`, or `public`; +- debugging "Crabbox can SSH but my browser can't reach the desktop"; +- changing how Crabbox falls back between the public IP and the tailnet IP; +- adjusting SSH port fallbacks for restrictive operator networks. + +A Crabbox lease can be reachable through more than one network plane. +Brokered Linux leases can join a Tailscale tailnet, brokered AWS Windows and +EC2 Mac leases stay public, and static SSH targets can be on either depending +on how the operator configured them. The CLI picks one plane per command and +prints which it picked. + +## Modes + +```text +--network auto prefer tailnet when reachable, otherwise fall back to public +--network tailscale require tailnet reachability; fail otherwise +--network public ignore tailnet metadata and use the public address +``` + +`auto` is the default. It optimizes for "do not surprise me": prefer tailnet +when both client and runner are on the tailnet, fall back transparently to +the public path when the client is off-tailnet. + +`tailscale` is the strict mode. Use it when you specifically want to verify +tailnet reachability or when the public IP is firewalled to a CI runner that +your local box cannot reach. + +`public` is the escape hatch. Use it when the tailnet metadata is stale, when +you are debugging public-network issues, or when the client cannot reach the +tailnet for unrelated reasons. + +The mode applies to `crabbox ssh`, `crabbox run`, `crabbox vnc`, and +`crabbox webvnc`. `crabbox status --network auto` also resolves through this +path so the printed address matches what later commands will use. + +## How `auto` Picks A Plane + +For a lease with tailnet metadata, `auto` mode: + +1. reads `tailscale_fqdn`, `tailscale_ipv4`, and `tailscale_hostname` from the + server labels; +2. probes the first non-empty option over SSH with a 5-second TCP transport + probe; +3. uses that target if the probe succeeds; +4. falls back to the public IP and prints `network=public` with the reason + `tailscale_unreachable`. + +For a lease with no tailnet metadata, `auto` is just public mode. + +Static SSH targets behave the same way when the static host name is a +MagicDNS or `100.x` address. If the operator points `static.host` at a +MagicDNS name, `--network tailscale` works without any other configuration - +the address is already on the tailnet. + +## Public Reachability + +Brokered AWS Linux, AWS Windows, AWS Mac, Hetzner Linux, Daytona, and Islo +leases all expose at least one public address. Crabbox stores the public +address on the server record and uses it whenever the network mode resolves +to `public`. + +Public addresses are gated by the provider's security group / firewall. AWS +managed leases use the `crabbox-runners` security group with SSH ingress +limited to the configured CIDRs or the request source IP. Hetzner managed +leases use the cloud firewall attached to the project; the broker keeps it +limited to the operator's IPs. + +If your client IP changes during a long warmup, the existing security group +rule may not include the new IP. Re-running `crabbox status` adds the +current IP back and updates the rule. + +## Tailnet Reachability + +When a managed Linux lease is created with `--tailscale`, cloud-init: + +- installs the Tailscale package; +- joins the tailnet with the configured tags (default `tag:crabbox`); +- writes non-secret metadata to `/var/lib/crabbox/tailscale-*`; +- extends `crabbox-ready` with a bounded check that a `100.x` address has + been assigned; +- discards the auth key after `tailscale up` so it never persists. + +The metadata Crabbox stores on the lease record: + +```text +tailscale=true +tailscale_hostname=blue-lobster +tailscale_fqdn=blue-lobster.tail-scale.ts.net +tailscale_ipv4=100.64.0.5 +tailscale_state=ok +tailscale_tags=tag:crabbox +tailscale_exit_node=... +tailscale_exit_node_allow_lan_access=true|false +``` + +Brokered leases get a one-shot auth key minted by the Worker via Tailscale +OAuth (`worker/src/tailscale.ts`). Direct-provider leases use a key from +`CRABBOX_TAILSCALE_AUTH_KEY`. The auth key is never stored on the runner. + +When the metadata says the lease is on the tailnet but the client cannot +reach it, the most common reasons are: + +- the client is not joined to the tailnet (`tailscale status` on the client); +- ACLs block the tag pair from reaching `100.x`; +- the runner's `tailscaled` process died (rare; readiness probes catch it + before the lease is handed back). + +`crabbox status --id --network tailscale` is the fastest way to test +tailnet reachability after lease creation. + +## SSH Port And Fallback + +Crabbox runs SSH on a non-standard port by default to keep noise out of the +provider firewall logs: + +```yaml +ssh: + port: "2222" + fallbackPorts: + - "22" +``` + +`ssh.port` is the primary port the bootstrap binds to. `ssh.fallbackPorts` is +an ordered list of additional ports the CLI will try when the primary port +is unreachable - typically because the operator's egress is restricted, the +sshd has not bound the new port yet, or cloud-init is still mid-flight. + +Fallback rules: + +- the CLI tries primary first, then each fallback in order; +- the first port that opens a TCP connection wins for that command; +- success is sticky for the run; the next command repeats the probe; +- the CLI prints `ssh-port-fallback=22` when fallback was used. + +Set `ssh.fallbackPorts: []` or `CRABBOX_SSH_FALLBACK_PORTS=none` to disable +fallback entirely. Some networks prefer this so a misconfigured `2222` rule +fails loud instead of quietly using `22`. + +## Loopback-Bound Capabilities + +Lease capabilities (desktop, code) are bound to loopback on purpose so they +do not need provider firewall changes: + +```text +VNC 127.0.0.1:5900 reached via SSH tunnel +code-server 127.0.0.1:8080 reached via portal bridge +``` + +The network mode does not change loopback bindings. `--network` only changes +which interface the SSH tunnel or portal bridge uses to talk to the lease. +Loopback is loopback; it is reachable from the runner regardless. + +## Static Hosts + +Static SSH targets honor the same modes: + +- `--network public` uses `static.host` as configured; +- `--network tailscale` requires `static.host` to be a MagicDNS name or + `100.x` address, then probes for SSH reachability; +- `--network auto` defers to the resolved address: if `static.host` is on + the tailnet, that is what `auto` uses; otherwise it is public. + +Tailscale-managed bootstrap (`--tailscale`) is rejected for static providers. +Static hosts are operator-owned; Crabbox does not install Tailscale on them. +Set `static.host` to a tailnet address and select `--network tailscale` +explicitly. + +## Failure Surface + +When a network mode cannot be satisfied, the CLI exits with code 5 and a +message that names the mode and the lease: + +```text +network=tailscale requested but lease cbx_... has no tailnet address +network=tailscale requested for static host mac-studio but SSH is not reachable +network=tailscale requested but blue-lobster.tail-scale.ts.net is not reachable over SSH +``` + +`auto` mode never fails on a tailnet probe; it falls back to public and +records the reason. The `network=public reason=tailscale_unreachable` log +line is the diagnostic signal that the tailnet plane is unhealthy even +though the command kept working. + +Related docs: + +- [Tailscale](tailscale.md) +- [Runner bootstrap](runner-bootstrap.md) +- [SSH keys](ssh-keys.md) +- [vnc command](../commands/vnc.md) +- [ssh command](../commands/ssh.md) +- [doctor command](../commands/doctor.md) diff --git a/docs/features/openclaw-plugin.md b/docs/features/openclaw-plugin.md new file mode 100644 index 0000000..4432995 --- /dev/null +++ b/docs/features/openclaw-plugin.md @@ -0,0 +1,165 @@ +# OpenClaw Plugin + +Read when: + +- enabling Crabbox as a plugin inside OpenClaw; +- changing the plugin tools, schema, or wrapper behavior; +- understanding why some Crabbox surfaces are CLI-only and not plugin tools. + +The Crabbox repository root is also a native OpenClaw plugin package. When +OpenClaw loads the plugin, it exposes a small set of agent tools that shell +out to the user's installed `crabbox` binary. The plugin does not embed the +CLI or duplicate any of its logic - it is a thin contract for safe, allowlisted +invocations. + +## Plugin Manifest + +`openclaw.plugin.json` declares the plugin id, the tools it owns, and the +config schema: + +```json +{ + "id": "crabbox", + "name": "Crabbox", + "description": "Run Crabbox remote testbox checks from OpenClaw.", + "activation": { "onStartup": true }, + "contracts": { + "tools": [ + "crabbox_run", + "crabbox_warmup", + "crabbox_status", + "crabbox_list", + "crabbox_stop" + ] + }, + "configSchema": { ... } +} +``` + +The runtime entrypoint is `index.js`. Tests in `index.test.js` lock the tool +schemas, argv shapes, output trimming, and config validation so a future +refactor cannot silently change the agent-facing contract. + +## Tools + +```text +crabbox_run run a command on a leased remote box +crabbox_warmup acquire a warm box for repeated commands +crabbox_status query a lease's state +crabbox_list list visible leases for the current owner/org +crabbox_stop stop a lease and release its provider resources +``` + +Each tool accepts an argv array of `string` plus an optional `env` object of +string values. The plugin enforces these as JSON schema before invoking the +binary, so an agent cannot pass arbitrary shell commands or non-string env +values. + +`crabbox_run`, `crabbox_warmup`, and `crabbox_stop` can be disabled per +install by setting `allowRun`, `allowWarmup`, or `allowStop` to `false` in +plugin config. `crabbox_status` and `crabbox_list` are read-only and always +allowed. + +## Config + +The plugin accepts only four config keys, all optional: + +```json +{ + "binary": "crabbox", + "maxOutputBytes": 60000, + "timeoutSeconds": 1800, + "allowRun": true, + "allowWarmup": true, + "allowStop": true +} +``` + +| Key | Default | Effect | +|:----|:--------|:-------| +| `binary` | `crabbox` | Path to the Crabbox binary. Set when the binary is not on PATH. | +| `maxOutputBytes` | 60000 | Max captured stdout/stderr returned to the model per call. | +| `timeoutSeconds` | 1800 | Default wrapper timeout for a Crabbox CLI invocation. | +| `allowRun` | true | Gate `crabbox_run`. | +| `allowWarmup` | true | Gate `crabbox_warmup`. | +| `allowStop` | true | Gate `crabbox_stop`. | + +Crabbox config (broker URL, provider, token, profile, class) lives in the +user/repo config files. The plugin does not duplicate those keys; it inherits +them from whatever `crabbox config show` would return for the agent's +working directory. + +## Output Handling + +The plugin captures stdout and stderr separately, trims each to +`maxOutputBytes`, and reports the exit code, the trimmed bytes, and a +truncation flag back to the model. Truncated output gets a tail marker so +agents know they did not get the full transcript: + +```text +... [output truncated; 12345 of 87654 bytes shown] +``` + +Long-running tools still respect `timeoutSeconds`. When the wrapper times +out, the plugin sends SIGTERM, waits a short grace period, then escalates to +SIGKILL. The exit code in the response reflects the wrapper outcome, not the +inner remote command. + +## What Belongs In The CLI Instead + +History, log inspection, attach, results, usage, and admin operations are +intentionally not plugin tools. They are best run from a shell-capable agent: + +```sh +crabbox history --lease cbx_... +crabbox events run_... --after 0 --limit 50 +crabbox attach run_... +crabbox logs run_... +crabbox results run_... +crabbox usage --scope user +crabbox admin leases --state active +crabbox cleanup --dry-run +``` + +Reasons for keeping these out of the plugin: + +- they often produce more output than `maxOutputBytes` can usefully capture; +- agents tend to want raw logs they can grep, not trimmed model output; +- admin tools are easier to gate at the shell level (env, allowlists) than + through plugin config; +- `crabbox attach` is interactive by design. + +## Provider Allowlist + +The plugin schema constrains the `provider` argument to the providers +Crabbox actually supports: + +```text +aws | hetzner | ssh | blacksmith-testbox | blacksmith | daytona | islo +``` + +Adding a provider to the CLI requires updating this list in `index.js` and +the test fixture in `index.test.js`. The schema is the agent-facing contract; +without the update, the new provider would be rejected by JSON validation +before reaching the binary. + +## When To Update + +Edit the plugin when you: + +- add or remove a provider; +- add a new agent-safe tool (read-only, owner-scoped, bounded output); +- change argv conventions across all `crabbox` commands (rare); +- update default timeouts or output budgets. + +Run `node --test index.test.js` after every change. The tests exercise the +schema, argv handling, and output trimming end-to-end. + +Related docs: + +- [docs/README.md](../README.md) - top-level overview includes the plugin. +- [Source map](../source-map.md) - `package.json`, `openclaw.plugin.json`, + `index.js`, `index.test.js`. +- [run command](../commands/run.md) - what `crabbox_run` ultimately invokes. +- [warmup command](../commands/warmup.md) - what `crabbox_warmup` invokes. +- [stop command](../commands/stop.md) - what `crabbox_stop` invokes. diff --git a/docs/getting-started.md b/docs/getting-started.md new file mode 100644 index 0000000..9969488 --- /dev/null +++ b/docs/getting-started.md @@ -0,0 +1,232 @@ +# Getting Started + +Read when: + +- you are new to Crabbox and want a working `run` in 10 minutes; +- you are evaluating Crabbox for a repo and want to see the shape; +- you want a reference for what a typical onboarding looks like. + +This is a cookbook, not a reference. It walks through one repo end to end, +from install to `crabbox run -- pnpm test`. For deeper coverage, follow the +links in each step. + +## Step 1. Install + +```sh +brew install openclaw/tap/crabbox +``` + +Verify the install: + +```sh +crabbox --version +crabbox doctor +``` + +`crabbox doctor` should print `ok` for `tools` (git, rsync, ssh, +ssh-keygen). It is fine if `auth` and `network` are still missing - we set +those next. + +If you do not have Homebrew, GitHub Releases ship signed tarballs for macOS, +Linux, and Windows. Download the matching archive from +. + +## Step 2. Log In + +```sh +crabbox login +``` + +`login` opens a browser to the GitHub OAuth flow. The broker exchanges the +OAuth code, verifies your GitHub org membership, and writes a signed token +to your user config. From then on, every `crabbox` command authenticates +automatically. + +```sh +crabbox whoami +``` + +Confirms the resolved owner, org, broker URL, and selected provider. + +If you are running Crabbox in a CI environment that cannot open a browser, +use shared-token auth: + +```sh +printf '%s' "$TOKEN" | crabbox login \ + --url https://crabbox.openclaw.ai \ + --provider aws \ + --token-stdin +``` + +See [Auth and admin](features/auth-admin.md) for the full identity model. + +## Step 3. Onboard A Repo + +Inside the repo: + +```sh +crabbox init +``` + +`init` writes three files: + +```text +.crabbox.yaml repo defaults (profile, class, sync, env) +.github/workflows/crabbox.yml Actions hydration stub (optional) +.agents/skills/crabbox/SKILL.md agent-facing skill instructions +``` + +Open `.crabbox.yaml` and fill in: + +- `profile`: a name for this lane (e.g. `project-check`); +- `class`: `standard`, `fast`, `large`, or `beast`; +- `sync.exclude`: directories that should not be sent to the runner; +- `env.allow`: env vars the remote command should see. + +Then run: + +```sh +crabbox sync-plan +``` + +`sync-plan` previews what would be sent: file count, total bytes, the +biggest files. If it shows surprises (a `dist/` folder, a `.cache/` you +forgot, a 2 GiB asset), tighten `sync.exclude` and re-run. The first sync +to a fresh runner is bound by this size. + +## Step 4. Warm A Box + +```sh +crabbox warmup +``` + +Warmup acquires a lease through the broker, provisions the runner, +bootstraps SSH and tooling, and prints a slug + lease ID: + +```text +leased cbx_abcdef123456 slug=blue-lobster provider=aws server=i-0123 type=c7a.48xlarge ip=203.0.113.10 idle_timeout=30m0s expires=2026-05-07T17:30:00Z +``` + +The lease is now waiting for commands. Idle timeout (default 30m) and TTL +(default 90m) bound how long it lives before the broker reclaims it. + +## Step 5. Run A Command + +```sh +crabbox run --id blue-lobster -- pnpm test +``` + +What happens: + +1. The CLI verifies SSH readiness on the lease. +2. It seeds remote Git from your origin/base ref, then rsyncs the dirty + working tree. +3. It runs the command over SSH, streaming stdout/stderr. +4. It heartbeats the broker so the lease does not idle out mid-test. +5. It records a `run_...` history entry with sync time, command time, exit + code, and (for Linux) bounded telemetry samples. + +You can omit `--id` for a one-shot run: + +```sh +crabbox run -- pnpm test +``` + +That acquires a fresh lease, runs the command, and releases the lease when +the command exits. Use this for ad-hoc tests; use `warmup` + `--id` for +iterative work. + +## Step 6. Inspect History + +```sh +crabbox history +crabbox events run_abcdef123456 +crabbox logs run_abcdef123456 +crabbox results run_abcdef123456 +``` + +`history` lists recent runs for the lease or owner. `events` prints ordered +events (lease, sync, command, output chunks, finish). `logs` returns the +retained command output. `results` parses any JUnit reports the run +attached. + +`/portal/runs/run_abcdef123456` renders the same data as a browser page if +you prefer a UI. + +## Step 7. Stop The Lease + +When you are done: + +```sh +crabbox stop blue-lobster +``` + +Stop releases the lease, deletes the provider machine, removes the local +claim, and frees reserved cost. If you forget, the broker idle alarm +releases the lease automatically. + +```sh +crabbox cleanup --dry-run +``` + +`cleanup` is a sweep for direct-provider leftovers. It refuses to run when +a coordinator is configured because brokered cleanup is the alarm's job. + +## Common Variations + +Use a kept lease across days: + +```sh +crabbox warmup --idle-timeout 4h --ttl 8h +crabbox run --id blue-lobster -- pnpm test +crabbox run --id blue-lobster -- pnpm bench +crabbox stop blue-lobster +``` + +Open a desktop session: + +```sh +crabbox warmup --desktop +crabbox vnc --id blue-lobster --open +``` + +Open a code-server tab: + +```sh +crabbox warmup --code +crabbox code --id blue-lobster --open +``` + +Use a Mac Studio you already own: + +```yaml +# .crabbox.yaml +provider: ssh +target: macos +static: + host: mac-studio.local + user: steipete + port: "22" + workRoot: /Users/steipete/crabbox +``` + +```sh +crabbox run -- xcodebuild test +``` + +Use AWS instead of the configured default: + +```sh +crabbox run --provider aws --class beast -- pnpm test +``` + +## Where To Go Next + +- [How Crabbox Works](how-it-works.md) - the mental model. +- [CLI](cli.md) - the full command surface and exit codes. +- [Commands](commands/README.md) - one page per command. +- [Features](features/README.md) - one page per feature. +- [Configuration](features/configuration.md) - YAML schema and precedence. +- [Providers](features/providers.md) - which provider to pick. +- [Provider authoring](features/provider-authoring.md) - add a new provider. +- [Troubleshooting](troubleshooting.md) - what to do when a step fails.