Add Azure as a managed provider for direct and brokered Crabbox leases. - provision Azure Linux VMs with cloud-init, spot fallback, shared network adoption, and per-lease cleanup - provision native Azure Windows VMs with VM Agent bootstrap and SSH/sync/run support - add Azure broker support in the Cloudflare Worker, provider config, docs, and tests - fix async Azure delete handling so successful 202 delete LROs do not refetch deleted resources - keep Go core coverage above the CI threshold Verified with CI plus live Azure Linux and native Windows leases. Co-authored-by: Jonathan Moss <2729151+jwmoss@users.noreply.github.com>
293 lines
16 KiB
Markdown
293 lines
16 KiB
Markdown
# 🦀 📦 Crabbox
|
|
|
|
[](https://github.com/openclaw/crabbox/actions/workflows/ci.yml)
|
|
[](https://github.com/openclaw/crabbox/actions/workflows/release.yml)
|
|
[](https://github.com/openclaw/crabbox/releases/latest)
|
|
|
|
**Warm a box, sync the diff, run the suite.**
|
|
|
|
Crabbox is an open-source remote testbox runner for maintainers and AI agents. Lease fast managed cloud capacity, or point at an existing SSH host, sync your dirty checkout, run a command remotely, stream output, and release. Local edit-save-run loop, cloud-grade compute.
|
|
|
|
```sh
|
|
crabbox run -- pnpm test
|
|
```
|
|
|
|
Behind that single command: a Go CLI on your laptop, a Cloudflare Worker broker that owns provider credentials and lease state, and a managed runner on Hetzner Cloud, AWS EC2, or Azure. Azure supports managed Linux and native Windows VMs. Crabbox can also wrap Blacksmith Testboxes when you choose `provider: blacksmith-testbox`, use Daytona or Islo sandboxes for direct-provider workflows, or use `provider: ssh` for existing macOS and Windows targets.
|
|
|
|
---
|
|
|
|
## Install
|
|
|
|
```sh
|
|
brew install openclaw/tap/crabbox
|
|
crabbox --version
|
|
```
|
|
|
|
No Homebrew? Grab a [GoReleaser archive](https://github.com/openclaw/crabbox/releases) for macOS, Linux, or Windows.
|
|
|
|
Prerequisites on the laptop: `git`, `ssh`, `ssh-keygen`, `rsync`, `curl`.
|
|
|
|
## Quick start
|
|
|
|
```sh
|
|
# log in once per machine (stores a broker token in user config)
|
|
crabbox login
|
|
|
|
# verify local prerequisites and broker reachability
|
|
crabbox doctor
|
|
|
|
# one-shot: lease, sync, run, release
|
|
crabbox run -- pnpm test
|
|
|
|
# or warm a box once, then reuse it
|
|
crabbox warmup # prints cbx_... + a slug
|
|
crabbox run --id blue-lobster -- pnpm test:changed
|
|
crabbox ssh --id blue-lobster
|
|
crabbox stop blue-lobster
|
|
```
|
|
|
|
Every lease has a stable `cbx_...` ID and a friendly crustacean slug (`blue-lobster`, `swift-hermit`, …). Either works wherever an `--id` is accepted.
|
|
|
|
## How it works
|
|
|
|
```text
|
|
your laptop Cloudflare Worker cloud provider
|
|
------------- ------------------ --------------
|
|
crabbox CLI -- HTTPS --> Fleet Durable Object --> Hetzner / AWS EC2 / Azure
|
|
| lease + cost state |
|
|
| |
|
|
+------------ SSH + rsync to leased runner <--------------+
|
|
```
|
|
|
|
- **CLI** — Go binary. Loads config, mints a per-lease SSH key, asks the broker for a lease, waits for SSH, seeds remote Git, rsyncs the dirty checkout (with fingerprint skip when nothing changed), runs the command, streams output, releases.
|
|
- **Broker** — Cloudflare Worker at `crabbox.openclaw.ai` plus a single Durable Object. Owns provider credentials, serializes lease state, enforces active-lease and monthly spend caps, and expires stale leases by alarm. Auth is GitHub login or a shared bearer token.
|
|
- **Runner** — a throwaway SSH machine prepared with SSH on the primary port, default `2222`, plus configured fallback ports and Crabbox's sync/run prerequisites. Linux uses Ubuntu with cloud-init and `/work/crabbox`; native Windows uses OpenSSH, Git for Windows, and `C:\crabbox`. No broker credentials live on the box. Project runtimes (Go, Node, Docker, services, secrets) come from your repo's GitHub Actions hydration, devcontainer, Nix, mise/asdf, or setup scripts — not from Crabbox.
|
|
|
|
A direct-provider mode (`--provider hetzner|aws|azure` with local credentials) exists for debugging the broker itself; the brokered path is the default.
|
|
|
|
For the full mental model, see [How Crabbox Works](docs/how-it-works.md). For the doc-to-code map, see [Source Map](docs/source-map.md).
|
|
|
|
## Highlights
|
|
|
|
- **One-shot or warm.** `crabbox run` for fire-and-forget; `crabbox warmup` + `--id` for repeated runs against the same box.
|
|
- **Run observability.** Every coordinator-backed run gets an early `run_...` handle. Use `crabbox attach <run-id>` while it is active, `crabbox events <run-id> --after <seq> --limit <n>` for durable lifecycle/output events, and `crabbox logs <run-id>` for retained output after completion.
|
|
- **Stable timing records.** `--timing-json` on `run`, `warmup`, and `actions hydrate` gives scripts one machine-readable sync/command/total timing schema across AWS, Hetzner, and Blacksmith Testboxes.
|
|
- **Local-first sync.** No clean-checkout requirement. Tracked + nonignored files only, fingerprint skip on no-op runs, sanity checks against suspicious mass deletions, optional shallow base-ref hydration for changed-test workflows.
|
|
- **Brokered cloud.** Maintainers and agents share infra without sharing provider tokens. Hetzner, AWS EC2, and Azure are managed providers; AWS also owns Windows WSL2 and EC2 Mac targets. Linux defaults to Spot unless capacity config says otherwise. Providers fall back across compatible instance families when capacity or quota rejects a request.
|
|
- **Azure Linux and native Windows.** `provider: azure` provisions Linux and native Windows VMs in a configurable Azure subscription using `DefaultAzureCredential` in direct mode or service-principal secrets in the broker. Crabbox creates a shared resource group, vnet, subnet, and NSG on first use, then per-lease public IPs, NICs, and VMs. Linux uses cloud-init; Windows uses VM Agent Custom Script Extension to install OpenSSH/Git and configure the Crabbox user.
|
|
- **macOS and Windows static hosts.** `provider: ssh` reuses existing machines; it does not create macOS or Windows Crabbox boxes. macOS and Windows WSL2 use the POSIX rsync path; native Windows uses PowerShell plus tar archive sync.
|
|
- **Blacksmith Testbox wrapper.** Set `provider: blacksmith-testbox` to delegate warmup/run/list/status/stop to the Blacksmith CLI while Crabbox keeps local slugs, repo claims, timing summaries, config conventions, and portal visibility for active external runners.
|
|
- **Daytona and Islo sandboxes.** Set `provider: daytona` for Daytona SDK/toolbox execution from a snapshot with explicit SSH access when needed, or `provider: islo` for delegated Islo sandbox execution through the Islo Go SDK.
|
|
- **Trusted AWS images.** Operators can create AMIs from active brokered AWS leases and promote a known-good image as the coordinator default.
|
|
- **Cost guardrails.** Per-lease and monthly spend caps. Live pricing from EC2 Spot history or Hetzner server-type prices, with static fallbacks. `crabbox usage` summarizes spend by user, org, provider, and type.
|
|
- **GitHub Actions hydration.** `crabbox actions hydrate` registers a leased box as an ephemeral Actions runner, so the repo's own workflow installs runtimes, services, and secrets. Crabbox does not parse Actions YAML.
|
|
- **Interactive desktop and browser leases.** `--browser` provisions Chrome or Chromium for headless automation, `--desktop` provisions visible UI with tunnel-only VNC takeover on managed Linux, AWS native Windows, and AWS EC2 Mac targets. `crabbox desktop doctor` checks session, VNC, input tooling, browser, ffmpeg, screen size, screenshot capture, and WebVNC portal state; `desktop click/paste/type/key` provide first-class input helpers so agents do not hand-roll brittle `xdotool` snippets. QA systems such as Mantis own scenario logic, screenshots, and PR evidence. Azure native Windows is SSH/sync/run only; use AWS for managed Windows desktop/WSL2 or `provider: ssh` for an existing Windows host.
|
|
- **Authenticated web portal.** Browser login opens owner-scoped and explicitly shared lease/run views with searchable, paginated tables, muted external-runner rows, compact provider/OS/access icons, relative sortable times, recent run logs/events, WebVNC, code-server, and Linux lease/run telemetry charts. `crabbox share` can grant a lease to one user or the owning org, and the lease page exposes the same sharing controls for owners/managers. WebVNC is preferred for human demos because it preloads the VNC password; `webvnc status` reports local daemon, tunnel, target reachability, bridge/viewer state, recent events, URL/password, and native VNC fallback, while `webvnc reset` restarts only the selected lease's WebVNC/input stack. Admin sessions can also see non-owned runner leases behind `mine`/`system` filters.
|
|
- **Hardened coordinator auth.** GitHub browser login, owner-scoped leases, admin-only routes, optional GitHub team allowlists, Cloudflare Access JWT verification, and service-token support keep normal use and operator automation separate.
|
|
- **OpenClaw plugin.** The repo root is a native OpenClaw plugin for box lifecycle operations: `crabbox_run`, `crabbox_warmup`, `crabbox_status`, `crabbox_list`, and `crabbox_stop`. Run inspection stays in the CLI and Crabbox skill.
|
|
- **Operator surface.** `doctor`, `init`, `status`, `inspect`, `list`, `usage`, `history`, `logs`, `results`, `cache`, `admin`, `cleanup`, plus `--json` output where it matters.
|
|
|
|
## Machine classes
|
|
|
|
`beast` is the default. Both providers fall back across an ordered list of instance types.
|
|
|
|
```text
|
|
Hetzner standard ccx33, cpx62, cx53
|
|
fast ccx43, cpx62, cx53
|
|
large ccx53, ccx43, cpx62, cx53
|
|
beast ccx63, ccx53, ccx43, cpx62, cx53
|
|
|
|
AWS Linux standard c7a/c7i/m7a/m7i.8xlarge family
|
|
fast …16xlarge family
|
|
large …24xlarge family
|
|
beast …48xlarge family, falling back to 32x/24x/16x
|
|
|
|
AWS Win standard m7i.large, m7a.large, t3.large
|
|
fast m7i.xlarge, m7a.xlarge, t3.xlarge
|
|
large m7i.2xlarge, m7a.2xlarge, t3.2xlarge
|
|
beast m7i.4xlarge, m7a.4xlarge, m7i.2xlarge
|
|
|
|
AWS WSL2 standard m8i.large, m8i-flex.large, c8i.large, r8i.large
|
|
fast m8i.xlarge, m8i-flex.xlarge, c8i.xlarge, r8i.xlarge
|
|
large m8i.2xlarge, m8i-flex.2xlarge, c8i.2xlarge, r8i.2xlarge
|
|
beast m8i.4xlarge, m8i-flex.4xlarge, c8i.4xlarge, r8i.4xlarge, m8i.2xlarge
|
|
|
|
AWS macOS all mac2.metal unless --type is set
|
|
|
|
Azure standard Standard_D32ads_v6, Standard_D32ds_v6, Standard_F32s_v2, then 16-vCPU fallbacks
|
|
fast Standard_D64ads_v6, Standard_D64ds_v6, Standard_F64s_v2, then 48/32-vCPU fallbacks
|
|
large Standard_D96ads_v6, Standard_D96ds_v6, then 64/48-vCPU fallbacks
|
|
beast Standard_D192ds_v6, Standard_D128ds_v6, then 96/64-vCPU fallbacks
|
|
|
|
Azure Win standard Standard_D2ads_v6, Standard_D2ds_v6, Standard_D2ads_v5, Standard_D2ds_v5, Standard_D2as_v6
|
|
fast Standard_D4ads_v6, Standard_D4ds_v6, Standard_D4ads_v5, Standard_D4ds_v5, Standard_D4as_v6
|
|
large Standard_D8ads_v6, Standard_D8ds_v6, Standard_D8ads_v5, Standard_D8ds_v5, Standard_D8as_v6
|
|
beast Standard_D16ads_v6, Standard_D16ds_v6, Standard_D16ads_v5, Standard_D16ds_v5, Standard_D8ads_v6
|
|
```
|
|
|
|
Override with `--type` or `CRABBOX_SERVER_TYPE` for a specific instance.
|
|
|
|
## Configuration
|
|
|
|
Config resolves in order: flags → env → repo `.crabbox.yaml` → user `~/.config/crabbox/config.yaml` → defaults.
|
|
|
|
```yaml
|
|
broker:
|
|
url: https://crabbox.openclaw.ai
|
|
provider: aws
|
|
token: ...
|
|
class: beast
|
|
capacity:
|
|
market: spot
|
|
strategy: most-available
|
|
fallback: on-demand-after-120s
|
|
hints: true
|
|
aws:
|
|
region: eu-west-1
|
|
rootGB: 400
|
|
lease:
|
|
idleTimeout: 30m
|
|
ttl: 90m
|
|
ssh:
|
|
key: ~/.ssh/id_ed25519
|
|
user: crabbox
|
|
port: "2222"
|
|
# Ordered fallback ports tried after ssh.port; use [] to disable fallback.
|
|
fallbackPorts:
|
|
- "22"
|
|
```
|
|
|
|
Optional Blacksmith Testbox wrapper:
|
|
|
|
```yaml
|
|
provider: blacksmith-testbox
|
|
blacksmith:
|
|
org: openclaw
|
|
workflow: .github/workflows/ci-check-testbox.yml
|
|
job: test
|
|
ref: main
|
|
idleTimeout: 90m
|
|
```
|
|
|
|
`crabbox list --provider blacksmith-testbox` also refreshes muted external
|
|
runner rows in the portal lease table from the current all-status Testbox list
|
|
when coordinator auth is configured. When GitHub is reachable, Crabbox also
|
|
links those rows back to the inferred Actions run and workflow, surfaces the
|
|
Actions status/conclusion, flags long-queued or long-running rows as `stuck`,
|
|
and exposes a copyable local `crabbox stop --provider blacksmith-testbox ...`
|
|
command. Clicking an external row opens a visibility-only runner detail page
|
|
with owner, workflow, timestamps, boundary notes, and the same stop command.
|
|
Those rows are visibility-only records for Blacksmith-owned Testboxes, not
|
|
Crabbox leases.
|
|
|
|
Optional Daytona sandbox:
|
|
|
|
```yaml
|
|
provider: daytona
|
|
daytona:
|
|
snapshot: crabbox-ready
|
|
workRoot: /home/daytona/crabbox
|
|
```
|
|
|
|
Optional Islo sandbox:
|
|
|
|
```yaml
|
|
provider: islo
|
|
islo:
|
|
image: docker.io/library/ubuntu:24.04
|
|
workdir: crabbox
|
|
```
|
|
|
|
Optional static macOS or Windows target:
|
|
|
|
```yaml
|
|
provider: ssh
|
|
target: windows
|
|
windows:
|
|
mode: normal # or wsl2
|
|
static:
|
|
host: win-dev.local
|
|
user: Peter
|
|
port: "22"
|
|
workRoot: C:\crabbox
|
|
```
|
|
|
|
Optional Tailscale reachability for managed Linux leases:
|
|
|
|
```yaml
|
|
tailscale:
|
|
enabled: true
|
|
network: auto
|
|
tags:
|
|
- tag:crabbox
|
|
hostnameTemplate: crabbox-{slug}
|
|
authKeyEnv: CRABBOX_TAILSCALE_AUTH_KEY
|
|
exitNode: mac-studio.example.ts.net
|
|
exitNodeAllowLanAccess: true
|
|
```
|
|
|
|
Tailscale is a network plane, not a provider. `--tailscale` joins new managed
|
|
Linux leases to the tailnet; `--network auto|tailscale|public` chooses how SSH
|
|
and VNC tunnel commands resolve the host. Brokered mode uses Worker OAuth
|
|
secrets to mint one-off keys; direct-provider mode reads the auth key from the
|
|
configured env var. `exitNode` is opt-in per lease for routing outbound internet
|
|
through an approved tailnet exit node. See [Tailscale](docs/features/tailscale.md).
|
|
|
|
Forwarded environment is intentionally narrow: `NODE_OPTIONS` and `CI`. Do not pass secrets as command-line arguments. Full env-var reference and per-command flags are in [docs/cli.md](docs/cli.md) and [docs/commands/](docs/commands/README.md).
|
|
|
|
## OpenClaw plugin
|
|
|
|
The repo root is a native OpenClaw plugin package. Once installed, it exposes Crabbox as agent tools:
|
|
|
|
- `crabbox_run`, `crabbox_warmup`, `crabbox_status`, `crabbox_list`, `crabbox_stop`
|
|
|
|
The plugin shells out to the configured `crabbox` binary, so local config, broker login, repo claims, and sync behavior stay owned by the CLI. Set `plugins.entries.crabbox.config.binary` if `crabbox` is not on `PATH`.
|
|
|
|
Durable run inspection is intentionally CLI/skill-led instead of additional plugin tools: use `crabbox history`, `crabbox events --after --limit`, `crabbox attach`, `crabbox logs`, `crabbox results`, and `crabbox usage` from a shell-capable agent.
|
|
|
|
## Development
|
|
|
|
```sh
|
|
# Go CLI
|
|
go build -o bin/crabbox ./cmd/crabbox
|
|
go test -race ./...
|
|
scripts/check-go-coverage.sh 85.0
|
|
|
|
# Cloudflare Worker
|
|
npm ci --prefix worker
|
|
npm test --prefix worker
|
|
npm run build --prefix worker
|
|
|
|
# Docs
|
|
npm run docs:check
|
|
|
|
# Optional live smoke, when broker/provider credentials are available
|
|
CRABBOX_LIVE=1 CRABBOX_LIVE_REPO=/path/to/openclaw scripts/live-smoke.sh
|
|
# Add Blacksmith only for repos with a Testbox workflow.
|
|
CRABBOX_LIVE=1 CRABBOX_LIVE_PROVIDERS=blacksmith-testbox scripts/live-smoke.sh
|
|
```
|
|
|
|
CI runs the full gate (gofmt, vet, race tests, coverage threshold, docs link/build check, GoReleaser snapshot, Worker lint/typecheck/tests/build) on every push and PR. Tagged pushes matching `v*` publish Go archives via GoReleaser and bump the Homebrew formula at [openclaw/homebrew-tap](https://github.com/openclaw/homebrew-tap).
|
|
|
|
Worker deployment, required secrets, and DNS routing live in [docs/infrastructure.md](docs/infrastructure.md).
|
|
|
|
## Docs
|
|
|
|
- **Get the model:** [How Crabbox Works](docs/how-it-works.md), [Architecture](docs/architecture.md), [Orchestrator](docs/orchestrator.md)
|
|
- **Use the CLI:** [CLI](docs/cli.md), [Commands](docs/commands/README.md), [Features](docs/features/README.md)
|
|
- **Interactive QA:** [Interactive Desktop and VNC](docs/features/interactive-desktop-vnc.md)
|
|
- **Operate it:** [Operations](docs/operations.md), [Observability](docs/observability.md), [Troubleshooting](docs/troubleshooting.md)
|
|
- **Set it up or audit it:** [Infrastructure](docs/infrastructure.md), [Security](docs/security.md), [Source Map](docs/source-map.md), [MVP Plan](docs/mvp-plan.md)
|
|
- **Changes:** [CHANGELOG.md](CHANGELOG.md)
|
|
|
|
The GitHub Pages site at <https://openclaw.github.io/crabbox/> is generated from the `docs/` Markdown:
|
|
|
|
```sh
|
|
npm run docs:check
|
|
open dist/docs-site/index.html
|
|
```
|
|
|
|
## License
|
|
|
|
MIT — see [LICENSE](LICENSE).
|