| .agents/skills/crabbox | ||
| .github/workflows | ||
| cmd/crabbox | ||
| docs | ||
| internal/cli | ||
| scripts | ||
| worker | ||
| .gitignore | ||
| .goreleaser.yaml | ||
| AGENTS.md | ||
| CHANGELOG.md | ||
| go.mod | ||
| go.sum | ||
| LICENSE | ||
| README.md | ||
🦀 Crabbox
Warm a box, sync the diff, run the suite.
Crabbox is an open-source remote testbox runner for maintainers and AI agents. Lease a fast Linux machine on owned cloud capacity, sync your dirty checkout, run a command remotely, stream output, and release. Local edit-save-run loop, cloud-grade compute.
crabbox run -- pnpm test
Behind that single command: a Go CLI on your laptop, a Cloudflare Worker broker that owns provider credentials and lease state, and a vanilla Ubuntu runner on Hetzner Cloud or AWS EC2 Spot. Crabbox can also wrap Blacksmith Testboxes when you choose provider: blacksmith-testbox.
Install
brew install openclaw/tap/crabbox
crabbox --version
No Homebrew? Grab a GoReleaser archive for macOS, Linux, or Windows.
Prerequisites on the laptop: git, ssh, ssh-keygen, rsync, curl.
Quick start
# log in once per machine (stores a bearer token in the OS keychain)
crabbox login
# verify local prerequisites and broker reachability
crabbox doctor
# one-shot: lease, sync, run, release
crabbox run -- pnpm test
# or warm a box once, then reuse it
crabbox warmup # prints cbx_... + a slug
crabbox run --id blue-lobster -- pnpm test:changed
crabbox ssh --id blue-lobster
crabbox inspect --id blue-lobster --json
Inspect usage and estimated cost:
crabbox usage
crabbox usage --scope org --org openclaw
crabbox usage --scope all --json
crabbox usage reads coordinator history, so it requires a configured broker. Cost is an estimate for compute leases, not a provider invoice: the coordinator prefers explicit CRABBOX_COST_RATES_JSON overrides, then provider pricing from AWS Spot history or Hetzner server-type prices, then built-in fallback rates. Full reference: docs/commands/usage.md.
Stop a kept server:
crabbox stop blue-lobster
Every lease has a stable cbx_... ID and a friendly crustacean slug (blue-lobster, swift-hermit, …). Either works wherever an --id is accepted.
How it works
your laptop Cloudflare Worker cloud provider
------------- ------------------ --------------
crabbox CLI -- HTTPS --> Fleet Durable Object --> Hetzner / AWS Spot
| lease + cost state |
| |
+------------ SSH + rsync to leased runner <--------------+
- CLI — Go binary. Loads config, mints a per-lease SSH key, asks the broker for a lease, waits for SSH, seeds remote Git, rsyncs the dirty checkout (with fingerprint skip when nothing changed), runs the command, streams output, releases.
- Broker — Cloudflare Worker at
crabbox.openclaw.aiplus a single Durable Object. Owns provider credentials, serializes lease state, enforces active-lease and monthly spend caps, and expires stale leases by alarm. Auth is GitHub login or a shared bearer token. - Runner — vanilla Ubuntu prepared by cloud-init with SSH on port 2222, Git, rsync, curl, jq, and
/work/crabbox. No broker credentials live on the box. Project runtimes (Go, Node, Docker, services, secrets) come from your repo's GitHub Actions hydration, devcontainer, Nix, mise/asdf, or setup scripts — not from Crabbox.
A direct-provider mode (--provider hetzner|aws with local credentials) exists for debugging the broker itself; the brokered path is the default.
For the full mental model, see How Crabbox Works. For the doc-to-code map, see Source Map.
Highlights
- One-shot or warm.
crabbox runfor fire-and-forget;crabbox warmup+--idfor repeated runs against the same box. - Local-first sync. No clean-checkout requirement. Tracked + nonignored files only, fingerprint skip on no-op runs, sanity checks against suspicious mass deletions, optional shallow base-ref hydration for changed-test workflows.
- Brokered cloud. Maintainers and agents share infra without sharing provider tokens. Hetzner and AWS EC2 Spot are first-class; both fall back across instance families when capacity or quota rejects a request.
- Blacksmith Testbox wrapper. Set
provider: blacksmith-testboxto delegate warmup/run/list/status/stop to the Blacksmith CLI while Crabbox keeps local slugs, repo claims, timing summaries, and config conventions. - Cost guardrails. Per-lease and monthly spend caps. Live pricing from EC2 Spot history or Hetzner server-type prices, with static fallbacks.
crabbox usagesummarizes spend by user, org, provider, and type. - GitHub Actions hydration.
crabbox actions hydrateregisters a leased box as an ephemeral Actions runner, so the repo's own workflow installs runtimes, services, and secrets. Crabbox does not parse Actions YAML. - AWS image cache.
crabbox image current|list|createlets trusted operators inspect the active AMI and capture scrubbed, warmed AWS runner images after hydration. - Operator surface.
doctor,init,status,inspect,list,usage,history,logs,results,cache,admin,cleanup, plus--jsonoutput where it matters.
Machine classes
beast is the default. Both providers fall back across an ordered list of instance types.
Hetzner standard ccx33, cpx62, cx53
fast ccx43, cpx62, cx53
large ccx53, ccx43, cpx62, cx53
beast ccx63, ccx53, ccx43, cpx62, cx53
AWS Spot standard c7a/c7i/m7a/m7i.8xlarge family
fast …16xlarge family
large …24xlarge family
beast …48xlarge family, falling back to 32x/24x/16x
Override with --type or CRABBOX_SERVER_TYPE for a specific instance.
Configuration
Config resolves in order: flags → env → repo .crabbox.yaml → user ~/.config/crabbox/config.yaml → defaults.
broker:
url: https://crabbox.openclaw.ai
provider: aws
token: ...
class: beast
capacity:
market: spot
strategy: most-available
fallback: on-demand-after-120s
aws:
region: eu-west-1
rootGB: 400
lease:
idleTimeout: 30m
ttl: 90m
ssh:
key: ~/.ssh/id_ed25519
user: crabbox
port: "2222"
Optional Blacksmith Testbox wrapper:
provider: blacksmith-testbox
blacksmith:
org: openclaw
workflow: .github/workflows/ci-check-testbox.yml
job: test
ref: main
idleTimeout: 90m
Forwarded environment is intentionally narrow: NODE_OPTIONS and CI. Do not pass secrets as command-line arguments. Full env-var reference and per-command flags are in docs/cli.md and docs/commands/.
Development
# Go CLI
go build -o bin/crabbox ./cmd/crabbox
go test -race ./...
scripts/check-go-coverage.sh 85.0
# Cloudflare Worker
npm ci --prefix worker
npm test --prefix worker
npm run build --prefix worker
CI runs the full gate (gofmt, vet, race tests, coverage threshold, GoReleaser snapshot, Worker lint/typecheck/tests/build) on every push and PR. Tagged pushes matching v* publish Go archives via GoReleaser and bump the Homebrew formula at openclaw/homebrew-tap.
Worker deployment, required secrets, and DNS routing live in docs/infrastructure.md.
Docs
- Get the model: How Crabbox Works, Architecture, Orchestrator
- Use the CLI: CLI, Commands, Features
- Operate it: Operations, Observability, Troubleshooting
- Set it up or audit it: Infrastructure, Security, Source Map, MVP Plan
- Changes: CHANGELOG.md
The GitHub Pages site at https://openclaw.github.io/crabbox/ is generated from the docs/ Markdown:
node scripts/build-docs-site.mjs
open dist/docs-site/index.html
Status
Crabbox 0.1.0 (2026-05-01) is the first public release. Working today: brokered Hetzner + AWS Spot provisioning, warm-box reuse, GitHub Actions hydration, cost guardrails, run history, JUnit summaries, and the full CLI surface listed above. Not yet: untrusted multi-tenant isolation — Crabbox today assumes shared trust between operators of a single broker.
License
MIT — see LICENSE.