9.4 KiB
Troubleshooting
Read when:
- a lease fails to create;
- SSH never becomes ready;
- tailnet reachability behaves unexpectedly;
- sync behaves unexpectedly;
- Actions hydration times out;
- docs deployment fails.
Start with:
bin/crabbox doctor
bin/crabbox config show
bin/crabbox list --json
bin/crabbox usage --scope all --json
Broker Auth Fails
Symptoms:
401;403;missing broker token;- GitHub
Invalid redirect_uri; - Cloudflare Access page instead of JSON.
Checks:
bin/crabbox config show
printenv CRABBOX_COORDINATOR
printenv CRABBOX_COORDINATOR_TOKEN
printenv CRABBOX_PUBLIC_URL
Fixes:
- configure the broker with
crabbox config set-broker; - ensure the CLI points at the Worker URL or the Access-protected route intentionally;
- ensure
CRABBOX_COORDINATOR_TOKENmatches the WorkerCRABBOX_SHARED_TOKEN. - for self-hosted GitHub browser login, create your own GitHub OAuth app and set its callback URL to
https://<your-coordinator-host>/v1/auth/github/callback; - ensure the Worker's
CRABBOX_PUBLIC_URLuses the same public origin as that GitHub OAuth callback.
SSH Host Key Or Control Socket Fails
Symptoms:
- SSH warns that host identification changed after a provider reused an IP;
- a reused warm lease connects to the wrong ControlMaster socket;
- paths under
~/Library/Application Supportappear split at the space.
Checks:
bin/crabbox inspect --id blue-lobster --json
bin/crabbox ssh --id blue-lobster
Fixes:
- upgrade to a build that quotes SSH config values with spaces;
- keep per-lease keys under the Crabbox config
testboxes/<lease>directory; - avoid manually overriding
UserKnownHostsFileorControlPathunless debugging SSH itself.
Lease Rejected By Cost Control
Symptoms:
cost_limit_exceeded;- lease request fails before provider creation.
Checks:
bin/crabbox usage --scope user --user "$(git config user.email)"
bin/crabbox usage --scope org --org openclaw
Fixes:
- raise the relevant monthly or active-lease limit;
- shorten
--idle-timeout; - choose a smaller
--class; - stop kept leases.
Provider Capacity Or Quota Fails
Symptoms:
- class falls back from dedicated machines to smaller machines;
- AWS Spot request cannot be fulfilled;
- AWS reports
VcpuLimitExceededfor large On-Demand instances; - server create fails before SSH.
Checks:
bin/crabbox list --json
bin/crabbox usage --scope all
CRABBOX_CAPACITY_REGIONS=eu-west-1,eu-west-2,eu-central-1,us-east-1,us-west-2 \
bin/crabbox warmup --provider aws --class standard --market on-demand --timing-json
Fixes:
- choose a smaller class;
- use
--market on-demandor--market spotfor a one-off AWS capacity-market override; - set
CRABBOX_CAPACITY_REGIONSso brokered and direct AWS launches can try multiple regions; - set
CRABBOX_CAPACITY_AVAILABILITY_ZONESonly when you intentionally want a specific zone in those regions; - set
CRABBOX_CAPACITY_STRATEGY=most-available; - keep capacity hints enabled, or set
CRABBOX_CAPACITY_LARGE_CLASSESwhen your installation wants warnings for classes beyondbeast; - raise the AWS
Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instancesquota for C/M/R/T/Z families, or the matching Spot quota when using Spot; - raise Hetzner dedicated-core quota when dedicated classes are required;
- temporarily use AWS fallback capacity.
Brokered AWS launch fallback records provisioning attempts. Quota preflight
uses AWS Service Quotas when available and reports the quota code, applied vCPU
limit, requested type, and required vCPUs before trying the next candidate.
Brokered responses also include capacityHints so callers can surface the
selected region/market and next operator action instead of parsing provider
errors.
Provider Machine Looks Orphaned
Symptoms:
crabbox listshowsorphan=no-active-lease;- provider console has a
crabbox-cbx_...machine butcrabbox inspectreturns not found.
Checks:
bin/crabbox list --provider hetzner
bin/crabbox list --provider aws
bin/crabbox admin leases --state active
Fixes:
- do not delete
keep=truemachines automatically; - stop or delete only after checking that no active coordinator lease references the machine;
- use
crabbox stop <lease-id-or-slug>for active leases, and provider/admin cleanup only for confirmed orphaned machines.
SSH Never Becomes Ready
Symptoms:
- lease exists but
crabbox runwaits until SSH timeout; - the primary SSH port, default
2222, and all fallback ports are unreachable; crabbox-readyis missing.
Checks:
bin/crabbox inspect --id cbx_... --json
ssh -p 2222 crabbox@HOST crabbox-ready
ssh -p 2222 crabbox@HOST test -f /var/lib/crabbox/bootstrapped
ssh -p 22 crabbox@HOST crabbox-ready
Fixes:
- wait for cloud-init to finish on fresh machines;
- verify security group or firewall allows the primary SSH port and the configured fallback ports;
- set
CRABBOX_SSH_FALLBACK_PORTS=nonewhen fallback port 22 should not be opened or tried; - inspect provider console output for cloud-init failures;
- retry the lease if bootstrap failed before creating the ready marker.
Tailscale Path Fails
Symptoms:
--tailscalelease creation fails withtailscale_unavailable,tailscale_disabled, orinvalid_tailscale_tags;--network tailscalesays the lease has no tailnet address;--network tailscalesays the tailnet host is unreachable over SSH;--network autofalls back topublic;tailscale exit node ... joined but remote internet egress failed.
Checks:
bin/crabbox config show
bin/crabbox inspect --id blue-lobster
bin/crabbox inspect --id blue-lobster --json
bin/crabbox ssh --id blue-lobster --network tailscale
tailscale status
tailscale ping <tailscale-fqdn-or-100.x-address>
Fixes:
- for brokered leases, configure Worker secrets
CRABBOX_TAILSCALE_CLIENT_IDandCRABBOX_TAILSCALE_CLIENT_SECRET; - keep
CRABBOX_TAILSCALE_ENABLEDunset or1; set it to0only to disable brokered Tailscale intentionally; - ensure requested tags are in the Worker
CRABBOX_TAILSCALE_TAGSallowlist; - ensure the local client is joined to the same tailnet and ACLs allow SSH to the tagged node;
- for exit nodes, ensure the exit node is approved and that tailnet grants or
ACLs allow the lease tag, for example
tag:crabbox, to reachautogroup:internet; - if the exit node is a personal Mac, verify Tailscale still advertises it as an exit node and that the Mac can actually forward internet traffic for clients;
- use
--network publicto prove the provider SSH path independently; - use
--network autowhen fallback to public is acceptable; - use
--network tailscalewhen a missing or unreachable tailnet path should fail the command.
Crabbox still uses OpenSSH and per-lease SSH keys over the selected host. Tailscale SSH, Serve, Funnel, and direct VNC binding are not part of managed lease support.
Sync Looks Wrong
Symptoms:
- changed-test detection sees the wrong base;
- deleted files unexpectedly appear remotely;
- sync aborts on mass tracked deletions.
- sync warns or fails before rsync because the candidate tree is too large.
Checks:
git status --short
git ls-files --cached --others --exclude-standard | wc -l
bin/crabbox run --id cbx_... -- git status --short
bin/crabbox run --id cbx_... --sync-only --debug
Fixes:
- commit, stash, or intentionally keep local deletions before syncing;
- add generated directories to
.gitignoreor.crabbox.yamlsync.exclude; - keep
.git, build caches, and package caches out of the repo source tree; - use
--force-sync-largeonly after verifying the candidate file count and bytes are expected; - check repo-local
.crabbox.yamlsync excludes; - rerun without relying on the sync fingerprint after large tree changes;
- verify base-ref hydration in repo config.
Sync Stalls Or Times Out
Symptoms:
- rsync prints little output for a long time;
rsync timed out after ...;- a local cache directory made the first sync unexpectedly huge.
Checks:
bin/crabbox config show
bin/crabbox run --id cbx_... --sync-only --debug
Fixes:
- inspect the printed sync candidate estimate before retrying;
- lower
sync.timeoutfor quick failure in agent loops, or raise it for intentionally large source transfers; - tune
sync.warnFiles,sync.warnBytes,sync.failFiles, andsync.failBytesin repo config; - stop and warm a fresh lease if the remote workspace looks corrupted.
Actions Hydration Times Out
Symptoms:
crabbox actions hydratedispatches a run but never sees the ready marker;- later
crabbox run --iddoes not enter the expected Actions workspace.
Checks:
bin/crabbox actions hydrate --id blue-lobster
bin/crabbox inspect --id blue-lobster --json
Fixes:
- open the workflow run URL and find the failed setup step;
- ensure the generated workflow writes the ready marker;
- confirm the workflow has permission to register or use the runner;
- keep secrets inside the workflow and only write non-secret handoff data.
Docs Site Fails To Publish
Symptoms:
- Pages workflow fails during Pages setup;
- local docs build succeeds.
Checks:
npm run docs:check
gh run list --workflow pages.yml
Fixes:
- enable GitHub Pages for the repository or organization;
- rerun the Pages workflow after Pages is allowed;
- keep Markdown links relative so the static builder can rewrite them.
- fix broken internal Markdown links before checking whether Pages itself is down.