diff --git a/CHANGELOG.md b/CHANGELOG.md index 34499ce..0a65131 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -24,10 +24,12 @@ - Added best-effort GitHub Actions run and workflow links for external Blacksmith Testbox rows in the portal. - Added GitHub Actions status badges, stuck filters, and copyable local stop commands for external Blacksmith Testbox rows in the portal. - Added external runner detail pages in the portal with owner, Actions, lifecycle timestamps, boundary notes, and copyable stop commands. +- Added broker capacity hints for AWS leases, including selected market, attempted regions, quota/capacity advice, and configurable high-pressure class warnings. ### Changed - Changed AWS capacity fallback to route configured `CRABBOX_CAPACITY_REGIONS` across both brokered and direct AWS launches, with the deployed coordinator defaulting to a wider multi-region pool for better headroom. +- Changed coordinator-backed CLI lease output to print broker capacity hints when AWS routing, quota, Spot fallback, or configured high-pressure classes are involved. - Changed the portal lease table to merge external Blacksmith Testbox runners into the main grid as muted, disabled rows instead of rendering a separate external-runners table. - Refactored built-in provider backend implementations into `internal/providers/` packages while keeping command orchestration and rendering core-owned. diff --git a/README.md b/README.md index 7ce19a4..37604f2 100644 --- a/README.md +++ b/README.md @@ -130,6 +130,7 @@ capacity: market: spot strategy: most-available fallback: on-demand-after-120s + hints: true aws: region: eu-west-1 rootGB: 400 diff --git a/docs/cli.md b/docs/cli.md index 74245bf..6c508cc 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -346,6 +346,7 @@ capacity: market: spot strategy: most-available fallback: on-demand-after-120s + hints: true aws: region: eu-west-1 rootGB: 400 @@ -527,6 +528,8 @@ CRABBOX_CAPACITY_STRATEGY CRABBOX_CAPACITY_FALLBACK CRABBOX_CAPACITY_REGIONS CRABBOX_CAPACITY_AVAILABILITY_ZONES +CRABBOX_CAPACITY_HINTS +CRABBOX_CAPACITY_LARGE_CLASSES CRABBOX_ACTIONS_WORKFLOW CRABBOX_ACTIONS_JOB CRABBOX_ACTIONS_REF diff --git a/docs/features/aws.md b/docs/features/aws.md index 41442a5..7bcc9a3 100644 --- a/docs/features/aws.md +++ b/docs/features/aws.md @@ -47,6 +47,13 @@ CRABBOX_CAPACITY_REGIONS=eu-west-1,eu-west-2,eu-central-1,us-east-1,us-west-2 Prefer `standard` or `fast` during capacity incidents. `beast` starts at 48xlarge candidates and can consume 192 vCPUs per request before fallback. +Brokered AWS leases return capacity hints in the lease payload and CLI output. +Hints include the selected region/market, failed attempt regions, quota +pressure, Spot-to-On-Demand fallback, and high-pressure class warnings. Set +`capacity.hints: false` or `CRABBOX_CAPACITY_HINTS=0` to suppress them. Set +`CRABBOX_CAPACITY_LARGE_CLASSES=beast,large` when an installation wants warning +hints for a different set of classes. + Crabbox tries ordered instance candidates for the requested class. Explicit `--type` is exact: if EC2 rejects it, Crabbox fails clearly instead of silently choosing another type. @@ -104,6 +111,8 @@ CRABBOX_AWS_SSH_CIDRS CRABBOX_AWS_MAC_HOST_ID CRABBOX_CAPACITY_REGIONS CRABBOX_CAPACITY_AVAILABILITY_ZONES +CRABBOX_CAPACITY_HINTS +CRABBOX_CAPACITY_LARGE_CLASSES ``` ## Security And Networking diff --git a/docs/providers/aws.md b/docs/providers/aws.md index dd1efae..af1a5f3 100644 --- a/docs/providers/aws.md +++ b/docs/providers/aws.md @@ -71,6 +71,8 @@ CRABBOX_AWS_SSH_CIDRS CRABBOX_AWS_MAC_HOST_ID CRABBOX_CAPACITY_REGIONS CRABBOX_CAPACITY_AVAILABILITY_ZONES +CRABBOX_CAPACITY_HINTS +CRABBOX_CAPACITY_LARGE_CLASSES ``` Brokered AWS credentials belong in the Worker, not on developer machines. @@ -111,6 +113,8 @@ provider labels and `crabbox cleanup`. - Spot capacity and quota errors are normal. Prefer classes over exact `--type` when you want fallback. +- Brokered leases include `capacityHints` unless disabled with + `capacity.hints: false` or `CRABBOX_CAPACITY_HINTS=0`. - During capacity pressure, prefer `standard` or `fast` plus multiple `CRABBOX_CAPACITY_REGIONS`; `beast` starts at 48xlarge candidates and can consume 192 vCPUs per request. diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 39b994e..ab81ba9 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -112,6 +112,7 @@ Fixes: - set `CRABBOX_CAPACITY_REGIONS` so brokered and direct AWS launches can try multiple regions; - set `CRABBOX_CAPACITY_AVAILABILITY_ZONES` only when you intentionally want a specific zone in those regions; - set `CRABBOX_CAPACITY_STRATEGY=most-available`; +- keep capacity hints enabled, or set `CRABBOX_CAPACITY_LARGE_CLASSES` when your installation wants warnings for classes beyond `beast`; - raise the AWS `Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances` quota for C/M/R/T/Z families, or the matching Spot quota when using Spot; - raise Hetzner dedicated-core quota when dedicated classes are required; - temporarily use AWS fallback capacity. @@ -119,6 +120,9 @@ Fixes: Brokered AWS launch fallback records provisioning attempts. Quota preflight uses AWS Service Quotas when available and reports the quota code, applied vCPU limit, requested type, and required vCPUs before trying the next candidate. +Brokered responses also include `capacityHints` so callers can surface the +selected region/market and next operator action instead of parsing provider +errors. ## Provider Machine Looks Orphaned