diff --git a/docs/repair/README.md b/docs/repair/README.md index 023d577145..cb58c8eb36 100644 --- a/docs/repair/README.md +++ b/docs/repair/README.md @@ -226,7 +226,7 @@ For a maintainer-facing architecture map of the automation lanes, see [`docs/INTERNAL_FEATURES.md`](docs/INTERNAL_FEATURES.md). For the ClawSweeper feedback loop that updates existing generated PRs, see -[`docs/auto-update-prs.md`](docs/auto-update-prs.md). +[`docs/repair/auto-update-prs.md`](auto-update-prs.md). That loop is marker-driven. ClawSweeper comments use hidden `clawsweeper-verdict:*` markers, and only actionable PR feedback includes @@ -312,7 +312,7 @@ Supported commands: ``` `status` and `explain` post a short status reply. `fix ci`, `address review`, -and `rebase` dispatch the normal `cluster-worker.yml` repair path, but only for +and `rebase` dispatch the normal `repair-cluster-worker.yml` repair path, but only for existing ClawSweeper PRs identified by the `clawsweeper/*` branch. `automerge` opts an open PR into the bounded review/fix/merge loop. `approve` is maintainer-only exact-head approval after a human-review pause; it clears diff --git a/docs/repair/auto-update-prs.md b/docs/repair/auto-update-prs.md index 76d44caf93..151937b9a7 100644 --- a/docs/repair/auto-update-prs.md +++ b/docs/repair/auto-update-prs.md @@ -21,7 +21,7 @@ The loop is intentionally small: to review that PR head. 3. The comment router sees trusted ClawSweeper feedback. 4. ClawSweeper dispatches the existing or adopted job through - `cluster-worker.yml`. + `repair-cluster-worker.yml`. 5. The repair worker pushes another commit to the source branch if it finds a safe, narrow fix, or opens a credited replacement when the source branch cannot be safely updated. @@ -177,7 +177,7 @@ five automatic ClawSweeper-triggered repair iterations. The per-PR cap is total across all head SHAs and stops the automatic review/repair loop even when every iteration produces a new commit. -Runs for the same job path and mode share the `cluster-worker.yml` concurrency +Runs for the same job path and mode share the `repair-cluster-worker.yml` concurrency group, so repeated dispatches queue instead of racing the same branch. ClawSweeper edits one durable review comment in place. The router keys its @@ -227,13 +227,13 @@ as: Workflow: -- `.github/workflows/comment-router.yml` +- `.github/workflows/repair-comment-router.yml` Scripts: -- `scripts/comment-router.ts` -- `scripts/comment-router-core.ts` -- `scripts/comment-router-utils.ts` +- `src/repair/comment-router.ts` +- `src/repair/comment-router-core.ts` +- `src/repair/comment-router-utils.ts` Durable state: @@ -255,7 +255,7 @@ Syntax and workflow checks: ```bash pnpm run check -actionlint .github/workflows/comment-router.yml +actionlint .github/workflows/repair-comment-router.yml ``` Dry-run the router against live recent comments: diff --git a/docs/repair/internal-features.md b/docs/repair/internal-features.md index 16eb32f66a..9bb904dd37 100644 --- a/docs/repair/internal-features.md +++ b/docs/repair/internal-features.md @@ -106,7 +106,7 @@ replacement PR. Direct mutation still happens outside Codex. ## Cloud Worker Flow -Workflow: `.github/workflows/cluster-worker.yml` +Workflow: `.github/workflows/repair-cluster-worker.yml` The cluster worker has two jobs: @@ -285,7 +285,7 @@ The finalizer scans open ClawSweeper PRs in the target repo. It finds PRs by the - security hold When `--dispatch-repairs --execute` is enabled, it dispatches the existing -cluster job back through `cluster-worker.yml` instead of creating another PR. +cluster job back through `repair-cluster-worker.yml` instead of creating another PR. The idempotency key includes target repo, PR number, and head SHA, so the same PR/head is not repeatedly repaired unless `--allow-repeat` is used. @@ -295,8 +295,8 @@ clearly transient jobs, and pass branch-caused failures into the repair prompt. ## Self-Heal Failed ClawSweeper Runs -Workflow: `.github/workflows/self-heal.yml` -Script: `scripts/self-heal-failed-runs.ts` +Workflow: `.github/workflows/repair-self-heal.yml` +Script: `src/repair/self-heal-failed-runs.ts` Self-heal retries failed ClawSweeper cluster-worker runs. It reads published `results/runs/*.json`, selects the latest failed run per source job, skips jobs @@ -309,11 +309,11 @@ finalizer/comment command repair path. ## Maintainer Comment Routing -Workflow: `.github/workflows/comment-router.yml` +Workflow: `.github/workflows/repair-comment-router.yml` Scripts: -- `scripts/comment-router.ts` -- `scripts/comment-router-core.ts` +- `src/repair/comment-router.ts` +- `src/repair/comment-router-core.ts` Comment routing scans recent target-repo issue/PR comments and accepts only maintainer-authored commands. Default allowed GitHub `author_association` @@ -326,7 +326,7 @@ values: Contributor comments are ignored without a reply. The generated-PR auto-update design is documented in -[`docs/auto-update-prs.md`](auto-update-prs.md). That lane lets trusted +[`docs/repair/auto-update-prs.md`](auto-update-prs.md). That lane lets trusted ClawSweeper comments dispatch a repair run for an existing ClawSweeper PR or a PR explicitly opted into `clawsweeper:automerge` without allowing arbitrary comment authors to trigger work. @@ -372,7 +372,7 @@ Behavior: Repair commands apply to existing ClawSweeper PRs and PRs opted into `clawsweeper:automerge`. The router finds ClawSweeper PRs by the `clawsweeper/*` branch, resolves or creates the backing job, posts one -idempotent response marker, and dispatches `cluster-worker.yml`. +idempotent response marker, and dispatches `repair-cluster-worker.yml`. Trusted ClawSweeper comments become `clawsweeper_auto_repair`. Preferred comments use hidden `clawsweeper-verdict:*` markers and include diff --git a/docs/repair/operations.md b/docs/repair/operations.md index 6a160842db..eb659f6901 100644 --- a/docs/repair/operations.md +++ b/docs/repair/operations.md @@ -5,7 +5,7 @@ commands, finalizers, self-heal, gates, and ledgers, see [`docs/INTERNAL_FEATURES.md`](INTERNAL_FEATURES.md). For the trusted ClawSweeper-to-ClawSweeper PR repair loop, see -[`docs/auto-update-prs.md`](auto-update-prs.md). +[`docs/repair/auto-update-prs.md`](auto-update-prs.md). For commit-review findings, ClawSweeper dispatches `clawsweeper_commit_finding` to this repository. ClawSweeper fetches the latest @@ -194,7 +194,7 @@ Repair commands apply to existing ClawSweeper PRs and to PRs opted into `clawsweeper/*` branch prefix. Opted-in non-ClawSweeper PRs get an adopted job at `jobs//inbox/automerge---.md`. The router posts one idempotent reply with a hidden marker and dispatches the -normal `cluster-worker.yml` repair path. It records processed comment versions +normal `repair-cluster-worker.yml` repair path. It records processed comment versions in `results/comment-router.json`. For durable ClawSweeper comments, idempotency is per comment id plus GitHub `updated_at`, and response markers include the target PR head SHA. That lets edited ClawSweeper comments wake diff --git a/src/repair/config.ts b/src/repair/config.ts index 4ff66d836e..878129ba73 100644 --- a/src/repair/config.ts +++ b/src/repair/config.ts @@ -2,7 +2,12 @@ import type { JsonValue, LooseRecord } from "./json-types.js"; import { DEFAULT_ALLOWED_REPOSITORY_PERMISSIONS } from "./comment-router-core.js"; import { currentProjectRepo, readMaxLiveWorkers } from "./lib.js"; import { assertRepo, commaSet, positiveInteger } from "./comment-router-utils.js"; -import { DEFAULT_HEAD_PREFIX, DEFAULT_TARGET_REPO } from "./constants.js"; +import { + DEFAULT_HEAD_PREFIX, + DEFAULT_TARGET_REPO, + REPAIR_CLUSTER_WORKFLOW, + SWEEP_WORKFLOW, +} from "./constants.js"; export { DEFAULT_HEAD_PREFIX, DEFAULT_TARGET_REPO } from "./constants.js"; const DEFAULT_ALLOWED_ASSOCIATIONS = ["OWNER", "MEMBER", "COLLABORATOR"]; @@ -44,7 +49,7 @@ export function readCommentRouterConfig(args: LooseRecord): CommentRouterConfig ); const workflow = stringSetting( args.workflow ?? process.env.CLAWSWEEPER_COMMENT_WORKFLOW, - "cluster-worker.yml", + REPAIR_CLUSTER_WORKFLOW, ); const reviewRepo = stringSetting( args["review-repo"] ?? process.env.CLAWSWEEPER_REVIEW_REPO, @@ -52,7 +57,7 @@ export function readCommentRouterConfig(args: LooseRecord): CommentRouterConfig ); const reviewWorkflow = stringSetting( args["review-workflow"] ?? process.env.CLAWSWEEPER_REVIEW_WORKFLOW, - "sweep.yml", + SWEEP_WORKFLOW, ); const runner = stringSetting( args.runner ?? process.env.CLAWSWEEPER_WORKER_RUNNER, diff --git a/src/repair/constants.ts b/src/repair/constants.ts index b1b31dae75..44685be1ed 100644 --- a/src/repair/constants.ts +++ b/src/repair/constants.ts @@ -2,6 +2,9 @@ export const DEFAULT_TARGET_REPO = "openclaw/openclaw"; export const DEFAULT_HEAD_PREFIX = "clawsweeper/"; export const DEFAULT_LABEL = "clawsweeper"; +export const REPAIR_CLUSTER_WORKFLOW = "repair-cluster-worker.yml"; +export const SWEEP_WORKFLOW = "sweep.yml"; + export const CLAWSWEEPER_LABEL = "clawsweeper"; export const CLAWSWEEPER_LABEL_COLOR = "F97316"; export const CLAWSWEEPER_LABEL_DESCRIPTION = "Tracked by ClawSweeper automation"; diff --git a/src/repair/dispatch-jobs.ts b/src/repair/dispatch-jobs.ts index e09d04aa33..8f65e638c7 100755 --- a/src/repair/dispatch-jobs.ts +++ b/src/repair/dispatch-jobs.ts @@ -15,6 +15,7 @@ import { waitForLiveWorkerCapacity, } from "./lib.js"; import { sleepMs } from "./timing.js"; +import { REPAIR_CLUSTER_WORKFLOW } from "./constants.js"; const args = parseArgs(process.argv.slice(2)); const defaultRunner = process.env.CLAWSWEEPER_WORKER_RUNNER ?? "blacksmith-4vcpu-ubuntu-2404"; @@ -23,7 +24,7 @@ const defaultExecutionRunner = const mode = args.mode ?? "plan"; const runner = args.runner ?? defaultRunner; const executionRunner = args["execution-runner"] ?? args.execution_runner ?? defaultExecutionRunner; -const workflow = args.workflow ?? "cluster-worker.yml"; +const workflow = args.workflow ?? REPAIR_CLUSTER_WORKFLOW; const repo = String(args.repo ?? currentProjectRepo()); const model = String(args.model ?? process.env.CLAWSWEEPER_MODEL ?? "gpt-5.5"); const maxLiveWorkers = readMaxLiveWorkers(args); diff --git a/src/repair/finalize-open-prs.ts b/src/repair/finalize-open-prs.ts index 4298c7d075..0352a09fd8 100644 --- a/src/repair/finalize-open-prs.ts +++ b/src/repair/finalize-open-prs.ts @@ -14,7 +14,7 @@ import { } from "./lib.js"; import { ghJson, ghText } from "./github-cli.js"; import { sleepMs } from "./timing.js"; -import { DEFAULT_TARGET_REPO, REVIEW_BOTS } from "./constants.js"; +import { DEFAULT_TARGET_REPO, REPAIR_CLUSTER_WORKFLOW, REVIEW_BOTS } from "./constants.js"; import { numberEnv } from "./env-utils.js"; import { compactText, escapeRegExp } from "./text-utils.js"; @@ -35,7 +35,7 @@ const writeReport = Boolean(args["write-report"]); const execute = Boolean(args.execute); const dispatchRepairs = Boolean(args["dispatch-repairs"] || args.dispatch || execute); const workflow = String( - args.workflow ?? process.env.CLAWSWEEPER_FINALIZER_WORKFLOW ?? "cluster-worker.yml", + args.workflow ?? process.env.CLAWSWEEPER_FINALIZER_WORKFLOW ?? REPAIR_CLUSTER_WORKFLOW, ); const runner = String( args.runner ?? process.env.CLAWSWEEPER_WORKER_RUNNER ?? "blacksmith-4vcpu-ubuntu-2404", diff --git a/src/repair/live-worker-capacity.ts b/src/repair/live-worker-capacity.ts index 1c2d8d8f2e..c0c3c5a823 100644 --- a/src/repair/live-worker-capacity.ts +++ b/src/repair/live-worker-capacity.ts @@ -1,5 +1,6 @@ import { ghJson } from "./github-cli.js"; import type { JsonValue, LooseRecord } from "./json-types.js"; +import { REPAIR_CLUSTER_WORKFLOW } from "./constants.js"; import { currentProjectRepo } from "./project-repo.js"; import { sleepMs } from "./timing.js"; @@ -20,7 +21,7 @@ export function readMaxLiveWorkers(args: LooseRecord = {}) { export function liveWorkerCapacity({ repo = currentProjectRepo(), - workflow = "cluster-worker.yml", + workflow = REPAIR_CLUSTER_WORKFLOW, requested = 1, maxLiveWorkers = DEFAULT_MAX_LIVE_WORKERS, }: LooseRecord = {}) { @@ -61,7 +62,7 @@ export function waitForLiveWorkerCapacity(options: LooseRecord = {}) { ); if (requestedCount > max) { throw new Error( - `refusing dispatch: requested ${requestedCount} ${options.workflow ?? "cluster-worker.yml"} workers exceeds max-live-workers=${max}`, + `refusing dispatch: requested ${requestedCount} ${options.workflow ?? REPAIR_CLUSTER_WORKFLOW} workers exceeds max-live-workers=${max}`, ); } const pollMs = readPositiveInteger( @@ -91,13 +92,13 @@ export function waitForLiveWorkerCapacity(options: LooseRecord = {}) { } throw new Error( - `timed out waiting for ${options.workflow ?? "cluster-worker.yml"} capacity: ${latest?.active ?? "unknown"} active + ${requestedCount} requested exceeds max-live-workers=${max}`, + `timed out waiting for ${options.workflow ?? REPAIR_CLUSTER_WORKFLOW} capacity: ${latest?.active ?? "unknown"} active + ${requestedCount} requested exceeds max-live-workers=${max}`, ); } export function listActiveWorkflowRuns({ repo = currentProjectRepo(), - workflow = "cluster-worker.yml", + workflow = REPAIR_CLUSTER_WORKFLOW, }: LooseRecord = {}) { const runs: LooseRecord[] = []; for (const status of ACTIVE_WORKFLOW_STATUSES) { diff --git a/src/repair/requeue-job.ts b/src/repair/requeue-job.ts index a3a77dc35d..76d194efdc 100644 --- a/src/repair/requeue-job.ts +++ b/src/repair/requeue-job.ts @@ -16,9 +16,10 @@ import { } from "./lib.js"; import { ghJson, ghText } from "./github-cli.js"; import { sleepMs } from "./timing.js"; +import { REPAIR_CLUSTER_WORKFLOW } from "./constants.js"; const DEFAULT_REPO = currentProjectRepo(); -const DEFAULT_WORKFLOW = "cluster-worker.yml"; +const DEFAULT_WORKFLOW = REPAIR_CLUSTER_WORKFLOW; const DEFAULT_RUNNER = process.env.CLAWSWEEPER_WORKER_RUNNER ?? "blacksmith-4vcpu-ubuntu-2404"; const DEFAULT_EXECUTION_RUNNER = process.env.CLAWSWEEPER_EXECUTION_RUNNER ?? "blacksmith-16vcpu-ubuntu-2404"; diff --git a/src/repair/self-heal-failed-runs.ts b/src/repair/self-heal-failed-runs.ts index 3e03a2fa14..601138d701 100644 --- a/src/repair/self-heal-failed-runs.ts +++ b/src/repair/self-heal-failed-runs.ts @@ -15,9 +15,10 @@ import { } from "./lib.js"; import { ghJson, ghText } from "./github-cli.js"; import { sleepMs } from "./timing.js"; +import { REPAIR_CLUSTER_WORKFLOW } from "./constants.js"; const DEFAULT_REPO = currentProjectRepo(); -const DEFAULT_WORKFLOW = "cluster-worker.yml"; +const DEFAULT_WORKFLOW = REPAIR_CLUSTER_WORKFLOW; const DEFAULT_RUNNER = process.env.CLAWSWEEPER_WORKER_RUNNER ?? "blacksmith-4vcpu-ubuntu-2404"; const DEFAULT_EXECUTION_RUNNER = process.env.CLAWSWEEPER_EXECUTION_RUNNER ?? "blacksmith-16vcpu-ubuntu-2404"; diff --git a/src/repair/sweep-openclaw-jobs.ts b/src/repair/sweep-openclaw-jobs.ts index dae1ee8f6c..4bf55f461d 100644 --- a/src/repair/sweep-openclaw-jobs.ts +++ b/src/repair/sweep-openclaw-jobs.ts @@ -4,6 +4,7 @@ import fs from "node:fs"; import path from "node:path"; import { hasSecuritySignalText, parseArgs, parseJob, repoRoot, validateJob } from "./lib.js"; import { ghJson } from "./github-cli.js"; +import { REPAIR_CLUSTER_WORKFLOW } from "./constants.js"; import { readJsonFileIfExists as readJson } from "./json-file.js"; const args = parseArgs(process.argv.slice(2)); @@ -291,7 +292,7 @@ function readActiveClusterRuns() { "--repo", repo, "--workflow", - "cluster-worker.yml", + REPAIR_CLUSTER_WORKFLOW, "--status", status, "--limit", diff --git a/test/repair/comment-router-core.test.mjs b/test/repair/comment-router-core.test.mjs index 615be5476e..b11d7bf7d6 100644 --- a/test/repair/comment-router-core.test.mjs +++ b/test/repair/comment-router-core.test.mjs @@ -289,7 +289,7 @@ test("renderResponse reports trusted repair dispatches without losing guardrails target: { head_sha: "def456" }, }, { - workflow: "cluster-worker.yml", + workflow: "repair-cluster-worker.yml", job_path: "jobs/openclaw/inbox/example.md", mode: "autonomous", model: "gpt-5.5", @@ -298,7 +298,7 @@ test("renderResponse reports trusted repair dispatches without losing guardrails assert.match(body, /Thanks, ClawSweeper/); assert.match(body, /clawsweeper-command:456:2026-04-29T07:12:31Z:clawsweeper_auto_repair:def456/); - assert.match(body, /cluster-worker\.yml/); + assert.match(body, /repair-cluster-worker\.yml/); assert.match(body, /safe credited replacement/); assert.match(body, /narrow fix/); assert.doesNotMatch(body, /ClawSweeper Repair/i); @@ -359,7 +359,7 @@ test("renderResponse reports automerge repair dispatches", () => { target: { head_sha: "def457" }, }, { - workflow: "cluster-worker.yml", + workflow: "repair-cluster-worker.yml", job_path: "jobs/openclaw/inbox/automerge-openclaw-openclaw-74156.md", mode: "autonomous", model: "gpt-5.5", @@ -367,7 +367,7 @@ test("renderResponse reports automerge repair dispatches", () => { ); assert.match(body, /picked up the repair feedback/); - assert.match(body, /cluster-worker\.yml/); + assert.match(body, /repair-cluster-worker\.yml/); assert.match(body, /automerge-openclaw-openclaw-74156/); assert.doesNotMatch(body, /did not dispatch/); });