fix: retain chunked run logs

This commit is contained in:
Peter Steinberger 2026-05-02 12:06:59 +01:00
parent 819640fbf7
commit 5b18b7d53c
No known key found for this signature in database
20 changed files with 261 additions and 29 deletions

View File

@ -8,6 +8,8 @@
### Fixed
- Fixed recorded run logs so long noisy commands are stored in bounded chunks instead of losing the failure evidence between the first output events and the final tail.
## 0.3.0 - 2026-05-02
Crabbox 0.3.0 makes brokered runs much easier to observe and debug, adds

View File

@ -81,7 +81,7 @@ Core stored records:
```sql
leases(id, slug, provider, cloud_id, region, owner, org, profile, class, server_type, server_id, server_name, provider_key, host, ssh_user, ssh_port, work_root, keep, ttl_seconds, idle_timeout_seconds, estimated_hourly_usd, max_estimated_usd, state, created_at, updated_at, last_touched_at, expires_at, released_at, ended_at)
runs(id, lease_id, slug, owner, org, provider, class, server_type, command_json, state, exit_code, sync_ms, command_ms, duration_ms, log_bytes, log_truncated, results_json, started_at, ended_at)
runlog(run_id, bounded_stdout_stderr_tail)
runlog(run_id, bounded_stdout_stderr_capture)
```
State transitions:

View File

@ -189,7 +189,7 @@ Behavior:
6. Sync current repo, unless a matching sync fingerprint lets Crabbox skip rsync.
7. Seed remote Git from the configured origin/base ref before first sync when possible.
8. Run command over SSH.
9. Stream remote output, append run events, and retain the latest log tail in coordinator history.
9. Stream remote output, append run events, and retain bounded command output in coordinator history.
10. Heartbeat coordinator leases in the background.
11. Release lease unless `--keep` is set.
12. Exit with the remote command exit code.

View File

@ -26,5 +26,4 @@ inspectable through [history](history.md), [events](events.md), and
[logs](logs.md).
Output events are a bounded preview. Use [logs](logs.md) for the retained
command output tail after completion.
command output after completion.

View File

@ -17,7 +17,7 @@ Human output includes sequence number, event type, phase, stream, timestamp, and
short message or output text. JSON output returns the raw event records.
Output events are a bounded preview: stdout/stderr capture stops after 64 KiB
per run and records an `output.truncated` marker. Use `crabbox logs` for the
retained command output tail.
larger retained command output.
Flags:

View File

@ -1,6 +1,6 @@
# logs
`crabbox logs` prints the retained remote output tail for a recorded run.
`crabbox logs` prints the retained remote output for a recorded run.
```sh
crabbox logs run_...
@ -10,7 +10,7 @@ crabbox logs run_... --json
The plain form writes the log text to stdout. `--json` returns run metadata plus the log.
Logs are bounded tails of remote stdout/stderr. They are for debugging recent runs, not unlimited archival.
Logs are bounded remote stdout/stderr captures. The CLI keeps up to 8 MiB per run and the coordinator stores larger captures in chunks, so failures from noisy parallel runs remain visible without turning run history into unlimited archival storage.
Related docs:

View File

@ -31,9 +31,9 @@ Before the first rsync into a Git checkout, Crabbox tries to seed the remote wor
After sync, Crabbox runs a remote sanity check. If the remote checkout reports at least 200 tracked deletions, Crabbox fails before running tests unless local `CRABBOX_ALLOW_MASS_DELETIONS=1` is set.
When a coordinator is configured, Crabbox records each remote command as a run history item. `crabbox history` lists those records and `crabbox logs <run-id>` prints the retained remote output tail. Log retention is intentionally bounded so a noisy command cannot fill Durable Object storage.
When a coordinator is configured, Crabbox records each remote command as a run history item. `crabbox history` lists those records and `crabbox logs <run-id>` prints retained remote output. Log retention is intentionally bounded so a noisy command cannot fill Durable Object storage.
Add `--junit <path>` or configure `results.junit` to attach JUnit XML summaries to the run record. `crabbox results <run-id>` then prints failed tests without reading the raw log tail.
Add `--junit <path>` or configure `results.junit` to attach JUnit XML summaries to the run record. `crabbox results <run-id>` then prints failed tests without reading the raw log.
Use `crabbox sync-plan` to inspect the same local manifest without leasing a box when a sync estimate looks unexpectedly large.

View File

@ -19,7 +19,7 @@ Core features:
- [Actions hydration](actions-hydration.md): let GitHub Actions prepare a runner, then sync local work into that workspace.
- [SSH keys](ssh-keys.md): per-lease keys, provider key cleanup, and local storage.
- [Cost and usage](cost-usage.md): guardrails, provider-backed pricing, and reporting.
- [History and logs](history-logs.md): coordinator run records, events, and retained remote output tails.
- [History and logs](history-logs.md): coordinator run records, events, and retained remote output.
- [Test results](test-results.md): JUnit summaries attached to recorded runs.
- [Cache controls](cache.md): inspect, purge, and warm remote package/build caches.
- [Auth and admin](auth-admin.md): login/logout/whoami and trusted operator controls.

View File

@ -19,7 +19,7 @@ CLI finishes that run with:
- total duration;
- owner and org;
- provider, class, and server type;
- retained remote output tail.
- retained remote output.
Use:
@ -32,10 +32,11 @@ crabbox logs run_...
```
History records and run events live in the Fleet Durable Object. Log text is
stored separately from run metadata and intentionally capped to the latest tail
so noisy commands cannot exhaust storage. Event output capture is also bounded;
use `crabbox attach` for active run previews and `crabbox logs` for the retained
command output tail.
stored separately from run metadata and intentionally capped so noisy commands
cannot exhaust storage. Logs larger than one storage value are chunked by the
coordinator and reassembled by `crabbox logs`. Event output capture is also
bounded; use `crabbox attach` for active run previews and `crabbox logs` for the
retained command output.
Direct-provider mode does not have central history. Use shell output or local terminal logs there.

View File

@ -6,7 +6,7 @@ Read when:
- changing how failed tests are summarized;
- debugging why `crabbox results` has no data.
Crabbox can attach JUnit XML summaries to coordinator run history. The agent uses this so a failed run can answer "which tests failed?" without scraping a large log tail.
Crabbox can attach JUnit XML summaries to coordinator run history. The agent uses this so a failed run can answer "which tests failed?" without scraping a large raw log.
Configure per run:

View File

@ -7,7 +7,7 @@ Read when:
- finding a remote machine for SSH inspection;
- correlating Actions hydration with the remote workspace.
Crabbox exposes operational visibility through CLI commands, coordinator usage summaries, retained run history/log tails, provider labels, GitHub Actions run links, and Worker logs. The reliable path is to keep the lease ID and run ID together.
Crabbox exposes operational visibility through CLI commands, coordinator usage summaries, retained run history/logs, provider labels, GitHub Actions run links, and Worker logs. The reliable path is to keep the lease ID and run ID together.
## Lease State
@ -49,7 +49,7 @@ Reports include lease count, active lease count, elapsed runtime, estimated elap
Coordinator-backed `crabbox run` creates a durable run record before leasing
starts, appends lifecycle events while the CLI progresses, and finishes the run
with exit code, timing, and the latest retained output tail.
with exit code, timing, and retained command output.
Use:
@ -65,8 +65,8 @@ bin/crabbox results run_...
History is for command debugging, not unlimited log archival. Events are ordered
phase and output chunks for reconnect/inspection, and `attach` can follow those
events while the original CLI is still alive. Logs are bounded tails of remote
stdout/stderr. Test results are stored as structured summaries when `--junit`
events while the original CLI is still alive. Logs are bounded retained remote
stdout/stderr captures. Test results are stored as structured summaries when `--junit`
or `results.junit` is configured.
## Remote Debugging

View File

@ -150,7 +150,7 @@ Do not store:
- file contents;
- SSH keys.
Coordinator run records keep bounded stdout/stderr tails and optional structured JUnit summaries for debugging.
Coordinator run records keep bounded stdout/stderr captures and optional structured JUnit summaries for debugging.
## Future Audit Trail

View File

@ -509,11 +509,13 @@ func (c *CoordinatorClient) CreateRun(ctx context.Context, leaseID string, cfg C
func (c *CoordinatorClient) FinishRun(ctx context.Context, runID string, exitCode int, sync, command time.Duration, log string, truncated bool, results *TestResultSummary) (CoordinatorRun, error) {
var res CoordinatorRunResponse
logChunks := splitRunLogChunks(log)
err := c.do(ctx, http.MethodPost, "/v1/runs/"+url.PathEscape(runID)+"/finish", map[string]any{
"exitCode": exitCode,
"syncMs": sync.Milliseconds(),
"commandMs": command.Milliseconds(),
"log": log,
"log": runLogFallbackPreview(log, truncated),
"logChunks": logChunks,
"logTruncated": truncated,
"results": results,
}, &res)

View File

@ -116,6 +116,41 @@ func TestCoordinatorRunEvents(t *testing.T) {
}
}
func TestCoordinatorFinishRunSendsLogChunks(t *testing.T) {
var finishBody map[string]any
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost || r.URL.Path != "/v1/runs/run_123/finish" {
t.Fatalf("unexpected request %s %s", r.Method, r.URL.Path)
}
if err := json.NewDecoder(r.Body).Decode(&finishBody); err != nil {
t.Fatal(err)
}
_, _ = w.Write([]byte(`{"run":{"id":"run_123","leaseID":"","owner":"peter@example.com","org":"openclaw","provider":"aws","class":"standard","serverType":"t3.small","command":["pnpm","test"],"state":"failed","phase":"failed","exitCode":1,"logBytes":0,"logTruncated":false,"startedAt":"2026-05-02T00:00:00Z"}}`))
}))
defer server.Close()
client := CoordinatorClient{BaseURL: server.URL, Client: server.Client()}
log := strings.Repeat("x", coordinatorRunLogChunkBytes) + "tail"
if _, err := client.FinishRun(context.Background(), "run_123", 1, time.Second, 2*time.Second, log, false, nil); err != nil {
t.Fatal(err)
}
chunks, ok := finishBody["logChunks"].([]any)
if !ok {
t.Fatalf("logChunks body=%#v", finishBody["logChunks"])
}
if len(chunks) != 2 {
t.Fatalf("logChunks=%d, want 2", len(chunks))
}
if got := chunks[0].(string); len(got) != coordinatorRunLogChunkBytes {
t.Fatalf("first chunk length=%d, want %d", len(got), coordinatorRunLogChunkBytes)
}
if got := chunks[1].(string); got != "tail" {
t.Fatalf("second chunk=%q, want tail", got)
}
if got := finishBody["log"].(string); len(got) != runLogFallbackPreviewBytes || !strings.HasSuffix(got, "tail") {
t.Fatalf("fallback log length=%d suffix=%q", len(got), got[len(got)-4:])
}
}
func TestCurlConfigKeepsBearerTokenInConfig(t *testing.T) {
client := CoordinatorClient{
BaseURL: "https://example.test",

View File

@ -89,7 +89,7 @@ func outputTruncatedEventInput() CoordinatorRunEventInput {
return CoordinatorRunEventInput{
Type: "output.truncated",
Phase: "command",
Message: fmt.Sprintf("stdout/stderr event capture capped at %d bytes; use crabbox logs for the retained output tail", runEventOutputMaxBytes),
Message: fmt.Sprintf("stdout/stderr event capture capped at %d bytes; use crabbox logs for retained command output", runEventOutputMaxBytes),
}
}

View File

@ -2,7 +2,11 @@ package cli
import "sync"
const maxRunLogBytes = 64 * 1024
const (
maxRunLogBytes = 8 * 1024 * 1024
coordinatorRunLogChunkBytes = 64 * 1024
runLogFallbackPreviewBytes = 64 * 1024
)
type runLogBuffer struct {
mu sync.Mutex
@ -39,3 +43,35 @@ func (b *runLogBuffer) Truncated() bool {
defer b.mu.Unlock()
return b.truncated
}
func splitRunLogChunks(log string) []string {
if len(log) == 0 {
return nil
}
chunks := make([]string, 0, (len(log)+coordinatorRunLogChunkBytes-1)/coordinatorRunLogChunkBytes)
start := 0
size := 0
for index, char := range log {
charSize := len(string(char))
if size > 0 && size+charSize > coordinatorRunLogChunkBytes {
chunks = append(chunks, log[start:index])
start = index
size = 0
}
size += charSize
}
if start < len(log) {
chunks = append(chunks, log[start:])
}
return chunks
}
func runLogFallbackPreview(log string, truncated bool) string {
if !truncated && len(log) <= runLogFallbackPreviewBytes {
return log
}
if len(log) <= runLogFallbackPreviewBytes {
return log
}
return log[len(log)-runLogFallbackPreviewBytes:]
}

View File

@ -65,3 +65,28 @@ func TestRunLogBufferConcurrentWrites(t *testing.T) {
t.Fatalf("log missing expected output: %q", log)
}
}
func TestSplitRunLogChunks(t *testing.T) {
log := strings.Repeat("a", coordinatorRunLogChunkBytes) + "tail"
chunks := splitRunLogChunks(log)
if len(chunks) != 2 {
t.Fatalf("chunks=%d, want 2", len(chunks))
}
if len(chunks[0]) != coordinatorRunLogChunkBytes {
t.Fatalf("first chunk length=%d, want %d", len(chunks[0]), coordinatorRunLogChunkBytes)
}
if got := strings.Join(chunks, ""); got != log {
t.Fatalf("joined chunks length=%d, want %d", len(got), len(log))
}
}
func TestRunLogFallbackPreviewKeepsTail(t *testing.T) {
log := strings.Repeat("a", runLogFallbackPreviewBytes) + "tail"
preview := runLogFallbackPreview(log, true)
if len(preview) != runLogFallbackPreviewBytes {
t.Fatalf("preview length=%d, want %d", len(preview), runLogFallbackPreviewBytes)
}
if !strings.HasSuffix(preview, "tail") {
t.Fatalf("preview does not keep tail: suffix=%q", preview[len(preview)-8:])
}
}

View File

@ -25,6 +25,10 @@ import type {
import { costLimits, enforceCostLimits, leaseCost, requestOrg, usageSummary } from "./usage";
const fleetID = "default";
const maxStoredRunLogBytes = 8 * 1024 * 1024;
const runLogChunkBytes = 64 * 1024;
const textEncoder = new TextEncoder();
const textDecoder = new TextDecoder();
export class FleetDurableObject implements DurableObject {
constructor(
@ -400,7 +404,7 @@ export class FleetDurableObject implements DurableObject {
if (!run || !this.runVisibleToRequest(run, request)) {
return notFound();
}
const log = (await this.state.storage.get<string>(runLogKey(runID))) ?? "";
const log = await this.readRunLog(runID);
return new Response(log, {
headers: { "content-type": "text/plain; charset=utf-8" },
});
@ -453,13 +457,13 @@ export class FleetDurableObject implements DurableObject {
run.state = run.exitCode === 0 ? "succeeded" : "failed";
run.phase = run.state;
run.endedAt = now.toISOString();
const log = input.log ?? "";
run.logBytes = new TextEncoder().encode(log).byteLength;
run.logTruncated = Boolean(input.logTruncated);
const logInput = normalizeRunLogInput(input);
run.logBytes = logInput.bytes;
run.logTruncated = logInput.truncated;
if (input.results) {
run.results = boundedTestResults(input.results);
}
await this.state.storage.put(runLogKey(runID), log);
await this.writeRunLog(runID, logInput.log);
await this.putRun(run);
await this.appendRunEventRecord(run, {
type: "command.finished",
@ -469,6 +473,35 @@ export class FleetDurableObject implements DurableObject {
return json({ run });
}
private async readRunLog(runID: string): Promise<string> {
const chunks = await this.state.storage.list<string>({ prefix: runLogChunkPrefix(runID) });
if (chunks.size > 0) {
return [...chunks.entries()]
.toSorted(([left], [right]) => left.localeCompare(right))
.map(([, chunk]) => chunk)
.join("");
}
return (await this.state.storage.get<string>(runLogKey(runID))) ?? "";
}
private async writeRunLog(runID: string, log: string): Promise<void> {
await this.deleteRunLogChunks(runID);
if (textEncoder.encode(log).byteLength <= runLogChunkBytes) {
await this.state.storage.put(runLogKey(runID), log);
return;
}
await this.state.storage.put(runLogKey(runID), "");
const chunks = splitRunLogByBytes(log, runLogChunkBytes);
await Promise.all(
chunks.map((chunk, index) => this.state.storage.put(runLogChunkKey(runID, index), chunk)),
);
}
private async deleteRunLogChunks(runID: string): Promise<void> {
const chunks = await this.state.storage.list<string>({ prefix: runLogChunkPrefix(runID) });
await Promise.all([...chunks.keys()].map((key) => this.state.storage.delete(key)));
}
private async listRuns(request: Request): Promise<Response> {
const url = new URL(request.url);
const leaseID = url.searchParams.get("leaseID") ?? "";
@ -746,6 +779,14 @@ function runLogKey(runID: string): string {
return `runlog:${runID}`;
}
function runLogChunkPrefix(runID: string): string {
return `runlog:${runID}:chunk:`;
}
function runLogChunkKey(runID: string, index: number): string {
return `${runLogChunkPrefix(runID)}${String(index).padStart(6, "0")}`;
}
function runEventPrefix(runID: string): string {
return `runevent:${runID}:`;
}
@ -816,6 +857,52 @@ function finiteQueryNumber(value: string | null): number | undefined {
return Number.isFinite(parsed) && parsed >= 0 ? Math.trunc(parsed) : undefined;
}
function normalizeRunLogInput(input: RunFinishRequest): {
log: string;
bytes: number;
truncated: boolean;
} {
const chunkLog = Array.isArray(input.logChunks)
? input.logChunks.map((chunk) => String(chunk)).join("")
: "";
const rawLog = chunkLog || input.log || "";
const bounded = truncateUtf8Tail(rawLog, maxStoredRunLogBytes);
const rawBytes = textEncoder.encode(rawLog).byteLength;
return {
log: bounded,
bytes: Math.min(rawBytes, maxStoredRunLogBytes),
truncated: Boolean(input.logTruncated) || rawBytes > maxStoredRunLogBytes,
};
}
function splitRunLogByBytes(log: string, maxBytes: number): string[] {
const chunks: string[] = [];
let current = "";
let currentBytes = 0;
for (const char of log) {
const charBytes = textEncoder.encode(char).byteLength;
if (current && currentBytes + charBytes > maxBytes) {
chunks.push(current);
current = "";
currentBytes = 0;
}
current += char;
currentBytes += charBytes;
}
if (current) {
chunks.push(current);
}
return chunks;
}
function truncateUtf8Tail(value: string, maxBytes: number): string {
const encoded = textEncoder.encode(value);
if (encoded.byteLength <= maxBytes) {
return value;
}
return textDecoder.decode(encoded.slice(encoded.byteLength - maxBytes));
}
const MAX_RESULT_FILES = 50;
const MAX_RESULT_FAILURES = 100;
const MAX_RESULT_STRING_BYTES = 4096;

View File

@ -163,6 +163,7 @@ export interface RunFinishRequest {
syncMs?: number;
commandMs?: number;
log?: string;
logChunks?: string[];
logTruncated?: boolean;
results?: TestResultSummary;
}

View File

@ -549,6 +549,50 @@ describe("fleet run history", () => {
expect(await logs.text()).toBe("ok\n");
});
it("records chunked run logs so failures do not disappear from long output", async () => {
const storage = new MemoryStorage();
const fleet = testFleet(storage);
const create = await fleet.fetch(
request("POST", "/v1/runs", {
body: {
leaseID: "cbx_000000000001",
provider: "aws",
class: "beast",
serverType: "c7a.48xlarge",
command: ["pnpm", "test"],
},
}),
);
expect(create.status).toBe(201);
const { run } = (await create.json()) as { run: { id: string } };
const chunkA = `${"a".repeat(70_000)}\nFAIL src/example.test.ts\n`;
const chunkB = `${"b".repeat(70_000)}\nELIFECYCLE Test failed\n`;
const finish = await fleet.fetch(
request("POST", `/v1/runs/${run.id}/finish`, {
body: {
exitCode: 1,
log: "fallback tail only\n",
logChunks: [chunkA, chunkB],
},
}),
);
expect(finish.status).toBe(200);
const finished = (await finish.json()) as {
run: { state: string; logBytes: number; logTruncated: boolean };
};
expect(finished.run.state).toBe("failed");
expect(finished.run.logBytes).toBe(chunkA.length + chunkB.length);
expect(finished.run.logTruncated).toBe(false);
expect(storage.value<string>(`runlog:${run.id}`)).toBe("");
const logs = await fleet.fetch(request("GET", `/v1/runs/${run.id}/logs`));
const logText = await logs.text();
expect(logText).toContain("FAIL src/example.test.ts");
expect(logText).toContain("ELIFECYCLE Test failed");
expect(logText).not.toContain("fallback tail only");
});
it("records resolved lease metadata instead of caller-supplied fallback guesses", async () => {
const storage = new MemoryStorage();
const fleet = testFleet(storage);