Commit Graph

45 Commits

Author SHA1 Message Date
Josh Lehman
55b0797c03
fix: clone-before-skills ordering + proper git rebase in workspace hooks
Some checks failed
make-all / make-all (push) Has been cancelled
- Wrap skill copying in copy_skills() helper function
- Call copy_skills() before early exit for lightweight states (Triage/Closure/Request Changes)
- Clone openclaw repo BEFORE copying skills (fixes empty-dir clone failure)
- Call copy_skills() AFTER clone + checkout for normal states
- Replace 'git pull --rebase origin HEAD' with 'git rebase origin/main' in before_run
- Add regression test to verify hook ordering in core_test.exs
2026-03-16 21:49:29 -07:00
Josh Lehman
cb46c7ab51
fix: point workspace origin to GitHub, fetch before each run
Local clone for speed, then reset origin to github.com/openclaw/openclaw.
Add git fetch origin to before_run hook so repeat Prepare runs
get fresh origin/main refs instead of stale local-clone state.
2026-03-16 20:29:34 -07:00
Josh Lehman
ddb6821c45
feat: activity labels (REVIEWING, PREPARING, etc.) applied during agent work, removed on exit 2026-03-12 22:03:23 -07:00
Josh Lehman
2c91e85ac0
Revert "feat: Prepare uses --no-test gates, full suite deferred to Test phase"
This reverts commit 3f2b323829.
2026-03-12 16:37:56 -07:00
Josh Lehman
3f2b323829
feat: Prepare uses --no-test gates, full suite deferred to Test phase 2026-03-12 16:22:59 -07:00
Josh Lehman
0b8db57582
fix: make Rebase build check advisory-only
Build failures after rebase are usually mainline API drift, not bad
conflict resolution. Push anyway and let Prepare handle code fixes.
2026-03-12 15:59:37 -07:00
Josh Lehman
72be61c620
feat: add Rebase state for lightweight branch updates
Side-lane state that rebases PR branch onto main, resolves mechanical
conflicts, force pushes, and drops back to Todo for human decision.
No review, no gates, no tests -- just freshens the branch.

Linear state: de50ceb9-a0ef-4f13-849f-bf31a65392ee
2026-03-12 15:36:32 -07:00
Josh Lehman
3599008802
refactor: remove test plan/kit generation from Prepare phase
Testing is now handled by the dedicated Test state lane.
Prepare phase focuses on: fix findings, run gates, push, report.
2026-03-12 15:10:20 -07:00
Josh Lehman
2e2d871330
feat: add Test + Pre-merge states for split testing pipeline
- Created Test state (591e5db0) and Pre-merge state (3f6e88cf) in Linear
- Test is an optional lane: Prepare Complete -> Test -> Pre-merge -> Merge
- Josh can also skip Test and go Prepare Complete -> Merge directly
- Test phase runs full suite, distinguishes pre-existing vs PR-introduced failures
- Pre-merge is a human gate with notification
- Max 1 concurrent test agent (resource constraint, same as prepare)
2026-03-12 12:44:05 -07:00
Josh Lehman
0cd174764c
prepare: add test-kit generation and test-gateway.sh
- WORKFLOW.md: Prepare phase now generates executable test scripts in
  .local/test-kit/ for PRs with API-observable behavioral changes
- common.sh template with gw_call() and assert_jq() helpers
- Numbered test scripts per scenario, independently runnable
- Linear comment includes worktree path to test-kit
- priv/scripts/test-gateway.sh: standalone tool to run a PR build
  on alternate port (18790) without touching the production gateway
  Supports start/stop/status/call subcommands
2026-03-10 16:15:44 -07:00
Josh Lehman
161317304d
fix: include full test plan in Prepare Complete comment, not just a summary 2026-03-09 16:28:29 -07:00
Josh Lehman
c568a31582
prepare: add live test plan generation step
After fixes and gates pass, the prepare agent now generates .local/test-plan.md
with concrete verification steps executable on a live OpenClaw instance.

Test plans include:
- Risk tier classification (GREEN/YELLOW/RED)
- Specific CLI commands and expected outcomes
- Rollback instructions
- Pre-test checklist for branch checkout/build/restart cycle
- Untestable changes section for transparency
2026-03-09 12:20:52 -07:00
Josh Lehman
4ae2369cf3
pipeline: agents read Linear comments for maintainer guidance
Review and Prepare agents now query issue comments before starting
work. Maintainer can leave fix directions, focus areas, or constraints
on the Linear issue and agents will incorporate them.
2026-03-06 14:37:29 -08:00
Josh Lehman
3ec50e1878
docs: rewrite README for caclawphony pipeline 2026-03-06 13:32:40 -08:00
Josh Lehman
f7c07fe129
pipeline: add Triage + Request Changes states, require duplicate assessments
- Move skill copy before early-exit in after_create hook (fixes pr-cluster missing for Triage agents)
- Add Triage and Request Changes as active states
- Move enrichment template from Backlog to Triage
- Add prior GitHub CHANGES_REQUESTED review check to Triage template
- Require structured duplicate assessment comments on both duplicate paths
- Add Request Changes template (posts gh pr review, moves to Backlog)
- Add missing state IDs to states map (review, review_complete, prepare, prepare_complete)
- Default mix caclawphony.review to Triage state (--direct skips to Review)
- gitignore workspaces/
2026-03-06 13:30:18 -08:00
Josh Lehman
35ea2f96a2
fix: scope no-PR-comments rule to non-Closure phases
Closure agents were refusing to post GitHub closing comments because
the 'Never comment on the PR' rule appeared in a global Rules section.
Now conditionally excluded for Closure phase, which needs gh pr comment
and gh pr close to do its job.
2026-03-05 12:12:32 -08:00
Josh Lehman
8458200c66
fix: verify duplicate relations + auto-transition to Closure on merge
Two bugs caused MAR-55 duplicates to get stuck:
1. Enrichment created Duplicate issues but relations weren't verified
2. Merge phase didn't move duplicates to Closure state

Fixes:
- Add relation verification step after issueRelationCreate in enrichment
- Add auto-transition of Duplicate issues to Closure in merge phase
2026-03-05 12:08:23 -08:00
Josh Lehman
dd1326576e
feat: cluster-aware enrichment + closure workflow 2026-03-05 10:36:25 -08:00
Josh Lehman
4cd32ea819
fix: NimbleOptions validation for labels/gates/states + test fixes
- Use {:map, :string, :any} and {:map, :string, :string} types for labels/states
- Return empty map instead of :omit when config sections are absent
- Fix test using self() as agent pid (causes EXIT shutdown)
- Update in-repo WORKFLOW.md integration tests to match current prompt
- Fix list ordering assertions for gate states (use Enum.sort)
2026-03-05 09:48:22 -08:00
Josh Lehman
141c296cce
feat: make notifications, gates, labels, and retry config-driven
- Add notifications section to WORKFLOW.yaml with telegram credentials and template
- Add gates section with state_id, assignee, and notify for each gate
- Add labels section with recommendation and subsystem label UUIDs
- Add states section for state name to ID mappings
- Add retry_base_ms and continuation_delay_ms to agent config
- Config module parses all new sections with env var resolution
- Notifier reads from Config instead of env vars directly
- Orchestrator uses Config for gate states and retry delays
- PromptBuilder exposes labels/gates/states as Solid template variables
- Tests cover all new config paths
2026-03-05 09:30:55 -08:00
Josh Lehman
ba3975479b
chore: stage pebbles events and gitignore 2026-03-05 09:14:21 -08:00
Josh Lehman
de8bffe9b8
fix: post Linear comment before state transition
State transition to a non-active state (e.g. Todo, Review Complete)
causes the orchestrator to kill the agent. If the comment mutation
comes after the state transition, it never executes.

Reordered all four phases to: comment first, state transition last.
2026-03-05 09:03:39 -08:00
Josh Lehman
8d4b8607c7
fix: replace non-ASCII chars in WORKFLOW.md
Jason.EncodeError on em dashes/emoji when serializing prompt
to Codex app-server. Replace with ASCII equivalents.
2026-03-05 08:11:54 -08:00
Josh Lehman
ff15423e89
fix: correct project ID in triage task 2026-03-05 08:02:51 -08:00
Josh Lehman
e3dcb73c58
feat: wire up labels, priority, estimate, assignee
Linear labels created:
- Recommendation: review (green), wait (yellow), skip (gray)
- Subsystem: gateway, channels, browser, agents, config, cli,
  runtime, auth, providers, docs

Enrichment agent now sets all metadata in a single issueUpdate:
- Title with [REVIEW]/[WAIT]/[SKIP] prefix
- Priority (0-4 based on recommendation + scope)
- Estimate (Fibonacci complexity 1/2/3/5/8)
- Labels (one recommendation + matching subsystem labels)
- Assignee (Josh at human gates, for review queue filtering)

Review/Prepare phases also assign to maintainer at gate transitions.

Triage task gains --priority flag for manual override on intake.
2026-03-05 07:50:40 -08:00
Josh Lehman
d561b6426a
feat: Backlog enrichment pipeline
- Add Backlog to active_states in WORKFLOW.md
- Add triage/enrichment prompt template for Backlog state
- Skip repo clone and skill copy for Backlog agents (lightweight)
- Add mix caclawphony.triage task for batch PR intake into Backlog
- Flow: Backlog (enrichment) -> Todo (human gate) -> Review -> ...
2026-03-05 07:39:40 -08:00
Josh Lehman
2f0fbb46d1
caclawphony-5c2: add Telegram notifications for gate states 2026-03-05 01:35:19 -08:00
Josh Lehman
a56e62be8f
caclawphony-9ed: suppress stale rollout-path startup noise 2026-03-05 01:35:19 -08:00
Josh Lehman
0ddb5aff0b
fix: agents post summary comments on Linear issues at each gate
Review: recommendation + findings summary
Prepare: findings fixed + gate results + push status
Merge: commit SHA + PR URL + cleanup
2026-03-05 01:11:25 -08:00
Josh Lehman
fa18e8d22f
fix: turn_sandbox_policy dangerFullAccess — enables networking for gh/git 2026-03-05 00:59:47 -08:00
Josh Lehman
515d1f737d
fix: copy skills into workspace via hook instead of inlining in prompt
after_create hook now copies skill files from maintainers repo into
the workspace with cp -RL (resolving symlinks). Skills stay in one
place, workspaces get fresh copies, prompt just says 'read the skill'.
2026-03-05 00:55:18 -08:00
Josh Lehman
8a17f11c62
fix: approval_policy never — on-failure sends requestApproval RPCs that Symphony can't handle 2026-03-05 00:51:57 -08:00
Josh Lehman
02fb7d1333
prompt: inline review-pr/prepare-pr/merge-pr skill content
Codex sandbox can't read symlinked skill files from host filesystem.
Inline the full step-by-step instructions directly in the prompt template
so the agent has everything it needs without file reads.
2026-03-05 00:50:51 -08:00
Josh Lehman
d5bd8fcf80
pb: close caclawphony-87b (Marie → Linear integration via TOOLS.md) 2026-03-05 00:48:57 -08:00
Josh Lehman
0eae14ff20
merge stash: workspace.ex env vars + WORKFLOW.md symlink 2026-03-05 00:40:24 -08:00
Josh Lehman
619fce724b
pb: close fa2, 18b, 432, f2f 2026-03-05 00:40:19 -08:00
Josh Lehman
b68e3d1d94
fix(orchestrator): clean terminal issue workspaces via rm_rf
Summary:
- remove workspaces directly from orchestrator terminal-state cleanup
- build cleanup path as Config.workspace_root() <> "/" <> issue identifier
- log workspace cleanup actions when terminal issues are detected

Rationale:
- enforce immediate workspace cleanup when Linear issues reach terminal states
- align cleanup path and deletion behavior with requested implementation

Tests:
- cd elixir && mix compile
- cd elixir && mix test (first run had one flaky timing failure; rerun passed)

Refs: caclawphony-f2f

Regeneration-Prompt: |
  Implement workspace cleanup in the orchestrator when an issue reaches a
  terminal state (Done, Canceled, Duplicate). Keep cleanup limited to terminal
  transitions only, not active or gate states. Update the cleanup helper to
  construct the workspace path from Config.workspace_root() and issue
  identifier, remove it with File.rm_rf/1, and emit a log entry describing the
  cleanup. Verify by compiling and running tests, then close pebbles issue
  caclawphony-f2f and commit the change.

Co-authored-by: Codex <codex@openai.com>
2026-03-05 00:40:10 -08:00
Josh Lehman
7d3a2cdc06
feat(review): add PR intake mix task for Linear
Summary:
- Add `mix caclawphony.review` task for batch PR intake
- Fetch PR title/url via `gh pr view` per PR number
- Create Linear issues in MAR team Review state + Caclawphony project
- Print created Linear identifier and URL for each PR

Rationale:
- Provide a direct CLI for converting PR numbers into Linear review work
- Keep Linear GraphQL interactions aligned with existing client patterns

Tests:
- cd elixir && mix compile
- cd elixir && LINEAR_API_KEY=test mix caclawphony.review --help

Issue:
- caclawphony-432

Regeneration-Prompt: |
  Implement a new Mix task `mix caclawphony.review` that accepts one or
  more PR numbers and creates one Linear issue per PR for review intake.
  Preserve existing project behavior by reusing the repository's
  `SymphonyElixir.Linear.Client.graphql/3` helper instead of inventing a
  new HTTP client path.

  For each PR number, call GitHub CLI to fetch title exactly via
  `gh pr view <num> --json title -q .title` and gather the PR URL for the
  issue description. Build Linear issue titles as `PR #<num>: <title>`,
  include PR metadata in the description, and set `stateId` to Review and
  `projectId` to Caclawphony using the provided IDs.

  Determine the team ID dynamically by querying Linear for team key `MAR`
  before creating issues. Require `LINEAR_API_KEY` through existing config
  resolution, fail loudly on GraphQL errors or command failures, and print
  the created issue identifier for each PR.

Co-authored-by: Codex <codex@openai.com>
2026-03-05 00:40:10 -08:00
Josh Lehman
036a3b696f
caclawphony-18b: ensure logfile parent directory exists 2026-03-05 00:40:10 -08:00
Josh Lehman
e4176a0210
caclawphony-fa2: skip after_create for git-initialized workspaces 2026-03-05 00:40:09 -08:00
Josh Lehman
f882ab141f
pb: close caclawphony-5fb (sandbox networking) 2026-03-05 00:24:47 -08:00
Josh Lehman
b5a3110f7a
codex: switch to danger-full-access sandbox for network access
prepare-pr and merge-pr need gh CLI, pnpm install, and network access
that workspace-write sandbox blocks. Using full access for now.

Closes caclawphony-5fb
2026-03-05 00:24:47 -08:00
Josh Lehman
101f2d0447
Add PLAN.md and initialize pebbles with 8 issues
P0: sandbox networking, workspace reuse, log dir creation
P1: PR intake CLI, Marie Clawndo integration, workspace cleanup
P2: Telegram notifications, suppress rollout errors
2026-03-05 00:24:23 -08:00
Alex Kotliarskyi
b0e0ff0082
Move Elixir observability dashboard to Phoenix (#29)
#### Context

Replace the custom Elixir observability TCP server with Phoenix while
keeping the shipped escript and existing operational API calls working
the same way they did before.

#### TL;DR

*Move the Elixir observability dashboard and API onto Phoenix without
breaking the escript or curl workflows.*

#### Summary

- replace the custom HTTP server with a Phoenix endpoint, router,
controller, and shared presenter
- add a LiveView operations dashboard with PubSub-driven updates and
embed its CSS and JS assets in code
- preserve runtime compatibility by accepting form-style POSTs and
extending the endpoint coverage around those paths

#### Alternatives

- keep extending the hand-rolled TCP server, but that keeps custom
parsing and routing complexity in the app
- serve dashboard assets from priv directories, but the shipped escript
cannot rely on those files existing on disk

#### Test Plan

- [x] `make -C elixir all`
- [x] `mix -C elixir test test/symphony_elixir/extensions_test.exs`
- [x] `./bin/symphony
--i-understand-that-this-will-be-running-without-the-usual-guardrails
--port 40123`
- [x] `curl -si http://127.0.0.1:40123/dashboard.css`
- [x] `curl -si http://127.0.0.1:40123/vendor/phoenix/phoenix.js`
- [x] `curl -si -X POST -d '' http://127.0.0.1:40123/api/v1/refresh`
- [x] `curl -si -X POST -d '' http://127.0.0.1:40123/api/v1/state`

---------

Co-authored-by: Codex <codex@openai.com>
2026-03-04 17:24:56 -08:00
Alex Kotliarskyi
fa75ec68c2 🎼 2026-03-04 09:29:11 -08:00