Compare commits

..

9 Commits
main ... fix-ci

Author SHA1 Message Date
Peter Steinberger
7b9cb9587e chore: bump Commander and TauTUI 2025-11-24 07:43:59 +01:00
Peter Steinberger
420e7ce999 style: apply swiftformat defaults 2025-11-24 07:43:52 +01:00
Peter Steinberger
7889f1e26e chore: tighten changelog summary parsing 2025-11-24 07:43:34 +01:00
Peter Steinberger
71448013a7 ci: scrub DYLD vars in macOS builds 2025-11-24 07:43:30 +01:00
Peter Steinberger
b365cbd21c chore: bump commander submodule 2025-11-22 20:04:21 +01:00
Peter Steinberger
dc6f88b4c0 chore: bump commander submodule 2025-11-22 20:03:20 +01:00
Peter Steinberger
0fd29a65ac chore: bump commander submodule 2025-11-22 19:52:57 +01:00
Peter Steinberger
7e7d13c6c5 chore: bump commander submodule 2025-11-22 19:52:14 +01:00
Peter Steinberger
c3ed7eb907 chore: bump commander submodule 2025-11-22 19:49:37 +01:00
1399 changed files with 93066 additions and 153173 deletions

View File

@ -1,711 +0,0 @@
---
name: crabbox
description: Use the Crabbox wrapper for OpenClaw remote validation across Linux, macOS, Windows, and WSL2, including delegated Blacksmith Testbox proof. Report the actual provider and id.
---
# Crabbox
Use the Crabbox wrapper when OpenClaw needs remote Linux proof for broad tests,
CI-parity checks, secrets, hosted services, Docker/E2E/package lanes, warmed
reusable boxes, sync timing, logs/results, cache inspection, or lease cleanup.
Crabbox is the transport/orchestration surface. The actual backend can be:
- brokered AWS Crabbox: direct provider, `provider=aws`, lease ids like
`cbx_...`, `syncDelegated=false`
- Blacksmith Testbox through Crabbox: delegated provider,
`provider=blacksmith-testbox`, ids like `tbx_...`, `syncDelegated=true`
For OpenClaw maintainer broad `pnpm` gates, Blacksmith Testbox through the
Crabbox wrapper is acceptable and often preferred when the standing Testbox
rules apply. Do not describe those runs as "AWS Crabbox"; report them as
Testbox-through-Crabbox with the `tbx_...` id and Actions run.
Use the repo `.crabbox.yaml` brokered AWS path when the task specifically needs
direct AWS Crabbox behavior, persistent direct-provider leases, `--fresh-pr`,
`--full-resync`, environment forwarding, capture/download support, or provider
comparison. Use `--provider blacksmith-testbox` when the task needs OpenClaw
maintainer Testbox proof, prepared CI environment, broad/heavy pnpm gates, or
the user asks for Testbox/Blacksmith.
## First Checks
- Run from the repo root. Crabbox sync mirrors the current checkout.
- Check the wrapper and providers before remote work:
```sh
command -v crabbox
../crabbox/bin/crabbox --version
pnpm crabbox:run -- --help | sed -n '1,120p'
../crabbox/bin/crabbox desktop launch --help
../crabbox/bin/crabbox webvnc --help
```
- OpenClaw scripts prefer `../crabbox/bin/crabbox` when present. The user PATH
shim can be stale.
- Check `.crabbox.yaml` for direct-provider defaults. Omitting `--provider`
means brokered AWS today.
- The brokered AWS default is a Linux developer image in `eu-west-1`; the repo
config pins hot `eu-west-1a/b/c` placement so Fast Snapshot Restore can apply.
If warmup drifts well past the minute-scale path, verify image promotion,
region/AZ placement, and FSR state before blaming OpenClaw.
- For broad OpenClaw maintainer `pnpm` gates, prefer the repo wrapper with
`--provider blacksmith-testbox` or the repo Testbox helpers when the standing
Testbox policy applies.
- Always report the actual provider and id. `cbx_...` means AWS Crabbox;
`tbx_...` means Blacksmith Testbox through Crabbox. If the output only says
`blacksmith testbox list`, use `blacksmith testbox list --all` before
concluding no box exists.
- If a warm direct-provider lease smells stale, retry with `--full-resync`
(alias `--fresh-sync`) before replacing the lease. This resets the remote
workdir, skips the fingerprint fast path, reseeds Git when possible, and
uploads the checkout from scratch.
- For live/provider bugs, use the configured secret workflow before downgrading
to mocks. Copy only the exact needed key into the remote process environment
for that one command. Do not print it, do not sync it as a repo file, and do
not leave it in remote shell history or logs. If no secret-safe injection path
is available, say true live provider auth is blocked instead of silently using
a fake key.
- Prefer local targeted tests for tight edit loops. Broad gates belong remote.
- Do not treat inherited shell env as operator intent. In particular,
`OPENCLAW_LOCAL_CHECK_MODE=throttled` from the local shell is not permission
to move broad `pnpm check:changed`, `pnpm test:changed`, full `pnpm test`, or
lint/typecheck fan-out onto the laptop.
- Only use `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` when the user explicitly
asks for local proof in the current task. If Testbox is queued or capacity is
constrained, report the blocker and keep only targeted local edit-loop checks
running.
## macOS And Windows Targets
Use these only when the task needs an existing non-Linux host. OpenClaw broad
Linux validation uses the repo Crabbox config unless a provider is explicitly
requested.
Native brokered Windows is available for Windows-specific proof. Use the AWS
developer image in `us-west-2` on demand; it has the expected OpenClaw developer
toolchain and Docker image cache. Keep broad Linux gates on Linux/Testbox unless
the bug is Windows-specific:
```sh
../crabbox/bin/crabbox warmup \
--provider aws \
--target windows \
--windows-mode normal \
--region us-west-2 \
--market on-demand \
--timing-json
```
The hydrate workflow assumes Docker should already be baked into Linux images
and only installs it as a fallback. Do not add per-run Docker installs to proof
commands unless the image probe shows Docker is actually missing.
When the user explicitly asks for brokered macOS runners, use Crabbox AWS
macOS only after confirming the deployed coordinator supports EC2 Mac host
lifecycle/image routes and the operator has AWS EC2 Mac Dedicated Host quota
and IAM. Prefer `CRABBOX_HOST_ID` for a known Crabbox-managed Dedicated Host,
or run the no-spend preflight first:
```sh
crabbox admin hosts quota --provider aws --target macos --region eu-west-1 --type mac2.metal --json
crabbox admin hosts allocate --provider aws --target macos --region eu-west-1 --type mac2.metal --dry-run --json
CRABBOX_MACOS_TYPES=all scripts/macos-host-region-preflight.sh
```
Do not silently substitute AWS macOS for normal OpenClaw Linux proof. Report
paid-host blockers as quota, IAM, coordinator deployment, or host availability
instead of falling back to local macOS.
Crabbox supports static SSH targets:
```sh
../crabbox/bin/crabbox run --provider ssh --target macos --static-host mac-studio.local -- xcodebuild test
../crabbox/bin/crabbox run --provider ssh --target windows --windows-mode normal --static-host win-dev.local -- pwsh -NoProfile -Command "dotnet test"
../crabbox/bin/crabbox run --provider ssh --target windows --windows-mode wsl2 --static-host win-dev.local -- pnpm test
```
- `target=macos` and `target=windows --windows-mode wsl2` use the POSIX SSH,
bash, Git, rsync, and tar contract.
- Native Windows uses OpenSSH, PowerShell, Git, and tar; sync is manifest tar
archive transfer into `static.workRoot`. Direct native Windows runs support
`--script*`, `--env-from-profile`, `--preflight`, and PowerShell `--shell`.
- `crabbox actions hydrate/register` are Linux-only today; use plain
`crabbox run` loops for static macOS and Windows hosts.
- Live proof needs a reachable, operator-managed SSH host. Without one, verify
with `../crabbox/bin/crabbox run --help`, config/flag tests, and the Crabbox
Go test suite.
## Direct Brokered AWS Backend
Use this when the task needs direct AWS Crabbox semantics rather than the
prepared Blacksmith Testbox CI environment.
Changed gate:
```sh
pnpm crabbox:run -- \
--idle-timeout 90m \
--ttl 240m \
--timing-json \
--shell -- \
"env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed"
```
Full suite:
```sh
pnpm crabbox:run -- \
--idle-timeout 90m \
--ttl 240m \
--timing-json \
--shell -- \
"env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test"
```
Focused rerun:
```sh
pnpm crabbox:run -- \
--idle-timeout 90m \
--ttl 240m \
--timing-json \
--shell -- \
"env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test <path-or-filter>"
```
Read the JSON summary. Useful fields:
- `provider`: `aws`
- `leaseId`: `cbx_...`
- `syncDelegated`: `false`
- `commandPhases`: populated when the command prints `CRABBOX_PHASE:<name>`
- `commandMs` / `totalMs`
- `exitCode`
Crabbox should stop one-shot AWS leases automatically after the run. Verify
cleanup when a run fails, is interrupted, or the command output is unclear:
```sh
../crabbox/bin/crabbox list --provider aws
```
## Blacksmith Testbox Through Crabbox
Use this for OpenClaw maintainer broad/heavy `pnpm` gates when the prepared CI
environment is the right proof surface:
```sh
node scripts/crabbox-wrapper.mjs run \
--provider blacksmith-testbox \
--blacksmith-org openclaw \
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
--blacksmith-job check \
--blacksmith-ref main \
--idle-timeout 90m \
--ttl 240m \
--timing-json \
-- \
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 OPENCLAW_TESTBOX=1 OPENCLAW_TESTBOX_REMOTE_RUN=1 pnpm check:changed
```
Read the JSON summary and the Testbox line. Useful fields:
- `provider`: `blacksmith-testbox`
- `leaseId`: `tbx_...`
- `syncDelegated`: `true`
- `syncPhases`: delegated/skipped because Blacksmith owns checkout/sync
- Actions run URL/id from the Testbox output
- `exitCode`
`blacksmith testbox list` may hide hydrating or ready boxes. Use:
```sh
blacksmith testbox list --all
blacksmith testbox status <tbx_id>
```
## Observability Flags
Use these on debugging runs before inventing ad hoc logging:
- `--preflight`: prints run context, workspace mode, SSH target, remote user/cwd,
and target-specific tool probes. Defaults cover `git`, `tar`, `node`, `npm`,
`corepack`, `pnpm`, `yarn`, `bun`, `docker`, plus POSIX
`sudo`/`apt`/`bubblewrap` and native Windows
`powershell`/`execution_policy`/`longpaths`/`temp`/`pwsh`. Add
`--preflight-tools node,bun,docker`, `CRABBOX_PREFLIGHT_TOOLS`, or repo
`run.preflightTools` to replace the list. `default` expands built-ins; `none`
prints only the workspace summary. Preflight is diagnostic only; install
toolchains through Actions hydration, images, devcontainer/Nix/mise/asdf, or
the run script. On `blacksmith-testbox`, this prints a delegated-unsupported
note because the workflow owns setup.
- `CRABBOX_ENV_ALLOW=NAME,...`: forwards only listed local env vars for direct
providers and prints `set len=N secret=true` style summaries. On
`blacksmith-testbox`, env forwarding is unsupported; put secrets in the
Testbox workflow instead.
- `--env-from-profile <file>` plus `--allow-env NAME`: loads simple
`export NAME=value` / `NAME=value` lines from a local profile without
executing it, then forwards only allowlisted names. `--allow-env` is
repeatable and comma-separated. Profile values override ambient allowlisted
env values for that run. Direct POSIX, WSL2, and native Windows runs are
supported; delegated providers are not. Crabbox probes the uploaded profile
remotely and prints redacted presence/length metadata before the command.
- `--env-helper <name>`: with `--env-from-profile` on POSIX SSH targets,
persists `.crabbox/env/<name>` and `.crabbox/env/<name>.env` so follow-up
commands on the same lease can run through `./.crabbox/env/<name> <command>`.
Use only on leases you control; the profile stays until cleanup, lease reset,
or `--full-resync`.
- `--script <file>` / `--script-stdin`: upload a local script into
`.crabbox/scripts/` and execute it on the remote box. Shebang scripts execute
directly on POSIX; scripts without a shebang run through `bash`. Native
Windows uploads run through Windows PowerShell, and Crabbox appends `.ps1`
when needed. Arguments after `--` become script args.
- `--fresh-pr owner/repo#123|URL|number`: skip dirty local sync and create a
fresh remote checkout of the GitHub PR. Bare numbers use the current repo's
GitHub origin. Add `--apply-local-patch` only when the current local
`git diff --binary HEAD` should be applied on top of that PR checkout.
- `--full-resync` / `--fresh-sync`: reset a stale direct-provider workdir
before syncing. Use after sync fingerprints look wrong, SSH times out before
sync, or rsync watchdog output suggests it. It is redundant with
`--fresh-pr`, incompatible with `--no-sync`, and unsupported by delegated
providers.
- `--capture-stdout <path>` / `--capture-stderr <path>`: write remote streams to
local files and keep binary/noisy output out of retained logs. Parent
directories must already exist. These are direct-provider only.
- `--capture-on-fail`: on non-zero direct-provider exits, downloads
`.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`,
`coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed.
- `--keep-on-failure`: leave a failed one-shot lease alive for live debugging
until idle/TTL expiry. Useful on direct providers and delegated one-shots.
- `--timing-json`: final machine-readable timing. Add
`echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell
commands; direct providers and Blacksmith Testbox both report them as
`commandPhases`.
Live-provider debug template for direct AWS/Hetzner leases:
```sh
mkdir -p .crabbox/logs
pnpm crabbox:run -- --provider aws \
--preflight \
--allow-env OPENAI_API_KEY,OPENAI_BASE_URL \
--timing-json \
--capture-stdout .crabbox/logs/live-provider.stdout.log \
--capture-stderr .crabbox/logs/live-provider.stderr.log \
--capture-on-fail \
--shell -- \
"echo CRABBOX_PHASE:install; pnpm install --frozen-lockfile; echo CRABBOX_PHASE:test; pnpm test:live"
```
Do not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or
`--sync-only` to delegated providers. Also do not pass `--script*`,
`--fresh-pr`, `--full-resync`, or `--env-helper` there. Crabbox rejects these
because the provider owns sync or command transport. `--keep-on-failure` is OK
for delegated one-shots when you need to inspect a failed lease.
## Efficient Bug E2E Verification
Use the smallest Crabbox lane that proves the reported user path, not just the
touched code. Aim for one after-fix E2E proof before commenting, closing, or
opening a PR for a user-visible bug.
When the user says "test in Crabbox", do not simply copy tests to the remote
box and run them there. Crabbox is for remote real-scenario proof: copy or
install OpenClaw as the user would, run the same setup/update/CLI/Gateway/API
call that failed, and capture behavior from that entrypoint. For regressions or
bug reports, prove the broken state first when feasible, then run the same
scenario after the fix.
Pick the lane by symptom:
- Docker/setup/install bug: build a package tarball and run the matching
`scripts/e2e/*-docker.sh` or package script. This proves npm packaging,
install paths, runtime deps, config writes, and container behavior.
- Provider/model/auth bug: prefer true live E2E. Use the configured secret
workflow, then inject the single needed key into Crabbox if needed. Scrub
unrelated provider env vars in the child command so interactive defaults do
not drift to another provider. If only a dummy key is used, label the proof
narrowly, e.g. "UI/install path only; live provider auth not exercised."
- Channel delivery bug: use the channel Docker/live lane when available; include
setup, config, gateway start, send/receive or agent-turn proof, and redacted
logs.
- Gateway/session/tool bug: prefer an end-to-end CLI or Gateway RPC command that
creates real state and inspects the resulting files/API output.
- Pure parser/config bug: targeted tests may be enough, but still run a
Crabbox command when OS, package, Docker, secrets, or service lifecycle could
change behavior.
Efficient flow:
1. Reproduce or prove the pre-fix symptom from the real user-facing entrypoint
when feasible. If the issue cannot be reproduced, capture the exact command
and observed behavior instead.
2. Patch locally and run narrow local tests for edit speed.
3. Run one Crabbox E2E command that starts from the user-facing entrypoint:
package install, Docker setup, onboarding, channel add, gateway start, or
agent turn as appropriate.
4. Record proof as: Testbox id, command, environment shape, redacted secret
source, and copied success/failure output.
5. If the issue says "cannot reproduce", ask for the missing config/log fields
that would distinguish the tested path from the reporter's path.
Keep it efficient:
- Reuse existing E2E scripts and helper assertions before writing ad hoc shell.
- Use `--script <file>` or `--script-stdin` for multi-line E2E commands instead
of quote-heavy `--shell` strings on direct SSH providers.
- Use `--fresh-pr <pr>` when validating an upstream PR in isolation from the
local dirty tree. Add `--apply-local-patch` only when testing a local fixup on
top of that PR.
- Use `--full-resync` before replacing a warmed direct-provider lease when the
remote workdir or sync fingerprint appears stale.
- Use one-shot Crabbox for a single proof; use a reusable Testbox only when
several commands must share built images, installed packages, or live state.
- Prefer `OPENCLAW_CURRENT_PACKAGE_TGZ` with Docker/package lanes when testing a
candidate tarball; prefer the repo's package helper instead of direct source
execution when the bug might be packaging/install related.
- Keep secrets redacted. It is fine to report key presence, source, and length;
never print secret values.
- Include `--timing-json` on broad or flaky runs when command duration or sync
behavior matters.
Before/after PR proof on delegated Testbox:
- For PRs that should prove "broken before, fixed after", compare base and PR
on the same Testbox when practical. Fetch both refs, create detached temp
worktrees under `/tmp`, install in each, then run the same harness twice.
- Do not checkout base/PR refs in the synced repo root. Delegated Testbox sync
may leave the root dirty with local files; `git checkout` can abort or mix
proof state.
- Temp harness files under `/tmp` do not resolve repo packages by default. Put
the harness inside the worktree, or in ESM use
`createRequire(path.join(process.cwd(), "package.json"))` before requiring
workspace deps such as `@lydell/node-pty`.
- For full-screen TUI/CLI bugs, a PTY harness is stronger than helper-only
assertions. Use a real PTY, wait for visible lifecycle markers, send input,
then send control keys and assert process exit/stuck behavior.
- When validating a rebased local branch before push, remember delegated sync
usually validates synced file content on a detached dirty checkout, not a
remote commit object. Record the local head SHA, changed files, Testbox id,
and final success markers; after pushing, ensure the pushed SHA has the same
file content.
- If GitHub CI is still queued but the exact changed content passed Testbox
`pnpm check:changed`, `pnpm check:test-types`, and the real E2E proof, it is
reasonable to merge once required checks allow it. Note any still-running
unrelated shards in the proof comment instead of waiting forever.
Interactive CLI/onboarding:
- For full-screen or prompt-heavy CLI flows, run the target command inside tmux
on the Crabbox and drive it with `tmux send-keys`; capture proof with
`tmux capture-pane`, redacted through `sed`.
- Prefer deterministic arrow navigation over search typing for Clack-style
searchable selects. Raw `send-keys -l openai` may not trigger filtering in a
tmux pane; inspect option order locally or on-box and send exact Down/Enter
sequences.
- Isolate mutable state with `OPENCLAW_STATE_DIR=$(mktemp -d)`. Plugin npm
installs live under that state dir (`npm/node_modules/...`), not under
`OPENCLAW_CONFIG_DIR`. Verify downloads by checking the state dir, package
lock, and installed package metadata.
- To test automatic setup installs against local package artifacts, use
`OPENCLAW_ALLOW_PLUGIN_INSTALL_OVERRIDES=1` plus
`OPENCLAW_PLUGIN_INSTALL_OVERRIDES='{"plugin-id":"npm-pack:/tmp/plugin.tgz"}'`.
Pack with `npm pack`, set an isolated `OPENCLAW_STATE_DIR`, and verify the
package under `npm/node_modules`. Overrides are test-only and must not be
treated as official/trusted-source installs.
- For OpenAI/Codex onboarding proof, the useful markers are the UI line
`Installed Codex plugin`, `npm/node_modules/@openclaw/codex`, and the
package-lock entry showing the bundled `@openai/codex` dependency. A dummy
OpenAI-shaped key can prove only UI/install behavior; it is not live auth.
## Reuse And Keepalive
For most Crabbox calls, one-shot is enough. Use reuse only when you need
multiple manual commands on the same hydrated box.
If Crabbox returns a reusable id or you intentionally keep a lease:
```sh
pnpm crabbox:run -- --id <cbx_id-or-slug> --no-sync --timing-json --shell -- "pnpm test <path>"
```
Stop boxes you created before handoff:
```sh
pnpm crabbox:stop -- <id-or-slug>
blacksmith testbox stop --id <tbx_id>
```
## Interactive Desktop And WebVNC
Prefer WebVNC for human inspection because the browser portal can preload the
lease VNC password and avoids a native VNC client's copy/paste/password dance.
Use native `crabbox vnc` only when WebVNC is unavailable, the browser portal is
broken, or the user explicitly wants a local VNC client.
Common desktop flow:
```sh
../crabbox/bin/crabbox warmup --provider hetzner --desktop --browser --class standard --idle-timeout 60m --ttl 240m
../crabbox/bin/crabbox desktop launch --provider hetzner --id <cbx_id-or-slug> --browser --url https://example.com --webvnc --open --take-control
```
Useful WebVNC commands:
```sh
../crabbox/bin/crabbox webvnc --provider hetzner --id <cbx_id-or-slug> --open --take-control
../crabbox/bin/crabbox webvnc daemon start --provider hetzner --id <cbx_id-or-slug> --open --take-control
../crabbox/bin/crabbox webvnc daemon status --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox webvnc daemon stop --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox webvnc status --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox webvnc reset --provider hetzner --id <cbx_id-or-slug> --open --take-control
../crabbox/bin/crabbox desktop doctor --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox desktop click --provider hetzner --id <cbx_id-or-slug> --x 640 --y 420
../crabbox/bin/crabbox desktop paste --provider hetzner --id <cbx_id-or-slug> --text "user@example.com"
../crabbox/bin/crabbox desktop key --provider hetzner --id <cbx_id-or-slug> ctrl+l
../crabbox/bin/crabbox artifacts collect --id <cbx_id-or-slug> --all --output artifacts/<slug>
../crabbox/bin/crabbox artifacts publish --dir artifacts/<slug> --pr <number>
```
`desktop launch --webvnc --open` is usually the nicest one-shot: it starts the
browser/app inside the visible session, bridges the lease into the authenticated
WebVNC portal, and opens the portal. Keep browsers windowed for human QA; use
`--fullscreen` only for capture/video workflows.
For human handoff, include `--take-control` so the opened portal viewer gets
keyboard/mouse control automatically instead of landing as an observer.
Human handoff preflight:
- Do not assume a visible desktop or launched browser means the repo CLI/app is
installed, built, or on the interactive terminal's `PATH`.
- Before handing WebVNC to a human tester, prove the expected command from the
same kept lease and from a neutral directory such as `~`.
- If the handoff needs repo-local code, sync/build/link it explicitly on that
lease. Source-tree CLIs often need build output before a symlink works.
- Prefer a real `command -v <expected-command> && <expected-command> --version`
check over a repo-root-only `pnpm ...` command.
Generic handoff repair pattern:
```sh
../crabbox/bin/crabbox run --id <cbx_id-or-slug> --full-resync --shell -- \
"set -euo pipefail
pnpm install --frozen-lockfile
pnpm build
sudo ln -sf \"\$PWD/<cli-entry>\" /usr/local/bin/<expected-command>
cd ~
command -v <expected-command>
<expected-command> --version"
```
## If Crabbox Fails
Keep the fallback narrow. First decide whether the failure is Crabbox itself,
the brokered AWS lease, Blacksmith/Testbox, repo hydration, sync, or the test
command.
Fast checks:
```sh
command -v crabbox
../crabbox/bin/crabbox --version
pnpm crabbox:run -- --help | sed -n '1,140p'
../crabbox/bin/crabbox doctor
command -v blacksmith
blacksmith --version
blacksmith testbox list
```
Common Crabbox-only failures:
- Provider missing or old CLI: use `../crabbox/bin/crabbox` from the sibling
repo, or update/install Crabbox before retrying.
- Bad local config: inspect `.crabbox.yaml`, `crabbox config show`, and
`crabbox whoami`; normal OpenClaw proof should use brokered AWS without
asking for cloud keys.
- Slug/claim confusion: use the raw `cbx_...` / `tbx_...` id, or run one-shot
without `--id`.
- Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the
printed Actions URL. Large sync warnings now include top source directories
by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect
those before reaching for `--force-sync-large`. Quiet rsync watchdogs and SSH
timeouts now print `next_action=` hints; follow them, usually `--full-resync`
first and a fresh lease second.
- Cleanup uncertainty: run `crabbox list --provider aws`; for explicit
Blacksmith runs, use `blacksmith testbox list` and stop only boxes you
created.
- Testbox queued/capacity pressure: do not retry Blacksmith repeatedly. Rerun
once without `--provider` so `.crabbox.yaml` routes to brokered AWS, or report
the Blacksmith blocker if Testbox itself is the requested proof.
If brokered AWS cannot dispatch, sync, attach, or stop, retry once with
`--debug` and `--timing-json`:
```sh
pnpm crabbox:run -- --debug --timing-json -- \
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed
```
Full suite:
```sh
pnpm crabbox:run -- --debug --timing-json -- \
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test
```
Auth fallback, only when `blacksmith` says auth is missing:
```sh
blacksmith auth login --non-interactive --organization openclaw
```
Raw Blacksmith footguns:
- Run from repo root. The CLI syncs the current directory.
- Save the returned `tbx_...` id in the session.
- Reuse that id for focused reruns; stop it before handoff.
- Raw commit SHAs are not reliable `warmup --ref` refs; use a branch or tag.
- Treat `blacksmith testbox list` as cleanup diagnostics, not a shared reusable
queue.
Use Blacksmith only when the task is specifically about Testbox, brokered AWS
is unavailable, or an explicit comparison is needed. If Blacksmith is down or
quota-limited, do not keep probing it; stay on brokered AWS and note the
delegated-provider outage.
## Blacksmith Backend Notes
Crabbox Blacksmith backend delegates setup to:
- org: `openclaw`
- workflow: `.github/workflows/ci-check-testbox.yml`
- job: `check`
- ref: `main` unless testing a branch/tag intentionally
The hydration workflow owns checkout, Node/pnpm setup, dependency install,
secrets, ready marker, and keepalive. Crabbox owns dispatch, sync, SSH command
execution, timing, logs/results, and cleanup.
Minimal Blacksmith-backed Crabbox run, from repo root:
```sh
pnpm crabbox:run -- --provider blacksmith-testbox --timing-json -- \
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:changed
```
Use direct Blacksmith only when Crabbox is the broken layer and you are
isolating a Crabbox bug. Prefer direct `blacksmith testbox list` for cleanup
diagnostics, not as a reusable work queue.
Important Blacksmith footguns:
- Always run from repo root. The CLI syncs the current directory.
- Raw commit SHAs are not reliable `warmup --ref` refs; use a branch or tag.
- If auth is missing and browser auth is acceptable:
```sh
blacksmith auth login --non-interactive --organization openclaw
```
## Brokered AWS
Use AWS for normal OpenClaw remote proof. The repo `.crabbox.yaml` already
selects brokered AWS, so omit `--provider` unless you are testing a different
provider deliberately.
```sh
pnpm crabbox:warmup -- --class beast --market on-demand --idle-timeout 90m
pnpm crabbox:hydrate -- --id <cbx_id-or-slug>
pnpm crabbox:run -- --id <cbx_id-or-slug> --timing-json --shell -- "env NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed"
pnpm crabbox:stop -- <cbx_id-or-slug>
```
Install/auth for owned Crabbox if needed:
```sh
brew install openclaw/tap/crabbox
crabbox login --url https://crabbox.openclaw.ai --provider aws
```
New users should self-resolve broker auth before anyone asks for AWS keys:
```sh
crabbox config show
crabbox doctor
crabbox whoami
```
- If broker auth is missing, run `crabbox login --url https://crabbox.openclaw.ai --provider aws`.
- If the CLI asks for `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, or AWS
profile setup during normal OpenClaw validation, assume the agent selected
the wrong path. Use brokered `crabbox login` or an existing brokered lease
before asking the user for cloud credentials.
- Ask for AWS keys only for explicit direct-provider/account administration,
not for normal brokered OpenClaw proof.
- Trusted automation may still use
`printf '%s' "$CRABBOX_COORDINATOR_TOKEN" | crabbox login --url https://crabbox.openclaw.ai --provider aws --token-stdin`.
macOS config lives at:
```text
~/Library/Application Support/crabbox/config.yaml
```
It should include `broker.url`, `broker.token`, and usually `provider: aws`
for OpenClaw lanes. Let that config drive normal validation.
### Interactive Desktop / WebVNC
For human desktop demos, prefer `webvnc` over native `vnc` and keep the remote
desktop visible/windowed. Do not fullscreen the remote browser or hide the XFCE
panel/window chrome unless the explicit goal is video/capture output. After
launch, verify a screenshot shows the desktop panel plus browser title bar. If
Chrome is fullscreen, toggle it back with:
```sh
crabbox run --id <lease> --shell -- 'DISPLAY=:99 xdotool search --onlyvisible --class google-chrome windowactivate key F11'
```
## Diagnostics
```sh
crabbox status --id <id-or-slug> --wait
crabbox inspect --id <id-or-slug> --json
crabbox sync-plan
crabbox history --limit 20
crabbox history --lease <id-or-slug>
crabbox attach <run_id>
crabbox events <run_id> --json
crabbox logs <run_id>
crabbox results <run_id>
crabbox cache stats --id <id-or-slug>
crabbox ssh --id <id-or-slug>
blacksmith testbox list
```
Use `--debug` on `run` when measuring sync timing.
Use `--timing-json` on warmup, hydrate, and run when comparing backends.
Use `--market spot|on-demand` only on AWS warmup/one-shot runs.
## Failure Triage
- Crabbox cannot find provider: verify `../crabbox/bin/crabbox --help` lists
the provider selected by `.crabbox.yaml`; update Crabbox before falling back.
- Hydration stuck or failed: open the printed GitHub Actions run URL and inspect
the hydration step.
- Sync failed: rerun with `--debug`; check changed-file count and whether the
checkout is dirty.
- Command failed: rerun only the failing shard/file first. Do not rerun a full
suite until the focused failure is understood.
- Cleanup uncertain: `crabbox list --provider aws`; for explicit Blacksmith
runs, use `blacksmith testbox list` and stop owned `tbx_...` leases you
created.
- Crabbox broken but Blacksmith works: use the direct Blacksmith fallback above,
then file/fix the Crabbox issue.
## Boundary
Do not add OpenClaw-specific setup to Crabbox itself. Put repo setup in the
hydration workflow and keep Crabbox generic around lease, sync, command
execution, logs/results, timing, and cleanup.

View File

@ -1,140 +0,0 @@
---
name: release-peekaboo
description: "Peekaboo release: notarization, npm/GitHub release, appcast, verify, closeout."
metadata: {"clawdbot":{"emoji":"👁️","requires":{"bins":["pnpm","op","tmux","gh","xcrun","jq","node","npm"]}}}
---
# Peekaboo Release
Release `~/Projects/Peekaboo` as the npm package `@steipete/peekaboo` plus signed/notarized macOS app assets.
Use `$one-password`, `$browser-use`, `$npm`, `$autoreview`, and repo `AGENTS.md` rules. Load `$release-private` if it exists before resolving Peter-owned credential locators. Read `$npm` before any npm auth, token, or publish recovery work. Keep all `op` secret work inside one persistent tmux session. Never print `.p8`, npm tokens, passwords, or OTPs.
## Current Secrets
- Peter-owned credential item names, key ids, issuer ids, keychain paths, and npm token locators live in `$release-private`.
- Required ASC fields: `key_id`, `issuer_id`, `private_key_p8`.
- Stale/revoked key symptom: `xcrun notarytool submit` fails with `HTTP status code: 401. Unauthenticated`.
- All ASC fields must come from the same current item; do not mix profile values with 1Password refs.
Sparkle key:
- Repo `.mac-release.env` has the current fallback.
- Do not set `SPARKLE_PRIVATE_KEY_FILE` for normal releases.
Developer ID release keychain:
- Resolve the release keychain item/path from `$release-private`.
- If macOS shows `codesign wants to use the release keychain`, enter the keychain item password, not the Developer ID `.p12` password.
- The Developer ID certificate password is only for importing the `.p12` while creating the keychain.
- After setup/import, run `security unlock-keychain` and `security set-key-partition-list -S apple-tool:,apple:,codesign: -s -k "$KEYCHAIN_PASSWORD" "$KEYCHAIN_PATH"` so `codesign` can use the identity without GUI prompts.
npm publish token:
- Resolve token/TOTP locators from `$release-private`.
- Use `$npm` rules. Run inside the same tmux session, write only a temp npmrc, delete it immediately, and use the `npmjs` TOTP item for web auth if npm prompts.
- Do not create short-lived/granular bypass tokens for a normal Peekaboo publish. They add cleanup risk and did not help the 3.2.1 slow-upload/web-auth path.
## Notary Credential Check
Use the service account from `$release-private` first. Put the token in the tmux environment without printing it:
```bash
# Resolve SERVICE_ACCOUNT_TOKEN from $release-private first.
tmux -S "$SOCKET" set-environment -t "$SESSION" OP_SERVICE_ACCOUNT_TOKEN "$SERVICE_ACCOUNT_TOKEN"
```
Create a temp env file with service-account refs from `$release-private`:
```text
APP_STORE_CONNECT_API_KEY_P8=<1Password ref from release-private>
APP_STORE_CONNECT_KEY_ID=<1Password ref from release-private>
APP_STORE_CONNECT_ISSUER_ID=<1Password ref from release-private>
```
Before a release, verify shape and Apple auth without printing values:
```bash
op run --env-file "$ENVFILE" -- bash -c '
set -euo pipefail
KEY_FILE="/tmp/AuthKey_${APP_STORE_CONNECT_KEY_ID}.p8"
printf "%s\n" "$APP_STORE_CONNECT_API_KEY_P8" > "$KEY_FILE"
chmod 600 "$KEY_FILE"
xcrun notarytool history \
--key "$KEY_FILE" \
--key-id "$APP_STORE_CONNECT_KEY_ID" \
--issuer "$APP_STORE_CONNECT_ISSUER_ID" \
--output-format json >/dev/null
rm -f "$KEY_FILE"
'
```
Peekaboo forces `notarytool submit --no-s3-acceleration`; the default S3 accelerated upload path can return a misleading `401` even when `history` auth succeeds.
If both `history` and non-S3 `submit` fail, suspect wrong access level or stale key. Browser route:
1. Use `$browser-use` real Chrome profile.
2. Open `https://appstoreconnect.apple.com/access/integrations/api`.
3. Generate Team Key named `Peekaboo Release <version>` with `Admin` access.
4. Download `.p8` once from the key row.
5. Store immediately into the private credential map; verify `notarytool history`; delete `~/Downloads/AuthKey_<key_id>.p8`.
6. Revoke the older Peekaboo release key after the new key validates.
## Release Flow
1. Start on clean `main`; pull ff-only if needed.
2. Set version in:
- `package.json`
- `version.json`
- `Apps/CLI/Sources/Resources/version.json`
- README npm badge
- `Core/PeekabooCore/Sources/PeekabooAgentRuntime/MCP/PeekabooMCPVersion.swift`
- Xcode marketing versions under `Apps/*`
3. Date `CHANGELOG.md` and `Apps/CLI/CHANGELOG.md` for the release.
4. Run focused proof or release script preflight. Release gates must be warning-free.
5. Use `$autoreview` before commit unless the change is trivial/docs-only.
6. Commit release prep with `committer`.
7. Push `main`.
8. Run:
```bash
op run --env-file "$ENVFILE" -- \
bash -c 'printf "y\n" | ./scripts/release-binaries.sh --create-github-release --publish-npm'
```
The script builds universal CLI, npm package, signed/notarized app zip, appcast, checksums, draft GitHub release, and npm publish.
Use a non-login shell: profile exports can replace current 1Password ASC IDs with stale values while leaving the current `.p8`, producing a misleading `401`.
Notarized releases must sign with `Developer ID Application: Peter Steinberger (Y5PE65HELJ)`, not `Apple Development`. If your shell has `SIGN_IDENTITY` exported for CLI builds, override it for the release command.
If npm upload is slow and TOTP expires, use the stored npm token through a temp npmrc and complete npm web auth immediately when prompted with the configured TOTP. Do not create granular bypass tokens for this; if one was created by mistake, delete it before closeout.
## Verify
Required before closeout:
```bash
npm view @steipete/peekaboo@<version> version dist-tags dist.tarball dist.integrity time --json
(cd /tmp && npm exec --yes --package=@steipete/peekaboo@<version> -- peekaboo --version)
gh release view v<version> --repo openclaw/Peekaboo --json tagName,isDraft,isPrerelease,url,assets,body
xmllint --noout appcast.xml
git status --short --branch
```
Confirm:
- npm version exists and `latest` points to it.
- npm-downloaded CLI reports the release version from a neutral cwd.
- GitHub release/tag/assets exist; release body is from changelog.
- app zip asset exists and appcast points at `v<version>`.
- `appcast.xml` changes are committed and pushed.
- Publish draft release if the script leaves it draft.
## Closeout
1. Add next patch `Unreleased` section to root and CLI changelogs.
2. Commit with `committer "docs(changelog): open <next-version>" CHANGELOG.md Apps/CLI/CHANGELOG.md`.
3. Push.
4. Watch release/homebrew/CI workflows if triggered.
5. `git checkout main && git pull --ff-only && git status --short --branch`.
6. Clear tmux `OP_SERVICE_ACCOUNT_TOKEN`, remove temp env/key files, and final with what landed.

28
.claude/hooks/pre_bash.py Executable file
View File

@ -0,0 +1,28 @@
#!/usr/bin/env python3
import json
import sys
import re
import os
try:
data = json.load(sys.stdin)
cmd = data.get("tool_input", {}).get("command", "")
# ALWAYS block git reset --hard, regardless of project
if re.search(r'\bgit\s+reset\s+--hard\b', cmd):
print("BLOCKED: git reset --hard is NEVER allowed for AI agents", file=sys.stderr)
print(f"Attempted: {cmd}", file=sys.stderr)
print("Only the user can run this command directly.", file=sys.stderr)
sys.exit(2)
# If ./runner exists, enforce stricter rules
if os.path.exists('./runner'):
if re.search(r'\bgit\s+', cmd) and './runner' not in cmd and 'runner git' not in cmd:
print("BLOCKED: All git commands must use ./runner in this project", file=sys.stderr)
print(f"Attempted: {cmd}", file=sys.stderr)
print("Use: ./runner git <subcommand>", file=sys.stderr)
sys.exit(2)
sys.exit(0)
except:
sys.exit(0)

View File

@ -0,0 +1,21 @@
{
"enableAllProjectMcpServers": false,
"hooks": {
"PreToolUse": [
{
"tool": "Bash",
"command": [
"python3",
".claude/hooks/pre_bash.py"
]
}
]
},
"permissions": {
"allow": [
"Bash(git -C /Users/steipete/Projects/Peekaboo diff --stat Apps/CLI/Sources/PeekabooCLI/CLI/PeekabooEntryPoint.swift)"
],
"deny": [],
"ask": []
}
}

View File

@ -1,50 +0,0 @@
profile: peekaboo-check
provider: aws
class: standard
capacity:
market: spot
strategy: most-available
fallback: on-demand-after-120s
hints: true
regions:
- eu-west-1
- eu-west-2
- eu-central-1
- us-east-1
- us-west-2
actions:
workflow: .github/workflows/crabbox-hydrate.yml
job: hydrate
ref: main
runnerLabels:
- crabbox
- openclaw
- peekaboo
runnerVersion: latest
ephemeral: true
aws:
region: eu-west-1
rootGB: 160
sync:
delete: true
checksum: false
gitSeed: true
fingerprint: true
baseRef: main
exclude:
- .artifacts
- .codex
- .DS_Store
- coverage
- dist
- node_modules
- .build
env:
allow:
- CI
- NODE_OPTIONS
- PNPM_*
- NPM_CONFIG_*
ssh:
user: crabbox
port: "2222"

108
.cursor/rules/agent.mdc Normal file
View File

@ -0,0 +1,108 @@
---
description:
globs:
alwaysApply: false
---
# Agent Instructions
This file provides guidance to AI assistants when working with code in this repository.
## Project Overview
This is the `peekaboo` project, which provides a Model Context Protocol (MCP) server that enables executing AppleScript and JavaScript for Automation (JXA) scripts on macOS. The server features a knowledge base of pre-defined scripts accessible by ID and supports inline scripts, script files, and argument passing.
## Architecture
- **Server Configuration**: The server reads configuration from environment variables like `LOG_LEVEL` and `KB_PARSING`.
- **MCP Tools**: Two main tools are provided:
1. `execute_script`: Executes AppleScript/JXA from inline content, file path, or knowledge base ID
2. `get_scripting_tips`: Retrieves information from the knowledge base
- **Knowledge Base**: A collection of pre-defined scripts stored as Markdown files in `knowledge_base/` directory with YAML frontmatter
- **ScriptExecutor**: Core component that executes scripts via `osascript` command
## Knowledge Base System
The knowledge base (`knowledge_base/` directory) contains numerous Markdown files organized by category:
- Each file has YAML frontmatter with metadata: `id`, `title`, `description`, `language`, etc.
- The actual script code is contained in the Markdown body in a fenced code block
- Scripts can use placeholders like `--MCP_INPUT:keyName` and `--MCP_ARG_N` for parameter substitution
## Common Development Commands
```bash
# Install dependencies
npm install
# Run the server in development mode with hot reloading
npm run dev
# Build the TypeScript project
npm run build
# Start the compiled server
npm run start
# Lint the codebase
npm run lint
# Format the codebase
npm run format
# Validate the knowledge base
npm run validate
```
## Environment Variables
- `LOG_LEVEL`: Set logging level (`DEBUG`, `INFO`, `WARN`, `ERROR`) - default is `INFO`
- `KB_PARSING`: Controls when knowledge base is parsed:
- `lazy` (default): Parsed on first request
- `eager`: Parsed when server starts
## Working with the Knowledge Base
When adding new scripts to the knowledge base:
1. Create a new `.md` file in the appropriate category folder
2. Include required YAML frontmatter (`title`, `description`, etc.)
3. Add the script code in a fenced code block
4. Run `npm run validate` to ensure the new content is correctly formatted
## Code Execution Flow
1. The `server.ts` file defines the MCP server and its tools
2. `knowledgeBaseService.ts` loads and indexes scripts from the knowledge base
3. `ScriptExecutor.ts` handles the actual execution of scripts
4. Input validation is handled via Zod schemas in `schemas.ts`
5. Logging is managed by the `Logger` class in `logger.ts`
## Security and Permissions
Remember that scripts run on macOS require specific permissions:
- Automation permissions for controlling applications
- Accessibility permissions for UI scripting via System Events
- Full Disk Access for certain file operations
## Agent Operational Learnings and Debugging Strategies
This section captures key operational strategies and debugging techniques for the agent (me) based on collaborative sessions.
### Prioritizing Log Visibility for Debugging
When an external tool or script (like AppleScript via `osascript`) returns cryptic errors, or when agent-generated code/substitutions might be faulty:
1. **Suspect Dynamic Content**: Issues often stem from the dynamic content being passed to the external tool (e.g., incorrect placeholder substitutions leading to syntax errors in the target language).
2. **Enable/Add Detailed Logging**: Prioritize enabling any built-in detailed logging features of the tool in question (e.g., `includeSubstitutionLogs: true` for this project's `execute_script` tool).
3. **Ensure Log Visibility**: If standard debug logging doesn't appear in the primary output channel the user is observing, attempt to modify the code to force critical diagnostic information (like step-by-step transformations, variable states, or the exact content being passed externally) into that main output. This might involve temporarily altering the structure of the success or error messages to include these logs.
* **Confirm Restarts and Code Version**: For changes requiring server restarts (common in this project), leverage any features that confirm the new code is active. For example, the server startup timestamp and execution mode info appended to `get_scripting_tips` output helps verify that a restart was successful and the intended code version (e.g., TypeScript source via `tsx` vs. compiled `dist/server.js`) is running.
### Iterative Simplification for Complex Patterns (e.g., Regex)
If a complex pattern (like a regular expression) in code being generated or modified by the agent is not working as expected, and the cause isn't immediately obvious:
1. **Isolate the Pattern**: Identify the specific complex pattern (e.g., a regex for string replacement).
2. **Drastically Simplify**: Reduce the pattern to its most basic form that should still achieve a part of the goal or match a core component of the target string. (e.g., simplifying `/(?:["'])--MCP_INPUT:(\w+)(?:["'])/g` to `/--MCP_INPUT:/g` to test basic matching of the placeholder prefix).
3. **Test the Simple Form**: Verify if this simplified pattern works. If it does, the core string manipulation mechanism is likely sound.
4. **Incrementally Rebuild & Test**: Gradually add back elements of the original complexity (e.g., capture groups, character sets, quantifiers, lookarounds, backreferences like `\1`). Test at each incremental step to pinpoint which specific construct or combination introduces the failure. This process helped identify that `(?:["'])` was problematic in our placeholder regex, leading to a solution using a capturing group and a backreference like `/(["'])--MCP_INPUT:(\w+)\1/g`.
5. **Verify Replacement Logic**: Ensure that if the pattern involves capturing groups for use in a replacement, the replacement logic correctly utilizes these captures and produces the intended output format (e.g., `valueToAppleScriptLiteral` for AppleScript).
This methodical approach is more effective than repeatedly trying minor variations of an already complex and failing pattern.

View File

@ -0,0 +1,98 @@
---
description:
globs:
alwaysApply: false
---
Rule Name: mcp-inspector
Description: Debugging and verifying the `macos-automator-mcp` server via the MCP Inspector, using Playwright for UI automation and direct terminal commands for server management. This rule prioritizes stability and detailed verification through Playwright's introspection capabilities.
**Required Tools:**
- `run_terminal_cmd`
- `mcp_playwright_browser_navigate`
- `mcp_playwright_browser_type`
- `mcp_playwright_browser_click`
- `mcp_playwright_browser_snapshot`
- `mcp_playwright_browser_console_messages`
- `mcp_playwright_browser_wait_for`
**User Workspace Path Placeholder:**
- The path to the `start.sh` script will be specified as `[WORKSPACE_PATH]/start.sh`.
- The AI assistant executing this rule **MUST** replace `[WORKSPACE_PATH]` with the absolute path to the user's current project workspace (e.g., as found in the `<user_info>` context block during rule execution).
- Example of a resolved path if the workspace is `/Users/username/Projects/my-mcp-project`: `/Users/username/Projects/my-mcp-project/start.sh`.
---
**Main Flow:**
**Phase 1: Start MCP Inspector Server**
1. **Kill Existing Inspector Processes:**
* Action: Call `run_terminal_cmd`.
* `command`: `pkill -f 'npx @modelcontextprotocol/inspector' || true`
* `is_background`: `false`
* Expected: Cleans up any lingering Inspector processes.
2. **Start New Inspector Process:**
* Action: Call `run_terminal_cmd`.
* `command`: `npx @modelcontextprotocol/inspector`
* `is_background`: `true`
* Expected: MCP Inspector starts in the background.
3. **Wait for Inspector Initialization:**
* Action: Call `mcp_playwright_browser_wait_for`.
* `time`: `10` (seconds)
* Expected: Allows ample time for the Inspector server to be ready. This step requires an active Playwright page, so it's implicitly preceded by navigation in Phase 2 if the browser isn't already open.
**Phase 2: Connect to Server via Playwright**
1. **Navigate to Inspector URL:**
* Action: Call `mcp_playwright_browser_navigate`.
* `url`: `http://127.0.0.1:6274`
* Expected: Playwright opens the MCP Inspector web UI.
* Snapshot: Take a snapshot (`mcp_playwright_browser_snapshot`) to confirm page load and identify initial form element references (`ref`).
2. **Fill Form (Command & Args only):**
* **Set Command:**
* Action: Call `mcp_playwright_browser_type`.
* `element`: "Command textbox" (Obtain `ref` from snapshot).
* `text`: `macos-automator-mcp`
* **Set Arguments:**
* Action: Call `mcp_playwright_browser_type`.
* `element`: "Arguments textbox" (Obtain `ref` from snapshot).
* `text`: `[WORKSPACE_PATH]/start.sh` (This placeholder MUST be replaced by the AI executing the rule with the absolute path to the user's current workspace).
* *(Note: Environment Variables are skipped in this flow for simplicity and stability, as issues were previously observed when setting LOG_LEVEL=DEBUG during connection.)*
3. **Click "Connect":**
* Action: Call `mcp_playwright_browser_click`.
* `element`: "Connect button" (Obtain `ref` from snapshot).
* Expected: Connection to the `macos-automator-mcp` server is established.
* Snapshot: Take a snapshot. Verify connection status (e.g., text changes to "Connected") and check for initial server logs in the UI.
**Phase 3: Interact with a Tool via Playwright**
1. **List Tools:**
* Action: Call `mcp_playwright_browser_click`.
* `element`: "List Tools button" (Obtain `ref` from the latest snapshot).
* Expected: The list of available tools from the `macos-automator-mcp` server is displayed.
* Snapshot: Take a snapshot. Verify tools like `execute_script` and `get_scripting_tips` are visible.
2. **Select 'get_scripting_tips' Tool:**
* Action: Call `mcp_playwright_browser_click`.
* `element`: "get_scripting_tips tool in list" (Obtain `ref` by identifying it in the snapshot's tool list).
* Expected: The parameters form for `get_scripting_tips` is displayed in the right-hand panel.
* Snapshot: Take a snapshot. Verify the right panel shows details for `get_scripting_tips` (e.g., its name, description, and parameter fields like 'searchTerm', 'listCategories', etc.).
3. **Execute 'get_scripting_tips' (default parameters):**
* Action: Call `mcp_playwright_browser_click`.
* `element`: "Run Tool button" (Obtain `ref` for the 'Run Tool' button specific to the `get_scripting_tips` form in the right panel from the snapshot).
* Expected: The `get_scripting_tips` tool is executed with its default parameters.
* Snapshot: Take a snapshot.
**Phase 4: Verify Tool Execution and Logs in Playwright**
1. **Check for Results in UI:**
* Action: Examine the latest snapshot.
* Look for: The results of the `get_scripting_tips` call (e.g., a list of script categories if `listCategories` was implicitly true by default, or an empty result if no default search term was run).
* The results should appear in the 'Result from tool' or a similarly named section within the right-hand panel where the tool's form was.
2. **Check Console Logs (Optional but Recommended):**
* Action: Call `mcp_playwright_browser_console_messages`.
* Expected: Review for any errors or relevant messages from the Inspector or the tool interaction.
3. **Check MCP Server Logs in UI:**
* Action: Examine the latest snapshot.
* Look for: Logs related to the `get_scripting_tips` tool execution in the main server log panel (usually bottom-left, titled "Error output from MCP server" or similar, but also shows general logs).
**Troubleshooting Notes:**
- If connection fails, check the `run_terminal_cmd` output for the Inspector to ensure it started correctly.
- Check Playwright console messages for clues.
- Ensure the `[WORKSPACE_PATH]` was correctly resolved and points to an existing `start.sh` script.
- Element `ref` values can change. Always use the latest snapshot to get correct `ref` values before an interaction.
- Shadow DOM: The MCP Inspector UI uses Shadow DOM extensively for the tool details and results panels. Playwright's default selectors should pierce Shadow DOM, but if issues arise with finding elements *within* the tool panel (right-hand side after selecting a tool), be mindful of this. The provided flow assumes Playwright's auto-piercing handles this sufficiently.

216
.cursor/rules/safari.mdc Normal file
View File

@ -0,0 +1,216 @@
---
description:
globs:
alwaysApply: false
---
### Meta Note
This file, `safari.mdc`, serves as a repository for detailed working notes, observations, and learnings acquired during the process of automating Safari interactions, particularly for the MCP Inspector UI. It's intended to capture the nuances of trial-and-error, debugging steps, and insights into what worked, what didn't, and why.
This contrasts with `mcp-inspector.mdc`, which is designed to be the concise, polished, and operational ruleset for future automated runs once a specific automation flow (like connecting to the MCP Inspector) has been stabilized and proven reliable. `mcp-inspector.mdc` should contain the 'final' working scripts and minimal necessary commentary, while `safari.mdc` is the space for the extended antechamber of discovery.
---
### Key Learnings and Observations from Safari Automation (MCP Inspector)
#### 1. Managing Safari Windows and Tabs for the Inspector
* **Objective:** Reliably direct Safari to the MCP Inspector URL (`http://127.0.0.1:6274`) in a predictable way, preferably using a single, consistent browser window and tab to avoid disrupting the user's workspace or losing context.
* **Initial Challenges & Evolution:
* Simply using `make new document with properties {URL:"..."}` could lead to multiple windows/tabs if not managed.
* Attempts to close all existing Inspector tabs first (`repeat with w in windows... close t...`) were functional but could be overly aggressive if the user had other work in Safari.
* Identifying and reusing an *existing specific tab* for the Inspector requires careful targeting (e.g., `first tab whose URL starts with "..."`). If this tab was from a previous, unconfigured session, just switching to it wasn't enough; it needed to be reloaded/reset.
* **Refined & Recommended Approach (as implemented in `mcp-inspector.mdc`):
```applescript
tell application "Safari"
activate
delay 0.2 -- Allow Safari to become the frontmost application
if (count of windows) is 0 then
-- No Safari windows are open, so create a new one.
make new document with properties {URL:"http://127.0.0.1:6274"}
else
-- Safari has windows open; use the frontmost one.
tell front window
set inspectorTab to missing value
try
-- Check if a tab for the Inspector is already open in this window.
set inspectorTab to (first tab whose URL starts with "http://127.0.0.1:6274")
end try
if inspectorTab is not missing value then
-- An Inspector tab exists: set its URL again (to refresh/reset) and make it active.
set URL of inspectorTab to "http://127.0.0.1:6274"
set current tab to inspectorTab
else
-- No specific Inspector tab found: set the URL of the *current active tab*.
set URL of current tab to "http://127.0.0.1:6274"
end if
end tell
end if
delay 1 -- Pause to allow the page to begin loading.
end tell
```
This logic aims to use the existing front window and either reuse/refresh an Inspector tab or repurpose the current active tab, falling back to creating a new window only if Safari isn't open.
#### 2. Clicking Elements Programmatically (The "Connect" Button Saga)
* **The Core Challenge:** Programmatically clicking the "Connect" button in the MCP Inspector UI to initiate the server connection.
* **Strategies Explored & Lessons:
* **CSS Selectors (`querySelector`):**
* Simple selectors like `[data-testid='env-vars-button']` worked for some buttons but required escaping single quotes in AppleScript: `do JavaScript "document.querySelector('[data-testid=\\\'env-vars-button\\']').click();"`.
* A complex `querySelector` for the "Connect" button (e.g., `'button[data-testid*=connect-button], button:not([disabled])... > span:contains(Connect)...'.click()`) ran without JS error but didn't reliably establish the connection, suggesting it might not have found the exact interactable element or the click wasn't registering correctly.
* **XPath (`document.evaluate`):**
* **Highly Specific XPaths:** An initial XPath based on the rule (`//button[contains(., 'Connect') and .//svg[.//polygon[@points='6 3 20 12 6 21 6 3']]]`) was very difficult to embed correctly in AppleScript due to nested single quotes requiring complex escaping (`\'`). This often led to AppleScript parsing errors (`-2741`).
* **`character id 39` for AppleScript String Construction:** To combat escaping issues, building the JavaScript string in AppleScript using `set sQuote to character id 39` for internal single quotes was effective for getting the AppleScript parser to accept the command. Example:
```applescript
set sQuote to character id 39
set jsConnectText to "Connect"
set specificXPath to "//button[contains(., " & sQuote & jsConnectText & sQuote & ") and .//svg[.//polygon[@points=" & sQuote & "6 3 20 12 6 21 6 3" & sQuote & "]]]"
set jsCommand to "document.evaluate(" & sQuote & specificXPath & sQuote & ", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue.click();"
```
While this made the AppleScript runnable, this very specific XPath still didn't reliably trigger the connection.
* **Successful XPath:** The breakthrough came with a slightly less specific but more robust XPath: `//button[.//text()='Connect']`. This finds a button that *contains* a text node exactly matching "Connect".
* JavaScript: `document.evaluate("//button[.//text()='Connect']", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue.click();`
* AppleScript embedding (note `\"` for JS string quotes):
```applescript
set jsCommand to "document.evaluate(\"//button[.//text()='Connect']\", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue.click();"
do JavaScript jsCommand in front document
```
This method proved successful in clicking the button and establishing the connection.
* **`dispatchEvent(new MouseEvent('click', ...))`:** This was tried as an alternative to `.click()` but did not yield a different outcome for the "Connect" button in this specific scenario.
#### 3. JavaScript Construction and Execution in AppleScript
* **`do JavaScript "..."`:** This is the fundamental command.
* **String Literals and Escaping:**
* If the AppleScript command itself is enclosed in double quotes (`"..."`), then any literal double quotes *within the JavaScript code* must be escaped as `\\"`.
* Single quotes (`'`) within the JavaScript code usually do not need escaping in this context.
* Example: `do JavaScript "var el = document.getElementById(\"myId\"); el.value = 'Hello\';"`
* **Long/Multiline JavaScript:**
* Concatenating multiple AppleScript string literals using `&` (and optionally `¬` for line continuation) can build up a long JavaScript command. However, this can be fragile if not every part is perfectly quoted and escaped. Often, AppleScript parsing errors (`-2741`) occur before the JS is even attempted.
* For complex JS, it's often more robust to ensure the entire JavaScript code is a single, well-formed string literal from AppleScript's perspective. If the JS itself is very complex, pre-constructing parts of it in AppleScript variables (especially strings that need careful quoting, like XPaths) can help.
* **Returning Values:** The `do JavaScript` command returns the result of the last JavaScript statement executed. This can be invaluable for debugging, e.g., `return 'Found element';` or `return element !== null;`.
#### 4. Asynchronicity and Delays
* **Essential `delay` commands (Strategic vs. Tactical):**
* **Strategic Delay (Crucial):** A critical lesson was the necessity of a significant delay (e.g., ~5 seconds) *after* an external process like the MCP Inspector is launched (e.g., via `npx` in iTerm) and *before* Safari automation attempts to interact with its web UI. This allows the external process and its web server to fully initialize. Without this, Safari automation might target a page that isn't ready or fully functional, leading to failures.
* **Tactical Delays (Within Safari UI Automation - Often Avoidable):** Initially, small `delay` commands were used within Safari AppleScripts after actions like clicks or page loads (e.g., `delay 0.25`, `delay 1`). While these can sometimes help ensure the DOM is updated, the latest successful runs showed that if the backend/server (Inspector) is fully ready (due to the strategic delay), rapid Safari UI interactions (form filling, sequential clicks) can often be performed reliably *without* these internal micro-delays. Removing them can speed up the automation if the underlying application is responsive enough.
* **Context is Key:** The need for tactical delays depends on how quickly the web application updates its DOM and responds to JavaScript events. For the MCP Inspector, once it's running, its UI seems to respond quickly enough to handle a sequence of JavaScript commands without interspersed AppleScript delays, provided the commands themselves are valid and target the correct elements.
* **Checking for Results:** When verifying an action (e.g., checking if `document.body.innerText.includes('Connected')`), it's vital that this check happens *after* the action has had a chance to complete and the UI to reflect the change. If running without tactical delays, this check should still be performed after the relevant JavaScript action that's supposed to cause the change.
#### 5. MCP Inspector Specifics
* **URL Consistency:** The MCP Inspector URL (`http://127.0.0.1:6274`) was found to be consistent between runs, simplifying Safari targeting.
* **Server Logs in the Inspector UI:** It was confirmed that after the `macos-automator-mcp` server connects via the MCP Inspector, its startup and operational logs (e.g., `[macos_automator_server] [INFO] Starting...`) are displayed directly within the MCP Inspector's web interface in Safari. This is the primary place to check for these server-specific logs, rather than the iTerm console running the `npx @modelcontextprotocol/inspector` command (which shows the Inspector's own proxy/connection logs). The Safari UI shows "Connected" status, and the server logs within the UI provide detailed confirmation of the server's state.
#### 6. Automating iTerm via AppleScript and Advanced Timing Considerations
* **Full iTerm Automation via AppleScript:** Due to persistent issues with iTerm-specific MCP tools (e.g., `mcp_iterm_send_control_character`, `mcp_iterm_write_to_terminal` consistently failing with "Tool not found" errors), a robust AppleScript workaround was developed and successfully implemented to manage the iTerm portion of the MCP Inspector setup. This script handles:
* Activating iTerm.
* Ensuring a window is available.
* Sending a Control-C command to the current session using `System Events` (for reliability, targeting the iTerm process) to terminate any running commands.
* Writing the `npx @modelcontextprotocol/inspector` command to the iTerm session to start the inspector.
* The successful AppleScript structure is as follows (and now part of `mcp-inspector.mdc`):
```applescript
tell application "iTerm"
activate
if (count of windows) is 0 then
create window with default profile
delay 0.5 # Brief delay for window creation
end if
end tell
delay 0.2 # Ensure iTerm is frontmost
tell application "System Events"
# Note: 'iTerm' process name might need to be 'iTerm2' for iTerm3+.
tell process "iTerm"
keystroke "c" using control down
end tell
end tell
delay 0.2 # Pause after Ctrl-C
tell application "iTerm"
tell current window
tell current session
write text "npx @modelcontextprotocol/inspector"
end tell
end tell
end tell
```
* **iTerm Process Name in System Events:** When using `System Events` to control iTerm (e.g., for `keystroke`), the `tell process "iTerm"` command might need to be `tell process "iTerm2"` if using iTerm version 3 or later, as the application's registered process name can vary.
* **Reinforcing the Strategic Delay:** The success of running Safari UI automation steps *without* internal (tactical) delays is highly dependent on the *strategic* delay implemented *after* initiating the MCP Inspector in iTerm and *before* beginning any Safari interaction. A delay of approximately 5 seconds was found to be effective, allowing `npx` and the Inspector server to fully initialize. Attempting Safari automation too soon, especially without tactical delays, will likely result in failures as the web UI won't be ready or responsive.
#### 7. Interacting with Shadow DOM (Advanced)
* **Identifying Shadow DOM:** Some web UIs, including potentially parts of the MCP Inspector (especially complex, self-contained components like the tool details and results panels), may use Shadow DOM to encapsulate their structure and styles. Standard `document.querySelector` or `document.evaluate` calls from the main document context will *not* pierce these shadow boundaries.
* **Symptoms of Shadow DOM:** If `document.body.innerText` seems to miss details of an active UI component, or if standard selectors fail for visible elements that are clearly part of a specific component, Shadow DOM may be in use.
* **Accessing Elements within Shadow DOM (Conceptual JavaScript Approach):**
To interact with elements inside a shadow root, you first need a reference to the host element, then access its `shadowRoot` property, and then query within that root.
```javascript
// 1. Find the host element (custom element tag name, e.g., 'tool-details-panel')
const hostElement = document.querySelector('your-shadow-host-tag-name');
if (hostElement && hostElement.shadowRoot) {
const shadowRoot = hostElement.shadowRoot;
// 2. Query within the shadowRoot for target elements
const targetElementInShadow = shadowRoot.querySelector('#some-element-inside-shadow');
if (targetElementInShadow) {
// targetElementInShadow.click();
// return targetElementInShadow.textContent;
} else {
// return 'Element not found within shadowRoot';
}
} else {
// return 'Shadow host not found or no shadowRoot attached';
}
```
* **Recursive Deep Query Helper (Conceptual):** For nested shadow DOMs or when the exact host is unknown, a recursive or iterative deep query function can be useful. This function would traverse the DOM, checking each element for a `shadowRoot` and searching within it.
```javascript
function $deep(selector, rootNode = document) {
const stack = [rootNode];
while (stack.length) {
const currentNode = stack.shift();
if (currentNode.nodeType === Node.ELEMENT_NODE && currentNode.matches(selector)) {
return currentNode;
}
if (currentNode.shadowRoot) {
stack.push(currentNode.shadowRoot);
}
// Check children only if it's an Element or DocumentFragment (like a shadowRoot)
if (currentNode.nodeType === Node.ELEMENT_NODE || currentNode.nodeType === Node.DOCUMENT_FRAGMENT_NODE) {
if (currentNode.children) { // Ensure children property exists
stack.push(...currentNode.children);
}
}
}
return null;
}
// Usage: const someButton = $deep('button.some-class-in-shadow');
```
* **Challenges with AppleScript `do JavaScript`:**
* **Return Value Limitations:** Complex objects (like DOM elements) or very large strings (like extensive `outerHTML`) returned from `do JavaScript` can sometimes result in `missing value` or empty strings in AppleScript, making debugging difficult.
* **Debugging:** Direct console logging from `do JavaScript` is not visible to the AppleScript environment, complicating troubleshooting of JavaScript execution within Safari.
* **Reliability:** For highly dynamic UIs with extensive Shadow DOM, the AppleScript `do JavaScript` bridge may not always be reliable enough for complex, multi-step interactions, especially when precise timing or access to nuanced DOM states is required. Direct API/tool calls, if available, are often more robust for verification in such cases.
* **Discovering Shadow Host Tag Names:** If the specific tag name of a shadow host is unknown, one might attempt to list all elements that have a `shadowRoot`:
```javascript
// JavaScript to be executed via AppleScript to list shadow host tag names
// (Note: Return value handling by AppleScript needs to be robust, e.g., JSON stringify)
// let hosts = [...document.querySelectorAll('*')]\
// .filter(el => el.shadowRoot)\
// .map(el => el.tagName);\
// return JSON.stringify(hosts);\
```
However, successful execution and return of this data via AppleScript `do JavaScript` can be unreliable, as experienced in attempts to automate the MCP Inspector.
These notes capture the iterative process and key takeaways from the Safari automation for the MCP Inspector. The successful methods are now enshrined in `mcp-inspector.mdc`, while this document provides the background and context.
---
### Meta-Level Collaboration & Rule Evolution Notes
* **Rule Refinement for Readability (User Feedback):** Based on user feedback, the main operational rule file (`mcp-inspector.mdc`) was refactored to move lengthy scripts (like the Safari tab setup AppleScript) into an Appendix section (e.g., `[Setup Safari Tab for Inspector]`). This keeps the main flow of the rule concise and readable for both humans and models, while still providing the full implementation details in a structured way. The `safari.mdc` file is designated for the more verbose, evolutionary notes and debugging narratives.
* **Tool Usage Preferences (User Feedback):** User indicated a preference for using the `edit_file` tool for modifying rule files (like `.mdc` files) rather than `claude_code`. This allows the user to review the diff in their IDE before the change is effectively applied by the AI. This preference will be honored for future rule file modifications.

195
.cursor/safari.mdc Normal file
View File

@ -0,0 +1,195 @@
---
description:
globs:
alwaysApply: false
---
#### 5. MCP Inspector Specifics
* **URL Consistency:** The MCP Inspector URL (`http://127.0.0.1:6274`) was found to be consistent between runs, simplifying Safari targeting.
* **"Connected" State vs. iTerm Logs:** A key finding was that the Safari Inspector UI can show "Connected" (and tools subsequently work) even if detailed `DEBUG`-level logs from the launched server process (`start.sh` -> `node dist/server.js`) do not appear in the iTerm console where `npx @modelcontextprotocol/inspector` is running. The Inspector seems to show its own proxying/connection logs, but the full stdout/stderr of the child might not always be visible there. This means successful connection and tool usability are the primary indicators, and absence of detailed server logs in the iTerm console is not necessarily a showstopper for basic interaction, though it would affect deeper debugging of the server itself.
These notes capture the iterative process and key takeaways from the Safari automation for the MCP Inspector. The successful methods are now enshrined in `mcp-inspector.mdc`, while this document provides the background and context.
This contrasts with `mcp-inspector.mdc`, which is designed to be the concise, polished, and operational ruleset for future automated runs once a specific automation flow (like connecting to the MCP Inspector) has been stabilized and proven reliable. `mcp-inspector.mdc` should contain the 'final' working scripts and minimal necessary commentary, while `safari.mdc` is the space for the extended antechamber of discovery.
* **Clarification on `[WORKSPACE_PATH]` Resolution:** The placeholder `[WORKSPACE_PATH]` used in rules (e.g., for script paths like `[WORKSPACE_PATH]/start.sh`) must be dynamically replaced by the AI with the absolute path of the current project workspace. This path is typically available to the AI from its context (e.g., derived from `user_info.workspace_path` or a similar environment variable). It is crucial that the AI ensures the resolved path is correctly quoted if it's used in shell commands or script arguments, especially if the path might contain spaces or special characters. For instance, a path like `/Users/username/My Projects/project-name` should be passed as `'/Users/username/My Projects/project-name'` in a shell command.
---
### Strategies for Robust Element Selection
When automating UI interactions, the reliability of your scripts heavily depends on how you identify and select HTML elements. Here's a hierarchy of preferences and tips for making your selectors more robust:
1. **`data-testid` Attributes (Gold Standard):**
* **Why:** These are custom attributes specifically added for testing and automation. They are decoupled from styling and functional implementation details, making them the most resilient to UI changes.
* **Example (CSS):** `[data-testid='user-login-button']`
* **Example (XPath):** `//*[@data-testid='user-login-button']`
2. **Unique `id` Attributes:**
* **Why:** `id` attributes are *supposed* to be unique within a page. If developers adhere to this, they are very reliable.
* **Example (CSS):** `#submit-form`
* **Example (XPath):** `//*[@id='submit-form']`
3. **Stable `aria-label`, `aria-labelledby`, `role`, or other Accessibility Attributes:**
* **Why:** Accessibility attributes are often more stable than class names used for styling, as they relate to the element's function and purpose.
* **Example (CSS):** `button[aria-label='Open settings']`
* **Example (XPath):** `//button[@aria-label='Open settings']`
4. **Stable Class Names (Used for Structure/Function, Not Just Styling):**
* **Why:** Some class names indicate the structure or function of an element rather than just its appearance. These can be reasonably stable. Avoid classes that are purely presentational (e.g., `color-blue`, `margin-small`).
* **Example (CSS):** `.user-profile-card .username` (Contextual selection)
* **Example (XPath):** `//div[contains(@class, 'user-profile-card')]//span[contains(@class, 'username')]`
5. **Structural XPaths (Based on DOM hierarchy):**
* **Why:** Relying on the element's position within the DOM (e.g., "the second `div` inside a `section` with a specific header"). These are more brittle than attribute-based selectors because any structural change can break them. Use sparingly and keep them as simple as possible.
* **Example (XPath):** `//section[@id='main-content']/div[2]/p`
6. **Text-Based XPaths (Using visible text):**
* **Why:** Selecting elements based on their visible text content (e.g., a button with the text "Submit"). Can be useful, but prone to breakage if the text changes (e.g., for localization or wording updates).
* **Example (XPath):** `//button[text()='Submit']` or `//button[contains(text(), 'Submit')]`
* **Tip for Robustness:** Use XPath's `normalize-space()` function to handle variations in whitespace (leading, trailing, multiple internal spaces).
* `//button[normalize-space(text())='Submit']` (Matches " Submit ", "Submit", " Submit" etc.)
* `//a[contains(normalize-space(.), 'Learn More')]` (Checks within any descendant text nodes)
**General Tips for Selectors:**
* **Prefer CSS Selectors for Simplicity and Speed:** When applicable, CSS selectors are often more concise and can be faster than XPaths.
* **Use Browser Developer Tools:** Actively use the "Inspect Element" feature in your browser to test and refine your CSS selectors and XPaths. Most dev tools allow you to directly test them.
* **Avoid Generated IDs/Classes:** Be wary of IDs or class names that look auto-generated (e.g., `id="ext-gen1234"`), as these are likely to change between page loads or application versions.
* **Context is Key:** Instead of overly complex global selectors, try to select a stable parent element first, then find the target element within that parent's context. This often leads to simpler and more reliable selectors.
---
### Debugging AppleScript `do JavaScript` Execution Flow
Successfully executing JavaScript via AppleScript's `do JavaScript` command often involves navigating two potential layers of errors: AppleScript parsing errors and JavaScript runtime errors. Here's how to approach debugging:
**1. Differentiating Error Types:**
* **AppleScript Compile-Time/Parsing Errors (e.g., `-2741`):**
* **Symptom:** The AppleScript editor shows an error, or the script fails immediately when run, often with error messages like "Syntax Error," "Expected end of line but found...", or specific error codes like `-2741` (which typically means the command couldn't be parsed correctly, often due to malformed strings or incorrect quoting).
* **Cause:** The AppleScript interpreter itself cannot understand the structure of your `do JavaScript "..."` command, usually due to incorrect quoting or escaping of characters *within the AppleScript string that defines the JavaScript code*.
* **The JavaScript code itself hasn't even been sent to Safari yet.**
* **JavaScript Runtime Errors:**
* **Symptom:** The AppleScript command runs without an immediate AppleScript error, but the desired action doesn't occur in Safari, or `do JavaScript` returns an error message from the JavaScript engine (e.g., "TypeError: null is not an object" or "SyntaxError: Unexpected identifier").
* **Cause:** The JavaScript code was successfully passed to Safari, but the JavaScript engine encountered an error while trying to execute it (e.g., trying to access a property of a non-existent element, incorrect JS syntax, etc.).
**2. Debugging AppleScript Syntax/Parsing Errors:**
* **Simplify the JavaScript String:** Start with the simplest possible JavaScript that should work, e.g.:
```applescript
tell application "Safari"
do JavaScript "'test';" in front document
end tell
```
* **Log the Constructed JavaScript String:** Before the `do JavaScript` line, use AppleScript's `log` command to print the exact JavaScript string you are about to send. This helps you visually inspect it for quoting issues.
```applescript
set jsCommand to "document.getElementById(\"myButton\").click();"
log jsCommand
tell application "Safari"
do JavaScript jsCommand in front document
end tell
```
Check the logged output carefully in Script Editor's "Messages" tab.
* **Build Complex Strings Incrementally:** If your JavaScript is complex, build it in parts using AppleScript variables. This can make it easier to manage quoting for each part.
* **Master Quoting:**
* If AppleScript string is in double quotes (`"..."`): Escape internal JS double quotes as `\"`. JS single quotes usually don't need escaping.
* Use `character id 39` for single quotes if constructing JS with many internal single quotes to avoid confusion: `set sQuote to character id 39`. `set jsCommand to "var name = " & sQuote & "Pete" & sQuote & ";"`
**3. Debugging JavaScript Runtime Errors:**
* **Test in Safari's Web Inspector Console:** The most effective way to debug the JavaScript itself is to open Safari, navigate to the target page, open the Web Inspector (Develop > Show Web Inspector), and paste your JavaScript snippet directly into the Console. This provides immediate feedback, error messages, and allows for interactive debugging.
* **Use `try...catch` in Your JavaScript:** Wrap your JavaScript code in a `try...catch` block to capture and return error messages back to AppleScript. This can make it much easier to see what went wrong inside Safari.
```applescript
set jsCommand to "try { document.getElementById('nonExistentElement').value = 'test'; return 'Success'; } catch(e) { return 'JS Error: ' + e.name + ': ' + e.message; }"
tell application "Safari"
set jsResult to do JavaScript jsCommand in front document
log jsResult
end tell
```
* **Return Values for Debugging:** Have your JavaScript return intermediate values or status indicators to AppleScript to understand its state.
```applescript
set jsCommand to "var el = document.getElementById('myField'); if (el) { return 'Element found!'; } else { return 'Element NOT found.'; }"
log (do JavaScript jsCommand in front document)
```
By systematically checking for AppleScript parsing issues first, then moving to debug the JavaScript logic within Safari's environment, you can effectively troubleshoot `do JavaScript` commands.
---
### Advanced Asynchronous Handling: Polling for Conditions
Web pages load and update content asynchronously. Relying on fixed `delay` commands in AppleScript after an action (like a click or page navigation) can be unreliable because the actual time needed for the UI to update can vary due to network speed, server load, or client-side processing.
A more robust approach is to actively poll for a specific condition to be met (e.g., an element appearing, text changing, a certain JavaScript variable becoming true) before proceeding. This makes your scripts more resilient to timing variations.
**How Polling Works:**
1. Define the JavaScript code that checks for your desired condition (this should return `true` or `false`).
2. In AppleScript, create a loop that:
* Executes the JavaScript check.
* If the condition is met, exit the loop.
* If not, wait for a short interval (e.g., 0.5 seconds).
* Include a counter or timeout mechanism to prevent the loop from running indefinitely if the condition is never met.
**Example: Polling for 'Connected' Status in MCP Inspector**
This AppleScript snippet demonstrates polling for the text "Connected" to appear on the page after clicking the connect button:
```applescript
-- JavaScript to check if the page body contains the text "Connected"
set jsCheckConnected to "document.body.innerText.includes('Connected');"
set isNowConnected to false
set attempts to 0
set maxAttempts to 20 -- Set a reasonable limit, e.g., 20 attempts
set pollInterval to 0.5 -- Wait 0.5 seconds between attempts
log "Polling for 'Connected' status..."
tell application "Safari"
tell front document
repeat while isNowConnected is false and attempts < maxAttempts
try
if (do JavaScript jsCheckConnected) is true then
set isNowConnected to true
log "Status changed to 'Connected' after " & (attempts + 1) & " attempts."
else
delay pollInterval
end if
on error errMsg number errNum
log "Error during JavaScript check (attempt " & (attempts + 1) & "): " & errMsg & " (Number: " & errNum & ")"
-- Decide if you want to stop on error or just log and continue
delay pollInterval -- Still delay even if JS itself errored, maybe it's a temporary issue
end try
set attempts to attempts + 1
end repeat
end tell
end tell
if isNowConnected then
log "Successfully confirmed 'Connected' status via polling."
-- Proceed with next actions that depend on being connected
else
log "Failed to see 'Connected' status within " & (maxAttempts * pollInterval) & " seconds."
-- Handle the failure case (e.g., log error, stop script)
end if
```
**Benefits of Polling:**
* **Increased Reliability:** Scripts wait only as long as necessary, adapting to real-time conditions rather than fixed, potentially too short or too long, delays.
* **Reduced Brittleness:** Less likely to fail due to unexpected slowdowns.
* **Clearer Intent:** The script explicitly states what condition it's waiting for.
**Considerations:**
* **Timeout:** Always implement a maximum number of attempts or a total timeout to prevent infinite loops if the condition never occurs.
* **Poll Interval:** Choose a reasonable interval. Too short can be resource-intensive; too long can make the script feel sluggish.
* **Error Handling:** Include `try...on error` blocks within your loop to gracefully handle potential errors during the JavaScript execution (e.g., if the page is still transitioning and elements are not yet available).
---
### Meta-Level Collaboration & Rule Evolution Notes

774
.cursor/scripts/terminator.scpt Executable file
View File

@ -0,0 +1,774 @@
#!/usr/bin/osascript
--------------------------------------------------------------------------------
-- terminator.scpt - v0.6.1 Enhanced "T-1000"
-- Enhanced Terminal session management with smart session reuse and better error reporting
-- Features: Smart session reuse, enhanced error reporting, improved timing, better output formatting
--------------------------------------------------------------------------------
--#region Configuration Properties
property maxCommandWaitTime : 15.0 -- Increased from 10.0 for better reliability
property pollIntervalForBusyCheck : 0.1
property startupDelayForTerminal : 0.7
property minTailLinesOnWrite : 100 -- Increased from 15 for better build log visibility
property defaultTailLines : 100 -- Increased from 30 for better build log visibility
property tabTitlePrefix : "🤖💥 " -- For the window/tab title itself
property scriptInfoPrefix : "Terminator 🤖💥: " -- For messages generated by this script
property projectIdentifierInTitle : "Project: "
property taskIdentifierInTitle : " - Task: "
property enableFuzzyTagGrouping : true
property fuzzyGroupingMinPrefixLength : 4
-- Safe enhanced properties (minimal additions)
property enhancedErrorReporting : true
property verboseLogging : false
--#endregion Configuration Properties
--#region Helper Functions
on isValidPath(thePath)
if thePath is not "" and (thePath starts with "/") then
if not (thePath contains " -") then -- Basic heuristic
return true
end if
end if
return false
end isValidPath
on getPathComponent(thePath, componentIndex)
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to "/"
set pathParts to text items of thePath
set AppleScript's text item delimiters to oldDelims
set nonEmptyParts to {}
repeat with aPart in pathParts
if aPart is not "" then set end of nonEmptyParts to aPart
end repeat
if (count nonEmptyParts) = 0 then return ""
try
if componentIndex is -1 then
return item -1 of nonEmptyParts
else if componentIndex > 0 and componentIndex ≤ (count nonEmptyParts) then
return item componentIndex of nonEmptyParts
end if
on error
return ""
end try
return ""
end getPathComponent
on generateWindowTitle(taskTag as text, projectGroup as text)
if projectGroup is not "" then
return tabTitlePrefix & projectIdentifierInTitle & projectGroup & taskIdentifierInTitle & taskTag
else
return tabTitlePrefix & taskTag
end if
end generateWindowTitle
on bufferContainsMeaningfulContentAS(multiLineText, knownInfoPrefix as text, commonShellPrompts as list)
if multiLineText is "" then return false
-- Simple approach: if the trimmed content is substantial and not just our info messages, consider it meaningful
set trimmedText to my trimWhitespace(multiLineText)
if (length of trimmedText) < 3 then return false
-- Check if it's only our script info messages
if trimmedText starts with knownInfoPrefix then
-- If it's ONLY our message and nothing else meaningful, return false
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to linefeed
set textLines to text items of multiLineText
set AppleScript's text item delimiters to oldDelims
set nonInfoLines to 0
repeat with aLine in textLines
set currentLine to my trimWhitespace(aLine as text)
if currentLine is not "" and not (currentLine starts with knownInfoPrefix) then
set nonInfoLines to nonInfoLines + 1
end if
end repeat
-- If we have substantial non-info content, consider it meaningful
return (nonInfoLines > 2)
end if
-- If content doesn't start with our info prefix, likely contains command output
return true
end bufferContainsMeaningfulContentAS
-- Enhanced error reporting helper
on formatErrorMessage(errorType, errorMsg, context)
if enhancedErrorReporting then
set formattedMsg to scriptInfoPrefix & errorType & ": " & errorMsg
if context is not "" then
set formattedMsg to formattedMsg & " (Context: " & context & ")"
end if
return formattedMsg
else
return scriptInfoPrefix & errorMsg
end if
end formatErrorMessage
-- Enhanced logging helper
on logVerbose(message)
if verboseLogging then
log "🔍 " & message
end if
end logVerbose
--#endregion Helper Functions
--#region Main Script Logic (on run)
on run argv
set appSpecificErrorOccurred to false
try
my logVerbose("Starting Terminator v0.6.0 Safe Enhanced")
tell application "System Events"
if not (exists process "Terminal") then
launch application id "com.apple.Terminal"
delay startupDelayForTerminal
end if
end tell
set originalArgCount to count argv
if originalArgCount < 1 then return my usageText()
set projectPathArg to ""
set actualArgsForParsing to argv
if originalArgCount > 0 then
set potentialPath to item 1 of argv
if my isValidPath(potentialPath) then
set projectPathArg to potentialPath
my logVerbose("Detected project path: " & projectPathArg)
if originalArgCount > 1 then
set actualArgsForParsing to items 2 thru -1 of argv
else
return my formatErrorMessage("Argument Error", "Project path \"" & projectPathArg & "\" provided, but no task tag or command specified." & linefeed & linefeed & my usageText(), "")
end if
end if
end if
if (count actualArgsForParsing) < 1 then return my usageText()
set taskTagName to item 1 of actualArgsForParsing
my logVerbose("Task tag: " & taskTagName)
if (length of taskTagName) > 40 or (not my tagOK(taskTagName)) then
set errorMsg to "Task Tag missing or invalid: \"" & taskTagName & "\"." & linefeed & linefeed & ¬
"A 'task tag' (e.g., 'build', 'tests') is a short name (1-40 letters, digits, -, _) " & ¬
"to identify a specific task, optionally within a project session." & linefeed & linefeed
return my formatErrorMessage("Validation Error", errorMsg & my usageText(), "tag validation")
end if
set doWrite to false
set shellCmd to ""
set originalUserShellCmd to ""
set currentTailLines to defaultTailLines
set explicitLinesProvided to false
set argCountAfterTagOrPath to count actualArgsForParsing
if argCountAfterTagOrPath > 1 then
set commandParts to items 2 thru -1 of actualArgsForParsing
if (count commandParts) > 0 then
set lastOfCmdParts to item -1 of commandParts
if my isInteger(lastOfCmdParts) then
set currentTailLines to (lastOfCmdParts as integer)
set explicitLinesProvided to true
my logVerbose("Explicit lines requested: " & currentTailLines)
if (count commandParts) > 1 then
set commandParts to items 1 thru -2 of commandParts
else
set commandParts to {}
end if
end if
end if
if (count commandParts) > 0 then
set originalUserShellCmd to my joinList(commandParts, " ")
my logVerbose("Command detected: " & originalUserShellCmd)
end if
else if argCountAfterTagOrPath = 1 then
-- Only taskTagName was provided after potential projectPathArg
-- This is a read operation by default.
my logVerbose("Read-only operation detected")
end if
if originalUserShellCmd is not "" and (my trimWhitespace(originalUserShellCmd) is not "") then
set doWrite to true
set shellCmd to originalUserShellCmd
else if projectPathArg is not "" and originalUserShellCmd is "" then
-- Path provided, task tag, and empty command string "" OR no command string but lines_to_read was there
set doWrite to true
set shellCmd to "" -- will become 'cd path'
my logVerbose("CD-only operation for path: " & projectPathArg)
else
set doWrite to false
set shellCmd to ""
end if
if currentTailLines < 1 then set currentTailLines to 1
if doWrite and (shellCmd is not "" or projectPathArg is not "") and currentTailLines < minTailLinesOnWrite then
set currentTailLines to minTailLinesOnWrite
my logVerbose("Increased tail lines for write operation: " & currentTailLines)
end if
if projectPathArg is not "" and doWrite then
set quotedProjectPath to quoted form of projectPathArg
if shellCmd is not "" then
set shellCmd to "cd " & quotedProjectPath & " && " & shellCmd
else
set shellCmd to "cd " & quotedProjectPath
end if
my logVerbose("Final command: " & shellCmd)
end if
set derivedProjectGroup to ""
if projectPathArg is not "" then
set derivedProjectGroup to my getPathComponent(projectPathArg, -1)
if derivedProjectGroup is "" then set derivedProjectGroup to "DefaultProject"
my logVerbose("Project group: " & derivedProjectGroup)
end if
set allowCreation to false
if doWrite then
set allowCreation to true
else if explicitLinesProvided then
set allowCreation to true
end if
set effectiveTabTitleForLookup to my generateWindowTitle(taskTagName, derivedProjectGroup)
my logVerbose("Tab title: " & effectiveTabTitleForLookup)
set tabInfo to my ensureTabAndWindow(taskTagName, derivedProjectGroup, allowCreation, effectiveTabTitleForLookup)
if tabInfo is missing value then
if not allowCreation then
set errorMsg to "Terminal session \"" & effectiveTabTitleForLookup & "\" not found." & linefeed & ¬
"To create this session, provide a command (even an empty string \"\" if only 'cd'-ing to a project path), " & ¬
"or specify lines to read (e.g., ... \"" & taskTagName & "\" 1)." & linefeed
if projectPathArg is not "" then
set errorMsg to errorMsg & "Project path was specified as: \"" & projectPathArg & "\"." & linefeed
else
set errorMsg to errorMsg & "If this is for a new project, provide the absolute project path as the first argument." & linefeed
end if
return my formatErrorMessage("Session Error", errorMsg & linefeed & my usageText(), "session lookup")
else
return my formatErrorMessage("Creation Error", "Could not find or create Terminal tab for \"" & effectiveTabTitleForLookup & "\". Check permissions/Terminal state.", "tab creation")
end if
end if
set targetTab to targetTab of tabInfo
set parentWindow to parentWindow of tabInfo
set wasNewlyCreated to wasNewlyCreated of tabInfo
set createdInExistingViaFuzzy to createdInExistingWindowViaFuzzy of tabInfo
my logVerbose("Tab info - new: " & wasNewlyCreated & ", fuzzy: " & createdInExistingViaFuzzy)
set bufferText to ""
set commandTimedOut to false
set tabWasBusyOnRead to false
set previousCommandActuallyStopped to true
set attemptMadeToStopPreviousCommand to false
set identifiedBusyProcessName to ""
set theTTYForInfo to ""
if not doWrite and wasNewlyCreated then
if createdInExistingViaFuzzy then
return scriptInfoPrefix & "New tab \"" & effectiveTabTitleForLookup & "\" created in existing project window and ready."
else
return scriptInfoPrefix & "New tab \"" & effectiveTabTitleForLookup & "\" (in new window) created and ready."
end if
end if
tell application id "com.apple.Terminal"
try
set index of parentWindow to 1
set selected tab of parentWindow to targetTab
if wasNewlyCreated and doWrite then
delay 0.4
else
delay 0.1
end if
if doWrite and shellCmd is not "" then
my logVerbose("Executing command: " & shellCmd)
set canProceedWithWrite to true
if busy of targetTab then
if not wasNewlyCreated or createdInExistingViaFuzzy then
set attemptMadeToStopPreviousCommand to true
set previousCommandActuallyStopped to false
try
set theTTYForInfo to my trimWhitespace(tty of targetTab)
end try
set processesBefore to {}
try
set processesBefore to processes of targetTab
end try
set commonShells to {"login", "bash", "zsh", "sh", "tcsh", "ksh", "-bash", "-zsh", "-sh", "-tcsh", "-ksh", "dtterm", "fish"}
set identifiedBusyProcessName to ""
if (count of processesBefore) > 0 then
repeat with i from (count of processesBefore) to 1 by -1
set aProcessName to item i of processesBefore
if aProcessName is not in commonShells then
set identifiedBusyProcessName to aProcessName
exit repeat
end if
end repeat
end if
my logVerbose("Busy process identified: " & identifiedBusyProcessName)
set processToTargetForKill to identifiedBusyProcessName
set killedViaPID to false
if theTTYForInfo is not "" and processToTargetForKill is not "" then
set shortTTY to text 6 thru -1 of theTTYForInfo
set pidsToKillText to ""
try
set psCommand to "ps -t " & shortTTY & " -o pid,comm | awk '$2 == \"" & processToTargetForKill & "\" {print $1}'"
set pidsToKillText to do shell script psCommand
end try
if pidsToKillText is not "" then
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to linefeed
set pidList to text items of pidsToKillText
set AppleScript's text item delimiters to oldDelims
repeat with aPID in pidList
set aPID to my trimWhitespace(aPID)
if aPID is not "" then
try
do shell script "kill -INT " & aPID
delay 0.3
do shell script "kill -0 " & aPID
try
do shell script "kill -KILL " & aPID
delay 0.2
try
do shell script "kill -0 " & aPID
on error
set previousCommandActuallyStopped to true
end try
end try
on error
set previousCommandActuallyStopped to true
end try
end if
if previousCommandActuallyStopped then
set killedViaPID to true
exit repeat
end if
end repeat
end if
end if
if not previousCommandActuallyStopped and busy of targetTab then
activate
delay 0.5
tell application "System Events" to keystroke "c" using control down
delay 0.6
if not (busy of targetTab) then
set previousCommandActuallyStopped to true
if identifiedBusyProcessName is not "" and (identifiedBusyProcessName is in (processes of targetTab)) then
set previousCommandActuallyStopped to false
end if
end if
else if not busy of targetTab then
set previousCommandActuallyStopped to true
end if
if not previousCommandActuallyStopped then
set canProceedWithWrite to false
end if
else if wasNewlyCreated and not createdInExistingViaFuzzy and busy of targetTab then
delay 0.4
if busy of targetTab then
set attemptMadeToStopPreviousCommand to true
set previousCommandActuallyStopped to false
set identifiedBusyProcessName to "extended initialization"
set canProceedWithWrite to false
else
set previousCommandActuallyStopped to true
end if
end if
end if
if canProceedWithWrite then
-- Clear before write to prevent output truncation (only for reused tabs)
if not wasNewlyCreated then
do script "clear" in targetTab
delay 0.1
end if
do script shellCmd in targetTab
set commandStartTime to current date
set commandFinished to false
repeat while ((current date) - commandStartTime) < maxCommandWaitTime
if not (busy of targetTab) then
set commandFinished to true
exit repeat
end if
delay pollIntervalForBusyCheck
end repeat
if not commandFinished then set commandTimedOut to true
if commandFinished then delay 0.2 -- Increased from 0.1 for better output settling
my logVerbose("Command execution completed, timeout: " & commandTimedOut)
end if
else if not doWrite then
if busy of targetTab then
set tabWasBusyOnRead to true
try
set theTTYForInfo to my trimWhitespace(tty of targetTab)
end try
set processesReading to processes of targetTab
set commonShells to {"login", "bash", "zsh", "sh", "tcsh", "ksh", "-bash", "-zsh", "-sh", "-tcsh", "-ksh", "dtterm", "fish"}
set identifiedBusyProcessName to ""
if (count of processesReading) > 0 then
repeat with i from (count of processesReading) to 1 by -1
set aProcessName to item i of processesReading
if aProcessName is not in commonShells then
set identifiedBusyProcessName to aProcessName
exit repeat
end if
end repeat
end if
my logVerbose("Tab busy during read with: " & identifiedBusyProcessName)
end if
end if
set bufferText to history of targetTab
on error errMsg number errNum
set appSpecificErrorOccurred to true
return my formatErrorMessage("Terminal Error", errMsg, "error " & errNum)
end try
end tell
set appendedMessage to ""
set ttyInfoStringForMessage to ""
if theTTYForInfo is not "" then set ttyInfoStringForMessage to " (TTY " & theTTYForInfo & ")"
if attemptMadeToStopPreviousCommand then
set processNameToReport to "process"
if identifiedBusyProcessName is not "" and identifiedBusyProcessName is not "extended initialization" then
set processNameToReport to "'" & identifiedBusyProcessName & "'"
else if identifiedBusyProcessName is "extended initialization" then
set processNameToReport to "tab's extended initialization"
end if
if previousCommandActuallyStopped then
set appendedMessage to linefeed & scriptInfoPrefix & "Previous " & processNameToReport & ttyInfoStringForMessage & " was interrupted. ---"
else
set appendedMessage to linefeed & scriptInfoPrefix & "Attempted to interrupt previous " & processNameToReport & ttyInfoStringForMessage & ", but it may still be running. New command NOT executed. ---"
end if
end if
if commandTimedOut then
set cmdForMsg to originalUserShellCmd
if projectPathArg is not "" and originalUserShellCmd is not "" then set cmdForMsg to originalUserShellCmd & " (in " & projectPathArg & ")"
if projectPathArg is not "" and originalUserShellCmd is "" then set cmdForMsg to "(cd " & projectPathArg & ")"
set appendedMessage to appendedMessage & linefeed & scriptInfoPrefix & "Command '" & cmdForMsg & "' may still be running. Returned after " & maxCommandWaitTime & "s timeout. ---"
else if tabWasBusyOnRead then
set processNameToReportOnRead to "process"
if identifiedBusyProcessName is not "" then set processNameToReportOnRead to "'" & identifiedBusyProcessName & "'"
set busyProcessInfoString to ""
if identifiedBusyProcessName is not "" then set busyProcessInfoString to " with " & processNameToReportOnRead
set appendedMessage to appendedMessage & linefeed & scriptInfoPrefix & "Tab" & ttyInfoStringForMessage & " was busy" & busyProcessInfoString & " during read. Output may be from an ongoing process. ---"
end if
if appendedMessage is not "" then
if bufferText is "" then
set bufferText to my trimWhitespace(appendedMessage)
else
set bufferText to bufferText & appendedMessage
end if
end if
set tailedOutput to my tailBufferAS(bufferText, currentTailLines)
set finalResult to my trimBlankLinesAS(tailedOutput)
if finalResult is "" then
set effectiveOriginalCmdForMsg to originalUserShellCmd
if projectPathArg is not "" and originalUserShellCmd is "" then
set effectiveOriginalCmdForMsg to "(cd " & projectPathArg & ")"
else if projectPathArg is not "" and originalUserShellCmd is not "" then
set effectiveOriginalCmdForMsg to originalUserShellCmd & " (in " & projectPathArg & ")"
end if
set baseMsgInfo to "Session \"" & effectiveTabTitleForLookup & "\", requested " & currentTailLines & " lines."
set specificAppendedInfo to my trimWhitespace(appendedMessage)
set suffixForReturn to ""
if specificAppendedInfo is not "" then set suffixForReturn to linefeed & specificAppendedInfo
if attemptMadeToStopPreviousCommand and not previousCommandActuallyStopped then
return my formatErrorMessage("Process Error", "Previous command/initialization in session \"" & effectiveTabTitleForLookup & "\"" & ttyInfoStringForMessage & " may not have terminated. New command '" & effectiveOriginalCmdForMsg & "' NOT executed." & suffixForReturn, "process termination")
else if commandTimedOut then
return my formatErrorMessage("Timeout Error", "Command '" & effectiveOriginalCmdForMsg & "' timed out after " & maxCommandWaitTime & "s. No other output. " & baseMsgInfo & suffixForReturn, "command timeout")
else if tabWasBusyOnRead then
return my formatErrorMessage("Busy Error", "Tab for session \"" & effectiveTabTitleForLookup & "\" was busy during read. No other output. " & baseMsgInfo & suffixForReturn, "read busy")
else if doWrite and shellCmd is not "" then
return scriptInfoPrefix & "Command '" & effectiveOriginalCmdForMsg & "' executed in session \"" & effectiveTabTitleForLookup & "\". No output captured."
else
return scriptInfoPrefix & "No meaningful content found in session \"" & effectiveTabTitleForLookup & "\"."
end if
end if
my logVerbose("Returning " & (length of finalResult) & " characters of output")
return finalResult
on error generalErrorMsg number generalErrorNum
if appSpecificErrorOccurred then error generalErrorMsg number generalErrorNum
return my formatErrorMessage("Execution Error", generalErrorMsg, "error " & generalErrorNum)
end try
end run
--#endregion Main Script Logic (on run)
--#region Helper Functions
on ensureTabAndWindow(taskTagName as text, projectGroupName as text, allowCreate as boolean, desiredFullTitle as text)
set wasActuallyCreated to false
set createdInExistingViaFuzzy to false
tell application id "com.apple.Terminal"
try
repeat with w in windows
repeat with tb in tabs of w
try
if custom title of tb is desiredFullTitle then
set selected tab of w to tb
return {targetTab:tb, parentWindow:w, wasNewlyCreated:false, createdInExistingWindowViaFuzzy:false}
end if
end try
end repeat
end repeat
end try
if allowCreate and enableFuzzyTagGrouping and projectGroupName is not "" then
set projectGroupSearchPatternForWindowName to tabTitlePrefix & projectIdentifierInTitle & projectGroupName
try
repeat with w in windows
try
-- Look for any window that contains our project name
if name of w contains projectGroupSearchPatternForWindowName or name of w contains (projectIdentifierInTitle & projectGroupName) then
if not frontmost then activate
delay 0.2
set newTabInGroup to do script "" in w
delay 0.3
set custom title of newTabInGroup to desiredFullTitle
delay 0.2
set selected tab of w to newTabInGroup
return {targetTab:newTabInGroup, parentWindow:w, wasNewlyCreated:true, createdInExistingWindowViaFuzzy:true}
end if
end try
end repeat
end try
end if
-- Enhanced fallback: if no project-specific window found, try to use any existing Terminator window
if allowCreate and enableFuzzyTagGrouping then
try
repeat with w in windows
try
if name of w contains tabTitlePrefix then
-- Found an existing Terminator window, use it for grouping
if not frontmost then activate
delay 0.2
set newTabInGroup to do script "" in w
delay 0.3
set custom title of newTabInGroup to desiredFullTitle
delay 0.2
set selected tab of w to newTabInGroup
return {targetTab:newTabInGroup, parentWindow:w, wasNewlyCreated:true, createdInExistingWindowViaFuzzy:true}
end if
end try
end repeat
end try
end if
if allowCreate then
try
if not frontmost then activate
delay 0.3
set newTabInNewWindow to do script ""
set wasActuallyCreated to true
delay 0.4
set custom title of newTabInNewWindow to desiredFullTitle
delay 0.2
set parentWinOfNew to missing value
try
set parentWinOfNew to window of newTabInNewWindow
on error
if (count of windows) > 0 then set parentWinOfNew to front window
end try
if parentWinOfNew is not missing value then
if custom title of newTabInNewWindow is desiredFullTitle then
set selected tab of parentWinOfNew to newTabInNewWindow
return {targetTab:newTabInNewWindow, parentWindow:parentWinOfNew, wasNewlyCreated:wasActuallyCreated, createdInExistingWindowViaFuzzy:false}
end if
end if
repeat with w_final_scan in windows
repeat with tb_final_scan in tabs of w_final_scan
try
if custom title of tb_final_scan is desiredFullTitle then
set selected tab of w_final_scan to tb_final_scan
return {targetTab:tb_final_scan, parentWindow:w_final_scan, wasNewlyCreated:wasActuallyCreated, createdInExistingWindowViaFuzzy:false}
end if
end try
end repeat
end repeat
return missing value
on error
return missing value
end try
else
return missing value
end if
end tell
end ensureTabAndWindow
on tailBufferAS(txt, n)
set AppleScript's text item delimiters to linefeed
set lst to text items of txt
if (count lst) = 0 then return ""
set startN to (count lst) - (n - 1)
if startN < 1 then set startN to 1
set slice to items startN thru -1 of lst
set outText to slice as text
set AppleScript's text item delimiters to ""
return outText
end tailBufferAS
on lineIsEffectivelyEmptyAS(aLine)
if aLine is "" then return true
set trimmedLine to my trimWhitespace(aLine)
return (trimmedLine is "")
end lineIsEffectivelyEmptyAS
on trimBlankLinesAS(txt)
if txt is "" then return ""
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to {linefeed}
set originalLines to text items of txt
set linesToProcess to {}
repeat with aLineRef in originalLines
set aLine to contents of aLineRef
if my lineIsEffectivelyEmptyAS(aLine) then
set end of linesToProcess to ""
else
set end of linesToProcess to aLine
end if
end repeat
set firstContentLine to 1
repeat while firstContentLine ≤ (count linesToProcess) and (item firstContentLine of linesToProcess is "")
set firstContentLine to firstContentLine + 1
end repeat
set lastContentLine to count linesToProcess
repeat while lastContentLine ≥ firstContentLine and (item lastContentLine of linesToProcess is "")
set lastContentLine to lastContentLine - 1
end repeat
if firstContentLine > lastContentLine then
set AppleScript's text item delimiters to oldDelims
return ""
end if
set resultLines to items firstContentLine thru lastContentLine of linesToProcess
set AppleScript's text item delimiters to linefeed
set trimmedTxt to resultLines as text
set AppleScript's text item delimiters to oldDelims
return trimmedTxt
end trimBlankLinesAS
on trimWhitespace(theText)
set whitespaceChars to {" ", tab}
set newText to theText
repeat while (newText is not "") and (character 1 of newText is in whitespaceChars)
if (length of newText) > 1 then
set newText to text 2 thru -1 of newText
else
set newText to ""
end if
end repeat
repeat while (newText is not "") and (character -1 of newText is in whitespaceChars)
if (length of newText) > 1 then
set newText to text 1 thru -2 of newText
else
set newText to ""
end if
end repeat
return newText
end trimWhitespace
on isInteger(v)
try
v as integer
return true
on error
return false
end try
end isInteger
on tagOK(t)
try
do shell script "/bin/echo " & quoted form of t & " | /usr/bin/grep -E -q '^[A-Za-z0-9_-]+$'"
return true
on error
return false
end try
end tagOK
on joinList(theList, theDelimiter)
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to theDelimiter
set theText to theList as text
set AppleScript's text item delimiters to oldDelims
return theText
end joinList
on usageText()
set LF to linefeed
set scriptName to "terminator.scpt"
set exampleProject to "/Users/name/Projects/FancyApp"
set exampleProjectNameForTitle to my getPathComponent(exampleProject, -1)
if exampleProjectNameForTitle is "" then set exampleProjectNameForTitle to "DefaultProject"
set exampleTaskTag to "build_frontend"
set exampleFullCommand to "npm run build"
set generatedExampleTitle to my generateWindowTitle(exampleTaskTag, exampleProjectNameForTitle)
set outText to scriptName & " - v0.6.0 Enhanced \"T-1000\" AppleScript Terminal helper" & LF & LF
set outText to outText & "Enhancements: Smart session reuse, enhanced error reporting, verbose logging (optional)" & LF & LF
set outText to outText & "Manages dedicated, tagged Terminal sessions, grouped by project path." & LF & LF
set outText to outText & "Core Concept:" & LF
set outText to outText & " 1. For a NEW project, provide the absolute project path FIRST, then task tag, then command:" & LF
set outText to outText & " osascript " & scriptName & " \"" & exampleProject & "\" \"" & exampleTaskTag & "\" \"" & exampleFullCommand & "\"" & LF
set outText to outText & " The script will 'cd' into the project path and run the command." & LF
set outText to outText & " The tab will be titled like: \"" & generatedExampleTitle & "\"" & LF
set outText to outText & " 2. For SUBSEQUENT commands for THE SAME PROJECT, use the project path and task tag:" & LF
set outText to outText & " osascript " & scriptName & " \"" & exampleProject & "\" \"" & exampleTaskTag & "\" \"another_command\"" & LF
set outText to outText & " 3. To simply READ from an existing session (path & tag must identify an existing session):" & LF
set outText to outText & " osascript " & scriptName & " \"" & exampleProject & "\" \"" & exampleTaskTag & "\"" & LF
set outText to outText & " A READ operation on a non-existent tag (without path/command to create) will error." & LF & LF
set outText to outText & "Title Format: \"" & tabTitlePrefix & projectIdentifierInTitle & "<ProjectName>" & taskIdentifierInTitle & "<TaskTag>\"" & LF
set outText to outText & "Or if no project path provided: \"" & tabTitlePrefix & "<TaskTag>\"" & LF & LF
set outText to outText & "Enhanced Features:" & LF
set outText to outText & " • Smart session reuse for same project paths" & LF
set outText to outText & " • Enhanced error reporting with context information" & LF
set outText to outText & " • Optional verbose logging for debugging" & LF
set outText to outText & " • No automatic clearing to prevent interrupting builds" & LF
set outText to outText & " • 100-line default output for better build log visibility" & LF
set outText to outText & " • Automatically 'cd's into project path if provided with a command." & LF
set outText to outText & " • Groups new task tabs into existing project windows if fuzzy grouping enabled." & LF
set outText to outText & " • Interrupts busy processes in reused tabs." & LF & LF
set outText to outText & "Usage Examples:" & LF
set outText to outText & " # Start new project session, cd, run command, get 100 lines:" & LF
set outText to outText & " osascript " & scriptName & " \"" & exampleProject & "\" \"frontend_build\" \"npm run build\" 100" & LF
set outText to outText & " # Create/use 'backend_tests' task tab in the 'FancyApp' project window:" & LF
set outText to outText & " osascript " & scriptName & " \"" & exampleProject & "\" \"backend_tests\" \"pytest\"" & LF
set outText to outText & " # Prepare/create a new session by just cd'ing into project path (empty command):" & LF
set outText to outText & " osascript " & scriptName & " \"" & exampleProject & "\" \"dev_shell\" \"\" 1" & LF
set outText to outText & " # Read from an existing session:" & LF
set outText to outText & " osascript " & scriptName & " \"" & exampleProject & "\" \"frontend_build\" 50" & LF & LF
set outText to outText & "Parameters:" & LF
set outText to outText & " [\"/absolute/project/path\"]: (Optional First Arg) Base path for project. Enables 'cd' and grouping." & LF
set outText to outText & " \"<task_tag_name>\": Required. Specific task name for the tab (e.g., 'build', 'tests')." & LF
set outText to outText & " [\"<shell_command_parts...>\"]: (Optional) Command. If path provided, 'cd path &&' is prepended." & LF
set outText to outText & " Use \"\" for no command (will just 'cd' if path given)." & LF
set outText to outText & " [[lines_to_read]]: (Optional Last Arg) Number of history lines. Default: " & defaultTailLines & "." & LF & LF
set outText to outText & "Notes:" & LF
set outText to outText & " • Provide project path on first use for a project for best window grouping and auto 'cd'." & LF
set outText to outText & " • Ensure Automation permissions for Terminal.app & System Events.app." & LF
set outText to outText & " • Works within Terminal.app's AppleScript limitations for reliable operation." & LF
return outText
end usageText
--#endregion Helper Functions

9
.github/CODEOWNERS vendored
View File

@ -1,9 +0,0 @@
# Protect ownership and automation rules.
/.github/CODEOWNERS @openclaw/openclaw-secops
/.github/workflows/ @openclaw/openclaw-secops
/package.json @openclaw/openclaw-secops
/pnpm-lock.yaml @openclaw/openclaw-secops
/Package.swift @openclaw/openclaw-secops
/.github/actionlint.yaml @openclaw/openclaw-secops
/.agents/skills/ @openclaw/openclaw-secops
/.crabbox.yaml @openclaw/openclaw-secops

View File

@ -1,5 +0,0 @@
self-hosted-runner:
labels:
- crabbox
- openclaw
- peekaboo

View File

@ -13,9 +13,9 @@ on:
jobs:
macos-host:
runs-on: macos-15
runs-on: macos-latest
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
submodules: recursive
- name: Swift version
@ -26,7 +26,7 @@ jobs:
run: swift test
apple-simulators:
runs-on: macos-15
runs-on: macos-latest
needs: macos-host
strategy:
matrix:
@ -44,7 +44,7 @@ jobs:
run:
working-directory: Commander
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
submodules: recursive
- name: Build for ${{ matrix.platform }}
@ -60,7 +60,7 @@ jobs:
runs-on: ubuntu-24.04
needs: macos-host
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: SwiftyLab/setup-swift@v1
@ -69,3 +69,22 @@ jobs:
- name: Test (Linux)
working-directory: Commander
run: swift test
android:
if: github.event_name == 'workflow_dispatch'
continue-on-error: true
runs-on: ubuntu-22.04
needs: macos-host
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- name: Android build + test
uses: skiptools/swift-android-action@v2
with:
swift-version: '6.2.1'
swift-branch: 'release/6.2.1'
package-path: 'Commander'
swift-configuration: 'debug'
android-api-level: 34
android-emulator-options: '-no-window -noaudio -no-boot-anim'

View File

@ -1,126 +0,0 @@
name: Crabbox Hydrate
on:
workflow_dispatch:
inputs:
crabbox_id:
description: "Crabbox lease ID"
required: true
type: string
ref:
description: "Git ref to hydrate"
required: false
type: string
crabbox_runner_label:
description: "Dynamic Crabbox runner label"
required: true
type: string
crabbox_job:
description: "Hydration job identifier expected by Crabbox"
required: false
default: "hydrate"
type: string
crabbox_keep_alive_minutes:
description: "Minutes to keep the hydrated job alive"
required: false
default: "90"
type: string
permissions:
contents: read
env:
NODE_VERSION: "24"
PNPM_VERSION: "11.1.2"
jobs:
hydrate:
name: hydrate
runs-on: [self-hosted, crabbox, openclaw, peekaboo, "${{ inputs.crabbox_runner_label }}"]
timeout-minutes: 120
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.ref || github.ref }}
- uses: pnpm/action-setup@v6.0.8
with:
version: ${{ env.PNPM_VERSION }}
- uses: actions/setup-node@v6
with:
node-version: ${{ env.NODE_VERSION }}
cache: pnpm
- name: Prepare pnpm and Swift workspace
shell: bash
run: |
set -euo pipefail
git fetch --no-tags --depth=50 origin "+refs/heads/main:refs/remotes/origin/main"
pnpm install --frozen-lockfile
node --version
pnpm --version
swift --version
- name: Mark Crabbox ready
shell: bash
env:
CRABBOX_ID: ${{ inputs.crabbox_id }}
CRABBOX_JOB: ${{ inputs.crabbox_job }}
run: |
set -euo pipefail
job="${CRABBOX_JOB}"
if [ -z "$job" ]; then job=hydrate; fi
case "$CRABBOX_ID" in
''|*[!A-Za-z0-9._-]*)
echo "Invalid crabbox_id" >&2
exit 2
;;
esac
mkdir -p "$HOME/.crabbox/actions"
state="$HOME/.crabbox/actions/${CRABBOX_ID}.env"
env_file="$HOME/.crabbox/actions/${CRABBOX_ID}.env.sh"
{
for key in CI GITHUB_ACTIONS GITHUB_WORKSPACE GITHUB_REPOSITORY GITHUB_RUN_ID GITHUB_RUN_NUMBER GITHUB_RUN_ATTEMPT GITHUB_REF GITHUB_REF_NAME GITHUB_SHA GITHUB_EVENT_NAME GITHUB_ACTOR RUNNER_OS RUNNER_ARCH RUNNER_TEMP RUNNER_TOOL_CACHE PATH; do
value="${!key-}"
if [ -n "$value" ]; then
printf 'export %s=%q\n' "$key" "$value"
fi
done
} > "${env_file}.tmp"
mv "${env_file}.tmp" "$env_file"
tmp="${state}.tmp"
{
echo "WORKSPACE=${GITHUB_WORKSPACE}"
echo "RUN_ID=${GITHUB_RUN_ID}"
echo "JOB=${job}"
echo "ENV_FILE=${env_file}"
echo "READY_AT=$(date -u +%Y-%m-%dT%H:%M:%SZ)"
} > "$tmp"
mv "$tmp" "$state"
- name: Keep Crabbox job alive
shell: bash
env:
CRABBOX_ID: ${{ inputs.crabbox_id }}
CRABBOX_KEEP_ALIVE_MINUTES: ${{ inputs.crabbox_keep_alive_minutes }}
run: |
set -euo pipefail
case "$CRABBOX_ID" in
''|*[!A-Za-z0-9._-]*)
echo "Invalid crabbox_id" >&2
exit 2
;;
esac
minutes="${CRABBOX_KEEP_ALIVE_MINUTES}"
case "$minutes" in
''|*[!0-9]*) minutes=90 ;;
esac
stop="$HOME/.crabbox/actions/${CRABBOX_ID}.stop"
deadline=$(( $(date +%s) + minutes * 60 ))
while [ "$(date +%s)" -lt "$deadline" ]; do
if [ -f "$stop" ]; then
exit 0
fi
sleep 15
done

View File

@ -10,32 +10,29 @@ concurrency:
group: macos-ci-${{ github.ref }}
cancel-in-progress: true
env:
SWIFT_TOOLCHAIN_ID: swift-6.2-RELEASE
SWIFT_TOOLCHAIN_NAME: swift
SWIFT_TOOLCHAIN_URL: https://download.swift.org/swift-6.2-release/xcode/swift-6.2-RELEASE/swift-6.2-RELEASE-osx.pkg
jobs:
peekaboo-core:
name: PeekabooCore build & tests
runs-on: macos-15
runs-on: macos-latest
env:
PEEKABOO_INCLUDE_AUTOMATION_TESTS: "false"
RUN_AUTOMATION_TESTS: "false"
RUN_LOCAL_TESTS: "false"
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 1
- name: Install Bun runtime
uses: oven-sh/setup-bun@v2
with:
bun-version: "latest"
- name: Docs lint
run: node scripts/docs-lint.mjs
- name: Select Xcode 26.2 (if present) or fallback to default
- name: Select Xcode 26.1.1 (if present) or fallback to default
run: |
set -euo pipefail
for candidate in /Applications/Xcode_26.2.app /Applications/Xcode_26.1.app /Applications/Xcode_26.0.app /Applications/Xcode_16.4.app /Applications/Xcode_16.3.app /Applications/Xcode.app; do
for candidate in /Applications/Xcode_26.1.1.app /Applications/Xcode_26.1.app /Applications/Xcode.app; do
if [[ -d "$candidate" ]]; then
sudo xcode-select -s "$candidate"
echo "DEVELOPER_DIR=${candidate}/Contents/Developer" >> "$GITHUB_ENV"
@ -74,7 +71,7 @@ jobs:
echo "key=${CACHE_PREFIX}${HASH}" >> "$GITHUB_OUTPUT"
- name: Cache SwiftPM (PeekabooCore)
uses: actions/cache@v5
uses: actions/cache@v4
with:
path: |
~/.swiftpm
@ -84,64 +81,60 @@ jobs:
restore-keys: |
${{ runner.os }}-spm-core-
- name: Clean SwiftPM trait state (PeekabooCore)
- name: Install Swift 6.2 toolchain
run: |
set -euo pipefail
# SwiftPM caches evaluated manifests in an sqlite DB. We've seen this get stale across
# `swift-configuration` trait renames (e.g. `JSONSupport` -> `JSON`) and break resolution.
for root in "$HOME/Library/Caches/org.swift.swiftpm" "$HOME/.cache/org.swift.swiftpm"; do
if [ -d "$root" ]; then
find "$root" -type f -name "manifest.db*" -print -delete || true
find "$root" -type f -name "manifests.db*" -print -delete || true
find "$root" -type f -name "package-collection.db*" -print -delete || true
fi
done
# SwiftPM traits can be persisted into `.swiftpm/configuration/traits.json` (and can get stuck in caches).
# When upstream packages rename/remove traits, stale state can break builds.
find ~/.swiftpm -type f -name traits.json -print -delete || true
if [ -d ~/.cache/org.swift.swiftpm ]; then
find ~/.cache/org.swift.swiftpm -type d -name .swiftpm -prune -print -exec rm -rf {} + || true
if [ ! -d "/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain" ]; then
curl -L "${SWIFT_TOOLCHAIN_URL}" -o /tmp/swift-toolchain.pkg
sudo installer -pkg /tmp/swift-toolchain.pkg -target /
fi
if [ -d Core/PeekabooCore/.swiftpm ]; then
find Core/PeekabooCore/.swiftpm -type f -name traits.json -print -delete || true
fi
if [ -d Core/PeekabooCore/.build/checkouts ]; then
find Core/PeekabooCore/.build/checkouts -type d -name .swiftpm -prune -print -exec rm -rf {} + || true
- name: Show Swift toolchain version
run: xcrun --toolchain ${SWIFT_TOOLCHAIN_NAME} swift --version
- name: Export Swift toolchain binary
run: |
echo "SWIFT_BIN=/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain/usr/bin/swift" >> "$GITHUB_ENV"
- name: Export Swift runtime library path
run: |
RUNTIME_PATH="/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain/usr/lib/swift/macosx"
echo "SWIFT_RUNTIME_LIB=${RUNTIME_PATH}" >> "$GITHUB_ENV"
if [ -n "${DYLD_LIBRARY_PATH:-}" ]; then
echo "DYLD_LIBRARY_PATH=${RUNTIME_PATH}:${DYLD_LIBRARY_PATH}" >> "$GITHUB_ENV"
else
echo "DYLD_LIBRARY_PATH=${RUNTIME_PATH}" >> "$GITHUB_ENV"
fi
- name: Show Xcode version
run: xcodebuild -version
- name: Show Swift toolchain version
run: swift --version
- name: Build PeekabooCore
working-directory: Core/PeekabooCore
run: |
swift build --configuration debug
"$SWIFT_BIN" build --configuration debug
- name: Run focused Swift tests
working-directory: Core/PeekabooCore
run: |
swift test --no-parallel --filter ScreenCaptureServiceFlowTests
"$SWIFT_BIN" test --filter ScreenCaptureServiceFlowTests
peekaboo-cli:
name: Peekaboo CLI build & tests
runs-on: macos-15
runs-on: macos-latest
needs: peekaboo-core
env:
PEEKABOO_INCLUDE_AUTOMATION_TESTS: "false"
PEEKABOO_SKIP_AUTOMATION: "1"
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 1
- name: Select Xcode 26.2 (if present) or fallback to default
- name: Select Xcode 26.1.1 (if present) or fallback to default
run: |
set -euo pipefail
for candidate in /Applications/Xcode_26.2.app /Applications/Xcode_26.1.app /Applications/Xcode_26.0.app /Applications/Xcode_16.4.app /Applications/Xcode_16.3.app /Applications/Xcode.app; do
for candidate in /Applications/Xcode_26.1.1.app /Applications/Xcode_26.1.app /Applications/Xcode.app; do
if [[ -d "$candidate" ]]; then
sudo xcode-select -s "$candidate"
echo "DEVELOPER_DIR=${candidate}/Contents/Developer" >> "$GITHUB_ENV"
@ -180,7 +173,7 @@ jobs:
echo "key=${CACHE_PREFIX}${HASH}" >> "$GITHUB_OUTPUT"
- name: Cache SwiftPM (CLI)
uses: actions/cache@v5
uses: actions/cache@v4
with:
path: |
~/.swiftpm
@ -190,35 +183,29 @@ jobs:
restore-keys: |
${{ runner.os }}-spm-cli-
- name: Clean SwiftPM trait state (CLI)
- name: Install Swift 6.2 toolchain
run: |
set -euo pipefail
# SwiftPM caches evaluated manifests in an sqlite DB. We've seen this get stale across
# `swift-configuration` trait renames (e.g. `JSONSupport` -> `JSON`) and break resolution.
for root in "$HOME/Library/Caches/org.swift.swiftpm" "$HOME/.cache/org.swift.swiftpm"; do
if [ -d "$root" ]; then
find "$root" -type f -name "manifest.db*" -print -delete || true
find "$root" -type f -name "manifests.db*" -print -delete || true
find "$root" -type f -name "package-collection.db*" -print -delete || true
fi
done
# SwiftPM traits can be persisted into `.swiftpm/configuration/traits.json` (and can get stuck in caches).
# When upstream packages rename/remove traits, stale state can break builds.
find ~/.swiftpm -type f -name traits.json -print -delete || true
if [ -d ~/.cache/org.swift.swiftpm ]; then
find ~/.cache/org.swift.swiftpm -type d -name .swiftpm -prune -print -exec rm -rf {} + || true
if [ ! -d "/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain" ]; then
curl -L "${SWIFT_TOOLCHAIN_URL}" -o /tmp/swift-toolchain.pkg
sudo installer -pkg /tmp/swift-toolchain.pkg -target /
fi
if [ -d Apps/CLI/.swiftpm ]; then
find Apps/CLI/.swiftpm -type f -name traits.json -print -delete || true
fi
if [ -d Apps/CLI/.build/checkouts ]; then
find Apps/CLI/.build/checkouts -type d -name .swiftpm -prune -print -exec rm -rf {} + || true
fi
# Avoid caching any package-local state that might remember old trait selections.
rm -rf Apps/CLI/.build || true
- name: Show Swift toolchain version
run: swift --version
run: xcrun --toolchain ${SWIFT_TOOLCHAIN_NAME} swift --version
- name: Export Swift toolchain binary
run: |
echo "SWIFT_BIN=/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain/usr/bin/swift" >> "$GITHUB_ENV"
- name: Export Swift runtime library path
run: |
RUNTIME_PATH="/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain/usr/lib/swift/macosx"
echo "SWIFT_RUNTIME_LIB=${RUNTIME_PATH}" >> "$GITHUB_ENV"
if [ -n "${DYLD_LIBRARY_PATH:-}" ]; then
echo "DYLD_LIBRARY_PATH=${RUNTIME_PATH}:${DYLD_LIBRARY_PATH}" >> "$GITHUB_ENV"
else
echo "DYLD_LIBRARY_PATH=${RUNTIME_PATH}" >> "$GITHUB_ENV"
fi
- name: Show Xcode version
run: xcodebuild -version
@ -226,31 +213,30 @@ jobs:
- name: Build CLI target
working-directory: Apps/CLI
run: |
swift build --configuration debug
echo "PEEKABOO_CLI_BINARY=$(swift build --configuration debug --show-bin-path)/peekaboo" >> "$GITHUB_ENV"
"$SWIFT_BIN" build --configuration debug
- name: Run CLI unit tests (skip automation)
working-directory: Apps/CLI
run: |
swift test --no-parallel -Xswiftc -DPEEKABOO_SKIP_AUTOMATION
"$SWIFT_BIN" test -Xswiftc -DPEEKABOO_SKIP_AUTOMATION
tachikoma:
name: Tachikoma build & tests
runs-on: macos-15
runs-on: macos-latest
needs: peekaboo-cli
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 1
- name: Select Xcode 26.2 (if present) or fallback to default
- name: Select Xcode 26.1.1 (if present) or fallback to default
run: |
set -euo pipefail
for candidate in /Applications/Xcode_26.2.app /Applications/Xcode_26.1.app /Applications/Xcode_26.0.app /Applications/Xcode_16.4.app /Applications/Xcode_16.3.app /Applications/Xcode.app; do
for candidate in /Applications/Xcode_26.1.1.app /Applications/Xcode_26.1.app /Applications/Xcode.app; do
if [[ -d "$candidate" ]]; then
sudo xcode-select -s "$candidate"
echo "DEVELOPER_DIR=${candidate}/Contents/Developer" >> "$GITHUB_ENV"
@ -294,7 +280,7 @@ jobs:
echo "key=${CACHE_PREFIX}${HASH}" >> "$GITHUB_OUTPUT"
- name: Cache SwiftPM (Tachikoma)
uses: actions/cache@v5
uses: actions/cache@v4
with:
path: |
~/.swiftpm
@ -304,33 +290,29 @@ jobs:
restore-keys: |
${{ runner.os }}-spm-tachikoma-
- name: Clean SwiftPM trait state (Tachikoma)
- name: Install Swift 6.2 toolchain
run: |
set -euo pipefail
# SwiftPM caches evaluated manifests in an sqlite DB. We've seen this get stale across
# `swift-configuration` trait renames (e.g. `JSONSupport` -> `JSON`) and break resolution.
for root in "$HOME/Library/Caches/org.swift.swiftpm" "$HOME/.cache/org.swift.swiftpm"; do
if [ -d "$root" ]; then
find "$root" -type f -name "manifest.db*" -print -delete || true
find "$root" -type f -name "manifests.db*" -print -delete || true
find "$root" -type f -name "package-collection.db*" -print -delete || true
fi
done
# SwiftPM traits can be persisted into `.swiftpm/configuration/traits.json` (and can get stuck in caches).
# When upstream packages rename/remove traits, stale state can break builds.
find ~/.swiftpm -type f -name traits.json -print -delete || true
if [ -d ~/.cache/org.swift.swiftpm ]; then
find ~/.cache/org.swift.swiftpm -type d -name .swiftpm -prune -print -exec rm -rf {} + || true
fi
if [ -d Tachikoma/.swiftpm ]; then
find Tachikoma/.swiftpm -type f -name traits.json -print -delete || true
fi
if [ -d Tachikoma/.build/checkouts ]; then
find Tachikoma/.build/checkouts -type d -name .swiftpm -prune -print -exec rm -rf {} + || true
if [ ! -d "/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain" ]; then
curl -L "${SWIFT_TOOLCHAIN_URL}" -o /tmp/swift-toolchain.pkg
sudo installer -pkg /tmp/swift-toolchain.pkg -target /
fi
- name: Show Swift toolchain version
run: swift --version
run: xcrun --toolchain ${SWIFT_TOOLCHAIN_NAME} swift --version
- name: Export Swift toolchain binary
run: |
echo "SWIFT_BIN=/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain/usr/bin/swift" >> "$GITHUB_ENV"
- name: Export Swift runtime library path
run: |
RUNTIME_PATH="/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain/usr/lib/swift/macosx"
echo "SWIFT_RUNTIME_LIB=${RUNTIME_PATH}" >> "$GITHUB_ENV"
if [ -n "${DYLD_LIBRARY_PATH:-}" ]; then
echo "DYLD_LIBRARY_PATH=${RUNTIME_PATH}:${DYLD_LIBRARY_PATH}" >> "$GITHUB_ENV"
else
echo "DYLD_LIBRARY_PATH=${RUNTIME_PATH}" >> "$GITHUB_ENV"
fi
- name: Show Xcode version
run: xcodebuild -version
@ -338,27 +320,27 @@ jobs:
- name: Build Tachikoma
working-directory: Tachikoma
run: |
swift build --configuration debug
"$SWIFT_BIN" build --configuration debug
- name: Run Tachikoma unit tests
working-directory: Tachikoma
run: |
swift test --no-parallel --filter unit
"$SWIFT_BIN" test --filter unit
mac-apps:
name: Build macOS apps (Peekaboo + Inspector)
runs-on: macos-15
runs-on: macos-latest
needs: [peekaboo-cli, tachikoma]
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 1
- name: Select Xcode 26.2 (if present) or fallback to default
- name: Select Xcode 26.1.1 (if present) or fallback to default
run: |
set -euo pipefail
for candidate in /Applications/Xcode_26.2.app /Applications/Xcode_26.1.app /Applications/Xcode_26.0.app /Applications/Xcode_16.4.app /Applications/Xcode_16.3.app /Applications/Xcode.app; do
for candidate in /Applications/Xcode_26.1.1.app /Applications/Xcode_26.1.app /Applications/Xcode.app; do
if [[ -d "$candidate" ]]; then
sudo xcode-select -s "$candidate"
echo "DEVELOPER_DIR=${candidate}/Contents/Developer" >> "$GITHUB_ENV"
@ -367,6 +349,13 @@ jobs:
done
/usr/bin/xcodebuild -version
- name: Install Swift 6.2 toolchain
run: |
if [ ! -d "/Library/Developer/Toolchains/${SWIFT_TOOLCHAIN_ID}.xctoolchain" ]; then
curl -L "${SWIFT_TOOLCHAIN_URL}" -o /tmp/swift-toolchain.pkg
sudo installer -pkg /tmp/swift-toolchain.pkg -target /
fi
- name: Build Peekaboo app (Xcode)
working-directory: Apps
run: |
@ -407,10 +396,10 @@ jobs:
lint:
name: SwiftLint (core + CLI)
runs-on: macos-15
runs-on: macos-latest
needs: [peekaboo-cli, tachikoma, mac-apps]
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
- name: Install SwiftLint
run: brew install swiftlint

View File

@ -1,63 +0,0 @@
name: Website (GitHub Pages)
on:
push:
branches: [main]
paths:
- "docs/**"
- "scripts/build-docs-site.mjs"
- "scripts/docs-site-assets.mjs"
- ".github/workflows/pages.yml"
workflow_dispatch:
permissions:
contents: read
pages: write
id-token: write
concurrency:
group: pages
cancel-in-progress: false
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout
uses: actions/checkout@v6
- name: Set up Node
uses: actions/setup-node@v6
with:
node-version: "24"
- name: Build docs site
run: node scripts/build-docs-site.mjs
- name: Validate docs site artifact
run: |
test -f _site/.nojekyll
test -f _site/.well-known/security.txt
test -f _site/security.txt
test -f _site/llms.txt
- name: Configure Pages
uses: actions/configure-pages@v6
- name: Upload artifact
uses: actions/upload-pages-artifact@v5
with:
path: _site
include-hidden-files: true
deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v5

23
.github/workflows/run-gemini-cli.yml vendored Normal file
View File

@ -0,0 +1,23 @@
name: Run Gemini CLI
on:
pull_request:
types: [opened, synchronize]
issue_comment:
types: [created]
workflow_dispatch:
jobs:
run-gemini:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run Gemini CLI
uses: google-github-actions/run-gemini-cli@main
with:
gemini_api_key: ${{ secrets.GEMINI_API_KEY }}
# Optional: Specify custom prompt or configuration
# prompt: "Review this code for improvements"

View File

@ -13,57 +13,89 @@ jobs:
update-homebrew-formula:
runs-on: ubuntu-latest
steps:
- name: Resolve release tag
- name: Checkout repository
uses: actions/checkout@v4
- name: Set version
id: version
run: |
if [ "${{ github.event_name }}" = "release" ]; then
echo "RELEASE_TAG=${{ github.event.release.tag_name }}" >> "$GITHUB_ENV"
VERSION="${{ github.event.release.tag_name }}"
else
echo "RELEASE_TAG=v${{ github.event.inputs.version }}" >> "$GITHUB_ENV"
VERSION="v${{ github.event.inputs.version }}"
fi
# Remove 'v' prefix if present
VERSION="${VERSION#v}"
echo "version=${VERSION}" >> $GITHUB_OUTPUT
echo "tag=v${VERSION}" >> $GITHUB_OUTPUT
- name: Dispatch tap formula update
env:
GH_TOKEN: ${{ secrets.HOMEBREW_TAP_TOKEN }}
- name: Download release artifact
run: |
if [ -z "$GH_TOKEN" ]; then
echo "::error::Set HOMEBREW_TAP_TOKEN with workflow access to steipete/homebrew-tap"
exit 1
fi
VERSION="${{ steps.version.outputs.version }}"
TAG="${{ steps.version.outputs.tag }}"
echo "Downloading release artifact for ${TAG}..."
curl -L -o peekaboo-macos-universal.tar.gz \
"https://github.com/steipete/peekaboo/releases/download/${TAG}/peekaboo-macos-universal.tar.gz"
request_id="peekaboo-${RELEASE_TAG}-${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}"
expected_title="Update peekaboo for ${RELEASE_TAG} (${request_id})"
- name: Calculate SHA256
id: sha256
run: |
SHA256=$(sha256sum peekaboo-macos-universal.tar.gz | cut -d' ' -f1)
echo "sha256=${SHA256}" >> $GITHUB_OUTPUT
echo "SHA256: ${SHA256}"
gh workflow run update-formula.yml \
--repo steipete/homebrew-tap \
--ref main \
-f formula=peekaboo \
-f tag="$RELEASE_TAG" \
-f repository="${{ github.repository }}" \
-f macos_artifact="peekaboo-macos-universal.tar.gz" \
-f request_id="$request_id"
- name: Update Homebrew formula
run: |
VERSION="${{ steps.version.outputs.version }}"
SHA256="${{ steps.sha256.outputs.sha256 }}"
# Update the formula file
sed -i "s|url \".*\"|url \"https://github.com/steipete/peekaboo/releases/download/v${VERSION}/peekaboo-macos-universal.tar.gz\"|" homebrew/peekaboo.rb
sed -i "s|sha256 \".*\"|sha256 \"${SHA256}\"|" homebrew/peekaboo.rb
sed -i "s|version \".*\"|version \"${VERSION}\"|" homebrew/peekaboo.rb
run_id=""
for _ in {1..30}; do
run_id=$(gh run list \
--repo steipete/homebrew-tap \
--workflow update-formula.yml \
--branch main \
--event workflow_dispatch \
--limit 20 \
--json databaseId,displayTitle \
--jq ".[] | select(.displayTitle == \"$expected_title\") | .databaseId" | head -n1)
if [ -n "$run_id" ]; then
break
fi
sleep 5
done
- name: Checkout homebrew tap
uses: actions/checkout@v4
with:
repository: steipete/homebrew-tap
token: ${{ secrets.HOMEBREW_TAP_TOKEN }}
path: homebrew-tap
if [ -z "$run_id" ]; then
echo "::error::Could not find tap workflow run with title: $expected_title"
exit 1
fi
- name: Copy updated formula to tap
run: |
mkdir -p homebrew-tap/Formula
cp homebrew/peekaboo.rb homebrew-tap/Formula/
gh run watch "$run_id" \
--repo steipete/homebrew-tap \
--exit-status \
--interval 10
- name: Commit and push to tap
run: |
cd homebrew-tap
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
VERSION="${{ steps.version.outputs.version }}"
git add Formula/peekaboo.rb
git commit -m "Update Peekaboo to v${VERSION}" || echo "No changes to commit"
git push
- name: Update formula in main repo
if: github.event_name == 'release'
run: |
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
VERSION="${{ steps.version.outputs.version }}"
git add homebrew/peekaboo.rb
git commit -m "Update Homebrew formula for v${VERSION}" || echo "No changes to commit"
# Create a PR instead of pushing directly to main
git checkout -b update-homebrew-formula-v${VERSION}
git push origin update-homebrew-formula-v${VERSION}
# Create PR using GitHub CLI
gh pr create \
--title "Update Homebrew formula for v${VERSION}" \
--body "Automated update of Homebrew formula to version ${VERSION}" \
--base main \
--head update-homebrew-formula-v${VERSION}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

5
.gitignore vendored
View File

@ -76,8 +76,6 @@ lib-cov/
.c9/
*.launch
.settings/
.claude/settings.local.json
_site/
*.sublime-workspace
*.sublime-project
@ -134,7 +132,6 @@ timeline.xctimeline
/DerivedData/**/*.app
/Apps/Mac/build/*.app
/Apps/Mac/DerivedData/**/*.app
/Apps/peekaboo
*.ipa
*.dSYM.zip
*.dSYM
@ -214,8 +211,6 @@ Apps/CLI/.generated/
Commander/Commander.tar.gz
# Test images and screenshots
Core/PeekabooCore/..png
Core/PeekabooCore/..png_annotated.png
*_screenshot.png
*_Screenshot_*.png
Calculator_*.png

8
.gitmodules vendored
View File

@ -1,20 +1,12 @@
[submodule "AXorcist"]
path = AXorcist
url = https://github.com/steipete/AXorcist.git
branch = main
[submodule "Tachikoma"]
path = Tachikoma
url = https://github.com/steipete/Tachikoma.git
branch = main
[submodule "Commander"]
path = Commander
url = https://github.com/steipete/Commander.git
branch = main
[submodule "TauTUI"]
path = TauTUI
url = https://github.com/steipete/TauTUI.git
branch = main
[submodule "Swiftdansi"]
path = Swiftdansi
url = https://github.com/steipete/Swiftdansi.git
branch = main

View File

@ -1,21 +0,0 @@
MAC_RELEASE_APP_NAME=Peekaboo
MAC_RELEASE_REPO=openclaw/Peekaboo
MAC_RELEASE_BUNDLE_ID=boo.peekaboo.mac
MAC_RELEASE_VERSION_FILE=/dev/null
MARKETING_VERSION=$(node -p "require('./package.json').version")
MAC_RELEASE_APPCAST=appcast.xml
MAC_RELEASE_INFO_PLIST=Apps/Mac/Peekaboo/Info.plist
MAC_RELEASE_SUPUBLIC_ED_KEY=AGCY8w5vHirVfGGDGc8Szc5iuOqupZSh9pMj/Qs67XI=
MAC_RELEASE_SIGNING_KEY_FILE='$HOME/Library/CloudStorage/Dropbox/Backup/Sparkle/sparkle-private-key-OBSOLETE-not-for-BlackBar-publickey-AGCY8w5vHirVfGGDGc8Szc5iuOqupZSh9pMj_Qs67XI-2026-05-21.txt'
MAC_RELEASE_APP_ZIP='${RELEASE_DIR:-release}/Peekaboo-${MARKETING_VERSION}.app.zip'
MAC_RELEASE_ARTIFACT_PREFIX='Peekaboo-'
MAC_RELEASE_REQUIRE_DSYM=0
MAC_RELEASE_FEED_URL='https://raw.githubusercontent.com/openclaw/Peekaboo/main/appcast.xml'
MAC_RELEASE_DOWNLOAD_URL_PREFIX='https://github.com/openclaw/Peekaboo/releases/download/v${MARKETING_VERSION}/'
MAC_RELEASE_PRECHECK='node scripts/prepare-release.js'
MAC_RELEASE_PACKAGE_CMD='scripts/release-macos-app.sh --no-appcast'
MAC_RELEASE_TAG_SIGNED=0
MAC_RELEASE_TAG_FORCE=0
MAC_RELEASE_TAG_ANNOTATED=0

View File

@ -48,4 +48,4 @@
--allman false
# Exclusions
--exclude .build,.swiftpm,DerivedData,node_modules,dist,coverage,xcuserdata,AXorcist,Commander,Swiftdansi,Tachikoma,TauTUI,Core/PeekabooCore/Sources/PeekabooCore/Extensions/NSArray+Extensions.swift
--exclude .build,.swiftpm,DerivedData,node_modules,dist,coverage,xcuserdata,Core/PeekabooCore/Sources/PeekabooCore/Extensions/NSArray+Extensions.swift

View File

@ -72,4 +72,4 @@
"generated_at": "2025-11-22T11:35:16.426Z",
"total_exclusions": 53
}
}
}

View File

@ -13,12 +13,11 @@
## Build, Test, and Development Commands
- Current local baseline is macOS 26.1 on arm64. If youre on an older SDK/OS, expect menubar/accessibility flakiness; re-run with the 26 SDK before chasing Peekaboo regressions.
- Run tools directly (runner removed). Use pnpm (Corepack-enabled).
- Always route tools through `./runner` (already baked into package scripts). Use pnpm (Corepack-enabled).
- Build the CLI: `pnpm run build:cli` (debug) or `pnpm run build:swift:all` (universal release). For arm64-only: `pnpm run build:swift`.
- Rapid rebuilds while editing Swift: `pnpm run poltergeist:haunt` → check with `pnpm run poltergeist:status`, stop via `pnpm run poltergeist:rest`.
- Validate before handoff: `pnpm run lint` (SwiftLint), `pnpm run format` (SwiftFormat check/fix), then `pnpm run test:safe`. Full automation/UI tests: `pnpm run test:automation` or `pnpm run test:all`.
- Tachikoma live provider checks: `pnpm run tachikoma:test:integration`.
- You may run `peekaboo` CLI commands locally for repros/debugging; be mindful they capture the host desktop (screen recording/accessibility permissions required).
## Coding Style & Naming Conventions
- Swift 6.2, 4-space indent, 120-column wrap; explicit `self` is required (SwiftFormat enforces). Run `pnpm run format` before committing.
@ -32,14 +31,7 @@
## Commit & Pull Request Guidelines
- Conventional Commits (`feat|fix|chore|docs|test|refactor|build|ci|style|perf`); scope optional: `feat(cli): add capture retry`.
- Use `./scripts/committer "type(scope): summary" <paths…>` to stage and create commits; avoid raw `git add`.
- Batch git network ops in groups: commit related repo changes first, then push/pull repos together so submodule gitlinks stay coherent.
- PRs should summarize intent, list test commands executed, mention doc updates, and include screenshots or terminal snippets when behavior changes.
- Never release or publish without an explicit release command.
- Peekaboo releases: follow `$release-peekaboo`; current Mac + existing 1Password credentials first. App Store Connect changes last resort, only after same-item `notarytool history` and non-S3 `submit` both fail.
- Credentialed release wrappers: `bash -c`, never login shells; profile exports can override ASC IDs and mix credentials.
- Published CLI proof: run `npm exec` from `/tmp`; repo cwd may shadow the downloaded package with a local binary.
- During PR triage, keep moving autonomously: fix defects, add obvious scoped features, and rewrite or land what makes sense.
- Before landing every PR, run autoreview until no actionable findings remain and fix or rerun CI until green.
## Security & Configuration Tips
- Secrets and provider tokens live under `~/.peekaboo` (managed by Tachikoma); never commit credentials or sample keys.

@ -1 +1 @@
Subproject commit c276ac88a0ebddb2a618b31092715d6df87456e0
Subproject commit a6b912702ae04c637bf0e2aa8feb2cf25b39e557

View File

@ -19,7 +19,7 @@ final class AcceleratedTextDetector {
// MARK: - Properties
/// Sobel kernels as Int16 for vImage convolution
// Sobel kernels as Int16 for vImage convolution
private let sobelXKernel: [Int16] = [
-1, 0, 1,
-2, 0, 2,
@ -42,7 +42,7 @@ final class AcceleratedTextDetector {
private let maxBufferWidth: Int = 200
private let maxBufferHeight: Int = 100
/// Edge detection threshold (0-255 scale)
// Edge detection threshold (0-255 scale)
private let edgeThreshold: UInt8 = 30
// MARK: - Initialization

View File

@ -18,7 +18,7 @@ final class SmartLabelPlacer {
private let labelSpacing: CGFloat = 3
private let cornerInset: CGFloat = 2
/// Label placement debugging
// Label placement debugging
private let debugMode: Bool
// MARK: - Initialization

View File

@ -5,139 +5,10 @@ All notable changes to Peekaboo CLI will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [3.5.3] - 2026-06-13
### Fixed
- Public CLI, agent, MCP, and API guidance now treats runtime element IDs as opaque strings to copy exactly instead of implying role-specific ID shapes. Thanks @coygeek for #194.
- JSON-only `peekaboo see` runs without `--path` now keep required screenshots in snapshot storage instead of leaving files on Desktop or exposing their temporary paths. Thanks @coygeek for #196.
- Background element/query/coordinate clicks now pin actions to the requested process and exact window, reject mismatched window/PID selectors and unverifiable snapshots, invalidate implicit latest snapshots without deleting history, and no longer require Event Synthesizing when Accessibility completes the click.
- App launch, open, and inventory commands now use the selected runtime host, fixing sandboxed LaunchServices failures; launch/open preserve `--no-focus` and caller-relative app paths, relaunch preflights and keeps quit/wait/launch in one daemon-held transaction, build-scoped fallback daemons remain reusable and controllable across native/Rosetta execution and executable upgrades, incompatible legacy hosts no longer force sandboxed local fallback, and inventory ignores unrelated input overrides.
- Agent, MCP, script, CLI, and bridge mutations now advance implicit-snapshot watermarks at host-confirmed completion or observation boundaries, keep durable pending barriers across client timeouts/disconnects without hiding the acting command's own snapshot, carry remote script observation certificates, recover safely from PID reuse, ignore unavailable alternate hosts after protecting the selected/local stores, and preserve explicit snapshot history.
## [3.5.2] - 2026-06-13
### Changed
- `peekaboo type` and the MCP `type` tool now default to zero-delay linear typing; supplying `--wpm`/`wpm` still opts into human cadence.
### Fixed
- Synchronized Tachikoma's OpenAI `gpt-5-chat-latest` catalog metadata so configured models apply the correct GPT-5 parameter filtering.
## [3.5.1] - 2026-06-12
### Fixed
- `peekaboo see` now returns at its configured wall-clock deadline when suspended capture or detection work ignores task cancellation, while preserving explicit command cancellation.
## [3.5.0] - 2026-06-12
## [Unreleased]
### Added
- `peekaboo agent` now supports explicit Claude Fable 5 (`claude-fable-5`) selection with 1M context and 128K max output while keeping Anthropic defaults on Opus 4.8 for zero-retention compatibility.
### Changed
- Agent runs now honor the saved `agent.temperature` and `agent.maxTokens` values shared by the CLI and macOS Settings UI, clamp them to each provider's capabilities, infer Fable limits through compatible providers, and omit unsupported sampling parameters for GPT-5 and current Anthropic reasoning models.
- Project, issue, build, release, and app About links now use the canonical `openclaw/Peekaboo` repository.
### Fixed
- Bridge hosts now use atomic lease-backed socket ownership and bounded nonblocking transport, keep Peekaboo.app and the reusable daemon on distinct paths while preserving the healthy app's TCC-backed fallback, preserve lifecycle settings while migrating legacy daemons, prevent MCP from hosting a bridge listener, safely recover stale sockets, and release abandoned client connections instead of wedging. Thanks @Artifact-LV for #184.
- Legacy screen and area capture now fails with a permission or native capture error instead of returning wallpaper-only/redacted pixels from background sessions. Thanks @VishalJ99 for #185.
## [3.4.1] - 2026-06-10
### Fixed
- `peekaboo agent` now resolves saved custom providers, xAI/Grok, Gemini 3.5 Flash, Claude Opus 4.8, and GPT-5.5 model selections before falling back to unavailable built-in defaults. Thanks @udiedrichsen for #182.
## [3.4.0] - 2026-06-07
## [3.3.0] - 2026-06-01
## [3.2.3] - 2026-05-24
## [3.2.2] - 2026-05-22
### Fixed
- `peekaboo agent` now accepts OpenRouter model IDs and can use `OPENROUTER_API_KEY` from env or credentials. Thanks @delort for #155.
## [3.2.1] - 2026-05-18
### Fixed
- `peekaboo click --coords` now treats coordinates as target-window-relative when app/window target flags are supplied, reports resolved target metadata, and requires `--global-coords` for targeted global clicks.
- `peekaboo-mcp` now shuts down cleanly during restart backoff and repairs executable permissions without shelling out through an install path.
- `pnpm run peekaboo:dev` no longer depends on a hardcoded local checkout path.
- `peekaboo agent` now tells models to use the current tool schema instead of stale tool names and arguments. Thanks @vyctorbrzezowski for #139.
- AX element detection now honors traversal budgets and reports truncation when depth, count, or per-node child limits are reached. Thanks @vyctorbrzezowski for #140.
- `peekaboo agent` and MCP clients now have an `inspect_ui` tool for AX-only UI text/control inspection without capturing screenshots. Thanks @vyctorbrzezowski for #141.
- Window-mode capture now falls back to desktop-independent ScreenCaptureKit filters when multi-display setups cannot map a window to an enumerated display. Thanks @lonexreb for #147.
- `peekaboo agent` guidance now routes AX-only observation through `inspect_ui` consistently while keeping screenshot-backed checks on `see`. Thanks @vyctorbrzezowski for #144.
- Custom provider docs, CLI help, and macOS settings now prefer `${VAR}` API key references and shell examples that preserve them literally. Thanks @scotthuang for #142.
- `peekaboo agent` now refreshes desktop context before each model turn and wires opt-in action verification through the configured capture strategy. Thanks @lonexreb for #148.
- AX traversal budgets now have wider defaults plus CLI, MCP, and environment overrides for complex app trees. Thanks @widdowson for #150 and #151.
- `peekaboo agent` now keeps OAuth access tokens on Bearer auth paths instead of misclassifying them as API keys, including config-dir overrides and audio transcription. Thanks @Crux0453 for #154.
## [3.2.0] - 2026-05-15
### Fixed
- Release automation now verifies CLI, npm, macOS app, checksum, appcast, and uploaded GitHub assets before publish.
- `peekaboo type --json` now separates requested text from executed key actions, making escaped special keys such as `\n` visible to agents without losing backwards-compatible `typedText`.
- `peekaboo permissions status --all-sources` now compares Bridge and local TCC permission state side by side, so daemon grants are no longer confused with CLI grants.
- `peekaboo mcp serve --transport ...` now rejects invalid transport names instead of silently starting stdio mode.
- `peekaboo paste --app ...` now fails before mutating the clipboard when the requested app cannot be found.
- `peekaboo agent` no longer sends stale Anthropic extended-thinking options to Claude Opus 4.7 and now exits with failure when agent execution fails.
- Command timeout JSON now reports the intended timeout error instead of occasionally surfacing cancellation as an unknown error.
- Refreshed CLI docs and quickstart examples to use current flags such as `image --path`, `click --coords`, `type --return`, `press --count`, and `scroll --amount`.
### Performance
- Debug CLI startup no longer spawns `git config` on every launch when build-staleness checking is disabled, cutting startup-heavy command latency by more than 30% in local testing.
## [3.1.2] - 2026-05-11
### Fixed
- Release automation now writes artifacts under `build/release` so clean release builds no longer embed `-dirty` in CLI version metadata.
## [3.1.1] - 2026-05-11
### Added
- `peekaboo image --path -` now writes a single captured image to stdout for shell pipelines.
- The npm package now allows Intel Macs when shipping the universal CLI binary.
### Fixed
- Agent tool schemas now preserve MCP `anyOf`/`oneOf` parameters so Gemini no longer rejects `peekaboo agent` requests with orphan `required` entries.
- `peekaboo see --capture-engine cg` now keeps frontmost/window captures on the CoreGraphics path instead of falling through to `SCScreenshotManager`.
## [3.1.0] - 2026-05-10
### Added
- `peekaboo agent --model` now understands GPT-5.5 and Claude Opus 4.7 identifiers, defaults to `gpt-5.5`, and rejects old GPT/Claude model families.
- Automation-oriented CLI commands now auto-start a warm Peekaboo daemon, reuse it across bursty invocations, and let it exit after an idle timeout.
- Bridge protocol 1.5 adds a daemon-side desktop observation operation so screenshot and `see` flows can execute fully in the warm daemon while returning compact metadata.
### Fixed
- MCP stdio servers now default to the local runtime instead of probing an existing Bridge host, avoiding recursive capture timeouts for `see` and `image` tool calls.
- MCP `image` now returns an `isError: true` tool result when Screen Recording permission is missing instead of surfacing an internal server error.
- MCP `analyze` now honors configured AI providers and per-call `provider_config` models instead of hardcoding an OpenAI model.
- Peekaboo.app now signs with the AppleEvents automation entitlement so macOS can prompt for Automation permission.
- The CLI bundle metadata and bundled Homebrew formula now advertise the macOS 15 minimum that the SwiftPM package already requires.
- `peekaboo see --annotate` now aligns labels using captured window bounds instead of guessing from the first detected element.
- Window capture on macOS 26 now resolves native Retina scale from `NSScreen.backingScaleFactor` before falling back to ScreenCaptureKit display ratios.
- `peekaboo image --app ... --window-title/--window-index` now captures the resolved window by stable window ID, avoiding mismatches between listed window indexes and ScreenCaptureKit window ordering.
- `peekaboo image --app ...` now prefers titled app windows over untitled helper windows, avoiding blank Chrome captures.
- `peekaboo image --capture-engine` is now accepted by Commander-based live parsing.
- Concurrent ScreenCaptureKit screenshot requests now queue through an in-process and cross-process capture gate instead of racing into continuation leaks or transient TCC-denied failures.
- Concurrent `peekaboo see` calls now queue the local screenshot/detection pipeline across processes, avoiding ReplayKit/ScreenCaptureKit continuation hangs under parallel usage.
- Natural-language automation examples now use `peekaboo agent "..."`.
### Performance
- `peekaboo see`, `image`, UI interaction, window, menu, dock, dialog, and app commands now prefer the warm on-demand daemon by default, avoiding repeated service startup cost across command bursts.
- `peekaboo tools`, `peekaboo list apps`, `peekaboo app list`, and purely local metadata commands still avoid daemon startup. Pass `--bridge-socket` to target a Bridge host explicitly where supported.
- Daemon-backed screenshot and `see` calls now write screenshot artifacts in the daemon and avoid sending image bytes through Bridge JSON, preventing large-payload timeouts and making warm calls substantially faster.
- Capture engine `auto` now tries the CoreGraphics path before ScreenCaptureKit, which makes repeated screenshot calls faster locally and avoids observed ScreenCaptureKit continuation hangs; explicit `--capture-engine modern` still forces ScreenCaptureKit.
- `peekaboo image --app` avoids redundant application/window-count lookups during screenshot setup and skips auto-focus work when the target app is already frontmost.
- `peekaboo image --app` now uses a CoreGraphics-only window selection fast path before falling back to full AX-enriched window enumeration, reducing warm Playground screenshot capture from about 350ms to 290ms.
- `peekaboo image` skips a redundant CLI-side screen-recording preflight and relies on the capture service's permission check, shaving about 8ms from warm one-shot app screenshots.
- `peekaboo see --app` avoids re-focusing the target window when Accessibility already reports the captured window as focused.
- `peekaboo see` avoids recursive AX child-text lookups for elements whose labels cannot use them, reducing Playground element detection from about 201ms to 134ms in local testing.
- `peekaboo see` batches per-element Accessibility descriptor reads and skips avoidable action/editability probes, reducing local Playground element detection from about 205ms to 176ms.
- `peekaboo see` limits expensive AX action and keyboard-shortcut probes to roles that can use them, reducing Playground element detection from about 286ms to roughly 180-190ms in local testing.
- `peekaboo see` skips a redundant CLI-side screen-recording preflight and relies on the capture service's permission check, shaving a fixed TCC probe from screenshot-plus-AX runs.
- `peekaboo see` now keeps AX traversal scoped to the captured window and skips web-content focus probing once a rich native AX tree is already visible, avoiding sibling-window elements and cutting native Playground detection from about 220ms to 130ms.
- `peekaboo agent --model` now understands the GPT-5.1 identifiers and defaults to `gpt-5.1`, matching the latest OpenAI release while keeping backward-compatible aliases for GPT-5 and GPT-4o inputs.
## [2.0.2] - 2025-07-03

View File

@ -24,23 +24,20 @@ let swiftTestingSettings = cliConcurrencySettings + [
let includeAutomationTests = ProcessInfo.processInfo.environment["PEEKABOO_INCLUDE_AUTOMATION_TESTS"] == "true"
var targets: [Target] = [
.target(
name: "PeekabooCLI",
dependencies: [
.product(name: "Commander", package: "Commander"),
.product(name: "MCP", package: "swift-sdk"),
.product(name: "Spinner", package: "Spinner"),
.product(name: "TauTUI", package: "TauTUI"),
.product(name: "PeekabooCore", package: "PeekabooCore"),
.product(name: "PeekabooBridge", package: "PeekabooCore"),
.product(name: "PeekabooVisualizer", package: "PeekabooVisualizer"),
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "TachikomaMCP", package: "Tachikoma"),
.product(name: "Swiftdansi", package: "Swiftdansi"),
],
path: "Sources/PeekabooCLI",
swiftSettings: cliConcurrencySettings),
.executableTarget(
.target(
name: "PeekabooCLI",
dependencies: [
.product(name: "Commander", package: "Commander"),
.product(name: "MCP", package: "swift-sdk"),
.product(name: "Spinner", package: "Spinner"),
.product(name: "TauTUI", package: "TauTUI"),
.product(name: "PeekabooCore", package: "PeekabooCore"),
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "TachikomaMCP", package: "Tachikoma"),
],
path: "Sources/PeekabooCLI",
swiftSettings: cliConcurrencySettings),
.executableTarget(
name: "peekaboo",
dependencies: [
"PeekabooCLI",
@ -62,9 +59,6 @@ var targets: [Target] = [
dependencies: [
"PeekabooCLI",
.product(name: "PeekabooFoundation", package: "PeekabooFoundation"),
.product(name: "PeekabooAutomation", package: "PeekabooCore"),
.product(name: "PeekabooAgentRuntime", package: "PeekabooCore"),
.product(name: "PeekabooCore", package: "PeekabooCore"),
],
path: "Tests/CoreCLITests",
swiftSettings: swiftTestingSettings),
@ -104,7 +98,7 @@ if includeAutomationTests {
let package = Package(
name: "peekaboo",
platforms: [
.macOS(.v15),
.macOS(.v14),
],
products: [
.executable(
@ -113,15 +107,13 @@ let package = Package(
],
dependencies: [
.package(path: "../../Commander"),
.package(url: "https://github.com/modelcontextprotocol/swift-sdk.git", "0.12.0" ..< "0.13.0"),
.package(url: "https://github.com/modelcontextprotocol/swift-sdk.git", from: "0.10.2"),
.package(url: "https://github.com/dominicegginton/Spinner", from: "2.1.0"),
.package(url: "https://github.com/swiftlang/swift-subprocess.git", from: "0.2.1"),
.package(path: "../../TauTUI"),
.package(path: "../../Core/PeekabooFoundation"),
.package(path: "../../Core/PeekabooVisualizer"),
.package(path: "../../Core/PeekabooCore"),
.package(path: "../../Tachikoma"),
.package(path: "../../Swiftdansi"),
],
targets: targets,
swiftLanguageModes: [.v6])

View File

@ -10,20 +10,20 @@ struct CommanderCommandDescriptor {
let subcommands: [CommanderCommandDescriptor]
}
struct CommanderCommandSummary: Codable {
struct Argument: Codable {
struct CommanderCommandSummary: Codable, Sendable {
struct Argument: Codable, Sendable {
let label: String
let help: String?
let isOptional: Bool
}
struct Option: Codable {
struct Option: Codable, Sendable {
let names: [String]
let help: String?
let parsing: String
}
struct Flag: Codable {
struct Flag: Codable, Sendable {
let names: [String]
let help: String?
}
@ -70,12 +70,12 @@ enum CommanderRegistryBuilder {
return lookup
}
static func buildDescriptor(for type: any ParsableCommand.Type) -> CommanderCommandDescriptor {
private static func buildDescriptor(for type: any ParsableCommand.Type) -> CommanderCommandDescriptor {
let description = type.commandDescription
let commandInstance = type.init()
let signature = self.resolveSignature(for: type, instance: commandInstance)
.flattened()
.withPeekabooRuntimeFlags()
.withStandardRuntimeFlags()
let childDescriptors = description.subcommands.map { self.buildDescriptor(for: $0) }
let defaultName = description.defaultSubcommand.map { self.commandName(for: $0) }
let metadata = CommandDescriptor(

View File

@ -1,34 +1,11 @@
import Commander
import Foundation
import PeekabooAutomationKit
/// Commands or runtime contexts that can specify a preferred capture engine.
protocol CaptureEngineConfigurable: AnyObject {
var captureEngine: String? { get }
}
enum CommanderRuntimeExecutorMessage {
static let snapshotInvalidationWarning =
"Warning: The requested action succeeded, but stale UI snapshots could not be invalidated after retry. " +
"Do not retry the action."
}
enum CommanderRuntimeExecutorError: LocalizedError {
case snapshotCatchUpFailed(any Error)
case mutationBarrierFailed(any Error)
var errorDescription: String? {
switch self {
case let .snapshotCatchUpFailed(error):
"Could not synchronize the selected host's UI snapshot watermark before execution: " +
"the requested command was not executed, so retrying later is safe. " + error.localizedDescription
case let .mutationBarrierFailed(error):
"Could not establish the desktop mutation barrier before execution: " +
"the requested command was not executed, so retrying later is safe. " + error.localizedDescription
}
}
}
@MainActor
enum CommanderRuntimeExecutor {
static func resolveAndRun(arguments: [String]) async throws {
@ -43,214 +20,18 @@ enum CommanderRuntimeExecutor {
)
if var runtimeCommand = command as? any AsyncRuntimeCommand {
let runtimeOptions = try CommanderCLIBinder.makeRuntimeOptions(
from: resolved.parsedValues,
commandType: resolved.type
)
let runtimeOptions = try CommanderCLIBinder.makeRuntimeOptions(from: resolved.parsedValues)
if let capturePreference = runtimeOptions.captureEnginePreference,
!capturePreference.isEmpty {
// Respect explicit engine choice; also allow disabling CG globally.
setenv("PEEKABOO_CAPTURE_ENGINE", capturePreference, 1)
}
let runtime = await CommandRuntime.makeDefaultAsync(options: runtimeOptions)
try await self.catchUpSelectedHostIfNeeded(
using: runtime,
required: runtimeOptions.requiresImplicitSnapshotInvalidation ||
runtimeOptions.usesPerToolSnapshotInvalidation
)
try await DeferredCommandOutput.run(
bufferingOutput: runtimeOptions.requiresImplicitSnapshotInvalidation
) {
try await self.runWithImplicitSnapshotInvalidation(
using: runtime,
required: runtimeOptions.requiresImplicitSnapshotInvalidation,
requiresCallerBarrier: runtimeOptions.requiresCallerDesktopMutationBarrier
) {
try await runtimeCommand.run(using: runtime)
}
}
let runtime = CommandRuntime.makeDefault(options: runtimeOptions)
try await runtimeCommand.run(using: runtime)
return
}
var plainCommand = command
try await plainCommand.run()
}
static func catchUpSelectedHostIfNeeded(
using runtime: CommandRuntime,
required: Bool
) async throws {
guard required else { return }
try Task.checkCancellation()
let cutoff = runtime.services.snapshots.effectiveImplicitLatestInvalidationWatermark
try Task.checkCancellation()
guard let cutoff else { return }
do {
_ = try await runtime.services.snapshots.invalidateImplicitLatestSnapshot(
through: cutoff,
preserving: nil,
preservedAt: nil
)
try Task.checkCancellation()
} catch let error as CancellationError {
throw error
} catch {
throw CommanderRuntimeExecutorError.snapshotCatchUpFailed(error)
}
}
static func runWithImplicitSnapshotInvalidation<T>(
using runtime: CommandRuntime,
required: Bool,
requiresCallerBarrier: Bool = false,
operation: () async throws -> T
) async throws -> T {
let mutationSequenceAtStart = runtime.interactionMutationTracker.mutationSequence
let needsCallerBarrier = required &&
(runtime.selectedRemoteSocketPath == nil || requiresCallerBarrier)
let createdDurableMutation: Bool
if needsCallerBarrier {
do {
createdDurableMutation = try runtime.interactionMutationTracker.beginDurableMutation()
} catch {
throw CommanderRuntimeExecutorError.mutationBarrierFailed(error)
}
} else {
createdDurableMutation = false
}
let result: T
do {
result = try await runtime.interactionMutationTracker.withPendingDurableMutationVisible(
createdByCurrentCommand: createdDurableMutation,
operation: operation
)
try Task.checkCancellation()
} catch {
_ = await self.invalidateSnapshotsAfterCommandIfNeeded(
using: runtime,
required: required,
succeeded: false,
mutationSequenceAtStart: mutationSequenceAtStart,
createdDurableMutation: createdDurableMutation
)
throw error
}
let hadPendingMutation = required && runtime.interactionMutationTracker.mutationStartedAt != nil
let invalidated = await invalidateSnapshotsAfterCommandIfNeeded(
using: runtime,
required: required,
succeeded: true,
mutationSequenceAtStart: mutationSequenceAtStart,
createdDurableMutation: createdDurableMutation
)
do {
try Task.checkCancellation()
} catch {
if hadPendingMutation {
_ = await self.invalidateSnapshots(
using: runtime,
reason: "command cancellation",
through: Date(),
preserving: nil,
preservedAt: nil
)
}
throw error
}
if !invalidated {
fputs("\(CommanderRuntimeExecutorMessage.snapshotInvalidationWarning)\n", stderr)
}
return result
}
private static func invalidateSnapshotsAfterCommandIfNeeded(
using runtime: CommandRuntime,
required: Bool,
succeeded: Bool,
mutationSequenceAtStart: UInt64,
createdDurableMutation: Bool
) async -> Bool {
let completion = Date()
guard required else { return true }
guard runtime.interactionMutationTracker.mutationStartedAt != nil else {
guard createdDurableMutation else {
return !runtime.interactionMutationTracker.hasPendingDurableMutation
}
do {
try runtime.interactionMutationTracker.cancelDurableMutation()
return true
} catch {
return false
}
}
guard let requestedCutoff = runtime.interactionMutationTracker.invalidationCutoff(
commandCompletedAt: completion,
succeeded: succeeded
)
else { return true }
let durableCompletion: DesktopMutationWatermarkStore.MutationCompletion?
do {
if createdDurableMutation,
runtime.interactionMutationTracker.mutationSequence == mutationSequenceAtStart {
try runtime.interactionMutationTracker.cancelDurableMutation()
durableCompletion = nil
} else {
durableCompletion = try runtime.interactionMutationTracker.completeDurableMutation(
through: succeeded ? requestedCutoff : completion
)
}
} catch {
runtime.interactionMutationTracker.markInvalidationFailed(through: completion)
return false
}
let cutoff = max(requestedCutoff, durableCompletion?.cutoff ?? requestedCutoff)
let preservationAllowed = durableCompletion?.allowsObservationPreservation ?? true
let preservedSnapshotID = succeeded && preservationAllowed
? runtime.interactionMutationTracker.preservedSnapshotID
: nil
let preservedAt = preservedSnapshotID == nil
? nil
: runtime.interactionMutationTracker.preservedAt
return await self.invalidateSnapshots(
using: runtime,
reason: "command execution",
through: cutoff,
preserving: preservedSnapshotID,
preservedAt: preservedAt
)
}
private static func invalidateSnapshots(
using runtime: CommandRuntime,
reason: String,
through cutoff: Date,
preserving preservedSnapshotID: String?,
preservedAt: Date?
) async -> Bool {
let targets = runtime.interactionMutationTargets
let isRetry = runtime.interactionMutationTracker.hasFailedInvalidationAttempt
let invalidated = await InteractionObservationInvalidator.invalidateAfterMutation(
targets: targets,
logger: runtime.logger,
reason: reason,
through: cutoff,
preserving: preservedSnapshotID,
preservedAt: preservedAt
)
if invalidated {
return true
}
if isRetry {
return false
}
return await InteractionObservationInvalidator.invalidateAfterMutation(
targets: targets,
logger: runtime.logger,
reason: "\(reason) retry",
through: cutoff,
preserving: preservedSnapshotID,
preservedAt: preservedAt
)
}
}

View File

@ -1,196 +0,0 @@
import Commander
import Foundation
extension CommanderRuntimeRouter {
static let categoryLookup: [ObjectIdentifier: CommandRegistryEntry.Category] = {
var lookup: [ObjectIdentifier: CommandRegistryEntry.Category] = [:]
for entry in CommandRegistry.entries {
lookup[ObjectIdentifier(entry.type)] = entry.category
}
return lookup
}()
static func makeHelpTheme() -> HelpTheme {
let capabilities = TerminalDetector.detectCapabilities()
if let forcedMode = TerminalDetector.shouldForceOutputMode() {
return HelpTheme(useColors: forcedMode.supportsColors)
}
return HelpTheme(useColors: capabilities.supportsColors)
}
static func renderRootUsageCard(theme: HelpTheme) -> String {
var lines: [String] = []
lines.append(theme.heading("Usage"))
lines.append(" \(theme.accent("peekaboo <command> [options]"))")
lines.append("")
lines.append(theme.heading("Tip"))
lines.append(" When developing locally, run via \(theme.accent("polter peekaboo")) to ensure fresh builds.")
return lines.joined(separator: "\n")
}
static func renderUsageCard(
for descriptor: CommanderCommandDescriptor,
path: [String],
theme: HelpTheme
) -> String {
let usageLine = self.buildUsageLine(path: path, signature: descriptor.metadata.signature)
var lines: [String] = []
lines.append(theme.heading("Usage"))
lines.append(" \(theme.accent(usageLine))")
let abstract = descriptor.metadata.abstract.trimmingCharacters(in: .whitespacesAndNewlines)
if !abstract.isEmpty {
lines.append("")
lines.append(theme.heading("Summary"))
lines.append(" \(abstract)")
}
lines.append("")
lines.append(theme.heading("Tip"))
lines.append(" When developing locally, run via \(theme.accent("polter peekaboo")) to ensure fresh builds.")
return lines.joined(separator: "\n")
}
static func globalFlagSummaries(theme: HelpTheme) -> [String] {
[
theme.bullet(label: "--json/-j (alias: --json-output)", description: "Emit machine-readable JSON output"),
theme.bullet(label: "--verbose/-v", description: "Enable verbose logging"),
theme.bullet(
label: "--log-level <level>",
description: "trace | verbose | debug | info | warning | error | critical"
),
theme.bullet(
label: "--no-remote",
description: "Force local services; skip remote bridge hosts even if available"
),
theme.bullet(
label: "--bridge-socket <path>",
description: "Override the Peekaboo Bridge socket path"
),
theme.bullet(
label: "--input-strategy <mode>",
description: "Override UI input strategy: actionFirst | synthFirst | actionOnly | synthOnly"
)
]
}
static func renderGlobalFlagsSection(theme: HelpTheme) -> String {
var lines: [String] = []
lines.append(theme.heading("Global Runtime Flags"))
for entry in self.globalFlagSummaries(theme: theme) {
lines.append(" \(entry)")
}
return lines.joined(separator: "\n")
}
static func renderCommandList(
for commands: [CommanderCommandDescriptor],
theme: HelpTheme,
indent: String = " "
) -> [String] {
let sorted = commands.sorted { $0.metadata.name < $1.metadata.name }
let maxNameLength = sorted.map(\.metadata.name.count).max() ?? 0
let columnWidth = min(max(maxNameLength, 8), 24)
return sorted.map { descriptor in
let name = descriptor.metadata.name
let summary = descriptor.metadata.abstract.isEmpty ? "No description provided." : descriptor.metadata
.abstract
let paddedName: String = if name.count >= columnWidth {
name
} else {
name + String(repeating: " ", count: columnWidth - name.count)
}
let displayName = theme.command(paddedName)
return "\(indent)\(displayName) \(summary)"
}
}
static func buildUsageLine(path: [String], signature: CommandSignature) -> String {
var tokens = ["peekaboo"]
let commandPath = path.isEmpty ? ["<command>"] : path
tokens.append(contentsOf: commandPath)
for argument in signature.arguments {
let placeholder = self.argumentPlaceholder(for: argument)
tokens.append(argument.isOptional ? "[\(placeholder)]" : "<\(placeholder)>")
}
if !signature.options.isEmpty || !signature.flags.isEmpty {
tokens.append("[options]")
}
return tokens.joined(separator: " ")
}
static func argumentPlaceholder(for argument: ArgumentDefinition) -> String {
let lowered = argument.label.replacingOccurrences(of: "_", with: "-")
return Self.kebabCased(lowered)
}
static func kebabCased(_ value: String) -> String {
guard !value.isEmpty else { return value }
var scalars: [Character] = []
for character in value {
if character.isUppercase {
if !scalars.isEmpty && scalars.last != "-" {
scalars.append("-")
}
scalars.append(contentsOf: character.lowercased())
} else if character == " " || character == "-" {
if scalars.last != "-" { scalars.append("-") }
} else {
scalars.append(character)
}
}
return String(scalars)
}
}
struct HelpTheme {
let useColors: Bool
func heading(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.bold)\(TerminalColor.cyan)\(text)\(TerminalColor.reset)"
}
func accent(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.magenta)\(text)\(TerminalColor.reset)"
}
func command(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.bold)\(text)\(TerminalColor.reset)"
}
func dim(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.gray)\(text)\(TerminalColor.reset)"
}
func bullet(label: String, description: String) -> String {
let prefix = self.useColors ? "\(TerminalColor.gray)\(TerminalColor.reset)" : "-"
let labelText = self.useColors ? "\(TerminalColor.bold)\(label)\(TerminalColor.reset)" : label
return "\(prefix) \(labelText) \(description)"
}
}
extension CommandRegistryEntry.Category {
var displayName: String {
switch self {
case .core:
"Core Commands"
case .interaction:
"Interaction"
case .system:
"System"
case .vision:
"Vision"
case .ai:
"AI"
case .mcp:
"MCP"
}
}
}

View File

@ -25,9 +25,6 @@ enum CommanderRuntimeRouter {
if try Self.handleHelpRequest(arguments: trimmedArgs, descriptors: descriptors) {
throw ExitCode.success
}
if let alias = try Self.resolveAgentPermissionAlias(arguments: trimmedArgs, originalArgv: argv) {
return alias
}
let program = Program(descriptors: descriptors.map(\.metadata))
let invocation = try program.resolve(argv: argv)
guard let descriptor = Self.findDescriptor(in: descriptors, matching: invocation.path) else {
@ -71,22 +68,13 @@ enum CommanderRuntimeRouter {
guard !arguments.isEmpty else { return false }
if arguments[0].caseInsensitiveCompare("help") == .orderedSame {
let tokens = Array(arguments.dropFirst())
if self.handleAgentPermissionHelp(tokens: tokens) {
return true
}
let path = self.resolveHelpPath(from: tokens, descriptors: descriptors)
let path = Array(arguments.dropFirst())
try self.printHelp(for: path, descriptors: descriptors)
return true
}
let helpSearchArguments = Array(arguments.prefix { $0 != "--" })
if let index = helpSearchArguments.firstIndex(where: { self.isHelpToken($0) }) {
let tokens = Array(helpSearchArguments.prefix(index))
if self.handleAgentPermissionHelp(tokens: tokens) {
return true
}
let path = self.resolveHelpPath(from: tokens, descriptors: descriptors)
if let index = arguments.firstIndex(where: { self.isHelpToken($0) }) {
let path = Array(arguments.prefix(index))
try self.printHelp(for: path, descriptors: descriptors)
return true
}
@ -94,64 +82,6 @@ enum CommanderRuntimeRouter {
return false
}
private static func handleAgentPermissionHelp(tokens: [String]) -> Bool {
guard tokens.count >= 2,
tokens[0].caseInsensitiveCompare("agent") == .orderedSame,
tokens[1].caseInsensitiveCompare("permission") == .orderedSame else {
return false
}
let rootDescriptor = CommanderRegistryBuilder.buildDescriptor(for: PermissionCommand.self)
let permissionPath = ["permission"] + tokens.dropFirst(2)
guard let descriptor = self.findDescriptor(in: [rootDescriptor], matching: permissionPath) else {
return false
}
self.printCommandHelp(descriptor, path: ["agent"] + permissionPath)
return true
}
private static func resolveAgentPermissionAlias(
arguments: [String],
originalArgv: [String]
) throws -> CommanderResolvedCommand? {
guard arguments.count >= 2,
arguments[0].caseInsensitiveCompare("agent") == .orderedSame,
arguments[1].caseInsensitiveCompare("permission") == .orderedSame else {
return nil
}
let rootDescriptor = CommanderRegistryBuilder.buildDescriptor(for: PermissionCommand.self)
let executable = originalArgv.first ?? "peekaboo"
let aliasArgv = [executable, "permission"] + arguments.dropFirst(2)
let program = Program(descriptors: [rootDescriptor.metadata])
let invocation = try program.resolve(argv: Array(aliasArgv))
guard let descriptor = self.findDescriptor(in: [rootDescriptor], matching: invocation.path) else {
throw CommanderProgramError.unknownCommand(invocation.path.joined(separator: ":"))
}
return CommanderResolvedCommand(
metadata: descriptor.metadata,
type: descriptor.type,
parsedValues: invocation.parsedValues
)
}
private static func resolveHelpPath(
from tokens: [String],
descriptors: [CommanderCommandDescriptor]
) -> [String] {
guard !tokens.isEmpty else { return [] }
for length in stride(from: tokens.count, through: 1, by: -1) {
let candidate = Array(tokens.prefix(length))
if self.findDescriptor(in: descriptors, matching: candidate) != nil {
return candidate
}
}
// Preserve previous behavior for unknown paths: let printHelp throw with the original tokens.
return tokens
}
private static func handleVersionRequest(arguments: [String]) -> Bool {
guard let first = arguments.first else { return false }
guard self.isVersionToken(first) else { return false }
@ -239,3 +169,187 @@ enum CommanderRuntimeRouter {
}
}
}
// MARK: - Usage Card + Theming
extension CommanderRuntimeRouter {
private static let categoryLookup: [ObjectIdentifier: CommandRegistryEntry.Category] = {
var lookup: [ObjectIdentifier: CommandRegistryEntry.Category] = [:]
for entry in CommandRegistry.entries {
lookup[ObjectIdentifier(entry.type)] = entry.category
}
return lookup
}()
private static func makeHelpTheme() -> HelpTheme {
let capabilities = TerminalDetector.detectCapabilities()
if let forcedMode = TerminalDetector.shouldForceOutputMode() {
return HelpTheme(useColors: forcedMode.supportsColors)
}
return HelpTheme(useColors: capabilities.supportsColors)
}
private static func renderRootUsageCard(theme: HelpTheme) -> String {
var lines: [String] = []
lines.append(theme.heading("Usage"))
lines.append(" \(theme.accent("polter peekaboo <command> [options]"))")
lines.append("")
lines.append(theme.heading("Tip"))
lines.append(" Run via \(theme.accent("polter peekaboo")) to ensure fresh builds.")
return lines.joined(separator: "\n")
}
private static func renderUsageCard(
for descriptor: CommanderCommandDescriptor,
path: [String],
theme: HelpTheme
) -> String {
let usageLine = self.buildUsageLine(path: path, signature: descriptor.metadata.signature)
var lines: [String] = []
lines.append(theme.heading("Usage"))
lines.append(" \(theme.accent(usageLine))")
let abstract = descriptor.metadata.abstract.trimmingCharacters(in: .whitespacesAndNewlines)
if !abstract.isEmpty {
lines.append("")
lines.append(theme.heading("Summary"))
lines.append(" \(abstract)")
}
lines.append("")
lines.append(theme.heading("Tip"))
lines.append(" Run via \(theme.accent("polter peekaboo")) to ensure fresh builds.")
return lines.joined(separator: "\n")
}
private static func globalFlagSummaries(theme: HelpTheme) -> [String] {
[
theme.bullet(label: "--json/-j", description: "Emit machine-readable JSON output"),
theme.bullet(label: "--verbose/-v", description: "Enable verbose logging"),
theme.bullet(
label: "--log-level <level>",
description: "trace | verbose | debug | info | warning | error | critical"
)
]
}
private static func renderGlobalFlagsSection(theme: HelpTheme) -> String {
var lines: [String] = []
lines.append(theme.heading("Global Runtime Flags"))
for entry in self.globalFlagSummaries(theme: theme) {
lines.append(" \(entry)")
}
return lines.joined(separator: "\n")
}
private static func renderCommandList(
for commands: [CommanderCommandDescriptor],
theme: HelpTheme,
indent: String = " "
) -> [String] {
let sorted = commands.sorted { $0.metadata.name < $1.metadata.name }
let maxNameLength = sorted.map(\.metadata.name.count).max() ?? 0
let columnWidth = min(max(maxNameLength, 8), 24)
return sorted.map { descriptor in
let name = descriptor.metadata.name
let summary = descriptor.metadata.abstract.isEmpty ? "No description provided." : descriptor.metadata
.abstract
let paddedName: String = if name.count >= columnWidth {
name
} else {
name + String(repeating: " ", count: columnWidth - name.count)
}
let displayName = theme.command(paddedName)
return "\(indent)\(displayName) \(summary)"
}
}
private static func buildUsageLine(path: [String], signature: CommandSignature) -> String {
var tokens = ["polter", "peekaboo"]
let commandPath = path.isEmpty ? ["<command>"] : path
tokens.append(contentsOf: commandPath)
for argument in signature.arguments {
let placeholder = self.argumentPlaceholder(for: argument)
tokens.append(argument.isOptional ? "[\(placeholder)]" : "<\(placeholder)>")
}
if !signature.options.isEmpty || !signature.flags.isEmpty {
tokens.append("[options]")
}
return tokens.joined(separator: " ")
}
private static func argumentPlaceholder(for argument: ArgumentDefinition) -> String {
let lowered = argument.label.replacingOccurrences(of: "_", with: "-")
return Self.kebabCased(lowered)
}
private static func kebabCased(_ value: String) -> String {
guard !value.isEmpty else { return value }
var scalars: [Character] = []
for character in value {
if character.isUppercase {
if !scalars.isEmpty && scalars.last != "-" {
scalars.append("-")
}
scalars.append(contentsOf: character.lowercased())
} else if character == " " || character == "-" {
if scalars.last != "-" { scalars.append("-") }
} else {
scalars.append(character)
}
}
return String(scalars)
}
}
struct HelpTheme {
let useColors: Bool
func heading(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.bold)\(TerminalColor.cyan)\(text)\(TerminalColor.reset)"
}
func accent(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.magenta)\(text)\(TerminalColor.reset)"
}
func command(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.bold)\(text)\(TerminalColor.reset)"
}
func dim(_ text: String) -> String {
guard self.useColors else { return text }
return "\(TerminalColor.gray)\(text)\(TerminalColor.reset)"
}
func bullet(label: String, description: String) -> String {
let prefix = self.useColors ? "\(TerminalColor.gray)\(TerminalColor.reset)" : "-"
let labelText = self.useColors ? "\(TerminalColor.bold)\(label)\(TerminalColor.reset)" : label
return "\(prefix) \(labelText) \(description)"
}
}
extension CommandRegistryEntry.Category {
fileprivate var displayName: String {
switch self {
case .core:
"Core Commands"
case .interaction:
"Interaction"
case .system:
"System"
case .vision:
"Vision"
case .ai:
"AI"
case .mcp:
"MCP"
}
}
}

View File

@ -1,226 +0,0 @@
import Foundation
/// Renders a self-contained bash completion script that queries shared
/// completion tables emitted from Swift metadata.
struct BashCompletionRenderer: ShellCompletionRendering {
func render(document: CompletionScriptDocument) -> String {
let lines = self.commonHeader(
shell: "bash",
install: CompletionsCommand.Shell.bash.installationSnippet
) + [
"__peekaboo_bash_subcommands() {",
self.renderBashChoiceSwitch(document.pathsIncludingRoot, accessor: \.subcommands),
"}",
"",
"__peekaboo_bash_options() {",
self.renderBashOptionSwitch(document: document),
"}",
"",
"__peekaboo_bash_argument_values() {",
self.renderBashArgumentSwitch(document.pathsIncludingRoot),
"}",
"",
"__peekaboo_bash_option_values() {",
self.renderBashOptionValueSwitch(document.pathsIncludingRoot),
"}",
"",
"__peekaboo_bash_has_subcommand() {",
" local path=\"$1\"",
" local candidate=\"$2\"",
" while IFS=$'\\t' read -r value _; do",
" [[ \"$value\" == \"$candidate\" ]] && return 0",
" done < <(__peekaboo_bash_subcommands \"$path\")",
" return 1",
"}",
"",
"__peekaboo_bash_complete() {",
" local cur=\"${COMP_WORDS[COMP_CWORD]}\"",
" local path=\"\"",
" local index=1",
" local previous=\"\"",
" local token",
" COMPREPLY=()",
"",
" while (( index < COMP_CWORD )); do",
" token=\"${COMP_WORDS[index]}\"",
" [[ \"$token\" == -* ]] && break",
" if __peekaboo_bash_has_subcommand \"$path\" \"$token\"; then",
" path=\"${path:+$path }$token\"",
" (( index++ ))",
" else",
" break",
" fi",
" done",
"",
" if (( COMP_CWORD > 0 )); then",
" previous=\"${COMP_WORDS[COMP_CWORD - 1]}\"",
" fi",
"",
" local option_values",
" option_values=\"$(__peekaboo_bash_option_values \"$path\" \"$previous\" | cut -f1 | tr '\\n' ' ')\"",
" if [[ -n \"$option_values\" ]]; then",
" COMPREPLY=($(compgen -W \"$option_values\" -- \"$cur\"))",
" return",
" fi",
"",
" if [[ \"$cur\" == -* ]]; then",
" local option_names",
" option_names=\"$(__peekaboo_bash_options \"$path\" | cut -f1 | tr '\\n' ' ')\"",
" COMPREPLY=($(compgen -W \"$option_names\" -- \"$cur\"))",
" return",
" fi",
"",
" local subcommands",
" subcommands=\"$(__peekaboo_bash_subcommands \"$path\" | cut -f1 | tr '\\n' ' ')\"",
" if [[ -n \"$subcommands\" ]]; then",
" COMPREPLY=($(compgen -W \"$subcommands\" -- \"$cur\"))",
" return",
" fi",
"",
" local argument_index=$(( COMP_CWORD - index ))",
" local values",
" values=\"$(__peekaboo_bash_argument_values \"$path\" \"$argument_index\" | cut -f1 | tr '\\n' ' ')\"",
" if [[ -n \"$values\" ]]; then",
" COMPREPLY=($(compgen -W \"$values\" -- \"$cur\"))",
" fi",
"}",
"",
"complete -F __peekaboo_bash_complete \(document.commandName)",
]
return lines.joined(separator: "\n")
}
private func renderBashChoiceSwitch(
_ paths: [CompletionPath],
accessor: KeyPath<CompletionPath, [CompletionChoice]>
) -> String {
var lines = [" case \"$1\" in"]
lines.append(contentsOf: self.renderCases(paths: paths) { path in
path[keyPath: accessor].map { choice in
self.tabSeparated(choice.value, choice.help)
}
})
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func renderBashOptionSwitch(document: CompletionScriptDocument) -> String {
var lines = [" case \"$1\" in", " '')"]
lines.append(contentsOf: self.heredocLines(items: document.rootOptions.map { option in
option.names.map { name in
self.tabSeparated(name, option.help)
}
}.flatMap(\.self), indent: " "))
lines.append(" ;;")
lines.append(contentsOf: self.renderCases(paths: document.flattenedPaths) { path in
path.options.flatMap { option in
option.names.map { name in
self.tabSeparated(name, option.help)
}
}
})
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func renderBashArgumentSwitch(_ paths: [CompletionPath]) -> String {
var lines = [" case \"$1:$2\" in"]
for path in paths {
for (index, argument) in path.arguments.enumerated() where !argument.choices.isEmpty {
lines.append(" '\(self.caseLabel(path.key)):\(index)')")
lines.append(contentsOf: self.heredocLines(items: argument.choices.map {
self.tabSeparated($0.value, $0.help)
}, indent: " "))
lines.append(" ;;")
}
}
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func renderBashOptionValueSwitch(_ paths: [CompletionPath]) -> String {
var lines = [" case \"$1:$2\" in"]
for path in paths {
for option in path.options where !option.valueChoices.isEmpty {
for name in option.names {
lines.append(" '\(self.caseLabel(path.key)):\(self.caseLabel(name))')")
lines.append(contentsOf: self.heredocLines(items: option.valueChoices.map {
self.tabSeparated($0.value, $0.help)
}, indent: " "))
lines.append(" ;;")
}
}
}
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func renderCases(
paths: [CompletionPath],
content: (CompletionPath) -> [String]
) -> [String] {
paths.map { path in
let items = content(path)
if items.isEmpty {
return [
" '\(self.caseLabel(path.key))')",
" ;;",
]
}
return [
" '\(self.caseLabel(path.key))')",
] + self.heredocLines(items: items, indent: " ") + [
" ;;",
]
}.flatMap(\.self)
}
private func heredocLines(items: [String], indent: String) -> [String] {
guard !items.isEmpty else { return [] }
return [
"\(indent)cat <<'EOF'",
] + items + [
"EOF",
]
}
private func tabSeparated(_ value: String, _ help: String?) -> String {
let tab = "\t"
let description = (help ?? "").replacingOccurrences(of: "\t", with: " ").replacingOccurrences(
of: "\n",
with: " "
)
return "\(value)\(tab)\(description)"
}
private func caseLabel(_ label: String) -> String {
label.replacingOccurrences(of: "'", with: "'\\''")
}
private func commonHeader(shell: String, install: String) -> [String] {
[
"# \(shell.capitalized) completion for peekaboo",
"# Generated from Commander descriptors via `peekaboo completions \(shell)`.",
"# Install with:",
"# \(install)",
"",
]
}
}

View File

@ -1,286 +0,0 @@
import Commander
import Foundation
/// Shell-completion document rendered from Commander metadata.
///
/// `CompletionScriptDocument` is the single source of truth for completion
/// generation. It is derived from `CommanderCommandDescriptor` values, which are
/// already the canonical source for help output and command discovery.
struct CompletionScriptDocument {
let commandName: String
let commands: [CompletionCommand]
let rootOptions: [CompletionOption]
var topLevelChoices: [CompletionChoice] {
self.commands.map { command in
CompletionChoice(value: command.name, help: command.abstract)
}
}
var flattenedPaths: [CompletionPath] {
self.commands.flatMap { command in
command.flattenedPaths(prefix: [])
}
}
var pathsIncludingRoot: [CompletionPath] {
[
CompletionPath(
path: [],
subcommands: self.topLevelChoices,
options: self.rootOptions,
arguments: []
),
] + self.flattenedPaths
}
static func make(
commandName: String = "peekaboo",
descriptors: [CommanderCommandDescriptor]
) -> CompletionScriptDocument {
let commands = descriptors
.sorted { $0.metadata.name < $1.metadata.name }
.map { CompletionCommand(descriptor: $0, path: [$0.metadata.name]) }
let helpMirror = CompletionCommand.helpMirror(commands: commands)
return CompletionScriptDocument(
commandName: commandName,
commands: [helpMirror] + commands,
rootOptions: [
.flag(names: ["-h", "--help"], help: "Show help information"),
.flag(names: ["-V", "--version"], help: "Show version information"),
]
)
}
}
struct CompletionCommand {
let name: String
let abstract: String
let arguments: [CompletionArgument]
let options: [CompletionOption]
let subcommands: [CompletionCommand]
var subcommandChoices: [CompletionChoice] {
self.subcommands.map { command in
CompletionChoice(value: command.name, help: command.abstract)
}
}
init(descriptor: CommanderCommandDescriptor, path: [String]) {
self.name = descriptor.metadata.name
self.abstract = descriptor.metadata.abstract
self.arguments = descriptor.metadata.signature.arguments.enumerated().map { index, argument in
CompletionArgument(
label: argument.label,
isOptional: argument.isOptional,
choices: CompletionValueCatalog.argumentChoices(for: path, index: index, label: argument.label)
)
}
self.options = Self.makeOptions(from: descriptor.metadata.signature, path: path)
self.subcommands = descriptor.subcommands
.sorted { $0.metadata.name < $1.metadata.name }
.map { subcommand in
CompletionCommand(descriptor: subcommand, path: path + [subcommand.metadata.name])
}
}
private init(
name: String,
abstract: String,
arguments: [CompletionArgument],
options: [CompletionOption],
subcommands: [CompletionCommand]
) {
self.name = name
self.abstract = abstract
self.arguments = arguments
self.options = options
self.subcommands = subcommands
}
func flattenedPaths(prefix: [String]) -> [CompletionPath] {
let path = prefix + [self.name]
let current = CompletionPath(
path: path,
subcommands: self.subcommandChoices,
options: self.options,
arguments: self.arguments
)
return [current] + self.subcommands.flatMap { subcommand in
subcommand.flattenedPaths(prefix: path)
}
}
static func helpMirror(commands: [CompletionCommand]) -> CompletionCommand {
CompletionCommand(
name: "help",
abstract: "Show help for commands",
arguments: [],
options: [],
subcommands: commands.map { command in
CompletionCommand(
name: command.name,
abstract: command.abstract,
arguments: [],
options: [],
subcommands: self.helpSubcommands(from: command.subcommands)
)
}
)
}
private static func helpSubcommands(from commands: [CompletionCommand]) -> [CompletionCommand] {
commands.map { command in
CompletionCommand(
name: command.name,
abstract: command.abstract,
arguments: [],
options: [],
subcommands: self.helpSubcommands(from: command.subcommands)
)
}
}
private static func makeOptions(from signature: CommandSignature, path: [String]) -> [CompletionOption] {
let flags = signature.flags.map { flag in
CompletionOption.flag(
names: self.uniqueNames(flag.names.map(\.completionSpelling)),
help: flag.help ?? "No description provided"
)
}
let options = signature.options.map { option in
let names = self.uniqueNames(option.names.map(\.completionSpelling))
return CompletionOption.option(
names: names,
valueName: option.label,
help: option.help ?? "No description provided",
valueChoices: CompletionValueCatalog.optionChoices(
for: path,
label: option.label,
names: names
)
)
}
return flags + options + [
.flag(names: ["-h", "--help"], help: "Show help information"),
]
}
private static func uniqueNames(_ names: [String]) -> [String] {
var seen: Set<String> = []
var ordered: [String] = []
for name in names where !seen.contains(name) {
seen.insert(name)
ordered.append(name)
}
return ordered
}
}
struct CompletionPath {
let path: [String]
let subcommands: [CompletionChoice]
let options: [CompletionOption]
let arguments: [CompletionArgument]
var key: String {
self.path.joined(separator: " ")
}
}
struct CompletionArgument {
let label: String
let isOptional: Bool
let choices: [CompletionChoice]
}
struct CompletionOption {
let names: [String]
let help: String
let valueName: String?
let valueChoices: [CompletionChoice]
var takesValue: Bool {
self.valueName != nil
}
static func flag(names: [String], help: String) -> CompletionOption {
CompletionOption(names: names, help: help, valueName: nil, valueChoices: [])
}
static func option(
names: [String],
valueName: String,
help: String,
valueChoices: [CompletionChoice]
) -> CompletionOption {
CompletionOption(names: names, help: help, valueName: valueName, valueChoices: valueChoices)
}
}
/// A single suggested completion value with optional help text.
///
/// `CompletionChoice` is used for subcommands and curated value suggestions for
/// positional arguments or option values.
struct CompletionChoice {
let value: String
let help: String?
}
/// Central registry for curated completion values that cannot be inferred from
/// Commander metadata alone.
///
/// Most command structure comes directly from descriptors. This catalog is only
/// for constrained value sets such as `completions [shell]` or `--log-level`.
enum CompletionValueCatalog {
static func argumentChoices(for path: [String], index: Int, label: String) -> [CompletionChoice] {
if path == ["completions"], index == 0, label == "shell" {
return CompletionsCommand.Shell.allCases.map { shell in
CompletionChoice(value: shell.rawValue, help: shell.helpText)
}
}
return []
}
static func optionChoices(for path: [String], label: String, names: [String]) -> [CompletionChoice] {
if names.contains("--log-level"), label == "logLevel" {
return LogLevel.allCases.map { level in
CompletionChoice(value: level.cliValue, help: nil)
}
}
return []
}
}
/// Dispatches shell-completion rendering to the appropriate shell-specific
/// renderer.
enum CompletionScriptRenderer {
static func render(document: CompletionScriptDocument, for targetShell: CompletionsCommand.Shell) -> String {
switch targetShell {
case .bash:
BashCompletionRenderer().render(document: document)
case .zsh:
ZshCompletionRenderer().render(document: document)
case .fish:
FishCompletionRenderer().render(document: document)
}
}
}
protocol ShellCompletionRendering {
func render(document: CompletionScriptDocument) -> String
}
extension CommanderName {
var completionSpelling: String {
switch self {
case let .short(value), let .aliasShort(value):
"-\(value)"
case let .long(value), let .aliasLong(value):
"--\(value)"
}
}
}

View File

@ -1,191 +0,0 @@
import Foundation
/// Renders a fish completion script using fish-native helper functions and a
/// single dynamic `complete -a` callback.
struct FishCompletionRenderer: ShellCompletionRendering {
func render(document: CompletionScriptDocument) -> String {
let lines = [
"# Fish completion for peekaboo",
"# Generated from Commander descriptors via `peekaboo completions fish`.",
"# Install with:",
"# \(CompletionsCommand.Shell.fish.installationSnippet)",
"",
"function __peekaboo_fish_subcommands",
self.renderFishChoiceSwitch(document.pathsIncludingRoot, accessor: \.subcommands),
"end",
"",
"function __peekaboo_fish_options",
self.renderFishOptionSwitch(document: document),
"end",
"",
"function __peekaboo_fish_argument_values",
self.renderFishArgumentSwitch(document.pathsIncludingRoot),
"end",
"",
"function __peekaboo_fish_option_values",
self.renderFishOptionValueSwitch(document.pathsIncludingRoot),
"end",
"",
"function __peekaboo_fish_has_subcommand",
" set -l path $argv[1]",
" set -l candidate $argv[2]",
" for line in (__peekaboo_fish_subcommands \"$path\")",
" set -l parts (string split \\t -- $line)",
" if test (count $parts) -gt 0; and test \"$parts[1]\" = \"$candidate\"",
" return 0",
" end",
" end",
" return 1",
"end",
"",
"function __peekaboo_fish_append_path",
" if test -n \"$argv[1]\"",
" printf '%s %s\\n' \"$argv[1]\" \"$argv[2]\"",
" else",
" printf '%s\\n' \"$argv[2]\"",
" end",
"end",
"",
"function __peekaboo_fish_complete",
" set -l tokens (commandline -opc)",
" if test (count $tokens) -gt 0",
" set -e tokens[1]",
" end",
" set -l current (commandline -ct)",
" set -l path ''",
" set -l index 1",
" set -l previous ''",
" set -l token_count (count $tokens)",
"",
" while test $index -le $token_count",
" set -l token $tokens[$index]",
" if string match -qr '^-' -- $token",
" break",
" end",
" if __peekaboo_fish_has_subcommand \"$path\" \"$token\"",
" set path (__peekaboo_fish_append_path \"$path\" \"$token\")",
" set index (math \"$index + 1\")",
" else",
" break",
" end",
" end",
"",
" if test $token_count -gt 0",
" set previous $tokens[$token_count]",
" end",
"",
" set -l option_values (__peekaboo_fish_option_values \"$path\" \"$previous\")",
" if test (count $option_values) -gt 0",
" printf '%s\\n' $option_values",
" return",
" end",
"",
" if string match -qr '^-' -- $current",
" __peekaboo_fish_options \"$path\"",
" return",
" end",
"",
" set -l subcommands (__peekaboo_fish_subcommands \"$path\")",
" if test (count $subcommands) -gt 0",
" printf '%s\\n' $subcommands",
" return",
" end",
"",
" set -l argument_index (math \"$token_count - $index + 1\")",
" __peekaboo_fish_argument_values \"$path\" \"$argument_index\"",
"end",
"",
"complete -c \(document.commandName) -f -a '(__peekaboo_fish_complete)'",
]
return lines.joined(separator: "\n")
}
private func renderFishChoiceSwitch(
_ paths: [CompletionPath],
accessor: KeyPath<CompletionPath, [CompletionChoice]>
) -> String {
var lines = [" switch $argv[1]"]
for path in paths {
lines.append(" case '\(self.fishEscaped(path.key))'")
for choice in path[keyPath: accessor] {
lines.append(self.printfLine(value: choice.value, help: choice.help ?? ""))
}
}
lines.append(contentsOf: [
" case '*'",
" end",
])
return lines.joined(separator: "\n")
}
private func renderFishOptionSwitch(document: CompletionScriptDocument) -> String {
var lines = [" switch $argv[1]", " case ''"]
for option in document.rootOptions {
for name in option.names {
lines.append(self.printfLine(value: name, help: option.help))
}
}
for path in document.flattenedPaths {
lines.append(" case '\(self.fishEscaped(path.key))'")
for option in path.options {
for name in option.names {
lines.append(self.printfLine(value: name, help: option.help))
}
}
}
lines.append(contentsOf: [
" case '*'",
" end",
])
return lines.joined(separator: "\n")
}
private func renderFishArgumentSwitch(_ paths: [CompletionPath]) -> String {
var lines = [" switch \"$argv[1]:$argv[2]\""]
for path in paths {
for (index, argument) in path.arguments.enumerated() where !argument.choices.isEmpty {
lines.append(" case '\(self.fishEscaped(path.key)):\(index)'")
for choice in argument.choices {
lines.append(self.printfLine(value: choice.value, help: choice.help ?? ""))
}
}
}
lines.append(contentsOf: [
" case '*'",
" end",
])
return lines.joined(separator: "\n")
}
private func renderFishOptionValueSwitch(_ paths: [CompletionPath]) -> String {
var lines = [" switch \"$argv[1]:$argv[2]\""]
for path in paths {
for option in path.options where !option.valueChoices.isEmpty {
for name in option.names {
lines.append(" case '\(self.fishEscaped(path.key)):\(self.fishEscaped(name))'")
for choice in option.valueChoices {
lines.append(self.printfLine(value: choice.value, help: choice.help ?? ""))
}
}
}
}
lines.append(contentsOf: [
" case '*'",
" end",
])
return lines.joined(separator: "\n")
}
private func printfLine(value: String, help: String) -> String {
" printf '%s\\t%s\\n' '\(self.fishEscaped(value))' '\(self.fishEscaped(help))'"
}
private func fishEscaped(_ value: String) -> String {
value
.replacingOccurrences(of: "\\", with: "\\\\")
.replacingOccurrences(of: "'", with: "\\'")
.replacingOccurrences(of: "\t", with: " ")
.replacingOccurrences(of: "\n", with: " ")
}
}

View File

@ -1,197 +0,0 @@
import Foundation
/// Renders a zsh completion script using `compdef` plus dynamic helper
/// functions backed by the shared completion document.
struct ZshCompletionRenderer: ShellCompletionRendering {
func render(document: CompletionScriptDocument) -> String {
let lines = [
"#compdef \(document.commandName)",
"# Zsh completion for peekaboo",
"# Generated from Commander descriptors via `peekaboo completions zsh`.",
"# Install with:",
"# \(CompletionsCommand.Shell.zsh.installationSnippet)",
"",
"__peekaboo_zsh_subcommands() {",
self.renderZshChoiceSwitch(document.pathsIncludingRoot, accessor: \.subcommands),
"}",
"",
"__peekaboo_zsh_options() {",
self.renderZshOptionSwitch(document: document),
"}",
"",
"__peekaboo_zsh_argument_values() {",
self.renderZshArgumentSwitch(document.pathsIncludingRoot),
"}",
"",
"__peekaboo_zsh_option_values() {",
self.renderZshOptionValueSwitch(document.pathsIncludingRoot),
"}",
"",
"__peekaboo_zsh_has_subcommand() {",
" local path=\"$1\"",
" local candidate=\"$2\"",
" local line value description",
" while IFS=$'\\t' read -r value description; do",
" [[ \"$value\" == \"$candidate\" ]] && return 0",
" done < <(__peekaboo_zsh_subcommands \"$path\")",
" return 1",
"}",
"",
"__peekaboo_zsh_compadd_with_help() {",
" local line value description",
" local -a values descriptions",
" while IFS=$'\\t' read -r value description; do",
" values+=(\"$value\")",
" descriptions+=(\"$description\")",
" done",
" if (( ${#values[@]} == 0 )); then",
" return 1",
" fi",
" compadd -Q -d descriptions -- \"${values[@]}\"",
"}",
"",
"_peekaboo() {",
" local path=\"\"",
" local index=2",
" local token current_word previous_word",
" current_word=\"${words[CURRENT]}\"",
"",
" while (( index < CURRENT )); do",
" token=\"${words[index]}\"",
" [[ \"$token\" == -* ]] && break",
" if __peekaboo_zsh_has_subcommand \"$path\" \"$token\"; then",
" path=\"${path:+$path }$token\"",
" (( index++ ))",
" else",
" break",
" fi",
" done",
"",
" if (( CURRENT > 2 )); then",
" previous_word=\"${words[CURRENT - 1]}\"",
" fi",
"",
" if __peekaboo_zsh_option_values \"$path\" \"$previous_word\" | __peekaboo_zsh_compadd_with_help; then",
" return",
" fi",
"",
" if [[ \"$current_word\" == -* ]]; then",
" __peekaboo_zsh_options \"$path\" | __peekaboo_zsh_compadd_with_help",
" return",
" fi",
"",
" if __peekaboo_zsh_subcommands \"$path\" | __peekaboo_zsh_compadd_with_help; then",
" return",
" fi",
"",
" local argument_index=$(( CURRENT - index ))",
" __peekaboo_zsh_argument_values \"$path\" \"$argument_index\" | __peekaboo_zsh_compadd_with_help",
"}",
"",
"compdef _peekaboo \(document.commandName)",
]
return lines.joined(separator: "\n")
}
private func renderZshChoiceSwitch(
_ paths: [CompletionPath],
accessor: KeyPath<CompletionPath, [CompletionChoice]>
) -> String {
var lines = [" case \"$1\" in"]
for path in paths {
lines.append(" '\(self.caseLabel(path.key))')")
for choice in path[keyPath: accessor] {
lines.append(self.printLine(value: choice.value, help: choice.help ?? ""))
}
lines.append(" ;;")
}
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func renderZshOptionSwitch(document: CompletionScriptDocument) -> String {
var lines = [" case \"$1\" in", " '')"]
for option in document.rootOptions {
for name in option.names {
lines.append(self.printLine(value: name, help: option.help))
}
}
lines.append(" ;;")
for path in document.flattenedPaths {
lines.append(" '\(self.caseLabel(path.key))')")
for option in path.options {
for name in option.names {
lines.append(self.printLine(value: name, help: option.help))
}
}
lines.append(" ;;")
}
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func renderZshArgumentSwitch(_ paths: [CompletionPath]) -> String {
var lines = [" case \"$1:$2\" in"]
for path in paths {
for (index, argument) in path.arguments.enumerated() where !argument.choices.isEmpty {
lines.append(" '\(self.caseLabel(path.key)):\(index)')")
for choice in argument.choices {
lines.append(self.printLine(value: choice.value, help: choice.help ?? ""))
}
lines.append(" ;;")
}
}
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func renderZshOptionValueSwitch(_ paths: [CompletionPath]) -> String {
var lines = [" case \"$1:$2\" in"]
for path in paths {
for option in path.options where !option.valueChoices.isEmpty {
for name in option.names {
lines.append(" '\(self.caseLabel(path.key)):\(self.caseLabel(name))')")
for choice in option.valueChoices {
lines.append(self.printLine(value: choice.value, help: choice.help ?? ""))
}
lines.append(" ;;")
}
}
}
lines.append(contentsOf: [
" *)",
" ;;",
" esac",
])
return lines.joined(separator: "\n")
}
private func caseLabel(_ label: String) -> String {
label.replacingOccurrences(of: "'", with: "'\\''")
}
private func printLine(value: String, help: String) -> String {
" print -r -- $'\(self.zshEscaped(value))\\t\(self.zshEscaped(help))'"
}
private func zshEscaped(_ value: String) -> String {
value
.replacingOccurrences(of: "\\", with: "\\\\")
.replacingOccurrences(of: "'", with: "\\'")
.replacingOccurrences(of: "\t", with: " ")
.replacingOccurrences(of: "\n", with: " ")
}
}

View File

@ -1,7 +1,7 @@
import Foundation
import PeekabooCore
/// Re-use the Configuration type from PeekabooCore
// Re-use the Configuration type from PeekabooCore
typealias Configuration = PeekabooCore.Configuration
/// CLI-specific configuration manager that extends PeekabooCore's ConfigurationManager
@ -10,7 +10,7 @@ typealias Configuration = PeekabooCore.Configuration
final class ConfigurationManager: @unchecked Sendable {
static let shared = ConfigurationManager()
/// Use PeekabooCore's ConfigurationManager for core functionality
// Use PeekabooCore's ConfigurationManager for core functionality
private let coreManager = PeekabooCore.ConfigurationManager.shared
private init() {}
@ -54,11 +54,6 @@ final class ConfigurationManager: @unchecked Sendable {
self.coreManager.loadConfiguration()
}
/// Get cached configuration, loading only if needed.
func getConfiguration() -> Configuration? {
self.coreManager.getConfiguration()
}
/// Strip comments from JSONC content
func stripJSONComments(from json: String) -> String {
// Strip comments from JSONC content

View File

@ -34,8 +34,6 @@ enum CommandRegistry {
static let entries: [CommandRegistryEntry] = [
.init(type: ImageCommand.self, category: .core),
.init(type: CaptureCommand.self, category: .core),
.init(type: BridgeCommand.self, category: .core),
.init(type: DaemonCommand.self, category: .core),
.init(type: ListCommand.self, category: .core),
.init(type: ToolsCommand.self, category: .core),
.init(type: ConfigCommand.self, category: .core),
@ -44,12 +42,9 @@ enum CommandRegistry {
.init(type: SeeCommand.self, category: .vision),
.init(type: ClickCommand.self, category: .interaction),
.init(type: TypeCommand.self, category: .interaction),
.init(type: SetValueCommand.self, category: .interaction),
.init(type: PerformActionCommand.self, category: .interaction),
.init(type: PressCommand.self, category: .interaction),
.init(type: ScrollCommand.self, category: .interaction),
.init(type: HotkeyCommand.self, category: .interaction),
.init(type: PasteCommand.self, category: .interaction),
.init(type: SwipeCommand.self, category: .interaction),
.init(type: DragCommand.self, category: .interaction),
.init(type: MoveCommand.self, category: .interaction),
@ -66,11 +61,7 @@ enum CommandRegistry {
.init(type: SpaceCommand.self, category: .system),
.init(type: VisualizerCommand.self, category: .system),
.init(type: ClipboardCommand.self, category: .system),
.init(type: CompletionsCommand.self, category: .core),
.init(type: CommanderCommand.self, category: .core),
.init(type: AgentCommand.self, category: .ai),
.init(type: BrowserCommand.self, category: .mcp),
.init(type: InspectUICommand.self, category: .mcp),
.init(type: MCPCommand.self, category: .mcp),
]

View File

@ -42,7 +42,7 @@ final class Logger: @unchecked Sendable {
private let queue = DispatchQueue(label: "logger.queue", attributes: .concurrent)
private let iso8601Formatter: ISO8601DateFormatter
/// Performance tracking
// Performance tracking
private nonisolated(unsafe) var performanceTimers: [String: Date] = [:]
private init() {

View File

@ -1,248 +0,0 @@
import Darwin
import Foundation
enum DeferredCommandOutput {
static func run<T>(
bufferingOutput: Bool,
operation: () async throws -> T
) async throws -> T {
guard bufferingOutput else {
return try await operation()
}
let inheritedTerminalOutput = TerminalDetector.standardOutputFileDescriptor
let capture = try FileDescriptorOutputCapture()
let terminalOutput = inheritedTerminalOutput ?? capture.originalStandardOutputDescriptor
let result: T
do {
result = try await TerminalDetector.$standardOutputFileDescriptor.withValue(terminalOutput) {
try await operation()
}
} catch {
let shouldReplay = !(error is CancellationError)
// Preserve the command's primary error even if restoring or replaying output fails.
Logger.shared.flush()
try? capture.finish(replayingOutput: shouldReplay)
throw error
}
Logger.shared.flush()
try capture.finish(replayingOutput: true)
return result
}
}
private nonisolated enum DeferredCommandOutputError: LocalizedError {
case posix(operation: String, code: Int32)
var errorDescription: String? {
switch self {
case let .posix(operation, code):
"Failed to \(operation): \(String(cString: strerror(code)))"
}
}
}
private final nonisolated class FileDescriptorOutputCapture {
private var stdoutCapture: Int32 = -1
private var stderrCapture: Int32 = -1
private var originalStdout: Int32 = -1
private var originalStderr: Int32 = -1
private var stdoutRedirected = false
private var stderrRedirected = false
private var finished = false
var originalStandardOutputDescriptor: Int32 {
self.originalStdout
}
init() throws {
// Keep output emitted before this command outside its deferred transaction.
_ = fflush(nil)
do {
self.stdoutCapture = try Self.makeTemporaryFile(named: "stdout")
self.stderrCapture = try Self.makeTemporaryFile(named: "stderr")
self.originalStdout = try Self.duplicate(STDOUT_FILENO, named: "stdout")
self.originalStderr = try Self.duplicate(STDERR_FILENO, named: "stderr")
try Self.redirect(self.stdoutCapture, to: STDOUT_FILENO, named: "stdout")
self.stdoutRedirected = true
try Self.redirect(self.stderrCapture, to: STDERR_FILENO, named: "stderr")
self.stderrRedirected = true
} catch {
self.restoreIgnoringErrors()
self.closeDescriptors()
throw error
}
}
deinit {
guard !self.finished else { return }
_ = fflush(nil)
self.restoreIgnoringErrors()
self.closeDescriptors()
}
func finish(replayingOutput: Bool) throws {
guard !self.finished else { return }
_ = fflush(nil)
try self.restore()
defer {
self.finished = true
self.closeDescriptors()
}
if replayingOutput {
try Self.replay(from: self.stdoutCapture, to: self.originalStdout, named: "stdout")
try Self.replay(from: self.stderrCapture, to: self.originalStderr, named: "stderr")
}
}
private func restore() throws {
var firstError: (any Error)?
if self.stdoutRedirected {
do {
try Self.redirect(self.originalStdout, to: STDOUT_FILENO, named: "stdout")
self.stdoutRedirected = false
} catch {
firstError = error
}
}
if self.stderrRedirected {
do {
try Self.redirect(self.originalStderr, to: STDERR_FILENO, named: "stderr")
self.stderrRedirected = false
} catch {
firstError = firstError ?? error
}
}
if let firstError {
throw firstError
}
}
private func restoreIgnoringErrors() {
if self.stdoutRedirected, dup2(self.originalStdout, STDOUT_FILENO) != -1 {
self.stdoutRedirected = false
}
if self.stderrRedirected, dup2(self.originalStderr, STDERR_FILENO) != -1 {
self.stderrRedirected = false
}
}
private func closeDescriptors() {
Self.close(&self.stdoutCapture)
Self.close(&self.stderrCapture)
Self.close(&self.originalStdout)
Self.close(&self.originalStderr)
}
private static func makeTemporaryFile(named stream: String) throws -> Int32 {
var template = Array("\(NSTemporaryDirectory())peekaboo-\(stream).XXXXXX".utf8CString)
let descriptor = template.withUnsafeMutableBufferPointer { buffer -> Int32 in
guard let baseAddress = buffer.baseAddress else { return -1 }
let descriptor = mkstemp(baseAddress)
if descriptor != -1 {
_ = unlink(baseAddress)
}
return descriptor
}
guard descriptor != -1 else {
throw DeferredCommandOutputError.posix(
operation: "create deferred \(stream) output",
code: errno
)
}
Self.setCloseOnExec(descriptor)
return descriptor
}
private static func duplicate(_ descriptor: Int32, named stream: String) throws -> Int32 {
let duplicate = dup(descriptor)
guard duplicate != -1 else {
throw DeferredCommandOutputError.posix(
operation: "duplicate \(stream)",
code: errno
)
}
Self.setCloseOnExec(duplicate)
return duplicate
}
private static func redirect(_ source: Int32, to destination: Int32, named stream: String) throws {
guard dup2(source, destination) != -1 else {
throw DeferredCommandOutputError.posix(
operation: "redirect \(stream)",
code: errno
)
}
}
private static func replay(from source: Int32, to destination: Int32, named stream: String) throws {
guard lseek(source, 0, SEEK_SET) != -1 else {
throw DeferredCommandOutputError.posix(
operation: "rewind deferred \(stream) output",
code: errno
)
}
var buffer = [UInt8](repeating: 0, count: 64 * 1024)
while true {
let bytesRead = buffer.withUnsafeMutableBytes { bytes in
Darwin.read(source, bytes.baseAddress, bytes.count)
}
if bytesRead == 0 {
return
}
if bytesRead == -1 {
if errno == EINTR {
continue
}
throw DeferredCommandOutputError.posix(
operation: "read deferred \(stream) output",
code: errno
)
}
var offset = 0
while offset < bytesRead {
let bytesWritten = buffer.withUnsafeBytes { bytes in
Darwin.write(
destination,
bytes.baseAddress?.advanced(by: offset),
bytesRead - offset
)
}
if bytesWritten == -1 {
if errno == EINTR {
continue
}
throw DeferredCommandOutputError.posix(
operation: "replay deferred \(stream) output",
code: errno
)
}
offset += bytesWritten
}
}
}
private static func setCloseOnExec(_ descriptor: Int32) {
let flags = fcntl(descriptor, F_GETFD)
if flags != -1 {
_ = fcntl(descriptor, F_SETFD, flags | FD_CLOEXEC)
}
}
private static func close(_ descriptor: inout Int32) {
guard descriptor != -1 else { return }
_ = Darwin.close(descriptor)
descriptor = -1
}
}

View File

@ -66,7 +66,6 @@ struct ErrorInfo: Codable {
enum ErrorCode: String, Codable {
case PERMISSION_ERROR_SCREEN_RECORDING
case PERMISSION_ERROR_ACCESSIBILITY
case PERMISSION_ERROR_EVENT_SYNTHESIZING
case PERMISSION_ERROR_APPLESCRIPT
case PERMISSION_DENIED
case APP_NOT_FOUND
@ -86,8 +85,6 @@ enum ErrorCode: String, Codable {
case NO_ACTIVE_DIALOG
case ELEMENT_NOT_FOUND
case SESSION_NOT_FOUND
case SNAPSHOT_NOT_FOUND
case SNAPSHOT_STALE
case APPLICATION_NOT_FOUND
case NO_POINT_SPECIFIED
case INVALID_COORDINATES
@ -177,11 +174,6 @@ func outputError(message: String, code: ErrorCode, details: String? = nil, logge
outputJSON(JSONResponse(success: false, messages: nil, debugLogs: debugLogs, error: error), logger: logger)
}
func outputFailure(message: String, logger: Logger, error: (any Error)? = nil) {
let details = error.map { "\($0)" }
outputError(message: message, code: .UNKNOWN_ERROR, details: details, logger: logger)
}
/// Empty type for successful responses with no data
struct Empty: Codable {}

View File

@ -1,26 +0,0 @@
import Foundation
extension LogLevel: CaseIterable {
public static var allCases: [LogLevel] {
[.trace, .verbose, .debug, .info, .warning, .error, .critical]
}
var cliValue: String {
switch self {
case .trace:
"trace"
case .verbose:
"verbose"
case .debug:
"debug"
case .info:
"info"
case .warning:
"warning"
case .error:
"error"
case .critical:
"critical"
}
}
}

View File

@ -28,18 +28,14 @@ struct TerminalCapabilities {
/// Terminal detection utilities following modern CLI best practices
enum TerminalDetector {
@TaskLocal
static var standardOutputFileDescriptor: Int32?
/// Detect comprehensive terminal capabilities
static func detectCapabilities() -> TerminalCapabilities {
// Detect comprehensive terminal capabilities
let outputFileDescriptor = self.standardOutputFileDescriptor ?? STDOUT_FILENO
let isInteractive = self.isInteractiveTerminal(outputFileDescriptor)
let (width, height) = self.getTerminalDimensions(outputFileDescriptor)
let isInteractive = self.isInteractiveTerminal()
let (width, height) = self.getTerminalDimensions()
let termType = ProcessInfo.processInfo.environment["TERM"]
let isCI = self.isCIEnvironment()
let isPiped = self.isPipedOutput(outputFileDescriptor)
let isPiped = self.isPipedOutput()
let supportsColors = self.detectColorSupport(termType: termType, isInteractive: isInteractive)
let supportsTrueColor = self.detectTrueColorSupport()
@ -58,15 +54,15 @@ enum TerminalDetector {
// MARK: - Core Detection Methods
/// Check if stdout is connected to an interactive terminal
private static func isInteractiveTerminal(_ outputFileDescriptor: Int32) -> Bool {
private static func isInteractiveTerminal() -> Bool {
// Check if stdout is connected to an interactive terminal
isatty(outputFileDescriptor) != 0
isatty(STDOUT_FILENO) != 0
}
/// Check if output is being piped or redirected
private static func isPipedOutput(_ outputFileDescriptor: Int32) -> Bool {
private static func isPipedOutput() -> Bool {
// Check if output is being piped or redirected
isatty(outputFileDescriptor) == 0
isatty(STDOUT_FILENO) == 0
}
/// Detect CI/automation environments
@ -83,7 +79,7 @@ enum TerminalDetector {
"AZURE_PIPELINES", "TF_BUILD",
"BITBUCKET_COMMIT", "BITBUCKET_BUILD_NUMBER",
"DRONE", "DRONE_BUILD_NUMBER",
"SEMAPHORE", "SEMAPHORE_BUILD_NUMBER",
"SEMAPHORE", "SEMAPHORE_BUILD_NUMBER"
]
let env = ProcessInfo.processInfo.environment
@ -91,11 +87,11 @@ enum TerminalDetector {
}
/// Get terminal dimensions using ioctl
private static func getTerminalDimensions(_ outputFileDescriptor: Int32) -> (width: Int, height: Int) {
private static func getTerminalDimensions() -> (width: Int, height: Int) {
// Get terminal dimensions using ioctl
var windowSize = winsize()
guard ioctl(outputFileDescriptor, TIOCGWINSZ, &windowSize) == 0 else {
guard ioctl(STDOUT_FILENO, TIOCGWINSZ, &windowSize) == 0 else {
// Fallback to environment variables
let width = Int(ProcessInfo.processInfo.environment["COLUMNS"] ?? "80") ?? 80
let height = Int(ProcessInfo.processInfo.environment["LINES"] ?? "24") ?? 24
@ -124,7 +120,7 @@ enum TerminalDetector {
if let term = termType {
let colorTermPatterns = [
"color", "256color", "truecolor", "24bit",
"xterm-256", "screen-256", "tmux-256",
"xterm-256", "screen-256", "tmux-256"
]
if colorTermPatterns.contains(where: term.contains) {
@ -135,7 +131,7 @@ enum TerminalDetector {
let colorTerminals = [
"xterm", "screen", "tmux", "rxvt", "konsole",
"gnome", "mate", "xfce", "terminology", "kitty",
"alacritty", "iterm", "hyper", "vscode",
"alacritty", "iterm", "hyper", "vscode"
]
if colorTerminals.contains(where: term.contains) {
@ -167,7 +163,7 @@ enum TerminalDetector {
if let term = env["TERM"] {
let trueColorTerminals = [
"iterm", "kitty", "alacritty", "wezterm",
"hyper", "vscode", "gnome-terminal",
"hyper", "vscode", "gnome-terminal"
]
return trueColorTerminals.contains(where: term.contains)
}

View File

@ -9,7 +9,7 @@ import PeekabooFoundation
typealias SavedFile = PeekabooCore.SavedFile
typealias ImageCaptureData = PeekabooCore.ImageCaptureData
/// Extend PeekabooCore types to conform to Commander argument parsing for CLI usage
// Extend PeekabooCore types to conform to Commander argument parsing for CLI usage
extension PeekabooCore.CaptureMode: @retroactive ExpressibleFromArgument {
public init?(argument: String) {
self.init(rawValue: argument.lowercased())
@ -40,12 +40,12 @@ typealias WindowListData = PeekabooCore.WindowListData
// MARK: - Window Specifier
/// Re-export WindowSpecifier from PeekabooCore
// Re-export WindowSpecifier from PeekabooCore
typealias WindowSpecifier = PeekabooCore.WindowSpecifier
// MARK: - Window Details Options
/// Re-export WindowDetailOption from PeekabooCore
// Re-export WindowDetailOption from PeekabooCore
typealias WindowDetailOption = PeekabooCore.WindowDetailOption
// MARK: - Window Management
@ -55,7 +55,7 @@ typealias WindowDetailOption = PeekabooCore.WindowDetailOption
/// Used internally for window operations, containing all available
/// information about a window including its Core Graphics identifier and bounds.
/// This is CLI-specific and not shared with PeekabooCore.
struct WindowData {
struct WindowData: Sendable {
let windowId: UInt32
let title: String
let bounds: CGRect
@ -65,5 +65,5 @@ struct WindowData {
// MARK: - Error Types
/// Re-export CaptureError from PeekabooFoundation
// Re-export CaptureError from PeekabooFoundation
typealias CaptureError = PeekabooFoundation.CaptureError

View File

@ -21,11 +21,8 @@ func executePeekabooCLI(arguments: [String]) async -> Int32 {
// Initialize CoreGraphics silently to prevent CGS_REQUIRE_INIT error
_ = CGMainDisplayID()
// Load configuration at startup. The singleton initializer already performs
// the initial load, so avoid a second credentials/config read on every CLI invocation.
_ = ConfigurationManager.shared.getConfiguration()
let shouldEmitJSONErrors = containsJSONOutputFlag(arguments)
// Load configuration at startup
_ = ConfigurationManager.shared.loadConfiguration()
do {
try await CommanderRuntimeExecutor.resolveAndRun(arguments: arguments)
@ -33,58 +30,25 @@ func executePeekabooCLI(arguments: [String]) async -> Int32 {
} catch let exit as ExitCode {
return exit.rawValue
} catch let programError as CommanderProgramError {
printCommanderError(programError, jsonOutput: shouldEmitJSONErrors)
printCommanderError(programError)
return EXIT_FAILURE
} catch {
printGenericError(error, jsonOutput: shouldEmitJSONErrors)
fputs("Error: \(error.localizedDescription)\n", stderr)
return EXIT_FAILURE
}
}
private func containsJSONOutputFlag(_ arguments: [String]) -> Bool {
arguments.contains("--json") || arguments.contains("-j") || arguments.contains("--json-output")
}
private func commanderErrorMessage(_ error: CommanderProgramError) -> String {
private func printCommanderError(_ error: CommanderProgramError) {
switch error {
case let .parsingError(parsing):
parsing.description
fputs("Error: \(parsing.description)\n", stderr)
case let .unknownCommand(name):
"Unknown command '\(name)'"
fputs("Error: Unknown command '\(name)'\n", stderr)
case let .unknownSubcommand(command, name):
"Unknown subcommand '\(name)' for command '\(command)'"
fputs("Error: Unknown subcommand '\(name)' for command '\(command)'\n", stderr)
case .missingCommand:
"No command specified"
fputs("Error: No command specified\n", stderr)
case let .missingSubcommand(command):
"Command '\(command)' requires a subcommand"
fputs("Error: Command '\(command)' requires a subcommand\n", stderr)
}
}
private func printCommanderError(_ error: CommanderProgramError, jsonOutput: Bool) {
let message = commanderErrorMessage(error)
guard jsonOutput else {
fputs("Error: \(message)\n", stderr)
return
}
let logger = Logger.shared
logger.setJsonOutputMode(true)
outputError(message: message, code: .INVALID_ARGUMENT, logger: logger)
}
private func printGenericError(_ error: any Error, jsonOutput: Bool) {
let code: ErrorCode = if error is CommanderBindingError {
.INVALID_ARGUMENT
} else {
.UNKNOWN_ERROR
}
guard jsonOutput else {
fputs("Error: \(error.localizedDescription)\n", stderr)
return
}
let logger = Logger.shared
logger.setJsonOutputMode(true)
outputError(message: error.localizedDescription, code: code, logger: logger)
}

View File

@ -12,32 +12,6 @@ protocol ApplicationResolvable {
}
extension ApplicationResolvable {
/// Returns a PID when the command explicitly targets one, including the documented `--app PID:<pid>` form.
func resolveExplicitPIDObservationTarget() throws -> Int32? {
if let pid, self.app == nil {
return pid
}
guard let appValue = self.app?.trimmingCharacters(in: .whitespacesAndNewlines),
appValue.uppercased().hasPrefix("PID:")
else {
return nil
}
let appPidString = String(appValue.dropFirst("PID:".count))
guard let appPid = Int32(appPidString) else {
throw PeekabooError.invalidInput("Invalid PID format in --app: '\(appValue)'")
}
if let pid, pid != appPid {
throw PeekabooError.invalidInput(
"Conflicting PIDs: --app specifies PID \(appPid) but --pid is \(pid)"
)
}
return appPid
}
/// Resolves the application identifier from app and/or pid parameters
/// Supports lenient handling for redundant but non-conflicting parameters
func resolveApplicationIdentifier() throws -> String {
@ -95,7 +69,5 @@ protocol ApplicationResolvablePositional: ApplicationResolvable {
}
extension ApplicationResolvablePositional {
var app: String? {
self.positionalAppIdentifier
}
var app: String? { self.positionalAppIdentifier }
}

View File

@ -3,7 +3,30 @@ import Foundation
/// Check if the CLI binary is stale compared to the current git state.
/// Only runs in debug builds when git config 'peekaboo.check-build-staleness' is true.
func checkBuildStaleness() {
guard isBuildStalenessCheckEnabled() else { return }
// Check if staleness checking is enabled via git config
let configCheck = Process()
configCheck.executableURL = URL(fileURLWithPath: "/usr/bin/git")
configCheck.arguments = ["config", "peekaboo.check-build-staleness"]
let configPipe = Pipe()
configCheck.standardOutput = configPipe
configCheck.standardError = Pipe() // Silence stderr
do {
try configCheck.run()
configCheck.waitUntilExit()
// Only proceed if the config value is "true"
let configData = configPipe.fileHandleForReading.readDataToEndOfFile()
let configValue = String(data: configData, encoding: .utf8)?
.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
guard configValue == "true" else {
return // Staleness checking is disabled
}
} catch {
return // Git config command failed, skip check
}
// Check 1: Git commit comparison
checkGitCommitStaleness()
@ -12,108 +35,6 @@ func checkBuildStaleness() {
checkFileModificationStaleness()
}
/// Return true when `peekaboo.check-build-staleness` is enabled.
///
/// This runs on every debug CLI start, so avoid spawning `git config` for the common
/// disabled path. Environment override keeps a cheap opt-in for CI and local debugging.
func isBuildStalenessCheckEnabled(
environment: [String: String] = ProcessInfo.processInfo.environment,
currentDirectory: String = FileManager.default.currentDirectoryPath,
gitConfigPaths: [String]? = nil
) -> Bool {
if let override = environment["PEEKABOO_CHECK_BUILD_STALENESS"]?.trimmingCharacters(in: .whitespacesAndNewlines)
.lowercased(),
!override.isEmpty {
return override == "1" || override == "true" || override == "yes"
}
var setting: Bool?
for path in gitConfigPaths ?? defaultGitConfigPaths(environment: environment, currentDirectory: currentDirectory) {
guard let contents = try? String(contentsOfFile: path, encoding: .utf8),
let parsedSetting = parseBuildStalenessSetting(from: contents)
else {
continue
}
setting = parsedSetting
}
return setting == true
}
func parseBuildStalenessSetting(from gitConfig: String) -> Bool? {
var inPeekabooSection = false
for rawLine in gitConfig.components(separatedBy: .newlines) {
let line = rawLine.trimmingCharacters(in: .whitespacesAndNewlines)
guard !line.isEmpty, !line.hasPrefix("#"), !line.hasPrefix(";") else { continue }
if line.hasPrefix("[") && line.hasSuffix("]") {
let section = line.dropFirst().dropLast().trimmingCharacters(in: .whitespacesAndNewlines)
inPeekabooSection = section == "peekaboo"
continue
}
guard inPeekabooSection else { continue }
let parts = line.split(separator: "=", maxSplits: 1).map {
$0.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
}
guard parts.count == 2, parts[0] == "check-build-staleness" else { continue }
return parts[1] == "true" || parts[1] == "1" || parts[1] == "yes"
}
return nil
}
private func defaultGitConfigPaths(environment: [String: String], currentDirectory: String) -> [String] {
var paths = ["/etc/gitconfig"]
if let xdgConfigHome = environment["XDG_CONFIG_HOME"], !xdgConfigHome.isEmpty {
paths.append(URL(fileURLWithPath: xdgConfigHome).appendingPathComponent("git/config").path)
} else if let home = environment["HOME"], !home.isEmpty {
paths.append(URL(fileURLWithPath: home).appendingPathComponent(".config/git/config").path)
}
if let home = environment["HOME"], !home.isEmpty {
paths.append(URL(fileURLWithPath: home).appendingPathComponent(".gitconfig").path)
}
if let localConfigPath = findGitConfigPath(startingAt: currentDirectory) {
paths.append(localConfigPath)
}
return paths
}
private func findGitConfigPath(startingAt path: String) -> String? {
let fileManager = FileManager.default
var directory = URL(fileURLWithPath: path, isDirectory: true).standardizedFileURL
while true {
let dotGit = directory.appendingPathComponent(".git").path
var isDirectory: ObjCBool = false
if fileManager.fileExists(atPath: dotGit, isDirectory: &isDirectory) {
if isDirectory.boolValue {
return URL(fileURLWithPath: dotGit).appendingPathComponent("config").path
}
if let contents = try? String(contentsOfFile: dotGit, encoding: .utf8),
let gitDirLine = contents.components(separatedBy: .newlines).first(where: {
$0.trimmingCharacters(in: .whitespaces).hasPrefix("gitdir:")
}) {
let rawGitDir = gitDirLine
.replacingOccurrences(of: "gitdir:", with: "")
.trimmingCharacters(in: .whitespacesAndNewlines)
let gitDirURL = URL(fileURLWithPath: rawGitDir, relativeTo: directory).standardizedFileURL
return gitDirURL.appendingPathComponent("config").path
}
}
let parent = directory.deletingLastPathComponent()
if parent.path == directory.path { return nil }
directory = parent
}
}
/// Check if the embedded git commit differs from the current git commit
private func checkGitCommitStaleness() {
// Get current git commit hash

View File

@ -1,3 +1,8 @@
//
// AcceleratedTextDetector.swift
// PeekabooCore
//
import Accelerate
import AppKit
import CoreGraphics
@ -16,17 +21,17 @@ final class AcceleratedTextDetector {
// MARK: - Properties
/// Sobel kernels as Int16 for vImage convolution
// Sobel kernels as Int16 for vImage convolution
private let sobelXKernel: [Int16] = [
-1, 0, 1,
-2, 0, 2,
-1, 0, 1,
-1, 0, 1
]
private let sobelYKernel: [Int16] = [
-1, -2, -1,
0, 0, 0,
1, 2, 1,
1, 2, 1
]
// Pre-allocated buffers for performance
@ -39,14 +44,14 @@ final class AcceleratedTextDetector {
private let maxBufferWidth: Int = 200
private let maxBufferHeight: Int = 100
/// Edge detection threshold (0-255 scale)
// Edge detection threshold (0-255 scale)
private let edgeThreshold: UInt8 = 30
private let logger: ObservationAnnotationLog
private let logger: Logger
// MARK: - Initialization
init(logger: ObservationAnnotationLog = .disabled) {
init(logger: Logger = Logger.shared) {
self.logger = logger
self.allocateBuffers()
}
@ -97,7 +102,7 @@ final class AcceleratedTextDetector {
self.logger.verbose("Edge detection for region", category: "LabelPlacement", metadata: [
"rect": "\(rect)",
"density": result.density,
"hasText": result.hasText,
"hasText": result.hasText
])
// More aggressive scoring to avoid text
@ -156,7 +161,7 @@ final class AcceleratedTextDetector {
CGPoint(x: rect.maxX, y: rect.minY),
CGPoint(x: rect.midX, y: rect.midY),
CGPoint(x: rect.minX, y: rect.maxY),
CGPoint(x: rect.maxX, y: rect.maxY),
CGPoint(x: rect.maxX, y: rect.maxY)
]
guard let bitmap = getBitmapRep(from: image) else { return nil }
@ -263,7 +268,8 @@ final class AcceleratedTextDetector {
3,
1, // Divisor
128, // Bias (to keep values positive)
vImage_Flags(kvImageEdgeExtend))
vImage_Flags(kvImageEdgeExtend)
)
// Apply Sobel Y kernel
vImageConvolve_Planar8(
@ -277,7 +283,8 @@ final class AcceleratedTextDetector {
3,
1, // Divisor
128, // Bias (to keep values positive)
vImage_Flags(kvImageEdgeExtend))
vImage_Flags(kvImageEdgeExtend)
)
return (gradX, gradY)
}
@ -334,8 +341,7 @@ final class AcceleratedTextDetector {
private func getBitmapRep(from image: NSImage) -> NSBitmapImageRep? {
guard let tiffData = image.tiffRepresentation,
let bitmap = NSBitmapImageRep(data: tiffData)
else {
let bitmap = NSBitmapImageRep(data: tiffData) else {
return nil
}
return bitmap
@ -346,8 +352,7 @@ final class AcceleratedTextDetector {
let y = Int(bitmap.size.height - point.y - 1) // Flip Y coordinate
guard x >= 0, x < bitmap.pixelsWide,
y >= 0, y < bitmap.pixelsHigh
else {
y >= 0, y < bitmap.pixelsHigh else {
return nil
}

View File

@ -1,177 +0,0 @@
import Foundation
import PeekabooAgentRuntime
@MainActor
final class AgentChatEventDelegate: AgentEventDelegate {
private weak var ui: AgentChatUI?
private var lastToolArguments: [String: [String: Any]] = [:]
init(ui: AgentChatUI) {
self.ui = ui
}
func agentDidEmitEvent(_ event: AgentEvent) {
guard let ui else { return }
switch event {
case .started:
break
case let .assistantMessage(content):
ui.appendAssistant(content)
case let .thinkingMessage(content):
ui.updateThinking(content)
case let .toolCallStarted(name, arguments):
self.handleToolStarted(name: name, arguments: arguments, ui: ui)
case let .toolCallCompleted(name, result):
self.handleToolCompleted(name: name, result: result, ui: ui)
case let .toolCallUpdated(name, arguments):
self.handleToolUpdated(name: name, arguments: arguments, ui: ui)
case .verificationCompleted, .desktopContextRefreshed:
break
case let .error(message):
ui.showError(message)
case .completed:
ui.finishStreaming()
case .queueDrained:
break
}
}
private func handleToolStarted(name: String, arguments: String, ui: AgentChatUI) {
let args = self.parseArguments(arguments)
self.lastToolArguments[name] = args
let formatter = self.toolFormatter(for: name)
let toolType = ToolType(rawValue: name)
let summary = formatter?.formatStarting(arguments: args) ??
name.replacingOccurrences(of: "_", with: " ")
ui.showToolStart(
name: name,
summary: summary,
icon: toolType?.icon,
displayName: toolType?.displayName
)
}
private func handleToolCompleted(name: String, result: String, ui: AgentChatUI) {
let summary = self.toolResultSummary(name: name, result: result)
let success = self.successFlag(from: result)
let toolType = ToolType(rawValue: name)
ui.showToolCompletion(
name: name,
success: success,
summary: summary,
icon: toolType?.icon,
displayName: toolType?.displayName
)
}
private func handleToolUpdated(name: String, arguments: String, ui: AgentChatUI) {
let args = self.parseArguments(arguments)
if let previous = self.lastToolArguments[name], self.dictionariesEqual(previous, args) {
return
}
let formatter = self.toolFormatter(for: name)
let toolType = ToolType(rawValue: name)
let summary = self.diffSummary(for: name, newArgs: args)
?? formatter?.formatStarting(arguments: args)
?? name.replacingOccurrences(of: "_", with: " ")
ui.showToolUpdate(
name: name,
summary: summary,
icon: toolType?.icon,
displayName: toolType?.displayName
)
self.lastToolArguments[name] = args
}
private func toolFormatter(for name: String) -> (any ToolFormatter)? {
if let type = ToolType(rawValue: name) {
return ToolFormatterRegistry.shared.formatter(for: type)
}
return nil
}
private func parseArguments(_ jsonString: String) -> [String: Any] {
guard let data = jsonString.data(using: .utf8),
let json = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else {
return [:]
}
return json
}
private func parseResult(_ jsonString: String) -> [String: Any]? {
guard let data = jsonString.data(using: .utf8),
let json = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else {
return nil
}
return json
}
private func toolResultSummary(name: String, result: String) -> String? {
guard let json = self.parseResult(result) else { return nil }
if let summary = ToolEventSummary.from(resultJSON: json)?.shortDescription(toolName: name) {
return summary
}
let formatter = self.toolFormatter(for: name)
return formatter?.formatResultSummary(result: json)
}
private func successFlag(from result: String) -> Bool {
guard let json = self.parseResult(result) else { return true }
return (json["success"] as? Bool) ?? true
}
/// Minimal diff between previous and new args for the same tool name.
private func diffSummary(for toolName: String, newArgs: [String: Any]) -> String? {
guard let previous = self.lastToolArguments[toolName] else { return nil }
var changes: [String] = []
for (key, newValue) in newArgs {
guard let prevValue = previous[key] else {
changes.append("+\(key)")
continue
}
if !self.valuesEqual(prevValue, newValue) {
let rendered = self.renderValue(newValue)
changes.append("\(key): \(rendered)")
}
if changes.count >= 3 { break }
}
if changes.isEmpty { return nil }
return changes.joined(separator: ", ")
}
private func valuesEqual(_ lhs: Any, _ rhs: Any) -> Bool {
switch (lhs, rhs) {
case let (l as String, r as String): l == r
case let (l as Int, r as Int): l == r
case let (l as Double, r as Double): l == r
case let (l as Bool, r as Bool): l == r
default: false
}
}
private func dictionariesEqual(_ lhs: [String: Any], _ rhs: [String: Any]) -> Bool {
guard lhs.count == rhs.count else { return false }
for (key, lval) in lhs {
guard let rval = rhs[key], self.valuesEqual(lval, rval) else { return false }
}
return true
}
private func renderValue(_ value: Any) -> String {
switch value {
case let str as String:
let max = 32
if str.count > max {
let idx = str.index(str.startIndex, offsetBy: max)
return String(str[..<idx]) + ""
}
return str
case let num as Int: return String(num)
case let num as Double: return String(format: "%.3f", num)
case let bool as Bool: return bool ? "true" : "false"
default: return ""
}
}
}

View File

@ -1,85 +0,0 @@
import TauTUI
/// Minimal loader component to keep chat rendering responsive without pulling in full spinner logic.
@MainActor
final class AgentChatLoader: Component {
private var message: String
init(tui: TUI, message: String) {
self.message = message
}
func setMessage(_ message: String) {
self.message = message
}
func stop() {}
func render(width: Int) -> [String] {
["\(self.message)"]
}
}
@MainActor
final class AgentChatInput: Component {
private let editor = Editor()
var onSubmit: ((String) -> Void)?
var onCancel: (() -> Void)?
var onInterrupt: (() -> Void)?
var onQueueWhileLocked: (() -> Void)?
var isLocked: Bool = false {
didSet {
if !self.isLocked {
self.editor.disableSubmit = false
}
}
}
init() {
self.editor.onSubmit = { [weak self] value in
self?.onSubmit?(value)
}
}
func render(width: Int) -> [String] {
self.editor.render(width: width)
}
func handle(input: TerminalInput) {
switch input {
case let .key(.character(char), modifiers):
if modifiers.contains(.control) {
let lower = String(char).lowercased()
if lower == "c" || lower == "d" {
self.onInterrupt?()
return
}
}
case .key(.escape, _):
if self.isLocked {
self.onCancel?()
return
}
case .key(.end, _):
if self.isLocked {
// End lets a user keep typing while the current run owns normal submit.
self.onQueueWhileLocked?()
return
}
default:
break
}
self.editor.handle(input: input)
}
func clear() {
self.editor.setText("")
}
func currentText() -> String {
self.editor.getText()
}
}

View File

@ -5,8 +5,96 @@
import Foundation
import PeekabooAgentRuntime
import Tachikoma
import TauTUI
// Minimal loader component to keep chat rendering responsive without pulling in full spinner logic.
@MainActor
private final class Loader: Component {
private var message: String
init(tui: TUI, message: String) {
self.message = message
}
func setMessage(_ message: String) {
self.message = message
}
func stop() {}
func render(width: Int) -> [String] {
["\(self.message)"]
}
}
// MARK: - Input
@MainActor
final class AgentChatInput: Component {
private let editor = Editor()
var onSubmit: ((String) -> Void)?
var onCancel: (() -> Void)?
var onInterrupt: (() -> Void)?
var onQueueWhileLocked: (() -> Void)?
var isLocked: Bool = false {
didSet {
if !self.isLocked {
self.editor.disableSubmit = false
}
}
}
init() {
self.editor.onSubmit = { [weak self] value in
self?.onSubmit?(value)
}
}
func render(width: Int) -> [String] {
self.editor.render(width: width)
}
func handle(input: TerminalInput) {
switch input {
case let .key(.character(char), modifiers):
if modifiers.contains(.control) {
let lower = String(char).lowercased()
if lower == "c" || lower == "d" {
self.onInterrupt?()
return
}
}
case .key(.escape, _):
if self.isLocked {
self.onCancel?()
return
}
case .key(.end, _):
if self.isLocked {
self.onQueueWhileLocked?()
return
}
default:
break
}
self.editor.handle(input: input)
}
func clear() {
self.editor.setText("")
}
func currentText() -> String {
self.editor.getText()
}
}
// MARK: - TauTUI Chat UI
@MainActor
final class AgentChatUI {
var onCancelRequested: (() -> Void)?
@ -29,7 +117,7 @@ final class AgentChatUI {
private let thinkingGray = AnsiStyling.color(246)
private var promptContinuation: AsyncStream<String>.Continuation?
private var loader: AgentChatLoader?
private var loader: Loader?
private var assistantBuffer = ""
private var assistantComponent: MarkdownComponent?
private var thinkingBlocks: [MarkdownComponent] = []
@ -105,7 +193,7 @@ final class AgentChatUI {
func beginRun(prompt: String) {
self.setRunning(true)
self.removeLoader()
self.loader = AgentChatLoader(tui: self.tui, message: "Running…")
self.loader = Loader(tui: self.tui, message: "Running…")
if let loader {
self.messages.addChild(loader)
}
@ -338,3 +426,166 @@ final class AgentChatUI {
return "\(base)\(mode)"
}
}
// MARK: - Event delegate
@MainActor
final class AgentChatEventDelegate: AgentEventDelegate {
private weak var ui: AgentChatUI?
private var lastToolArguments: [String: [String: Any]] = [:]
init(ui: AgentChatUI) {
self.ui = ui
}
func agentDidEmitEvent(_ event: AgentEvent) {
guard let ui else { return }
switch event {
case .started:
break
case let .assistantMessage(content):
ui.appendAssistant(content)
case let .thinkingMessage(content):
ui.updateThinking(content)
case let .toolCallStarted(name, arguments):
let args = self.parseArguments(arguments)
self.lastToolArguments[name] = args
let formatter = self.toolFormatter(for: name)
let toolType = ToolType(rawValue: name)
let summary = formatter?.formatStarting(arguments: args) ??
name.replacingOccurrences(of: "_", with: " ")
ui.showToolStart(
name: name,
summary: summary,
icon: toolType?.icon,
displayName: toolType?.displayName
)
case let .toolCallCompleted(name, result):
let summary = self.toolResultSummary(name: name, result: result)
let success = self.successFlag(from: result)
let toolType = ToolType(rawValue: name)
ui.showToolCompletion(
name: name,
success: success,
summary: summary,
icon: toolType?.icon,
displayName: toolType?.displayName
)
case let .toolCallUpdated(name, arguments):
let args = self.parseArguments(arguments)
if let previous = self.lastToolArguments[name], self.dictionariesEqual(previous, args) {
break // skip no-op updates
}
let formatter = self.toolFormatter(for: name)
let toolType = ToolType(rawValue: name)
let summary = self.diffSummary(for: name, newArgs: args)
?? formatter?.formatStarting(arguments: args)
?? name.replacingOccurrences(of: "_", with: " ")
ui.showToolUpdate(
name: name,
summary: summary,
icon: toolType?.icon,
displayName: toolType?.displayName
)
self.lastToolArguments[name] = args
case let .error(message):
ui.showError(message)
case .completed:
ui.finishStreaming()
case .queueDrained:
break
}
}
private func toolFormatter(for name: String) -> (any ToolFormatter)? {
if let type = ToolType(rawValue: name) {
return ToolFormatterRegistry.shared.formatter(for: type)
}
return nil
}
private func parseArguments(_ jsonString: String) -> [String: Any] {
guard let data = jsonString.data(using: .utf8),
let json = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else {
return [:]
}
return json
}
private func parseResult(_ jsonString: String) -> [String: Any]? {
guard let data = jsonString.data(using: .utf8),
let json = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else {
return nil
}
return json
}
private func toolResultSummary(name: String, result: String) -> String? {
guard let json = self.parseResult(result) else { return nil }
if let summary = ToolEventSummary.from(resultJSON: json)?.shortDescription(toolName: name) {
return summary
}
let formatter = self.toolFormatter(for: name)
return formatter?.formatResultSummary(result: json)
}
private func successFlag(from result: String) -> Bool {
guard let json = self.parseResult(result) else { return true }
return (json["success"] as? Bool) ?? true
}
/// Minimal diff between previous and new args for the same tool name.
private func diffSummary(for toolName: String, newArgs: [String: Any]) -> String? {
guard let previous = self.lastToolArguments[toolName] else { return nil }
var changes: [String] = []
for (key, newValue) in newArgs {
guard let prevValue = previous[key] else {
changes.append("+\(key)")
continue
}
if !self.valuesEqual(prevValue, newValue) {
let rendered = self.renderValue(newValue)
changes.append("\(key): \(rendered)")
}
if changes.count >= 3 { break }
}
if changes.isEmpty { return nil }
return changes.joined(separator: ", ")
}
private func valuesEqual(_ lhs: Any, _ rhs: Any) -> Bool {
switch (lhs, rhs) {
case let (l as String, r as String): l == r
case let (l as Int, r as Int): l == r
case let (l as Double, r as Double): l == r
case let (l as Bool, r as Bool): l == r
default: false
}
}
private func dictionariesEqual(_ lhs: [String: Any], _ rhs: [String: Any]) -> Bool {
guard lhs.count == rhs.count else { return false }
for (key, lval) in lhs {
guard let rval = rhs[key], self.valuesEqual(lval, rval) else { return false }
}
return true
}
private func renderValue(_ value: Any) -> String {
switch value {
case let str as String:
let max = 32
if str.count > max {
let idx = str.index(str.startIndex, offsetBy: max)
return String(str[..<idx]) + ""
}
return str
case let num as Int: return String(num)
case let num as Double: return String(format: "%.3f", num)
case let bool as Bool: return bool ? "true" : "false"
default: return ""
}
}
}

View File

@ -51,7 +51,7 @@ extension AgentCommand {
private func transcribeAudio(using audioService: AudioInputService) async throws -> String {
if let audioPath = self.audioFile {
let url = URL(fileURLWithPath: PathResolver.expandPath(audioPath))
let url = URL(fileURLWithPath: audioPath)
return try await audioService.transcribeAudioFile(url)
} else {
try await audioService.startRecording()

View File

@ -69,8 +69,7 @@ extension AgentCommand {
return
} catch {
self.printAgentExecutionError(
"Failed to launch TauTUI chat: \(error.localizedDescription). Falling back to basic chat."
)
"Failed to launch TauTUI chat: \(error.localizedDescription). Falling back to basic chat.")
}
}
@ -91,28 +90,31 @@ extension AgentCommand {
capabilities: TerminalCapabilities,
queueMode: QueueMode
) async throws {
var turnContext = ChatTurnContext(
sessionId: nil,
requestedModel: requestedModel,
queueMode: queueMode,
queuedWhileRunning: []
)
var queuedWhileRunning: [String] = []
var activeSessionId: String?
do {
turnContext.sessionId = try await self.initialChatSessionId(agentService)
activeSessionId = try await self.initialChatSessionId(agentService)
} catch {
self.printAgentExecutionError(error.localizedDescription)
return
}
self.printChatWelcome(
sessionId: turnContext.sessionId,
sessionId: activeSessionId,
modelDescription: self.describeModel(requestedModel),
queueMode: queueMode
)
self.printChatHelpIntro()
if let seed = initialPrompt {
try await self.performChatTurn(seed, agentService: agentService, context: &turnContext)
try await self.performChatTurn(
seed,
agentService: agentService,
sessionId: &activeSessionId,
requestedModel: requestedModel,
queueMode: queueMode,
queuedWhileRunning: &queuedWhileRunning
)
}
while true {
@ -134,7 +136,14 @@ extension AgentCommand {
let batchedPrompt = trimmed
do {
try await self.performChatTurn(batchedPrompt, agentService: agentService, context: &turnContext)
try await self.performChatTurn(
batchedPrompt,
agentService: agentService,
sessionId: &activeSessionId,
requestedModel: requestedModel,
queueMode: queueMode,
queuedWhileRunning: &queuedWhileRunning
)
} catch {
self.printAgentExecutionError(error.localizedDescription)
break
@ -208,17 +217,14 @@ extension AgentCommand {
let tuiDelegate = AgentChatEventDelegate(ui: chatUI)
let sessionForRun = activeSessionId
let tuiContext = AgentRunContext(
sessionId: sessionForRun,
requestedModel: requestedModel,
queueMode: queueMode,
delegate: tuiDelegate
)
currentRun = Task { @MainActor in
try await self.runAgentTurnForTUI(
batchedPrompt,
agentService: agentService,
context: tuiContext
sessionId: sessionForRun,
requestedModel: requestedModel,
queueMode: queueMode,
delegate: tuiDelegate
)
}
@ -240,23 +246,15 @@ extension AgentCommand {
}
}
struct AgentRunContext {
var sessionId: String?
var requestedModel: LanguageModel?
var queueMode: QueueMode
var delegate: any AgentEventDelegate
}
@MainActor
private func runAgentTurnForTUI(
_ input: String,
agentService: PeekabooAgentService,
context: AgentRunContext
sessionId: String?,
requestedModel: LanguageModel?,
queueMode: QueueMode,
delegate: any AgentEventDelegate
) async throws -> AgentExecutionResult {
let sessionId = context.sessionId
let requestedModel = context.requestedModel
let queueMode = context.queueMode
let delegate = context.delegate
if let existingSessionId = sessionId {
return try await agentService.continueSession(
sessionId: existingSessionId,
@ -311,25 +309,19 @@ extension AgentCommand {
return readLine()
}
struct ChatTurnContext {
var sessionId: String?
var requestedModel: LanguageModel?
var queueMode: QueueMode
var queuedWhileRunning: [String]
}
private func performChatTurn(
_ input: String,
agentService: PeekabooAgentService,
context: inout ChatTurnContext
sessionId: inout String?,
requestedModel: LanguageModel?,
queueMode: QueueMode,
queuedWhileRunning: inout [String]
) async throws {
let startingSessionId = context.sessionId
let queueMode = context.queueMode
let requestedModel = context.requestedModel
let startingSessionId = sessionId
var batchedInput = input
if queueMode == .all {
let extras = context.queuedWhileRunning
context.queuedWhileRunning.removeAll()
let extras = queuedWhileRunning
queuedWhileRunning.removeAll()
batchedInput = ([input] + extras).joined(separator: "\n\n")
}
@ -380,7 +372,7 @@ extension AgentCommand {
}
if let updatedSessionId = result.sessionId {
context.sessionId = updatedSessionId
sessionId = updatedSessionId
}
self.printChatTurnSummary(result)
@ -404,6 +396,6 @@ extension AgentCommand {
}
private func describeModel(_ requestedModel: LanguageModel?) -> String {
requestedModel?.description ?? "default (gpt-5.5)"
requestedModel?.description ?? "default (gpt-5.1)"
}
}

View File

@ -1,176 +0,0 @@
import Commander
import Foundation
import PeekabooAgentRuntime
import PeekabooCore
import PeekabooFoundation
import Tachikoma
import TauTUI
@available(macOS 14.0, *)
extension AgentCommand {
func ensureAgentHasCredentials(
selectedModel: LanguageModel
) -> Bool {
if self.isLocalModel(selectedModel) {
return true
}
if self.hasCredentials(for: selectedModel) {
return true
}
let providerName = self.providerDisplayName(for: selectedModel)
let envVar = self.providerEnvironmentVariable(for: selectedModel)
self.printAgentExecutionError(
"Missing API key for \(providerName). Set \(envVar) and retry."
)
return false
}
/// Render the agent execution result using either JSON output or a rich CLI transcript.
@MainActor
func displayResult(_ result: AgentExecutionResult, delegate: AgentOutputDelegate? = nil) {
if self.jsonOutput {
let response = [
"success": true,
"result": [
"content": result.content,
"sessionId": result.sessionId as Any,
"toolCalls": result.messages.flatMap { message in
message.content.compactMap { content in
if case let .toolCall(toolCall) = content {
return [
"id": toolCall.id,
"name": toolCall.name,
"arguments": String(describing: toolCall.arguments)
]
}
return nil
}
},
"metadata": [
"executionTime": result.metadata.executionTime,
"toolCallCount": result.metadata.toolCallCount,
"modelName": result.metadata.modelName
],
"usage": result.usage.map { usage in
[
"inputTokens": usage.inputTokens,
"outputTokens": usage.outputTokens,
"totalTokens": usage.totalTokens
]
} as Any
]
] as [String: Any]
if let jsonData = try? JSONSerialization.data(withJSONObject: response, options: .prettyPrinted) {
print(String(data: jsonData, encoding: .utf8) ?? "{}")
}
} else if self.outputMode == .quiet {
print(result.content)
}
delegate?.showFinalSummaryIfNeeded(result)
}
func makeDisplayDelegate(for task: String) -> AgentOutputDelegate? {
guard !self.jsonOutput, !self.quiet else { return nil }
return AgentOutputDelegate(outputMode: self.outputMode, jsonOutput: self.jsonOutput, task: task)
}
func makeStreamingDelegate(using displayDelegate: AgentOutputDelegate?) -> (any AgentEventDelegate)? {
if let displayDelegate {
return displayDelegate
}
if self.jsonOutput || self.quiet {
return SilentAgentEventDelegate()
}
return nil
}
final class SilentAgentEventDelegate: AgentEventDelegate {
func agentDidEmitEvent(_ event: AgentEvent) {}
}
func printAgentExecutionError(_ message: String) {
if self.jsonOutput {
let error: [String: Any] = ["success": false, "error": message]
if let jsonData = try? JSONSerialization.data(withJSONObject: error, options: .prettyPrinted),
let jsonString = String(data: jsonData, encoding: .utf8) {
print(jsonString)
} else {
print("{\"success\":false,\"error\":\"\(message)\"}")
}
} else {
print("\(TerminalColor.red)Error: \(message)\(TerminalColor.reset)")
}
}
func executeAgentTask(
_ agentService: PeekabooAgentService,
task: String,
requestedModel: LanguageModel?,
maxSteps: Int,
queueMode: QueueMode
) async throws -> AgentExecutionResult {
let outputDelegate = self.makeDisplayDelegate(for: task)
let streamingDelegate = self.makeStreamingDelegate(using: outputDelegate)
do {
let result = try await agentService.executeTask(
task,
maxSteps: maxSteps,
sessionId: nil,
model: requestedModel,
dryRun: self.dryRun,
queueMode: queueMode,
eventDelegate: streamingDelegate,
verbose: self.verbose
)
self.displayResult(result, delegate: outputDelegate)
let duration = String(format: "%.2f", result.metadata.executionTime)
let sessionId = result.sessionId ?? "none"
let finalTokens = result.usage?.totalTokens ?? 0
let status = result.metadata.context["status"] ?? "completed"
AutomationEventLogger.log(
.agent,
"result status=\(status) task='\(task)' model=\(result.metadata.modelName) duration=\(duration)s "
+ "tools=\(result.metadata.toolCallCount) dry_run=\(self.dryRun) "
+ "session=\(sessionId) tokens=\(finalTokens)"
)
return result
} catch {
self.printAgentExecutionError("Agent execution failed: \(error.localizedDescription)")
throw ExitCode.failure
}
}
var normalizedTaskInput: String? {
guard let task else { return nil }
let trimmed = task.trimmingCharacters(in: .whitespacesAndNewlines)
return trimmed.isEmpty ? nil : trimmed
}
var hasTaskInput: Bool {
self.normalizedTaskInput != nil || self.audio || self.audioFile != nil
}
var resolvedMaxSteps: Int {
self.maxSteps ?? 100
}
func resolvedQueueMode() throws -> QueueMode {
guard let raw = self.queueMode?.trimmingCharacters(in: .whitespacesAndNewlines), !raw.isEmpty else {
return .oneAtATime
}
switch raw.lowercased() {
case "one", "one-at-a-time", "single", "sequential", "1":
return .oneAtATime
case "all", "batch", "together":
return .all
default:
throw PeekabooError.invalidInput("Invalid queue mode '\(raw)'. Use one-at-a-time or all.")
}
}
}

View File

@ -1,322 +0,0 @@
import Foundation
import PeekabooCore
import PeekabooFoundation
import Tachikoma
@available(macOS 14.0, *)
extension AgentCommand {
@MainActor
func parseModelString(
_ modelString: String,
configuration: PeekabooCore.ConfigurationManager? = nil
) -> LanguageModel? {
let trimmed = modelString.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
let explicitProvider = trimmed
.split(separator: "/", maxSplits: 1)
.first
.map { String($0).lowercased() }
if trimmed.caseInsensitiveCompare("claude") == .orderedSame ||
trimmed.caseInsensitiveCompare("anthropic") == .orderedSame {
return .anthropic(.opus48)
}
if let configuration {
switch self.parseConfiguredCustomModel(
trimmed,
explicitProvider: explicitProvider,
configuration: configuration
) {
case let .resolved(model):
return model
case .unresolved:
break
}
}
guard let parsed = LanguageModel.parse(from: trimmed) else {
return nil
}
return self.supportedParsedModel(parsed, explicitProvider: explicitProvider)
}
@MainActor
private func supportedParsedModel(_ parsed: LanguageModel, explicitProvider: String?) -> LanguageModel? {
switch parsed {
case let .openai(model):
if Self.supportedOpenAIInputs.contains(model) {
return .openai(.gpt55)
}
case let .anthropic(model):
if Self.supportedAnthropicInputs.contains(model) {
return .anthropic(model)
}
case let .google(model):
if Self.supportedGoogleInputs.contains(model) {
return .google(model)
}
case .grok:
return parsed.supportsTools ? parsed : nil
case let .minimax(model):
if Self.supportedMiniMaxInputs.contains(model) {
return .minimax(model)
}
case let .minimaxCN(model):
if Self.supportedMiniMaxInputs.contains(model) {
return .minimaxCN(model)
}
case .ollama, .lmstudio:
return parsed.supportsTools ? parsed : nil
case .openRouter:
if let explicitProvider, Self.reservedProviderInputs.contains(explicitProvider) {
return nil
}
return parsed.supportsTools ? parsed : nil
default:
break
}
return nil
}
@MainActor
private func parseConfiguredCustomModel(
_ modelString: String,
explicitProvider: String?,
configuration: PeekabooCore.ConfigurationManager
) -> ConfiguredModelResolution {
if let configuredModel = PeekabooAIService(configuration: configuration).resolveConfiguredModel(modelString),
case .custom = configuredModel {
return .resolved(configuredModel.supportsTools ? configuredModel : nil)
}
if let explicitProvider,
configuration.listCustomProviders().contains(where: { providerID, provider in
provider.enabled && providerID.caseInsensitiveCompare(explicitProvider) == .orderedSame
}) {
return .resolved(nil)
}
return .unresolved
}
private enum ConfiguredModelResolution {
case resolved(LanguageModel?)
case unresolved
}
@MainActor
func validatedModelSelection(configuration: PeekabooCore.ConfigurationManager? = nil) throws -> LanguageModel? {
guard let modelString = self.model else { return nil }
guard let parsed = self.parseModelString(modelString, configuration: configuration) else {
throw PeekabooError.invalidInput(
"Unsupported model '\(modelString)'. Allowed values: \(Self.allowedModelList)"
)
}
return parsed
}
private static let supportedOpenAIInputs: Set<LanguageModel.OpenAI> = [
.gpt55,
.gpt54,
.gpt54Mini,
.gpt54Nano,
.gpt5,
.gpt5Pro,
.gpt5Mini,
.gpt5Nano,
]
private static let supportedAnthropicInputs: Set<LanguageModel.Anthropic> = [
.fable5,
.opus48,
.opus47,
.opus45,
.opus4,
.sonnet46,
.sonnet45,
.haiku45,
]
private static let supportedGoogleInputs: Set<LanguageModel.Google> = [
.gemini35Flash,
.gemini31ProPreview,
.gemini31FlashLite,
.gemini3Flash,
.gemini25Pro,
.gemini25Flash,
.gemini25FlashLite,
]
private static let supportedMiniMaxInputs: Set<LanguageModel.MiniMax> = [
.m27,
.m27Highspeed,
]
private static let reservedProviderInputs: Set<String> = [
"openai",
"anthropic",
"google",
"gemini",
"grok",
"xai",
"minimax",
"minimax-cn",
"minimax_cn",
"minimaxi",
"ollama",
"lmstudio",
"lm-studio",
]
private static var allowedModelList: String {
let openAIModels = Self.supportedOpenAIInputs.map(\.modelId)
let anthropicModels = Self.supportedAnthropicInputs.map(\.modelId)
let googleModels = Self.supportedGoogleInputs.map(\.userFacingModelId)
let miniMaxModels = Self.supportedMiniMaxInputs.map(\.modelId)
return (openAIModels + anthropicModels + googleModels + miniMaxModels + [
"grok/<model>",
"xai/<model>",
"minimax-cn/<model>",
"ollama/<model>",
"lmstudio/<model>",
"openrouter/<provider>/<model>",
"<custom-provider>/<model>",
])
.sorted()
.joined(separator: ", ")
}
@MainActor
func firstAvailableToolModel(from service: PeekabooAIService) -> LanguageModel? {
service.availableModels().first { model in
model.supportsTools && service.isModelAvailable(model)
}
}
@MainActor
func configuredDefaultToolModel(
from service: PeekabooAIService,
configuration: PeekabooCore.ConfigurationManager
) -> LanguageModel? {
guard let defaultModel = configuration.getAgentModel(),
let model = service.resolveConfiguredModel(defaultModel),
model.supportsTools,
service.isModelAvailable(model)
else {
return nil
}
return model
}
@MainActor
func implicitToolModel(
from service: PeekabooAIService,
configuration: PeekabooCore.ConfigurationManager,
existingAgentModel: LanguageModel?
) -> LanguageModel? {
if let existingAgentModel {
return existingAgentModel
}
if configuration.hasExplicitAIProviderList() {
return nil
}
return self.configuredDefaultToolModel(from: service, configuration: configuration) ??
self.firstAvailableToolModel(from: service)
}
@MainActor
func hasCredentials(for model: LanguageModel) -> Bool {
let configuration = self.services.configuration
switch model {
case .ollama, .lmstudio:
return true
case .openai:
return configuration.hasOpenAIAuth()
case .anthropic:
return configuration.hasAnthropicAuth()
case .google:
return configuration.getGeminiAPIKey()?.isEmpty == false
case .minimax:
return configuration.getMiniMaxAPIKey()?.isEmpty == false
case .minimaxCN:
return configuration.getMiniMaxChinaAPIKey()?.isEmpty == false
case .grok:
return configuration.getGrokAPIKey()?.isEmpty == false
case .openRouter:
return configuration.getOpenRouterAPIKey()?.isEmpty == false
case let .custom(provider):
return provider.apiKey?.isEmpty == false
default:
return false
}
}
func providerDisplayName(for model: LanguageModel) -> String {
switch model {
case .openai:
"OpenAI"
case .anthropic:
"Anthropic"
case .google:
"Google"
case .minimax:
"MiniMax"
case .minimaxCN:
"MiniMax China"
case .ollama:
"Ollama"
case .lmstudio:
"LM Studio"
case .openRouter:
"OpenRouter"
case .grok:
"xAI"
case let .custom(provider):
"custom provider \(provider.modelId)"
default:
"the selected provider"
}
}
func providerEnvironmentVariable(for model: LanguageModel) -> String {
switch model {
case .openai:
"OPENAI_API_KEY"
case .anthropic:
"ANTHROPIC_API_KEY"
case .google:
"GEMINI_API_KEY"
case .minimax:
"MINIMAX_API_KEY"
case .minimaxCN:
"MINIMAX_CN_API_KEY or MINIMAX_API_KEY"
case .ollama:
"OLLAMA_BASE_URL or PEEKABOO_OLLAMA_BASE_URL"
case .lmstudio:
"LM Studio local server URL"
case .openRouter:
"OPENROUTER_API_KEY"
case .grok:
"X_AI_API_KEY, XAI_API_KEY, or GROK_API_KEY"
case .custom:
"the custom provider API key reference"
default:
"provider API key"
}
}
func isLocalModel(_ model: LanguageModel?) -> Bool {
switch model {
case .ollama, .lmstudio:
true
default:
false
}
}
}

View File

@ -1,240 +0,0 @@
import Foundation
import PeekabooAgentRuntime
import PeekabooCore
import PeekabooFoundation
import Tachikoma
import TauTUI
/// Temporary session info struct until PeekabooAgentService implements session management
struct AgentSessionInfo: Codable {
let id: String
let task: String
let created: Date
let lastModified: Date
let messageCount: Int
}
@available(macOS 14.0, *)
extension AgentCommand {
struct ResumeAgentSessionRequest {
let sessionId: String
let task: String
let requestedModel: LanguageModel?
let maxSteps: Int
let queueMode: QueueMode
}
func handleSessionResumption(
_ agentService: PeekabooAgentService,
requestedModel: LanguageModel?,
maxSteps: Int,
queueMode: QueueMode
) async throws -> Bool {
if let sessionId = self.resumeSession {
guard let continuationTask = self.task else {
self.printMissingTaskError(
message: "Task argument required when resuming session",
usage: "Usage: peekaboo agent --resume-session <session-id> \"<continuation-task>\""
)
return true
}
try await self.resumeAgentSession(
agentService,
request: ResumeAgentSessionRequest(
sessionId: sessionId,
task: continuationTask,
requestedModel: requestedModel,
maxSteps: maxSteps,
queueMode: queueMode
)
)
return true
}
if self.resume {
guard let continuationTask = self.task else {
self.printMissingTaskError(
message: "Task argument required when resuming",
usage: "Usage: peekaboo agent --resume \"<continuation-task>\""
)
return true
}
let sessions = try await agentService.listSessions()
if let mostRecent = sessions.first {
try await self.resumeAgentSession(
agentService,
request: ResumeAgentSessionRequest(
sessionId: mostRecent.id,
task: continuationTask,
requestedModel: requestedModel,
maxSteps: maxSteps,
queueMode: queueMode
)
)
} else {
if self.jsonOutput {
let error = ["success": false, "error": "No sessions found to resume"] as [String: Any]
let jsonData = try JSONSerialization.data(withJSONObject: error, options: .prettyPrinted)
print(String(data: jsonData, encoding: .utf8) ?? "{}")
} else {
print("\(TerminalColor.red)Error: No sessions found to resume\(TerminalColor.reset)")
}
}
return true
}
return false
}
func printMissingTaskError(message: String, usage: String) {
if self.jsonOutput {
let error = ["success": false, "error": message] as [String: Any]
if let jsonData = try? JSONSerialization.data(withJSONObject: error, options: .prettyPrinted),
let jsonString = String(data: jsonData, encoding: .utf8) {
print(jsonString)
} else {
print("{\"success\":false,\"error\":\"\(message)\"}")
}
} else {
print("\(TerminalColor.red)Error: \(message)\(TerminalColor.reset)")
if !usage.isEmpty {
print(usage)
}
}
}
@MainActor
func showSessions(_ agentService: any AgentServiceProtocol) async throws {
guard let peekabooService = agentService as? PeekabooAgentService else {
throw PeekabooError.commandFailed("Agent service not properly initialized")
}
let sessionSummaries = try await peekabooService.listSessions()
let sessions = sessionSummaries.map { summary in
AgentSessionInfo(
id: summary.id,
task: summary.summary ?? "Unknown task",
created: summary.createdAt,
lastModified: summary.lastAccessedAt,
messageCount: summary.messageCount
)
}
guard !sessions.isEmpty else {
self.printNoAgentSessions()
return
}
if self.jsonOutput {
self.printSessionsJSON(sessions)
} else {
self.printSessionsList(sessions)
}
}
private func printNoAgentSessions() {
if self.jsonOutput {
let response = ["success": true, "sessions": []] as [String: Any]
let jsonData = try? JSONSerialization.data(withJSONObject: response, options: .prettyPrinted)
print(String(data: jsonData ?? Data(), encoding: .utf8) ?? "{}")
} else {
print("No agent sessions found.")
}
}
private func printSessionsJSON(_ sessions: [AgentSessionInfo]) {
let sessionData = sessions.map { session in
[
"id": session.id,
"createdAt": ISO8601DateFormatter().string(from: session.created),
"updatedAt": ISO8601DateFormatter().string(from: session.lastModified),
"messageCount": session.messageCount
]
}
let response = ["success": true, "sessions": sessionData] as [String: Any]
if let jsonData = try? JSONSerialization.data(withJSONObject: response, options: .prettyPrinted) {
print(String(data: jsonData, encoding: .utf8) ?? "{}")
}
}
private func printSessionsList(_ sessions: [AgentSessionInfo]) {
let headerLine = [
"\(TerminalColor.cyan)\(TerminalColor.bold)Agent Sessions:\(TerminalColor.reset)",
"\n"
].joined()
print(headerLine)
let dateFormatter = DateFormatter()
dateFormatter.dateStyle = .short
dateFormatter.timeStyle = .short
for (index, session) in sessions.prefix(10).indexed() {
self.printSessionLine(index: index, session: session, dateFormatter: dateFormatter)
if index < sessions.count - 1 {
print()
}
}
if sessions.count > 10 {
print([
"\n",
"\(TerminalColor.dim)... and \(sessions.count - 10) more sessions\(TerminalColor.reset)"
].joined())
}
let resumeHintLine = [
"\n",
"\(TerminalColor.dim)To resume: peekaboo agent --resume <session-id>",
" \"<continuation>\"\(TerminalColor.reset)"
].joined()
print(resumeHintLine)
}
private func printSessionLine(index: Int, session: AgentSessionInfo, dateFormatter: DateFormatter) {
let timeAgo = formatTimeAgo(session.lastModified)
let sessionLine = [
"\(TerminalColor.blue)\(index + 1).\(TerminalColor.reset)",
" ",
"\(TerminalColor.bold)\(session.id.prefix(8))\(TerminalColor.reset)"
].joined()
print(sessionLine)
print(" Messages: \(session.messageCount)")
print(" Last activity: \(timeAgo)")
}
private func resumeAgentSession(
_ agentService: PeekabooAgentService,
request: ResumeAgentSessionRequest
) async throws {
if !self.jsonOutput {
let resumingLine = [
"\(TerminalColor.cyan)\(TerminalColor.bold)",
"\(AgentDisplayTokens.Status.info)",
" Resuming session \(request.sessionId.prefix(8))...",
"\(TerminalColor.reset)",
"\n"
].joined()
print(resumingLine)
}
let outputDelegate = self.makeDisplayDelegate(for: request.task)
let streamingDelegate = self.makeStreamingDelegate(using: outputDelegate)
do {
let result = try await agentService.continueSession(
sessionId: request.sessionId,
userMessage: request.task,
model: request.requestedModel,
maxSteps: request.maxSteps,
dryRun: self.dryRun,
queueMode: request.queueMode,
eventDelegate: streamingDelegate
)
self.displayResult(result, delegate: outputDelegate)
} catch {
self.printAgentExecutionError("Failed to resume session: \(error.localizedDescription)")
throw error
}
}
}

View File

@ -1,199 +0,0 @@
import Darwin
import Dispatch
import Foundation
import PeekabooAgentRuntime
import PeekabooCore
import TauTUI
@MainActor
private final class TerminalModeGuard {
private let fd: Int32
private var original = termios()
private var active = false
init?(fd: Int32 = STDIN_FILENO) {
guard isatty(fd) == 1 else { return nil }
guard tcgetattr(fd, &self.original) == 0 else { return nil }
var raw = self.original
cfmakeraw(&raw)
raw.c_lflag |= tcflag_t(ISIG) // keep signals like Ctrl+C enabled
guard tcsetattr(fd, TCSANOW, &raw) == 0 else { return nil }
self.fd = fd
self.active = true
}
var fileDescriptor: Int32 {
self.fd
}
func restore() {
guard self.active else { return }
_ = tcsetattr(self.fd, TCSANOW, &self.original)
self.active = false
}
@MainActor
deinit {
self.restore()
}
}
final class EscapeKeyMonitor {
private var source: (any DispatchSourceRead)?
private var terminalGuard: TerminalModeGuard?
private let handler: @Sendable () async -> Void
private let queue = DispatchQueue(label: "peekaboo.escape.monitor")
init(handler: @escaping @Sendable () async -> Void) {
self.handler = handler
}
func start() {
guard self.source == nil else { return }
guard let termGuard = TerminalModeGuard() else { return }
let fd = termGuard.fileDescriptor
let handler = self.handler
let source = DispatchSource.makeReadSource(fileDescriptor: fd, queue: self.queue)
source.setEventHandler {
var buffer = [UInt8](repeating: 0, count: 16)
let count = read(fd, &buffer, buffer.count)
guard count > 0 else { return }
if buffer[..<count].contains(0x1B) {
Task.detached(priority: .userInitiated) {
await handler()
}
}
}
source.setCancelHandler {
termGuard.restore()
}
source.resume()
self.source = source
self.terminalGuard = termGuard
}
func stop() {
self.source?.cancel()
self.source = nil
self.terminalGuard = nil
}
}
@available(macOS 14.0, *)
extension AgentCommand {
func printChatWelcome(sessionId: String?, modelDescription: String, queueMode: QueueMode) {
guard !self.quiet else { return }
let header = [
TerminalColor.cyan,
TerminalColor.bold,
"Interactive agent chat",
TerminalColor.reset,
" model: ",
modelDescription,
" • queue: ",
queueMode == .all ? "all" : "one-at-a-time"
].joined()
print(header)
if let sessionId {
print("\(TerminalColor.dim)Resuming session \(sessionId.prefix(8))\(TerminalColor.reset)")
} else {
print("\(TerminalColor.dim)A new session will be created on the first prompt\(TerminalColor.reset)")
}
print()
}
func printChatHelpIntro() {
guard !self.quiet else { return }
print("Type /help for chat commands (Ctrl+C to exit).")
self.printChatHelpMenu()
}
func printChatHelpMenu() {
guard !self.quiet else { return }
self.chatHelpLines.forEach { print($0) }
}
private var chatHelpText: String {
"""
Chat commands:
Type any prompt and press Return to run it.
/help Show this menu again.
Esc Cancel the active run (if one is in progress).
Ctrl+C Cancel when running; exit immediately when idle.
Ctrl+D Exit when idle (EOF).
"""
}
var chatHelpLines: [String] {
self.chatHelpText
.split(separator: "\n", omittingEmptySubsequences: false)
.map(String.init)
}
private func printCapabilityFlag(_ label: String, supported: Bool, detail: String? = nil) {
let status = supported ? AgentDisplayTokens.Status.success : AgentDisplayTokens.Status.failure
let detailSuffix = detail.map { " (\($0))" } ?? ""
print("\(label): \(status)\(detailSuffix)")
}
/// Print detailed terminal detection debugging information
func printTerminalDetectionDebug(_ capabilities: TerminalCapabilities, actualMode: OutputMode) {
print("\n" + String(repeating: "=", count: 60))
print("\(TerminalColor.bold)\(TerminalColor.cyan)TERMINAL DETECTION DEBUG (-vv)\(TerminalColor.reset)")
print(String(repeating: "=", count: 60))
print("[term] \(TerminalColor.bold)Terminal Type:\(TerminalColor.reset) \(capabilities.termType ?? "unknown")")
print(
"[size] \(TerminalColor.bold)Dimensions:\(TerminalColor.reset) \(capabilities.width)x\(capabilities.height)"
)
print("\(AgentDisplayTokens.Status.running) \(TerminalColor.bold)Capabilities:\(TerminalColor.reset)")
self.printCapabilityFlag("Interactive", supported: capabilities.isInteractive, detail: "isatty check")
self.printCapabilityFlag("Colors", supported: capabilities.supportsColors, detail: "ANSI support")
self.printCapabilityFlag("True Color", supported: capabilities.supportsTrueColor, detail: "24-bit")
print(" • Dimensions: \(capabilities.width)x\(capabilities.height)")
print("[env] \(TerminalColor.bold)Environment:\(TerminalColor.reset)")
self.printCapabilityFlag("CI Environment", supported: capabilities.isCI)
self.printCapabilityFlag("Piped Output", supported: capabilities.isPiped)
let env = ProcessInfo.processInfo.environment
print("\(AgentDisplayTokens.Status.running) \(TerminalColor.bold)Environment Variables:\(TerminalColor.reset)")
print(" • TERM: \(env["TERM"] ?? "not set")")
print(" • COLORTERM: \(env["COLORTERM"] ?? "not set")")
print(" • NO_COLOR: \(env["NO_COLOR"] != nil ? "set" : "not set")")
print(" • FORCE_COLOR: \(env["FORCE_COLOR"] ?? "not set")")
print(" • PEEKABOO_OUTPUT_MODE: \(env["PEEKABOO_OUTPUT_MODE"] ?? "not set")")
let recommendedMode = capabilities.recommendedOutputMode
print("[focus] \(TerminalColor.bold)Recommended Mode:\(TerminalColor.reset) \(recommendedMode.description)")
print("[focus] \(TerminalColor.bold)Actual Mode:\(TerminalColor.reset) \(actualMode.description)")
if recommendedMode != actualMode {
let modeOverrideLine = [
"\(AgentDisplayTokens.Status.warning) ",
"\(TerminalColor.yellow)Mode Override Detected\(TerminalColor.reset)",
" - explicit flag or environment variable used"
].joined()
print(modeOverrideLine)
}
if !capabilities.isInteractive || capabilities.isCI || capabilities.isPiped {
print(" → Minimal mode (non-interactive/CI/piped)")
} else if capabilities.supportsColors {
print(" → Enhanced mode (colors available)")
} else {
print(" → Compact mode (basic terminal)")
}
print(String(repeating: "=", count: 60) + "\n")
}
}

View File

@ -1,13 +1,27 @@
import Commander
import Darwin
import Dispatch
import Foundation
import Logging
import PeekabooAgentRuntime
import PeekabooCore
import PeekabooFoundation
import Spinner
import Tachikoma
import TachikomaMCP
import TauTUI
/// Simple debug logging check
// Temporary session info struct until PeekabooAgentService implements session management
// Test: Icon notifications are now working
struct AgentSessionInfo: Codable {
let id: String
let task: String
let created: Date
let lastModified: Date
let messageCount: Int
}
// Simple debug logging check
private var isDebugLoggingEnabled: Bool {
// Check if verbose mode is enabled via log level
if let logLevel = ProcessInfo.processInfo.environment["PEEKABOO_LOG_LEVEL"]?.lowercased() {
@ -21,6 +35,8 @@ private var isDebugLoggingEnabled: Bool {
return false
}
private let defaultMCPServerName = "chrome-devtools"
private func aiDebugPrint(_ message: String) {
if isDebugLoggingEnabled {
print(message)
@ -86,14 +102,7 @@ struct AgentCommand: RuntimeOptionsConfigurable {
@Option(name: .long, help: "Queue mode for queued prompts: one-at-a-time (default) or all")
var queueMode: String?
@Option(
name: .long,
help: """
AI model to use (for example: gpt-5.5, claude-fable-5, \
gemini-3.5-flash, grok-4.3, minimax-m2.7, minimax-cn/m2.7, \
ollama/<model>, lmstudio/<model>, or <custom-provider>/<model>)
"""
)
@Option(name: .long, help: "AI model to use (allowed: gpt-5.1 or claude-sonnet-4.5)")
var model: String?
@Flag(name: .long, help: "Resume the most recent session (use with task argument)")
var resume = false
@ -126,7 +135,7 @@ struct AgentCommand: RuntimeOptionsConfigurable {
var chat = false
/// Computed property for output mode with smart detection and progressive enhancement
var outputMode: OutputMode {
private var outputMode: OutputMode {
// Explicit user overrides first
if self.quiet { return .quiet }
if self.verbose || self.debugTerminal { return .verbose }
@ -144,13 +153,7 @@ struct AgentCommand: RuntimeOptionsConfigurable {
}
@RuntimeStorage private var runtime: CommandRuntime?
var runtimeOptions: CommandRuntimeOptions = {
var options = CommandRuntimeOptions()
// Remote GUI bridge mode is optional and can fail to expose auth state.
// Keep agent execution local by default unless an explicit runtime option overrides it.
options.preferRemote = false
return options
}()
var runtimeOptions = CommandRuntimeOptions()
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
@ -164,12 +167,90 @@ struct AgentCommand: RuntimeOptionsConfigurable {
self.resolvedRuntime.services
}
var jsonOutput: Bool {
self.runtime?.configuration.jsonOutput ?? self.runtimeOptions.jsonOutput
private var logger: Logger {
self.resolvedRuntime.logger
}
var verbose: Bool {
self.runtime?.configuration.verbose ?? self.runtimeOptions.verbose
var jsonOutput: Bool { self.runtime?.configuration.jsonOutput ?? self.runtimeOptions.jsonOutput }
var verbose: Bool { self.runtime?.configuration.verbose ?? self.runtimeOptions.verbose }
}
@MainActor
private final class TerminalModeGuard {
private let fd: Int32
private var original = termios()
private var active = false
init?(fd: Int32 = STDIN_FILENO) {
guard isatty(fd) == 1 else { return nil }
guard tcgetattr(fd, &self.original) == 0 else { return nil }
var raw = self.original
cfmakeraw(&raw)
raw.c_lflag |= tcflag_t(ISIG) // keep signals like Ctrl+C enabled
guard tcsetattr(fd, TCSANOW, &raw) == 0 else { return nil }
self.fd = fd
self.active = true
}
var fileDescriptor: Int32 { self.fd }
func restore() {
guard self.active else { return }
_ = tcsetattr(self.fd, TCSANOW, &self.original)
self.active = false
}
@MainActor
deinit {
self.restore()
}
}
final class EscapeKeyMonitor {
private var source: (any DispatchSourceRead)?
private var terminalGuard: TerminalModeGuard?
private let handler: @Sendable () async -> Void
private let queue = DispatchQueue(label: "peekaboo.escape.monitor")
init(handler: @escaping @Sendable () async -> Void) {
self.handler = handler
}
func start() {
guard self.source == nil else { return }
guard let termGuard = TerminalModeGuard() else { return }
let fd = termGuard.fileDescriptor
let handler = self.handler
let source = DispatchSource.makeReadSource(fileDescriptor: fd, queue: self.queue)
source.setEventHandler {
var buffer = [UInt8](repeating: 0, count: 16)
let count = read(fd, &buffer, buffer.count)
guard count > 0 else { return }
if buffer[..<count].contains(0x1B) {
Task.detached(priority: .userInitiated) {
await handler()
}
}
}
source.setCancelHandler {
termGuard.restore()
}
source.resume()
self.source = source
self.terminalGuard = termGuard
}
func stop() {
self.source?.cancel()
self.source = nil
self.terminalGuard = nil
}
}
@ -177,7 +258,7 @@ struct AgentCommand: RuntimeOptionsConfigurable {
extension AgentCommand {
@MainActor
mutating func run() async throws {
let runtime = await CommandRuntime.makeDefaultAsync(options: self.runtimeOptions)
let runtime = CommandRuntime.makeDefault()
try await self.run(using: runtime)
}
@ -211,84 +292,46 @@ extension AgentCommand {
let services = runtime.services
let requestedModel: LanguageModel?
do {
requestedModel = try self.validatedModelSelection(configuration: services.configuration)
} catch {
self.printAgentExecutionError(error.localizedDescription)
throw ExitCode.failure
}
let configuredAIService = PeekabooAIService(configuration: services.configuration)
let existingAgent = services.agent as? PeekabooAgentService
let mutationCoordinator = runtime.toolSnapshotMutationCoordinator
existingAgent?.configureSnapshotMutationCoordinator(mutationCoordinator)
let existingAgentModel = existingAgent.flatMap {
configuredAIService.resolveConfiguredModel($0.defaultModelSelection) ??
LanguageModel.parse(from: $0.defaultModelSelection)
}
let selectedModel = requestedModel ??
self.implicitToolModel(
from: configuredAIService,
configuration: services.configuration,
existingAgentModel: existingAgentModel
)
if self.listSessions {
let listingModel = selectedModel ?? existingAgentModel ?? .anthropic(.opus48)
let agentService: any AgentServiceProtocol = if let existing = existingAgent {
existing
} else {
try PeekabooAgentService(
services: services,
defaultModel: listingModel,
snapshotMutationCoordinator: mutationCoordinator
)
}
try await self.showSessions(agentService)
return
}
guard let selectedModel else {
guard let agentService = services.agent else {
self.emitAgentUnavailableMessage()
return
}
guard self.hasCredentials(for: selectedModel) || self.isLocalModel(selectedModel) else {
if requestedModel != nil {
let providerName = self.providerDisplayName(for: selectedModel)
let envVar = self.providerEnvironmentVariable(for: selectedModel)
self.printAgentExecutionError(
"Missing API key for \(providerName). Set \(envVar) and retry."
)
} else {
self.emitAgentUnavailableMessage()
}
return
}
let agentService: any AgentServiceProtocol = if let existing = existingAgent {
existing
} else {
try PeekabooAgentService(
services: services,
defaultModel: selectedModel,
snapshotMutationCoordinator: mutationCoordinator
)
}
let terminalCapabilities = TerminalDetector.detectCapabilities()
if self.debugTerminal {
self.printTerminalDetectionDebug(terminalCapabilities, actualMode: self.outputMode)
}
if self.listSessions {
try await self.showSessions(agentService)
return
}
guard self.hasConfiguredAIProvider(configuration: services.configuration) else {
self.emitAgentUnavailableMessage()
return
}
let shouldSuppressMCPLogs = !self.verbose && !self.debugTerminal
self.configureLogging(suppressingMCPLogs: shouldSuppressMCPLogs)
// Warm up MCP servers off the main actor so chat can start immediately.
Task.detached(priority: .utility) {
await Self.initializeMCP()
}
guard let peekabooAgent = agentService as? PeekabooAgentService else {
throw PeekabooError.commandFailed("Agent service not properly initialized")
}
guard self.ensureAgentHasCredentials(selectedModel: selectedModel) else {
let requestedModel: LanguageModel?
do {
requestedModel = try self.validatedModelSelection()
} catch {
self.printAgentExecutionError(error.localizedDescription)
return
}
guard await self.ensureAgentHasCredentials(peekabooAgent, requestedModel: requestedModel) else {
return
}
@ -306,7 +349,7 @@ extension AgentCommand {
queueMode = try self.resolvedQueueMode()
} catch {
self.printAgentExecutionError(error.localizedDescription)
throw ExitCode.failure
return
}
switch chatPolicy.strategy(for: chatContext) {
@ -326,12 +369,7 @@ extension AgentCommand {
break
}
if try await self.handleSessionResumption(
peekabooAgent,
requestedModel: requestedModel,
maxSteps: self.maxSteps ?? 100,
queueMode: queueMode
) {
if try await self.handleSessionResumption(peekabooAgent, requestedModel: requestedModel) {
return
}
@ -369,14 +407,517 @@ extension AgentCommand {
}
}
func emitAgentUnavailableMessage() {
private static func initializeMCP() async {
if ProcessInfo.processInfo.environment["PEEKABOO_ENABLE_BROWSER_MCP"] == "1" {
let defaultChromeDevTools = ChromeDevToolsServerFactory.tachikomaConfig(timeout: 60.0, autoReconnect: true)
TachikomaMCPClientManager.shared.registerDefaultServers(
[defaultMCPServerName: defaultChromeDevTools])
}
await TachikomaMCPClientManager.shared.initializeFromProfile()
}
private func ensureAgentHasCredentials(
_ peekabooAgent: PeekabooAgentService,
requestedModel: LanguageModel?
) async -> Bool {
if let requestedModel {
if self.hasCredentials(for: requestedModel) {
return true
}
let providerName = self.providerDisplayName(for: requestedModel)
let envVar = self.providerEnvironmentVariable(for: requestedModel)
self.printAgentExecutionError(
"Missing API key for \(providerName). Set \(envVar) and retry."
)
return false
}
let hasCredential = await peekabooAgent.maskedApiKey != nil
if !hasCredential {
self.emitAgentUnavailableMessage()
}
return hasCredential
}
private func handleSessionResumption(
_ agentService: PeekabooAgentService,
requestedModel: LanguageModel?
) async throws -> Bool {
if let sessionId = self.resumeSession {
guard let continuationTask = self.task else {
self.printMissingTaskError(
message: "Task argument required when resuming session",
usage: "Usage: peekaboo agent --resume-session <session-id> \"<continuation-task>\""
)
return true
}
try await self.resumeAgentSession(
agentService,
sessionId: sessionId,
task: continuationTask,
requestedModel: requestedModel
)
return true
}
if self.resume {
guard let continuationTask = self.task else {
self.printMissingTaskError(
message: "Task argument required when resuming",
usage: "Usage: peekaboo agent --resume \"<continuation-task>\""
)
return true
}
let sessions = try await agentService.listSessions()
if let mostRecent = sessions.first {
try await self.resumeAgentSession(
agentService,
sessionId: mostRecent.id,
task: continuationTask,
requestedModel: requestedModel
)
} else {
if self.jsonOutput {
let error = ["success": false, "error": "No sessions found to resume"] as [String: Any]
let jsonData = try JSONSerialization.data(withJSONObject: error, options: .prettyPrinted)
print(String(data: jsonData, encoding: .utf8) ?? "{}")
} else {
print("\(TerminalColor.red)Error: No sessions found to resume\(TerminalColor.reset)")
}
}
return true
}
return false
}
func printMissingTaskError(message: String, usage: String) {
if self.jsonOutput {
let error = ["success": false, "error": message] as [String: Any]
if let jsonData = try? JSONSerialization.data(withJSONObject: error, options: .prettyPrinted),
let jsonString = String(data: jsonData, encoding: .utf8) {
print(jsonString)
} else {
print("{\"success\":false,\"error\":\"\(message)\"}")
}
} else {
print("\(TerminalColor.red)Error: \(message)\(TerminalColor.reset)")
if !usage.isEmpty {
print(usage)
}
}
}
/// Render the agent execution result using either JSON output or a rich CLI transcript.
@MainActor
func displayResult(_ result: AgentExecutionResult, delegate: AgentOutputDelegate? = nil) {
if self.jsonOutput {
let response = [
"success": true,
"result": [
"content": result.content,
"sessionId": result.sessionId as Any,
"toolCalls": result.messages.flatMap { message in
message.content.compactMap { content in
if case let .toolCall(toolCall) = content {
return [
"id": toolCall.id,
"name": toolCall.name,
"arguments": String(describing: toolCall.arguments)
]
}
return nil
}
},
"metadata": [
"executionTime": result.metadata.executionTime,
"toolCallCount": result.metadata.toolCallCount,
"modelName": result.metadata.modelName
],
"usage": result.usage.map { usage in
[
"inputTokens": usage.inputTokens,
"outputTokens": usage.outputTokens,
"totalTokens": usage.totalTokens
]
} as Any
]
] as [String: Any]
if let jsonData = try? JSONSerialization.data(withJSONObject: response, options: .prettyPrinted) {
print(String(data: jsonData, encoding: .utf8) ?? "{}")
}
} else if self.outputMode == .quiet {
// Quiet mode - only show final result
print(result.content)
}
delegate?.showFinalSummaryIfNeeded(result)
}
// MARK: - Session Management
@MainActor
func showSessions(_ agentService: any AgentServiceProtocol) async throws {
guard let peekabooService = agentService as? PeekabooAgentService else {
throw PeekabooError.commandFailed("Agent service not properly initialized")
}
let sessionSummaries = try await peekabooService.listSessions()
let sessions = sessionSummaries.map { summary in
AgentSessionInfo(
id: summary.id,
task: summary.summary ?? "Unknown task",
created: summary.createdAt,
lastModified: summary.lastAccessedAt,
messageCount: summary.messageCount
)
}
guard !sessions.isEmpty else {
self.printNoAgentSessions()
return
}
if self.jsonOutput {
self.printSessionsJSON(sessions)
} else {
self.printSessionsList(sessions)
}
}
private func printNoAgentSessions() {
if self.jsonOutput {
let response = ["success": true, "sessions": []] as [String: Any]
let jsonData = try? JSONSerialization.data(withJSONObject: response, options: .prettyPrinted)
print(String(data: jsonData ?? Data(), encoding: .utf8) ?? "{}")
} else {
print("No agent sessions found.")
}
}
private func printSessionsJSON(_ sessions: [AgentSessionInfo]) {
let sessionData = sessions.map { session in
[
"id": session.id,
"createdAt": ISO8601DateFormatter().string(from: session.created),
"updatedAt": ISO8601DateFormatter().string(from: session.lastModified),
"messageCount": session.messageCount
]
}
let response = ["success": true, "sessions": sessionData] as [String: Any]
if let jsonData = try? JSONSerialization.data(withJSONObject: response, options: .prettyPrinted) {
print(String(data: jsonData, encoding: .utf8) ?? "{}")
}
}
private func printSessionsList(_ sessions: [AgentSessionInfo]) {
let headerLine = [
"\(TerminalColor.cyan)\(TerminalColor.bold)Agent Sessions:\(TerminalColor.reset)",
"\n"
].joined()
print(headerLine)
let dateFormatter = DateFormatter()
dateFormatter.dateStyle = .short
dateFormatter.timeStyle = .short
for (index, session) in sessions.prefix(10).indexed() {
self.printSessionLine(index: index, session: session, dateFormatter: dateFormatter)
if index < sessions.count - 1 {
print()
}
}
if sessions.count > 10 {
print([
"\n",
"\(TerminalColor.dim)... and \(sessions.count - 10) more sessions\(TerminalColor.reset)"
].joined())
}
let resumeHintLine = [
"\n",
"\(TerminalColor.dim)To resume: peekaboo agent --resume <session-id>",
" \"<continuation>\"\(TerminalColor.reset)"
].joined()
print(resumeHintLine)
}
private func printSessionLine(index: Int, session: AgentSessionInfo, dateFormatter: DateFormatter) {
let timeAgo = formatTimeAgo(session.lastModified)
let sessionLine = [
"\(TerminalColor.blue)\(index + 1).\(TerminalColor.reset)",
" ",
"\(TerminalColor.bold)\(session.id.prefix(8))\(TerminalColor.reset)"
].joined()
print(sessionLine)
print(" Messages: \(session.messageCount)")
print(" Last activity: \(timeAgo)")
}
func resumeAgentSession(
_ agentService: PeekabooAgentService,
sessionId: String,
task: String,
requestedModel: LanguageModel?
) async throws {
if !self.jsonOutput {
let resumingLine = [
"\(TerminalColor.cyan)\(TerminalColor.bold)",
"\(AgentDisplayTokens.Status.info)",
" Resuming session \(sessionId.prefix(8))...",
"\(TerminalColor.reset)",
"\n"
].joined()
print(resumingLine)
}
let outputDelegate = self.makeDisplayDelegate(for: task)
let streamingDelegate = self.makeStreamingDelegate(using: outputDelegate)
do {
let result = try await agentService.resumeSession(
sessionId: sessionId,
model: requestedModel,
eventDelegate: streamingDelegate
)
self.displayResult(result, delegate: outputDelegate)
} catch {
self.printAgentExecutionError("Failed to resume session: \(error.localizedDescription)")
throw error
}
}
func makeDisplayDelegate(for task: String) -> AgentOutputDelegate? {
guard !self.jsonOutput, !self.quiet else { return nil }
return AgentOutputDelegate(outputMode: self.outputMode, jsonOutput: self.jsonOutput, task: task)
}
func makeStreamingDelegate(using displayDelegate: AgentOutputDelegate?) -> (any AgentEventDelegate)? {
if let displayDelegate {
return displayDelegate
}
if self.jsonOutput || self.quiet {
return SilentAgentEventDelegate()
}
return nil
}
final class SilentAgentEventDelegate: AgentEventDelegate {
func agentDidEmitEvent(_ event: AgentEvent) {}
}
func printAgentExecutionError(_ message: String) {
if self.jsonOutput {
let error: [String: Any] = ["success": false, "error": message]
if let jsonData = try? JSONSerialization.data(withJSONObject: error, options: .prettyPrinted),
let jsonString = String(data: jsonData, encoding: .utf8) {
print(jsonString)
} else {
print("{\"success\":false,\"error\":\"\(message)\"}")
}
} else {
print("\(TerminalColor.red)Error: \(message)\(TerminalColor.reset)")
}
}
func executeAgentTask(
_ agentService: PeekabooAgentService,
task: String,
requestedModel: LanguageModel?,
maxSteps: Int,
queueMode: QueueMode
) async throws -> AgentExecutionResult {
let outputDelegate = self.makeDisplayDelegate(for: task)
let streamingDelegate = self.makeStreamingDelegate(using: outputDelegate)
do {
let result = try await agentService.executeTask(
task,
maxSteps: maxSteps,
sessionId: nil,
model: requestedModel,
dryRun: self.dryRun,
queueMode: queueMode,
eventDelegate: streamingDelegate,
verbose: self.verbose
)
self.displayResult(result, delegate: outputDelegate)
let duration = String(format: "%.2f", result.metadata.executionTime)
let sessionId = result.sessionId ?? "none"
let finalTokens = result.usage?.totalTokens ?? 0
let status = result.metadata.context["status"] ?? "completed"
AutomationEventLogger.log(
.agent,
"result status=\(status) task='\(task)' model=\(result.metadata.modelName) duration=\(duration)s "
+ "tools=\(result.metadata.toolCallCount) dry_run=\(self.dryRun) "
+ "session=\(sessionId) tokens=\(finalTokens)"
)
return result
} catch {
self.printAgentExecutionError("Agent execution failed: \(error.localizedDescription)")
throw error
}
}
private var normalizedTaskInput: String? {
guard let task else { return nil }
let trimmed = task.trimmingCharacters(in: .whitespacesAndNewlines)
return trimmed.isEmpty ? nil : trimmed
}
private var hasTaskInput: Bool {
self.normalizedTaskInput != nil || self.audio || self.audioFile != nil
}
var resolvedMaxSteps: Int { self.maxSteps ?? 100 }
private func resolvedQueueMode() throws -> QueueMode {
guard let raw = self.queueMode?.trimmingCharacters(in: .whitespacesAndNewlines), !raw.isEmpty else {
return .oneAtATime
}
switch raw.lowercased() {
case "one", "one-at-a-time", "single", "sequential", "1":
return .oneAtATime
case "all", "batch", "together":
return .all
default:
throw PeekabooError.invalidInput("Invalid queue mode '\(raw)'. Use one-at-a-time or all.")
}
}
func printChatWelcome(sessionId: String?, modelDescription: String, queueMode: QueueMode) {
guard !self.quiet else { return }
let header = [
TerminalColor.cyan,
TerminalColor.bold,
"Interactive agent chat",
TerminalColor.reset,
" model: ",
modelDescription,
" • queue: ",
queueMode == .all ? "all" : "one-at-a-time"
].joined()
print(header)
if let sessionId {
print("\(TerminalColor.dim)Resuming session \(sessionId.prefix(8))\(TerminalColor.reset)")
} else {
print("\(TerminalColor.dim)A new session will be created on the first prompt\(TerminalColor.reset)")
}
print()
}
func printChatHelpIntro() {
guard !self.quiet else { return }
print("Type /help for chat commands (Ctrl+C to exit).")
self.printChatHelpMenu()
}
func printChatHelpMenu() {
guard !self.quiet else { return }
self.chatHelpLines.forEach { print($0) }
}
private var chatHelpText: String {
"""
Chat commands:
Type any prompt and press Return to run it.
/help Show this menu again.
Esc Cancel the active run (if one is in progress).
Ctrl+C Cancel when running; exit immediately when idle.
Ctrl+D Exit when idle (EOF).
"""
}
var chatHelpLines: [String] {
self.chatHelpText
.split(separator: "\n", omittingEmptySubsequences: false)
.map(String.init)
}
private func printCapabilityFlag(_ label: String, supported: Bool, detail: String? = nil) {
let status = supported ? AgentDisplayTokens.Status.success : AgentDisplayTokens.Status.failure
let detailSuffix = detail.map { " (\($0))" } ?? ""
print("\(label): \(status)\(detailSuffix)")
}
/// Print detailed terminal detection debugging information
func printTerminalDetectionDebug(_ capabilities: TerminalCapabilities, actualMode: OutputMode) {
// Print detailed terminal detection debugging information
print("\n" + String(repeating: "=", count: 60))
print("\(TerminalColor.bold)\(TerminalColor.cyan)TERMINAL DETECTION DEBUG (-vv)\(TerminalColor.reset)")
print(String(repeating: "=", count: 60))
// Basic terminal info
print("[term] \(TerminalColor.bold)Terminal Type:\(TerminalColor.reset) \(capabilities.termType ?? "unknown")")
print(
"[size] \(TerminalColor.bold)Dimensions:\(TerminalColor.reset) \(capabilities.width)x\(capabilities.height)"
)
// Capability flags
print("\(AgentDisplayTokens.Status.running) \(TerminalColor.bold)Capabilities:\(TerminalColor.reset)")
self.printCapabilityFlag("Interactive", supported: capabilities.isInteractive, detail: "isatty check")
self.printCapabilityFlag("Colors", supported: capabilities.supportsColors, detail: "ANSI support")
self.printCapabilityFlag("True Color", supported: capabilities.supportsTrueColor, detail: "24-bit")
print(" • Dimensions: \(capabilities.width)x\(capabilities.height)")
// Environment info
print("[env] \(TerminalColor.bold)Environment:\(TerminalColor.reset)")
self.printCapabilityFlag("CI Environment", supported: capabilities.isCI)
self.printCapabilityFlag("Piped Output", supported: capabilities.isPiped)
// Environment variables
let env = ProcessInfo.processInfo.environment
print("\(AgentDisplayTokens.Status.running) \(TerminalColor.bold)Environment Variables:\(TerminalColor.reset)")
print(" • TERM: \(env["TERM"] ?? "not set")")
print(" • COLORTERM: \(env["COLORTERM"] ?? "not set")")
print(" • NO_COLOR: \(env["NO_COLOR"] != nil ? "set" : "not set")")
print(" • FORCE_COLOR: \(env["FORCE_COLOR"] ?? "not set")")
print(" • PEEKABOO_OUTPUT_MODE: \(env["PEEKABOO_OUTPUT_MODE"] ?? "not set")")
// Recommended vs actual mode
let recommendedMode = capabilities.recommendedOutputMode
print("[focus] \(TerminalColor.bold)Recommended Mode:\(TerminalColor.reset) \(recommendedMode.description)")
print("[focus] \(TerminalColor.bold)Actual Mode:\(TerminalColor.reset) \(actualMode.description)")
if recommendedMode != actualMode {
let modeOverrideLine = [
"\(AgentDisplayTokens.Status.warning) ",
"\(TerminalColor.yellow)Mode Override Detected\(TerminalColor.reset)",
" - explicit flag or environment variable used"
].joined()
print(modeOverrideLine)
}
// Show decision logic
if !capabilities.isInteractive || capabilities.isCI || capabilities.isPiped {
print(" → Minimal mode (non-interactive/CI/piped)")
} else if capabilities.supportsColors {
print(" → Enhanced mode (colors available)")
} else {
print(" → Compact mode (basic terminal)")
}
print(String(repeating: "=", count: 60) + "\n")
}
private func hasConfiguredAIProvider(configuration: PeekabooCore.ConfigurationManager) -> Bool {
let hasOpenAI = configuration.getOpenAIAPIKey()?.isEmpty == false
let hasAnthropic = configuration.getAnthropicAPIKey()?.isEmpty == false
return hasOpenAI || hasAnthropic
}
private func emitAgentUnavailableMessage() {
if self.jsonOutput {
let message = "Agent service not available. Please set OPENAI_API_KEY, ANTHROPIC_API_KEY, " +
"GEMINI_API_KEY, X_AI_API_KEY, MINIMAX_API_KEY, MINIMAX_CN_API_KEY, OPENROUTER_API_KEY, " +
"or configure ollama/<model>, lmstudio/<model>, or a custom provider."
let error = [
"success": false,
"error": message
"error": "Agent service not available. Please set OPENAI_API_KEY environment variable."
] as [String: Any]
if let jsonData = try? JSONSerialization.data(withJSONObject: error, options: .prettyPrinted),
let jsonString = String(data: jsonData, encoding: .utf8) {
@ -387,14 +928,115 @@ extension AgentCommand {
} else {
let errorPrefix = [
"\(TerminalColor.red)Error: Agent service not available.",
" Please set OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, X_AI_API_KEY,",
" MINIMAX_API_KEY, MINIMAX_CN_API_KEY, OPENROUTER_API_KEY,",
" or configure ollama/<model>, lmstudio/<model>, or a custom provider."
" Please set OPENAI_API_KEY environment variable."
].joined()
let errorMessageLine = [errorPrefix, "\(TerminalColor.reset)"].joined()
print(errorMessageLine)
}
}
// MARK: - Model Parsing
func parseModelString(_ modelString: String) -> LanguageModel? {
let trimmed = modelString.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
guard let parsed = LanguageModel.parse(from: trimmed) else {
return nil
}
switch parsed {
case let .openai(model):
if Self.supportedOpenAIInputs.contains(model) {
return .openai(.gpt51)
}
case let .anthropic(model):
if Self.supportedAnthropicInputs.contains(model) {
return .anthropic(.sonnet45)
}
default:
break
}
return nil
}
func validatedModelSelection() throws -> LanguageModel? {
guard let modelString = self.model else { return nil }
guard let parsed = self.parseModelString(modelString) else {
throw PeekabooError.invalidInput(
"Unsupported model '\(modelString)'. Allowed values: \(Self.allowedModelList)"
)
}
return parsed
}
private static let supportedOpenAIInputs: Set<LanguageModel.OpenAI> = [
.gpt51,
.gpt51Mini,
.gpt51Nano,
.gpt5,
.gpt5Pro,
.gpt5Mini,
.gpt5Nano,
.gpt5Thinking,
.gpt5ThinkingMini,
.gpt5ThinkingNano,
.gpt5ChatLatest,
.gpt4o,
.gpt4oMini,
.gpt4oRealtime,
.o4Mini,
]
private static let supportedAnthropicInputs: Set<LanguageModel.Anthropic> = [
.sonnet45,
.sonnet4,
.sonnet4Thinking,
.opus4,
.opus4Thinking,
]
private static var allowedModelList: String {
let openAIModels = Self.supportedOpenAIInputs.map(\.modelId)
let anthropicModels = Self.supportedAnthropicInputs.map(\.modelId)
return (openAIModels + anthropicModels).sorted().joined(separator: ", ")
}
@MainActor
private func hasCredentials(for model: LanguageModel) -> Bool {
let configuration = self.services.configuration
switch model {
case .openai:
return configuration.getOpenAIAPIKey()?.isEmpty == false
case .anthropic:
return configuration.getAnthropicAPIKey()?.isEmpty == false
default:
return false
}
}
private func providerDisplayName(for model: LanguageModel) -> String {
switch model {
case .openai:
"OpenAI"
case .anthropic:
"Anthropic"
default:
"the selected provider"
}
}
private func providerEnvironmentVariable(for model: LanguageModel) -> String {
switch model {
case .openai:
"OPENAI_API_KEY"
case .anthropic:
"ANTHROPIC_API_KEY"
default:
"provider API key"
}
}
}
extension AgentCommand: ParsableCommand {}

View File

@ -1,402 +0,0 @@
//
// AgentOutputDelegate+Formatting.swift
// Peekaboo
//
import Foundation
import PeekabooCore
@available(macOS 14.0, *)
extension AgentOutputDelegate {
// MARK: - Helper Methods
func shouldSkipCommunicationOutput(for toolType: ToolType?) -> Bool {
guard let toolType else { return false }
return [ToolType.taskCompleted, .needMoreInformation, .needInfo].contains(toolType)
}
func printToolCallStart(
displayName: String,
args: [String: Any],
rawArguments: String,
formatter: any ToolFormatter
) {
let sanitizedName = self.cleanToolPrefix(displayName)
switch self.outputMode {
case .minimal:
print(sanitizedName, terminator: "")
case .verbose:
print("\(TerminalColor.blue)\(TerminalColor.bold)\(sanitizedName)\(TerminalColor.reset)")
if rawArguments.isEmpty || rawArguments == "{}" {
print("\(TerminalColor.gray)Arguments: (none)\(TerminalColor.reset)")
} else if let formatted = formatJSON(rawArguments) {
print("\(TerminalColor.gray)Arguments:\(TerminalColor.reset)")
print(formatted)
}
case .enhanced:
let startMessage = self.cleanToolPrefix(formatter.formatStarting(arguments: args))
print(
"\(TerminalColor.blue)\(TerminalColor.bold)\(startMessage)\(TerminalColor.reset)",
terminator: ""
)
default: // .normal, .compact
print(
"\(TerminalColor.blue)\(TerminalColor.bold)\(sanitizedName)\(TerminalColor.reset)",
terminator: ""
)
let summary = formatter.formatCompactSummary(arguments: args)
if !summary.isEmpty {
print(" \(TerminalColor.gray)\(summary)\(TerminalColor.reset)", terminator: "")
}
}
fflush(stdout)
}
/// Remove leading glyph tokens like "[sh]" from tool narration so agent output reads naturally.
func cleanToolPrefix(_ text: String) -> String {
var result = text.trimmingCharacters(in: .whitespacesAndNewlines)
while result.hasPrefix("[") {
guard let closing = result.firstIndex(of: "]") else { break }
let next = result.index(after: closing)
result = String(result[next...]).trimmingCharacters(in: .whitespacesAndNewlines)
}
return result
}
func successStatusLine(resultSummary: String, durationString: String) -> String {
if resultSummary.isEmpty {
return " \(durationString)"
}
let summarySegment = [
" ",
TerminalColor.bold,
resultSummary,
TerminalColor.reset
].joined()
return "\(summarySegment)\(durationString)"
}
func failureStatusLine(message: String, durationString: String) -> String {
let statusPrefix = [
" ",
TerminalColor.red,
AgentDisplayTokens.Status.failure
].joined()
return [
statusPrefix,
" ",
message,
TerminalColor.reset,
durationString
].joined()
}
func completionSummaryLine(totalElapsed: TimeInterval, toolsText: String, tokenInfo: String) -> String {
let summaryPrefix = "\(TerminalColor.gray)Task completed in \(formatDuration(totalElapsed))"
return [
"\n",
summaryPrefix,
" with \(toolsText)\(tokenInfo)",
TerminalColor.reset
].joined()
}
func durationString(for toolName: String) -> String {
if let startTime = self.toolStartTimes[toolName] {
self.toolStartTimes.removeValue(forKey: toolName)
let elapsed = Date().timeIntervalSince(startTime)
return " \(TerminalColor.gray)(\(formatDuration(elapsed)))\(TerminalColor.reset)"
}
return ""
}
func printInvalidResult(rawResult: String, durationString: String) {
if self.outputMode == .verbose {
let failureBadge = [
" ",
TerminalColor.red,
AgentDisplayTokens.Status.failure
].joined()
let invalidJsonMessage = [
failureBadge,
" Invalid JSON result",
TerminalColor.reset,
durationString
].joined()
print(invalidJsonMessage)
let rawResultLine = [
TerminalColor.gray,
"Raw result: \(rawResult.prefix(200))",
TerminalColor.reset
].joined()
print(rawResultLine)
} else {
let failureBadge = [
" ",
TerminalColor.red,
AgentDisplayTokens.Status.failure
].joined()
let invalidResultMessage = [
failureBadge,
" Invalid result",
TerminalColor.reset,
durationString
].joined()
print(invalidResultMessage)
}
}
func toolFormatter(for name: String) -> (any ToolFormatter, ToolType?) {
if let type = ToolType(rawValue: name) {
return (ToolFormatterRegistry.shared.formatter(for: type), type)
}
return (UnknownToolFormatter(toolName: name), nil)
}
/// Produce a compact diff summary between previous and new arguments for the same tool name.
func diffSummary(for toolName: String, newArgs: [String: Any]) -> String? {
guard let previous = self.lastToolArguments[toolName] else { return nil }
var changes: [String] = []
for (key, newValue) in newArgs {
guard let prevValue = previous[key] else {
changes.append("+\(key)")
continue
}
if !self.valuesEqual(prevValue, newValue) {
let rendered = self.renderValue(newValue)
changes.append("\(key): \(rendered)")
}
if changes.count >= 3 { break }
}
if changes.isEmpty {
return nil
}
return changes.joined(separator: ", ")
}
func valuesEqual(_ lhs: Any, _ rhs: Any) -> Bool {
switch (lhs, rhs) {
case let (l as String, r as String): l == r
case let (l as Int, r as Int): l == r
case let (l as Double, r as Double): l == r
case let (l as Bool, r as Bool): l == r
default:
false
}
}
func dictionariesEqual(_ lhs: [String: Any], _ rhs: [String: Any]) -> Bool {
guard lhs.count == rhs.count else { return false }
for (key, lval) in lhs {
guard let rval = rhs[key], self.valuesEqual(lval, rval) else { return false }
}
return true
}
func renderValue(_ value: Any) -> String {
switch value {
case let str as String:
let max = 40
if str.count > max {
let idx = str.index(str.startIndex, offsetBy: max)
return String(str[..<idx]) + ""
}
return str
case let num as Int: return String(num)
case let num as Double: return String(format: "%.3f", num)
case let bool as Bool: return bool ? "true" : "false"
default:
if let data = try? JSONSerialization.data(withJSONObject: ["v": value], options: []),
let text = String(data: data, encoding: .utf8) {
return text.replacingOccurrences(of: "{\"v\":", with: "")
.trimmingCharacters(in: CharacterSet(charactersIn: "}"))
}
return ""
}
}
func resultSummary(
for name: String,
json: [String: Any],
formatter: any ToolFormatter,
summary: ToolEventSummary?
) -> String {
if let summaryText = summary?.shortDescription(toolName: name) {
return summaryText
}
var fallback = formatter.formatResultSummary(result: json)
guard name == "app" else {
return self.cleanToolPrefix(fallback)
}
if let meta = json["meta"] as? [String: Any],
let appName = meta["app_name"] as? String,
let content = json["content"] as? [[String: Any]],
let firstContent = content.first,
let text = firstContent["text"] as? String {
switch text {
case let value where value.contains("Launched"):
fallback = "\(appName) launched"
case let value where value.contains("Quit"):
fallback = "\(appName) quit"
case let value where value.contains("Focused") || value.contains("Switched"):
fallback = "\(appName) focused"
case let value where value.contains("Hidden"):
fallback = "\(appName) hidden"
case let value where value.contains("Unhidden"):
fallback = "\(appName) shown"
default:
break
}
}
return self.cleanToolPrefix(fallback)
}
func handleSuccess(
resultSummary: String,
durationString: String,
result: String,
json: [String: Any]
) {
switch self.outputMode {
case .minimal:
let prefix = resultSummary.isEmpty ? "" : " \(resultSummary)"
print("\(prefix)\(durationString)")
case .verbose:
print(" \(durationString)")
if let formatted = formatJSON(result) {
print("\(TerminalColor.gray)Result:\(TerminalColor.reset)")
print(formatted)
}
default:
print(self.successStatusLine(resultSummary: resultSummary, durationString: durationString))
self.printResultDetails(from: json)
}
}
func handleFailure(message: String, durationString: String, json: [String: Any], tool: String) {
if self.outputMode == .minimal {
print(" FAILED\(durationString)")
} else {
print(self.failureStatusLine(message: message, durationString: durationString))
}
self.displayEnhancedError(tool: tool, json: json)
}
func handleCommunicationToolComplete(name: String, toolType: ToolType) {
if self.outputMode == .verbose {
let toolName = toolType.rawValue
.replacingOccurrences(of: "_", with: " ")
.capitalized
print("\n\(AgentDisplayTokens.Status.success) \(toolName) completed")
}
}
func displayEnhancedError(tool: String, json: [String: Any]) {
guard self.outputMode != .minimal && self.outputMode != .quiet else { return }
if let error = json["error"] as? String {
print(" \(TerminalColor.gray)Error: \(error)\(TerminalColor.reset)")
}
if let suggestion = json["suggestion"] as? String {
print(" \(TerminalColor.yellow)💡 Suggestion: \(suggestion)\(TerminalColor.reset)")
}
if self.outputMode == .verbose,
let details = json["details"] as? [String: Any],
let formatted = try? JSONSerialization.data(withJSONObject: details, options: .prettyPrinted),
let detailsStr = String(data: formatted, encoding: .utf8) {
print(" \(TerminalColor.gray)Details:\(TerminalColor.reset)")
print(detailsStr)
}
}
func printResultDetails(from json: [String: Any]) {
guard self.outputMode != .minimal && self.outputMode != .quiet else { return }
guard let detail = self.primaryResultMessage(from: json) else { return }
let snippet = detail.trimmingCharacters(in: .whitespacesAndNewlines)
let sanitized = self.cleanToolPrefix(snippet)
guard !sanitized.isEmpty else { return }
print("\n \(TerminalColor.gray)\(sanitized.prefix(240))\(TerminalColor.reset)")
}
func primaryResultMessage(from json: [String: Any]) -> String? {
if let message = json["message"] as? String, !message.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return message
}
if let content = json["content"] as? [[String: Any]] {
for item in content {
if let text = item["text"] as? String,
!text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return text
}
}
}
if let meta = json["meta"] as? [String: Any],
let message = meta["message"] as? String,
!message.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return message
}
return nil
}
}
// MARK: - Supporting Types
/// Formatter for unknown tools.
private class UnknownToolFormatter: BaseToolFormatter {
private let toolName: String
override nonisolated init(toolType: ToolType) {
fatalError("Use init(toolName:)")
}
init(toolName: String) {
self.toolName = toolName
// Use wait as the inert placeholder so unknown tools still get a formatter base.
super.init(toolType: .wait)
}
override nonisolated func formatStarting(arguments: [String: Any]) -> String {
"\(self.toolName.replacingOccurrences(of: "_", with: " ").capitalized)"
}
override nonisolated func formatCompleted(result: [String: Any], duration: TimeInterval) -> String {
"→ completed"
}
override nonisolated func formatError(error: String, result: [String: Any]) -> String {
"\(AgentDisplayTokens.Status.failure) \(error)"
}
override nonisolated func formatCompactSummary(arguments: [String: Any]) -> String {
""
}
override nonisolated func formatResultSummary(result: [String: Any]) -> String {
""
}
override nonisolated func formatForTitle(arguments: [String: Any]) -> String {
self.toolName
}
}

View File

@ -13,14 +13,14 @@ import Tachikoma
final class AgentOutputDelegate: PeekabooCore.AgentEventDelegate {
// MARK: - Properties
let outputMode: OutputMode
private let outputMode: OutputMode
private let jsonOutput: Bool
private let task: String?
// Tool tracking
private var currentTool: String?
var toolStartTimes: [String: Date] = [:]
var lastToolArguments: [String: [String: Any]] = [:]
private var toolStartTimes: [String: Date] = [:]
private var lastToolArguments: [String: [String: Any]] = [:]
private var toolCallCount = 0
private var totalTokens = 0
@ -66,9 +66,6 @@ extension AgentOutputDelegate {
case let .thinkingMessage(content):
self.handleThinkingMessage(content)
case .verificationCompleted, .desktopContextRefreshed:
break
case let .error(message):
self.handleError(message)
@ -322,4 +319,397 @@ extension AgentOutputDelegate {
))
self.hasShownFinalSummary = true
}
// MARK: - Helper Methods
private func shouldSkipCommunicationOutput(for toolType: ToolType?) -> Bool {
guard let toolType else { return false }
return [ToolType.taskCompleted, .needMoreInformation, .needInfo].contains(toolType)
}
private func printToolCallStart(
displayName: String,
args: [String: Any],
rawArguments: String,
formatter: any ToolFormatter
) {
let sanitizedName = self.cleanToolPrefix(displayName)
switch self.outputMode {
case .minimal:
print(sanitizedName, terminator: "")
case .verbose:
print("\(TerminalColor.blue)\(TerminalColor.bold)\(sanitizedName)\(TerminalColor.reset)")
if rawArguments.isEmpty || rawArguments == "{}" {
print("\(TerminalColor.gray)Arguments: (none)\(TerminalColor.reset)")
} else if let formatted = formatJSON(rawArguments) {
print("\(TerminalColor.gray)Arguments:\(TerminalColor.reset)")
print(formatted)
}
case .enhanced:
let startMessage = self.cleanToolPrefix(formatter.formatStarting(arguments: args))
print(
"\(TerminalColor.blue)\(TerminalColor.bold)\(startMessage)\(TerminalColor.reset)",
terminator: ""
)
default: // .normal, .compact
print(
"\(TerminalColor.blue)\(TerminalColor.bold)\(sanitizedName)\(TerminalColor.reset)",
terminator: ""
)
let summary = formatter.formatCompactSummary(arguments: args)
if !summary.isEmpty {
print(" \(TerminalColor.gray)\(summary)\(TerminalColor.reset)", terminator: "")
}
}
fflush(stdout)
}
/// Remove leading glyph tokens like "[sh]" from tool narration so agent output reads naturally.
private func cleanToolPrefix(_ text: String) -> String {
var result = text.trimmingCharacters(in: .whitespacesAndNewlines)
while result.hasPrefix("[") {
guard let closing = result.firstIndex(of: "]") else { break }
let next = result.index(after: closing)
result = String(result[next...]).trimmingCharacters(in: .whitespacesAndNewlines)
}
return result
}
private func successStatusLine(resultSummary: String, durationString: String) -> String {
if resultSummary.isEmpty {
return " \(durationString)"
}
let summarySegment = [
" ",
TerminalColor.bold,
resultSummary,
TerminalColor.reset
].joined()
return "\(summarySegment)\(durationString)"
}
private func failureStatusLine(message: String, durationString: String) -> String {
let statusPrefix = [
" ",
TerminalColor.red,
AgentDisplayTokens.Status.failure
].joined()
return [
statusPrefix,
" ",
message,
TerminalColor.reset,
durationString
].joined()
}
private func completionSummaryLine(totalElapsed: TimeInterval, toolsText: String, tokenInfo: String) -> String {
let summaryPrefix = "\(TerminalColor.gray)Task completed in \(formatDuration(totalElapsed))"
return [
"\n",
summaryPrefix,
" with \(toolsText)\(tokenInfo)",
TerminalColor.reset
].joined()
}
private func durationString(for toolName: String) -> String {
if let startTime = self.toolStartTimes[toolName] {
self.toolStartTimes.removeValue(forKey: toolName)
let elapsed = Date().timeIntervalSince(startTime)
return " \(TerminalColor.gray)(\(formatDuration(elapsed)))\(TerminalColor.reset)"
}
return ""
}
private func printInvalidResult(rawResult: String, durationString: String) {
if self.outputMode == .verbose {
let failureBadge = [
" ",
TerminalColor.red,
AgentDisplayTokens.Status.failure
].joined()
let invalidJsonMessage = [
failureBadge,
" Invalid JSON result",
TerminalColor.reset,
durationString
].joined()
print(invalidJsonMessage)
let rawResultLine = [
TerminalColor.gray,
"Raw result: \(rawResult.prefix(200))",
TerminalColor.reset
].joined()
print(rawResultLine)
} else {
let failureBadge = [
" ",
TerminalColor.red,
AgentDisplayTokens.Status.failure
].joined()
let invalidResultMessage = [
failureBadge,
" Invalid result",
TerminalColor.reset,
durationString
].joined()
print(invalidResultMessage)
}
}
private func toolFormatter(for name: String) -> (any ToolFormatter, ToolType?) {
if let type = ToolType(rawValue: name) {
return (ToolFormatterRegistry.shared.formatter(for: type), type)
}
return (UnknownToolFormatter(toolName: name), nil)
}
/// Produce a compact diff summary between previous and new arguments for the same tool name.
private func diffSummary(for toolName: String, newArgs: [String: Any]) -> String? {
guard let previous = self.lastToolArguments[toolName] else { return nil }
var changes: [String] = []
for (key, newValue) in newArgs {
guard let prevValue = previous[key] else {
changes.append("+\(key)")
continue
}
if !self.valuesEqual(prevValue, newValue) {
let rendered = self.renderValue(newValue)
changes.append("\(key): \(rendered)")
}
if changes.count >= 3 { break }
}
if changes.isEmpty {
return nil
}
return changes.joined(separator: ", ")
}
private func valuesEqual(_ lhs: Any, _ rhs: Any) -> Bool {
switch (lhs, rhs) {
case let (l as String, r as String): l == r
case let (l as Int, r as Int): l == r
case let (l as Double, r as Double): l == r
case let (l as Bool, r as Bool): l == r
default:
false
}
}
private func dictionariesEqual(_ lhs: [String: Any], _ rhs: [String: Any]) -> Bool {
guard lhs.count == rhs.count else { return false }
for (key, lval) in lhs {
guard let rval = rhs[key], self.valuesEqual(lval, rval) else { return false }
}
return true
}
private func renderValue(_ value: Any) -> String {
switch value {
case let str as String:
let max = 40
if str.count > max {
let idx = str.index(str.startIndex, offsetBy: max)
return String(str[..<idx]) + ""
}
return str
case let num as Int: return String(num)
case let num as Double: return String(format: "%.3f", num)
case let bool as Bool: return bool ? "true" : "false"
default:
if let data = try? JSONSerialization.data(withJSONObject: ["v": value], options: []),
let text = String(data: data, encoding: .utf8) {
return text.replacingOccurrences(of: "{\"v\":", with: "")
.trimmingCharacters(in: CharacterSet(charactersIn: "}"))
}
return ""
}
}
private func resultSummary(
for name: String,
json: [String: Any],
formatter: any ToolFormatter,
summary: ToolEventSummary?
) -> String {
if let summaryText = summary?.shortDescription(toolName: name) {
return summaryText
}
var fallback = formatter.formatResultSummary(result: json)
guard name == "app" else {
return self.cleanToolPrefix(fallback)
}
if let meta = json["meta"] as? [String: Any],
let appName = meta["app_name"] as? String,
let content = json["content"] as? [[String: Any]],
let firstContent = content.first,
let text = firstContent["text"] as? String {
switch text {
case let value where value.contains("Launched"):
fallback = "\(appName) launched"
case let value where value.contains("Quit"):
fallback = "\(appName) quit"
case let value where value.contains("Focused") || value.contains("Switched"):
fallback = "\(appName) focused"
case let value where value.contains("Hidden"):
fallback = "\(appName) hidden"
case let value where value.contains("Unhidden"):
fallback = "\(appName) shown"
default:
break
}
}
return self.cleanToolPrefix(fallback)
}
private func handleSuccess(
resultSummary: String,
durationString: String,
result: String,
json: [String: Any]
) {
switch self.outputMode {
case .minimal:
let prefix = resultSummary.isEmpty ? "" : " \(resultSummary)"
print("\(prefix)\(durationString)")
case .verbose:
print(" \(durationString)")
if let formatted = formatJSON(result) {
print("\(TerminalColor.gray)Result:\(TerminalColor.reset)")
print(formatted)
}
default:
print(self.successStatusLine(resultSummary: resultSummary, durationString: durationString))
self.printResultDetails(from: json)
}
}
private func handleFailure(message: String, durationString: String, json: [String: Any], tool: String) {
if self.outputMode == .minimal {
print(" FAILED\(durationString)")
} else {
print(self.failureStatusLine(message: message, durationString: durationString))
}
self.displayEnhancedError(tool: tool, json: json)
}
private func handleCommunicationToolComplete(name: String, toolType: ToolType) {
if self.outputMode == .verbose {
let toolName = toolType.rawValue
.replacingOccurrences(of: "_", with: " ")
.capitalized
print("\n\(AgentDisplayTokens.Status.success) \(toolName) completed")
}
}
private func displayEnhancedError(tool: String, json: [String: Any]) {
guard self.outputMode != .minimal && self.outputMode != .quiet else { return }
if let error = json["error"] as? String {
print(" \(TerminalColor.gray)Error: \(error)\(TerminalColor.reset)")
}
if let suggestion = json["suggestion"] as? String {
print(" \(TerminalColor.yellow)💡 Suggestion: \(suggestion)\(TerminalColor.reset)")
}
if self.outputMode == .verbose,
let details = json["details"] as? [String: Any],
let formatted = try? JSONSerialization.data(withJSONObject: details, options: .prettyPrinted),
let detailsStr = String(data: formatted, encoding: .utf8) {
print(" \(TerminalColor.gray)Details:\(TerminalColor.reset)")
print(detailsStr)
}
}
private func printResultDetails(from json: [String: Any]) {
guard self.outputMode != .minimal && self.outputMode != .quiet else { return }
guard let detail = self.primaryResultMessage(from: json) else { return }
let snippet = detail.trimmingCharacters(in: .whitespacesAndNewlines)
let sanitized = self.cleanToolPrefix(snippet)
guard !sanitized.isEmpty else { return }
print("\n \(TerminalColor.gray)\(sanitized.prefix(240))\(TerminalColor.reset)")
}
private func primaryResultMessage(from json: [String: Any]) -> String? {
if let message = json["message"] as? String, !message.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return message
}
if let content = json["content"] as? [[String: Any]] {
for item in content {
if let text = item["text"] as? String,
!text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return text
}
}
}
if let meta = json["meta"] as? [String: Any],
let message = meta["message"] as? String,
!message.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return message
}
return nil
}
}
// MARK: - Supporting Types
/// Formatter for unknown tools
private class UnknownToolFormatter: BaseToolFormatter {
private let toolName: String
override nonisolated init(toolType: ToolType) {
fatalError("Use init(toolName:)")
}
init(toolName: String) {
self.toolName = toolName
// Create a synthetic ToolType for unknown tools
// We'll use wait as a placeholder since it's a simple tool
super.init(toolType: .wait)
}
override nonisolated func formatStarting(arguments: [String: Any]) -> String {
"\(self.toolName.replacingOccurrences(of: "_", with: " ").capitalized)"
}
override nonisolated func formatCompleted(result: [String: Any], duration: TimeInterval) -> String {
"→ completed"
}
override nonisolated func formatError(error: String, result: [String: Any]) -> String {
"\(AgentDisplayTokens.Status.failure) \(error)"
}
override nonisolated func formatCompactSummary(arguments: [String: Any]) -> String {
""
}
override nonisolated func formatResultSummary(result: [String: Any]) -> String {
""
}
override nonisolated func formatForTitle(arguments: [String: Any]) -> String {
self.toolName
}
}

View File

@ -0,0 +1,197 @@
//
// PixelAnalyzer.swift
// PeekabooCore
//
import AppKit
import Foundation
/// Analyzes pixel regions to find uniform (boring) areas for optimal label placement
struct PixelAnalyzer {
private let image: NSImage
private let bitmapRep: NSBitmapImageRep?
private let textDetector: AcceleratedTextDetector
init?(image: NSImage) {
self.image = image
self.textDetector = AcceleratedTextDetector()
// Get bitmap representation for fallback pixel access
if let tiffData = image.tiffRepresentation,
let bitmap = NSBitmapImageRep(data: tiffData) {
self.bitmapRep = bitmap
} else {
self.bitmapRep = nil
}
}
/// Scores a region based on absence of text/edges (higher score = better for labels)
func scoreRegion(_ rect: NSRect) -> Float {
// Clamp rect to image bounds
let imageRect = NSRect(origin: .zero, size: image.size)
let clampedRect = rect.intersection(imageRect)
// If rect is outside image, return low score
guard !clampedRect.isEmpty else { return 0 }
// Use Accelerated Sobel edge detection to find text
return self.textDetector.scoreRegionForLabelPlacement(clampedRect, in: self.image)
}
/// Scores a region using simple variance (fallback method)
func scoreRegionSimple(_ rect: NSRect) -> Float {
// Scores a region using simple variance (fallback method)
guard self.bitmapRep != nil else { return 0 }
// Clamp rect to image bounds
let imageRect = NSRect(origin: .zero, size: image.size)
let clampedRect = rect.intersection(imageRect)
// If rect is outside image, return low score
guard !clampedRect.isEmpty else { return 0 }
// Sample pixels in a 7x7 grid for better coverage
let samples = self.samplePixels(in: clampedRect, gridSize: 7)
// Calculate contrast instead of variance
let contrast = self.calculateContrast(samples)
// Convert to score: lower contrast = higher score (more uniform)
// Add small epsilon to avoid division by zero
return 1.0 / (contrast + 0.001)
}
/// Finds the best position from candidates based on background uniformity
func findBestPosition(from candidates: [NSRect]) -> (rect: NSRect, score: Float)? {
// Finds the best position from candidates based on background uniformity
var bestPosition: (rect: NSRect, score: Float)?
for candidate in candidates {
let score = self.scoreRegion(candidate)
if bestPosition == nil || score > bestPosition!.score {
bestPosition = (rect: candidate, score: score)
}
}
return bestPosition
}
// MARK: - Private Methods
private func samplePixels(in rect: NSRect, gridSize: Int) -> [NSColor] {
var colors: [NSColor] = []
let stepX = rect.width / CGFloat(gridSize - 1)
let stepY = rect.height / CGFloat(gridSize - 1)
for row in 0..<gridSize {
for col in 0..<gridSize {
let x = rect.minX + CGFloat(col) * stepX
let y = rect.minY + CGFloat(row) * stepY
if let color = getPixelColor(at: CGPoint(x: x, y: y)) {
colors.append(color)
}
}
}
return colors
}
private func getPixelColor(at point: CGPoint) -> NSColor? {
guard let bitmap = bitmapRep else { return nil }
// Convert to bitmap coordinates (flip Y if needed)
let x = Int(point.x)
let y = Int(image.size.height - point.y - 1) // Flip Y coordinate
// Check bounds
guard x >= 0, x < bitmap.pixelsWide,
y >= 0, y < bitmap.pixelsHigh else {
return nil
}
return bitmap.colorAt(x: x, y: y)
}
private func calculateBrightnessVariance(_ colors: [NSColor]) -> Float {
guard !colors.isEmpty else { return 0 }
// Calculate brightness for each color
let brightnesses = colors.map { color -> Float in
// Convert to RGB color space if needed
guard let rgbColor = color.usingColorSpace(.deviceRGB) else {
return 0.5 // Default middle brightness
}
// Calculate luminance using standard formula
return Float(rgbColor.redComponent) * 0.299 +
Float(rgbColor.greenComponent) * 0.587 +
Float(rgbColor.blueComponent) * 0.114
}
// Calculate mean brightness
let mean = brightnesses.reduce(0, +) / Float(brightnesses.count)
// Calculate variance
let squaredDiffs = brightnesses.map { pow($0 - mean, 2) }
let variance = squaredDiffs.reduce(0, +) / Float(brightnesses.count)
return variance
}
private func calculateContrast(_ colors: [NSColor]) -> Float {
guard !colors.isEmpty else { return 0 }
// Calculate brightness for each color
let brightnesses = colors.map { color -> Float in
guard let rgbColor = color.usingColorSpace(.deviceRGB) else {
return 0.5
}
return Float(rgbColor.redComponent) * 0.299 +
Float(rgbColor.greenComponent) * 0.587 +
Float(rgbColor.blueComponent) * 0.114
}
// Calculate contrast as difference between min and max brightness
let minBrightness = brightnesses.min() ?? 0
let maxBrightness = brightnesses.max() ?? 0
return maxBrightness - minBrightness
}
}
// Extension for checking if a region has high contrast (text, edges)
extension PixelAnalyzer {
/// Quick check if region likely contains text or edges
func hasHighContrast(in rect: NSRect) -> Bool {
// Sample just 5 pixels in a cross pattern
let center = CGPoint(x: rect.midX, y: rect.midY)
let points = [
center,
CGPoint(x: rect.minX + rect.width * 0.25, y: rect.midY),
CGPoint(x: rect.maxX - rect.width * 0.25, y: rect.midY),
CGPoint(x: rect.midX, y: rect.minY + rect.height * 0.25),
CGPoint(x: rect.midX, y: rect.maxY - rect.height * 0.25)
]
let colors = points.compactMap { self.getPixelColor(at: $0) }
guard colors.count >= 2 else { return false }
// Check if colors vary significantly
let brightnesses = colors.map { color -> Float in
guard let rgbColor = color.usingColorSpace(.deviceRGB) else { return 0.5 }
return Float(rgbColor.redComponent) * 0.299 +
Float(rgbColor.greenComponent) * 0.587 +
Float(rgbColor.blueComponent) * 0.114
}
let minBrightness = brightnesses.min() ?? 0
let maxBrightness = brightnesses.max() ?? 0
// If brightness range > 0.3, we likely have text or edges
return (maxBrightness - minBrightness) > 0.3
}
}

View File

@ -1,135 +0,0 @@
import Commander
import Foundation
import PeekabooCore
import PeekabooFoundation
@MainActor
extension SeeCommand {
func detectElements(
imageData: Data,
windowContext: WindowContext?,
snapshotID: String? = nil
) async throws -> ElementDetectionResult {
self.logger.operationStart("element_detection")
defer { self.logger.operationComplete("element_detection") }
let timeoutSeconds = Self.detectionTimeoutSeconds(
configuredTimeoutSeconds: self.timeoutSeconds,
analyze: self.analyze
)
do {
return try await Self.detectElements(
automation: self.services.automation,
imageData: imageData,
windowContext: windowContext,
timeoutSeconds: timeoutSeconds,
snapshotID: snapshotID,
interactionMutationTracker: self.resolvedRuntime.observationTimeoutMutationTracker
)
} catch is TimeoutError {
throw CaptureError.detectionTimedOut(timeoutSeconds)
}
}
static func detectionTimeoutSeconds(
configuredTimeoutSeconds: Int?,
analyze: String?
) -> TimeInterval {
TimeInterval(configuredTimeoutSeconds ?? ((analyze == nil) ? 20 : 60))
}
static func remoteDetectionRequestTimeoutSeconds(for timeoutSeconds: TimeInterval) -> TimeInterval {
timeoutSeconds + 5
}
static func detectElements(
automation: any UIAutomationServiceProtocol,
imageData: Data,
windowContext: WindowContext?,
timeoutSeconds: TimeInterval,
snapshotID: String? = nil,
interactionMutationTracker: InteractionMutationTracker? = nil
) async throws -> ElementDetectionResult {
try await withWallClockTimeout(
seconds: timeoutSeconds,
interactionMutationTracker: interactionMutationTracker
) {
if let timeoutAdjustingAutomation = automation as? any DetectElementsRequestTimeoutAdjusting {
return try await timeoutAdjustingAutomation.detectElements(
in: imageData,
snapshotId: snapshotID,
windowContext: windowContext,
requestTimeoutSec: Self.remoteDetectionRequestTimeoutSeconds(for: timeoutSeconds)
)
}
return try await AutomationServiceBridge.detectElements(
automation: automation,
imageData: imageData,
snapshotId: snapshotID,
windowContext: windowContext
)
}
}
func resolveCaptureContext() async throws -> CaptureContext {
if self.menubar {
if let popover = try await self.captureMenuBarPopover() {
return CaptureContext(
captureResult: popover.captureResult,
captureBounds: popover.windowBounds,
prefersOCR: true,
ocrMethod: "OCR",
windowIdOverride: popover.windowId
)
}
self.logger.verbose("No menu bar popover detected; capturing menu bar area", category: "Capture")
let rect = try self.menuBarRect()
let result = try await self.services.screenCapture.captureArea(rect)
return CaptureContext(
captureResult: result,
captureBounds: rect,
prefersOCR: true,
ocrMethod: "OCR",
windowIdOverride: nil
)
}
let result = try await self.performLegacyScreenCapture()
return CaptureContext(
captureResult: result,
captureBounds: nil,
prefersOCR: false,
ocrMethod: nil,
windowIdOverride: nil
)
}
private func performLegacyScreenCapture() async throws -> CaptureResult {
let effectiveMode = self.determineMode()
self.logger.verbose(
"Determined capture mode",
category: "Capture",
metadata: ["mode": effectiveMode.rawValue]
)
self.logger.operationStart("capture_phase", metadata: ["mode": effectiveMode.rawValue])
switch effectiveMode {
case .screen:
// Handle screen capture with multi-screen support
let result = try await self.performScreenCapture()
self.logger.operationComplete("capture_phase", metadata: ["mode": effectiveMode.rawValue])
return result
case .multi:
// Commander currently treats multi captures as multi-display screen grabs
let result = try await self.performScreenCapture()
self.logger.operationComplete("capture_phase", metadata: ["mode": effectiveMode.rawValue])
return result
case .window, .frontmost, .area:
throw ValidationError("\(effectiveMode.rawValue) captures must use the desktop observation pipeline")
}
}
}

View File

@ -1,119 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
import PeekabooFoundation
@MainActor
extension SeeCommand {
var usesTemporaryScreenshotOutput: Bool {
self.jsonOutput && self.path == nil
}
func screenshotOutputPath(snapshotID: String? = nil) -> String {
if self.usesTemporaryScreenshotOutput {
return self.temporaryScreenshotDirectory(snapshotID: snapshotID)
.appendingPathComponent("raw.png")
.path
}
let timestamp = Date().timeIntervalSince1970
let filename = "peekaboo_see_\(Int(timestamp)).png"
return ObservationCommandSupport.outputPath(
path: self.path,
format: .png,
defaultDirectory: ConfigurationManager.shared.getDefaultSavePath(cliValue: nil),
defaultFileName: filename
)
}
func saveScreenshot(_ imageData: Data, snapshotID: String) throws -> String {
let outputPath = self.screenshotOutputPath(snapshotID: snapshotID)
let directory = (outputPath as NSString).deletingLastPathComponent
try FileManager.default.createDirectory(
atPath: directory,
withIntermediateDirectories: true
)
try imageData.write(to: URL(fileURLWithPath: outputPath))
self.logger.verbose("Saved screenshot to: \(outputPath)")
return outputPath
}
func cleanupTemporaryScreenshotOutput(snapshotID: String) {
guard self.usesTemporaryScreenshotOutput else { return }
try? FileManager.default.removeItem(at: self.temporaryScreenshotDirectory(snapshotID: snapshotID))
}
private func temporaryScreenshotDirectory(snapshotID: String?) -> URL {
FileManager.default.temporaryDirectory
.appendingPathComponent("peekaboo-see", isDirectory: true)
.appendingPathComponent(snapshotID ?? UUID().uuidString, isDirectory: true)
}
func resolveSeeWindowIndex(appIdentifier: String, titleFragment: String?) async throws -> Int? {
guard let fragment = titleFragment, !fragment.isEmpty else {
return nil
}
let appInfo = try await self.services.applications.findApplication(identifier: appIdentifier)
let snapshot = try await WindowListMapper.shared.snapshot()
let appWindows = WindowListMapper.scWindows(
for: appInfo.processIdentifier,
in: snapshot.scWindows
)
guard !appWindows.isEmpty else {
throw CaptureError.windowNotFound
}
if let index = WindowListMapper.scWindowIndex(
for: appInfo.processIdentifier,
titleFragment: fragment,
in: snapshot
) {
return index
}
if let index = WindowListMapper.scWindowIndex(for: fragment, in: appWindows) {
return index
}
throw CaptureError.windowNotFound
}
func resolveWindowId(appIdentifier: String, titleFragment: String?) async throws -> Int? {
guard let fragment = titleFragment, !fragment.isEmpty else {
return nil
}
let windows = try await self.services.windows.listWindows(
target: .applicationAndTitle(app: appIdentifier, title: fragment)
)
return windows.first?.windowID
}
func generateAnnotatedScreenshot(
snapshotId: String,
originalPath: String
) async throws -> String? {
guard let detectionResult = try await self.services.snapshots.getDetectionResult(snapshotId: snapshotId)
else {
self.logger.info("No detection result found for snapshot")
return nil
}
let renderer = ObservationAnnotationRenderer(debugMode: self.verbose)
let annotatedPath = try renderer.renderAnnotatedScreenshot(
originalPath: originalPath,
detectionResult: detectionResult
)
guard let annotatedPath else {
return nil
}
self.logger.verbose("Created annotated screenshot: \(annotatedPath)")
return annotatedPath
}
}

View File

@ -19,12 +19,6 @@ extension SeeCommand: CommanderSignatureProviding {
help: "Specific window title to capture",
long: "window-title"
),
.commandOption(
"windowId",
help: "Capture a specific window by CoreGraphics window id "
+ "(window_id from `peekaboo window list --json`)",
long: "window-id"
),
.commandOption(
"mode",
help: "Capture mode (screen, window, frontmost)",
@ -35,11 +29,6 @@ extension SeeCommand: CommanderSignatureProviding {
help: "Output path for screenshot",
long: "path"
),
.commandOption(
"captureEngine",
help: "Capture engine: auto|classic|cg|modern|sckit (defaults to auto)",
long: "capture-engine"
),
.commandOption(
"screenIndex",
help: "Specific screen index to capture (0-based)",
@ -50,11 +39,6 @@ extension SeeCommand: CommanderSignatureProviding {
help: "Analyze captured content with AI",
long: "analyze"
),
.commandOption(
"timeoutSeconds",
help: "Overall timeout in seconds (default: 20, or 60 when --analyze is set)",
long: "timeout-seconds"
),
],
flags: [
.commandFlag(
@ -62,16 +46,6 @@ extension SeeCommand: CommanderSignatureProviding {
help: "Generate annotated screenshot with interaction markers",
long: "annotate"
),
.commandFlag(
"menubar",
help: "Capture menu bar popovers via window list + OCR",
long: "menubar"
),
.commandFlag(
"noWebFocus",
help: "Skip web-content focus fallback when no text fields are detected",
long: "no-web-focus"
),
]
)
}

View File

@ -1,177 +0,0 @@
import Foundation
import PeekabooCore
@available(macOS 14.0, *)
@MainActor
extension SeeCommand {
func performCaptureWithDetection(snapshotID: String) async throws -> CaptureAndDetectionResult {
if let observationResult = try await self.performObservationCaptureWithDetectionIfPossible(
snapshotID: snapshotID
) {
return observationResult
}
let captureContext = try await self.resolveCaptureContext()
let captureResult = captureContext.captureResult
self.logger.startTimer("file_write")
let outputPath = try saveScreenshot(captureResult.imageData, snapshotID: snapshotID)
self.logger.stopTimer("file_write")
let windowContext = WindowContext(
applicationName: captureResult.metadata.applicationInfo?.name,
applicationBundleId: captureResult.metadata.applicationInfo?.bundleIdentifier,
applicationProcessId: captureResult.metadata.applicationInfo?.processIdentifier,
windowTitle: captureResult.metadata.windowInfo?.title,
windowID: captureContext.windowIdOverride ?? captureResult.metadata.windowInfo?.windowID,
windowBounds: captureContext.captureBounds ?? captureResult.metadata.windowInfo?.bounds,
shouldFocusWebContent: self.noWebFocus ? false : true,
traversalBudget: self.axTraversalBudget()
)
let detectionResult = try await self.detectElements(
for: captureContext,
windowContext: windowContext,
snapshotID: snapshotID
)
let resultWithPath = ElementDetectionResult(
snapshotId: snapshotID,
screenshotPath: outputPath,
elements: detectionResult.elements,
metadata: detectionResult.metadata
)
try await self.services.snapshots.storeScreenshot(
SnapshotScreenshotRequest(
snapshotId: snapshotID,
screenshotPath: outputPath,
applicationBundleId: captureResult.metadata.applicationInfo?.bundleIdentifier,
applicationProcessId: captureResult.metadata.applicationInfo.map { Int32($0.processIdentifier) },
applicationName: windowContext.applicationName,
windowTitle: windowContext.windowTitle,
windowBounds: windowContext.windowBounds
)
)
try await self.services.snapshots.storeDetectionResult(
snapshotId: snapshotID,
result: resultWithPath
)
return CaptureAndDetectionResult(
snapshotId: snapshotID,
screenshotPath: outputPath,
annotatedPath: nil,
elements: detectionResult.elements,
metadata: detectionResult.metadata,
observation: nil
)
}
private func detectElements(
for captureContext: CaptureContext,
windowContext: WindowContext,
snapshotID: String
) async throws -> ElementDetectionResult {
let captureResult = captureContext.captureResult
let detectionStart = Date()
if captureContext.prefersOCR {
self.logger.verbose("Running OCR for menu bar popover", category: "Capture")
let ocrElements = try self.ocrElements(
imageData: captureResult.imageData,
windowBounds: captureContext.captureBounds ?? captureResult.metadata.windowInfo?.bounds
)
let warnings = ocrElements.isEmpty ? ["OCR produced no elements"] : []
let metadata = DetectionMetadata(
detectionTime: Date().timeIntervalSince(detectionStart),
elementCount: ocrElements.count,
method: captureContext.ocrMethod ?? "OCR",
warnings: warnings,
windowContext: windowContext,
isDialog: false
)
return ElementDetectionResult(
snapshotId: snapshotID,
screenshotPath: "",
elements: DetectedElements(other: ocrElements),
metadata: metadata
)
}
let detectionResult = try await self.detectElements(
imageData: captureResult.imageData,
windowContext: windowContext,
snapshotID: snapshotID
)
return ElementDetectionResult(
snapshotId: snapshotID,
screenshotPath: detectionResult.screenshotPath,
elements: detectionResult.elements,
metadata: detectionResult.metadata
)
}
private func performObservationCaptureWithDetectionIfPossible(
snapshotID: String
) async throws -> CaptureAndDetectionResult? {
guard let target = try self.observationTargetForCaptureWithDetectionIfPossible() else {
return nil
}
self.logger.verbose("Using desktop observation pipeline", category: "Capture", metadata: [
"target": self.observationTargetDescription(target)
])
let mode = self.determineMode()
self.logger.operationStart("capture_phase", metadata: ["mode": mode.rawValue])
let observation: DesktopObservationResult
do {
observation = try await self.services.desktopObservation
.observe(self.makeObservationRequest(target: target, snapshotID: snapshotID))
} catch DesktopObservationError.targetNotFound(_) where self.menubar {
self.logger.verbose("No observation-backed menu bar popover found; falling back", category: "Capture")
self.logger.operationComplete("capture_phase", success: false, metadata: [
"mode": mode.rawValue,
"fallback": "legacy_menubar",
])
return nil
}
self.logger.operationComplete("capture_phase", metadata: [
"mode": mode.rawValue
])
self.logObservationSpans(observation.timings)
guard let outputPath = observation.files.rawScreenshotPath else {
throw CaptureError.captureFailure("Observation completed without a saved screenshot path")
}
guard let detectionResult = observation.elements else {
throw CaptureError.captureFailure("Observation completed without element detection")
}
return CaptureAndDetectionResult(
snapshotId: snapshotID,
screenshotPath: outputPath,
annotatedPath: observation.files.annotatedScreenshotPath,
elements: detectionResult.elements,
metadata: detectionResult.metadata,
observation: SeeObservationDiagnostics(
timings: observation.timings,
diagnostics: observation.diagnostics
)
)
}
private func logObservationSpans(_ timings: ObservationTimings) {
for span in timings.spans {
self.logger.verbose("Desktop observation span", category: "Performance", metadata: [
"span": span.name,
"duration_ms": Int(span.durationMS.rounded()),
])
}
}
}

View File

@ -1,257 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
@MainActor
extension SeeCommand {
struct MenuBarPopoverContext {
let extras: [MenuExtraInfo]
let ownerPidSet: Set<pid_t>
let canFilterByOwnerPid: Bool
let appHint: String?
let hintExtra: MenuExtraInfo?
let openExtra: MenuExtraInfo?
let preferredExtra: MenuExtraInfo?
let preferredOwnerName: String?
let preferredOwnerPid: pid_t?
let preferredX: CGFloat?
var shouldRelaxFilter: Bool {
self.openExtra != nil || self.appHint != nil
}
var hintName: String? {
self.appHint ?? self.preferredExtra?.title ?? self.preferredExtra?.ownerName
}
}
struct MenuBarCandidateState {
var candidates: [MenuBarPopoverCandidate]
var windowInfoMap: [Int: MenuBarPopoverWindowInfo]
var usedFilteredWindowList: Bool
}
func captureMenuBarPopover(allowAreaFallback: Bool = false) async throws -> MenuBarPopoverCapture? {
let context = try await self.makeMenuBarPopoverContext()
self.logOpenMenuExtraIfNeeded(context)
let snapshot = self.menuBarWindowSnapshot()
var state = self.resolveInitialCandidates(context: context, snapshot: snapshot)
state = self.relaxCandidatesIfNeeded(
context: context,
snapshot: snapshot,
state: state
)
state = self.applyOwnerNameFallbackIfNeeded(
context: context,
snapshot: snapshot,
state: state
)
if state.candidates.isEmpty {
if let capture = try await self.fallbackCaptureForEmptyCandidates(
context: context,
snapshot: snapshot,
state: &state
) {
return capture
}
}
guard !state.candidates.isEmpty else { return nil }
return try await self.capturePopoverFromCandidates(
context: context,
allowAreaFallback: allowAreaFallback,
state: state
)
}
private func makeMenuBarPopoverContext() async throws -> MenuBarPopoverContext {
let extras = try await self.services.menu.listMenuExtras()
let ownerPidSet = Set(extras.compactMap(\.ownerPID))
let canFilterByOwnerPid = !ownerPidSet.isEmpty
let appHint = self.menuBarAppHint()
let hintExtra = self.resolveMenuExtraHint(appHint: appHint, extras: extras)
let openExtra = try await self.resolveOpenMenuExtra(from: extras)
let preferredExtra = appHint != nil ? (hintExtra ?? openExtra) : (openExtra ?? hintExtra)
let preferredOwnerName = appHint ?? preferredExtra?.ownerName ?? preferredExtra?.title
let preferredX = preferredExtra?.position.x
let preferredOwnerPid = preferredExtra?.ownerPID
return MenuBarPopoverContext(
extras: extras,
ownerPidSet: ownerPidSet,
canFilterByOwnerPid: canFilterByOwnerPid,
appHint: appHint,
hintExtra: hintExtra,
openExtra: openExtra,
preferredExtra: preferredExtra,
preferredOwnerName: preferredOwnerName,
preferredOwnerPid: preferredOwnerPid,
preferredX: preferredX
)
}
private func logOpenMenuExtraIfNeeded(_ context: MenuBarPopoverContext) {
guard let openExtra = context.openExtra, let openPid = openExtra.ownerPID else { return }
self.logger.verbose(
"Detected open menu extra",
category: "Capture",
metadata: [
"title": openExtra.title,
"ownerPID": openPid
]
)
}
private func fallbackCaptureForEmptyCandidates(
context: MenuBarPopoverContext,
snapshot: ObservationMenuBarPopoverSnapshot,
state: inout MenuBarCandidateState
) async throws -> MenuBarPopoverCapture? {
if let openMenuCapture = try await self.captureMenuBarPopoverFromOpenMenu(
openExtra: context.openExtra ?? context.hintExtra,
appHint: context.appHint
) {
return openMenuCapture
}
if let preferredX = context.preferredX {
let bandCandidates = self.menuBarPopoverCandidatesByBand(
snapshot: snapshot,
preferredX: preferredX
)
if !bandCandidates.isEmpty {
state.candidates = bandCandidates
state.windowInfoMap = snapshot.windowInfoByID
state.usedFilteredWindowList = false
}
}
return nil
}
private func capturePopoverFromCandidates(
context: MenuBarPopoverContext,
allowAreaFallback: Bool,
state: MenuBarCandidateState
) async throws -> MenuBarPopoverCapture? {
let windowInfoMap = state.windowInfoMap
let selectionCandidates = self.selectCandidates(
from: state.candidates,
preferredOwnerName: context.preferredOwnerName,
windowInfoMap: windowInfoMap,
openExtra: context.openExtra
)
guard let selectionCandidates else { return nil }
let hints = MenuBarPopoverResolverContext.normalizedHints([
context.hintName,
context.preferredOwnerName
])
let resolverContext = MenuBarPopoverResolverContext(
appHint: context.hintName,
preferredOwnerName: context.preferredOwnerName,
ownerPID: context.preferredOwnerPid,
preferredX: context.preferredX,
ocrHints: hints
)
let allowOCR = selectionCandidates.count > 1 && !hints.isEmpty
let allowArea = (context.openExtra != nil || allowAreaFallback)
let candidateOCR = allowOCR ? self.menuBarCandidateOCRMatcher(hints: hints) : nil
let areaOCR = allowArea ? self.menuBarAreaOCRMatcher() : nil
let options = MenuBarPopoverResolver.ResolutionOptions(
allowOCR: allowOCR,
allowAreaFallback: allowArea,
candidateOCR: candidateOCR,
areaOCR: areaOCR
)
guard let resolution = try await MenuBarPopoverResolver.resolve(
candidates: selectionCandidates,
windowInfoById: windowInfoMap,
context: resolverContext,
options: options
) else {
return nil
}
return try await self.captureMenuBarPopover(from: resolution, windowInfoMap: windowInfoMap)
}
private func captureMenuBarPopover(
from resolution: MenuBarPopoverResolution,
windowInfoMap: [Int: MenuBarPopoverWindowInfo]
) async throws -> MenuBarPopoverCapture? {
self.logPopoverResolution(resolution, windowInfoMap: windowInfoMap)
if let captureResult = resolution.captureResult,
let bounds = resolution.bounds {
return MenuBarPopoverCapture(
captureResult: captureResult,
windowBounds: bounds,
windowId: resolution.windowId
)
}
guard let windowId = resolution.windowId,
let bounds = resolution.bounds else {
return nil
}
let captureResult = try await self.services.screenCapture.captureWindow(windowID: CGWindowID(windowId))
return MenuBarPopoverCapture(
captureResult: captureResult,
windowBounds: bounds,
windowId: windowId
)
}
private func logPopoverResolution(
_ resolution: MenuBarPopoverResolution,
windowInfoMap: [Int: MenuBarPopoverWindowInfo]
) {
switch resolution.reason {
case .ocr:
if let windowId = resolution.windowId {
self.logger.verbose(
"Selected menu bar popover via OCR",
category: "Capture",
metadata: [
"windowId": windowId
]
)
}
case .ocrArea:
if let bounds = resolution.bounds {
self.logger.verbose(
"Selected menu bar popover via area capture",
category: "Capture",
metadata: [
"rect": "\(bounds)"
]
)
}
default:
if let windowId = resolution.windowId,
let info = windowInfoMap[windowId] {
self.logger.verbose(
"Selected menu bar popover window",
category: "Capture",
metadata: [
"windowId": windowId,
"owner": info.ownerName ?? "unknown",
"title": info.title ?? "",
"reason": resolution.reason.rawValue
]
)
}
}
}
}

View File

@ -1,234 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
@MainActor
extension SeeCommand {
func menuBarWindowSnapshot() -> ObservationMenuBarPopoverSnapshot {
ObservationMenuBarWindowCatalog.currentPopoverSnapshot(
screens: self.services.screens.listScreens()
)
}
func resolveInitialCandidates(
context: MenuBarPopoverContext,
snapshot: ObservationMenuBarPopoverSnapshot
) -> MenuBarCandidateState {
let filteredCandidates: [MenuBarPopoverCandidate] = if context.canFilterByOwnerPid {
snapshot.candidates.filter { candidate in
context.ownerPidSet.contains(candidate.ownerPID)
}
} else {
snapshot.candidates
}
let usedFilteredWindowList = context.canFilterByOwnerPid &&
!filteredCandidates.isEmpty &&
filteredCandidates.count != snapshot.candidates.count
let baseCandidates = usedFilteredWindowList ? filteredCandidates : snapshot.candidates
var candidates = self.menuBarPopoverCandidates(
candidates: baseCandidates,
ownerPID: context.preferredOwnerPid
)
if candidates.isEmpty, context.preferredOwnerPid != nil {
candidates = self.menuBarPopoverCandidates(
candidates: baseCandidates,
ownerPID: nil
)
}
return MenuBarCandidateState(
candidates: candidates,
windowInfoMap: snapshot.windowInfoByID,
usedFilteredWindowList: usedFilteredWindowList
)
}
func relaxCandidatesIfNeeded(
context: MenuBarPopoverContext,
snapshot: ObservationMenuBarPopoverSnapshot,
state: MenuBarCandidateState
) -> MenuBarCandidateState {
guard state.candidates.isEmpty,
context.shouldRelaxFilter,
state.usedFilteredWindowList else {
return state
}
self.logger.debug("Relaxing menu bar popover filter to full window list")
var candidates = self.menuBarPopoverCandidates(
candidates: snapshot.candidates,
ownerPID: context.preferredOwnerPid
)
if candidates.isEmpty, context.preferredOwnerPid != nil {
candidates = self.menuBarPopoverCandidates(
candidates: snapshot.candidates,
ownerPID: nil
)
}
return MenuBarCandidateState(
candidates: candidates,
windowInfoMap: snapshot.windowInfoByID,
usedFilteredWindowList: false
)
}
func applyOwnerNameFallbackIfNeeded(
context: MenuBarPopoverContext,
snapshot: ObservationMenuBarPopoverSnapshot,
state: MenuBarCandidateState
) -> MenuBarCandidateState {
guard let preferredOwnerName = context.preferredOwnerName,
!preferredOwnerName.isEmpty,
state.usedFilteredWindowList else {
return state
}
let normalized = preferredOwnerName.lowercased()
let ownerMatches = state.candidates.filter { candidate in
let ownerName = state.windowInfoMap[candidate.windowId]?.ownerName?.lowercased() ?? ""
return ownerName == normalized || ownerName.contains(normalized)
}
guard ownerMatches.isEmpty else { return state }
var candidates = self.menuBarPopoverCandidates(
candidates: snapshot.candidates,
ownerPID: context.preferredOwnerPid
)
if candidates.isEmpty, context.preferredOwnerPid != nil {
candidates = self.menuBarPopoverCandidates(
candidates: snapshot.candidates,
ownerPID: nil
)
}
return MenuBarCandidateState(
candidates: candidates,
windowInfoMap: snapshot.windowInfoByID,
usedFilteredWindowList: false
)
}
func selectCandidates(
from candidates: [MenuBarPopoverCandidate],
preferredOwnerName: String?,
windowInfoMap: [Int: MenuBarPopoverWindowInfo],
openExtra: MenuExtraInfo?
) -> [MenuBarPopoverCandidate]? {
guard let preferredOwnerName, !preferredOwnerName.isEmpty else { return candidates }
let normalized = preferredOwnerName.lowercased()
let ownerMatches = candidates.filter { candidate in
let ownerName = windowInfoMap[candidate.windowId]?.ownerName?.lowercased() ?? ""
return ownerName == normalized || ownerName.contains(normalized)
}
if !ownerMatches.isEmpty {
return ownerMatches
}
return openExtra == nil ? nil : candidates
}
private func menuBarPopoverCandidates(
candidates: [MenuBarPopoverCandidate],
ownerPID: pid_t?
) -> [MenuBarPopoverCandidate] {
guard let ownerPID else { return candidates }
return candidates.filter { $0.ownerPID == ownerPID }
}
func menuBarPopoverCandidatesByBand(
snapshot _: ObservationMenuBarPopoverSnapshot,
preferredX: CGFloat
) -> [MenuBarPopoverCandidate] {
ObservationMenuBarWindowCatalog.currentBandCandidates(
preferredX: preferredX,
screens: self.services.screens.listScreens()
)
}
func menuBarAppHint() -> String? {
guard let app = self.app?.trimmingCharacters(in: .whitespacesAndNewlines),
!app.isEmpty else {
return nil
}
let lower = app.lowercased()
if lower == "menubar" || lower == "frontmost" {
return nil
}
return app
}
func resolveMenuExtraHint(
appHint: String?,
extras: [MenuExtraInfo]
) -> MenuExtraInfo? {
guard let appHint else { return nil }
let normalized = appHint.lowercased()
return extras.first { extra in
let candidates = [
extra.title,
extra.rawTitle,
extra.ownerName,
extra.bundleIdentifier,
extra.identifier
].compactMap { $0?.lowercased() }
return candidates.contains(where: { $0 == normalized }) ||
candidates.contains(where: { $0.contains(normalized) })
}
}
func resolveOpenMenuExtra(from extras: [MenuExtraInfo]) async throws -> MenuExtraInfo? {
for extra in extras {
let candidates = [
extra.title,
extra.ownerName,
extra.rawTitle,
extra.identifier,
extra.bundleIdentifier,
].compactMap { $0?.trimmingCharacters(in: .whitespacesAndNewlines) }
for candidate in candidates where !candidate.isEmpty {
let ownerPID: pid_t? = if let extraOwnerPID = extra.ownerPID {
extraOwnerPID
} else {
await self.resolveMenuExtraOwnerPID(extra)
}
let isOpen = await (try? self.services.menu.isMenuExtraMenuOpen(
title: candidate,
ownerPID: ownerPID
)) ?? false
if isOpen {
return extra
}
}
}
return nil
}
func resolveMenuExtraOwnerPID(_ extra: MenuExtraInfo) async -> pid_t? {
if let ownerPID = extra.ownerPID {
return ownerPID
}
guard let runningApps = try? await self.services.applications.listApplications().data.applications else {
return nil
}
if let bundleIdentifier = extra.bundleIdentifier,
let match = runningApps.first(where: { $0.bundleIdentifier == bundleIdentifier }) {
return match.processIdentifier
}
if let ownerName = extra.ownerName {
if let match = runningApps.first(where: { $0.name == ownerName }) {
return match.processIdentifier
}
let normalizedOwner = ownerName.lowercased()
if let match = runningApps.first(where: {
($0.bundleIdentifier ?? "").lowercased().contains(normalizedOwner)
}) {
return match.processIdentifier
}
}
return nil
}
}

View File

@ -1,32 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
import PeekabooFoundation
@MainActor
extension SeeCommand {
func menuBarRect() throws -> CGRect {
let screens = self.services.screens.listScreens()
guard let mainScreen = screens.first(where: \.isPrimary) ?? screens.first else {
throw PeekabooError.captureFailed("No main screen found")
}
let menuBarHeight = self.menuBarHeight(for: mainScreen)
return CGRect(
x: mainScreen.frame.origin.x,
y: mainScreen.frame.origin.y + mainScreen.frame.height - menuBarHeight,
width: mainScreen.frame.width,
height: menuBarHeight
)
}
func menuBarHeight(for screen: MenuBarPopoverDetector.ScreenBounds) -> CGFloat {
let height = max(0, screen.frame.maxY - screen.visibleFrame.maxY)
return height > 0 ? height : 24.0
}
private func menuBarHeight(for screen: ScreenInfo) -> CGFloat {
let height = max(0, screen.frame.maxY - screen.visibleFrame.maxY)
return height > 0 ? height : 24.0
}
}

View File

@ -1,106 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
@MainActor
extension SeeCommand {
func menuBarCandidateOCRMatcher(hints: [String]) -> MenuBarPopoverResolver.CandidateOCR {
let selector = self.menuBarPopoverOCRSelector()
return { candidate, _ in
guard let match = try await selector.matchCandidate(
windowID: CGWindowID(candidate.windowId),
bounds: candidate.bounds,
hints: hints
)
else { return nil }
return MenuBarPopoverResolver.OCRMatch(
captureResult: match.captureResult,
bounds: match.bounds
)
}
}
func menuBarAreaOCRMatcher() -> MenuBarPopoverResolver.AreaOCR {
let selector = self.menuBarPopoverOCRSelector()
return { preferredX, hints in
guard let match = try await selector.matchArea(preferredX: preferredX, hints: hints) else { return nil }
return MenuBarPopoverResolver.OCRMatch(
captureResult: match.captureResult,
bounds: match.bounds
)
}
}
func captureMenuBarPopoverFromOpenMenu(
openExtra: MenuExtraInfo?,
appHint: String?
) async throws -> MenuBarPopoverCapture? {
let ownerPID: pid_t? = if let openExtra {
await self.resolveMenuExtraOwnerPID(openExtra)
} else {
nil
}
let titles = [
openExtra?.title,
openExtra?.ownerName,
openExtra?.rawTitle,
appHint,
].compactMap { $0?.trimmingCharacters(in: .whitespacesAndNewlines) }
for candidate in titles where !candidate.isEmpty {
if let frame = try? await self.services.menu.menuExtraOpenMenuFrame(
title: candidate,
ownerPID: ownerPID
),
let capture = try await self.captureMenuBarPopoverByFrame(
frame,
hint: appHint ?? openExtra?.title,
ownerHint: openExtra?.ownerName
) {
return capture
}
}
return nil
}
func ocrElements(imageData: Data, windowBounds: CGRect?) throws -> [DetectedElement] {
guard let windowBounds else { return [] }
let result = try OCRService().recognizeText(in: imageData)
return ObservationOCRMapper.elements(from: result, windowBounds: windowBounds)
}
private func captureMenuBarPopoverByFrame(
_ frame: CGRect,
hint: String?,
ownerHint: String?
) async throws -> MenuBarPopoverCapture? {
let selector = self.menuBarPopoverOCRSelector()
if let match = try await selector.matchFrame(
frame,
hints: MenuBarPopoverResolverContext.normalizedHints([hint, ownerHint])
) {
self.logger.verbose(
"Selected menu bar popover via AX menu frame",
category: "Capture",
metadata: [
"rect": "\(match.bounds)"
]
)
return MenuBarPopoverCapture(
captureResult: match.captureResult,
windowBounds: match.bounds,
windowId: nil
)
}
return nil
}
private func menuBarPopoverOCRSelector() -> ObservationMenuBarPopoverOCRSelector {
ObservationMenuBarPopoverOCRSelector(
screenCapture: self.services.screenCapture,
screens: self.services.screens.listScreens()
)
}
}

View File

@ -1,181 +0,0 @@
import Commander
import CoreGraphics
import Foundation
import PeekabooAutomationKit
import PeekabooCore
@available(macOS 14.0, *)
@MainActor
extension SeeCommand {
func determineMode() -> PeekabooCore.CaptureMode {
if let mode = self.mode {
mode
} else if self.app != nil || self.pid != nil || self.windowTitle != nil || self.windowId != nil {
.window
} else {
.frontmost
}
}
func observationTargetForCaptureWithDetectionIfPossible() throws -> DesktopObservationTargetRequest? {
if self.menubar {
let hint = self.menuBarAppHint()
return .menubarPopover(
hints: MenuBarPopoverResolverContext.normalizedHints([hint]),
openIfNeeded: MenuBarPopoverOpenOptions(clickHint: hint)
)
}
switch self.determineMode() {
case .window:
if let windowId {
return .windowID(CGWindowID(windowId))
}
if let appValue = self.app?.lowercased() {
switch appValue {
case "menubar":
return .menubar
case "frontmost":
return .frontmost
default:
break
}
}
if let pid = try self.resolveExplicitPIDObservationTarget() {
return .pid(pid, window: self.seeWindowSelection)
}
if self.app != nil || self.pid != nil {
return try .app(identifier: self.resolveApplicationIdentifier(), window: self.seeWindowSelection)
}
throw ValidationError("Provide --window-id, or --app/--pid for window mode")
case .frontmost:
return .frontmost
case .screen:
if let screenIndex {
return .screen(index: screenIndex)
}
if self.analyze != nil {
return .screen(index: 0)
}
return nil
case .multi:
return nil
case .area:
throw ValidationError(
"Area capture mode is not supported by `see`; use `image --mode area --region x,y,width,height` " +
"or a window/screen target."
)
}
}
func makeObservationRequest(
target: DesktopObservationTargetRequest,
snapshotID: String? = nil
) -> DesktopObservationRequest {
DesktopObservationRequest(
target: target,
capture: DesktopCaptureOptions(
engine: self.observationCaptureEnginePreference,
scale: .logical1x,
visualizerMode: .screenshotFlash
),
detection: self.observationDetectionOptions(for: target),
output: DesktopObservationOutputOptions(
path: self.screenshotOutputPath(snapshotID: snapshotID),
saveRawScreenshot: true,
saveAnnotatedScreenshot: self.annotate && self.allowsAnnotation(for: target),
saveSnapshot: true,
snapshotID: snapshotID
)
)
}
func observationTargetDescription(_ target: DesktopObservationTargetRequest) -> String {
switch target {
case let .screen(index):
"screen:\(index.map(String.init) ?? "primary")"
case .allScreens:
"all-screens"
case .frontmost:
"frontmost"
case let .app(identifier, _):
"app:\(identifier)"
case let .pid(pid, _):
"pid:\(pid)"
case let .windowID(windowID):
"window-id:\(windowID)"
case let .area(rect):
"area:\(Int(rect.origin.x)),\(Int(rect.origin.y)),\(Int(rect.width))x\(Int(rect.height))"
case .menubar:
"menubar"
case .menubarPopover:
"menubar-popover"
}
}
private var seeWindowSelection: WindowSelection {
if let windowTitle {
return .title(windowTitle)
}
return .automatic
}
func allowsAnnotation(for target: DesktopObservationTargetRequest) -> Bool {
switch target {
case .screen, .allScreens, .menubar:
false
default:
true
}
}
private func observationDetectionOptions(for target: DesktopObservationTargetRequest) -> DesktopDetectionOptions {
switch target {
case .menubarPopover:
DesktopDetectionOptions(
mode: .none,
allowWebFocusFallback: false,
preferOCR: true,
traversalBudget: self.axTraversalBudget()
)
default:
DesktopDetectionOptions(
mode: .accessibility,
allowWebFocusFallback: !self.noWebFocus,
traversalBudget: self.axTraversalBudget()
)
}
}
func axTraversalBudget() -> AXTraversalBudget {
AXTraversalBudget.resolved(
maxDepth: self.validatedTraversalLimit(self.maxDepth, option: "--max-depth"),
maxElementCount: self.validatedTraversalLimit(self.maxElements, option: "--max-elements"),
maxChildrenPerNode: self.validatedTraversalLimit(self.maxChildren, option: "--max-children")
)
}
private func validatedTraversalLimit(_ value: Int?, option: String) -> Int? {
guard let value else { return nil }
guard value > 0 else {
self.logger.warn("\(option) must be positive; using default AX traversal budget")
return nil
}
return value
}
private var observationCaptureEnginePreference: CaptureEnginePreference {
ObservationCommandSupport.captureEnginePreference(
cliValue: self.captureEngine,
configuredValue: self.configuredCaptureEnginePreference
)
}
}

View File

@ -1,186 +0,0 @@
import Foundation
import PeekabooCore
@available(macOS 14.0, *)
@MainActor
extension SeeCommand {
func renderResults(context: SeeCommandRenderContext) throws {
try Task.checkCancellation()
if self.jsonOutput {
try self.outputJSONResults(context: context)
} else {
try self.outputTextResults(context: context)
}
}
/// Fetches the menu bar summary only when verbose output is requested, with a short timeout.
func fetchMenuBarSummaryIfEnabled() async -> MenuBarSummary? {
guard self.verbose else { return nil }
do {
return try await Self.withWallClockTimeout(seconds: 2.5) {
try Task.checkCancellation()
return await self.getMenuBarItemsSummary()
}
} catch {
self.logger.debug(
"Skipping menu bar summary",
category: "Menu",
metadata: ["reason": error.localizedDescription]
)
return nil
}
}
/// Drives the deadline independently while the MainActor operation is suspended.
/// Synchronous MainActor calls cannot be preempted.
static func withWallClockTimeout<T: Sendable>(
seconds: TimeInterval,
timeoutErrorSeconds: TimeInterval? = nil,
interactionMutationTracker: InteractionMutationTracker? = nil,
operation: @escaping @MainActor @Sendable () async throws -> T
) async throws -> T {
try await withMainActorCommandTimeout(
seconds: seconds,
operationName: "see",
timeoutError: { CaptureError.detectionTimedOut(timeoutErrorSeconds ?? seconds) },
interactionMutationTracker: interactionMutationTracker,
operation: { try await operation() }
)
}
func performAnalysisDetailed(imagePath: String, prompt: String) async throws -> SeeAnalysisData {
let ai = PeekabooAIService()
let res = try await ai.analyzeImageFileDetailed(at: imagePath, question: prompt, model: nil)
return SeeAnalysisData(provider: res.provider, model: res.model, text: res.text)
}
private func outputJSONResults(context: SeeCommandRenderContext) throws {
let uiElements: [UIElementSummary] = context.elements.all.map { element in
UIElementSummary(
id: element.id,
role: element.type.rawValue,
title: element.attributes["title"],
label: element.label,
description: element.attributes["description"],
role_description: element.attributes["roleDescription"],
help: element.attributes["help"],
identifier: element.attributes["identifier"],
bounds: UIElementBounds(element.bounds),
is_actionable: element.isEnabled,
keyboard_shortcut: element.attributes["keyboardShortcut"]
)
}
let snapshotPaths = self.snapshotPaths(for: context)
let output = SeeResult(
snapshot_id: context.snapshotId,
screenshot_raw: snapshotPaths.raw,
screenshot_annotated: snapshotPaths.annotated,
ui_map: snapshotPaths.map,
application_name: context.metadata.windowContext?.applicationName,
window_title: context.metadata.windowContext?.windowTitle,
is_dialog: context.metadata.isDialog,
element_count: context.metadata.elementCount,
interactable_count: context.elements.all.count { $0.isEnabled },
capture_mode: self.determineMode().rawValue,
analysis: context.analysis,
execution_time: context.executionTime,
ui_elements: uiElements,
menu_bar: context.menuBar,
truncation: SeeTruncationSummary(metadata: context.metadata),
observation: context.observation
)
outputSuccessCodable(data: output, logger: self.outputLogger)
}
private func getMenuBarItemsSummary() async -> MenuBarSummary {
var menuExtras: [MenuExtraInfo] = []
do {
menuExtras = try await self.services.menu.listMenuExtras()
} catch {
menuExtras = []
}
let menus = menuExtras.map { extra in
MenuBarSummary.MenuSummary(
title: extra.title,
item_count: 1,
enabled: true,
items: [
MenuBarSummary.MenuItemSummary(
title: extra.title,
enabled: true,
keyboard_shortcut: nil
)
]
)
}
return MenuBarSummary(menus: menus)
}
private func outputTextResults(context: SeeCommandRenderContext) throws {
try Task.checkCancellation()
print("🖼️ Screenshot saved to: \(context.screenshotPath)")
if let annotatedPath = context.annotatedPath {
print("📝 Annotated screenshot: \(annotatedPath)")
}
if let appName = context.metadata.windowContext?.applicationName {
print("📱 Application: \(appName)")
}
if let windowTitle = context.metadata.windowContext?.windowTitle {
let windowType = context.metadata.isDialog ? "Dialog" : "Window"
let icon = context.metadata.isDialog ? "🗨️" : "[win]"
print("\(icon) \(windowType): \(windowTitle)")
}
print("🧊 Detection method: \(context.metadata.method)")
print("📊 UI elements detected: \(context.metadata.elementCount)")
print("⚙️ Interactable elements: \(context.elements.all.count { $0.isEnabled })")
if let truncationInfo = context.metadata.truncationInfo, truncationInfo.isTruncated {
print("⚠️ \(truncationInfo.remediationMessage(budget: context.metadata.windowContext?.traversalBudget))")
}
let formattedDuration = String(format: "%.2f", context.executionTime)
print("⏱️ Execution time: \(formattedDuration)s")
if let analysis = context.analysis {
print("\n🤖 AI Analysis\n\(analysis.text)")
}
if context.metadata.elementCount > 0 {
print("\n🔍 Element Summary")
for element in context.elements.all.prefix(10) {
let summaryLabel = element.label ?? element.attributes["title"] ?? element.value ?? "Untitled"
print("\(element.id) (\(element.type.rawValue)) - \(summaryLabel)")
}
if context.metadata.elementCount > 10 {
print(" ...and \(context.metadata.elementCount - 10) more elements")
}
}
if self.annotate, context.annotatedPath != nil {
print("\n📝 Annotated screenshot created")
}
print("\nSnapshot ID: \(context.snapshotId)")
let terminalCapabilities = TerminalDetector.detectCapabilities()
if terminalCapabilities.recommendedOutputMode == .minimal {
print("Agent: Use a tool like view_image to inspect it.")
}
}
private func snapshotPaths(for context: SeeCommandRenderContext) -> SnapshotPaths {
let publishesScreenshotPaths = !self.usesTemporaryScreenshotOutput
return SnapshotPaths(
raw: publishesScreenshotPaths ? context.screenshotPath : "",
annotated: publishesScreenshotPaths ? context.annotatedPath ?? "" : "",
map: self.services.snapshots.getSnapshotStoragePath() + "/\(context.snapshotId)/snapshot.json"
)
}
}

View File

@ -1,160 +0,0 @@
import Algorithms
import Foundation
import PeekabooCore
@MainActor
extension SeeCommand {
func performScreenCapture() async throws -> CaptureResult {
if self.annotate {
self.logger.info("Annotation is disabled for full screen captures due to performance constraints")
}
self.logger.verbose("Initiating screen capture", category: "Capture")
self.logger.startTimer("screen_capture")
defer {
self.logger.stopTimer("screen_capture")
}
if let index = self.screenIndex ?? (self.analyze != nil ? 0 : nil) {
self.logger.verbose("Capturing specific screen", category: "Capture", metadata: ["screenIndex": index])
let result = try await self.services.screenCapture.captureScreen(displayIndex: index)
if !self.jsonOutput, let displayInfo = result.metadata.displayInfo {
self.printScreenDisplayInfo(index: index, displayInfo: displayInfo)
}
self.logger.verbose("Screen capture completed", category: "Capture", metadata: [
"mode": "screen-index",
"screenIndex": index,
"imageBytes": result.imageData.count
])
return result
}
self.logger.verbose("Capturing all screens", category: "Capture")
let results = try await self.captureAllScreens()
if results.isEmpty {
throw CaptureError.captureFailure("Failed to capture any screens")
}
if !self.jsonOutput {
print("📸 Captured \(results.count) screen(s):")
}
for (index, result) in results.indexed() {
if index > 0 {
let screenPath = self.screenOutputPath(for: index)
try result.imageData.write(to: URL(fileURLWithPath: screenPath))
if !self.jsonOutput, let displayInfo = result.metadata.displayInfo {
let fileSize = self.getFileSize(screenPath) ?? 0
let suffix = "\(screenPath) (\(self.formatFileSize(Int64(fileSize))))"
self.printScreenDisplayInfo(
index: index,
displayInfo: displayInfo,
indent: " ",
suffix: suffix
)
}
} else if !self.jsonOutput, let displayInfo = result.metadata.displayInfo {
self.printScreenDisplayInfo(
index: index,
displayInfo: displayInfo,
indent: " ",
suffix: "(primary)"
)
}
}
self.logger.verbose("Multi-screen capture completed", category: "Capture", metadata: [
"count": results.count,
"primaryBytes": results.first?.imageData.count ?? 0
])
return results[0]
}
func captureAllScreens() async throws -> [CaptureResult] {
var results: [CaptureResult] = []
let displays = self.services.screens.listScreens()
self.logger.info("Found \(displays.count) display(s) to capture")
for display in displays {
self.logger.verbose("Capturing display \(display.index)", category: "MultiScreen", metadata: [
"displayID": display.displayID,
"width": display.frame.width,
"height": display.frame.height
])
do {
let result = try await self.services.screenCapture.captureScreen(displayIndex: display.index)
results.append(result)
} catch {
self.logger.error("Failed to capture display \(display.index): \(error)")
// Continue capturing other screens even if one fails
}
}
if results.isEmpty {
throw CaptureError.captureFailure("Failed to capture any screens")
}
return results
}
func formatFileSize(_ bytes: Int64) -> String {
let formatter = ByteCountFormatter()
formatter.countStyle = .file
return formatter.string(fromByteCount: bytes)
}
private func screenOutputPath(for index: Int) -> String {
if let basePath = self.path {
let expanded = (basePath as NSString).expandingTildeInPath
if ObservationOutputPathResolver.isDirectoryLike(expanded) {
return URL(fileURLWithPath: expanded, isDirectory: true)
.appendingPathComponent(self.defaultScreenOutputFilename(for: index))
.path
}
let directory = (expanded as NSString).deletingLastPathComponent
let filename = (expanded as NSString).lastPathComponent
let nameWithoutExt = (filename as NSString).deletingPathExtension
let ext = (filename as NSString).pathExtension
let fileExtension = ext.isEmpty ? "png" : ext
return (directory as NSString)
.appendingPathComponent("\(nameWithoutExt)_screen\(index).\(fileExtension)")
}
return self.defaultScreenOutputFilename(for: index)
}
private func defaultScreenOutputFilename(for index: Int) -> String {
let timestamp = ISO8601DateFormatter().string(from: Date())
return "screenshot_\(timestamp)_screen\(index).png"
}
private func screenDisplayBaseText(index: Int, displayInfo: DisplayInfo) -> String {
let displayName = displayInfo.name ?? "Display \(index)"
let bounds = displayInfo.bounds
let resolution = "(\(Int(bounds.width))×\(Int(bounds.height)))"
return "[scrn] Display \(index): \(displayName) \(resolution)"
}
private func printScreenDisplayInfo(
index: Int,
displayInfo: DisplayInfo,
indent: String = "",
suffix: String? = nil
) {
var line = self.screenDisplayBaseText(index: index, displayInfo: displayInfo)
if let suffix {
line += "\(suffix)"
}
print("\(indent)\(line)")
}
}

View File

@ -1,240 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
struct CaptureContext {
let captureResult: CaptureResult
let captureBounds: CGRect?
let prefersOCR: Bool
let ocrMethod: String?
let windowIdOverride: Int?
}
struct MenuBarPopoverCapture {
let captureResult: CaptureResult
let windowBounds: CGRect
let windowId: Int?
}
struct CaptureAndDetectionResult {
let snapshotId: String
let screenshotPath: String
let annotatedPath: String?
let elements: DetectedElements
let metadata: DetectionMetadata
let observation: SeeObservationDiagnostics?
}
struct SnapshotPaths {
let raw: String
let annotated: String
let map: String
}
struct SeeCommandRenderContext {
let snapshotId: String
let screenshotPath: String
let annotatedPath: String?
let metadata: DetectionMetadata
let elements: DetectedElements
let analysis: SeeAnalysisData?
let executionTime: TimeInterval
let observation: SeeObservationDiagnostics?
let menuBar: MenuBarSummary?
}
struct UIElementSummary: Codable {
let id: String
let role: String
let title: String?
let label: String?
let description: String?
let role_description: String?
let help: String?
let identifier: String?
let bounds: UIElementBounds
let is_actionable: Bool
let keyboard_shortcut: String?
}
struct UIElementBounds: Codable {
let x: Double
let y: Double
let width: Double
let height: Double
init(_ rect: CGRect) {
self.x = rect.origin.x
self.y = rect.origin.y
self.width = rect.size.width
self.height = rect.size.height
}
}
struct SeeAnalysisData: Codable {
let provider: String
let model: String
let text: String
}
struct SeeObservationDiagnostics: Codable {
let spans: [SeeObservationSpan]
let warnings: [String]
let state_snapshot: SeeDesktopStateSnapshotSummary?
let target: SeeObservationTargetDiagnostics?
init(timings: ObservationTimings, diagnostics: DesktopObservationDiagnostics) {
self.spans = timings.spans.map(SeeObservationSpan.init)
self.warnings = diagnostics.warnings
self.state_snapshot = diagnostics.stateSnapshot.map(SeeDesktopStateSnapshotSummary.init)
self.target = diagnostics.target.map(SeeObservationTargetDiagnostics.init)
}
}
struct SeeObservationTargetDiagnostics: Codable {
let requested_kind: String
let resolved_kind: String
let source: String
let hints: [String]
let open_if_needed: Bool
let click_hint: String?
let window_id: Int?
let bounds: CGRect?
let capture_scale_hint: CGFloat?
init(_ diagnostics: DesktopObservationTargetDiagnostics) {
self.requested_kind = diagnostics.requestedKind
self.resolved_kind = diagnostics.resolvedKind
self.source = diagnostics.source
self.hints = diagnostics.hints
self.open_if_needed = diagnostics.openIfNeeded
self.click_hint = diagnostics.clickHint
self.window_id = diagnostics.windowID
self.bounds = diagnostics.bounds
self.capture_scale_hint = diagnostics.captureScaleHint
}
}
struct SeeObservationSpan: Codable {
let name: String
let duration_ms: Double
let metadata: [String: String]
init(_ span: ObservationSpan) {
self.name = span.name
self.duration_ms = span.durationMS
self.metadata = span.metadata
}
}
struct SeeDesktopStateSnapshotSummary: Codable {
let display_count: Int
let running_application_count: Int
let window_count: Int
let frontmost_application_name: String?
let frontmost_bundle_identifier: String?
let frontmost_window_title: String?
let frontmost_window_id: Int?
init(_ summary: DesktopStateSnapshotSummary) {
self.display_count = summary.displayCount
self.running_application_count = summary.runningApplicationCount
self.window_count = summary.windowCount
self.frontmost_application_name = summary.frontmostApplication?.name
self.frontmost_bundle_identifier = summary.frontmostApplication?.bundleIdentifier
self.frontmost_window_title = summary.frontmostWindow?.title
self.frontmost_window_id = summary.frontmostWindow?.windowID
}
}
struct SeeTruncationSummary: Codable {
let max_depth_reached: Bool
let max_element_count_reached: Bool
let max_children_per_node_reached: Bool
let warning: String
init?(metadata: DetectionMetadata) {
guard let truncationInfo = metadata.truncationInfo, truncationInfo.isTruncated else {
return nil
}
self.max_depth_reached = truncationInfo.maxDepthReached
self.max_element_count_reached = truncationInfo.maxElementCountReached
self.max_children_per_node_reached = truncationInfo.maxChildrenPerNodeReached
self.warning = truncationInfo.remediationMessage(budget: metadata.windowContext?.traversalBudget)
}
}
struct SeeResult: Codable {
let snapshot_id: String
let screenshot_raw: String
let screenshot_annotated: String
let ui_map: String
let application_name: String?
let window_title: String?
let is_dialog: Bool
let element_count: Int
let interactable_count: Int
let capture_mode: String
let analysis: SeeAnalysisData?
let execution_time: TimeInterval
let ui_elements: [UIElementSummary]
let truncation: SeeTruncationSummary?
let menu_bar: MenuBarSummary?
let observation: SeeObservationDiagnostics?
var success: Bool = true
init(
snapshot_id: String,
screenshot_raw: String,
screenshot_annotated: String,
ui_map: String,
application_name: String?,
window_title: String?,
is_dialog: Bool,
element_count: Int,
interactable_count: Int,
capture_mode: String,
analysis: SeeAnalysisData?,
execution_time: TimeInterval,
ui_elements: [UIElementSummary],
menu_bar: MenuBarSummary?,
truncation: SeeTruncationSummary? = nil,
observation: SeeObservationDiagnostics? = nil,
success: Bool = true
) {
self.snapshot_id = snapshot_id
self.screenshot_raw = screenshot_raw
self.screenshot_annotated = screenshot_annotated
self.ui_map = ui_map
self.application_name = application_name
self.window_title = window_title
self.is_dialog = is_dialog
self.element_count = element_count
self.interactable_count = interactable_count
self.capture_mode = capture_mode
self.analysis = analysis
self.execution_time = execution_time
self.ui_elements = ui_elements
self.truncation = truncation
self.menu_bar = menu_bar
self.observation = observation
self.success = success
}
}
struct MenuBarSummary: Codable {
let menus: [MenuSummary]
struct MenuSummary: Codable {
let title: String
let item_count: Int
let enabled: Bool
let items: [MenuItemSummary]
}
struct MenuItemSummary: Codable {
let title: String
let enabled: Bool
let keyboard_shortcut: String?
}
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,618 @@
//
// SmartLabelPlacer.swift
// PeekabooCore
//
import AppKit
import Foundation
import PeekabooCore
import PeekabooFoundation
protocol SmartLabelPlacerTextDetecting: AnyObject {
func scoreRegionForLabelPlacement(_ rect: NSRect, in image: NSImage) -> Float
func analyzeRegion(_ rect: NSRect, in image: NSImage) -> AcceleratedTextDetector.EdgeDensityResult
}
extension AcceleratedTextDetector: SmartLabelPlacerTextDetecting {}
/// Handles intelligent label placement for UI element annotations
final class SmartLabelPlacer {
static let defaultScoreRegionPadding: CGFloat = 6
// MARK: - Properties
private let image: NSImage
private let imageSize: NSSize
private let textDetector: any SmartLabelPlacerTextDetecting
private let fontSize: CGFloat
private let labelSpacing: CGFloat = 3
private let cornerInset: CGFloat = 2
private let scoreRegionPadding: CGFloat
// Label placement debugging
private let debugMode: Bool
private let logger: Logger
// MARK: - Initialization
init(
image: NSImage,
fontSize: CGFloat = 8,
debugMode: Bool = false,
logger: Logger = Logger.shared,
textDetector: (any SmartLabelPlacerTextDetecting)? = nil
) {
self.image = image
self.imageSize = image.size
self.textDetector = textDetector ?? AcceleratedTextDetector(logger: logger)
self.fontSize = fontSize
self.debugMode = debugMode
self.logger = logger
self.scoreRegionPadding = Self.defaultScoreRegionPadding
}
// MARK: - Public Methods
/// Finds the best position for a label given an element's bounds
/// - Parameters:
/// - element: The detected UI element
/// - elementRect: The element's rectangle in drawing coordinates (Y-flipped)
/// - labelSize: The size of the label to place
/// - existingLabels: Already placed labels to avoid overlapping
/// - allElements: All elements to avoid overlapping with
/// - Returns: Tuple of (labelRect, connectionPoint) or nil if no good position found
func findBestLabelPosition(
for element: DetectedElement,
elementRect: NSRect,
labelSize: NSSize,
existingLabels: [(rect: NSRect, element: DetectedElement)],
allElements: [(element: DetectedElement, rect: NSRect)]
) -> (labelRect: NSRect, connectionPoint: NSPoint?)? {
// Finds the best position for a label given an element's bounds
if self.debugMode {
self.logger.verbose(
"Finding position for \(element.id) (\(element.type)) with \(element.label ?? "no label")",
category: "LabelPlacement"
)
}
// Check if element is horizontally constrained (has neighbors on sides)
let isHorizontallyConstrained = self.isElementHorizontallyConstrained(
element: element,
elementRect: elementRect,
allElements: allElements
)
// Generate candidate positions based on element type and constraints
let candidates = self.generateCandidatePositions(
for: element,
elementRect: elementRect,
labelSize: labelSize,
prioritizeVertical: isHorizontallyConstrained
)
// Filter out positions that overlap with other elements or labels
let validPositions = self.filterValidPositions(
candidates: candidates,
element: element,
existingLabels: existingLabels,
allElements: allElements,
logRejections: self.debugMode
)
if self.debugMode {
self.logger.verbose(
"Found \(validPositions.count) valid external positions out of \(candidates.count) candidates",
category: "LabelPlacement"
)
}
// If no valid positions, try with relaxed constraints before falling back to internal
if validPositions.isEmpty {
if self.debugMode {
self.logger.verbose(
"No valid positions with strict constraints, trying relaxed constraints",
category: "LabelPlacement"
)
}
// Try with relaxed constraints (allow slight boundary overflow)
let relaxedCandidates = self.generateCandidatePositions(
for: element,
elementRect: elementRect,
labelSize: labelSize,
prioritizeVertical: isHorizontallyConstrained,
relaxedSpacing: true
)
let relaxedValidPositions = self.filterValidPositions(
candidates: relaxedCandidates,
element: element,
existingLabels: existingLabels,
allElements: allElements,
allowBoundaryOverflow: true,
logRejections: self.debugMode
)
if !relaxedValidPositions.isEmpty {
if self.debugMode {
self.logger.verbose(
"Found \(relaxedValidPositions.count) valid positions with relaxed constraints",
category: "LabelPlacement"
)
}
// Score and pick best relaxed position
let scoredRelaxed = self.scorePositions(relaxedValidPositions, elementRect: elementRect)
if let best = scoredRelaxed.max(by: { $0.score < $1.score }) {
let connectionPoint = self.calculateConnectionPoint(
for: best.index,
elementRect: elementRect,
isExternal: true
)
return (labelRect: best.rect, connectionPoint: connectionPoint)
}
}
// Only use internal placement as absolute last resort
if self.debugMode {
self.logger.info(
"No valid external positions even with relaxed constraints, falling back to internal placement",
category: "LabelPlacement"
)
}
return self.findInternalPosition(
for: element,
elementRect: elementRect,
labelSize: labelSize
)
}
// Score each valid position using edge detection
let scoredPositions = self.scorePositions(validPositions, elementRect: elementRect)
// Pick the best scoring position
guard let best = scoredPositions.max(by: { $0.score < $1.score }) else {
if self.debugMode {
self.logger.verbose("No scored positions available", category: "LabelPlacement")
}
return nil
}
if self.debugMode {
self.logger.verbose(
"""
Best position for \(element.id): type \(best.type) with score \(best.score) \
(higher = better, 1.0 = clear area, 0.0 = text/edges)
""",
category: "LabelPlacement",
metadata: [
"elementId": element.id,
"positionType": best.type.rawValue,
"score": best.score
]
)
}
// Calculate connection point if needed
let connectionPoint = self.calculateConnectionPoint(
for: best.index,
elementRect: elementRect,
isExternal: best.index < candidates.count
)
return (labelRect: best.rect, connectionPoint: connectionPoint)
}
// MARK: - Private Methods
private func isElementHorizontallyConstrained(
element: DetectedElement,
elementRect: NSRect,
allElements: [(element: DetectedElement, rect: NSRect)]
) -> Bool {
// Check if there are elements close to the left and right
let horizontalThreshold: CGFloat = 20 // pixels
var hasLeftNeighbor = false
var hasRightNeighbor = false
for (otherElement, otherRect) in allElements {
guard otherElement.id != element.id else { continue }
// Check if vertically aligned (similar Y position)
let verticalOverlap = min(elementRect.maxY, otherRect.maxY) - max(elementRect.minY, otherRect.minY)
guard verticalOverlap > elementRect.height * 0.5 else { continue }
// Check horizontal proximity
if otherRect.maxX < elementRect.minX && elementRect.minX - otherRect.maxX < horizontalThreshold {
hasLeftNeighbor = true
}
if otherRect.minX > elementRect.maxX && otherRect.minX - elementRect.maxX < horizontalThreshold {
hasRightNeighbor = true
}
}
return hasLeftNeighbor || hasRightNeighbor
}
private func generateCandidatePositions(
for element: DetectedElement,
elementRect: NSRect,
labelSize: NSSize,
prioritizeVertical: Bool = false,
relaxedSpacing: Bool = false
) -> [(rect: NSRect, index: Int, type: PositionType)] {
var positions: [(rect: NSRect, index: Int, type: PositionType)] = []
let spacing = relaxedSpacing ? self.labelSpacing * 2 : self.labelSpacing
// ALWAYS generate above/below positions first for ALL element types
// This is the key fix - buttons need these positions too!
positions.append(contentsOf: [
// Above (priority position for horizontally constrained elements)
(NSRect(
x: elementRect.midX - labelSize.width / 2,
y: elementRect.maxY + spacing,
width: labelSize.width,
height: labelSize.height
), 0, .externalAbove),
// Below
(NSRect(
x: elementRect.midX - labelSize.width / 2,
y: elementRect.minY - labelSize.height - spacing,
width: labelSize.width,
height: labelSize.height
), 1, .externalBelow),
])
// For buttons and links, add corner positions
if element.type == .button || element.type == .link {
// External corners (less intrusive)
positions.append(contentsOf: [
// Top-left external
(NSRect(
x: elementRect.minX - labelSize.width - spacing,
y: elementRect.maxY - labelSize.height,
width: labelSize.width,
height: labelSize.height
), 2, .externalTopLeft),
// Top-right external
(NSRect(
x: elementRect.maxX + spacing,
y: elementRect.maxY - labelSize.height,
width: labelSize.width,
height: labelSize.height
), 3, .externalTopRight),
// Bottom-left external
(NSRect(
x: elementRect.minX - labelSize.width - spacing,
y: elementRect.minY,
width: labelSize.width,
height: labelSize.height
), 4, .externalBottomLeft),
// Bottom-right external
(NSRect(
x: elementRect.maxX + spacing,
y: elementRect.minY,
width: labelSize.width,
height: labelSize.height
), 5, .externalBottomRight),
])
}
// Add side positions
positions.append(contentsOf: [
// Right side
(NSRect(
x: elementRect.maxX + spacing,
y: elementRect.midY - labelSize.height / 2,
width: labelSize.width,
height: labelSize.height
), 6, .externalRight),
// Left side
(NSRect(
x: elementRect.minX - labelSize.width - spacing,
y: elementRect.midY - labelSize.height / 2,
width: labelSize.width,
height: labelSize.height
), 7, .externalLeft),
])
// If element is horizontally constrained, prioritize vertical positions
if prioritizeVertical {
// Move above/below positions to the front of the array
positions.sort { a, b in
let aIsVertical = a.type == .externalAbove || a.type == .externalBelow
let bIsVertical = b.type == .externalAbove || b.type == .externalBelow
if aIsVertical && !bIsVertical { return true }
if !aIsVertical && bIsVertical { return false }
return a.index < b.index
}
}
return positions
}
private func filterValidPositions(
candidates: [(rect: NSRect, index: Int, type: PositionType)],
element: DetectedElement,
existingLabels: [(rect: NSRect, element: DetectedElement)],
allElements: [(element: DetectedElement, rect: NSRect)],
allowBoundaryOverflow: Bool = false,
logRejections: Bool = false
) -> [(rect: NSRect, index: Int, type: PositionType)] {
candidates.filter { candidate in
// Check if within image bounds (with optional relaxation)
if !allowBoundaryOverflow {
let withinBounds = candidate.rect.minX >= -5 && // Allow slight overflow on edges
candidate.rect.maxX <= self.imageSize.width + 5 &&
candidate.rect.minY >= -5 &&
candidate.rect.maxY <= self.imageSize.height + 5
if !withinBounds {
if logRejections {
self.logger.verbose(
"Position \(candidate.type) rejected: outside image bounds",
category: "LabelPlacement",
metadata: [
"rect": "\(candidate.rect)",
"imageBounds": "0,0 \(self.imageSize.width)x\(self.imageSize.height)"
]
)
}
return false
}
}
// Check overlap with other elements
for (otherElement, otherRect) in allElements {
if otherElement.id != element.id && candidate.rect.intersects(otherRect) {
if logRejections {
self.logger.verbose(
"Position \(candidate.type) rejected: overlaps with element \(otherElement.id)",
category: "LabelPlacement",
metadata: [
"candidateRect": "\(candidate.rect)",
"elementRect": "\(otherRect)"
]
)
}
return false
}
}
// Check overlap with existing labels
for (existingLabel, labelElement) in existingLabels where candidate.rect.intersects(existingLabel) {
if logRejections {
self.logger.verbose(
"Position \(candidate.type) rejected: overlaps with label for \(labelElement.id)",
category: "LabelPlacement",
metadata: [
"candidateRect": "\(candidate.rect)",
"existingLabelRect": "\(existingLabel)"
]
)
}
return false
}
return true
}
}
private func scorePositions(
_ positions: [(rect: NSRect, index: Int, type: PositionType)],
elementRect: NSRect
) -> [(rect: NSRect, index: Int, type: PositionType, score: Float)] {
positions.map { position in
// Convert from drawing coordinates to image coordinates for analysis
// Drawing has Y=0 at top, image has Y=0 at bottom
let imageRect = NSRect(
x: position.rect.origin.x,
y: self.imageSize.height - position.rect.origin.y - position.rect.height,
width: position.rect.width,
height: position.rect.height
)
// Expand the sampled area slightly so we avoid busy regions around the label,
// not just underneath it. This helps place annotations over calmer backgrounds.
// NOTE: this is a critical tweakby sampling beyond the label bounds we detect noisy
// backgrounds that would otherwise not register, which is what keeps labels from
// covering interesting UI areas (graphs, text blocks, etc.).
let scoringRect = Self.clampedRect(
imageRect.insetBy(dx: -self.scoreRegionPadding, dy: -self.scoreRegionPadding),
within: NSRect(origin: .zero, size: self.imageSize)
)
// Score using edge detection
var score = self.textDetector.scoreRegionForLabelPlacement(scoringRect, in: self.image)
// Boost score for preferred positions
if position.type == .externalAbove {
score *= 1.2 // Prefer above position
} else if position.type == .externalBelow {
score *= 1.1 // Second preference for below
}
// Ensure score stays in valid range
score = min(1.0, score)
if self.debugMode {
self.logger.verbose(
"Scoring position \(position.index) (\(position.type))",
category: "LabelPlacement",
metadata: [
"index": position.index,
"type": position.type.rawValue,
"drawingRect": "\(position.rect)",
"imageRect": "\(imageRect)",
"score": score
]
)
}
return (rect: position.rect, index: position.index, type: position.type, score: score)
}
}
private func findInternalPosition(
for element: DetectedElement,
elementRect: NSRect,
labelSize: NSSize
) -> (labelRect: NSRect, connectionPoint: NSPoint?)? {
let insidePositions: [NSRect] = if element.type == .button || element.type == .link {
// For buttons, use corners with small inset
[
// Top-left corner
NSRect(
x: elementRect.minX + self.cornerInset,
y: elementRect.maxY - labelSize.height - self.cornerInset,
width: labelSize.width,
height: labelSize.height
),
// Top-right corner
NSRect(
x: elementRect.maxX - labelSize.width - self.cornerInset,
y: elementRect.maxY - labelSize.height - self.cornerInset,
width: labelSize.width,
height: labelSize.height
),
]
} else {
// For other elements
[
// Top-left
NSRect(
x: elementRect.minX + 2,
y: elementRect.maxY - labelSize.height - 2,
width: labelSize.width,
height: labelSize.height
),
]
}
// Find first position that fits
for candidateRect in insidePositions where elementRect.contains(candidateRect) {
// Score this internal position
let imageRect = NSRect(
x: candidateRect.origin.x,
y: self.imageSize.height - candidateRect.origin.y - candidateRect.height,
width: candidateRect.width,
height: candidateRect.height
)
let score = self.textDetector.scoreRegionForLabelPlacement(imageRect, in: self.image)
// Only use if score is acceptable (low edge density)
if score > 0.5 {
return (labelRect: candidateRect, connectionPoint: nil)
}
}
// Ultimate fallback - center
let centerRect = NSRect(
x: elementRect.midX - labelSize.width / 2,
y: elementRect.midY - labelSize.height / 2,
width: labelSize.width,
height: labelSize.height
)
return (labelRect: centerRect, connectionPoint: nil)
}
private func calculateConnectionPoint(
for positionIndex: Int,
elementRect: NSRect,
isExternal: Bool
) -> NSPoint? {
guard isExternal else { return nil }
// Connection points for external positions
// Updated to match new position indices
switch positionIndex {
case 0: // Above
return NSPoint(x: elementRect.midX, y: elementRect.maxY)
case 1: // Below
return NSPoint(x: elementRect.midX, y: elementRect.minY)
case 2, 3, 4, 5: // Corner positions
return NSPoint(x: elementRect.midX, y: elementRect.midY)
case 6: // Right
return NSPoint(x: elementRect.maxX, y: elementRect.midY)
case 7: // Left
return NSPoint(x: elementRect.minX, y: elementRect.midY)
default:
return nil
}
}
// MARK: - Types
private enum PositionType: String {
case externalTopLeft
case externalTopRight
case externalBottomLeft
case externalBottomRight
case externalLeft
case externalRight
case externalAbove
case externalBelow
case internalTopLeft
case internalTopRight
case internalCenter
}
}
extension SmartLabelPlacer {
/// Returns a rect clamped to the provided bounds. If there is no overlap,
/// it returns the original rect to avoid zero-sized inputs.
private static func clampedRect(_ rect: NSRect, within bounds: NSRect) -> NSRect {
let intersection = rect.intersection(bounds)
if intersection.isNull {
return rect
}
return intersection
}
}
// MARK: - Debug Visualization
extension SmartLabelPlacer {
/// Creates a debug image showing edge detection results
func createDebugVisualization(for rect: NSRect) -> NSImage? {
// Convert to image coordinates
let imageRect = NSRect(
x: rect.origin.x,
y: self.imageSize.height - rect.origin.y - rect.height,
width: rect.width,
height: rect.height
)
let result = self.textDetector.analyzeRegion(imageRect, in: self.image)
// Create visualization showing edge density
let debugImage = NSImage(size: rect.size)
debugImage.lockFocus()
// Draw background color based on edge density
let color = if result.hasText {
NSColor.red.withAlphaComponent(0.5) // Bad for labels
} else {
NSColor.green.withAlphaComponent(0.5) // Good for labels
}
color.setFill()
NSRect(origin: .zero, size: rect.size).fill()
// Draw edge density percentage
let text = String(format: "%.1f%%", result.density * 100)
let attributes: [NSAttributedString.Key: Any] = [
.font: NSFont.systemFont(ofSize: 10),
.foregroundColor: NSColor.white
]
text.draw(at: NSPoint(x: 2, y: 2), withAttributes: attributes)
debugImage.unlockFocus()
return debugImage
}
}

View File

@ -1,360 +0,0 @@
import Commander
import CoreGraphics
import Foundation
import PeekabooCore
import PeekabooFoundation
extension PermissionCommand {
struct RequestScreenRecordingSubcommand: OutputFormattable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "request-screen-recording",
abstract: "Trigger screen recording permission prompt"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var services: any PeekabooServiceProviding {
self.resolvedRuntime.services
}
private var logger: Logger {
self.resolvedRuntime.logger
}
var outputLogger: Logger {
self.logger
}
var jsonOutput: Bool {
self.resolvedRuntime.configuration.jsonOutput
}
/// Trigger the screen recording permission prompt using the best available mechanism.
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.prepare(using: runtime)
if await self.renderIfAlreadyGranted() { return }
let result = await PermissionHelpers.performInteractivePermissionRequest(using: runtime) {
await self.requestScreenRecordingPermission()
}
self.render(result: result)
}
private mutating func prepare(using runtime: CommandRuntime) {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
}
private func renderIfAlreadyGranted() async -> Bool {
let hasPermission = await self.services.screenCapture.hasScreenRecordingPermission()
guard hasPermission else { return false }
let payload = AgentPermissionActionResult(
action: "request-screen-recording",
already_granted: true,
prompt_triggered: false,
granted: true
)
self.render(result: payload)
return true
}
private func requestScreenRecordingPermission() async -> AgentPermissionActionResult {
if !self.jsonOutput {
print("Requesting Screen Recording permission...\n")
print("Triggering permission prompt...\n")
}
if #available(macOS 10.15, *) {
return self.handleModernPrompt()
} else {
return self.handleLegacyPrompt()
}
}
private func handleModernPrompt() -> AgentPermissionActionResult {
let granted = CGRequestScreenCaptureAccess()
if !self.jsonOutput {
self.printModernResult(granted: granted)
}
return AgentPermissionActionResult(
action: "request-screen-recording",
already_granted: false,
prompt_triggered: true,
granted: granted
)
}
private func handleLegacyPrompt() -> AgentPermissionActionResult {
// Minimum supported macOS is 15+, so reuse the modern path.
self.handleModernPrompt()
}
private func printModernResult(granted: Bool) {
guard !self.jsonOutput else { return }
if granted {
print("✅ Screen Recording permission granted!")
return
}
print("❌ Screen Recording permission denied\n")
print("To grant manually:")
print("1. Open System Settings")
print("2. Go to Privacy & Security > Screen Recording")
print("3. Enable Peekaboo")
}
private func render(result: AgentPermissionActionResult) {
if self.jsonOutput {
outputSuccessCodable(data: result, logger: self.logger)
} else if result.already_granted {
print("✅ Screen Recording permission is already granted!")
}
}
}
struct RequestAccessibilitySubcommand: OutputFormattable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "request-accessibility",
abstract: "Request accessibility permission"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var services: any PeekabooServiceProviding {
self.resolvedRuntime.services
}
private var logger: Logger {
self.resolvedRuntime.logger
}
var outputLogger: Logger {
self.logger
}
var jsonOutput: Bool {
self.resolvedRuntime.configuration.jsonOutput
}
/// Prompt the user to grant accessibility permission and open the relevant System Settings pane.
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.prepare(using: runtime)
if await self.renderIfAlreadyGranted() { return }
let granted = await PermissionHelpers.performInteractivePermissionRequest(using: runtime) {
self.promptAccessibilityDialog()
}
self.renderAccessibilityResult(granted: granted)
}
private mutating func prepare(using runtime: CommandRuntime) {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
}
private func renderIfAlreadyGranted() async -> Bool {
let hasPermission = await AutomationServiceBridge
.hasAccessibilityPermission(automation: self.services.automation)
guard hasPermission else { return false }
let payload = AgentPermissionActionResult(
action: "request-accessibility",
already_granted: true,
prompt_triggered: false,
granted: true
)
self.renderAccessibilityResult(payload: payload)
return true
}
private func promptAccessibilityDialog() -> Bool {
if !self.jsonOutput {
print("Requesting Accessibility permission...\n")
print("Opening System Settings to Accessibility permissions...\n")
}
return self.services.permissions.requestAccessibilityPermission(interactive: true)
}
private func renderAccessibilityResult(granted: Bool) {
let payload = AgentPermissionActionResult(
action: "request-accessibility",
already_granted: false,
prompt_triggered: true,
granted: granted
)
self.renderAccessibilityResult(payload: payload)
}
private func renderAccessibilityResult(payload: AgentPermissionActionResult) {
if self.jsonOutput {
outputSuccessCodable(data: payload, logger: self.logger)
return
}
guard !payload.already_granted else {
print("✅ Accessibility permission is already granted!")
return
}
if payload.granted == true {
print("✅ Accessibility permission granted!")
} else {
print("A dialog should have appeared.\n")
print("To grant permission:")
print("1. Click 'Open System Settings' in the dialog")
print("2. Enable Peekaboo in the Accessibility list")
print("3. You may need to restart Peekaboo after granting")
}
}
}
struct RequestEventSynthesizingSubcommand: ErrorHandlingCommand, OutputFormattable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "request-event-synthesizing",
abstract: "Request event-synthesizing permission for background input"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var services: any PeekabooServiceProviding {
self.resolvedRuntime.services
}
private var logger: Logger {
self.resolvedRuntime.logger
}
var outputLogger: Logger {
self.logger
}
var jsonOutput: Bool {
self.resolvedRuntime.configuration.jsonOutput
}
/// Prompt macOS for event-posting access used by process-targeted hotkeys.
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.prepare(using: runtime)
do {
let payload = try await self.requestEventSynthesizingPermission()
self.renderEventSynthesizingResult(payload: payload)
} catch {
self.handleError(error)
throw ExitCode.failure
}
}
private mutating func prepare(using runtime: CommandRuntime) {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
}
private func requestEventSynthesizingPermission() async throws -> AgentPermissionActionResult {
let result = try await PermissionHelpers.requestEventSynthesizingPermission(
services: self.services,
runtime: self.resolvedRuntime
)
return AgentPermissionActionResult(
action: result.action,
source: result.source,
already_granted: result.already_granted,
prompt_triggered: result.prompt_triggered,
granted: result.granted
)
}
private func renderEventSynthesizingResult(payload: AgentPermissionActionResult) {
if self.jsonOutput {
outputSuccessCodable(data: payload, logger: self.logger)
return
}
guard !payload.already_granted else {
print("✅ Event Synthesizing permission is already granted!")
return
}
if payload.granted == true {
print("✅ Event Synthesizing permission granted!")
} else {
print("❌ Event Synthesizing permission denied\n")
print("To grant manually:")
print("1. Open System Settings")
print("2. Go to Privacy & Security > Accessibility")
if payload.source == "bridge" {
print("3. Enable the process that showed the prompt")
} else {
print("3. Enable Peekaboo")
}
}
}
}
}
private struct AgentPermissionActionResult: Codable {
let action: String
let source: String?
let already_granted: Bool
let prompt_triggered: Bool
let granted: Bool?
init(
action: String,
source: String? = nil,
already_granted: Bool,
prompt_triggered: Bool,
granted: Bool?
) {
self.action = action
self.source = source
self.already_granted = already_granted
self.prompt_triggered = prompt_triggered
self.granted = granted
}
}
extension PermissionCommand.RequestScreenRecordingSubcommand: ParsableCommand {}
extension PermissionCommand.RequestScreenRecordingSubcommand: AsyncRuntimeCommand {}
extension PermissionCommand.RequestAccessibilitySubcommand: ParsableCommand {}
extension PermissionCommand.RequestAccessibilitySubcommand: AsyncRuntimeCommand {}
extension PermissionCommand.RequestEventSynthesizingSubcommand: ParsableCommand {}
extension PermissionCommand.RequestEventSynthesizingSubcommand: AsyncRuntimeCommand {}

View File

@ -1,120 +0,0 @@
import Commander
import Foundation
import PeekabooCore
import PeekabooFoundation
extension PermissionCommand {
struct StatusSubcommand: OutputFormattable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "status",
abstract: "Check current permission status"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var services: any PeekabooServiceProviding {
self.resolvedRuntime.services
}
private var logger: Logger {
self.resolvedRuntime.logger
}
var outputLogger: Logger {
self.logger
}
var jsonOutput: Bool {
self.resolvedRuntime.configuration.jsonOutput
}
/// Summarize the current permission state for the agent-centric workflow.
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.prepare(using: runtime)
let status = await self.fetchPermissionStatus()
self.render(status: status)
}
private mutating func prepare(using runtime: CommandRuntime) {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
}
@MainActor
private func fetchPermissionStatus() async -> AgentPermissionStatusPayload {
if let remoteServices = self.services as? RemotePeekabooServices,
let status = try? await remoteServices.permissionsStatus() {
return AgentPermissionStatusPayload(
screen_recording: status.screenRecording,
accessibility: status.accessibility,
event_synthesizing: status.postEvent
)
}
let screenRecording = await self.services.screenCapture.hasScreenRecordingPermission()
let accessibility = await AutomationServiceBridge
.hasAccessibilityPermission(automation: self.services.automation)
let eventSynthesizing = self.services.permissions.checkPostEventPermission()
return AgentPermissionStatusPayload(
screen_recording: screenRecording,
accessibility: accessibility,
event_synthesizing: eventSynthesizing
)
}
private func render(status: AgentPermissionStatusPayload) {
if self.jsonOutput {
outputSuccessCodable(data: status, logger: self.logger)
return
}
print("Peekaboo Permission Status")
print("==========================\n")
self.printStatusLine(label: "Screen Recording", granted: status.screen_recording)
self.printStatusLine(label: "Accessibility", granted: status.accessibility)
self.printStatusLine(label: "Event Synthesizing", granted: status.event_synthesizing)
if !status.screen_recording || !status.accessibility {
print("\nTo grant missing required permissions:")
if !status.screen_recording {
print("- Run: peekaboo agent permission request-screen-recording")
}
if !status.accessibility {
print("- Run: peekaboo agent permission request-accessibility")
}
}
if !status.event_synthesizing {
print("\nOptional for background input:")
print("- Run: peekaboo agent permission request-event-synthesizing")
}
}
private func printStatusLine(label: String, granted: Bool) {
let state = granted ? "✅ Granted" : "❌ Not granted"
print("\(label): \(state)")
}
}
}
private struct AgentPermissionStatusPayload: Codable {
let screen_recording: Bool
let accessibility: Bool
let event_synthesizing: Bool
}
extension PermissionCommand.StatusSubcommand: ParsableCommand {}
extension PermissionCommand.StatusSubcommand: AsyncRuntimeCommand {}

View File

@ -1,4 +1,9 @@
import AXorcist
import Commander
import CoreGraphics
import Foundation
import PeekabooCore
import PeekabooFoundation
/// Manage and request system permissions
struct PermissionCommand: ParsableCommand {
@ -17,16 +22,379 @@ struct PermissionCommand: ParsableCommand {
# Request accessibility permission
peekaboo agent permission request-accessibility
# Request event-synthesizing permission for background input
peekaboo agent permission request-event-synthesizing
""",
subcommands: [
StatusSubcommand.self,
RequestScreenRecordingSubcommand.self,
RequestAccessibilitySubcommand.self,
RequestEventSynthesizingSubcommand.self
RequestAccessibilitySubcommand.self
],
defaultSubcommand: StatusSubcommand.self
)
}
extension PermissionCommand {
// MARK: - Status Subcommand
struct StatusSubcommand: OutputFormattable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "status",
abstract: "Check current permission status"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var services: any PeekabooServiceProviding { self.resolvedRuntime.services }
private var logger: Logger { self.resolvedRuntime.logger }
var outputLogger: Logger { self.logger }
var jsonOutput: Bool { self.resolvedRuntime.configuration.jsonOutput }
/// Summarize the current permission state for the agent-centric workflow.
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.prepare(using: runtime)
let status = await self.fetchPermissionStatus()
self.render(status: status)
}
private mutating func prepare(using runtime: CommandRuntime) {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
}
@MainActor
private func fetchPermissionStatus() async -> AgentPermissionStatusPayload {
let screenRecording = await self.services.screenCapture.hasScreenRecordingPermission()
let accessibility = await AutomationServiceBridge
.hasAccessibilityPermission(automation: self.services.automation)
return AgentPermissionStatusPayload(
screen_recording: screenRecording,
accessibility: accessibility
)
}
private func render(status: AgentPermissionStatusPayload) {
if self.jsonOutput {
outputSuccessCodable(data: status, logger: self.logger)
return
}
print("Peekaboo Permission Status")
print("==========================\n")
self.printStatusLine(label: "Screen Recording", granted: status.screen_recording)
self.printStatusLine(label: "Accessibility", granted: status.accessibility)
guard !status.screen_recording || !status.accessibility else { return }
print("\nTo grant missing permissions:")
if !status.screen_recording {
print("- Run: peekaboo agent permission request-screen-recording")
}
if !status.accessibility {
print("- Run: peekaboo agent permission request-accessibility")
}
}
private func printStatusLine(label: String, granted: Bool) {
let state = granted ? "✅ Granted" : "❌ Not granted"
print("\(label): \(state)")
}
}
// MARK: - Request Screen Recording Subcommand
struct RequestScreenRecordingSubcommand: OutputFormattable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "request-screen-recording",
abstract: "Trigger screen recording permission prompt"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var services: any PeekabooServiceProviding { self.resolvedRuntime.services }
private var logger: Logger { self.resolvedRuntime.logger }
var outputLogger: Logger { self.logger }
var jsonOutput: Bool { self.resolvedRuntime.configuration.jsonOutput }
/// Trigger the screen recording permission prompt using the best available mechanism.
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.prepare(using: runtime)
if await self.renderIfAlreadyGranted() { return }
let result = await self.requestScreenRecordingPermission()
self.render(result: result)
}
private mutating func prepare(using runtime: CommandRuntime) {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
}
private func renderIfAlreadyGranted() async -> Bool {
let hasPermission = await self.services.screenCapture.hasScreenRecordingPermission()
guard hasPermission else { return false }
let payload = AgentPermissionActionResult(
action: "request-screen-recording",
already_granted: true,
prompt_triggered: false,
granted: true
)
self.render(result: payload)
return true
}
private func requestScreenRecordingPermission() async -> AgentPermissionActionResult {
if !self.jsonOutput {
print("Requesting Screen Recording permission...\n")
print("Triggering permission prompt...\n")
}
if #available(macOS 10.15, *) {
return self.handleModernPrompt()
} else {
return self.handleLegacyPrompt()
}
}
private func handleModernPrompt() -> AgentPermissionActionResult {
let granted = CGRequestScreenCaptureAccess()
if !self.jsonOutput {
self.printModernResult(granted: granted)
}
return AgentPermissionActionResult(
action: "request-screen-recording",
already_granted: false,
prompt_triggered: true,
granted: granted
)
}
private func handleLegacyPrompt() -> AgentPermissionActionResult {
if #available(macOS 14.0, *) {
// Should never reach on modern macOS; keep for completeness.
return self.handleModernPrompt()
}
if !self.jsonOutput {
print("Attempting screen capture to trigger permission prompt...")
}
if #unavailable(macOS 14.0) {
self.triggerLegacyScreenRecordingPrompt()
}
if !self.jsonOutput {
self.printLegacyGuidance()
}
return AgentPermissionActionResult(
action: "request-screen-recording",
already_granted: false,
prompt_triggered: true,
granted: nil
)
}
/// Legacy (< macOS 14) probe to provoke the Screen Recording prompt.
/// We intentionally keep CGWindowListCreateImage so older systems see the dialog;
/// we arent modernizing this path yet. It can be disabled via env if needed.
@available(
macOS,
introduced: 10.15,
deprecated: 14.0,
message: "ScreenCaptureKit handles permission prompts on macOS 14+."
)
private func triggerLegacyScreenRecordingPrompt() {
guard #unavailable(macOS 14.0) else {
return
}
let enableLegacy = ProcessInfo.processInfo.environment["PEEKABOO_ALLOW_LEGACY_CAPTURE"]?.lowercased()
let allowed = enableLegacy.map { ["1", "true", "yes"].contains($0) } ?? true
if !allowed { return }
_ = CGWindowListCreateImage(
CGRect(x: 0, y: 0, width: 1, height: 1),
.optionAll,
kCGNullWindowID,
.nominalResolution
)
}
private func printModernResult(granted: Bool) {
guard !self.jsonOutput else { return }
if granted {
print("✅ Screen Recording permission granted!")
return
}
print("❌ Screen Recording permission denied\n")
print("To grant manually:")
print("1. Open System Settings")
print("2. Go to Privacy & Security > Screen Recording")
print("3. Enable Peekaboo")
}
private func printLegacyGuidance() {
guard !self.jsonOutput else { return }
print("")
print("If a permission dialog appeared:")
print("- Click 'Open System Settings'")
print("- Enable Screen Recording for Peekaboo")
print("")
print("If no dialog appeared, grant manually in:")
print("System Settings > Privacy & Security > Screen Recording")
}
private func render(result: AgentPermissionActionResult) {
if self.jsonOutput {
outputSuccessCodable(data: result, logger: self.logger)
} else if result.already_granted {
print("✅ Screen Recording permission is already granted!")
}
}
}
// MARK: - Request Accessibility Subcommand
struct RequestAccessibilitySubcommand: OutputFormattable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "request-accessibility",
abstract: "Request accessibility permission"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var services: any PeekabooServiceProviding { self.resolvedRuntime.services }
private var logger: Logger { self.resolvedRuntime.logger }
var outputLogger: Logger { self.logger }
var jsonOutput: Bool { self.resolvedRuntime.configuration.jsonOutput }
/// Prompt the user to grant accessibility permission and open the relevant System Settings pane.
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.prepare(using: runtime)
if await self.renderIfAlreadyGranted() { return }
let granted = self.promptAccessibilityDialog()
self.renderAccessibilityResult(granted: granted)
}
private mutating func prepare(using runtime: CommandRuntime) {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
}
private func renderIfAlreadyGranted() async -> Bool {
let hasPermission = await AutomationServiceBridge
.hasAccessibilityPermission(automation: self.services.automation)
guard hasPermission else { return false }
let payload = AgentPermissionActionResult(
action: "request-accessibility",
already_granted: true,
prompt_triggered: false,
granted: true
)
self.renderAccessibilityResult(payload: payload)
return true
}
private func promptAccessibilityDialog() -> Bool {
if !self.jsonOutput {
print("Requesting Accessibility permission...\n")
print("Opening System Settings to Accessibility permissions...\n")
}
return AXPermissionHelpers.askForAccessibilityIfNeeded()
}
private func renderAccessibilityResult(granted: Bool) {
let payload = AgentPermissionActionResult(
action: "request-accessibility",
already_granted: false,
prompt_triggered: true,
granted: granted
)
self.renderAccessibilityResult(payload: payload)
}
private func renderAccessibilityResult(payload: AgentPermissionActionResult) {
if self.jsonOutput {
outputSuccessCodable(data: payload, logger: self.logger)
return
}
guard !payload.already_granted else {
print("✅ Accessibility permission is already granted!")
return
}
if payload.granted == true {
print("✅ Accessibility permission granted!")
} else {
print("A dialog should have appeared.\n")
print("To grant permission:")
print("1. Click 'Open System Settings' in the dialog")
print("2. Enable Peekaboo in the Accessibility list")
print("3. You may need to restart Peekaboo after granting")
}
}
}
}
// MARK: - Response Types
private struct AgentPermissionStatusPayload: Codable {
let screen_recording: Bool
let accessibility: Bool
}
private struct AgentPermissionActionResult: Codable {
let action: String
let already_granted: Bool
let prompt_triggered: Bool
let granted: Bool?
}
extension PermissionCommand.StatusSubcommand: ParsableCommand {}
extension PermissionCommand.StatusSubcommand: AsyncRuntimeCommand {}
extension PermissionCommand.RequestScreenRecordingSubcommand: ParsableCommand {}
extension PermissionCommand.RequestScreenRecordingSubcommand: AsyncRuntimeCommand {}
extension PermissionCommand.RequestAccessibilitySubcommand: ParsableCommand {}
extension PermissionCommand.RequestAccessibilitySubcommand: AsyncRuntimeCommand {}

View File

@ -1,304 +0,0 @@
import Commander
import Foundation
import PeekabooBridge
import PeekabooCore
import PeekabooFoundation
// MARK: - Error Handling Protocol
/// Protocol for commands that need standardized error handling
@MainActor
protocol ErrorHandlingCommand {
var jsonOutput: Bool { get }
}
extension ErrorHandlingCommand {
/// Handle errors with appropriate output format
func handleError(_ error: any Error, customCode: ErrorCode? = nil) {
if jsonOutput {
let errorCode = customCode ?? self.mapErrorToCode(error)
let logger: Logger = if let formattable = self as? any OutputFormattable {
formattable.outputLogger
} else {
Logger.shared
}
outputError(
message: errorMessage(for: error),
code: errorCode,
details: errorDetails(for: error),
logger: logger
)
} else {
let errorMessage: String = if let peekabooError = error as? PeekabooError {
peekabooError.errorDescription ?? String(describing: error)
} else if let captureError = error as? CaptureError {
captureError.errorDescription ?? String(describing: error)
} else if error
.localizedDescription == "The operation couldn't be completed. (PeekabooCore.PeekabooError error 0.)" ||
error.localizedDescription == "Error" {
String(describing: error)
} else {
error.localizedDescription
}
fputs("Error: \(errorMessage)\n", stderr)
}
}
/// Map various error types to error codes
private func mapErrorToCode(_ error: any Error) -> ErrorCode {
switch error {
case let focusError as FocusError:
self.mapFocusErrorToCode(focusError)
case let peekabooError as PeekabooError:
self.mapPeekabooErrorToCode(peekabooError)
case let captureError as CaptureError:
self.mapCaptureErrorToCode(captureError)
case let observationError as DesktopObservationError:
self.mapObservationErrorToCode(observationError)
case let bridgeError as PeekabooBridgeErrorEnvelope:
errorCode(for: bridgeError)
case let posixError as POSIXError:
errorCode(for: posixError)
case is Commander.ValidationError:
.VALIDATION_ERROR
default:
.INTERNAL_SWIFT_ERROR
}
}
private func mapObservationErrorToCode(_ error: DesktopObservationError) -> ErrorCode {
switch error {
case .targetNotFound:
.WINDOW_NOT_FOUND
case .unsupportedTarget:
.VALIDATION_ERROR
}
}
private func mapPeekabooErrorToCode(_ error: PeekabooError) -> ErrorCode {
if let lookupCode = lookupErrorCode(for: error) {
return lookupCode
}
if let permissionCode = permissionErrorCode(for: error) {
return permissionCode
}
if let timeoutCode = timeoutErrorCode(for: error) {
return timeoutCode
}
if let automationCode = automationErrorCode(for: error) {
return automationCode
}
if let inputCode = inputErrorCode(for: error) {
return inputCode
}
if let credentialCode = credentialErrorCode(for: error) {
return credentialCode
}
return .UNKNOWN_ERROR
}
private func lookupErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .appNotFound:
.APP_NOT_FOUND
case .ambiguousAppIdentifier:
.AMBIGUOUS_APP_IDENTIFIER
case .windowNotFound:
.WINDOW_NOT_FOUND
case .elementNotFound:
.ELEMENT_NOT_FOUND
case .sessionNotFound:
.SESSION_NOT_FOUND
case .snapshotNotFound:
.SNAPSHOT_NOT_FOUND
case .snapshotStale:
.SNAPSHOT_STALE
case .menuNotFound:
.MENU_BAR_NOT_FOUND
case .menuItemNotFound:
.MENU_ITEM_NOT_FOUND
default:
nil
}
}
private func permissionErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .permissionDeniedScreenRecording:
.PERMISSION_ERROR_SCREEN_RECORDING
case .permissionDeniedAccessibility:
.PERMISSION_ERROR_ACCESSIBILITY
case .permissionDeniedEventSynthesizing:
.PERMISSION_ERROR_EVENT_SYNTHESIZING
default:
nil
}
}
private func timeoutErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .captureTimeout, .timeout:
.TIMEOUT
default:
nil
}
}
private func automationErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .captureFailed, .clickFailed, .typeFailed:
.CAPTURE_FAILED
case .serviceUnavailable, .networkError, .apiError, .commandFailed, .encodingError:
.UNKNOWN_ERROR
default:
nil
}
}
private func inputErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .invalidCoordinates:
.INVALID_COORDINATES
case .fileIOError:
.FILE_IO_ERROR
case .invalidInput:
.INVALID_INPUT
default:
nil
}
}
private func credentialErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .noAIProviderAvailable, .authenticationFailed:
.MISSING_API_KEY
case .aiProviderError:
.AGENT_ERROR
default:
nil
}
}
private func mapCaptureErrorToCode(_ error: CaptureError) -> ErrorCode {
switch error {
case .screenRecordingPermissionDenied, .permissionDeniedScreenRecording:
.PERMISSION_ERROR_SCREEN_RECORDING
case .accessibilityPermissionDenied:
.PERMISSION_ERROR_ACCESSIBILITY
case .appleScriptPermissionDenied:
.PERMISSION_ERROR_APPLESCRIPT
case .noDisplaysAvailable, .noDisplaysFound:
.CAPTURE_FAILED
case .invalidDisplayID, .invalidDisplayIndex:
.INVALID_ARGUMENT
case .captureCreationFailed, .windowCaptureFailed, .captureFailed, .captureFailure:
.CAPTURE_FAILED
case .windowNotFound, .noWindowsFound:
.WINDOW_NOT_FOUND
case .windowTitleNotFound:
.WINDOW_NOT_FOUND
case .fileWriteError, .fileIOError:
.FILE_IO_ERROR
case .appNotFound:
.APP_NOT_FOUND
case .invalidWindowIndexOld, .invalidWindowIndex:
.INVALID_ARGUMENT
case .invalidArgument:
.INVALID_ARGUMENT
case .unknownError:
.UNKNOWN_ERROR
case .noFrontmostApplication:
.WINDOW_NOT_FOUND
case .invalidCaptureArea:
.INVALID_ARGUMENT
case .ambiguousAppIdentifier:
.AMBIGUOUS_APP_IDENTIFIER
case .imageConversionFailed:
.CAPTURE_FAILED
case .detectionTimedOut:
.TIMEOUT
}
}
private func mapFocusErrorToCode(_ error: FocusError) -> ErrorCode {
errorCode(for: error)
}
}
func errorMessage(for error: any Error) -> String {
if let bridgeError = error as? PeekabooBridgeErrorEnvelope {
return bridgeError.message
}
return error.localizedDescription
}
func applicationLaunchErrorCode(for error: any Error) -> ErrorCode? {
guard let bridgeError = error as? PeekabooBridgeErrorEnvelope,
bridgeError.code == .notFound
else {
return nil
}
return .APP_NOT_FOUND
}
func errorDetails(for error: any Error) -> String? {
guard let bridgeError = error as? PeekabooBridgeErrorEnvelope else {
return nil
}
var details: [String] = []
if let bridgeDetails = bridgeError.details, !bridgeDetails.isEmpty {
details.append(bridgeDetails)
}
if let permission = bridgeError.permission {
details.append("permission: \(permission.rawValue)")
}
return details.isEmpty ? nil : details.joined(separator: "\n")
}
func errorCode(for focusError: FocusError) -> ErrorCode {
switch focusError {
case .applicationNotRunning:
.APP_NOT_FOUND
case .focusVerificationTimeout, .timeoutWaitingForCondition:
.TIMEOUT
default:
.WINDOW_NOT_FOUND
}
}
func errorCode(for bridgeError: PeekabooBridgeErrorEnvelope) -> ErrorCode {
switch bridgeError.code {
case .permissionDenied:
switch bridgeError.permission {
case .screenRecording:
.PERMISSION_ERROR_SCREEN_RECORDING
case .accessibility:
.PERMISSION_ERROR_ACCESSIBILITY
case .postEvent:
.PERMISSION_ERROR_EVENT_SYNTHESIZING
case .appleScript:
.PERMISSION_ERROR_APPLESCRIPT
case .none:
.PERMISSION_DENIED
}
case .timeout:
.TIMEOUT
case .invalidRequest:
.INVALID_ARGUMENT
case .operationNotSupported:
.VALIDATION_ERROR
case .notFound:
.UNKNOWN_ERROR
case .versionMismatch, .unauthorizedClient, .decodingFailed, .internalError, .serverBusy:
.UNKNOWN_ERROR
}
}
func errorCode(for posixError: POSIXError) -> ErrorCode {
switch posixError.code {
case .ETIMEDOUT:
.TIMEOUT
default:
.INTERNAL_SWIFT_ERROR
}
}

View File

@ -17,7 +17,7 @@ struct CommandHelpRenderer {
let fallbackSignature = CommandSignature.describe(type.init())
.flattened()
.withPeekabooRuntimeFlags()
.withStandardRuntimeFlags()
return self.renderHelp(
abstract: description.abstract,
discussion: description.discussion,
@ -74,8 +74,7 @@ struct CommandHelpRenderer {
private static func renderArguments(_ arguments: [ArgumentDefinition], theme: HelpTheme?) -> String? {
guard !arguments.isEmpty else { return nil }
let rows = arguments.map { argument -> (String, String?) in
let placeholder = self.kebabCased(argument.label)
let label = argument.isOptional ? "[\(placeholder)]" : "<\(placeholder)>"
let label = argument.isOptional ? "[\(argument.label)]" : "<\(argument.label)>"
return (label, argument.help)
}
return self.makeSection(title: "ARGUMENTS", lines: self.renderKeyValueRows(rows, theme: theme), theme: theme)
@ -88,48 +87,12 @@ struct CommandHelpRenderer {
.filter { !$0.isAlias }
.map(\.cliSpelling)
.joined(separator: ", ")
let valuePlaceholder = " <\(self.optionValuePlaceholder(for: option))>"
let valuePlaceholder = " <\(option.label)>"
return (names + valuePlaceholder, option.help)
}
return self.makeSection(title: "OPTIONS", lines: self.renderKeyValueRows(rows, theme: theme), theme: theme)
}
private static func optionValuePlaceholder(for option: OptionDefinition) -> String {
if let longName = option.names.compactMap(\.primaryLongComponent).first {
return longName
}
return self.kebabCased(self.optionLabel(option.label))
}
private static func optionLabel(_ label: String) -> String {
let suffix = "Option"
guard label.hasSuffix(suffix), label.count > suffix.count else {
return label
}
return String(label.dropLast(suffix.count))
}
private static func kebabCased(_ value: String) -> String {
guard !value.isEmpty else { return value }
var output = ""
for character in value {
if character.isUppercase {
if !output.isEmpty, output.last != "-" {
output.append("-")
}
output.append(contentsOf: character.lowercased())
} else if character == "_" || character == " " {
if !output.isEmpty, output.last != "-" {
output.append("-")
}
} else {
output.append(character)
}
}
return output
}
private static func renderFlags(_ flags: [FlagDefinition], theme: HelpTheme?) -> String? {
guard !flags.isEmpty else { return nil }
let rows = flags.map { flag -> (String, String?) in
@ -162,7 +125,8 @@ struct CommandHelpRenderer {
let padding = min(max(rows.map(\.0.count).max() ?? 0, 12), 32)
return rows.map { key, value in
guard let value, !value.isEmpty else {
return theme?.command(key) ?? key
let displayKey = theme?.command(key) ?? key
return displayKey
}
let paddedKey: String = if key.count >= padding {
key
@ -192,13 +156,4 @@ extension CommanderName {
"--\(value)"
}
}
fileprivate var primaryLongComponent: String? {
switch self {
case let .long(value):
value
case .short, .aliasShort, .aliasLong:
nil
}
}
}

View File

@ -0,0 +1,33 @@
import AppKit
import AXorcist
import Commander
import Foundation
import PeekabooCore
import PeekabooFoundation
// MARK: - Action Extensions
extension Attribute where T == String {
static var hide: Attribute<String> { Attribute("AXHide") }
static var unhide: Attribute<String> { Attribute("AXUnhide") }
}
// MARK: - Application Finding
/// Async wrapper for finding applications using PeekabooCore services
@MainActor
func findApplication(
identifier: String,
services: any PeekabooServiceProviding
) async throws -> (app: Element, runningApp: NSRunningApplication) {
// Use PeekabooServices to find the application
let appInfo = try await services.applications.findApplication(identifier: identifier)
// Get the NSRunningApplication
guard let runningApp = NSRunningApplication(processIdentifier: appInfo.processIdentifier) else {
throw PeekabooError.appNotFound(identifier)
}
let axApp = AXApp(runningApp)
return (app: axApp.element, runningApp: runningApp)
}

View File

@ -1,55 +0,0 @@
import Foundation
import PeekabooCore
// MARK: - Output Formatting Protocol
/// Protocol for commands that support both JSON and human-readable output
@MainActor
protocol OutputFormattable {
var jsonOutput: Bool { get }
var outputLogger: Logger { get }
}
extension OutputFormattable {
/// Output data in appropriate format
func output(_ data: some Codable, humanReadable: () -> Void) {
if jsonOutput {
outputSuccessCodable(data: data, logger: self.outputLogger)
} else {
humanReadable()
}
}
/// Output success with optional data
func outputSuccess(data: (some Codable)? = nil as Empty?) {
if jsonOutput {
if let data {
outputSuccessCodable(data: data, logger: self.outputLogger)
} else {
outputJSON(JSONResponse(success: true), logger: self.outputLogger)
}
}
}
}
// MARK: - Permission Checking
/// Check and require screen recording permission
@MainActor
func requireScreenRecordingPermission(services: any PeekabooServiceProviding) async throws {
let hasPermission = await Task { @MainActor in
await services.screenCapture.hasScreenRecordingPermission()
}.value
guard hasPermission else {
throw CaptureError.screenRecordingPermissionDenied
}
}
/// Check and require accessibility permission
@MainActor
func requireAccessibilityPermission(services: any PeekabooServiceProviding) throws {
if !services.permissions.checkAccessibilityPermission() {
throw CaptureError.accessibilityPermissionDenied
}
}

View File

@ -18,12 +18,12 @@ extension AsyncRuntimeCommand {
/// and executes the async implementation on the main actor.
mutating func run() throws {
var commandCopy = self
let runtime = CommandRuntime.makeDefault()
let semaphore = DispatchSemaphore(value: 0)
var thrownError: (any Error)?
Task { @MainActor in
do {
let runtime = await CommandRuntime.makeDefaultAsync()
try await commandCopy.run(using: runtime)
} catch {
thrownError = error

View File

@ -3,75 +3,30 @@
// PeekabooCLI
//
import Darwin
import Foundation
import PeekabooAutomation
import PeekabooAutomationKit
import PeekabooBridge
import PeekabooCore
import PeekabooFoundation
import PeekabooProtocols
/// Shared options that control logging and output behavior.
struct CommandRuntimeOptions {
struct CommandRuntimeOptions: Sendable {
var verbose = false
var jsonOutput = false
var logLevel: LogLevel?
var captureEnginePreference: String?
var inputStrategy: UIInputStrategy?
var preferRemote = true
var remoteIsolationRequested = false
var autoStartDaemon = true
var bridgeSocketPath: String?
var requiresElementActions = false
var requiresInspectAccessibilityTree = false
var requiresBrowserMCP = false
var requiresApplicationLaunchOptions = false
var requiresApplicationRelaunch = false
var requiresSurvivingApplicationHost = false
var requiresHostApplicationInventory = false
var requiresImplicitSnapshotInvalidation = false
var requiresCallerDesktopMutationBarrier = false
var usesPerToolSnapshotInvalidation = false
var requiresExactWindowTargetedClicks = false
var requiresPostEventClickPermission = false
func makeConfiguration() -> CommandRuntime.Configuration {
CommandRuntime.Configuration(
verbose: self.verbose,
jsonOutput: self.jsonOutput,
logLevel: self.logLevel,
captureEnginePreference: self.captureEnginePreference,
inputStrategy: self.inputStrategy
captureEnginePreference: self.captureEnginePreference
)
}
func applyingEnvironmentOverrides(environment: [String: String]) -> CommandRuntimeOptions {
var options = self
if options.captureEnginePreference == nil,
let captureEngine = Self.captureEnginePreference(environment: environment) {
options.captureEnginePreference = captureEngine
if !options.requiresApplicationLaunchOptions && !options.requiresHostApplicationInventory {
options.preferRemote = false
}
}
return options
}
static func captureEnginePreference(environment: [String: String]) -> String? {
guard let value = environment["PEEKABOO_CAPTURE_ENGINE"]?
.trimmingCharacters(in: .whitespacesAndNewlines),
!value.isEmpty else {
return nil
}
return value
}
}
/// Runtime context passed to runtime-aware commands.
struct CommandRuntime {
static let defaultDaemonIdleTimeoutSeconds: TimeInterval = 300
@TaskLocal
private static var serviceOverride: PeekabooServices?
@ -80,49 +35,19 @@ struct CommandRuntime {
var jsonOutput: Bool
var logLevel: LogLevel?
var captureEnginePreference: String?
var inputStrategy: UIInputStrategy?
}
let configuration: Configuration
let hostDescription: String
let selectedRemoteSocketPath: String?
let selectedRemoteHostProcessIdentifier: pid_t?
let snapshotInvalidationRemoteSocketPaths: [String]
let applicationRelaunchAllowed: Bool
let interactionMutationTracker: InteractionMutationTracker
@MainActor let services: any PeekabooServiceProviding
@MainActor let logger: Logger
@MainActor
var observationTimeoutMutationTracker: InteractionMutationTracker? {
if self.selectedRemoteSocketPath == nil || self.interactionMutationTracker.hasPendingDurableMutation {
return self.interactionMutationTracker
}
return nil
}
@MainActor
init(
configuration: Configuration,
services: any PeekabooServiceProviding,
hostDescription: String = "local (in-process)",
selectedRemoteSocketPath: String? = nil,
selectedRemoteHostProcessIdentifier: pid_t? = nil,
snapshotInvalidationRemoteSocketPaths: [String] = [],
applicationRelaunchAllowed: Bool = true,
interactionMutationTracker: InteractionMutationTracker = InteractionMutationTracker()
services: any PeekabooServiceProviding
) {
// Keep Tachikoma credential/profile resolution aligned with Peekaboo CLI storage.
PeekabooCore.ConfigurationManager.configureTachikomaProfileDirectory()
self.configuration = configuration
self.services = services
self.hostDescription = hostDescription
self.selectedRemoteSocketPath = selectedRemoteSocketPath
self.selectedRemoteHostProcessIdentifier = selectedRemoteHostProcessIdentifier
self.snapshotInvalidationRemoteSocketPaths = snapshotInvalidationRemoteSocketPaths
self.applicationRelaunchAllowed = applicationRelaunchAllowed
self.interactionMutationTracker = interactionMutationTracker
self.logger = Logger.shared
services.installAgentRuntimeDefaults()
@ -156,11 +81,8 @@ struct CommandRuntime {
}
VisualizationClient.shared.setConsoleLogLevelOverride(visualizerConsoleLevel)
VisualizationClient.shared.setConsoleMirroringEnabled(configuration.verbose)
self.services.ensureVisualizerConnection()
self.logger.debug("Runtime host: \(hostDescription)")
}
@MainActor
@ -172,9 +94,10 @@ struct CommandRuntime {
extension CommandRuntime {
@MainActor
static func makeDefault(options: CommandRuntimeOptions) -> CommandRuntime {
let effectiveOptions = options.applyingEnvironmentOverrides(environment: ProcessInfo.processInfo.environment)
let services = self.serviceOverride ?? self.makeLocalServices(options: effectiveOptions)
return CommandRuntime(configuration: effectiveOptions.makeConfiguration(), services: services)
CommandRuntime(
options: options,
services: self.serviceOverride ?? PeekabooServices()
)
}
@MainActor
@ -182,30 +105,6 @@ extension CommandRuntime {
self.makeDefault(options: CommandRuntimeOptions())
}
@MainActor
static func makeDefaultAsync(options: CommandRuntimeOptions) async -> CommandRuntime {
let effectiveOptions = options.applyingEnvironmentOverrides(environment: ProcessInfo.processInfo.environment)
if let override = serviceOverride {
return CommandRuntime(options: effectiveOptions, services: override)
}
let resolution = await resolveServices(options: effectiveOptions)
return CommandRuntime(
configuration: effectiveOptions.makeConfiguration(),
services: resolution.services,
hostDescription: resolution.hostDescription,
selectedRemoteSocketPath: resolution.selectedRemoteSocketPath,
selectedRemoteHostProcessIdentifier: resolution.selectedRemoteHostProcessIdentifier,
snapshotInvalidationRemoteSocketPaths: resolution.snapshotInvalidationRemoteSocketPaths,
applicationRelaunchAllowed: resolution.applicationRelaunchAllowed
)
}
@MainActor
static func makeDefaultAsync() async -> CommandRuntime {
await self.makeDefaultAsync(options: CommandRuntimeOptions())
}
@MainActor
static func withInjectedServices<T>(
_ services: PeekabooServices,
@ -215,116 +114,6 @@ extension CommandRuntime {
try await operation()
}
}
@MainActor
private static func resolveServices(options: CommandRuntimeOptions) async -> RuntimeHostResolver.Resolution {
await RuntimeHostResolver.resolveServices(options: options)
}
static func explicitBridgeSocket(
options: CommandRuntimeOptions,
environment: [String: String]
) -> String? {
BridgeSocketResolver.explicitBridgeSocket(options: options, environment: environment)
}
static func shouldAutoStartDaemon(
options: CommandRuntimeOptions,
environment: [String: String]
) -> Bool {
DaemonLaunchPolicy.shouldAutoStartDaemon(options: options, environment: environment)
}
static func daemonSocketPath(environment: [String: String]) -> String {
DaemonLaunchPolicy.daemonSocketPath(environment: environment)
}
static func daemonIdleTimeoutSeconds(environment: [String: String]) -> TimeInterval {
DaemonLaunchPolicy.daemonIdleTimeoutSeconds(environment: environment)
}
static func onDemandDaemonArguments(socketPath: String, idleTimeoutSeconds: TimeInterval) -> [String] {
DaemonLaunchPolicy.onDemandDaemonArguments(socketPath: socketPath, idleTimeoutSeconds: idleTimeoutSeconds)
}
@MainActor
private static func makeLocalServices(options: CommandRuntimeOptions) -> PeekabooServices {
RuntimeServiceFactory.makeLocalServices(options: options)
}
static func hasInputStrategyEnvironmentOverride(environment: [String: String]) -> Bool {
RuntimeInputPolicyResolver.hasEnvironmentOverride(environment: environment)
}
static func hasInputStrategyConfigOverride(input: PeekabooAutomation.Configuration.InputConfig?) -> Bool {
RuntimeInputPolicyResolver.hasConfigOverride(input: input)
}
static func supportsRemoteRequirements(
for handshake: PeekabooBridgeHandshakeResponse,
options: CommandRuntimeOptions
) -> Bool {
BridgeCapabilityPolicy.supportsRemoteRequirements(for: handshake, options: options)
}
static func supportsTargetedHotkeys(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsTargetedHotkeys(for: handshake)
}
static func supportsTargetedTypeActions(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsTargetedTypeActions(for: handshake)
}
static func supportsTargetedClicks(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsTargetedClicks(for: handshake)
}
static func supportsApplicationLaunchOptions(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsApplicationLaunchOptions(for: handshake)
}
static func supportsApplicationRelaunch(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsApplicationRelaunch(for: handshake)
}
static func supportsImplicitSnapshotInvalidation(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsImplicitSnapshotInvalidation(for: handshake)
}
static func supportsElementActions(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsElementActions(for: handshake)
}
static func supportsDesktopObservation(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsDesktopObservation(for: handshake)
}
static func supportsInspectAccessibilityTree(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsInspectAccessibilityTree(for: handshake)
}
static func supportsBrowserMCP(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsBrowserMCP(for: handshake)
}
static func supportsPostEventPermissionRequest(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
BridgeCapabilityPolicy.supportsPostEventPermissionRequest(for: handshake)
}
static func targetedHotkeyAvailability(for handshake: PeekabooBridgeHandshakeResponse)
-> (isEnabled: Bool, unavailableReason: String?, missingPermissions: Set<PeekabooBridgePermissionKind>) {
BridgeCapabilityPolicy.targetedHotkeyAvailability(for: handshake)
}
static func targetedTypeAvailability(for handshake: PeekabooBridgeHandshakeResponse)
-> (isEnabled: Bool, unavailableReason: String?, missingPermissions: Set<PeekabooBridgePermissionKind>) {
BridgeCapabilityPolicy.targetedTypeAvailability(for: handshake)
}
static func targetedClickAvailability(for handshake: PeekabooBridgeHandshakeResponse)
-> (isEnabled: Bool, unavailableReason: String?, missingPermissions: Set<PeekabooBridgePermissionKind>) {
BridgeCapabilityPolicy.targetedClickAvailability(for: handshake)
}
}
/// Commands that need access to verbose/json flags even before a runtime is injected
@ -335,12 +124,12 @@ protocol RuntimeOptionsConfigurable {
extension RuntimeOptionsConfigurable {
mutating func setRuntimeOptions(_ options: CommandRuntimeOptions) {
runtimeOptions = options
self.runtimeOptions = options
}
}
@propertyWrapper
struct RuntimeStorage<Value: ExpressibleByNilLiteral> {
struct RuntimeStorage<Value> where Value: ExpressibleByNilLiteral {
private var storage: Value
init() {

View File

@ -1,497 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
import PeekabooFoundation
// MARK: - Service Bridges
enum AutomationServiceBridge {
static func waitForElement(
automation: any UIAutomationServiceProtocol,
target: ClickTarget,
timeout: TimeInterval,
snapshotId: String?
) async throws -> WaitForElementResult {
let result = try await Task { @MainActor in
try await automation.waitForElement(target: target, timeout: timeout, snapshotId: snapshotId)
}.value
if !result.warnings.isEmpty {
Logger.shared.debug(
"waitForElement warnings: \(result.warnings.joined(separator: ","))",
category: "Automation"
)
}
return result
}
static func click(
automation: any UIAutomationServiceProtocol,
target: ClickTarget,
clickType: ClickType,
snapshotId: String?
) async throws {
try await Task { @MainActor in
try await automation.click(target: target, clickType: clickType, snapshotId: snapshotId)
}.value
}
static func click(
automation: any UIAutomationServiceProtocol,
target: ClickTarget,
clickType: ClickType,
snapshotId: String?,
targetProcessIdentifier: pid_t,
targetWindowID: Int? = nil
) async throws {
try await Task { @MainActor in
guard let targetedClickService = automation as? any TargetedClickServiceProtocol else {
throw PeekabooError.serviceUnavailable(
"Background clicks require an automation service that supports targeted click delivery"
)
}
guard targetedClickService.supportsTargetedClicks else {
throw self.targetedClickUnavailableError(service: targetedClickService)
}
if let targetWindowID {
guard let exactWindowService = targetedClickService as? any ExactWindowTargetedClickServiceProtocol
else {
throw PeekabooError.serviceUnavailable(
"Background clicks with an exact window require a compatible automation service"
)
}
try await exactWindowService.click(
target: target,
clickType: clickType,
snapshotId: snapshotId,
targetProcessIdentifier: targetProcessIdentifier,
targetWindowID: targetWindowID
)
} else {
try await targetedClickService.click(
target: target,
clickType: clickType,
snapshotId: snapshotId,
targetProcessIdentifier: targetProcessIdentifier
)
}
}.value
}
static func typeActions(
automation: any UIAutomationServiceProtocol,
request: TypeActionsRequest
) async throws -> TypeResult {
try await Task { @MainActor in
try await automation.typeActions(
request.actions,
cadence: request.cadence,
snapshotId: request.snapshotId
)
}.value
}
static func typeActions(
automation: any UIAutomationServiceProtocol,
request: TypeActionsRequest,
targetProcessIdentifier: pid_t
) async throws -> TypeResult {
try await Task { @MainActor in
guard let targetedTypeService = automation as? any TargetedTypeServiceProtocol else {
throw PeekabooError.serviceUnavailable(
"Background typing requires an automation service that supports targeted type delivery"
)
}
guard targetedTypeService.supportsTargetedTypeActions else {
throw self.targetedTypeUnavailableError(service: targetedTypeService)
}
return try await targetedTypeService.typeActions(
request.actions,
cadence: request.cadence,
snapshotId: request.snapshotId,
targetProcessIdentifier: targetProcessIdentifier
)
}.value
}
static func scroll(
automation: any UIAutomationServiceProtocol,
request: ScrollRequest
) async throws {
try await Task { @MainActor in
try await automation.scroll(request)
}.value
}
static func setValue(
automation: any UIAutomationServiceProtocol,
target: String,
value: UIElementValue,
snapshotId: String?
) async throws -> ElementActionResult {
try await Task { @MainActor in
guard let automation = automation as? any ElementActionAutomationServiceProtocol else {
throw PeekabooError.serviceUnavailable(
"This automation host does not support direct accessibility value setting"
)
}
return try await automation.setValue(target: target, value: value, snapshotId: snapshotId)
}.value
}
static func performAction(
automation: any UIAutomationServiceProtocol,
target: String,
actionName: String,
snapshotId: String?
) async throws -> ElementActionResult {
try await Task { @MainActor in
guard let automation = automation as? any ElementActionAutomationServiceProtocol else {
throw PeekabooError.serviceUnavailable(
"This automation host does not support direct accessibility action invocation"
)
}
return try await automation.performAction(target: target, actionName: actionName, snapshotId: snapshotId)
}.value
}
static func hotkey(automation: any UIAutomationServiceProtocol, keys: String, holdDuration: Int) async throws {
try await Task { @MainActor in
try await automation.hotkey(keys: keys, holdDuration: holdDuration)
}.value
}
static func hotkey(
automation: any UIAutomationServiceProtocol,
keys: String,
holdDuration: Int,
targetProcessIdentifier: pid_t
) async throws {
try await Task { @MainActor in
guard let targetedHotkeyService = automation as? any TargetedHotkeyServiceProtocol else {
throw PeekabooError.serviceUnavailable(
"Background hotkeys require an automation service that supports targeted hotkey delivery"
)
}
guard targetedHotkeyService.supportsTargetedHotkeys else {
throw self.targetedHotkeyUnavailableError(service: targetedHotkeyService)
}
try await targetedHotkeyService.hotkey(
keys: keys,
holdDuration: holdDuration,
targetProcessIdentifier: targetProcessIdentifier
)
}.value
}
private static func targetedHotkeyUnavailableError(service: any TargetedHotkeyServiceProtocol) -> PeekabooError {
if service.targetedHotkeyRequiresEventSynthesizingPermission {
return .permissionDeniedEventSynthesizing
}
return .serviceUnavailable(
service.targetedHotkeyUnavailableReason ??
"Remote bridge host does not support background hotkeys; use --no-remote or update the host"
)
}
private static func targetedTypeUnavailableError(service: any TargetedTypeServiceProtocol) -> PeekabooError {
if service.targetedTypeRequiresEventSynthesizingPermission {
return .permissionDeniedEventSynthesizing
}
return .serviceUnavailable(
service.targetedTypeUnavailableReason ??
"Remote bridge host does not support background typing; use --no-remote or update the host"
)
}
private static func targetedClickUnavailableError(service: any TargetedClickServiceProtocol) -> PeekabooError {
if service.targetedClickRequiresEventSynthesizingPermission {
return .permissionDeniedEventSynthesizing
}
return .serviceUnavailable(
service.targetedClickUnavailableReason ??
"Remote bridge host does not support background clicks; use --no-remote or update the host"
)
}
static func swipe(
automation: any UIAutomationServiceProtocol,
request: SwipeRequest
) async throws {
try await Task { @MainActor in
try await automation.swipe(
from: request.from,
to: request.to,
duration: request.duration,
steps: request.steps,
profile: request.profile
)
}.value
}
static func drag(
automation: any UIAutomationServiceProtocol,
request: DragRequest
) async throws {
try await Task { @MainActor in
try await automation.drag(
DragOperationRequest(
from: request.from,
to: request.to,
duration: request.duration,
steps: request.steps,
modifiers: request.modifiers,
profile: request.profile
)
)
}.value
}
static func moveMouse(
automation: any UIAutomationServiceProtocol,
to point: CGPoint,
duration: Int,
steps: Int,
profile: MouseMovementProfile
) async throws {
try await Task { @MainActor in
try await automation.moveMouse(to: point, duration: duration, steps: steps, profile: profile)
}.value
}
static func detectElements(
automation: any UIAutomationServiceProtocol,
imageData: Data,
snapshotId: String?,
windowContext: WindowContext?
) async throws -> ElementDetectionResult {
try await Task { @MainActor in
try await automation.detectElements(
in: imageData,
snapshotId: snapshotId,
windowContext: windowContext
)
}.value
}
static func hasAccessibilityPermission(automation: any UIAutomationServiceProtocol) async -> Bool {
await Task { @MainActor in
await automation.hasAccessibilityPermission()
}.value
}
}
struct TypeActionsRequest {
let actions: [TypeAction]
let cadence: TypingCadence
let snapshotId: String?
}
struct SwipeRequest {
let from: CGPoint
let to: CGPoint
let duration: Int
let steps: Int
let profile: MouseMovementProfile
}
struct DragRequest {
let from: CGPoint
let to: CGPoint
let duration: Int
let steps: Int
let modifiers: String?
let profile: MouseMovementProfile
}
enum WindowServiceBridge {
static func closeWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.closeWindow(target: target)
}.value
}
static func minimizeWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.minimizeWindow(target: target)
}.value
}
static func maximizeWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.maximizeWindow(target: target)
}.value
}
static func moveWindow(
windows: any WindowManagementServiceProtocol,
target: WindowTarget,
to origin: CGPoint
) async throws {
try await Task { @MainActor in
try await windows.moveWindow(target: target, to: origin)
}.value
}
static func resizeWindow(
windows: any WindowManagementServiceProtocol,
target: WindowTarget,
to size: CGSize
) async throws {
try await Task { @MainActor in
try await windows.resizeWindow(target: target, to: size)
}.value
}
static func setWindowBounds(
windows: any WindowManagementServiceProtocol,
target: WindowTarget,
bounds: CGRect
) async throws {
try await Task { @MainActor in
try await windows.setWindowBounds(target: target, bounds: bounds)
}.value
}
static func focusWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.focusWindow(target: target)
}.value
}
static func listWindows(
windows: any WindowManagementServiceProtocol,
target: WindowTarget
) async throws -> [ServiceWindowInfo] {
try await Task { @MainActor in
try await windows.listWindows(target: target)
}.value
}
static func getFocusedWindow(windows: any WindowManagementServiceProtocol) async throws -> ServiceWindowInfo? {
try await Task { @MainActor in
try await windows.getFocusedWindow()
}.value
}
}
enum MenuServiceBridge {
static func listMenus(menu: any MenuServiceProtocol, appIdentifier: String) async throws -> MenuStructure {
try await Task { @MainActor in
try await menu.listMenus(for: appIdentifier)
}.value
}
static func listFrontmostMenus(menu: any MenuServiceProtocol) async throws -> MenuStructure {
try await Task { @MainActor in
try await menu.listFrontmostMenus()
}.value
}
static func listMenuExtras(menu: any MenuServiceProtocol) async throws -> [MenuExtraInfo] {
try await Task { @MainActor in
try await menu.listMenuExtras()
}.value
}
static func clickMenuItem(menu: any MenuServiceProtocol, appIdentifier: String, itemPath: String) async throws {
try await Task { @MainActor in
try await menu.clickMenuItem(app: appIdentifier, itemPath: itemPath)
}.value
}
static func clickMenuItemByName(
menu: any MenuServiceProtocol,
appIdentifier: String,
itemName: String
) async throws {
try await Task { @MainActor in
try await menu.clickMenuItemByName(app: appIdentifier, itemName: itemName)
}.value
}
static func clickMenuExtra(menu: any MenuServiceProtocol, title: String) async throws {
try await Task { @MainActor in
try await menu.clickMenuExtra(title: title)
}.value
}
static func isMenuExtraMenuOpen(
menu: any MenuServiceProtocol,
title: String,
ownerPID: pid_t?
) async throws -> Bool {
try await Task { @MainActor in
try await menu.isMenuExtraMenuOpen(title: title, ownerPID: ownerPID)
}.value
}
static func listMenuBarItems(menu: any MenuServiceProtocol, includeRaw: Bool = false) async throws
-> [MenuBarItemInfo] {
try await Task { @MainActor in
try await menu.listMenuBarItems(includeRaw: includeRaw)
}.value
}
static func clickMenuBarItem(named name: String, menu: any MenuServiceProtocol) async throws -> PeekabooCore
.ClickResult {
try await Task<PeekabooCore.ClickResult, any Error> { @MainActor in
try await menu.clickMenuBarItem(named: name)
}.value
}
static func clickMenuBarItem(at index: Int, menu: any MenuServiceProtocol) async throws -> PeekabooCore
.ClickResult {
try await Task<PeekabooCore.ClickResult, any Error> { @MainActor in
try await menu.clickMenuBarItem(at: index)
}.value
}
}
enum DockServiceBridge {
static func launchFromDock(dock: any DockServiceProtocol, appName: String) async throws {
try await Task { @MainActor in
try await dock.launchFromDock(appName: appName)
}.value
}
static func findDockItem(dock: any DockServiceProtocol, name: String) async throws -> DockItem {
try await Task { @MainActor in
try await dock.findDockItem(name: name)
}.value
}
static func rightClickDockItem(dock: any DockServiceProtocol, appName: String, menuItem: String?) async throws {
try await Task { @MainActor in
try await dock.rightClickDockItem(appName: appName, menuItem: menuItem)
}.value
}
static func hideDock(dock: any DockServiceProtocol) async throws {
try await Task { @MainActor in
try await dock.hideDock()
}.value
}
static func showDock(dock: any DockServiceProtocol) async throws {
try await Task { @MainActor in
try await dock.showDock()
}.value
}
static func listDockItems(dock: any DockServiceProtocol, includeAll: Bool) async throws -> [DockItem] {
try await Task { @MainActor in
try await dock.listDockItems(includeAll: includeAll)
}.value
}
}

View File

@ -1,43 +0,0 @@
import Commander
extension CommandSignature {
/// Add Peekaboo's standard runtime flags and options (extends Commander defaults).
func withPeekabooRuntimeFlags() -> CommandSignature {
let base = self.withStandardRuntimeFlags()
let bridgeSocketOption = OptionDefinition.make(
label: "bridge-socket",
names: [
.long("bridge-socket"),
.aliasLong("bridgeSocket"),
],
help: "Override the socket path for a Peekaboo Bridge host",
parsing: .singleValue
)
let noRemoteFlag = FlagDefinition.make(
label: "no-remote",
names: [
.long("no-remote"),
],
help: "Force local execution; skip remote hosts even if available"
)
let inputStrategyOption = OptionDefinition.make(
label: "inputStrategy",
names: [
.long("input-strategy"),
.aliasLong("inputStrategy"),
],
help: "Override UI input strategy: actionFirst, synthFirst, actionOnly, or synthOnly",
parsing: .singleValue
)
return CommandSignature(
arguments: base.arguments,
options: base.options + [bridgeSocketOption, inputStrategyOption],
flags: base.flags + [noRemoteFlag],
optionGroups: base.optionGroups
)
}
}

View File

@ -1,9 +1,651 @@
import AppKit
import Commander
import CoreGraphics
import Foundation
import PeekabooAutomationKit
import PeekabooCore
import PeekabooFoundation
// MARK: - Error Handling Protocol
/// Protocol for commands that need standardized error handling
@MainActor
protocol ErrorHandlingCommand {
var jsonOutput: Bool { get }
}
extension ErrorHandlingCommand {
/// Handle errors with appropriate output format
func handleError(_ error: any Error, customCode: ErrorCode? = nil) {
// Handle errors with appropriate output format
if jsonOutput {
let errorCode = customCode ?? self.mapErrorToCode(error)
let logger: Logger = if let formattable = self as? any OutputFormattable {
formattable.outputLogger
} else {
Logger.shared
}
outputError(message: error.localizedDescription, code: errorCode, logger: logger)
} else {
// Get a more descriptive error message
let errorMessage: String = if let peekabooError = error as? PeekabooError {
peekabooError.errorDescription ?? String(describing: error)
} else if let captureError = error as? CaptureError {
captureError.errorDescription ?? String(describing: error)
} else if error
.localizedDescription == "The operation couldn't be completed. (PeekabooCore.PeekabooError error 0.)" ||
error.localizedDescription == "Error" {
// For generic errors, try to get more info
String(describing: error)
} else {
error.localizedDescription
}
fputs("Error: \(errorMessage)\n", stderr)
}
}
/// Map various error types to error codes
private func mapErrorToCode(_ error: any Error) -> ErrorCode {
// Map various error types to error codes
switch error {
// FocusError mappings
case let focusError as FocusError:
self.mapFocusErrorToCode(focusError)
// PeekabooError mappings
case let peekabooError as PeekabooError:
self.mapPeekabooErrorToCode(peekabooError)
// CaptureError mappings
case let captureError as CaptureError:
self.mapCaptureErrorToCode(captureError)
// Commander ValidationError
case is Commander.ValidationError:
.VALIDATION_ERROR
// Default
default:
.INTERNAL_SWIFT_ERROR
}
}
private func mapPeekabooErrorToCode(_ error: PeekabooError) -> ErrorCode {
if let lookupCode = self.lookupErrorCode(for: error) {
return lookupCode
}
if let permissionCode = self.permissionErrorCode(for: error) {
return permissionCode
}
if let timeoutCode = self.timeoutErrorCode(for: error) {
return timeoutCode
}
if let automationCode = self.automationErrorCode(for: error) {
return automationCode
}
if let inputCode = self.inputErrorCode(for: error) {
return inputCode
}
if let credentialCode = self.credentialErrorCode(for: error) {
return credentialCode
}
return .UNKNOWN_ERROR
}
private func lookupErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .appNotFound:
.APP_NOT_FOUND
case .ambiguousAppIdentifier:
.AMBIGUOUS_APP_IDENTIFIER
case .windowNotFound:
.WINDOW_NOT_FOUND
case .elementNotFound:
.ELEMENT_NOT_FOUND
case .sessionNotFound:
.SESSION_NOT_FOUND
case .menuNotFound:
.MENU_BAR_NOT_FOUND
case .menuItemNotFound:
.MENU_ITEM_NOT_FOUND
default:
nil
}
}
private func permissionErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .permissionDeniedScreenRecording:
.PERMISSION_ERROR_SCREEN_RECORDING
case .permissionDeniedAccessibility:
.PERMISSION_ERROR_ACCESSIBILITY
default:
nil
}
}
private func timeoutErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .captureTimeout, .timeout:
.TIMEOUT
default:
nil
}
}
private func automationErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .captureFailed, .clickFailed, .typeFailed:
.CAPTURE_FAILED
case .serviceUnavailable, .networkError, .apiError, .commandFailed, .encodingError:
.UNKNOWN_ERROR
default:
nil
}
}
private func inputErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .invalidCoordinates:
.INVALID_COORDINATES
case .fileIOError:
.FILE_IO_ERROR
case .invalidInput:
.INVALID_INPUT
default:
nil
}
}
private func credentialErrorCode(for error: PeekabooError) -> ErrorCode? {
switch error {
case .noAIProviderAvailable, .authenticationFailed:
.MISSING_API_KEY
case .aiProviderError:
.AGENT_ERROR
default:
nil
}
}
private func mapCaptureErrorToCode(_ error: CaptureError) -> ErrorCode {
switch error {
case .screenRecordingPermissionDenied, .permissionDeniedScreenRecording:
.PERMISSION_ERROR_SCREEN_RECORDING
case .accessibilityPermissionDenied:
.PERMISSION_ERROR_ACCESSIBILITY
case .appleScriptPermissionDenied:
.PERMISSION_ERROR_APPLESCRIPT
case .noDisplaysAvailable, .noDisplaysFound:
.CAPTURE_FAILED
case .invalidDisplayID, .invalidDisplayIndex:
.INVALID_ARGUMENT
case .captureCreationFailed, .windowCaptureFailed, .captureFailed, .captureFailure:
.CAPTURE_FAILED
case .windowNotFound, .noWindowsFound:
.WINDOW_NOT_FOUND
case .windowTitleNotFound:
.WINDOW_NOT_FOUND
case .fileWriteError, .fileIOError:
.FILE_IO_ERROR
case .appNotFound:
.APP_NOT_FOUND
case .invalidWindowIndexOld, .invalidWindowIndex:
.INVALID_ARGUMENT
case .invalidArgument:
.INVALID_ARGUMENT
case .unknownError:
.UNKNOWN_ERROR
case .noFrontmostApplication:
.WINDOW_NOT_FOUND
case .invalidCaptureArea:
.INVALID_ARGUMENT
case .ambiguousAppIdentifier:
.AMBIGUOUS_APP_IDENTIFIER
case .imageConversionFailed:
.CAPTURE_FAILED
}
}
private func mapFocusErrorToCode(_ error: FocusError) -> ErrorCode {
errorCode(for: error)
}
}
func errorCode(for focusError: FocusError) -> ErrorCode {
switch focusError {
case .applicationNotRunning:
.APP_NOT_FOUND
case .focusVerificationTimeout, .timeoutWaitingForCondition:
.TIMEOUT
default:
.WINDOW_NOT_FOUND
}
}
// MARK: - Output Formatting Protocol
/// Protocol for commands that support both JSON and human-readable output
@MainActor
protocol OutputFormattable {
var jsonOutput: Bool { get }
var outputLogger: Logger { get }
}
extension OutputFormattable {
/// Output data in appropriate format
func output(_ data: some Codable, humanReadable: () -> Void) {
// Output data in appropriate format
if jsonOutput {
outputSuccessCodable(data: data, logger: self.outputLogger)
} else {
humanReadable()
}
}
/// Output success with optional data
func outputSuccess(data: (some Codable)? = nil as Empty?) {
// Output success with optional data
if jsonOutput {
if let data {
outputSuccessCodable(data: data, logger: self.outputLogger)
} else {
outputJSON(JSONResponse(success: true), logger: self.outputLogger)
}
}
}
}
// MARK: - Permission Checking
/// Check and require screen recording permission
@MainActor
func requireScreenRecordingPermission(services: any PeekabooServiceProviding) async throws {
// Check and require screen recording permission
let hasPermission = await Task { @MainActor in
await services.screenCapture.hasScreenRecordingPermission()
}.value
guard hasPermission else {
throw CaptureError.screenRecordingPermissionDenied
}
}
/// Check and require accessibility permission
@MainActor
func requireAccessibilityPermission(services: any PeekabooServiceProviding) throws {
if !services.permissions.checkAccessibilityPermission() {
throw CaptureError.accessibilityPermissionDenied
}
}
// MARK: - Service Bridges
enum AutomationServiceBridge {
static func waitForElement(
automation: any UIAutomationServiceProtocol,
target: ClickTarget,
timeout: TimeInterval,
sessionId: String?
) async throws -> WaitForElementResult {
let result = try await Task { @MainActor in
try await automation.waitForElement(target: target, timeout: timeout, sessionId: sessionId)
}.value
if !result.warnings.isEmpty {
Logger.shared.debug(
"waitForElement warnings: \(result.warnings.joined(separator: ","))",
category: "Automation"
)
}
return result
}
static func click(
automation: any UIAutomationServiceProtocol,
target: ClickTarget,
clickType: ClickType,
sessionId: String?
) async throws {
try await Task { @MainActor in
try await automation.click(target: target, clickType: clickType, sessionId: sessionId)
}.value
}
static func typeActions(
automation: any UIAutomationServiceProtocol,
request: TypeActionsRequest
) async throws -> TypeResult {
try await Task { @MainActor in
try await automation.typeActions(
request.actions,
cadence: request.cadence,
sessionId: request.sessionId
)
}.value
}
static func scroll(
automation: any UIAutomationServiceProtocol,
request: ScrollRequest
) async throws {
try await Task { @MainActor in
try await automation.scroll(request)
}.value
}
static func hotkey(automation: any UIAutomationServiceProtocol, keys: String, holdDuration: Int) async throws {
try await Task { @MainActor in
try await automation.hotkey(keys: keys, holdDuration: holdDuration)
}.value
}
// swiftlint:disable:next function_parameter_count
static func swipe(
automation: any UIAutomationServiceProtocol,
from: CGPoint,
to: CGPoint,
duration: Int,
steps: Int,
profile: MouseMovementProfile
) async throws {
try await Task { @MainActor in
try await automation.swipe(from: from, to: to, duration: duration, steps: steps, profile: profile)
}.value
}
static func drag(
automation: any UIAutomationServiceProtocol,
request: DragRequest
) async throws {
try await Task { @MainActor in
try await automation.drag(
from: request.from,
to: request.to,
duration: request.duration,
steps: request.steps,
modifiers: request.modifiers,
profile: request.profile
)
}.value
}
static func moveMouse(
automation: any UIAutomationServiceProtocol,
to point: CGPoint,
duration: Int,
steps: Int,
profile: MouseMovementProfile
) async throws {
try await Task { @MainActor in
try await automation.moveMouse(to: point, duration: duration, steps: steps, profile: profile)
}.value
}
static func detectElements(
automation: any UIAutomationServiceProtocol,
imageData: Data,
sessionId: String?,
windowContext: WindowContext?
) async throws -> ElementDetectionResult {
try await Task { @MainActor in
try await automation.detectElements(
in: imageData,
sessionId: sessionId,
windowContext: windowContext
)
}.value
}
static func hasAccessibilityPermission(automation: any UIAutomationServiceProtocol) async -> Bool {
await Task { @MainActor in
await automation.hasAccessibilityPermission()
}.value
}
}
struct TypeActionsRequest: Sendable {
let actions: [TypeAction]
let cadence: TypingCadence
let sessionId: String?
}
struct DragRequest: Sendable {
let from: CGPoint
let to: CGPoint
let duration: Int
let steps: Int
let modifiers: String?
let profile: MouseMovementProfile
}
enum CursorMovementProfileSelection: String {
case linear
case human
}
struct CursorMovementParameters {
let profile: MouseMovementProfile
let duration: Int
let steps: Int
let smooth: Bool
let profileName: String
}
enum CursorMovementResolver {
// swiftlint:disable:next function_parameter_count
static func resolve(
selection: CursorMovementProfileSelection,
durationOverride: Int?,
stepsOverride: Int?,
baseSmooth: Bool,
distance: CGFloat,
defaultDuration: Int,
defaultSteps: Int
) -> CursorMovementParameters {
switch selection {
case .linear:
let resolvedDuration = durationOverride ?? (baseSmooth ? defaultDuration : 0)
let resolvedSteps = baseSmooth ? max(stepsOverride ?? defaultSteps, 1) : 1
return CursorMovementParameters(
profile: .linear,
duration: resolvedDuration,
steps: resolvedSteps,
smooth: baseSmooth,
profileName: selection.rawValue
)
case .human:
let resolvedDuration = durationOverride ?? Self.humanDuration(for: distance)
let resolvedSteps = max(stepsOverride ?? Self.humanSteps(for: distance), 30)
return CursorMovementParameters(
profile: .human(),
duration: resolvedDuration,
steps: resolvedSteps,
smooth: true,
profileName: selection.rawValue
)
}
}
private static func humanDuration(for distance: CGFloat) -> Int {
let distanceFactor = log2(Double(distance) + 1) * 90
let perPixel = Double(distance) * 0.45
let estimate = 280 + distanceFactor + perPixel
return min(max(Int(estimate), 300), 1700)
}
private static func humanSteps(for distance: CGFloat) -> Int {
let scaled = Int(distance * 0.35)
return min(max(scaled, 40), 140)
}
}
enum WindowServiceBridge {
static func closeWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.closeWindow(target: target)
}.value
}
static func minimizeWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.minimizeWindow(target: target)
}.value
}
static func maximizeWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.maximizeWindow(target: target)
}.value
}
static func moveWindow(
windows: any WindowManagementServiceProtocol,
target: WindowTarget,
to origin: CGPoint
) async throws {
try await Task { @MainActor in
try await windows.moveWindow(target: target, to: origin)
}.value
}
static func resizeWindow(
windows: any WindowManagementServiceProtocol,
target: WindowTarget,
to size: CGSize
) async throws {
try await Task { @MainActor in
try await windows.resizeWindow(target: target, to: size)
}.value
}
static func setWindowBounds(
windows: any WindowManagementServiceProtocol,
target: WindowTarget,
bounds: CGRect
) async throws {
try await Task { @MainActor in
try await windows.setWindowBounds(target: target, bounds: bounds)
}.value
}
static func focusWindow(windows: any WindowManagementServiceProtocol, target: WindowTarget) async throws {
try await Task { @MainActor in
try await windows.focusWindow(target: target)
}.value
}
static func listWindows(
windows: any WindowManagementServiceProtocol,
target: WindowTarget
) async throws -> [ServiceWindowInfo] {
try await Task { @MainActor in
try await windows.listWindows(target: target)
}.value
}
}
enum MenuServiceBridge {
static func listMenus(menu: any MenuServiceProtocol, appIdentifier: String) async throws -> MenuStructure {
try await Task { @MainActor in
try await menu.listMenus(for: appIdentifier)
}.value
}
static func listFrontmostMenus(menu: any MenuServiceProtocol) async throws -> MenuStructure {
try await Task { @MainActor in
try await menu.listFrontmostMenus()
}.value
}
static func listMenuExtras(menu: any MenuServiceProtocol) async throws -> [MenuExtraInfo] {
try await Task { @MainActor in
try await menu.listMenuExtras()
}.value
}
static func clickMenuItem(menu: any MenuServiceProtocol, appIdentifier: String, itemPath: String) async throws {
try await Task { @MainActor in
try await menu.clickMenuItem(app: appIdentifier, itemPath: itemPath)
}.value
}
static func clickMenuItemByName(
menu: any MenuServiceProtocol,
appIdentifier: String,
itemName: String
) async throws {
try await Task { @MainActor in
try await menu.clickMenuItemByName(app: appIdentifier, itemName: itemName)
}.value
}
static func clickMenuExtra(menu: any MenuServiceProtocol, title: String) async throws {
try await Task { @MainActor in
try await menu.clickMenuExtra(title: title)
}.value
}
static func listMenuBarItems(menu: any MenuServiceProtocol, includeRaw: Bool = false) async throws
-> [MenuBarItemInfo] {
try await Task { @MainActor in
try await menu.listMenuBarItems(includeRaw: includeRaw)
}.value
}
static func clickMenuBarItem(named name: String, menu: any MenuServiceProtocol) async throws -> PeekabooCore
.ClickResult {
try await Task<PeekabooCore.ClickResult, any Error> { @MainActor in
try await menu.clickMenuBarItem(named: name)
}.value
}
static func clickMenuBarItem(at index: Int, menu: any MenuServiceProtocol) async throws -> PeekabooCore
.ClickResult {
try await Task<PeekabooCore.ClickResult, any Error> { @MainActor in
try await menu.clickMenuBarItem(at: index)
}.value
}
}
enum DockServiceBridge {
static func launchFromDock(dock: any DockServiceProtocol, appName: String) async throws {
try await Task { @MainActor in
try await dock.launchFromDock(appName: appName)
}.value
}
static func findDockItem(dock: any DockServiceProtocol, name: String) async throws -> DockItem {
try await Task { @MainActor in
try await dock.findDockItem(name: name)
}.value
}
static func rightClickDockItem(dock: any DockServiceProtocol, appName: String, menuItem: String?) async throws {
try await Task { @MainActor in
try await dock.rightClickDockItem(appName: appName, menuItem: menuItem)
}.value
}
static func hideDock(dock: any DockServiceProtocol) async throws {
try await Task { @MainActor in
try await dock.hideDock()
}.value
}
static func showDock(dock: any DockServiceProtocol) async throws {
try await Task { @MainActor in
try await dock.showDock()
}.value
}
static func listDockItems(dock: any DockServiceProtocol, includeAll: Bool) async throws -> [DockItem] {
try await Task { @MainActor in
try await dock.listDockItems(includeAll: includeAll)
}.value
}
}
// MARK: - Timeout Utilities
/// Execute an async operation with a timeout
@ -11,6 +653,7 @@ func withTimeout<T: Sendable>(
seconds: TimeInterval,
operation: @escaping @Sendable () async throws -> T
) async throws -> T {
// Execute an async operation with a timeout
let task = Task {
try await operation()
}
@ -33,196 +676,29 @@ func withTimeout<T: Sendable>(
}
}
private typealias TimeoutRaceResult = Result<any Sendable, any Error>
private final class TimeoutRace: @unchecked Sendable {
private let lock = NSLock()
private nonisolated(unsafe) var continuation: (@Sendable (TimeoutRaceResult) -> Void)?
private nonisolated(unsafe) var pendingResult: TimeoutRaceResult?
private nonisolated(unsafe) var completed = false
nonisolated func setContinuation<T: Sendable>(_ continuation: CheckedContinuation<T, any Error>) {
let pendingResult: TimeoutRaceResult?
self.lock.lock()
if self.completed {
pendingResult = self.pendingResult
self.pendingResult = nil
} else {
pendingResult = nil
self.continuation = { result in
switch result {
case let .success(value):
guard let value = value as? T else {
continuation
.resume(throwing: PeekabooError.operationError(message: "Timeout result type mismatch"))
return
}
continuation.resume(returning: value)
case let .failure(error):
continuation.resume(throwing: error)
}
}
}
self.lock.unlock()
if let pendingResult {
self.resume(continuation: continuation, with: pendingResult)
}
}
nonisolated func resume<T: Sendable>(with result: Result<T, any Error>) {
let result = result.map { value in value as any Sendable }
let continuation: (@Sendable (TimeoutRaceResult) -> Void)?
self.lock.lock()
if self.completed {
self.lock.unlock()
return
}
self.completed = true
continuation = self.continuation
self.continuation = nil
if continuation == nil {
self.pendingResult = result
}
self.lock.unlock()
continuation?(result)
}
private nonisolated func resume<T: Sendable>(
continuation: CheckedContinuation<T, any Error>,
with result: TimeoutRaceResult
) {
switch result {
case let .success(value):
guard let value = value as? T else {
continuation.resume(throwing: PeekabooError.operationError(message: "Timeout result type mismatch"))
return
}
continuation.resume(returning: value)
case let .failure(error):
continuation.resume(throwing: error)
}
}
}
/// Race an operation against a wall-clock timeout, even if the operation ignores cancellation.
func withCommandTimeout<T: Sendable>(
seconds: TimeInterval,
operationName: String,
operation: @escaping @Sendable () async throws -> T
) async throws -> T {
guard seconds > 0 else {
throw PeekabooError.invalidInput("Timeout must be greater than 0 seconds")
}
let race = TimeoutRace()
let workTask = Task {
do {
let value = try await operation()
race.resume(with: .success(value))
} catch {
race.resume(with: Result<T, any Error>.failure(error))
}
}
let timeoutTask = Task.detached {
do {
try await Task.sleep(nanoseconds: UInt64(seconds * 1_000_000_000))
} catch {
return
}
race.resume(with: Result<T, any Error>.failure(PeekabooError.timeout(
operation: operationName,
duration: seconds
)))
workTask.cancel()
}
return try await withTaskCancellationHandler {
try await withCheckedThrowingContinuation { continuation in
race.setContinuation(continuation)
}
} onCancel: {
race.resume(with: Result<T, any Error>.failure(CancellationError()))
workTask.cancel()
timeoutTask.cancel()
}
}
@MainActor
func withMainActorCommandTimeout<T: Sendable>(
seconds: TimeInterval,
operationName: String,
timeoutError: (@Sendable () -> any Error)? = nil,
desktopMutationWatermarkStore: DesktopMutationWatermarkStore? = nil,
interactionMutationTracker: InteractionMutationTracker? = nil,
operation: @escaping @MainActor () async throws -> T
) async throws -> T {
guard seconds > 0 else {
throw PeekabooError.invalidInput("Timeout must be greater than 0 seconds")
}
let race = TimeoutRace()
let pendingMutation = try desktopMutationWatermarkStore?.beginMutation()
do {
try interactionMutationTracker?.retainDurableMutationLease()
} catch {
if let desktopMutationWatermarkStore, let pendingMutation {
try? desktopMutationWatermarkStore.cancelMutation(pendingMutation)
}
throw error
}
let workTask = Task { @MainActor in
let result: Result<T, any Error>
do {
result = try await .success(operation())
} catch {
result = .failure(error)
}
if let desktopMutationWatermarkStore, let pendingMutation {
_ = try? desktopMutationWatermarkStore.completeMutation(pendingMutation)
}
_ = try? interactionMutationTracker?.completeDurableMutation(through: Date())
race.resume(with: result)
}
let timeoutTask = Task.detached {
do {
try await Task.sleep(nanoseconds: UInt64(seconds * 1_000_000_000))
} catch {
return
}
let error = timeoutError?() ?? PeekabooError.timeout(operation: operationName, duration: seconds)
race.resume(with: Result<T, any Error>.failure(error))
workTask.cancel()
}
return try await withTaskCancellationHandler {
try await withCheckedThrowingContinuation { continuation in
race.setContinuation(continuation)
}
} onCancel: {
race.resume(with: Result<T, any Error>.failure(CancellationError()))
workTask.cancel()
timeoutTask.cancel()
}
}
// MARK: - Window Target Extensions
extension WindowIdentificationOptions {
/// Create a window target from options
func createTarget() throws -> WindowTarget {
try self.toWindowTarget()
func createTarget() -> WindowTarget {
// Create a window target from options
if let app {
if let index = windowIndex {
return .index(app: app, index: index)
} else if let title = windowTitle {
return .title(title)
} else {
return .application(app)
}
}
return .frontmost
}
/// Select a window from a list based on options
@MainActor
func selectWindow(from windows: [ServiceWindowInfo]) -> ServiceWindowInfo? {
if let windowId {
windows.first(where: { $0.windowID == windowId })
} else if let title = windowTitle {
// Select a window from a list based on options
if let title = windowTitle {
windows.first { $0.title.localizedCaseInsensitiveContains(title) }
} else if let index = windowIndex, index < windows.count {
windows[index]
@ -260,6 +736,36 @@ extension WindowIdentificationOptions {
}
}
// MARK: - Common Command Base Classes
// Note: WindowCommandBase is currently unused and has been commented out
// to avoid compilation issues with overlapping Commander option metadata.
/*
/// Base struct for commands that work with windows
struct WindowCommandBase: @MainActor MainActorAsyncParsableCommand, ErrorHandlingCommand, OutputFormattable {
@Option(name: .shortAndLong, help: "Target application name or bundle ID")
var app: String?
@Option(name: .customShort("i", allowingJoined: false), help: "Window index (0-based)")
var windowIndex: Int?
@Option(name: .long, help: "Window title (partial match)")
var windowTitle: String?
@Flag(name: .long, help: "Output in JSON format")
var jsonOutput = false
/// Get window identification options
var windowOptions: WindowIdentificationOptions {
WindowIdentificationOptions(
app: app,
windowTitle: windowTitle,
windowIndex: windowIndex
)
}
}
*/
// MARK: - Application Resolution
/// Marker protocol for commands that need to resolve applications using injected services.
@ -295,6 +801,7 @@ extension Error {
return captureError
}
// Map PeekabooError to CaptureError
if let peekabooError = self as? PeekabooError {
switch peekabooError {
case let .appNotFound(identifier):
@ -306,6 +813,7 @@ extension Error {
}
}
// Default
return .unknownError(self.localizedDescription)
}
}

View File

@ -1,6 +1,5 @@
import Commander
import Foundation
import PeekabooAutomationKit
// MARK: - Binder
@ -10,7 +9,7 @@ enum CommanderCLIBinder {
parsedValues: ParsedValues
) throws -> any ParsableCommand {
var command = type.init()
let runtimeOptions = try makeRuntimeOptions(from: parsedValues, commandType: type)
let runtimeOptions = try self.makeRuntimeOptions(from: parsedValues)
if var bindable = command as? any CommanderBindableCommand {
try bindable.applyCommanderValues(.init(parsedValues: parsedValues))
guard let rebound = bindable as? any ParsableCommand else {
@ -28,46 +27,18 @@ enum CommanderCLIBinder {
return command
}
static func instantiateCommand<T: ParsableCommand>(
static func instantiateCommand<T>(
ofType type: T.Type,
parsedValues: ParsedValues
) throws -> T {
) throws -> T where T: ParsableCommand {
guard let command = try instantiateCommand(type: type, parsedValues: parsedValues) as? T else {
preconditionFailure("Commander instantiation failed to produce expected type \(T.self)")
}
return command
}
static func makeRuntimeOptions(
from parsedValues: ParsedValues,
commandType: (any ParsableCommand.Type)? = nil
) throws -> CommandRuntimeOptions {
static func makeRuntimeOptions(from parsedValues: ParsedValues) throws -> CommandRuntimeOptions {
var options = CommandRuntimeOptions()
options.requiresApplicationLaunchOptions = Self.requiresApplicationLaunchOptions(commandType)
options.requiresApplicationRelaunch = commandType == AppCommand.RelaunchSubcommand.self
options.requiresSurvivingApplicationHost = commandType == AppCommand.QuitSubcommand.self
options.requiresHostApplicationInventory = Self.requiresHostApplicationInventory(commandType)
options.requiresImplicitSnapshotInvalidation = Self.requiresImplicitSnapshotInvalidation(
commandType,
parsedValues: parsedValues
)
let clipboardMayMutate = commandType == ClipboardCommand.self &&
Self.clipboardMayMutate(parsedValues)
options.requiresCallerDesktopMutationBarrier = commandType == SwitchSubcommand.self ||
commandType == MoveWindowSubcommand.self ||
commandType == CaptureActionCommand.self ||
clipboardMayMutate
options.requiresExactWindowTargetedClicks = Self.requiresExactWindowTargetedClicks(
commandType,
parsedValues: parsedValues
)
options.requiresPostEventClickPermission = Self.requiresPostEventClickPermission(
commandType,
parsedValues: parsedValues
)
options.usesPerToolSnapshotInvalidation = commandType == AgentCommand.self ||
commandType == MCPCommand.Serve.self ||
commandType == InspectUICommand.self
options.verbose = parsedValues.flags.contains("verbose")
options.jsonOutput = parsedValues.flags.contains("jsonOutput")
let values = CommanderBindableValues(parsedValues: parsedValues)
@ -78,319 +49,9 @@ enum CommanderCLIBinder {
.trimmingCharacters(in: .whitespacesAndNewlines),
!captureEngine.isEmpty {
options.captureEnginePreference = captureEngine
if !options.requiresApplicationLaunchOptions && !options.requiresHostApplicationInventory {
options.preferRemote = false
}
}
if let rawInputStrategy = values.singleOption("inputStrategy")?
.trimmingCharacters(in: .whitespacesAndNewlines),
!rawInputStrategy.isEmpty {
guard let strategy = UIInputStrategy(rawValue: rawInputStrategy) else {
throw CommanderBindingError.invalidArgument(
label: "input-strategy",
value: rawInputStrategy,
reason: "expected one of \(UIInputStrategy.allCases.map(\.rawValue).joined(separator: ", "))"
)
}
options.inputStrategy = strategy
}
if values.flag("no-remote") {
options.preferRemote = false
options.remoteIsolationRequested = true
}
let explicitBridgeSocket = values.singleOption("bridge-socket")?.trimmingCharacters(in: .whitespacesAndNewlines)
if commandType == AgentCommand.self && !values.flag("no-remote") {
// Agent execution should stay local by default unless explicitly overridden.
options.preferRemote = false
}
if Self.isDaemonCommand(commandType) {
options.preferRemote = false
options.autoStartDaemon = false
}
if Self.requiresCallerLocalRuntime(commandType) {
options.preferRemote = false
} else if Self.prefersLocalRuntime(commandType), !values.flag("no-remote"),
explicitBridgeSocket?.isEmpty ?? true {
options.preferRemote = false
}
if let socketPath = explicitBridgeSocket, !socketPath.isEmpty {
options.bridgeSocketPath = socketPath
}
if commandType == SetValueCommand.self || commandType == PerformActionCommand.self {
options.requiresElementActions = true
}
if commandType == InspectUICommand.self {
options.requiresInspectAccessibilityTree = true
}
if commandType == BrowserCommand.self {
options.requiresBrowserMCP = true
}
return options
}
private static func requiresApplicationLaunchOptions(_ commandType: (any ParsableCommand.Type)?) -> Bool {
commandType == OpenCommand.self ||
commandType == AppCommand.LaunchSubcommand.self ||
commandType == AppCommand.RelaunchSubcommand.self
}
private static func requiresHostApplicationInventory(_ commandType: (any ParsableCommand.Type)?) -> Bool {
commandType == ListCommand.AppsSubcommand.self ||
commandType == AppCommand.ListSubcommand.self
}
private static func requiresImplicitSnapshotInvalidation(
_ commandType: (any ParsableCommand.Type)?,
parsedValues: ParsedValues
) -> Bool {
if commandType == ClipboardCommand.self {
return self.clipboardMayMutate(parsedValues)
}
if commandType == MenuBarCommand.self {
return parsedValues.positional.first?.lowercased() == "click"
}
if commandType == BrowserCommand.self {
return BrowserCommand.actionMayMutate(parsedValues.positional.first ?? "status")
}
if commandType == SeeCommand.self {
return true
}
if self.isInteractivePermissionRequest(commandType) {
return true
}
if commandType == DialogCommand.ListSubcommand.self {
return self.dialogListMayFocus(parsedValues)
}
if commandType == MenuCommand.ListSubcommand.self {
return self.menuListMayFocus(parsedValues)
}
if commandType == ImageCommand.self ||
commandType == CaptureLiveCommand.self ||
commandType == CaptureWatchAlias.self {
return self.captureCommandMayFocus(commandType, parsedValues: parsedValues)
}
return commandType == OpenCommand.self ||
commandType == AppCommand.LaunchSubcommand.self ||
commandType == AppCommand.RelaunchSubcommand.self ||
commandType == AppCommand.QuitSubcommand.self ||
commandType == AppCommand.HideSubcommand.self ||
commandType == AppCommand.UnhideSubcommand.self ||
commandType == AppCommand.SwitchSubcommand.self ||
commandType == ClickCommand.self ||
commandType == MoveCommand.self ||
commandType == TypeCommand.self ||
commandType == PressCommand.self ||
commandType == HotkeyCommand.self ||
commandType == PasteCommand.self ||
commandType == ScrollCommand.self ||
commandType == SwipeCommand.self ||
commandType == DragCommand.self ||
commandType == SetValueCommand.self ||
commandType == PerformActionCommand.self ||
commandType == CaptureActionCommand.self ||
commandType == WindowCommand.FocusSubcommand.self ||
commandType == WindowCommand.CloseSubcommand.self ||
commandType == WindowCommand.MinimizeSubcommand.self ||
commandType == WindowCommand.MaximizeSubcommand.self ||
commandType == WindowCommand.MoveSubcommand.self ||
commandType == WindowCommand.ResizeSubcommand.self ||
commandType == WindowCommand.SetBoundsSubcommand.self ||
commandType == DialogCommand.ClickSubcommand.self ||
commandType == DialogCommand.DismissSubcommand.self ||
commandType == DialogCommand.InputSubcommand.self ||
commandType == DialogCommand.FileSubcommand.self ||
commandType == MenuCommand.ClickSubcommand.self ||
commandType == MenuCommand.ClickExtraSubcommand.self ||
commandType == DockCommand.LaunchSubcommand.self ||
commandType == DockCommand.RightClickSubcommand.self ||
commandType == DockCommand.HideSubcommand.self ||
commandType == DockCommand.ShowSubcommand.self ||
commandType == SwitchSubcommand.self ||
commandType == MoveWindowSubcommand.self ||
commandType == RunCommand.self
}
private static func isInteractivePermissionRequest(
_ commandType: (any ParsableCommand.Type)?
) -> Bool {
commandType == PermissionsCommand.RequestScreenRecordingSubcommand.self ||
commandType == PermissionsCommand.RequestEventSynthesizingSubcommand.self ||
commandType == PermissionCommand.RequestScreenRecordingSubcommand.self ||
commandType == PermissionCommand.RequestAccessibilitySubcommand.self ||
commandType == PermissionCommand.RequestEventSynthesizingSubcommand.self
}
private static func clipboardMayMutate(_ parsedValues: ParsedValues) -> Bool {
let values = CommanderBindableValues(parsedValues: parsedValues)
let positionalAction = values.positionalValue(at: 0)?
.trimmingCharacters(in: .whitespacesAndNewlines)
let action = (positionalAction?.isEmpty == false ? positionalAction : nil) ??
values.singleOption("actionOption") ??
values.singleOption("action")
return ClipboardCommand.actionMayMutate(action)
}
private static func menuListMayFocus(_ parsedValues: ParsedValues) -> Bool {
let values = CommanderBindableValues(parsedValues: parsedValues)
return !values.flag("noAutoFocus")
}
private static func dialogListMayFocus(_ parsedValues: ParsedValues) -> Bool {
let values = CommanderBindableValues(parsedValues: parsedValues)
let hasWindowTarget = values.singleOption("windowId") != nil ||
values.singleOption("windowTitle") != nil ||
values.singleOption("windowIndex") != nil
if hasWindowTarget {
return true
}
guard !values.flag("noAutoFocus") else { return false }
let app = values.singleOption("app")?
.trimmingCharacters(in: .whitespacesAndNewlines)
return app?.isEmpty == false ||
values.singleOption("pid") != nil
}
private static func captureCommandMayFocus(
_ commandType: (any ParsableCommand.Type)?,
parsedValues: ParsedValues
) -> Bool {
let values = CommanderBindableValues(parsedValues: parsedValues)
let focus = values.singleOption("captureFocus")?
.trimmingCharacters(in: .whitespacesAndNewlines)
.lowercased()
guard focus != "background" else { return false }
let app = values.singleOption("app")?
.trimmingCharacters(in: .whitespacesAndNewlines)
let hasApplicationTarget = app?.isEmpty == false || values.singleOption("pid") != nil
if commandType == ImageCommand.self {
let normalizedApp = app?.lowercased()
guard normalizedApp != "menubar", normalizedApp != "frontmost" else { return false }
let mode = values.singleOption("mode")?
.trimmingCharacters(in: .whitespacesAndNewlines)
.lowercased() ?? Self.inferredImageCaptureMode(values)
switch mode {
case "window":
return values.singleOption("windowId") == nil && hasApplicationTarget
case "multi":
return hasApplicationTarget
default:
return false
}
}
let mode = values.singleOption("mode")?
.trimmingCharacters(in: .whitespacesAndNewlines)
.lowercased() ?? Self.inferredLiveCaptureMode(values)
return mode == "window" && hasApplicationTarget
}
private static func inferredImageCaptureMode(_ values: CommanderBindableValues) -> String {
if values.singleOption("region") != nil { return "area" }
if values.singleOption("app") != nil ||
values.singleOption("pid") != nil ||
values.singleOption("windowTitle") != nil ||
values.singleOption("windowIndex") != nil ||
values.singleOption("windowId") != nil {
return "window"
}
return "frontmost"
}
private static func inferredLiveCaptureMode(_ values: CommanderBindableValues) -> String {
if values.singleOption("region") != nil { return "area" }
if values.singleOption("app") != nil ||
values.singleOption("pid") != nil ||
values.singleOption("windowTitle") != nil ||
values.singleOption("windowIndex") != nil {
return "window"
}
return "frontmost"
}
private static func requiresExactWindowTargetedClicks(
_ commandType: (any ParsableCommand.Type)?,
parsedValues: ParsedValues
) -> Bool {
guard commandType == ClickCommand.self else { return false }
let values = CommanderBindableValues(parsedValues: parsedValues)
guard self.usesBackgroundClickDelivery(values) else { return false }
let hasWindowSelector = values.singleOption("windowId") != nil ||
values.singleOption("windowTitle") != nil ||
values.singleOption("windowIndex") != nil
if hasWindowSelector {
return true
}
let hasProcessTarget = values.singleOption("app") != nil || values.singleOption("pid") != nil
return values.singleOption("coords") != nil && hasProcessTarget && !values.flag("globalCoords")
}
private static func requiresPostEventClickPermission(
_ commandType: (any ParsableCommand.Type)?,
parsedValues: ParsedValues
) -> Bool {
guard commandType == ClickCommand.self else { return false }
let values = CommanderBindableValues(parsedValues: parsedValues)
guard self.usesBackgroundClickDelivery(values) else { return false }
if values.singleOption("coords") != nil {
return true
}
// ClickCommand resolves conflicting flags as right-click first, then double-click.
return values.flag("double") && !values.flag("right")
}
private static func usesBackgroundClickDelivery(_ values: CommanderBindableValues) -> Bool {
if values.flag("focusBackground") { return true }
return !values.flag("foreground") &&
!values.flag("noAutoFocus") &&
!values.flag("spaceSwitch") &&
!values.flag("bringToCurrentSpace") &&
values.singleOption("focusTimeoutSeconds") == nil &&
values.singleOption("focusRetryCount") == nil
}
private static func prefersLocalRuntime(_ commandType: (any ParsableCommand.Type)?) -> Bool {
commandType == MCPCommand.Serve.self ||
commandType == ToolsCommand.self ||
commandType == SleepCommand.self ||
commandType == LearnCommand.self ||
commandType == CleanCommand.self ||
commandType == ConfigCommand.InitCommand.self ||
commandType == ConfigCommand.ShowCommand.self ||
commandType == ConfigCommand.EditCommand.self ||
commandType == ConfigCommand.ValidateCommand.self ||
commandType == ConfigCommand.AddCommand.self ||
commandType == ConfigCommand.LoginCommand.self ||
commandType == ConfigCommand.SetCredentialCommand.self ||
commandType == ConfigCommand.AddProviderCommand.self ||
commandType == ConfigCommand.ListProvidersCommand.self ||
commandType == ConfigCommand.TestProviderCommand.self ||
commandType == ConfigCommand.RemoveProviderCommand.self ||
commandType == ConfigCommand.ModelsProviderCommand.self ||
commandType == ListCommand.ScreensSubcommand.self ||
commandType == PermissionsCommand.RequestScreenRecordingSubcommand.self ||
commandType == PermissionCommand.RequestScreenRecordingSubcommand.self ||
commandType == PermissionCommand.RequestAccessibilitySubcommand.self
}
private static func requiresCallerLocalRuntime(_ commandType: (any ParsableCommand.Type)?) -> Bool {
commandType == PermissionsCommand.RequestScreenRecordingSubcommand.self ||
commandType == PermissionCommand.RequestScreenRecordingSubcommand.self ||
commandType == PermissionCommand.RequestAccessibilitySubcommand.self
}
private static func isDaemonCommand(_ commandType: (any ParsableCommand.Type)?) -> Bool {
commandType == DaemonCommand.self ||
commandType == DaemonCommand.Start.self ||
commandType == DaemonCommand.Stop.self ||
commandType == DaemonCommand.Status.self ||
commandType == DaemonCommand.Run.self
}
}
// MARK: - Bindable Protocol
@ -505,56 +166,27 @@ extension CommanderBindableValues {
if let pid: Int32 = try decodeOption("pid", as: Int32.self) {
options.pid = pid
}
if let windowId: Int = try decodeOption("windowId", as: Int.self) {
options.windowId = windowId
}
options.windowTitle = self.singleOption("windowTitle")
if let index: Int = try decodeOption("windowIndex", as: Int.self) {
options.windowIndex = index
}
}
func makeInteractionTargetOptions() throws -> InteractionTargetOptions {
var options = InteractionTargetOptions()
try fillInteractionTargetOptions(into: &options)
return options
}
func fillInteractionTargetOptions(into options: inout InteractionTargetOptions) throws {
options.app = self.singleOption("app")
if let pid: Int32 = try decodeOption("pid", as: Int32.self) {
options.pid = pid
}
if let windowId: Int = try decodeOption("windowId", as: Int.self) {
options.windowId = windowId
}
options.windowTitle = self.singleOption("windowTitle")
if let index: Int = try decodeOption("windowIndex", as: Int.self) {
options.windowIndex = index
}
}
func makeFocusOptions(includeBackgroundDelivery: Bool = false) throws -> FocusCommandOptions {
func makeFocusOptions() throws -> FocusCommandOptions {
var options = FocusCommandOptions()
try fillFocusOptions(into: &options, includeBackgroundDelivery: includeBackgroundDelivery)
try fillFocusOptions(into: &options)
return options
}
func fillFocusOptions(
into options: inout FocusCommandOptions,
includeBackgroundDelivery: Bool = false
) throws {
func fillFocusOptions(into options: inout FocusCommandOptions) throws {
options.noAutoFocus = self.flag("noAutoFocus")
options.spaceSwitch = self.flag("spaceSwitch")
options.bringToCurrentSpace = self.flag("bringToCurrentSpace")
if includeBackgroundDelivery && self.flag("focusBackground") {
options.focusBackground = true
}
if let timeout: TimeInterval = try decodeOption("focusTimeoutSeconds", as: TimeInterval.self) {
options.focusTimeoutSeconds = timeout
}
if let retries: Int = try decodeOption("focusRetryCount", as: Int.self) {
options.focusRetryCount = retries
if let retries: Int = try decodeOption("focusRetryCountValue", as: Int.self) {
options.focusRetryCountValue = retries
}
}
}
@ -564,7 +196,7 @@ protocol CommanderBindableCommand {
mutating func applyCommanderValues(_ values: CommanderBindableValues) throws
}
enum CommanderBindingError: LocalizedError, Equatable {
enum CommanderBindingError: LocalizedError, Sendable, Equatable {
case missingArgument(label: String)
case invalidArgument(label: String, value: String, reason: String)

View File

@ -1,65 +0,0 @@
import CoreGraphics
import Foundation
import PeekabooCore
enum CursorMovementProfileSelection: String {
case linear
case human
}
struct CursorMovementParameters {
let profile: MouseMovementProfile
let duration: Int
let steps: Int
let smooth: Bool
let profileName: String
}
struct CursorMovementResolutionRequest {
let selection: CursorMovementProfileSelection
let durationOverride: Int?
let stepsOverride: Int?
let baseSmooth: Bool
let distance: CGFloat
let defaultDuration: Int
let defaultSteps: Int
}
enum CursorMovementResolver {
static func resolve(_ request: CursorMovementResolutionRequest) -> CursorMovementParameters {
switch request.selection {
case .linear:
let resolvedDuration = request.durationOverride ?? (request.baseSmooth ? request.defaultDuration : 0)
let resolvedSteps = request.baseSmooth ? max(request.stepsOverride ?? request.defaultSteps, 1) : 1
return CursorMovementParameters(
profile: .linear,
duration: resolvedDuration,
steps: resolvedSteps,
smooth: request.baseSmooth,
profileName: request.selection.rawValue
)
case .human:
let resolvedDuration = request.durationOverride ?? Self.humanDuration(for: request.distance)
let resolvedSteps = max(request.stepsOverride ?? Self.humanSteps(for: request.distance), 30)
return CursorMovementParameters(
profile: .human(),
duration: resolvedDuration,
steps: resolvedSteps,
smooth: true,
profileName: request.selection.rawValue
)
}
}
private static func humanDuration(for distance: CGFloat) -> Int {
let distanceFactor = log2(Double(distance) + 1) * 90
let perPixel = Double(distance) * 0.45
let estimate = 280 + distanceFactor + perPixel
return min(max(Int(estimate), 300), 1700)
}
private static func humanSteps(for distance: CGFloat) -> Int {
let scaled = Int(distance * 0.35)
return min(max(scaled, 40), 140)
}
}

View File

@ -5,7 +5,7 @@ extension ParsableCommand {
let instance = Self()
let signature = CommandSignature.describe(instance)
.flattened()
.withPeekabooRuntimeFlags()
.withStandardRuntimeFlags()
let parser = CommandParser(signature: signature)
let parsedValues = try parser.parse(arguments: arguments)
return try CommanderCLIBinder.instantiateCommand(ofType: Self.self, parsedValues: parsedValues)

View File

@ -1,307 +0,0 @@
import Foundation
import PeekabooAutomationKit
import PeekabooBridge
import PeekabooFoundation
enum BridgeCapabilityPolicy {
static func supportsRemoteRequirements(
for handshake: PeekabooBridgeHandshakeResponse,
options: CommandRuntimeOptions
) -> Bool {
guard handshake.supportedOperations.contains(.captureScreen) else {
return false
}
if options.requiresElementActions, !self.supportsElementActions(for: handshake) {
return false
}
if options.requiresInspectAccessibilityTree, !self.supportsInspectAccessibilityTree(for: handshake) {
return false
}
if options.requiresBrowserMCP, !self.supportsBrowserMCP(for: handshake) {
return false
}
if options.requiresApplicationLaunchOptions, !self.supportsApplicationLaunchOptions(for: handshake) {
return false
}
if options.requiresApplicationRelaunch, !self.supportsApplicationRelaunch(for: handshake) {
return false
}
if options.requiresSurvivingApplicationHost, handshake.hostKind != .onDemand {
return false
}
if options.requiresHostApplicationInventory, !self.supportsHostApplicationInventory(for: handshake) {
return false
}
if options.requiresExactWindowTargetedClicks,
!self.supportsExactWindowTargetedClicks(for: handshake) {
return false
}
if options.requiresPostEventClickPermission,
handshake.permissions?.postEvent != true {
return false
}
if options.requiresImplicitSnapshotInvalidation || options.usesPerToolSnapshotInvalidation,
!self.supportsImplicitSnapshotInvalidation(for: handshake) {
return false
}
return true
}
static func supportsTargetedHotkeys(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
self.targetedHotkeyAvailability(for: handshake).isEnabled
}
static func supportsTargetedTypeActions(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
self.targetedTypeAvailability(for: handshake).isEnabled
}
static func supportsTargetedClicks(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
self.targetedClickAvailability(for: handshake).isEnabled
}
static func supportsApplicationLaunchOptions(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 9) &&
handshake.supportedOperations.contains(.launchApplicationWithOptions)
}
static func supportsApplicationRelaunch(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
guard handshake.hostKind == .onDemand,
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 9),
handshake.supportedOperations.contains(.relaunchApplicationWithOptions)
else {
return false
}
let enabledOperations = handshake.enabledOperations ?? handshake.supportedOperations
return enabledOperations.contains(.relaunchApplicationWithOptions)
}
static func supportsHostApplicationInventory(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
guard handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 0),
handshake.supportedOperations.contains(.listApplications)
else {
return false
}
let enabledOperations = handshake.enabledOperations ?? handshake.supportedOperations
return enabledOperations.contains(.listApplications)
}
static func supportsImplicitSnapshotInvalidation(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
guard handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 9),
handshake.supportedOperations.contains(.invalidateImplicitLatestSnapshot)
else {
return false
}
let enabledOperations = handshake.enabledOperations ?? handshake.supportedOperations
return enabledOperations.contains(.invalidateImplicitLatestSnapshot)
}
static func supportsElementActions(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 3) &&
handshake.supportedOperations.contains(.setValue) &&
handshake.supportedOperations.contains(.performAction)
}
static func supportsDesktopObservation(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 5) &&
handshake.supportedOperations.contains(.desktopObservation)
}
static func supportsInspectAccessibilityTree(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 7) &&
handshake.supportedOperations.contains(.inspectAccessibilityTree)
}
static func supportsBrowserMCP(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 4) &&
handshake.supportedOperations.contains(.browserStatus) &&
handshake.supportedOperations.contains(.browserConnect) &&
handshake.supportedOperations.contains(.browserDisconnect) &&
handshake.supportedOperations.contains(.browserExecute)
}
static func supportsPostEventPermissionRequest(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 2) &&
handshake.supportedOperations.contains(.requestPostEventPermission)
}
static func targetedHotkeyAvailability(for handshake: PeekabooBridgeHandshakeResponse)
-> (isEnabled: Bool, unavailableReason: String?, missingPermissions: Set<PeekabooBridgePermissionKind>) {
guard
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 1),
handshake.supportedOperations.contains(.targetedHotkey)
else {
return (false, nil, [])
}
let enabledOperations = handshake.enabledOperations ?? handshake.supportedOperations
if enabledOperations.contains(.targetedHotkey) {
return (true, nil, [])
}
let missingPermissions = missingPermissions(for: .targetedHotkey, handshake: handshake)
guard !missingPermissions.isEmpty else {
return (
false,
"Remote bridge host supports background hotkeys, but they are disabled by current permissions",
[]
)
}
return (
false,
"Remote bridge host supports background hotkeys, but current permissions are missing: " +
self.missingPermissionNames(missingPermissions).joined(separator: ", "),
missingPermissions
)
}
static func targetedClickAvailability(for handshake: PeekabooBridgeHandshakeResponse)
-> (isEnabled: Bool, unavailableReason: String?, missingPermissions: Set<PeekabooBridgePermissionKind>) {
guard
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 6),
handshake.supportedOperations.contains(.targetedClick)
else {
return (false, nil, [])
}
let enabledOperations = handshake.enabledOperations ?? handshake.supportedOperations
if enabledOperations.contains(.targetedClick) {
let missingVariantPermissions: Set<PeekabooBridgePermissionKind> =
handshake.permissions?.postEvent == false ? [.postEvent] : []
return (true, nil, missingVariantPermissions)
}
let requestAwarePermissions =
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 9) &&
handshake.permissionTags[PeekabooBridgeOperation.targetedClick.rawValue]?.isEmpty == true
if requestAwarePermissions,
handshake.permissions?.accessibility == false,
handshake.permissions?.postEvent == false {
return (
false,
"Remote bridge host background clicks require Accessibility or Event Synthesizing permission",
[]
)
}
let missingPermissions = missingPermissions(for: .targetedClick, handshake: handshake)
guard !missingPermissions.isEmpty else {
return (
false,
"Remote bridge host supports background clicks, but they are disabled by current permissions",
[]
)
}
return (
false,
"Remote bridge host supports background clicks, but current permissions are missing: " +
self.missingPermissionNames(missingPermissions).joined(separator: ", "),
missingPermissions
)
}
static func supportsExactWindowTargetedClicks(for handshake: PeekabooBridgeHandshakeResponse) -> Bool {
guard handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 9),
handshake.supportedOperations.contains(.exactWindowTargetedClick)
else {
return false
}
return (handshake.enabledOperations ?? handshake.supportedOperations)
.contains(.exactWindowTargetedClick)
}
static func targetedTypeAvailability(for handshake: PeekabooBridgeHandshakeResponse)
-> (isEnabled: Bool, unavailableReason: String?, missingPermissions: Set<PeekabooBridgePermissionKind>) {
guard
handshake.negotiatedVersion >= PeekabooBridgeProtocolVersion(major: 1, minor: 8),
handshake.supportedOperations.contains(.targetedTypeActions)
else {
return (false, nil, [])
}
let enabledOperations = handshake.enabledOperations ?? handshake.supportedOperations
if enabledOperations.contains(.targetedTypeActions) {
return (true, nil, [])
}
let missingPermissions = missingPermissions(for: .targetedTypeActions, handshake: handshake)
guard !missingPermissions.isEmpty else {
return (
false,
"Remote bridge host supports background typing, but it is disabled by current permissions",
[]
)
}
return (
false,
"Remote bridge host supports background typing, but current permissions are missing: " +
self.missingPermissionNames(missingPermissions).joined(separator: ", "),
missingPermissions
)
}
private static func missingPermissions(
for operation: PeekabooBridgeOperation,
handshake: PeekabooBridgeHandshakeResponse
) -> Set<PeekabooBridgePermissionKind> {
let requiredPermissions = Set(
handshake.permissionTags[operation.rawValue] ?? Array(operation.requiredPermissions)
)
let grantedPermissions = grantedPermissions(from: handshake.permissions)
return requiredPermissions.subtracting(grantedPermissions)
}
private static func missingPermissionNames(_ permissions: Set<PeekabooBridgePermissionKind>) -> [String] {
permissions.map(\.displayName).sorted()
}
private static func grantedPermissions(from status: PermissionsStatus?) -> Set<PeekabooBridgePermissionKind> {
guard let status else { return [] }
var granted: Set<PeekabooBridgePermissionKind> = []
if status.screenRecording {
granted.insert(.screenRecording)
}
if status.accessibility {
granted.insert(.accessibility)
}
if status.appleScript {
granted.insert(.appleScript)
}
if status.postEvent {
granted.insert(.postEvent)
}
return granted
}
}
extension PeekabooBridgePermissionKind {
fileprivate var displayName: String {
switch self {
case .screenRecording:
"Screen Recording"
case .accessibility:
"Accessibility"
case .postEvent:
"Event Synthesizing"
case .appleScript:
"AppleScript"
}
}
}

View File

@ -1,14 +0,0 @@
enum BridgeSocketResolver {
static func explicitBridgeSocket(
options: CommandRuntimeOptions,
environment: [String: String]
) -> String? {
if let socket = options.bridgeSocketPath, !socket.isEmpty {
return socket
}
if let socket = environment["PEEKABOO_BRIDGE_SOCKET"], !socket.isEmpty {
return socket
}
return nil
}
}

View File

@ -1,605 +0,0 @@
import Darwin
import Foundation
import MachO
import PeekabooBridge
enum DaemonLaunchPolicy {
enum ImplicitRuntimeCandidateRole: Equatable {
case reusableDaemon
case defaultAppFallback
}
struct LaunchResult {
let status: PeekabooDaemonStatus
let processID: pid_t
var ownsObservedDaemon: Bool {
self.status.pid == self.processID
}
}
enum SocketAvailability: Equatable {
case available
case reusableDaemon
case timedOut
}
enum LegacyStopRaceResolution: Equatable {
case keepReplacement
case useLegacy(socketPath: String)
}
static func shouldAutoStartDaemon(
options: CommandRuntimeOptions,
environment: [String: String]
) -> Bool {
options.autoStartDaemon &&
BridgeSocketResolver.explicitBridgeSocket(options: options, environment: environment) == nil
}
static func daemonSocketPath(environment: [String: String]) -> String {
if let socket = environment["PEEKABOO_DAEMON_SOCKET"]?.trimmingCharacters(in: .whitespacesAndNewlines),
!socket.isEmpty {
return socket
}
return PeekabooBridgeConstants.daemonSocketPath
}
static func runtimeBuildIdentity(
executableURL: URL? = Bundle.main.executableURL,
executableUUIDProvider: (URL) -> [String] = executableUUIDs
) -> String {
let protocolVersion = PeekabooBridgeConstants.protocolVersion
let identityPrefix = "\(protocolVersion.major).\(protocolVersion.minor)|" +
PeekabooBridgeConstants.buildIdentifier
let resolvedURL = executableURL?.resolvingSymlinksInPath()
if let resolvedURL {
let executableUUIDs = executableUUIDProvider(resolvedURL).sorted()
if !executableUUIDs.isEmpty {
return "\(identityPrefix)|\(executableUUIDs.joined(separator: ","))"
}
}
let executablePath = resolvedURL?.path ?? CommandLine.arguments.first ?? "unknown"
let attributes = try? FileManager.default.attributesOfItem(atPath: executablePath)
let fileSize = (attributes?[.size] as? NSNumber)?.uint64Value ?? 0
let modificationBits = (attributes?[.modificationDate] as? Date)?
.timeIntervalSinceReferenceDate.bitPattern ?? 0
return [
identityPrefix,
executablePath,
"\(fileSize)",
"\(modificationBits)",
].joined(separator: "|")
}
private enum ByteOrder {
case little
case big
}
private nonisolated static func executableUUIDs(_ executableURL: URL) -> [String] {
guard let data = try? Data(contentsOf: executableURL, options: .mappedIfSafe) else {
return []
}
return self.machoUUIDs(in: data)
}
nonisolated static func machoUUIDs(in data: Data) -> [String] {
guard let magic = readUInt32(data, at: 0, order: .little) else { return [] }
switch magic {
case UInt32(FAT_CIGAM), UInt32(FAT_CIGAM_64):
return self.fatMachOUUIDs(
in: data,
order: .big,
uses64BitArchitectureRecords: magic == UInt32(FAT_CIGAM_64)
)
case UInt32(FAT_MAGIC), UInt32(FAT_MAGIC_64):
return self.fatMachOUUIDs(
in: data,
order: .little,
uses64BitArchitectureRecords: magic == UInt32(FAT_MAGIC_64)
)
default:
return self.machOUUID(in: data, sliceOffset: 0).map { [$0] } ?? []
}
}
private nonisolated static func fatMachOUUIDs(
in data: Data,
order: ByteOrder,
uses64BitArchitectureRecords: Bool
) -> [String] {
guard let architectureCount = readUInt32(data, at: 4, order: order) else { return [] }
let recordSize = uses64BitArchitectureRecords ? 32 : 20
guard architectureCount <= 64 else { return [] }
var uuids: [String] = []
for index in 0..<Int(architectureCount) {
let recordOffset = 8 + index * recordSize
let rawSliceOffset: UInt64? = if uses64BitArchitectureRecords {
self.readUInt64(data, at: recordOffset + 8, order: order)
} else {
self.readUInt32(data, at: recordOffset + 8, order: order).map(UInt64.init)
}
guard let rawSliceOffset, rawSliceOffset <= UInt64(Int.max) else { return [] }
if let uuid = machOUUID(in: data, sliceOffset: Int(rawSliceOffset)) {
uuids.append(uuid)
}
}
return uuids
}
private nonisolated static func machOUUID(in data: Data, sliceOffset: Int) -> String? {
guard let magic = readUInt32(data, at: sliceOffset, order: .little) else { return nil }
let order: ByteOrder
let headerSize: Int
switch magic {
case UInt32(MH_MAGIC):
order = .little
headerSize = 28
case UInt32(MH_MAGIC_64):
order = .little
headerSize = 32
case UInt32(MH_CIGAM):
order = .big
headerSize = 28
case UInt32(MH_CIGAM_64):
order = .big
headerSize = 32
default:
return nil
}
guard let commandCount = readUInt32(data, at: sliceOffset + 16, order: order),
let commandBytes = readUInt32(data, at: sliceOffset + 20, order: order),
commandCount <= 16384
else { return nil }
var commandOffset = sliceOffset + headerSize
let commandsEnd = commandOffset + Int(commandBytes)
guard commandsEnd >= commandOffset, commandsEnd <= data.count else { return nil }
for _ in 0..<Int(commandCount) {
guard let command = readUInt32(data, at: commandOffset, order: order),
let rawCommandSize = readUInt32(data, at: commandOffset + 4, order: order)
else { return nil }
let commandSize = Int(rawCommandSize)
guard commandSize >= 8, commandOffset + commandSize <= commandsEnd else { return nil }
if command == UInt32(LC_UUID), commandSize >= 24 {
let uuidRange = (commandOffset + 8)..<(commandOffset + 24)
return data[uuidRange].map { String(format: "%02x", $0) }.joined()
}
commandOffset += commandSize
}
return nil
}
private nonisolated static func readUInt32(_ data: Data, at offset: Int, order: ByteOrder) -> UInt32? {
guard offset >= 0, offset + 4 <= data.count else { return nil }
let bytes = data[offset..<(offset + 4)]
return bytes.enumerated().reduce(UInt32(0)) { partial, pair in
let shift = switch order {
case .little: pair.offset * 8
case .big: (3 - pair.offset) * 8
}
return partial | UInt32(pair.element) << UInt32(shift)
}
}
private nonisolated static func readUInt64(_ data: Data, at offset: Int, order: ByteOrder) -> UInt64? {
guard offset >= 0, offset + 8 <= data.count else { return nil }
let bytes = data[offset..<(offset + 8)]
return bytes.enumerated().reduce(UInt64(0)) { partial, pair in
let shift = switch order {
case .little: pair.offset * 8
case .big: (7 - pair.offset) * 8
}
return partial | UInt64(pair.element) << UInt64(shift)
}
}
static func autoStartSocketPath(
daemonSocketPath: String,
defaultSocketWasOccupiedAndRejected: Bool,
runtimeBuildIdentity: String
) -> String {
guard defaultSocketWasOccupiedAndRejected,
let buildScopedSocketPath = buildScopedDaemonSocketPath(
daemonSocketPath: daemonSocketPath,
runtimeBuildIdentity: runtimeBuildIdentity
)
else {
return daemonSocketPath
}
return buildScopedSocketPath
}
static func buildScopedDaemonSocketPath(
daemonSocketPath: String,
runtimeBuildIdentity: String
) -> String? {
guard self.standardizedSocketPath(daemonSocketPath) ==
self.standardizedSocketPath(PeekabooBridgeConstants.daemonSocketPath)
else { return nil }
return URL(fileURLWithPath: daemonSocketPath)
.deletingLastPathComponent()
.appendingPathComponent("daemon-\(self.stableHash(runtimeBuildIdentity)).sock")
.path
}
private static func stableHash(_ value: String) -> String {
var hash: UInt64 = 14_695_981_039_346_656_037
for byte in value.utf8 {
hash ^= UInt64(byte)
hash &*= 1_099_511_628_211
}
return String(format: "%016llx", hash)
}
static func daemonIdleTimeoutSeconds(environment: [String: String]) -> TimeInterval {
guard let raw = environment["PEEKABOO_DAEMON_IDLE_TIMEOUT_SECONDS"]?
.trimmingCharacters(in: .whitespacesAndNewlines),
let value = TimeInterval(raw),
value > 0 else {
return CommandRuntime.defaultDaemonIdleTimeoutSeconds
}
return value
}
static func shouldMigrateLegacyDaemon(targetSocketPath: String) -> Bool {
self.standardizedSocketPath(targetSocketPath) ==
self.standardizedSocketPath(PeekabooBridgeConstants.daemonSocketPath)
}
static func implicitRuntimeCandidateRole(
socketPath: String,
daemonSocketPath: String,
buildScopedDaemonSocketPath: String? = nil
) -> ImplicitRuntimeCandidateRole? {
let candidate = self.standardizedSocketPath(socketPath)
if candidate == self.standardizedSocketPath(daemonSocketPath) ||
buildScopedDaemonSocketPath.map(self.standardizedSocketPath) == candidate {
return .reusableDaemon
}
if self.shouldMigrateLegacyDaemon(targetSocketPath: daemonSocketPath),
candidate == self.standardizedSocketPath(PeekabooBridgeConstants.peekabooSocketPath) {
return .defaultAppFallback
}
return nil
}
static func isSelectableImplicitRuntimeCandidate(
role: ImplicitRuntimeCandidateRole,
handshake: PeekabooBridgeHandshakeResponse,
daemonStatus: PeekabooDaemonStatus?
) -> Bool {
switch role {
case .reusableDaemon:
daemonStatus.map(DaemonControlClient.isReusableDaemonStatus) == true
case .defaultAppFallback:
handshake.hostKind == .gui ||
daemonStatus.map(DaemonControlClient.isReusableDaemonStatus) == true
}
}
static func onDemandDaemonArguments(socketPath: String, idleTimeoutSeconds: TimeInterval) -> [String] {
self.daemonArguments(
socketPath: socketPath,
mode: .auto,
idleTimeoutSeconds: idleTimeoutSeconds
)
}
static func daemonArguments(
socketPath: String,
mode: PeekabooDaemonMode,
pollIntervalMs: Int? = nil,
idleTimeoutSeconds: TimeInterval
) -> [String] {
var arguments = [
"daemon",
"run",
"--mode",
mode.rawValue,
"--bridge-socket",
socketPath,
]
if let pollIntervalMs, pollIntervalMs > 0 {
arguments.append(contentsOf: [
"--poll-interval-ms",
"\(pollIntervalMs)",
])
}
if mode == .auto {
arguments.append(contentsOf: [
"--idle-timeout-seconds",
String(format: "%.3f", idleTimeoutSeconds),
])
}
return arguments
}
static func migratedDaemonArguments(
socketPath: String,
status: PeekabooDaemonStatus,
fallbackIdleTimeoutSeconds: TimeInterval
) -> [String]? {
guard let mode = DaemonControlClient.migrationMode(for: status) else { return nil }
let idleTimeoutSeconds = status.activity?.idleTimeoutSeconds.flatMap { $0 > 0 ? $0 : nil }
?? fallbackIdleTimeoutSeconds
return self.daemonArguments(
socketPath: socketPath,
mode: mode,
pollIntervalMs: status.windowTracker?.cgPollIntervalMs,
idleTimeoutSeconds: idleTimeoutSeconds
)
}
static func startOnDemandDaemon(socketPath: String, environment: [String: String]) async -> String? {
let client = DaemonControlClient(socketPath: socketPath)
let lockHandle = DaemonPaths.openDaemonStartupLock()
if let fileDescriptor = lockHandle?.fileDescriptor {
flock(fileDescriptor, LOCK_EX)
}
defer {
if let fileDescriptor = lockHandle?.fileDescriptor {
flock(fileDescriptor, LOCK_UN)
}
try? lockHandle?.close()
}
if await client.fetchReusableDaemonStatus() != nil {
return socketPath
}
switch await self.waitForDaemonSocketAvailability(
socketPath: socketPath,
client: client,
timeout: TimeInterval(DaemonControlClient.defaultShutdownWaitSeconds)
) {
case .available:
break
case .reusableDaemon:
return socketPath
case .timedOut:
return nil
}
let fallbackIdleTimeoutSeconds = self.daemonIdleTimeoutSeconds(environment: environment)
var launchArguments = self.daemonArguments(
socketPath: socketPath,
mode: .auto,
idleTimeoutSeconds: fallbackIdleTimeoutSeconds
)
let legacyClient = DaemonControlClient(socketPath: PeekabooBridgeConstants.peekabooSocketPath)
if self.shouldMigrateLegacyDaemon(targetSocketPath: socketPath),
let legacyStatus = await legacyClient.fetchReusableDaemonStatus(),
let migrationArguments = migratedDaemonArguments(
socketPath: socketPath,
status: legacyStatus,
fallbackIdleTimeoutSeconds: fallbackIdleTimeoutSeconds
) {
if DaemonControlClient.supportsSafeMigration(legacyStatus),
DaemonControlClient.isIdleForMigration(legacyStatus) {
launchArguments = migrationArguments
guard let replacement = await launchDaemon(
socketPath: socketPath,
arguments: launchArguments
)
else {
return await self.compatibleLegacyFallbackSocketPath {
await legacyClient.fetchReusableDaemonStatus()
}
}
do {
let stopped = try await legacyClient.stopAndWait(
waitSeconds: DaemonControlClient.defaultShutdownWaitSeconds,
expectedPID: legacyStatus.pid,
requireIdentityMatch: true
)
if !stopped {
if let currentLegacyStatus = await legacyClient.fetchReusableDaemonStatus() {
return await self.resolveLegacyStopRace(
legacyStatus: currentLegacyStatus,
client: client,
replacement: replacement,
replacementSocketPath: socketPath
)
}
}
} catch {
if let currentLegacyStatus = await legacyClient.fetchReusableDaemonStatus() {
return await self.resolveLegacyStopRace(
legacyStatus: currentLegacyStatus,
client: client,
replacement: replacement,
replacementSocketPath: socketPath
)
}
}
return await client.fetchReusableDaemonStatus() != nil ? socketPath : nil
}
if let fallback = self.compatibleLegacyFallbackSocketPath(for: legacyStatus) {
return fallback
}
// An incompatible legacy host cannot satisfy this caller. Leave it running and
// start the current daemon on the free canonical socket instead.
launchArguments = self.daemonArguments(
socketPath: socketPath,
mode: .auto,
idleTimeoutSeconds: fallbackIdleTimeoutSeconds
)
}
return await self.launchDaemon(
socketPath: socketPath,
arguments: launchArguments
) != nil ? socketPath : nil
}
static func compatibleLegacyFallbackSocketPath(for status: PeekabooDaemonStatus) -> String? {
guard DaemonControlPlanner.supportsCurrentDaemon(status) else {
return nil
}
return PeekabooBridgeConstants.peekabooSocketPath
}
static func compatibleLegacyFallbackSocketPath(
refreshingWith fetchStatus: () async -> PeekabooDaemonStatus?
) async -> String? {
guard let currentStatus = await fetchStatus() else { return nil }
return self.compatibleLegacyFallbackSocketPath(for: currentStatus)
}
static func legacyStopRaceResolution(for status: PeekabooDaemonStatus) -> LegacyStopRaceResolution {
if let fallback = self.compatibleLegacyFallbackSocketPath(for: status) {
return .useLegacy(socketPath: fallback)
}
return .keepReplacement
}
static func legacyStopRaceSocketPath(
replacementCleanupSucceeded: Bool,
replacementIsReusable: Bool,
legacySocketPath: String,
replacementSocketPath: String
) -> String? {
if replacementCleanupSucceeded {
return legacySocketPath
}
return replacementIsReusable ? replacementSocketPath : nil
}
private static func resolveLegacyStopRace(
legacyStatus: PeekabooDaemonStatus,
client: DaemonControlClient,
replacement: LaunchResult,
replacementSocketPath: String
) async -> String? {
switch self.legacyStopRaceResolution(for: legacyStatus) {
case .keepReplacement:
return await client.fetchReusableDaemonStatus() != nil ? replacementSocketPath : nil
case let .useLegacy(socketPath):
let cleanedUp = await self.stopReplacement(client: client, replacement: replacement)
var replacementIsReusable = false
if !cleanedUp {
replacementIsReusable = await client.fetchReusableDaemonStatus() != nil
}
return self.legacyStopRaceSocketPath(
replacementCleanupSucceeded: cleanedUp,
replacementIsReusable: replacementIsReusable,
legacySocketPath: socketPath,
replacementSocketPath: replacementSocketPath
)
}
}
static func waitForDaemonSocketAvailability(
socketPath: String,
client: DaemonControlClient,
timeout: TimeInterval
) async -> SocketAvailability {
guard self.bridgeLeaseIsHeld(socketPath: socketPath) else {
return .available
}
let deadline = Date().addingTimeInterval(timeout)
while Date() < deadline {
if await client.fetchReusableDaemonStatus() != nil {
return .reusableDaemon
}
if !self.bridgeLeaseIsHeld(socketPath: socketPath) {
return .available
}
try? await Task.sleep(nanoseconds: 100_000_000)
}
return self.bridgeLeaseIsHeld(socketPath: socketPath) ? .timedOut : .available
}
private static func bridgeLeaseIsHeld(socketPath: String) -> Bool {
let fd = open(
"\(socketPath).lock",
O_RDWR | O_CLOEXEC | O_NOFOLLOW
)
guard fd >= 0 else { return false }
defer { close(fd) }
if flock(fd, LOCK_EX | LOCK_NB) == 0 {
flock(fd, LOCK_UN)
return false
}
return errno == EWOULDBLOCK || errno == EAGAIN
}
static func launchDaemon(
socketPath: String,
arguments: [String],
timeout: TimeInterval = 3
) async -> LaunchResult? {
let executable = CommandLine.arguments.first ?? "/usr/local/bin/peekaboo"
let process = Process()
process.executableURL = URL(fileURLWithPath: executable)
process.arguments = arguments
let logHandle = DaemonPaths.openDaemonLogForAppend() ?? FileHandle.nullDevice
process.standardOutput = logHandle
process.standardError = logHandle
process.standardInput = FileHandle.nullDevice
do {
try process.run()
} catch {
return nil
}
let deadline = Date().addingTimeInterval(timeout)
let client = DaemonControlClient(socketPath: socketPath)
while Date() < deadline {
if let status = await client.fetchReusableDaemonStatus() {
let processID = process.processIdentifier
if status.pid != processID, process.isRunning {
process.terminate()
}
return LaunchResult(status: status, processID: processID)
}
try? await Task.sleep(nanoseconds: 100_000_000)
}
if process.isRunning {
process.terminate()
}
return nil
}
static func stopReplacement(
client: DaemonControlClient,
replacement: LaunchResult
) async -> Bool {
guard replacement.ownsObservedDaemon else { return true }
let expectedPID = replacement.processID
let deadline = Date().addingTimeInterval(
TimeInterval(DaemonControlClient.defaultShutdownWaitSeconds)
)
while Date() < deadline {
guard let status = await client.fetchControllableDaemonStatus(),
status.pid == expectedPID
else {
return true
}
_ = try? await client.stopDaemon(expectedPID: expectedPID)
try? await Task.sleep(nanoseconds: 200_000_000)
}
return await client.fetchControllableDaemonStatus()?.pid != expectedPID
}
private static func standardizedSocketPath(_ path: String) -> String {
let expanded = (path as NSString).expandingTildeInPath
return (expanded as NSString).standardizingPath
}
}

View File

@ -1,468 +0,0 @@
import Darwin
import Foundation
import PeekabooAutomation
import PeekabooBridge
import PeekabooCore
@MainActor
enum RuntimeHostResolver {
struct Resolution {
let services: any PeekabooServiceProviding
let hostDescription: String
let selectedRemoteSocketPath: String?
let selectedRemoteHostProcessIdentifier: pid_t?
let snapshotInvalidationRemoteSocketPaths: [String]
let applicationRelaunchAllowed: Bool
}
struct ImplicitRemoteCandidate: Equatable {
let socketPath: String
let requireReusableDaemon: Bool
let requiredHostKind: PeekabooBridgeHostKind?
let requiresValidatedHistoricalDaemon: Bool
}
struct RemoteCandidatePlan {
let explicitSocket: String?
let daemonSocketPath: String
let runtimeBuildIdentity: String
let buildScopedDaemonSocketPath: String?
let historicalBuildScopedDaemonSocketPaths: [String]
let candidates: [ImplicitRemoteCandidate]
}
struct RemoteCandidateValidation {
let reusableDaemonStatus: PeekabooDaemonStatus?
}
enum InitialRoutingDecision: Equatable {
case local(snapshotInvalidationRemoteSocketPaths: [String])
case remote
}
static func resolveServices(options: CommandRuntimeOptions) async -> Resolution {
let environment = ProcessInfo.processInfo.environment
let configurationInput = PeekabooAutomation.ConfigurationManager.shared.getConfiguration()?.input
guard self.shouldResolveKnownRemoteEndpoints(
options: options,
environment: environment,
configurationInput: configurationInput
) else {
return Resolution(
services: RuntimeServiceFactory.makeLocalServices(options: options),
hostDescription: "local (in-process)",
selectedRemoteSocketPath: nil,
selectedRemoteHostProcessIdentifier: nil,
snapshotInvalidationRemoteSocketPaths: [],
applicationRelaunchAllowed: true
)
}
let candidatePlan = await self.remoteCandidatePlan(options: options, environment: environment)
let explicitSocket = candidatePlan.explicitSocket
let daemonSocketPath = candidatePlan.daemonSocketPath
let runtimeBuildIdentity = candidatePlan.runtimeBuildIdentity
let buildScopedDaemonSocketPath = candidatePlan.buildScopedDaemonSocketPath
let historicalBuildScopedDaemonSocketPaths = candidatePlan.historicalBuildScopedDaemonSocketPaths
let snapshotInvalidationRemoteSocketPaths = snapshotInvalidationRemoteSocketPaths(
explicitSocket: explicitSocket,
daemonSocketPath: daemonSocketPath,
buildScopedDaemonSocketPath: buildScopedDaemonSocketPath,
historicalBuildScopedDaemonSocketPaths: historicalBuildScopedDaemonSocketPaths
)
if case let .local(localSnapshotInvalidationPaths) = initialRoutingDecision(
options: options,
environment: environment,
configurationInput: configurationInput,
knownSnapshotInvalidationRemoteSocketPaths: snapshotInvalidationRemoteSocketPaths
) {
return Resolution(
services: RuntimeServiceFactory.makeLocalServices(options: options),
hostDescription: "local (in-process)",
selectedRemoteSocketPath: nil,
selectedRemoteHostProcessIdentifier: nil,
snapshotInvalidationRemoteSocketPaths: localSnapshotInvalidationPaths,
applicationRelaunchAllowed: true
)
}
let identity = PeekabooBridgeClientIdentity(
bundleIdentifier: Bundle.main.bundleIdentifier,
teamIdentifier: nil,
processIdentifier: getpid(),
hostname: Host.current().name
)
if let resolved = await resolveRemoteServices(
candidates: candidatePlan.candidates,
identity: identity,
options: options,
snapshotInvalidationRemoteSocketPaths: snapshotInvalidationRemoteSocketPaths
) {
return resolved
}
if DaemonLaunchPolicy.shouldAutoStartDaemon(options: options, environment: environment) {
let rejectedDefaultSocketOccupant =
await DaemonControlClient(socketPath: daemonSocketPath).fetchStatus() != nil
let autoStartSocketPath = DaemonLaunchPolicy.autoStartSocketPath(
daemonSocketPath: daemonSocketPath,
defaultSocketWasOccupiedAndRejected: rejectedDefaultSocketOccupant,
runtimeBuildIdentity: runtimeBuildIdentity
)
if let resolvedDaemonSocket = await DaemonLaunchPolicy.startOnDemandDaemon(
socketPath: autoStartSocketPath,
environment: environment
),
let resolved = await resolveRemoteServices(
candidates: [ImplicitRemoteCandidate(
socketPath: resolvedDaemonSocket,
requireReusableDaemon: true,
requiredHostKind: nil,
requiresValidatedHistoricalDaemon: false
)],
identity: identity,
options: options,
snapshotInvalidationRemoteSocketPaths: snapshotInvalidationRemoteSocketPaths
) {
return resolved
}
}
return Resolution(
services: RuntimeServiceFactory.makeLocalServices(options: options),
hostDescription: "local (in-process fallback)",
selectedRemoteSocketPath: nil,
selectedRemoteHostProcessIdentifier: nil,
snapshotInvalidationRemoteSocketPaths: snapshotInvalidationRemoteSocketPaths,
applicationRelaunchAllowed: !options.requiresApplicationRelaunch
)
}
static func remoteRoutingAllowed(
options: CommandRuntimeOptions,
environment: [String: String],
configurationInput: PeekabooAutomation.Configuration.InputConfig?
) -> Bool {
self.initialRoutingDecision(
options: options,
environment: environment,
configurationInput: configurationInput,
knownSnapshotInvalidationRemoteSocketPaths: []
) == .remote
}
static func remoteCandidatePlan(
options: CommandRuntimeOptions,
environment: [String: String]
) async -> RemoteCandidatePlan {
let explicitSocket = BridgeSocketResolver.explicitBridgeSocket(options: options, environment: environment)
let daemonSocketPath = DaemonLaunchPolicy.daemonSocketPath(environment: environment)
let runtimeBuildIdentity = DaemonLaunchPolicy.runtimeBuildIdentity()
let buildScopedDaemonSocketPath = DaemonLaunchPolicy.buildScopedDaemonSocketPath(
daemonSocketPath: daemonSocketPath,
runtimeBuildIdentity: runtimeBuildIdentity
)
let historicalBuildScopedDaemonSocketPaths: [String] = if self.shouldDiscoverHistoricalDaemons(
explicitSocket: explicitSocket,
daemonSocketPath: daemonSocketPath
) {
await DaemonControlResolver.validatedHistoricalTargets(
daemonSocketPath: daemonSocketPath,
currentBuildScopedSocketPath: buildScopedDaemonSocketPath
)
.filter { DaemonControlPlanner.supportsCurrentDaemon($0.status) }
.map(\.client.socketPath)
} else {
[]
}
let candidates: [ImplicitRemoteCandidate] = if let explicitSocket, !explicitSocket.isEmpty {
[ImplicitRemoteCandidate(
socketPath: explicitSocket,
requireReusableDaemon: false,
requiredHostKind: nil,
requiresValidatedHistoricalDaemon: false
)]
} else {
self.implicitRemoteCandidates(
options: options,
daemonSocketPath: daemonSocketPath,
buildScopedDaemonSocketPath: buildScopedDaemonSocketPath,
historicalBuildScopedDaemonSocketPaths: historicalBuildScopedDaemonSocketPaths
)
}
return RemoteCandidatePlan(
explicitSocket: explicitSocket,
daemonSocketPath: daemonSocketPath,
runtimeBuildIdentity: runtimeBuildIdentity,
buildScopedDaemonSocketPath: buildScopedDaemonSocketPath,
historicalBuildScopedDaemonSocketPaths: historicalBuildScopedDaemonSocketPaths,
candidates: candidates
)
}
static func initialRoutingDecision(
options: CommandRuntimeOptions,
environment: [String: String],
configurationInput: PeekabooAutomation.Configuration.InputConfig?,
knownSnapshotInvalidationRemoteSocketPaths: [String]
) -> InitialRoutingDecision {
guard !self.remoteIsolationRequested(options: options, environment: environment) else {
return .local(snapshotInvalidationRemoteSocketPaths: [])
}
if self.inputPolicyRequiresLocal(
options: options,
environment: environment,
configurationInput: configurationInput
) {
return .local(
snapshotInvalidationRemoteSocketPaths: knownSnapshotInvalidationRemoteSocketPaths
)
}
if !options.preferRemote,
options.requiresImplicitSnapshotInvalidation || options.usesPerToolSnapshotInvalidation {
return .local(
snapshotInvalidationRemoteSocketPaths: knownSnapshotInvalidationRemoteSocketPaths
)
}
guard options.preferRemote else {
return .local(snapshotInvalidationRemoteSocketPaths: [])
}
return .remote
}
static func shouldResolveKnownRemoteEndpoints(
options: CommandRuntimeOptions,
environment: [String: String],
configurationInput: PeekabooAutomation.Configuration.InputConfig?
) -> Bool {
guard !self.remoteIsolationRequested(options: options, environment: environment) else {
return false
}
return options.preferRemote ||
options.requiresImplicitSnapshotInvalidation ||
options.usesPerToolSnapshotInvalidation ||
self.inputPolicyRequiresLocal(
options: options,
environment: environment,
configurationInput: configurationInput
)
}
static func remoteIsolationRequested(
options: CommandRuntimeOptions,
environment: [String: String]
) -> Bool {
options.remoteIsolationRequested || environment["PEEKABOO_NO_REMOTE"] != nil
}
static func snapshotInvalidationRemoteSocketPaths(
explicitSocket: String?,
daemonSocketPath: String,
buildScopedDaemonSocketPath: String? = nil,
historicalBuildScopedDaemonSocketPaths: [String] = []
) -> [String] {
var seen = Set<String>()
var candidatePaths = [
explicitSocket,
PeekabooBridgeConstants.peekabooSocketPath,
daemonSocketPath,
buildScopedDaemonSocketPath,
]
.compactMap(\.self)
candidatePaths.append(contentsOf: historicalBuildScopedDaemonSocketPaths)
return candidatePaths
.map { NSString(string: $0).standardizingPath }
.filter { !$0.isEmpty && seen.insert($0).inserted }
}
static func shouldDiscoverHistoricalDaemons(
explicitSocket: String?,
daemonSocketPath: String
) -> Bool {
explicitSocket == nil && DaemonLaunchPolicy.shouldMigrateLegacyDaemon(targetSocketPath: daemonSocketPath)
}
static func inputPolicyRequiresLocal(
options: CommandRuntimeOptions,
environment: [String: String],
configurationInput: PeekabooAutomation.Configuration.InputConfig?
) -> Bool {
guard !options.requiresApplicationLaunchOptions,
!options.requiresHostApplicationInventory
else {
return false
}
return options.inputStrategy != nil ||
RuntimeInputPolicyResolver.hasEnvironmentOverride(environment: environment) ||
RuntimeInputPolicyResolver.hasConfigOverride(input: configurationInput)
}
static func implicitRemoteCandidates(
options: CommandRuntimeOptions,
daemonSocketPath: String,
buildScopedDaemonSocketPath: String? = nil,
historicalBuildScopedDaemonSocketPaths: [String] = []
) -> [ImplicitRemoteCandidate] {
var seenDaemonPaths = Set<String>()
var daemons: [ImplicitRemoteCandidate] = []
for socketPath in [daemonSocketPath, buildScopedDaemonSocketPath].compactMap(\.self) {
guard seenDaemonPaths.insert(NSString(string: socketPath).standardizingPath).inserted else { continue }
daemons.append(ImplicitRemoteCandidate(
socketPath: socketPath,
requireReusableDaemon: true,
requiredHostKind: nil,
requiresValidatedHistoricalDaemon: false
))
}
for socketPath in historicalBuildScopedDaemonSocketPaths {
guard seenDaemonPaths.insert(NSString(string: socketPath).standardizingPath).inserted else { continue }
daemons.append(ImplicitRemoteCandidate(
socketPath: socketPath,
requireReusableDaemon: true,
requiredHostKind: .onDemand,
requiresValidatedHistoricalDaemon: true
))
}
let gui = ImplicitRemoteCandidate(
socketPath: PeekabooBridgeConstants.peekabooSocketPath,
requireReusableDaemon: false,
requiredHostKind: .gui,
requiresValidatedHistoricalDaemon: false
)
if options.requiresApplicationRelaunch || options.requiresSurvivingApplicationHost {
return daemons
}
if options.requiresApplicationLaunchOptions || options.requiresHostApplicationInventory {
return [gui] + daemons
}
if DaemonLaunchPolicy.shouldMigrateLegacyDaemon(targetSocketPath: daemonSocketPath) {
return daemons + [gui]
}
return daemons
}
private static func resolveRemoteServices(
candidates: [ImplicitRemoteCandidate],
identity: PeekabooBridgeClientIdentity,
options: CommandRuntimeOptions,
snapshotInvalidationRemoteSocketPaths: [String]
)
async -> Resolution? {
for candidate in candidates {
let socketPath = candidate.socketPath
let client = PeekabooBridgeClient(socketPath: socketPath)
do {
let handshake = try await client.handshake(client: identity, requestedHost: nil)
guard let validation = await self.validateRemoteCandidate(
candidate,
handshake: handshake,
options: options
) else { continue }
let reusableDaemonStatus = validation.reusableDaemonStatus
let targetedHotkeyAvailability = BridgeCapabilityPolicy.targetedHotkeyAvailability(for: handshake)
let targetedTypeAvailability = BridgeCapabilityPolicy.targetedTypeAvailability(for: handshake)
let targetedClickAvailability = BridgeCapabilityPolicy.targetedClickAvailability(for: handshake)
let hostDescription = "remote \(handshake.hostKind.rawValue) via \(socketPath)" +
(handshake.build.map { " (build \($0))" } ?? "")
return Resolution(
services: RemotePeekabooServices(
client: client,
supportsTargetedHotkeys: targetedHotkeyAvailability.isEnabled,
targetedHotkeyUnavailableReason: targetedHotkeyAvailability.unavailableReason,
targetedHotkeyRequiresEventSynthesizingPermission:
targetedHotkeyAvailability.missingPermissions.contains(.postEvent),
supportsTargetedTypeActions: targetedTypeAvailability.isEnabled,
targetedTypeUnavailableReason: targetedTypeAvailability.unavailableReason,
targetedTypeRequiresEventSynthesizingPermission:
targetedTypeAvailability.missingPermissions.contains(.postEvent),
supportsTargetedClicks: targetedClickAvailability.isEnabled,
targetedClickUnavailableReason: targetedClickAvailability.unavailableReason,
targetedClickRequiresEventSynthesizingPermission:
targetedClickAvailability.missingPermissions.contains(.postEvent),
supportsExactWindowTargetedClicks:
BridgeCapabilityPolicy.supportsExactWindowTargetedClicks(for: handshake),
supportsInspectAccessibilityTree: BridgeCapabilityPolicy.supportsInspectAccessibilityTree(
for: handshake
),
supportsPostEventPermissionRequest: BridgeCapabilityPolicy.supportsPostEventPermissionRequest(
for: handshake
),
supportsElementActions: BridgeCapabilityPolicy.supportsElementActions(for: handshake),
supportsDesktopObservation: BridgeCapabilityPolicy.supportsDesktopObservation(for: handshake),
supportsImplicitLatestSnapshotInvalidation:
BridgeCapabilityPolicy.supportsImplicitSnapshotInvalidation(for: handshake),
supportsApplicationLaunchOptions:
BridgeCapabilityPolicy.supportsApplicationLaunchOptions(for: handshake),
supportsApplicationRelaunch:
BridgeCapabilityPolicy.supportsApplicationRelaunch(for: handshake),
allowLocalApplicationFallback: handshake.hostKind == .onDemand,
desktopMutationWatermarkStore: DesktopMutationWatermarkStore()
),
hostDescription: hostDescription,
selectedRemoteSocketPath: NSString(string: socketPath).standardizingPath,
selectedRemoteHostProcessIdentifier: reusableDaemonStatus?.pid,
snapshotInvalidationRemoteSocketPaths: snapshotInvalidationRemoteSocketPaths,
applicationRelaunchAllowed: BridgeCapabilityPolicy.supportsApplicationRelaunch(for: handshake)
)
} catch {
continue
}
}
return nil
}
static func validateRemoteCandidate(
_ candidate: ImplicitRemoteCandidate,
handshake: PeekabooBridgeHandshakeResponse,
options: CommandRuntimeOptions,
fetchReusableDaemonStatus: (String) async -> PeekabooDaemonStatus? = { socketPath in
await DaemonControlClient(socketPath: socketPath).fetchReusableDaemonStatus()
}
) async -> RemoteCandidateValidation? {
guard candidate.requiredHostKind == nil || handshake.hostKind == candidate.requiredHostKind else {
return nil
}
guard BridgeCapabilityPolicy.supportsRemoteRequirements(for: handshake, options: options) else {
return nil
}
let requiresReusableHost = candidate.requireReusableDaemon ||
options.requiresApplicationRelaunch ||
options.requiresSurvivingApplicationHost
let reusableDaemonStatus: PeekabooDaemonStatus? = if requiresReusableHost {
await fetchReusableDaemonStatus(candidate.socketPath)
} else {
nil
}
guard !requiresReusableHost || reusableDaemonStatus != nil else { return nil }
if candidate.requiresValidatedHistoricalDaemon {
guard let reusableDaemonStatus,
DaemonControlResolver.isValidatedHistoricalTarget(
status: reusableDaemonStatus,
socketPath: candidate.socketPath
),
DaemonControlPlanner.supportsCurrentDaemon(reusableDaemonStatus)
else {
return nil
}
}
if options.requiresApplicationRelaunch || options.requiresSurvivingApplicationHost,
reusableDaemonStatus?.pid == nil {
return nil
}
return RemoteCandidateValidation(reusableDaemonStatus: reusableDaemonStatus)
}
}

View File

@ -1,48 +0,0 @@
import Foundation
import PeekabooAutomation
import PeekabooAutomationKit
enum RuntimeInputPolicyResolver {
static func hasEnvironmentOverride(environment: [String: String]) -> Bool {
[
"PEEKABOO_INPUT_STRATEGY",
"PEEKABOO_CLICK_INPUT_STRATEGY",
"PEEKABOO_SCROLL_INPUT_STRATEGY",
"PEEKABOO_TYPE_INPUT_STRATEGY",
"PEEKABOO_HOTKEY_INPUT_STRATEGY",
"PEEKABOO_SET_VALUE_INPUT_STRATEGY",
"PEEKABOO_PERFORM_ACTION_INPUT_STRATEGY",
].contains { key in
guard let value = environment[key] else {
return false
}
return UIInputStrategy(rawValue: value.trimmingCharacters(in: .whitespacesAndNewlines)) != nil
}
}
static func hasConfigOverride(input: PeekabooAutomation.Configuration.InputConfig?) -> Bool {
guard let input else {
return false
}
if input.defaultStrategy != nil ||
input.click != nil ||
input.scroll != nil ||
input.type != nil ||
input.hotkey != nil ||
input.setValue != nil ||
input.performAction != nil {
return true
}
return input.perApp?.values.contains { appInput in
appInput.defaultStrategy != nil ||
appInput.click != nil ||
appInput.scroll != nil ||
appInput.type != nil ||
appInput.hotkey != nil ||
appInput.setValue != nil ||
appInput.performAction != nil
} ?? false
}
}

View File

@ -1,16 +0,0 @@
import PeekabooAutomation
import PeekabooCore
@MainActor
enum RuntimeServiceFactory {
static func makeLocalServices(options: CommandRuntimeOptions) -> PeekabooServices {
PeekabooServices(
snapshotManager: SnapshotManager(
desktopMutationWatermarkStore: DesktopMutationWatermarkStore()
),
inputPolicy: PeekabooAutomation.ConfigurationManager.shared.getUIInputPolicy(
cliStrategy: options.inputStrategy
)
)
}
}

View File

@ -1,216 +0,0 @@
import Foundation
import PeekabooAutomation
import PeekabooBridge
import Security
struct BridgeDiagnostics {
private let logger: Logger
init(logger: Logger) {
self.logger = logger
}
@MainActor
func run(runtimeOptions: CommandRuntimeOptions) async -> BridgeStatusReport {
let environment = ProcessInfo.processInfo.environment
let effectiveOptions = runtimeOptions.applyingEnvironmentOverrides(environment: environment)
let configurationInput = PeekabooAutomation.ConfigurationManager.shared.getConfiguration()?.input
let remoteSkipReason = Self.remoteSkipReason(
runtimeOptions: effectiveOptions,
environment: environment,
configurationInput: configurationInput
)
let identity = PeekabooBridgeClientIdentity(
bundleIdentifier: Bundle.main.bundleIdentifier,
teamIdentifier: Self.currentTeamIdentifier(),
processIdentifier: getpid(),
hostname: Host.current().name
)
if let remoteSkipReason {
let candidates = Self.diagnosticSocketPaths(
runtimeOptions: effectiveOptions,
environment: environment
)
self.logger.debug("Bridge status: remote skipped (\(remoteSkipReason))")
return BridgeStatusReport(
remoteSkipped: true,
remoteSkipReason: remoteSkipReason,
selected: .local(),
candidates: candidates.map { BridgeCandidateReport(socketPath: $0, result: .skipped) },
client: .init(identity: identity)
)
}
let candidatePlan = await RuntimeHostResolver.remoteCandidatePlan(
options: effectiveOptions,
environment: environment
)
let runtimeCandidates = candidatePlan.candidates
let candidates = Self.diagnosticSocketPaths(
runtimeCandidateSocketPaths: runtimeCandidates.map(\.socketPath),
hasExplicitSocket: candidatePlan.explicitSocket != nil
)
var runtimeCandidateByPath: [String: RuntimeHostResolver.ImplicitRemoteCandidate] = [:]
for candidate in runtimeCandidates {
let path = NSString(string: candidate.socketPath).standardizingPath
if runtimeCandidateByPath[path] == nil {
runtimeCandidateByPath[path] = candidate
}
}
var results: [BridgeCandidateReport] = []
var selected: BridgeSelectionReport?
for socketPath in candidates {
let client = PeekabooBridgeClient(socketPath: socketPath)
do {
let handshake = try await client.handshake(client: identity, requestedHost: nil)
let report = BridgeHandshakeReport(from: handshake)
self.logger.debug(
"Bridge status: handshake OK \(handshake.hostKind.rawValue) via \(socketPath)",
category: "Bridge"
)
results.append(.init(socketPath: socketPath, result: .success(report)))
let candidatePath = NSString(string: socketPath).standardizingPath
if selected == nil,
let runtimeCandidate = runtimeCandidateByPath[candidatePath] {
let validation = await RuntimeHostResolver.validateRemoteCandidate(
runtimeCandidate,
handshake: handshake,
options: effectiveOptions
)
if validation != nil {
selected = .remote(socketPath: socketPath, handshake: report)
}
}
} catch let envelope as PeekabooBridgeErrorEnvelope {
self.logger.debug(
"Bridge status: handshake error \(envelope.code.rawValue) via \(socketPath): \(envelope.message)",
category: "Bridge"
)
results.append(.init(socketPath: socketPath, result: .failure(.bridgeEnvelope(envelope))))
} catch {
self.logger.debug(
"Bridge status: handshake error via \(socketPath): \(String(describing: error))",
category: "Bridge"
)
results.append(.init(socketPath: socketPath, result: .failure(.other(error))))
}
}
return BridgeStatusReport(
remoteSkipped: false,
remoteSkipReason: nil,
selected: selected ?? .local(),
candidates: results,
client: .init(identity: identity)
)
}
static func remoteSkipReason(
runtimeOptions: CommandRuntimeOptions,
environment: [String: String],
configurationInput: PeekabooAutomation.Configuration.InputConfig?
) -> String? {
let decision = RuntimeHostResolver.initialRoutingDecision(
options: runtimeOptions,
environment: environment,
configurationInput: configurationInput,
knownSnapshotInvalidationRemoteSocketPaths: []
)
guard case .local = decision else { return nil }
if environment["PEEKABOO_NO_REMOTE"] != nil {
return "PEEKABOO_NO_REMOTE"
}
if runtimeOptions.remoteIsolationRequested {
return "--no-remote"
}
if RuntimeHostResolver.inputPolicyRequiresLocal(
options: runtimeOptions,
environment: environment,
configurationInput: configurationInput
) {
return "input strategy policy"
}
return "local runtime policy"
}
static func runtimeCandidateSocketPaths(
runtimeOptions: CommandRuntimeOptions,
environment: [String: String],
historicalBuildScopedDaemonSocketPaths: [String] = []
) -> [String] {
if let explicitPath = BridgeSocketResolver.explicitBridgeSocket(
options: runtimeOptions,
environment: environment
) {
return [explicitPath]
}
let daemonPath = DaemonLaunchPolicy.daemonSocketPath(environment: environment)
let buildScopedPath = DaemonLaunchPolicy.buildScopedDaemonSocketPath(
daemonSocketPath: daemonPath,
runtimeBuildIdentity: DaemonLaunchPolicy.runtimeBuildIdentity()
)
return RuntimeHostResolver.implicitRemoteCandidates(
options: runtimeOptions,
daemonSocketPath: daemonPath,
buildScopedDaemonSocketPath: buildScopedPath,
historicalBuildScopedDaemonSocketPaths: historicalBuildScopedDaemonSocketPaths
).map(\.socketPath)
}
static func diagnosticSocketPaths(
runtimeOptions: CommandRuntimeOptions,
environment: [String: String],
historicalBuildScopedDaemonSocketPaths: [String] = []
) -> [String] {
let runtimePaths = self.runtimeCandidateSocketPaths(
runtimeOptions: runtimeOptions,
environment: environment,
historicalBuildScopedDaemonSocketPaths: historicalBuildScopedDaemonSocketPaths
)
return self.diagnosticSocketPaths(
runtimeCandidateSocketPaths: runtimePaths,
hasExplicitSocket: BridgeSocketResolver.explicitBridgeSocket(
options: runtimeOptions,
environment: environment
) != nil
)
}
private static func diagnosticSocketPaths(
runtimeCandidateSocketPaths runtimePaths: [String],
hasExplicitSocket: Bool
) -> [String] {
if hasExplicitSocket { return runtimePaths }
let additionalPaths = [
PeekabooBridgeConstants.peekabooSocketPath,
PeekabooBridgeConstants.claudeSocketPath,
PeekabooBridgeConstants.clawdbotSocketPath,
]
return runtimePaths + additionalPaths.filter { !runtimePaths.contains($0) }
}
private static func currentTeamIdentifier() -> String? {
var code: SecCode?
guard SecCodeCopySelf([], &code) == errSecSuccess, let code else { return nil }
var staticCode: SecStaticCode?
guard SecCodeCopyStaticCode(code, SecCSFlags(), &staticCode) == errSecSuccess,
let sCode = staticCode
else { return nil }
var infoCF: CFDictionary?
let flags = SecCSFlags(rawValue: UInt32(kSecCSSigningInformation))
guard SecCodeCopySigningInformation(sCode, flags, &infoCF) == errSecSuccess,
let info = infoCF as? [String: Any]
else { return nil }
return info[kSecCodeInfoTeamIdentifier as String] as? String
}
}

View File

@ -1,193 +0,0 @@
import Foundation
import PeekabooBridge
import PeekabooCore
struct BridgeStatusReport: Codable {
let remoteSkipped: Bool
let remoteSkipReason: String?
let selected: BridgeSelectionReport
let candidates: [BridgeCandidateReport]
let client: BridgeClientReport
var bridgeScreenRecordingHint: String? {
guard let candidate = self.candidates.first(where: { $0.screenRecordingDenied }) else { return nil }
let hostKind = candidate.hostKind ?? "Bridge host"
return "Hint: \(hostKind) at \(candidate.socketPath) does not have Screen Recording. Grant it to " +
"the host app, or run capture commands with --no-remote --capture-engine cg when the caller " +
"process already has permission."
}
}
struct BridgeClientReport: Codable {
let bundleIdentifier: String?
let teamIdentifier: String?
let processIdentifier: pid_t
let hostname: String?
init(identity: PeekabooBridgeClientIdentity) {
self.bundleIdentifier = identity.bundleIdentifier
self.teamIdentifier = identity.teamIdentifier
self.processIdentifier = identity.processIdentifier
self.hostname = identity.hostname
}
var humanSummary: String {
let bundle = self.bundleIdentifier ?? "<unknown bundle>"
let team = self.teamIdentifier ?? "<unsigned>"
return "pid=\(self.processIdentifier) bundle=\(bundle) team=\(team)"
}
}
struct BridgeCandidateReport: Codable {
let socketPath: String
let result: BridgeCandidateResult
var hostKind: String? {
if case let .success(handshake) = self.result {
return handshake.hostKind.rawValue
}
return nil
}
var screenRecordingDenied: Bool {
if case let .success(handshake) = self.result {
return handshake.permissions?.screenRecording == false
}
return false
}
var humanSummary: String {
switch self.result {
case .skipped:
return "\(self.socketPath) — skipped"
case let .success(handshake):
let enabled = handshake.enabledOperations?.count
let supported = handshake.supportedOperations.count
let opsSummary = if let enabled {
"ops: \(enabled)/\(supported) enabled"
} else {
"ops: \(supported)"
}
let permissionsSummary = handshake.permissions.map { status in
let sr = status.screenRecording ? "Y" : "N"
let ax = status.accessibility ? "Y" : "N"
let appleScript = status.appleScript ? "Y" : "N"
let eventSynthesizing = status.postEvent ? "Y" : "N"
return "perm: SR=\(sr) AX=\(ax) AS=\(appleScript) ES=\(eventSynthesizing)"
}
if let permissionsSummary {
return "\(self.socketPath) — OK (\(handshake.hostKind.rawValue), \(opsSummary), \(permissionsSummary))"
}
return "\(self.socketPath) — OK (\(handshake.hostKind.rawValue), \(opsSummary))"
case let .failure(error):
return "\(self.socketPath)\(error.humanSummary)"
}
}
}
enum BridgeCandidateResult: Codable {
case skipped
case success(BridgeHandshakeReport)
case failure(BridgeCandidateErrorReport)
}
struct BridgeHandshakeReport: Codable {
let negotiatedVersion: PeekabooBridgeProtocolVersion
let hostKind: PeekabooBridgeHostKind
let build: String?
let supportedOperations: [PeekabooBridgeOperation]
let permissions: PermissionsStatus?
let enabledOperations: [PeekabooBridgeOperation]?
let permissionTags: [String: [PeekabooBridgePermissionKind]]
init(from handshake: PeekabooBridgeHandshakeResponse) {
self.negotiatedVersion = handshake.negotiatedVersion
self.hostKind = handshake.hostKind
self.build = handshake.build
self.supportedOperations = handshake.supportedOperations
self.permissions = handshake.permissions
self.enabledOperations = handshake.enabledOperations
self.permissionTags = handshake.permissionTags
}
}
struct BridgeCandidateErrorReport: Codable {
let kind: String
let code: String?
let message: String
let details: String?
let hint: String?
static func bridgeEnvelope(_ envelope: PeekabooBridgeErrorEnvelope) -> BridgeCandidateErrorReport {
let hint: String? = switch envelope.code {
case .unauthorizedClient:
"Client not signed by an allowed TeamID. For local dev, set " +
"PEEKABOO_ALLOW_UNSIGNED_SOCKET_CLIENTS=1 in the host."
case .decodingFailed:
"Host returned a non-Bridge response. This commonly means you hit a different socket protocol " +
"or the host closed early due to code-sign checks."
case .internalError:
"Host closed the connection without a valid response. This commonly indicates code-sign checks " +
"or a mismatched Bridge protocol."
default:
nil
}
return BridgeCandidateErrorReport(
kind: "bridge",
code: envelope.code.rawValue,
message: envelope.message,
details: envelope.details,
hint: hint
)
}
static func other(_ error: any Error) -> BridgeCandidateErrorReport {
BridgeCandidateErrorReport(
kind: "system",
code: nil,
message: error.localizedDescription,
details: String(describing: error),
hint: nil
)
}
var humanSummary: String {
if let code {
return "\(code): \(self.message)"
}
return self.message
}
}
struct BridgeSelectionReport: Codable {
enum Source: String, Codable {
case remote
case local
}
let source: Source
let socketPath: String?
let handshake: BridgeHandshakeReport?
static func local() -> BridgeSelectionReport {
BridgeSelectionReport(source: .local, socketPath: nil, handshake: nil)
}
static func remote(socketPath: String, handshake: BridgeHandshakeReport) -> BridgeSelectionReport {
BridgeSelectionReport(source: .remote, socketPath: socketPath, handshake: handshake)
}
var humanSummary: String {
switch self.source {
case .local:
return "local (in-process)"
case .remote:
let kind = self.handshake?.hostKind.rawValue ?? "remote"
let buildSuffix = self.handshake?.build.map { " (build \($0))" } ?? ""
if let socketPath {
return "remote \(kind) via \(socketPath)\(buildSuffix)"
}
return "remote \(kind)\(buildSuffix)"
}
}
}

View File

@ -1,137 +0,0 @@
import Commander
/// Diagnose Peekaboo Bridge host connectivity and resolution.
struct BridgeCommand: ParsableCommand {
static let commandDescription = CommandDescription(
commandName: "bridge",
abstract: "Inspect Peekaboo Bridge host connectivity",
discussion: """
Peekaboo Bridge lets the CLI run permission-bound operations (Screen Recording, Accessibility,
AppleScript) via a host app that already has the needed TCC grants.
By default, automation commands use the dedicated Peekaboo daemon and fall back to local execution.
Peekaboo.app, Claude.app, and ClawdBot.app sockets are shown for diagnostics and can be selected explicitly.
Examples:
peekaboo bridge status
peekaboo bridge status --json
peekaboo bridge status --verbose
peekaboo bridge status --bridge-socket ~/Library/Application\\ Support/clawdbot/bridge.sock
peekaboo bridge status --no-remote
""",
subcommands: [
StatusSubcommand.self
],
defaultSubcommand: StatusSubcommand.self,
showHelpOnEmptyInvocation: true
)
}
extension BridgeCommand {
@MainActor
struct StatusSubcommand: OutputFormattable, RuntimeOptionsConfigurable {
nonisolated(unsafe) static var commandDescription: CommandDescription {
MainActorCommandDescription.describe {
CommandDescription(
commandName: "status",
abstract: "Report which Bridge host would be used"
)
}
}
@RuntimeStorage private var runtime: CommandRuntime?
var runtimeOptions = CommandRuntimeOptions()
private var resolvedRuntime: CommandRuntime {
guard let runtime else {
preconditionFailure("CommandRuntime must be configured before accessing runtime resources")
}
return runtime
}
private var configuration: CommandRuntime.Configuration {
if let runtime {
return runtime.configuration
}
return self.runtimeOptions.makeConfiguration()
}
private var logger: Logger {
self.resolvedRuntime.logger
}
var outputLogger: Logger {
self.logger
}
var jsonOutput: Bool {
self.configuration.jsonOutput
}
private var verbose: Bool {
self.configuration.verbose
}
@MainActor
mutating func run(using runtime: CommandRuntime) async throws {
self.runtime = runtime
self.logger.setJsonOutputMode(self.jsonOutput)
let report = await BridgeDiagnostics(logger: self.logger).run(runtimeOptions: self.runtimeOptions)
if self.jsonOutput {
outputSuccessCodable(data: report, logger: self.outputLogger)
return
}
self.printHumanReadable(report: report)
}
private func printHumanReadable(report: BridgeStatusReport) {
print("Peekaboo Bridge")
print("===============")
print("")
print("Selected: \(report.selected.humanSummary)")
if let hint = report.bridgeScreenRecordingHint {
print("")
print(hint)
}
if report.remoteSkipped {
print("Remote: skipped (\(report.remoteSkipReason ?? "disabled"))")
return
}
guard self.verbose else {
if report.selected.source == .local {
print("")
print("Tip: run with --verbose to see remote host probe results.")
}
return
}
print("")
print("Client: \(report.client.humanSummary)")
if report.client.teamIdentifier == nil {
print("Note: unsigned clients may be rejected by host code-sign checks.")
}
print("")
print("Candidates:")
for candidate in report.candidates {
print("- \(candidate.humanSummary)")
if case let .failure(error) = candidate.result, let hint = error.hint {
print(" hint: \(hint)")
}
}
}
}
}
extension BridgeCommand.StatusSubcommand: AsyncRuntimeCommand {}
@MainActor
extension BridgeCommand.StatusSubcommand: CommanderBindableCommand {
mutating func applyCommanderValues(_: CommanderBindableValues) throws {
// No command-specific flags; runtime flags are bound via RuntimeOptionsConfigurable.
}
}

Some files were not shown because too many files have changed in this diff Show More