Compare commits

...

64 Commits
v0.2.0 ... main

Author SHA1 Message Date
Peter Steinberger
469d89bc1a
chore: prepare gitcrawl 0.3.1
Some checks failed
CI / Go / ${{ matrix.os }} (macos-latest) (push) Has been cancelled
CI / Go / ${{ matrix.os }} (ubuntu-latest) (push) Has been cancelled
Pages / Deploy docs (push) Has been cancelled
Security Gate: Secret Scanning / Scan for Verified Secrets (push) Has been cancelled
2026-05-08 09:56:02 +01:00
Peter Steinberger
a94a53217d
docs: update gitcrawl changelog and command docs 2026-05-08 09:50:20 +01:00
Peter Steinberger
7671a6b999
fix: harden gitcrawl command surface 2026-05-08 09:50:17 +01:00
Peter Steinberger
f2d60276f9
feat: prepare gitcrawl 0.3.0 2026-05-08 06:20:35 +01:00
Peter Steinberger
a1be2e57c5
docs: clarify gitcrawl skill paths 2026-05-08 01:13:01 +01:00
Vincent Koc
01d62c1afc
docs: note dependency updates
Some checks are pending
CI / Go / ${{ matrix.os }} (macos-latest) (push) Waiting to run
CI / Go / ${{ matrix.os }} (ubuntu-latest) (push) Waiting to run
Security Gate: Secret Scanning / Scan for Verified Secrets (push) Waiting to run
2026-05-07 02:52:17 -07:00
dependabot[bot]
fc7001e21e
chore(deps): bump goreleaser/goreleaser-action from 7.1.0 to 7.2.1 (#11)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 7.1.0 to 7.2.1.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](https://github.com/goreleaser/goreleaser-action/compare/v7.1.0...v7.2.1)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-version: 7.2.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-07 02:41:45 -07:00
Peter Steinberger
025e92b858
ci: update homebrew tap on release
Some checks failed
CI / Go / ${{ matrix.os }} (macos-latest) (push) Waiting to run
CI / Go / ${{ matrix.os }} (ubuntu-latest) (push) Waiting to run
Security Gate: Secret Scanning / Scan for Verified Secrets (push) Waiting to run
Pages / Deploy docs (push) Has been cancelled
2026-05-07 03:56:51 +01:00
Vincent Koc
eafeabf8fd
build(deps): bump crawlkit to v0.4.1 (#13) 2026-05-06 14:52:53 -07:00
Vincent Koc
fdc3f7473e
fix(docs): avoid regex tag stripping in toc (#12) 2026-05-06 02:10:03 -07:00
Vincent Koc
858f824719
Merge pull request #10 from openclaw/ci-security-baseline
chore(ci): add crawl security baseline
2026-05-06 01:55:22 -07:00
Vincent Koc
2a011cfef3
docs: document SQL archive queries 2026-05-06 01:54:34 -07:00
Vincent Koc
71d32d8ef2
docs: update changelog for agent skill 2026-05-06 01:38:43 -07:00
Vincent Koc
a4ab91b035
chore(security): add verified secret scanning 2026-05-06 01:37:04 -07:00
Vincent Koc
f205d3abe4
chore: add Go repository hygiene files 2026-05-06 01:37:03 -07:00
Vincent Koc
86f67bea8b
docs: add gitcrawl agent skill 2026-05-06 01:29:00 -07:00
Vincent Koc
ad2a4344a6
chore(ci): rely on CodeQL default setup 2026-05-06 00:42:35 -07:00
Vincent Koc
94a25db94a
chore(ci): add stale issue automation 2026-05-06 00:30:16 -07:00
Vincent Koc
43d9491b81
chore(ci): add CodeQL analysis 2026-05-06 00:30:14 -07:00
Vincent Koc
c35210ad31
chore(security): add protected automation owners 2026-05-06 00:30:13 -07:00
Vincent Koc
bed1da5471
docs: document crawlkit control surface 2026-05-05 19:16:51 -07:00
Vincent Koc
ec7a91465c
test(ci): cover crawlkit control commands 2026-05-05 18:48:48 -07:00
Vincent Koc
1ca61691c0
merge: use crawlkit infrastructure
* feat/use-crawlkit: (33 commits)
  fix(tui): allow empty json smoke
  chore(deps): use crawlkit v0.4.0
  fix(tui): use compact-pane crawlkit
  fix(tui): pick up crawlkit renderer
  fix(sync): log thread progress percentages
  chore(deps): bump crawlkit to v0.3.13
  chore(deps): bump crawlkit to v0.3.12
  chore(deps): update crawlkit to v0.3.11
  chore(deps): tidy crawlkit checksums
  chore(deps): update crawlkit to v0.3.10
  chore(deps): tidy crawlkit checksum
  chore(deps): update crawlkit to v0.3.9
  chore(deps): update crawlkit to v0.3.8
  docs(changelog): note TUI alignment
  chore(deps): update crawlkit to v0.3.7
  chore(deps): update crawlkit to v0.3.6
  chore(deps): update crawlkit to v0.3.5
  fix(tui): use crawlkit empty-json fix
  fix(tui): use crawlkit safe renderer
  fix(cli): document portable help
  ...
2026-05-05 18:20:49 -07:00
Vincent Koc
7342912545
fix(tui): allow empty json smoke 2026-05-05 18:16:02 -07:00
Vincent Koc
b7176c3569
chore(deps): use crawlkit v0.4.0 2026-05-05 18:16:02 -07:00
Vincent Koc
b78370f2ba
fix(tui): use compact-pane crawlkit 2026-05-05 18:16:02 -07:00
Vincent Koc
11455a6a17
fix(tui): pick up crawlkit renderer 2026-05-05 18:16:01 -07:00
Vincent Koc
5d8b59c79b
fix(sync): log thread progress percentages 2026-05-05 18:16:00 -07:00
Vincent Koc
ec8de7a53d
chore(deps): bump crawlkit to v0.3.13 2026-05-05 18:13:56 -07:00
Vincent Koc
32f4a13a8e
chore(deps): bump crawlkit to v0.3.12 2026-05-05 18:13:56 -07:00
Vincent Koc
c39fad757f
chore(deps): update crawlkit to v0.3.11 2026-05-05 18:13:55 -07:00
Vincent Koc
990d2616d6
chore(deps): tidy crawlkit checksums 2026-05-05 18:13:55 -07:00
Vincent Koc
ef89a1d876
chore(deps): update crawlkit to v0.3.10 2026-05-05 18:13:55 -07:00
Vincent Koc
b68852bde2
chore(deps): tidy crawlkit checksum 2026-05-05 18:13:55 -07:00
Vincent Koc
8c460f1a34
chore(deps): update crawlkit to v0.3.9 2026-05-05 18:13:55 -07:00
Vincent Koc
be832aa57c
chore(deps): update crawlkit to v0.3.8 2026-05-05 18:13:54 -07:00
Vincent Koc
87616b1860
docs(changelog): note TUI alignment 2026-05-05 18:13:54 -07:00
Vincent Koc
360037b3ad
chore(deps): update crawlkit to v0.3.7 2026-05-05 18:13:54 -07:00
Vincent Koc
73de21871d
chore(deps): update crawlkit to v0.3.6 2026-05-05 18:13:54 -07:00
Vincent Koc
b543bdc172
chore(deps): update crawlkit to v0.3.5 2026-05-05 18:13:54 -07:00
Vincent Koc
6b3032649b
fix(tui): use crawlkit empty-json fix 2026-05-05 18:13:53 -07:00
Vincent Koc
75215c9389
fix(tui): use crawlkit safe renderer 2026-05-05 18:13:53 -07:00
Vincent Koc
7a9bac31b5
fix(cli): document portable help 2026-05-05 18:13:53 -07:00
Vincent Koc
bf21271477
chore(deps): tidy crawlkit module sums 2026-05-05 18:13:52 -07:00
Vincent Koc
3bfaef0761
ci: smoke crawlkit control surface 2026-05-05 18:13:52 -07:00
Vincent Koc
e13976fbea
feat(cli): add crawlkit control surface 2026-05-05 18:13:52 -07:00
Vincent Koc
92c839cae2
chore: bump crawlkit to v0.3.1 2026-05-05 18:13:52 -07:00
Vincent Koc
77f981725e
chore: tidy crawlkit module sums 2026-05-05 18:13:52 -07:00
Vincent Koc
863d8599e5
refactor: use crawlkit package nouns 2026-05-05 18:13:52 -07:00
Vincent Koc
8cd92156f8
chore: use crawlkit v0.2.0 2026-05-05 18:13:52 -07:00
Vincent Koc
90c90204c1
docs(tui): mark gitcrawl as browser reference 2026-05-05 18:13:51 -07:00
Vincent Koc
af0ea88c98
chore: use crawlkit v0.1.1 2026-05-05 18:13:51 -07:00
Vincent Koc
afed848dc7
chore: use crawlkit v0.1.0 2026-05-05 18:13:51 -07:00
Vincent Koc
511603d0b1
refactor(store): use crawlkit sqlite openers 2026-05-05 18:13:51 -07:00
Vincent Koc
c352cb4e6a
refactor(config): route paths through crawlkit 2026-05-05 18:13:51 -07:00
Vincent Koc
47cc722d33
chore: add crawlkit module dependency 2026-05-05 18:13:51 -07:00
Peter Steinberger
3e43d1a5d5
docs: style homepage action links 2026-05-06 00:20:13 +01:00
Peter Steinberger
d91eec3973
docs: add syntax highlighting 2026-05-05 23:43:45 +01:00
Peter Steinberger
54f7107df9
test: enforce 85 percent coverage gate 2026-05-05 22:00:07 +01:00
Peter Steinberger
e5621d1b78
feat: improve gh shim cache observability 2026-05-05 21:23:39 +01:00
Peter Steinberger
5e441a9e48
docs: open 0.3.0 changelog
Some checks failed
CI / Go / ${{ matrix.os }} (macos-latest) (push) Has been cancelled
CI / Go / ${{ matrix.os }} (ubuntu-latest) (push) Has been cancelled
Pages / Deploy docs (push) Has been cancelled
2026-05-05 09:31:21 +01:00
Peter Steinberger
1350779782
docs: fix homepage command rendering 2026-05-05 09:19:54 +01:00
Peter Steinberger
d5530b3dd9
docs: sharpen homepage positioning 2026-05-05 09:17:35 +01:00
Peter Steinberger
ae1b334ccb
docs: fix gitcrawl brew install path 2026-05-05 09:15:00 +01:00
64 changed files with 4578 additions and 295 deletions

View File

@ -0,0 +1,109 @@
---
name: gitcrawl
description: Use for local GitHub issue/PR archive search, sync freshness, clusters, durable maintainer triage, gh-shim cache reads, and Gitcrawl repo/release work.
---
# Gitcrawl
Use local archive data first for GitHub issue and pull request questions. Browse
or hit live GitHub APIs only when the local archive is stale, missing the
requested scope, or the user asks for current external context.
## Sources
- Config: `~/.config/gitcrawl/config.toml`
- DB: resolve with `gitcrawl doctor --json`; portable-store installs may point at `~/.config/gitcrawl/stores/gitcrawl-store/data/openclaw__openclaw.sync.db` instead of the default local DB
- Cache: `~/.config/gitcrawl/cache`
- Vectors: `~/.config/gitcrawl/vectors`
- Repo: `openclaw/gitcrawl`; on ClawSweeper this is checked out at `~/clawsweeper-workspace/gitcrawl`
- Preferred CLI: `gitcrawl`; fallback to `go run ./cmd/gitcrawl` from a verified repo checkout if the installed binary is stale
## Freshness
For recent/current questions, check freshness before analysis:
```bash
gitcrawl doctor --json
```
Routine refresh:
```bash
gitcrawl doctor
gitcrawl refresh owner/repo
```
Targeted refresh:
```bash
gitcrawl sync owner/repo --numbers 123,456 --with pr-details
```
For agent-driven discovery, prefer bounded freshness:
```bash
gitcrawl search issues "query" -R owner/repo --state open --sync-if-stale 5m --json number,title,url
```
## Query Workflow
1. Resolve scope: owner/repo, issue/PR number, cluster id, keyword, label, author, state, or date range.
2. Check freshness for recent/current requests.
3. Use CLI for normal reads; use read-only SQL for precise counts/rankings.
4. Report absolute date spans, repo names, issue/PR numbers, cluster ids, and known gaps.
Common commands:
```bash
gitcrawl search issues "query" -R owner/repo --state open --json number,title,url
gitcrawl clusters owner/repo --sort size --min-size 5
gitcrawl cluster-detail owner/repo --id <id>
gitcrawl gh pr view 123 -R owner/repo --json number,title,state,url
```
## SQL
`gitcrawl` does not currently expose a first-class `sql` command. For exact
local archive counts or rankings, use SQLite read-only mode against the
configured DB and prefer CLI commands for normal reads.
Useful examples:
```bash
db="$(gitcrawl doctor --json | jq -r .db_path)"
sqlite3 -readonly "$db" \
"select count(*) as threads from threads;"
sqlite3 -readonly "$db" \
"select r.full_name, count(*) as threads from threads t join repositories r on r.id = t.repo_id group by r.full_name order by threads desc limit 20;"
sqlite3 -readonly "$db" \
"select state, count(*) as threads from threads group by state;"
```
Do not run mutating SQL against the archive. Use local maintainer commands for
overrides instead of writing database rows directly.
When the installed CLI lacks a new feature, build or run from
a verified `openclaw/gitcrawl` checkout before concluding the feature is missing.
## Maintainer Boundaries
`close-thread`, `close-cluster`, exclusions, and canonical-member choices are
local maintainer overrides; they do not write back to GitHub. Set
`GITCRAWL_GH_PATH` explicitly when using the gh shim so it cannot recurse into
itself.
## Verification
For repo edits, prefer existing Go gates:
```bash
GOWORK=off go test ./...
```
Then run targeted CLI smoke for the touched surface, for example:
```bash
gitcrawl doctor --json
gitcrawl status --json
gitcrawl search issues "test" -R openclaw/gitcrawl --state open --limit 5
```

12
.editorconfig Normal file
View File

@ -0,0 +1,12 @@
root = true
[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
indent_style = tab
indent_size = 4
[*.{md,yml,yaml,json,toml}]
indent_style = space
indent_size = 2

6
.gitattributes vendored Normal file
View File

@ -0,0 +1,6 @@
* text=auto
*.go text eol=lf
*.md text eol=lf
*.toml text eol=lf
*.yml text eol=lf
*.yaml text eol=lf

11
.github/CODEOWNERS vendored Normal file
View File

@ -0,0 +1,11 @@
# Protect ownership and automation rules.
/.github/CODEOWNERS @openclaw/openclaw-secops
/.github/dependabot.yml @openclaw/openclaw-secops
/.github/workflows/ @openclaw/openclaw-secops
# Release and package integrity surfaces.
/.goreleaser.yaml @openclaw/openclaw-secops
/go.mod @openclaw/openclaw-secops
/go.sum @openclaw/openclaw-secops
/scripts/*release* @openclaw/openclaw-secops
/scripts/*publish* @openclaw/openclaw-secops

13
.github/dependabot.yml vendored Normal file
View File

@ -0,0 +1,13 @@
version: 2
updates:
- package-ecosystem: gomod
directory: /
schedule:
interval: weekly
open-pull-requests-limit: 10
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly
open-pull-requests-limit: 10

View File

@ -54,14 +54,28 @@ jobs:
- name: Vet
run: go vet ./...
- name: Test
run: go test ./...
- name: Test with coverage
run: |
go test ./... -covermode=atomic -coverprofile=coverage.out
total="$(go tool cover -func=coverage.out | awk '/^total:/ { sub(/%/, "", $3); print $3 }')"
echo "total coverage: ${total}%"
awk -v total="$total" 'BEGIN { if (total + 0 < 85.0) { printf("coverage %.1f%% is below 85.0%%\n", total); exit 1 } }'
- name: Build
run: go build -ldflags "-X github.com/openclaw/gitcrawl/internal/cli.version=${GITHUB_SHA:0:7}" -o bin/gitcrawl ./cmd/gitcrawl
- name: Smoke test TUI help
run: |
set -euo pipefail
test -n "$(./bin/gitcrawl --version)"
./bin/gitcrawl metadata --json | grep -q '"schema_version"'
./bin/gitcrawl status --json | grep -q '"databases"'
output="$(./bin/gitcrawl help tui)"
printf '%s\n' "$output"
printf '%s' "$output" | grep -q "gitcrawl tui"
- name: Snapshot release build
uses: goreleaser/goreleaser-action@v7.1.0
uses: goreleaser/goreleaser-action@v7.2.1
with:
distribution: goreleaser
version: "~> v2"

View File

@ -37,10 +37,69 @@ jobs:
run: git checkout ${{ inputs.tag }}
- name: GoReleaser
uses: goreleaser/goreleaser-action@v7.1.0
uses: goreleaser/goreleaser-action@v7.2.1
with:
distribution: goreleaser
version: "~> v2"
args: release --clean --config /tmp/.goreleaser.yaml
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
update-homebrew-tap:
runs-on: ubuntu-latest
needs: goreleaser
steps:
- name: Resolve release tag
run: |
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "RELEASE_TAG=${{ inputs.tag }}" >> "$GITHUB_ENV"
else
echo "RELEASE_TAG=${{ github.ref_name }}" >> "$GITHUB_ENV"
fi
- name: Dispatch tap formula update
env:
GH_TOKEN: ${{ secrets.HOMEBREW_TAP_TOKEN }}
run: |
if [ -z "$GH_TOKEN" ]; then
echo "::error::Set HOMEBREW_TAP_TOKEN with workflow access to openclaw/homebrew-tap"
exit 1
fi
request_id="gitcrawl-${RELEASE_TAG}-${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}"
expected_title="Update gitcrawl for ${RELEASE_TAG} (${request_id})"
gh workflow run update-formula.yml \
--repo openclaw/homebrew-tap \
--ref main \
-f formula=gitcrawl \
-f tag="$RELEASE_TAG" \
-f repository=openclaw/gitcrawl \
-f artifact_template="{formula}_{version}_{target}.tar.gz" \
-f request_id="$request_id"
run_id=""
for _ in {1..30}; do
run_id=$(gh run list \
--repo openclaw/homebrew-tap \
--workflow update-formula.yml \
--branch main \
--event workflow_dispatch \
--limit 20 \
--json databaseId,displayTitle \
--jq ".[] | select(.displayTitle == \"$expected_title\") | .databaseId" | head -n1)
if [ -n "$run_id" ]; then
break
fi
sleep 5
done
if [ -z "$run_id" ]; then
echo "::error::Could not find tap workflow run with title: $expected_title"
exit 1
fi
gh run watch "$run_id" \
--repo openclaw/homebrew-tap \
--exit-status \
--interval 10

63
.github/workflows/secret-scan.yml vendored Normal file
View File

@ -0,0 +1,63 @@
name: "Security Gate: Secret Scanning"
on:
push:
branches: ["**"]
pull_request:
branches: [main, master]
permissions: {}
jobs:
trufflehog:
name: Scan for Verified Secrets
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Resolve scan range
id: scan_range
env:
EVENT_NAME: ${{ github.event_name }}
PR_BASE_SHA: ${{ github.event.pull_request.base.sha }}
PR_HEAD_SHA: ${{ github.event.pull_request.head.sha }}
PUSH_BASE_SHA: ${{ github.event.before }}
PUSH_HEAD_SHA: ${{ github.sha }}
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
run: |
set -euo pipefail
zero_sha="0000000000000000000000000000000000000000"
if [[ "$EVENT_NAME" == "pull_request" ]]; then
base="$PR_BASE_SHA"
head="$PR_HEAD_SHA"
else
base="$PUSH_BASE_SHA"
head="$PUSH_HEAD_SHA"
if [[ -z "$base" || "$base" == "$zero_sha" ]]; then
base="origin/$DEFAULT_BRANCH"
fi
fi
echo "base=$base" >> "$GITHUB_OUTPUT"
echo "head=$head" >> "$GITHUB_OUTPUT"
- name: TruffleHog OSS
id: trufflehog
uses: trufflesecurity/trufflehog@v3.95.2
with:
path: ./
base: ${{ steps.scan_range.outputs.base }}
head: ${{ steps.scan_range.outputs.head }}
extra_args: --only-verified --debug
- name: Notify on failure
if: steps.trufflehog.outcome == 'failure'
run: |
echo "::error::Verified secrets found. Rotate the credential before merging."
exit 1

86
.github/workflows/stale.yml vendored Normal file
View File

@ -0,0 +1,86 @@
name: Stale
on:
schedule:
- cron: "21 4 * * *"
workflow_dispatch:
permissions: {}
jobs:
stale:
permissions:
issues: write
pull-requests: write
runs-on: ubuntu-latest
steps:
- name: Mark stale unassigned issues and pull requests
uses: actions/stale@v10
with:
days-before-issue-stale: 14
days-before-issue-close: 7
days-before-pr-stale: 14
days-before-pr-close: 7
stale-issue-label: stale
stale-pr-label: stale
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale
exempt-pr-labels: maintainer,no-stale
operations-per-run: 1000
ascending: true
exempt-all-assignees: true
remove-stale-when-updated: true
stale-issue-message: |
This issue has been automatically marked as stale due to inactivity.
Please add updated gitcrawl details or it will be closed.
stale-pr-message: |
This pull request has been automatically marked as stale due to inactivity.
Please update it or it will be closed.
close-issue-message: |
Closing due to inactivity.
If this still affects gitcrawl, open a new issue with current reproduction details.
close-issue-reason: not_planned
close-pr-message: |
Closing due to inactivity.
If this PR should be revived, reopen it with current context and validation.
- name: Mark stale assigned issues
uses: actions/stale@v10
with:
days-before-issue-stale: 30
days-before-issue-close: 10
days-before-pr-stale: -1
days-before-pr-close: -1
stale-issue-label: stale
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale
operations-per-run: 1000
ascending: true
include-only-assigned: true
remove-stale-when-updated: true
stale-issue-message: |
This assigned issue has been automatically marked as stale after 30 days of inactivity.
Please add an update or it will be closed.
close-issue-message: |
Closing due to inactivity.
If this still affects gitcrawl, reopen or file a new issue with current evidence.
close-issue-reason: not_planned
- name: Mark stale assigned pull requests
uses: actions/stale@v10
with:
days-before-issue-stale: -1
days-before-issue-close: -1
days-before-pr-stale: 27
days-before-pr-close: 7
stale-pr-label: stale
exempt-pr-labels: maintainer,no-stale
operations-per-run: 1000
ascending: true
include-only-assigned: true
ignore-pr-updates: true
remove-stale-when-updated: true
stale-pr-message: |
This assigned pull request has been automatically marked as stale after being open for 27 days.
Please add an update or it will be closed.
close-pr-message: |
Closing due to inactivity.
If this PR should be revived, reopen it with current context and validation.

View File

@ -1,8 +1,30 @@
# Changelog
## Unreleased
## 0.3.1 - 2026-05-08
- Fix gh-shim portable-store auto-hydration so exact issue/PR refreshes write to the runtime mirror instead of dirtying the Git checkout, clear stale portable refresh locks, and make empty open issue discovery fall through when only targeted sync history exists.
- Keep `cluster-detail` aligned with the default cluster list by showing closed historical members unless `--hide-closed` is passed, and fail fast when `GITCRAWL_GH_PATH` points back at the `gitcrawl` shim.
## 0.3.0 - 2026-05-08
- Bump routine release workflow dependencies.
- Add a repo-local `gitcrawl` agent skill for local archive, freshness, gh-shim, cluster, and verification workflows.
- Accept full GitHub issue and pull request URLs anywhere `gitcrawl` expects a thread number, including sync filters, gh-shim views/diffs, governance commands, neighbor lookup, embedding, and TUI jumps.
- Document read-only SQLite query examples in the repo-local agent skill so agents can do exact local archive counts without mutating state.
- Document the crawlkit control surface now available on `main`, including `metadata --json`, `status --json`, and `doctor --json` for local launchers and CI.
- Clarify that `gitcrawl tui` remains the reference terminal browser for the crawl app family while shared `crawlkit/tui` converges on the same panes, sorting, action menus, and status chrome.
- Add command-reference coverage for the read-only metadata/status commands.
- Add broader CLI, gh-shim, TUI, and store regression coverage for the verified release surface.
## 0.2.1 - 2026-05-05
- Improve `gh` shim cache coordination and observability with stale-while-revalidate reads, finer Actions/API TTLs, recent-window stats, top miss keys, and `xcache snapshot`.
## 0.2.0 - 2026-05-05
- Add Homebrew tap installation via `brew install steipete/tap/gitcrawl`.
- Add Homebrew tap installation via `brew install openclaw/tap/gitcrawl`.
- Improve the `gh` shim cache with canonicalized keys, targeted mutation invalidation, stale-on-rate-limit fallback reads, completed-run TTLs, hit-rate stats, counter reset, and issue auto-hydration.
- Add dark-mode support, a theme toggle, and clearer navigation styling to the generated docs site.
- Force embedding refreshes when the embedding input rune cap changes, so stale larger-cap vectors are not reused.
@ -11,6 +33,11 @@
- Auto-hydrate one exact pull request when local PR detail reads miss or check/run data is stale, using `gh auth token` if `GITHUB_TOKEN` is absent, then retry from SQLite before falling back to live `gh`.
- Cache more ghx-style read-only fallthroughs, including release, workflow, secret, variable, project, ruleset, gist, org, and search reads; cache repeat read failures by default; and clear the fallthrough cache after the corresponding mutating `gh` commands.
- Promote portable backups to the v2 format: keep compact comments, PR files, commits, checks, and workflow runs while stripping raw JSON, generated documents, vectors, clusters, and run history.
- Add crawlkit control metadata/status surfaces with command-local `metadata --json`, `status --json`, and `doctor --json`.
- Include the primary SQLite database inventory in status JSON so local control surfaces can discover archive storage without opening live stores.
- Route config path handling and SQLite openers through `crawlkit` so GitHub archive tooling shares the same foundation as the Slack, Discord, and Notion crawlers.
- Keep shared crawl app TUI nomenclature aligned while `gitcrawl tui` remains the richer cluster-browser reference implementation.
- Keep the existing `gitcrawl tui` as the family reference terminal interface and add CI smoke coverage for its help surface.
## 0.1.2 - 2026-05-01

View File

@ -1,7 +1,7 @@
BINARY := gitcrawl
VERSION ?= dev
.PHONY: build test run clean
.PHONY: build test test-coverage run clean
build:
go build -ldflags "-X github.com/openclaw/gitcrawl/internal/cli.version=$(VERSION)" -o bin/$(BINARY) ./cmd/gitcrawl
@ -9,6 +9,12 @@ build:
test:
go test ./...
test-coverage:
go test ./... -covermode=atomic -coverprofile=coverage.out
@total="$$(go tool cover -func=coverage.out | awk '/^total:/ { sub(/%/, "", $$3); print $$3 }')"; \
echo "total coverage: $${total}%"; \
awk -v total="$$total" 'BEGIN { if (total + 0 < 85.0) { printf("coverage %.1f%% is below 85.0%%\n", total); exit 1 } }'
run:
go run ./cmd/gitcrawl $(ARGS)

View File

@ -15,9 +15,12 @@ Early bootstrap. The implementation is being built in small commits.
```bash
gitcrawl init
gitcrawl doctor
gitcrawl metadata --json
gitcrawl status --json
gitcrawl sync owner/repo
gitcrawl sync owner/repo --state open
gitcrawl sync owner/repo --numbers 123,456 --include-comments
gitcrawl sync owner/repo --numbers https://github.com/owner/repo/issues/123 --with pr-details
gitcrawl refresh owner/repo
gitcrawl cluster owner/repo --threshold 0.80
gitcrawl clusters owner/repo
@ -25,6 +28,7 @@ gitcrawl durable-clusters owner/repo
gitcrawl cluster-detail owner/repo --id 123
gitcrawl cluster-explain owner/repo --id 123
gitcrawl close-thread owner/repo --number 123 --reason "duplicate handled"
gitcrawl close-thread owner/repo --number https://github.com/owner/repo/issues/123 --reason "handled"
gitcrawl reopen-thread owner/repo --number 123
gitcrawl close-cluster owner/repo --id 123 --reason "handled"
gitcrawl reopen-cluster owner/repo --id 123
@ -32,6 +36,7 @@ gitcrawl exclude-cluster-member owner/repo --id 123 --number 456 --reason "not t
gitcrawl include-cluster-member owner/repo --id 123 --number 456
gitcrawl set-cluster-canonical owner/repo --id 123 --number 456
gitcrawl neighbors owner/repo --number 123 --limit 10
gitcrawl neighbors owner/repo --number https://github.com/owner/repo/pull/456 --limit 10
gitcrawl search owner/repo --query "download stalls"
gitcrawl search issues "download stalls" -R owner/repo --state open --json number,title,state,url,updatedAt,labels --limit 30
gitcrawl search prs "manifest cache" -R owner/repo --state open --json number,title,state,url,updatedAt,isDraft,author --limit 20
@ -39,6 +44,8 @@ gitcrawl search issues "hot loop" -R owner/repo --state open --sync-if-stale 5m
gitcrawl sync owner/repo --numbers 123 --with pr-details
gitcrawl gh search issues "download stalls" -R owner/repo --state open --match comments --json number,title,url
gitcrawl gh pr view 123 -R owner/repo --json number,title,state,url
gitcrawl gh pr view https://github.com/owner/repo/pull/123 --json number,title,state,url
gitcrawl gh pr checks https://github.com/owner/repo/pull/123 --json name,state,conclusion
gitcrawl gh run view 123456789 -R owner/repo --json status,conclusion
gitcrawl gh xcache stats
gitcrawl tui
@ -46,14 +53,17 @@ gitcrawl tui owner/repo
```
`gitcrawl clusters` and `gitcrawl tui` match ghcrawl's display view: latest raw run clusters first, closed durable rows merged as historical context, sorted by size by default. Pass `--hide-closed` to focus only currently open clusters. `gitcrawl durable-clusters` stays on governed durable rows and needs `--include-closed` for inactive rows.
`gitcrawl metadata --json`, `gitcrawl status --json`, and `gitcrawl doctor --json` are crawlkit control surfaces for launchers, local automation, and CI checks. They are read-only and do not mutate archive data.
`gitcrawl cluster` and `gitcrawl refresh` build ghcrawl-shaped durable clusters by default (`--threshold 0.80`, `--min-size 1`, `--max-cluster-size 40`, `--k 16`, `--cross-kind-threshold 0.93`): every active vector-backed thread is represented, singleton rows use `singleton_orphan`, multi-member rows use `duplicate_candidate`, and stable IDs are derived from the representative thread. They also add deterministic GitHub reference evidence for direct issue/PR links such as `#123`, `issues/123`, and `pull/123`. Weak embedding edges need concrete title-token overlap unless their similarity is already high, which keeps generic low-confidence bridges from forming unrelated clusters.
`gitcrawl tui` infers the most recently updated local repository when `owner/repo` is omitted. `serve` is intentionally not part of `gitcrawl`.
`gitcrawl sync` fetches open issues and pull requests by default. Pass `--state all` or `--state closed` for explicit backfill workflows; incremental open syncs with `--since` also sweep recently closed items so local open state does not rot.
Pass `--numbers` to refresh exact issue or pull request rows without relying on list ordering or updated-time windows.
Thread-reference inputs accept bare numbers, `#123`, `issues/123`, `pull/123`, `owner/repo#123`, and full GitHub issue/PR URLs. This applies to sync filters, `--number` flags, governance member commands, neighbor/embed lookups, gh-shim `view`/`checks`/`diff`, and TUI jump input. For gh-shim view/checks/diff, a full GitHub URL also supplies the repository, so `-R owner/repo` can be omitted.
Pass `--with pr-details` or `--include-pr-details` to hydrate pull request files, commits, checks, and workflow runs for local review. The `gh` shim can also auto-hydrate one exact PR on a PR-detail miss, then retry locally.
`gitcrawl search issues|prs` accepts the common `gh search` shape (`<query> -R owner/repo --state open --json fields --limit N`) and answers from the local SQLite cache. It is intended for discovery without spending GitHub REST search quota; use `gh` for final live verification and GitHub write actions. Pass `--sync-if-stale 5m` to perform one metadata sync before the cached search when the local repository mirror is older than that duration.
`gitcrawl gh` is a gh-compatible shim for agent workflows. It answers broad `gh search issues|prs`, `gh issue/pr list`, supported `gh issue/pr view --json` fields, hydrated `gh pr checks`, and hydrated `gh run list/view` from local SQLite, then falls through to the real GitHub CLI for unsupported commands. Local `gh issue/pr list` supports common filters such as `--author`, `--assignee`, and repeated `--label`. Read-only fallthroughs such as `gh pr diff`, `gh repo view/list`, `gh release list/view`, `gh workflow list/view`, `gh secret list`, `gh variable get/list`, `gh label list`, read-only `gh search` kinds, GET-only REST `gh api` calls, and read-only `gh api graphql` queries use a command-aware persistent cache under `cache/gh-shim`; Actions run/job logs get longer TTLs, completed run views are kept much longer than active CI status, user profile reads get a 7-day TTL, read-only GraphQL gets a 6-hour TTL, and `gh pr diff` entries are keyed by the cached PR head SHA when available. Explicit API paths and explicit repositories share cache entries across sibling checkouts even when agents set different `GH_REPO` values; implicit repo reads stay isolated by `GH_REPO` or current working directory. Cache keys canonicalize common flags such as `-R`/`--repo` and sorted `--json` fields so equivalent agent commands coalesce. Repeat read failures are cached by default so agents do not rediscover the same missing release or workflow, but rate-limit error entries expire quickly; if GitHub rate-limits a refresh and an expired successful entry exists, the shim serves the stale response with a warning instead of failing the read. Set `GITCRAWL_GH_CACHE_ERRORS=0` to disable error caching. Mutating commands pass through, increment write counters, and invalidate matching cache tags instead of flushing unrelated entries. `gh xcache stats|keys|gc|flush|reset` inspects, garbage-collects, clears, or resets fallthrough-cache counters, including hit rate plus per-command and per-route backend miss counters. Set `GITCRAWL_GH_PATH` to choose the backend `gh`, and symlink or install the binary as `gh`/`gitcrawl-gh` to run the shim directly.
`gitcrawl gh` is a gh-compatible shim for agent workflows. It answers broad `gh search issues|prs`, `gh issue/pr list`, supported `gh issue/pr view --json` fields, hydrated `gh pr checks`, and hydrated `gh run list/view` from local SQLite, then falls through to the real GitHub CLI for unsupported commands. Local `gh issue/pr list` supports common filters such as `--author`, `--assignee`, and repeated `--label`; empty open issue discovery falls through when the local repo only has targeted sync history. Read-only fallthroughs such as `gh pr diff`, `gh repo view/list`, `gh release list/view`, `gh workflow list/view`, `gh secret list`, `gh variable get/list`, `gh label list`, read-only `gh search` kinds, GET-only REST `gh api` calls, and read-only `gh api graphql` queries use a command-aware persistent cache under `cache/gh-shim`; Actions run/job logs get longer TTLs, completed run/job reads are kept much longer than active CI status, user profile reads get a 7-day TTL, read-only GraphQL gets a 6-hour TTL, and `gh pr diff` entries are keyed by the cached PR head SHA when available. Explicit API paths and explicit repositories share cache entries across sibling checkouts even when agents set different `GH_REPO` values; implicit repo reads stay isolated by `GH_REPO` or current working directory. Cache keys canonicalize common flags such as `-R`/`--repo` and sorted `--json` fields so equivalent agent commands coalesce. Repeat read failures are cached by default so agents do not rediscover the same missing release or workflow, but rate-limit error entries expire quickly; if GitHub rate-limits a refresh and an expired successful entry exists, the shim serves the stale response with a warning instead of failing the read. When another process is refreshing an expired successful entry, peers may serve stale inside a short grace window instead of joining the backend stampede. Set `GITCRAWL_GH_STALE_GRACE=0` to disable stale-while-revalidate, or `GITCRAWL_GH_CACHE_ERRORS=0` to disable error caching. Mutating commands pass through, increment write counters, and invalidate matching cache tags instead of flushing unrelated entries. `gh xcache stats|keys|gc|flush|reset|snapshot` inspects, garbage-collects, clears, resets, or snapshots fallthrough-cache counters, including hit rate plus per-command, per-route, per-key, and `--since` recent-window miss counters. Set `GITCRAWL_GH_PATH` to choose the backend `gh`, and symlink or install the binary as `gh`/`gitcrawl-gh` to run the shim directly.
The TUI starts at `--min-size 5` and `--sort size`, like ghcrawl's saved default, so the first screen is the useful cluster workload instead of singleton noise. Pass `--min-size 1` when you intentionally want singleton clusters. Mouse support is built in: click rows, wheel panes, and right-click for copy, sort, filter, jump, link, neighbor, local close/reopen, and member triage actions. Press `a` to open the same action menu from the keyboard, `#` to jump directly to an issue or PR number, `p` to switch between repositories already present in the local store, or `n` to load neighbors for the selected issue or PR. Enter from the members pane also loads neighbors before opening detail. The TUI quietly refreshes from the local store every 15 seconds.
`gitcrawl tui` remains the reference terminal interaction model for the crawl app family: pane focus, sortable headers, mouse/right-click actions, detail rendering, and status chrome are the behavior the shared `crawlkit/tui` browser is converging on for Slack, Discord, and Notion archives.
## Local Defaults
@ -74,7 +84,7 @@ The TUI starts at `--min-size 5` and `--sort size`, like ghcrawl's saved default
Install from Homebrew:
```bash
brew install steipete/tap/gitcrawl
brew install openclaw/tap/gitcrawl
```
Or download a release archive from GitHub releases or build from source:
@ -91,4 +101,5 @@ go build -ldflags "-X github.com/openclaw/gitcrawl/internal/cli.version=$(git de
```bash
go test ./...
go build ./cmd/gitcrawl
go run ./cmd/gitcrawl help tui
```

View File

@ -122,9 +122,10 @@ gitcrawl gh xcache stats
gitcrawl gh xcache keys
gitcrawl gh xcache reset
gitcrawl gh xcache flush
gitcrawl gh xcache snapshot [--reset]
```
The cache key includes the resolved gitcrawl config path, current working directory, `GH_HOST`, `GH_REPO`, stable PR-diff identity when available, and canonicalized `gh` arguments. This keeps sibling checkouts and portable stores isolated while still coalescing equivalent agent calls such as reordered flags or sorted `--json` fields. Concurrent cache misses use a lock file so one process populates the entry while peers wait for the result.
The cache key includes the resolved gitcrawl config path, current working directory, `GH_HOST`, `GH_REPO`, stable PR-diff identity when available, and canonicalized `gh` arguments. This keeps sibling checkouts and portable stores isolated while still coalescing equivalent agent calls such as reordered flags or sorted `--json` fields. Concurrent cache misses use a lock file so one process populates the entry while peers wait for the result; if an expired successful entry is still inside its stale grace window, peers may serve stale while the lock holder refreshes it. `xcache stats --since <duration>` reports recent-window counters from hourly buckets, and miss maps include command, normalized route, and canonical key views.
## Config

View File

@ -122,6 +122,10 @@ gh issue comment 456 -R owner/repo --body "Duplicate of #123"
# Periodically log cache stats — watch local_hits climb relative to backend_misses.
gitcrawl gh xcache stats --json \
| jq '{local: .counters.local_hits, fallback: .counters.fallback_hits, github: .counters.backend_misses}'
# During release/debug sessions, compare a recent window or snapshot before reset.
gitcrawl gh xcache stats --since 1h --json
gitcrawl gh xcache snapshot --reset --json
```
## Multi-repo automation

View File

@ -88,7 +88,7 @@ gitcrawl cluster-explain owner/repo --id 123 # alias
| `--id <n>` | _(required)_ | Cluster ID |
| `--member-limit <n>` | _(no limit)_ | Maximum members to return |
| `--body-chars <n>` | `280` | Body snippet length per member |
| `--include-closed` | _(off)_ | Include closed members |
| `--hide-closed` | _(off)_ | Hide locally closed members |
`cluster-explain` is the same command — it exists so the verb reads naturally in agent prompts ("explain why these things ended up together").

View File

@ -32,6 +32,8 @@ These work on every command.
| --- | --- | --- |
| `gitcrawl init [--db --portable-store --portable-db --store-dir --json]` | Create config, database, runtime directories; optionally clone a portable store | [Installation](/installation/), [Portable stores](/portable-stores/) |
| `gitcrawl doctor [--json]` | Health check for config, database, credentials, model selection, repo/thread counts | [Configuration](/configuration/#gitcrawl-doctor) |
| `gitcrawl metadata [--json]` | Print the crawlkit command/control manifest for launchers and automation | — |
| `gitcrawl status [--json]` | Print read-only archive status, database inventory, and control state | — |
| `gitcrawl configure [--summary-model --embed-model --embedding-basis --json]` | Update model fields in `config.toml` | [Configuration](/configuration/#gitcrawl-configure) |
| `gitcrawl version` | Print version | — |
@ -39,9 +41,9 @@ These work on every command.
| Command | Purpose | Docs |
| --- | --- | --- |
| `gitcrawl sync owner/repo [--state --since --numbers --limit --include-comments --include-pr-details --with pr-details --json]` | Sync issues and PRs from GitHub into local SQLite | [Sync](/sync/) |
| `gitcrawl sync owner/repo [--state --since --numbers <refs> --limit --include-comments --include-pr-details --with pr-details --json]` | Sync issues and PRs from GitHub into local SQLite | [Sync](/sync/) |
| `gitcrawl refresh owner/repo [--no-sync --no-embed --no-cluster ...]` | Wrapper that runs sync → embed → cluster | [Refresh and embed](/refresh-and-embed/) |
| `gitcrawl embed owner/repo [--number --limit --force --include-closed --json]` | Generate OpenAI embeddings for thread documents | [Refresh and embed](/refresh-and-embed/#embed) |
| `gitcrawl embed owner/repo [--number <ref> --limit --force --include-closed --json]` | Generate OpenAI embeddings for thread documents | [Refresh and embed](/refresh-and-embed/#embed) |
| `gitcrawl runs owner/repo [--kind sync\|embedding\|cluster --limit --json]` | List recorded run history | [Refresh and embed](/refresh-and-embed/#runs) |
## Inspect
@ -51,7 +53,23 @@ These work on every command.
| `gitcrawl threads owner/repo [--include-closed --numbers --limit --json]` | List threads from local cache | — |
| `gitcrawl search owner/repo --query <text> [--mode keyword\|semantic\|hybrid --limit --json]` | Local search (direct mode) | [Search](/search/) |
| `gitcrawl search issues\|prs <query> -R owner/repo [--state --json --limit --sync-if-stale]` | Local search (`gh search` shape) | [Search](/search/#gh-search-compatibility-mode) |
| `gitcrawl neighbors owner/repo --number <n> [--limit --threshold --json]` | Vector-similar threads to a specific issue/PR | [Clustering](/clustering/#find-similar-threads-neighbors) |
| `gitcrawl neighbors owner/repo --number <ref> [--limit --threshold --json]` | Vector-similar threads to a specific issue/PR | [Clustering](/clustering/#find-similar-threads-neighbors) |
## Thread References
Commands that accept a thread number also accept thread references:
- bare numbers: `123`
- hash references: `#123`
- path references: `issues/123`, `pull/123`
- scoped references: `owner/repo#123`
- full GitHub issue or pull request URLs
This applies to `sync --numbers`, `threads --numbers`, `embed --number`,
`neighbors --number`, all governance `--number` flags, gh-shim
`issue/pr view`, `pr checks`, `pr diff`, and TUI jump input. In gh-shim
`view`/`checks`/`diff`, a full GitHub URL also supplies `owner/repo`, so
`-R owner/repo` is optional.
## Cluster
@ -60,20 +78,20 @@ These work on every command.
| `gitcrawl cluster owner/repo [--threshold --min-size --max-cluster-size --k --cross-kind-threshold --limit --model --basis --include-closed --json]` | Build durable clusters from vectors | [Clustering](/clustering/#generate-clusters) |
| `gitcrawl clusters owner/repo [--sort size\|recent\|oldest --min-size --limit --hide-closed --json]` | Latest-run cluster summary, merged with closed durable rows | [Clustering](/clustering/#list-clusters) |
| `gitcrawl durable-clusters owner/repo [--include-closed --sort --min-size --limit --json]` | Strict durable-cluster audit view | [Clustering](/clustering/#list-clusters) |
| `gitcrawl cluster-detail owner/repo --id <n> [--member-limit --body-chars --include-closed --json]` | Cluster + members detail | [Clustering](/clustering/#inspect-a-cluster) |
| `gitcrawl cluster-detail owner/repo --id <n> [--member-limit --body-chars --hide-closed --json]` | Cluster + members detail | [Clustering](/clustering/#inspect-a-cluster) |
| `gitcrawl cluster-explain owner/repo --id <n> [...]` | Alias for `cluster-detail` | [Clustering](/clustering/#inspect-a-cluster) |
## Governance
| Command | Purpose | Docs |
| --- | --- | --- |
| `gitcrawl close-thread owner/repo --number <n> [--reason --json]` | Local close on a thread | [Governance](/governance/#local-close) |
| `gitcrawl reopen-thread owner/repo --number <n> [--json]` | Inverse | — |
| `gitcrawl close-thread owner/repo --number <ref> [--reason --json]` | Local close on a thread | [Governance](/governance/#local-close) |
| `gitcrawl reopen-thread owner/repo --number <ref> [--json]` | Inverse | — |
| `gitcrawl close-cluster owner/repo --id <n> [--reason --json]` | Local close on a cluster | [Governance](/governance/#local-close) |
| `gitcrawl reopen-cluster owner/repo --id <n> [--json]` | Inverse | — |
| `gitcrawl exclude-cluster-member owner/repo --id <n> --number <m> [--reason --json]` | Pull a thread out of a cluster | [Governance](/governance/#member-exclusion) |
| `gitcrawl include-cluster-member owner/repo --id <n> --number <m> [--reason --json]` | Inverse | — |
| `gitcrawl set-cluster-canonical owner/repo --id <n> --number <m> [--reason --json]` | Pin canonical thread for a cluster | [Governance](/governance/#canonical-member) |
| `gitcrawl exclude-cluster-member owner/repo --id <n> --number <ref> [--reason --json]` | Pull a thread out of a cluster | [Governance](/governance/#member-exclusion) |
| `gitcrawl include-cluster-member owner/repo --id <n> --number <ref> [--reason --json]` | Inverse | — |
| `gitcrawl set-cluster-canonical owner/repo --id <n> --number <ref> [--reason --json]` | Pin canonical thread for a cluster | [Governance](/governance/#canonical-member) |
## TUI
@ -86,12 +104,12 @@ These work on every command.
| Command | Purpose | Docs |
| --- | --- | --- |
| `gitcrawl gh search issues\|prs <query> -R owner/repo [...]` | Local-first `gh search` | [gh shim](/gh-shim/) |
| `gitcrawl gh issue view <n> -R owner/repo --json <fields>` | Local-first thread view | [gh shim](/gh-shim/) |
| `gitcrawl gh pr view <n> -R owner/repo --json <fields>` | Same, for PRs (with auto-hydration) | [gh shim](/gh-shim/) |
| `gitcrawl gh issue view <n-or-url> [-R owner/repo] --json <fields>` | Local-first thread view | [gh shim](/gh-shim/) |
| `gitcrawl gh pr view <n-or-url> [-R owner/repo] --json <fields>` | Same, for PRs (with auto-hydration) | [gh shim](/gh-shim/) |
| `gitcrawl gh issue list -R owner/repo [--state --search --author --assignee --label --json]` | Local-first list | [gh shim](/gh-shim/) |
| `gitcrawl gh pr list -R owner/repo [...]` | Same, for PRs | [gh shim](/gh-shim/) |
| `gitcrawl gh pr checks <n> -R owner/repo --json <fields>` | Cached PR checks (auto-hydrates if stale) | [gh shim](/gh-shim/) |
| `gitcrawl gh pr diff <n> -R owner/repo` | Falls through; cached by head SHA | [gh shim](/gh-shim/) |
| `gitcrawl gh pr checks <n-or-url> [-R owner/repo] --json <fields>` | Cached PR checks (auto-hydrates if stale) | [gh shim](/gh-shim/) |
| `gitcrawl gh pr diff <n-or-url> [-R owner/repo]` | Falls through; cached by head SHA | [gh shim](/gh-shim/) |
| `gitcrawl gh run list -R owner/repo [--branch --commit --json]` | Cached workflow runs | [gh shim](/gh-shim/) |
| `gitcrawl gh run view <run-id> -R owner/repo [--json]` | Same, single run | [gh shim](/gh-shim/) |
| `gitcrawl gh repo view\|list ...` | Falls through; cached briefly | [gh shim](/gh-shim/) |
@ -101,7 +119,7 @@ These work on every command.
| `gitcrawl gh label list ...` | Falls through; cached briefly | [gh shim](/gh-shim/) |
| `gitcrawl gh api <GET path>` | Falls through; cached briefly (GET-only REST) | [gh shim](/gh-shim/) |
| `gitcrawl gh api graphql -f query=...` | Falls through; read-only queries are cached | [gh shim](/gh-shim/#read-only-fallthroughs-cached) |
| `gitcrawl gh xcache stats\|keys\|gc\|flush\|reset [--json]` | Cache inspection / housekeeping | [gh shim](/gh-shim/#cache-inspection-xcache) |
| `gitcrawl gh xcache stats [--since <duration>] \| keys \| gc \| flush \| reset \| snapshot [--reset] [--json]` | Cache inspection / housekeeping | [gh shim](/gh-shim/#cache-inspection-xcache) |
| _Anything else_ | Falls through to real `gh` | [gh shim](/gh-shim/) |
The shim binary can be installed standalone by symlinking the `gitcrawl` binary as `gh` or `gitcrawl-gh`.

View File

@ -102,6 +102,7 @@ checkout_dir = "/Users/me/.config/gitcrawl/portable"
| `GITCRAWL_GH_PATH` | Path to the real `gh` binary used for fallthrough |
| `GITCRAWL_GH_AUTO_HYDRATE` | Set to `0` to disable PR auto-hydration on cache miss |
| `GITCRAWL_GH_CACHE_TTL` | Override fallthrough cache TTL (e.g., `5m`, `1h`) |
| `GITCRAWL_GH_STALE_GRACE` | Override stale-while-revalidate grace for expired successful fallthrough entries |
| `GITCRAWL_GH_CACHE_ERRORS` | Set to `0` to avoid caching non-zero read-only fallthroughs |
If `GITCRAWL_GH_PATH` is unset, the shim probes common Homebrew install paths and then your `PATH`. Set it explicitly when you symlink the gitcrawl binary as `gh` (otherwise the shim will recurse into itself).

View File

@ -52,8 +52,13 @@ Answered from the local FTS index. Honors `--state`, `--json`, `--limit`. `--mat
```bash
gh issue view 123 -R owner/repo --json number,title,state,url,body,labels,author
gh pr view 123 -R owner/repo --json number,title,state,url,isDraft,author,headRef,baseRef
gh issue view https://github.com/owner/repo/issues/123 --json number,title,url
gh pr view https://github.com/owner/repo/pull/123 --json number,title,url
```
Full GitHub issue/PR URLs provide both the repository and thread number when
`-R`/`--repo` is omitted.
Supported JSON fields include `number`, `title`, `state`, `url`, `body`, `author`, `createdAt`, `updatedAt`, `closedAt`, `labels`, plus PR-specific `isDraft`, `headRef`, `baseRef`. PR detail fields (`files`, `commits`, `checks`, `statusCheckRollup`) are answered from cached PR detail and trigger [auto-hydration](#auto-hydration) on miss.
### `gh issue list` / `gh pr list`
@ -71,10 +76,14 @@ Supports `--state`, `--search` (keyword search), `--author`, `--assignee`, repea
```bash
gh pr checks 123 -R owner/repo --json name,state,conclusion,detailsUrl
gh pr checks https://github.com/owner/repo/pull/123 --json name,state,conclusion
```
Returns the cached check/status summary for the PR. If the cached PR detail is older than 90 seconds or its head SHA is stale, [auto-hydration](#auto-hydration) refreshes it before answering. Supported fields: `name`, `state`, `status`, `conclusion`, `detailsUrl`, `workflow`, `startedAt`, `completedAt`.
Like `gh pr view`, a full pull request URL can supply both repository and
number.
### `gh run list` / `gh run view`
```bash
@ -89,7 +98,7 @@ Workflow runs come from cached PR detail. Filters: `--branch`, `--commit` (head
These commands always run real `gh` but the response body is cached for the next caller in the same workspace:
- `gh pr diff` — keyed by the cached PR head SHA when available, so the cache is stable across many sequential agent reads
- `gh pr diff <number-or-url>` — keyed by the cached PR head SHA when available, so the cache is stable across many sequential agent reads; full PR URLs can omit `-R`
- `gh issue list/status/view`, `gh pr list/status/view/checks`, and unsupported read-only local shim shapes
- `gh release list/view`, `gh workflow list/view`, `gh secret list`, and `gh variable get/list`
- `gh project list/view/field-list/item-list`, `gh ruleset check/list/view`, `gh gist list/view`, and `gh org list`
@ -101,9 +110,9 @@ These commands always run real `gh` but the response body is cached for the next
Common Actions REST reads such as run status, job lists, and logs get Actions-aware TTLs.
Default cache TTLs are command-aware: active `gh run list` and run-status reads use `2m`; completed run views are kept for `12h`; completed run lists are kept for `30m`; workflow, job detail, and Actions job-list reads use `5m`; search reads use `15m`; release metadata uses `30m`; GitHub user profile reads use `7d`; read-only GraphQL queries use `6h`; completed-style run/job log reads use `12h`; `gh pr diff` uses `5m` without a stable SHA and `7d` with one. Most other read-only fallthroughs use `5m` to `10m`. Override with `GITCRAWL_GH_CACHE_TTL=5m` or similar.
Default cache TTLs are command-aware: active `gh run list` and run-status reads use `30s`; completed run views, completed Actions job lists, and run/job logs are kept for `12h`; completed run lists are kept for `30m`; workflow reads use `15m`; search reads use `15m`; release metadata uses `1h`; GitHub user profile reads use `7d`; read-only GraphQL queries use `6h`; GitHub Pages metadata uses `15m` to `30m`; tagged/SHA `contents` API reads use `7d`; `gh pr diff` uses `5m` without a stable SHA and `7d` with one. Most other read-only fallthroughs use `5m` to `10m`. Override with `GITCRAWL_GH_CACHE_TTL=5m` or similar.
Repeat read failures are cached by default too. That avoids a fleet of agents all rediscovering the same missing release, workflow, secret, or unsupported field. Error entries are capped to shorter lifetimes, and rate-limit errors are capped at `2m` so a reset is not masked all day. If GitHub returns a rate-limit error while refreshing an expired successful entry, the shim serves that stale success with a warning instead of failing the read. Set `GITCRAWL_GH_CACHE_ERRORS=0` to cache successful reads only.
Repeat read failures are cached by default too. That avoids a fleet of agents all rediscovering the same missing release, workflow, secret, or unsupported field. Error entries are capped to shorter lifetimes, and rate-limit errors are capped at `2m` so a reset is not masked all day. If GitHub returns a rate-limit error while refreshing an expired successful entry, the shim serves that stale success with a warning instead of failing the read. When another process is already refreshing an expired successful entry, peers can serve that stale entry within a short command-aware grace window instead of joining the backend stampede. Set `GITCRAWL_GH_STALE_GRACE=0` to disable stale-while-revalidate, or `GITCRAWL_GH_CACHE_ERRORS=0` to cache successful reads only.
## Auto-hydration
@ -116,6 +125,8 @@ When a local issue or PR read misses the cache, the shim can auto-hydrate exactl
This keeps `gh issue view`, `gh pr view`, `gh pr checks`, and `gh run` reads cheap and fresh without manual sync orchestration. Disable with `GITCRAWL_GH_AUTO_HYDRATE=0` if you want the shim to be strictly cache-or-fallthrough.
When the configured database comes from a portable store, auto-hydration writes to the local runtime mirror, not the Git checkout. Broad empty open-issue discovery is also guarded: if `gh issue list` or empty-query `gh search issues --state open` would return no rows but the repo only has targeted sync history, the shim falls through to the real `gh` instead of treating that incomplete local snapshot as authoritative.
## Cache inspection: `xcache`
```bash
@ -124,9 +135,10 @@ gitcrawl gh xcache keys # per-entry detail
gitcrawl gh xcache gc # remove expired entries + stale lock files
gitcrawl gh xcache flush # clear everything
gitcrawl gh xcache reset # reset counters without deleting entries
gitcrawl gh xcache snapshot # write a counter snapshot for later comparison
```
All accept `--json` for scripting.
All accept `--json` for scripting. `stats` accepts `--since 1h` for recent-window counters. `snapshot` accepts `--reset` to checkpoint counters before a noisy release/debugging session.
`stats` JSON:
@ -152,6 +164,9 @@ All accept `--json` for scripting.
},
"backend_misses_by_route": {
"api repos/:owner/:repo/actions/runs/:id/logs": 3
},
"backend_misses_by_key": {
"api repos/openclaw/gitcrawl/actions/runs/123/logs -i": 2
}
},
"commands": {
@ -161,7 +176,7 @@ All accept `--json` for scripting.
}
```
`local_hits` are answered from SQLite; `fallback_hits` are answered from the fallthrough cache; `stale_hits` are expired successful cache entries served after a backend rate-limit response; `backend_misses` actually hit GitHub. The per-command and per-route miss maps show which shapes still escape the cache, which is usually the fastest way to find the next optimization.
`local_hits` are answered from SQLite; `fallback_hits` are answered from the fallthrough cache; `stale_hits` are expired successful cache entries served after a backend rate-limit response or while another process refreshes the key; `backend_misses` actually hit GitHub. The per-command, per-route, and per-key miss maps show which shapes still escape the cache, which is usually the fastest way to find the next optimization.
## Cache key composition

View File

@ -25,6 +25,7 @@ Mark a thread or a cluster as "handled locally — do not show me this again."
```bash
gitcrawl close-thread owner/repo --number 123 --reason "duplicate handled"
gitcrawl close-thread owner/repo --number https://github.com/owner/repo/issues/123 --reason "duplicate handled"
gitcrawl reopen-thread owner/repo --number 123
gitcrawl close-cluster owner/repo --id 42 --reason "all members handled"
@ -47,6 +48,7 @@ Pull a single thread out of a cluster, or pull it back in.
```bash
gitcrawl exclude-cluster-member owner/repo --id 42 --number 456 --reason "different repro"
gitcrawl exclude-cluster-member owner/repo --id 42 --number owner/repo#456 --reason "different repro"
gitcrawl include-cluster-member owner/repo --id 42 --number 456
```
@ -72,6 +74,12 @@ gitcrawl set-cluster-canonical owner/repo --id 42 --number 123 --reason "main tr
The chosen `--number` must already be a member of the cluster. The TUI's right-click menu has a "set canonical" entry that calls this command.
All governance `--number` flags accept the same thread-reference forms as sync:
bare numbers, `#123`, `issues/123`, `pull/123`, `owner/repo#123`, and full
GitHub issue or pull request URLs. The command still applies only to the
`owner/repo` argument you pass to gitcrawl; URL input is accepted so copied
GitHub links can be pasted directly.
## Reopen and undo
There is no separate `undo`. The inverse commands are explicit:

View File

@ -9,7 +9,7 @@ permalink: /
# gitcrawl
{: .fs-9 }
A local-first GitHub issue and pull request crawler for maintainer triage. Sync, search, cluster, and review related threads from a SQLite cache that lives entirely on your machine.
A local-first GitHub triage tool **and** a drop-in caching `gh` shim. Sync issues and PRs into SQLite for search and clustering — then let agents call `gh` against that same cache so you stop burning the API rate limit.
{: .fs-6 .fw-300 }
[Quickstart](/quickstart/){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 }
@ -17,13 +17,13 @@ A local-first GitHub issue and pull request crawler for maintainer triage. Sync,
---
## What gitcrawl does
## Two jobs, one binary
`gitcrawl` mirrors a GitHub repository's issues and pull requests into local SQLite, then layers semantic clustering, full-text search, and a `gh`-compatible shim on top so a maintainer (or an agent acting on their behalf) can triage threads without burning live API quota.
`gitcrawl` mirrors a GitHub repository's issues and pull requests into local SQLite, then layers semantic clustering, full-text search, and a `gh`-compatible shim on top so a maintainer (or an agent acting on their behalf) can triage threads *and* serve everyday `gh` reads without burning live API quota.
- **Local SQLite first.** All issues, PRs, comments, reviews, files, commits, checks, and workflow runs land in `~/.config/gitcrawl/gitcrawl.db`. Queries hit the disk, not GitHub.
- **Drop-in `gh` cache.** Symlink `gitcrawl-gh` as `gh` and most read-only calls (`gh search`, `gh issue/pr view`, `gh pr checks`, `gh run`, REST GETs, GraphQL queries) answer from local SQLite. Agents stop hitting rate limits; mutating commands pass through unchanged. Run `gh xcache stats` to see hit rate, per-command misses, and evictions.
- **Semantic clustering.** OpenAI embeddings group related reports, with deterministic GitHub reference evidence (`#123`, `pull/123`) preventing weak similarity bridges from forming mega-clusters.
- **`gh`-compatible shim.** Drop `gitcrawl gh` (or symlink it as `gh`) into agent workflows and most read-only `gh search`, `gh issue/pr view`, `gh pr checks`, and `gh run` calls answer from local cache instead of the GitHub API.
- **Terminal UI.** `gitcrawl tui` is a keyboard- and mouse-driven cluster browser with bidirectional sort, jump-to-number, neighbors, and member-level governance actions.
- **Agent-friendly JSON.** Every command supports `--json` for clean automation surfaces.

View File

@ -23,7 +23,7 @@ gitcrawl runs on macOS and Linux. Windows is not actively tested.
## Install from Homebrew
```bash
brew install steipete/tap/gitcrawl
brew install openclaw/tap/gitcrawl
```
Homebrew installs the `gitcrawl` binary. If you also want the GitHub CLI shim behavior, add a `gh` or `gitcrawl-gh` symlink as shown below.

View File

@ -56,6 +56,7 @@ Write commands (`embed`, `refresh`, `cluster`, neighbor generation) need to pers
This separation means:
- You can `gitcrawl embed` against a portable store without dirtying the Git checkout
- gh-shim exact-thread auto-hydration writes into the same runtime mirror
- Local cluster overrides (`close-cluster`, exclusions, canonicals) live in the runtime mirror
- Only the publishing workflow writes back into the portable checkout

View File

@ -104,17 +104,21 @@ Override the config root with `--config <path>` or `GITCRAWL_CONFIG`.
| Cache class | TTL |
| --- | --- |
| Most read-only fallthroughs | `5m`-`10m` |
| `gh run list` / run status | `2m` |
| `gh run list` / run status | `30s` |
| `gh run view --log` / `--log-failed` | `12h` |
| `gh run view --job` | `5m` |
| `gh run view --job` | `1m` |
| `gh search ...` | `15m` |
| `gh release ...` | `30m` |
| `gh api` Actions run status | `2m` |
| `gh api` Actions job lists / workflow reads | `5m` |
| `gh release ...` | `1h` |
| `gh api` Actions run status | `30s` |
| `gh api` Actions job lists | `1m` active, `12h` completed |
| `gh api` workflow reads | `15m` |
| `gh api` Actions run/job logs | `12h` |
| `gh api` Pages metadata | `15m`-`30m` |
| `gh api` tagged/SHA contents | `7d` |
| `gh pr diff` without stable head SHA | `5m` |
| `gh pr diff` with stable head SHA | `7d` |
| Override | `GITCRAWL_GH_CACHE_TTL` |
| Stale-while-revalidate grace | command-aware; override with `GITCRAWL_GH_STALE_GRACE` |
| Cache read failures | on by default; error TTL is capped (`2m` for rate-limit errors); disable with `GITCRAWL_GH_CACHE_ERRORS=0` |
## gh shim cache key composition

View File

@ -71,11 +71,14 @@ Generates OpenAI embeddings for any thread whose document hash has changed since
| Flag | Default | Description |
| --- | --- | --- |
| `--number <n>` | _(any)_ | Embed a single issue/PR by number |
| `--number <ref>` | _(any)_ | Embed a single issue/PR by number or copied GitHub URL |
| `--limit <n>` | _(no limit)_ | Maximum rows to embed in this run |
| `--force` | _(off)_ | Re-embed every selected row, ignoring content hash |
| `--include-closed` | _(off)_ | Include closed threads |
`--number` accepts bare numbers, `#123`, `issues/123`, `pull/123`,
`owner/repo#123`, and full GitHub issue or pull request URLs.
### When to `--force`
You should rarely need it. The pipeline auto-forces a rebuild when:

View File

@ -54,10 +54,15 @@ gitcrawl sync owner/repo --state all --since 2026-04-01T00:00:00Z
```bash
gitcrawl sync owner/repo --numbers 123,456 --include-comments
gitcrawl sync owner/repo --numbers https://github.com/owner/repo/issues/123 --with pr-details
```
`--numbers` is the safest way to refresh specific issues or PRs — it bypasses list ordering and the updated-time window, fetching exactly the rows you ask for. Pair it with `--include-comments` and/or `--include-pr-details` to hydrate the conversation and PR-only data at the same time.
`--numbers` accepts comma-separated thread references, not just integers:
`123`, `#123`, `issues/123`, `pull/123`, `owner/repo#123`, and full GitHub
issue or pull request URLs.
This is also what the `gh` shim uses internally for [auto-hydration](/gh-shim/#auto-hydration).
## Hydration depth

View File

@ -10,6 +10,11 @@ permalink: /tui/
`gitcrawl tui` is the interactive cluster browser. Keyboard-first, mouse-friendly, refreshes from local SQLite every 15 seconds.
{: .fs-6 .fw-300 }
It is also the reference terminal interaction model for the crawl app family.
The shared `crawlkit/tui` browser used by Slack, Discord, and Notion archives
is expected to match its pane focus, sortable headers, mouse/right-click
actions, detail rendering, and status chrome wherever the data model allows it.
1. TOC
{:toc}
@ -48,7 +53,7 @@ The view auto-refreshes from the local store every 15 seconds. There is no GitHu
| `Tab` / `Shift+Tab` | Switch panes |
| `Enter` | Open detail for selected cluster or member; on a member, loads neighbors first |
| `a` | Open the action menu (cluster or member, depending on focus) |
| `#` | Jump to a specific issue or PR number |
| `#` | Jump to a specific issue or PR number or copied GitHub issue/PR URL |
| `n` | Load neighbors for the selected issue or PR |
| `p` | Switch between repositories already present in the local store |
| `s` | Cycle sort mode (`size` ↔ `recent``oldest`, both directions) |
@ -57,6 +62,10 @@ The view auto-refreshes from the local store every 15 seconds. There is no GitHu
The action menu opened with `a` mirrors the right-click menu, so every mouse action has a keyboard equivalent.
Jump input accepts the same thread references as the CLI: bare numbers, `#123`,
`issues/123`, `pull/123`, `owner/repo#123`, and full GitHub issue or pull
request URLs.
## Mouse
Mouse support is built in and works in most modern terminals (iTerm2, Kitty, Alacritty, WezTerm, recent macOS Terminal):

5
go.mod
View File

@ -8,8 +8,7 @@ require (
github.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834
github.com/charmbracelet/x/ansi v0.11.7
github.com/mattn/go-isatty v0.0.22
github.com/pelletier/go-toml/v2 v2.3.1
modernc.org/sqlite v1.50.0
github.com/vincentkoc/crawlkit v0.4.1
)
require (
@ -30,6 +29,7 @@ require (
github.com/muesli/cancelreader v0.2.2 // indirect
github.com/muesli/termenv v0.16.0 // indirect
github.com/ncruces/go-strftime v1.0.0 // indirect
github.com/pelletier/go-toml/v2 v2.3.1 // indirect
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
github.com/rivo/uniseg v0.4.7 // indirect
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e // indirect
@ -38,4 +38,5 @@ require (
modernc.org/libc v1.72.1 // indirect
modernc.org/mathutil v1.7.1 // indirect
modernc.org/memory v1.11.0 // indirect
modernc.org/sqlite v1.50.0 // indirect
)

2
go.sum
View File

@ -56,6 +56,8 @@ github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec h1:W09IVJc94
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo=
github.com/rivo/uniseg v0.4.7 h1:WUdvkW8uEhrYfLC4ZzdpI2ztxP1I582+49Oc5Mq64VQ=
github.com/rivo/uniseg v0.4.7/go.mod h1:FN3SvrM+Zdj16jyLfmOkMNblXMcoc8DfTHruCPUcx88=
github.com/vincentkoc/crawlkit v0.4.1 h1:qDUF+Kk7nqADmpGMcnWTHEQMiX3bSD2DdFywKyT3kWs=
github.com/vincentkoc/crawlkit v0.4.1/go.mod h1:/ioLA/tyZ/927kAOGg0M8Mrqk7pnTZLpCKWfpul9zoE=
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e h1:JVG44RsyaB9T2KIHavMF/ppJZNG9ZpyihvCd0w101no=
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e/go.mod h1:RbqR21r5mrJuqunuUZ/Dhy/avygyECGrLceyNeo4LiM=
golang.org/x/exp v0.0.0-20231006140011-7918f672742d h1:jtJma62tbqLibJ5sFQz8bKtEM8rJBtfilJ2qTU199MI=

View File

@ -7,6 +7,7 @@ import (
"flag"
"fmt"
"io"
"log/slog"
"os"
"os/exec"
"path/filepath"
@ -23,6 +24,7 @@ import (
"github.com/openclaw/gitcrawl/internal/store"
"github.com/openclaw/gitcrawl/internal/syncer"
"github.com/openclaw/gitcrawl/internal/vector"
"github.com/vincentkoc/crawlkit/control"
)
const (
@ -39,6 +41,9 @@ const (
)
var threadReferencePattern = regexp.MustCompile(`(?i)(?:\b[\w.-]+/[\w.-]+#(\d+)|(?:issues|pull)/(\d+)|#(\d{2,}))`)
var githubThreadURLPattern = regexp.MustCompile(`(?i)^https?://github\.com/([\w.-]+)/([\w.-]+)/(?:issues|pull)/(\d+)(?:[/?#].*)?$`)
var ownerRepoThreadPattern = regexp.MustCompile(`(?i)^([\w.-]+)/([\w.-]+)#(\d+)$`)
var pathThreadPattern = regexp.MustCompile(`(?i)(?:^|/)(?:issues|pull)/(\d+)(?:[/?#].*)?$`)
var titleTokenPattern = regexp.MustCompile(`[A-Za-z0-9]{4,}`)
type referenceEvidence struct {
@ -124,12 +129,16 @@ func (a *App) Run(ctx context.Context, args []string) error {
switch rest[0] {
case "version":
return a.writeOutput("version", map[string]string{"version": version}, false)
case "metadata":
return a.runMetadata(rest[1:])
case "serve":
return usageErr(fmt.Errorf("serve is not supported in gitcrawl"))
case "init":
return a.runInit(ctx, rest[1:])
case "doctor":
return a.runDoctor(ctx, rest[1:])
case "status":
return a.runStatus(ctx, rest[1:])
case "sync":
return a.runSync(ctx, rest[1:])
case "threads":
@ -455,7 +464,7 @@ func (a *App) runNeighbors(ctx context.Context, args []string) error {
if err != nil {
return usageErr(err)
}
number, err := parseRequiredPositiveInt("number", *numberRaw)
number, err := parseRequiredThreadNumber("number", *numberRaw)
if err != nil {
return usageErr(err)
}
@ -690,7 +699,7 @@ func (a *App) runEmbed(ctx context.Context, args []string) error {
if err != nil {
return usageErr(err)
}
number, err := parseOptionalPositiveInt(*numberRaw)
number, err := parseOptionalThreadNumber(*numberRaw)
if err != nil {
return usageErr(err)
}
@ -1077,23 +1086,35 @@ func (a *App) runTUI(ctx context.Context, args []string) error {
rt, err = a.openLocalRuntimeReadOnly(ctx)
}
if err != nil {
if !interactive && errors.Is(err, os.ErrNotExist) {
cfg := config.Default()
if cfgErr := cfg.Normalize(); cfgErr != nil {
return cfgErr
}
sort, sortErr := resolveTUISort(*sortMode, cfg)
if sortErr != nil {
return sortErr
}
return a.writeOutput("tui", emptyClusterBrowserPayload(ctx, cfg, cfg.DBPath, sort, minSize, limit, *hideClosed), true)
}
return err
}
defer rt.Store.Close()
repo, inferred, err := a.resolveOptionalRepository(ctx, rt, fs.Args())
if err != nil {
if !interactive && len(fs.Args()) == 0 && strings.Contains(err.Error(), "no local repositories found") {
sort, sortErr := resolveTUISort(*sortMode, rt.Config)
if sortErr != nil {
return sortErr
}
return a.writeOutput("tui", emptyClusterBrowserPayload(ctx, rt.Config, rt.SourceDBPath, sort, minSize, limit, *hideClosed), true)
}
return err
}
sort := strings.TrimSpace(*sortMode)
if sort == "" {
sort = strings.TrimSpace(rt.Config.TUI.DefaultSort)
}
if sort == "" {
sort = "size"
}
if sort != "recent" && sort != "oldest" && sort != "size" {
return usageErr(fmt.Errorf("unsupported sort %q", sort))
sort, err := resolveTUISort(*sortMode, rt.Config)
if err != nil {
return err
}
showClosed := !*hideClosed || *includeClosed
@ -1148,6 +1169,38 @@ func (a *App) runTUI(ctx context.Context, args []string) error {
return a.runInteractiveTUI(ctx, rt.Store, repo.ID, payload)
}
func resolveTUISort(raw string, cfg config.Config) (string, error) {
sort := strings.TrimSpace(raw)
if sort == "" {
sort = strings.TrimSpace(cfg.TUI.DefaultSort)
}
if sort == "" {
sort = "size"
}
if sort != "recent" && sort != "oldest" && sort != "size" {
return "", usageErr(fmt.Errorf("unsupported sort %q", sort))
}
return sort, nil
}
func emptyClusterBrowserPayload(ctx context.Context, cfg config.Config, sourceDBPath, sort string, minSize, limit int, hideClosed bool) clusterBrowserPayload {
if strings.TrimSpace(sourceDBPath) == "" {
sourceDBPath = cfg.DBPath
}
return clusterBrowserPayload{
Mode: "cluster-browser",
DBSource: databaseSourceKind(sourceDBPath),
DBLocation: databaseSourceLocation(ctx, sourceDBPath),
Sort: sort,
MinSize: minSize,
Limit: limit,
HideClosed: hideClosed,
EmbedModel: cfg.OpenAI.EmbedModel,
EmbeddingBasis: cfg.EmbeddingBasis,
Clusters: []store.ClusterSummary{},
}
}
func databaseSourceKind(dbPath string) string {
if _, ok := portableStoreRoot(dbPath); ok {
return "remote"
@ -1252,7 +1305,8 @@ func (a *App) runClusterDetail(ctx context.Context, args []string) error {
clusterIDRaw := fs.String("id", "", "cluster id")
memberLimitRaw := fs.String("member-limit", "", "maximum member rows")
bodyCharsRaw := fs.String("body-chars", "", "maximum body snippet characters")
includeClosed := fs.Bool("include-closed", false, "include closed clusters and members")
includeClosed := fs.Bool("include-closed", false, "deprecated; closed cluster members are shown by default")
hideClosed := fs.Bool("hide-closed", false, "hide locally closed members")
jsonOut := fs.Bool("json", false, "write JSON output")
if err := fs.Parse(normalizeCommandArgs(args, map[string]bool{"id": true, "member-limit": true, "body-chars": true})); err != nil {
return usageErr(err)
@ -1293,7 +1347,7 @@ func (a *App) runClusterDetail(ctx context.Context, args []string) error {
detail, err := rt.Store.ClusterDetail(ctx, store.ClusterDetailOptions{
RepoID: repo.ID,
ClusterID: int64(clusterID),
IncludeClosed: *includeClosed,
IncludeClosed: *includeClosed || !*hideClosed,
MemberLimit: memberLimit,
BodyChars: bodyChars,
})
@ -1368,7 +1422,7 @@ func (a *App) runThreads(ctx context.Context, args []string) error {
if err != nil {
return usageErr(err)
}
numbers, err := parseOptionalPositiveIntList(*numbersRaw)
numbers, err := parseOptionalThreadNumberList(*numbersRaw)
if err != nil {
return usageErr(err)
}
@ -1419,7 +1473,7 @@ func (a *App) runCloseThread(ctx context.Context, args []string) error {
if err != nil {
return usageErr(err)
}
number, err := parseOptionalPositiveInt(*numberRaw)
number, err := parseOptionalThreadNumber(*numberRaw)
if err != nil {
return usageErr(err)
}
@ -1464,7 +1518,7 @@ func (a *App) runReopenThread(ctx context.Context, args []string) error {
if err != nil {
return usageErr(err)
}
number, err := parseOptionalPositiveInt(*numberRaw)
number, err := parseOptionalThreadNumber(*numberRaw)
if err != nil {
return usageErr(err)
}
@ -1735,7 +1789,7 @@ func (a *App) runSync(ctx context.Context, args []string) error {
if err != nil {
return usageErr(err)
}
numbers, err := parseOptionalPositiveIntList(*numbersRaw)
numbers, err := parseOptionalThreadNumberList(*numbersRaw)
if err != nil {
return usageErr(err)
}
@ -1796,14 +1850,14 @@ func (a *App) syncRepository(ctx context.Context, owner, repo string, options sy
if err := config.EnsureRuntimeDirs(cfg); err != nil {
return syncer.Stats{}, err
}
st, err := store.Open(ctx, cfg.DBPath)
rt, err := a.openLocalRuntime(ctx)
if err != nil {
return syncer.Stats{}, err
}
defer st.Close()
defer rt.Store.Close()
client := gh.New(gh.Options{Token: token.Value, BaseURL: githubBaseURL()})
service := syncer.New(client, st)
service := syncer.New(client, rt.Store)
stats, err := service.Sync(ctx, syncer.Options{
Owner: owner,
Repo: repo,
@ -1816,6 +1870,7 @@ func (a *App) syncRepository(ctx context.Context, owner, repo string, options sy
Reporter: func(message string) {
fmt.Fprintln(a.Stderr, message)
},
Logger: progressLogger(a.Stderr),
})
if err != nil {
return syncer.Stats{}, err
@ -1823,6 +1878,17 @@ func (a *App) syncRepository(ctx context.Context, owner, repo string, options sy
return stats, nil
}
func progressLogger(w io.Writer) *slog.Logger {
return slog.New(slog.NewTextHandler(w, &slog.HandlerOptions{
ReplaceAttr: func(_ []string, attr slog.Attr) slog.Attr {
if attr.Key == slog.TimeKey {
return slog.Attr{}
}
return attr
},
}))
}
func (a *App) runInit(ctx context.Context, args []string) error {
fs := flag.NewFlagSet("init", flag.ContinueOnError)
fs.SetOutput(io.Discard)
@ -1887,6 +1953,8 @@ func (a *App) runPortable(ctx context.Context, args []string) error {
return usageErr(fmt.Errorf("portable requires a subcommand"))
}
switch args[0] {
case "help", "--help", "-h":
return a.printCommandUsage("portable")
case "prune":
return a.runPortablePrune(ctx, args[1:])
default:
@ -2197,6 +2265,113 @@ func (a *App) runDoctor(ctx context.Context, args []string) error {
}, true)
}
func (a *App) runMetadata(args []string) error {
fs := flag.NewFlagSet("metadata", flag.ContinueOnError)
fs.SetOutput(io.Discard)
jsonOut := fs.Bool("json", false, "write JSON output")
if err := fs.Parse(normalizeCommandArgs(args, nil)); err != nil {
return usageErr(err)
}
a.applyCommandJSON(*jsonOut)
if fs.NArg() != 0 {
return usageErr(fmt.Errorf("metadata takes flags only"))
}
cfg := config.Default()
manifest := control.NewManifest("gitcrawl", "Git Crawl", "gitcrawl")
manifest.Description = "Local-first GitHub issue and pull request crawler."
manifest.Branding = control.Branding{SymbolName: "point.3.connected.trianglepath.dotted", AccentColor: "#2da44e"}
manifest.Paths = control.Paths{
DefaultConfig: config.ResolvePath(""),
ConfigEnv: config.DefaultConfigEnv,
DefaultDatabase: cfg.DBPath,
DefaultCache: cfg.CacheDir,
DefaultLogs: cfg.LogDir,
}
manifest.Capabilities = []string{"metadata", "status", "doctor", "sync", "search", "tui", "portable", "clusters", "embeddings"}
manifest.Privacy = control.Privacy{ContainsPrivateMessages: false, ExportsSecrets: false, LocalOnlyScopes: []string{"github", "sqlite", "portable"}}
manifest.Commands = map[string]control.Command{
"status": {Title: "Status", Argv: []string{"gitcrawl", "status", "--json"}, JSON: true},
"doctor": {Title: "Doctor", Argv: []string{"gitcrawl", "doctor", "--json"}, JSON: true},
"sync": {Title: "Sync repository", Argv: []string{"gitcrawl", "sync", "--json"}, JSON: true, Mutates: true},
"search": {Title: "Search", Argv: []string{"gitcrawl", "search", "--json"}, JSON: true},
"tui": {Title: "Terminal cluster browser", Argv: []string{"gitcrawl", "tui"}},
"tui-json": {Title: "Terminal cluster data", Argv: []string{"gitcrawl", "tui", "--json"}, JSON: true},
"portable": {Title: "Portable store tools", Argv: []string{"gitcrawl", "portable", "prune", "--json"}, JSON: true, Mutates: true},
"clusters": {Title: "Clusters", Argv: []string{"gitcrawl", "clusters", "--json"}, JSON: true},
"legacy-sync-api": {Title: "Legacy sync-status alias", Argv: []string{"gitcrawl", "sync-status"}, Legacy: true, Deprecated: true},
}
return a.writeOutput("metadata", manifest, false)
}
func (a *App) runStatus(ctx context.Context, args []string) error {
fs := flag.NewFlagSet("status", flag.ContinueOnError)
fs.SetOutput(io.Discard)
jsonOut := fs.Bool("json", false, "write JSON output")
if err := fs.Parse(normalizeCommandArgs(args, nil)); err != nil {
return usageErr(err)
}
a.applyCommandJSON(*jsonOut)
if fs.NArg() != 0 {
return usageErr(fmt.Errorf("status takes flags only"))
}
cfg, err := config.Load(a.configPath)
if err != nil {
if !errors.Is(err, os.ErrNotExist) {
return err
}
cfg = config.Default()
if err := cfg.Normalize(); err != nil {
return err
}
}
status := store.Status{DBPath: cfg.DBPath}
if _, err := os.Stat(cfg.DBPath); err == nil {
st, err := store.OpenReadOnly(ctx, cfg.DBPath)
if err != nil {
return err
}
defer st.Close()
status, err = st.Status(ctx)
if err != nil {
return err
}
} else if !errors.Is(err, os.ErrNotExist) {
return err
}
status.DBPath = cfg.DBPath
return a.writeOutput("status", controlStatus(config.ResolvePath(a.configPath), cfg, status), false)
}
func controlStatus(configPath string, cfg config.Config, status store.Status) control.Status {
counts := []control.Count{
control.NewCount("repositories", "Repositories", int64(status.RepositoryCount)),
control.NewCount("threads", "Threads", int64(status.ThreadCount)),
control.NewCount("open_threads", "Open threads", int64(status.OpenThreadCount)),
control.NewCount("clusters", "Clusters", int64(status.ClusterCount)),
}
out := control.NewStatus("gitcrawl", fmt.Sprintf("%d threads across %d repositories", status.ThreadCount, status.RepositoryCount))
out.State = "current"
out.ConfigPath = configPath
out.DatabasePath = status.DBPath
out.Counts = counts
if !status.LastSyncAt.IsZero() {
out.LastSyncAt = status.LastSyncAt.UTC().Format(time.RFC3339)
}
db := control.SQLiteDatabase("primary", "GitHub archive", "archive", status.DBPath, true, counts)
out.DatabaseBytes = db.Bytes
out.WALBytes = fileSize(status.DBPath + "-wal")
out.Databases = []control.Database{db}
return out
}
func fileSize(path string) int64 {
info, err := os.Stat(path)
if err != nil {
return 0
}
return info.Size()
}
func (a *App) applyCommandJSON(enabled bool) {
if enabled {
a.format = FormatJSON
@ -2227,6 +2402,9 @@ func resolveOutputFormat(value string, jsonOut bool) (OutputFormat, error) {
}
func parseOwnerRepo(value string) (string, string, error) {
if ref, ok := parseThreadReference(value); ok && ref.Owner != "" && ref.Repo != "" {
return ref.Owner, ref.Repo, nil
}
parts := strings.Split(value, "/")
if len(parts) != 2 || strings.TrimSpace(parts[0]) == "" || strings.TrimSpace(parts[1]) == "" {
return "", "", fmt.Errorf("expected owner/repo, got %q", value)
@ -2234,6 +2412,60 @@ func parseOwnerRepo(value string) (string, string, error) {
return strings.TrimSpace(parts[0]), strings.TrimSpace(parts[1]), nil
}
type threadReference struct {
Owner string
Repo string
Number int
}
func (ref threadReference) FullName() string {
if ref.Owner == "" || ref.Repo == "" {
return ""
}
return ref.Owner + "/" + ref.Repo
}
func parseThreadReference(value string) (threadReference, bool) {
value = strings.TrimSpace(value)
value = strings.Trim(value, "<>()[]{}\"'`")
value = strings.TrimRight(value, ".,;")
if value == "" {
return threadReference{}, false
}
if number, ok := parsePositiveIntLiteral(value); ok {
return threadReference{Number: number}, true
}
if strings.HasPrefix(value, "#") {
if number, ok := parsePositiveIntLiteral(strings.TrimPrefix(value, "#")); ok {
return threadReference{Number: number}, true
}
}
if match := githubThreadURLPattern.FindStringSubmatch(value); match != nil {
if number, ok := parsePositiveIntLiteral(match[3]); ok {
return threadReference{Owner: match[1], Repo: match[2], Number: number}, true
}
}
if match := ownerRepoThreadPattern.FindStringSubmatch(value); match != nil {
if number, ok := parsePositiveIntLiteral(match[3]); ok {
return threadReference{Owner: match[1], Repo: match[2], Number: number}, true
}
}
if match := pathThreadPattern.FindStringSubmatch(value); match != nil {
if number, ok := parsePositiveIntLiteral(match[1]); ok {
return threadReference{Number: number}, true
}
}
return threadReference{}, false
}
func parsePositiveIntLiteral(value string) (int, bool) {
if !isDecimalString(value) {
return 0, false
}
number, err := strconv.Atoi(value)
return number, err == nil && number > 0
}
func parseOptionalPositiveInt(value string) (int, error) {
if strings.TrimSpace(value) == "" {
return 0, nil
@ -2256,6 +2488,28 @@ func parseRequiredPositiveInt(name, value string) (int, error) {
return parsed, nil
}
func parseOptionalThreadNumber(value string) (int, error) {
if strings.TrimSpace(value) == "" {
return 0, nil
}
ref, ok := parseThreadReference(value)
if !ok || ref.Number <= 0 {
return 0, fmt.Errorf("expected positive issue or pull request number, got %q", value)
}
return ref.Number, nil
}
func parseRequiredThreadNumber(name, value string) (int, error) {
parsed, err := parseOptionalThreadNumber(value)
if err != nil {
return 0, err
}
if parsed == 0 {
return 0, fmt.Errorf("missing --%s", name)
}
return parsed, nil
}
func parseClusterMemberCommandIDs(command, clusterIDRaw, numberRaw string) (int, int, error) {
clusterID, err := parseOptionalPositiveInt(clusterIDRaw)
if err != nil {
@ -2264,7 +2518,7 @@ func parseClusterMemberCommandIDs(command, clusterIDRaw, numberRaw string) (int,
if clusterID == 0 {
return 0, 0, fmt.Errorf("%s requires --id", command)
}
number, err := parseOptionalPositiveInt(numberRaw)
number, err := parseOptionalThreadNumber(numberRaw)
if err != nil {
return 0, 0, err
}
@ -2619,6 +2873,22 @@ func parseOptionalPositiveIntList(value string) ([]int, error) {
return out, nil
}
func parseOptionalThreadNumberList(value string) ([]int, error) {
if strings.TrimSpace(value) == "" {
return nil, nil
}
parts := strings.Split(value, ",")
out := make([]int, 0, len(parts))
for _, part := range parts {
parsed, err := parseOptionalThreadNumber(strings.TrimSpace(part))
if err != nil {
return nil, err
}
out = append(out, parsed)
}
return out, nil
}
func (a *App) writeOutput(title string, payload any, allowLog bool) error {
switch a.format {
case FormatJSON:
@ -2682,7 +2952,17 @@ func (a *App) printUsage() {
}
func (a *App) printCommandUsage(command string) error {
if text, ok := commandUsageTexts[command]; ok {
fmt.Fprint(a.Stdout, text)
return nil
}
switch command {
case "cluster-explain":
fmt.Fprint(a.Stdout, commandUsageTexts["cluster-detail"])
return nil
case "portable":
fmt.Fprint(a.Stdout, portableUsageText)
return nil
case "tui":
fmt.Fprint(a.Stdout, tuiUsageText)
return nil
@ -2704,10 +2984,13 @@ Global flags:
--version print version
Core commands:
metadata print crawlkit control metadata
status print fast read-only archive status
init create config, optionally from a portable store
doctor check config, token, and database readiness
sync sync GitHub issue and pull request metadata
refresh run sync, enrichment, embedding, and clustering pipeline
embed generate OpenAI embeddings for local thread documents
threads list local issue and pull request rows
cluster build durable clusters from local thread vectors
close-thread locally hide one issue or pull request row
@ -2733,6 +3016,131 @@ Core commands:
No API server is provided. There is intentionally no serve command.
`
var commandUsageTexts = map[string]string{
"metadata": `gitcrawl metadata prints crawlkit control metadata.
Usage:
gitcrawl metadata [--json]
`,
"status": `gitcrawl status prints fast read-only archive status.
Usage:
gitcrawl status [--json]
`,
"init": `gitcrawl init creates a local config and SQLite database.
Usage:
gitcrawl init [--db path] [--portable-store URL] [--json]
`,
"configure": `gitcrawl configure updates model fields in the config.
Usage:
gitcrawl configure [--summary-model name] [--embed-model name] [--embedding-basis title_original] [--json]
`,
"doctor": `gitcrawl doctor checks config, token, and database readiness.
Usage:
gitcrawl doctor [--json]
`,
"sync": `gitcrawl sync mirrors GitHub issue and pull request metadata.
Usage:
gitcrawl sync owner/repo [--state open|closed|all] [--numbers refs] [--with pr-details] [--include-pr-details] [--json]
`,
"refresh": `gitcrawl refresh runs sync, enrichment, embedding, and clustering.
Usage:
gitcrawl refresh owner/repo [--state open|closed|all] [--sync-if-stale duration] [--no-sync] [--no-embed] [--no-cluster] [--json]
`,
"embed": `gitcrawl embed generates OpenAI embeddings for local thread documents.
Usage:
gitcrawl embed owner/repo [--number ref] [--limit N] [--force] [--include-closed] [--json]
`,
"threads": `gitcrawl threads lists local issue and pull request rows.
Usage:
gitcrawl threads owner/repo [--include-closed] [--numbers refs] [--limit N] [--json]
`,
"search": `gitcrawl search queries local thread documents, or accepts gh-shaped issue and PR search.
Usage:
gitcrawl search owner/repo --query text [--mode keyword|semantic] [--limit N] [--json]
gitcrawl search issues|prs <query> -R owner/repo [--state open|closed|all] [--json fields] [--limit N]
`,
"cluster": `gitcrawl cluster builds durable clusters from local thread vectors.
Usage:
gitcrawl cluster owner/repo [--threshold N] [--min-size N] [--max-cluster-size N] [--k N] [--cross-kind-threshold N] [--limit N] [--model name] [--basis semantic|references|hybrid] [--include-closed] [--json]
`,
"clusters": `gitcrawl clusters lists latest display clusters with durable fallback.
Usage:
gitcrawl clusters owner/repo [--sort size|recent|oldest] [--min-size N] [--limit N] [--hide-closed] [--json]
`,
"durable-clusters": `gitcrawl durable-clusters lists governed durable cluster groups.
Usage:
gitcrawl durable-clusters owner/repo [--include-closed] [--sort size|recent|oldest] [--min-size N] [--limit N] [--json]
`,
"cluster-detail": `gitcrawl cluster-detail dumps one cluster and its member rows.
Usage:
gitcrawl cluster-detail owner/repo --id N [--member-limit N] [--body-chars N] [--hide-closed] [--json]
`,
"neighbors": `gitcrawl neighbors lists vector-nearest local issue and pull request rows.
Usage:
gitcrawl neighbors owner/repo --number ref [--limit N] [--json]
`,
"runs": `gitcrawl runs lists local pipeline run history.
Usage:
gitcrawl runs owner/repo [--kind sync|summary|embedding|cluster] [--limit N] [--json]
`,
"close-thread": `gitcrawl close-thread locally hides one issue or pull request row.
Usage:
gitcrawl close-thread owner/repo --number ref [--reason text] [--json]
`,
"reopen-thread": `gitcrawl reopen-thread clears a local thread hide.
Usage:
gitcrawl reopen-thread owner/repo --number ref [--json]
`,
"close-cluster": `gitcrawl close-cluster locally hides one durable cluster.
Usage:
gitcrawl close-cluster owner/repo --id N [--reason text] [--json]
`,
"reopen-cluster": `gitcrawl reopen-cluster clears a local cluster hide.
Usage:
gitcrawl reopen-cluster owner/repo --id N [--json]
`,
"exclude-cluster-member": `gitcrawl exclude-cluster-member locally removes one row from a durable cluster.
Usage:
gitcrawl exclude-cluster-member owner/repo --id N --number ref [--reason text] [--json]
`,
"include-cluster-member": `gitcrawl include-cluster-member restores one row to a durable cluster.
Usage:
gitcrawl include-cluster-member owner/repo --id N --number ref [--json]
`,
"set-cluster-canonical": `gitcrawl set-cluster-canonical sets the canonical row for a durable cluster.
Usage:
gitcrawl set-cluster-canonical owner/repo --id N --number ref [--reason text] [--json]
`,
"gh": `gitcrawl gh runs a gh-compatible local cache shim with fallback to real gh.
Usage:
gitcrawl gh <gh command>
gitcrawl gh xcache stats|keys|gc|flush|reset|snapshot [--json]
`,
}
const tuiUsageText = `gitcrawl tui opens the local terminal cluster browser.
Usage:
@ -2748,3 +3156,12 @@ Press n to load neighbors for the selected issue or PR.
Enter from the members pane also loads neighbors before opening detail.
The TUI quietly refreshes from the local store every 15 seconds and leaves the current status alone when nothing changed.
`
const portableUsageText = `gitcrawl portable manages local portable-store snapshots.
Usage:
gitcrawl portable prune [--body-chars N] [--no-vacuum] [--json]
Subcommands:
prune prune volatile payloads from the configured portable store
`

View File

@ -4,6 +4,7 @@ import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"net/http"
"net/http/httptest"
@ -58,6 +59,181 @@ func TestInitDefaultOutputIsHumanReadable(t *testing.T) {
}
}
func TestMetadataStatusAndControlStatusJSON(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
configPath := filepath.Join(dir, "config.toml")
dbPath := filepath.Join(dir, "gitcrawl.db")
init := New()
if err := init.Run(ctx, []string{"--config", configPath, "init", "--db", dbPath}); err != nil {
t.Fatalf("init: %v", err)
}
if err := os.WriteFile(dbPath+"-wal", []byte("wal"), 0o600); err != nil {
t.Fatalf("write wal: %v", err)
}
for _, tc := range []struct {
name string
args []string
want string
}{
{name: "metadata", args: []string{"--config", configPath, "metadata", "--json"}, want: "commands"},
{name: "status", args: []string{"--config", configPath, "status", "--json"}, want: "databases"},
{name: "status missing config", args: []string{"--config", filepath.Join(dir, "missing.toml"), "status", "--json"}, want: "counts"},
} {
t.Run(tc.name, func(t *testing.T) {
app := New()
var stdout bytes.Buffer
app.Stdout = &stdout
if err := app.Run(ctx, tc.args); err != nil {
t.Fatalf("run %s: %v", tc.name, err)
}
var payload map[string]any
if err := json.Unmarshal(stdout.Bytes(), &payload); err != nil {
t.Fatalf("decode %s output %q: %v", tc.name, stdout.String(), err)
}
if payload["app_id"] != "gitcrawl" && payload["id"] != "gitcrawl" {
t.Fatalf("expected gitcrawl payload, got %#v", payload)
}
if _, ok := payload[tc.want]; !ok {
t.Fatalf("expected %s in %#v", tc.want, payload)
}
})
}
cfg, err := config.Load(configPath)
if err != nil {
t.Fatalf("load config: %v", err)
}
sizePath := filepath.Join(dir, "sized.db")
if err := os.WriteFile(sizePath, []byte("db"), 0o600); err != nil {
t.Fatalf("write sized db: %v", err)
}
if err := os.WriteFile(sizePath+"-wal", []byte("wal"), 0o600); err != nil {
t.Fatalf("write sized wal: %v", err)
}
lastSync := time.Unix(100, 0)
out := controlStatus(configPath, cfg, store.Status{
DBPath: sizePath,
RepositoryCount: 2,
ThreadCount: 3,
OpenThreadCount: 1,
ClusterCount: 4,
LastSyncAt: lastSync,
})
if out.DatabaseBytes == 0 {
t.Fatalf("database bytes should be populated: %#v", out)
}
if out.WALBytes != 3 {
t.Fatalf("wal bytes = %d, want 3", out.WALBytes)
}
if out.LastSyncAt != lastSync.UTC().Format(time.RFC3339) {
t.Fatalf("last sync = %q", out.LastSyncAt)
}
if len(out.Databases) != 1 || out.Databases[0].Path != sizePath || !out.Databases[0].IsPrimary {
t.Fatalf("database metadata = %#v", out.Databases)
}
if got := fileSize(filepath.Join(dir, "missing.db")); got != 0 {
t.Fatalf("missing file size = %d, want 0", got)
}
var helpOut bytes.Buffer
help := New()
help.Stdout = &helpOut
if err := help.printCommandUsage("portable"); err != nil {
t.Fatalf("portable help: %v", err)
}
if !strings.Contains(helpOut.String(), "portable") {
t.Fatalf("portable help output = %q", helpOut.String())
}
helpOut.Reset()
if err := help.printCommandUsage("tui"); err != nil {
t.Fatalf("tui help: %v", err)
}
if !strings.Contains(helpOut.String(), "cluster browser") {
t.Fatalf("tui help output = %q", helpOut.String())
}
for _, topic := range []string{"metadata", "status", "init", "configure", "doctor", "sync", "refresh", "embed", "threads", "search", "cluster", "clusters", "durable-clusters", "cluster-detail", "cluster-explain", "neighbors", "runs", "close-thread", "reopen-thread", "close-cluster", "reopen-cluster", "exclude-cluster-member", "include-cluster-member", "set-cluster-canonical", "gh"} {
helpOut.Reset()
if err := help.printCommandUsage(topic); err != nil {
t.Fatalf("%s help: %v", topic, err)
}
if !strings.Contains(helpOut.String(), "Usage:") {
t.Fatalf("%s help output = %q", topic, helpOut.String())
}
}
if err := New().Run(ctx, []string{"--config", configPath, "status", "extra"}); err == nil {
t.Fatal("status extra arg should fail")
}
}
func TestControlRepositoryAndClusterHelperBranches(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
cfg := config.Default()
cfg.DBPath = filepath.Join(dir, "gitcrawl.db")
payload := emptyClusterBrowserPayload(ctx, cfg, "", "recent", 2, 50, true)
if payload.DBSource != "local" || payload.DBLocation != "gitcrawl.db" {
t.Fatalf("empty payload source = %s/%s", payload.DBSource, payload.DBLocation)
}
if payload.Sort != "recent" || payload.MinSize != 2 || payload.Limit != 50 || !payload.HideClosed {
t.Fatalf("empty payload options = %#v", payload)
}
rt := localRuntime{Config: cfg}
if got := remoteRefreshSource(rt); got != "" {
t.Fatalf("local refresh source = %q", got)
}
if got := remoteRuntimePath(rt); got != "" {
t.Fatalf("local runtime path = %q", got)
}
rt.RemoteSource = true
rt.SourceDBPath = filepath.Join(dir, "store", "data", "archive.db")
if got := remoteRefreshSource(rt); got != rt.SourceDBPath {
t.Fatalf("remote refresh source = %q", got)
}
if got := remoteRuntimePath(rt); got != cfg.DBPath {
t.Fatalf("remote runtime path = %q", got)
}
if got := githubRepoFromRemote("git@github.com:openclaw/gitcrawl-store.git"); got != "openclaw/gitcrawl-store" {
t.Fatalf("ssh remote repo = %q", got)
}
if got := githubRepoFromRemote("https://github.com/openclaw/gitcrawl-store.git"); got != "openclaw/gitcrawl-store" {
t.Fatalf("https remote repo = %q", got)
}
if got := githubRepoFromRemote("ssh://git@github.com/openclaw/gitcrawl-store.git"); got != "openclaw/gitcrawl-store" {
t.Fatalf("ssh url remote repo = %q", got)
}
if got := githubRepoFromRemote("https://example.com/openclaw/gitcrawl-store.git"); got != "" {
t.Fatalf("non-github remote repo = %q", got)
}
if got := githubRepoFromRemote("https://github.com/openclaw"); got != "" {
t.Fatalf("short github remote repo = %q", got)
}
with, err := parseSyncWith(" pr-details, ")
if err != nil || !with["pr-details"] {
t.Fatalf("parse sync with = %#v, %v", with, err)
}
if _, err := parseSyncWith("reviews"); err == nil {
t.Fatal("unsupported sync --with value should fail")
}
maxSize, fanout, crossKind, err := parseClusterShapeOptions("cluster", "", "", "")
if err != nil {
t.Fatalf("default cluster shape: %v", err)
}
if maxSize != defaultClusterMaxSize || fanout != defaultClusterFanout || crossKind != defaultCrossKindMinScore {
t.Fatalf("default cluster shape = %d/%d/%f", maxSize, fanout, crossKind)
}
if _, _, _, err := parseClusterShapeOptions("cluster", "2", "3", "1.5"); err == nil {
t.Fatal("out-of-range cross-kind threshold should fail")
}
if !stateIncludesClosed("all") || !stateIncludesClosed(" closed ") || stateIncludesClosed("open") {
t.Fatal("state closed helper mismatch")
}
}
func TestInitRejectsDBAndPortableStore(t *testing.T) {
dir := t.TempDir()
app := New()
@ -763,6 +939,18 @@ func TestAppOutputModesAndUsageBranches(t *testing.T) {
if _, err := parseOptionalPositiveIntList("1, 0"); err == nil {
t.Fatal("bad int list should fail")
}
if owner, repo, err := parseOwnerRepo("https://github.com/openclaw/openclaw/issues/78601"); err != nil || owner != "openclaw" || repo != "openclaw" {
t.Fatalf("full issue URL owner/repo = %q/%q err=%v", owner, repo, err)
}
if got, err := parseOptionalThreadNumber("https://github.com/openclaw/openclaw/issues/78601"); err != nil || got != 78601 {
t.Fatalf("full issue URL number = %d err=%v", got, err)
}
if got, err := parseOptionalThreadNumber("https://github.com/openclaw/openclaw/pull/78602#issuecomment-1"); err != nil || got != 78602 {
t.Fatalf("full pull URL number = %d err=%v", got, err)
}
if got, err := parseOptionalThreadNumberList("https://github.com/openclaw/openclaw/issues/78601, openclaw/openclaw#78602, pull/78603, #78604"); err != nil || len(got) != 4 || got[0] != 78601 || got[1] != 78602 || got[2] != 78603 || got[3] != 78604 {
t.Fatalf("thread ref list = %#v err=%v", got, err)
}
if _, _, _, err := parseClusterShapeOptions("test", "bad", "1", "0.5"); err == nil {
t.Fatal("bad cluster shape should fail")
}
@ -801,7 +989,7 @@ func TestGlobalCommandBranches(t *testing.T) {
}{
{args: []string{"--help"}, wantOut: "Usage:"},
{args: []string{"help"}, wantOut: "Usage:"},
{args: []string{"help", "sync"}, wantErr: true, exitCode: 2},
{args: []string{"help", "sync"}, wantOut: "gitcrawl sync"},
{args: []string{"--version"}, wantOut: "dev"},
{args: []string{"version"}, wantOut: "dev"},
{args: []string{"--json", "version"}, wantOut: `"version"`},
@ -1022,6 +1210,60 @@ func TestTUIInfersRepository(t *testing.T) {
}
}
func TestTUIJSONUsesDefaultsWhenConfigMissing(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
configPath := filepath.Join(dir, "missing.toml")
t.Setenv("GITCRAWL_DB_PATH", filepath.Join(dir, "missing.db"))
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, []string{"--config", configPath, "tui", "--json"}); err != nil {
t.Fatalf("tui: %v", err)
}
var payload map[string]any
if err := json.Unmarshal(stdout.Bytes(), &payload); err != nil {
t.Fatalf("decode tui payload: %v\n%s", err, stdout.String())
}
if payload["mode"] != "cluster-browser" {
t.Fatalf("mode = %#v", payload["mode"])
}
clusters, ok := payload["clusters"].([]any)
if !ok || len(clusters) != 0 {
t.Fatalf("clusters = %#v", payload["clusters"])
}
if _, err := os.Stat(configPath); !errors.Is(err, os.ErrNotExist) {
t.Fatalf("config file should not be created, stat err=%v", err)
}
}
func TestTUIJSONHandlesEmptyStoreWithoutRepository(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
configPath := filepath.Join(dir, "config.toml")
dbPath := filepath.Join(dir, "gitcrawl.db")
app := New()
if err := app.Run(ctx, []string{"--config", configPath, "init", "--db", dbPath}); err != nil {
t.Fatalf("init: %v", err)
}
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, []string{"--config", configPath, "tui", "--json"}); err != nil {
t.Fatalf("tui: %v", err)
}
var payload map[string]any
if err := json.Unmarshal(stdout.Bytes(), &payload); err != nil {
t.Fatalf("decode tui payload: %v\n%s", err, stdout.String())
}
clusters, ok := payload["clusters"].([]any)
if !ok || len(clusters) != 0 {
t.Fatalf("clusters = %#v", payload["clusters"])
}
}
func TestTUIRequiresInteractiveTerminalByDefault(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
@ -2093,6 +2335,36 @@ func TestClustersDefaultShowsActivePrimaryMembers(t *testing.T) {
if len(all.Clusters) != 1 || all.Clusters[0].MemberCount != 1 {
t.Fatalf("hide-closed should focus active members, got %#v", all.Clusters)
}
stdout.Reset()
detail := New()
detail.Stdout = &stdout
if err := detail.Run(ctx, []string{"--config", configPath, "--json", "cluster-detail", "openclaw/openclaw", "--id", "90"}); err != nil {
t.Fatalf("cluster-detail: %v", err)
}
var detailPayload struct {
Members []store.ClusterMemberDetail `json:"members"`
}
if err := json.Unmarshal(stdout.Bytes(), &detailPayload); err != nil {
t.Fatalf("decode cluster detail: %v\n%s", err, stdout.String())
}
if len(detailPayload.Members) != 2 {
t.Fatalf("default cluster-detail should match visible cluster members, got %#v", detailPayload.Members)
}
stdout.Reset()
hideDetail := New()
hideDetail.Stdout = &stdout
if err := hideDetail.Run(ctx, []string{"--config", configPath, "--json", "cluster-detail", "openclaw/openclaw", "--id", "90", "--hide-closed"}); err != nil {
t.Fatalf("cluster-detail hide closed: %v", err)
}
detailPayload.Members = nil
if err := json.Unmarshal(stdout.Bytes(), &detailPayload); err != nil {
t.Fatalf("decode hide-closed cluster detail: %v\n%s", err, stdout.String())
}
if len(detailPayload.Members) != 1 || detailPayload.Members[0].Thread.Number != 90 {
t.Fatalf("hide-closed cluster-detail should focus open members, got %#v", detailPayload.Members)
}
}
func TestClusterMemberOverrideCommands(t *testing.T) {

View File

@ -0,0 +1,268 @@
package cli
import (
"bytes"
"context"
"path/filepath"
"strconv"
"strings"
"testing"
"time"
"github.com/openclaw/gitcrawl/internal/config"
"github.com/openclaw/gitcrawl/internal/store"
)
func TestCLIAppCommandCoveragePaths(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
cfg, err := config.Load(configPath)
if err != nil {
t.Fatalf("load config: %v", err)
}
st, err := store.Open(ctx, cfg.DBPath)
if err != nil {
t.Fatalf("open store: %v", err)
}
repo, err := st.RepositoryByFullName(ctx, "openclaw/openclaw")
if err != nil {
t.Fatalf("repo: %v", err)
}
threads, err := st.ListThreadsFiltered(ctx, store.ThreadListOptions{RepoID: repo.ID, IncludeClosed: true, Numbers: []int{10, 12}})
if err != nil {
t.Fatalf("threads: %v", err)
}
if len(threads) != 2 {
t.Fatalf("seed threads = %+v", threads)
}
result, err := st.SaveDurableClusters(ctx, repo.ID, []store.DurableClusterInput{{
StableKey: "cli:10,12",
StableSlug: "cli-10-12",
RepresentativeThreadID: threads[0].ID,
Title: "CLI command cluster",
Members: []store.DurableClusterMemberInput{
{ThreadID: threads[0].ID, Role: "canonical"},
{ThreadID: threads[1].ID, Role: "member"},
},
}})
if err != nil {
t.Fatalf("save cluster: %v", err)
}
if _, err := st.RecordRun(ctx, store.RunRecord{RepoID: repo.ID, Kind: "sync", Scope: "open", Status: "success", StartedAt: "2026-05-08T01:00:00Z", FinishedAt: "2026-05-08T01:00:01Z", StatsJSON: "{}"}); err != nil {
t.Fatalf("record run: %v", err)
}
clusterID, err := st.ClusterIDForThreadNumber(ctx, repo.ID, 10, true)
if err != nil {
t.Fatalf("cluster id: %v", err)
}
if result.RunID == 0 {
t.Fatal("cluster run id should be non-zero")
}
if err := st.Close(); err != nil {
t.Fatalf("close store: %v", err)
}
commands := [][]string{
{"--config", configPath, "--json", "configure", "--summary-model", "gpt-test", "--embed-model", "embed-test", "--embedding-basis", "title_original"},
{"--config", configPath, "--json", "metadata"},
{"--config", configPath, "--json", "status"},
{"--config", configPath, "--json", "threads", "openclaw/openclaw", "--numbers", "https://github.com/openclaw/openclaw/issues/10,https://github.com/openclaw/openclaw/pull/12", "--include-closed", "--limit", "2"},
{"--config", configPath, "--json", "runs", "openclaw/openclaw", "--kind", "sync", "--limit", "1"},
{"--config", configPath, "--json", "clusters", "openclaw/openclaw", "--include-closed", "--sort", "oldest", "--min-size", "1", "--limit", "5"},
{"--config", configPath, "--json", "durable-clusters", "openclaw/openclaw", "--include-closed", "--sort", "size", "--min-size", "1", "--limit", "5"},
{"--config", configPath, "--json", "cluster-detail", "openclaw/openclaw", "--id", strconv.FormatInt(clusterID, 10), "--member-limit", "2", "--body-chars", "10", "--include-closed"},
{"--config", configPath, "--json", "close-thread", "openclaw/openclaw", "--number", "https://github.com/openclaw/openclaw/issues/10", "--reason", "covered"},
{"--config", configPath, "--json", "reopen-thread", "openclaw/openclaw", "--number", "10"},
{"--config", configPath, "--json", "close-cluster", "openclaw/openclaw", "--id", strconv.FormatInt(clusterID, 10), "--reason", "covered"},
{"--config", configPath, "--json", "reopen-cluster", "openclaw/openclaw", "--id", strconv.FormatInt(clusterID, 10)},
{"--config", configPath, "--json", "exclude-cluster-member", "openclaw/openclaw", "--id", strconv.FormatInt(clusterID, 10), "--number", "12", "--reason", "covered"},
{"--config", configPath, "--json", "include-cluster-member", "openclaw/openclaw", "--id", strconv.FormatInt(clusterID, 10), "--number", "12", "--reason", "covered"},
{"--config", configPath, "--json", "set-cluster-canonical", "openclaw/openclaw", "--id", strconv.FormatInt(clusterID, 10), "--number", "12", "--reason", "covered"},
}
for _, args := range commands {
app := New()
var stdout, stderr bytes.Buffer
app.Stdout = &stdout
app.Stderr = &stderr
if err := app.Run(ctx, args); err != nil {
t.Fatalf("%v failed: %v\nstdout=%s\nstderr=%s", args, err, stdout.String(), stderr.String())
}
if stdout.Len() == 0 {
t.Fatalf("%v produced no output", args)
}
}
if clusterID <= 0 {
t.Fatalf("cluster id = %d", clusterID)
}
}
func TestCLIAppHumanAndLogOutputE2E(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
textCommands := [][]string{
{"--config", configPath, "version"},
{"--config", configPath, "metadata"},
{"--config", configPath, "status"},
{"--config", configPath, "doctor"},
{"--config", configPath, "help", "portable"},
{"--config", configPath, "help", "tui"},
}
for _, args := range textCommands {
app := New()
var stdout bytes.Buffer
app.Stdout = &stdout
if err := app.Run(ctx, args); err != nil {
t.Fatalf("%v failed: %v", args, err)
}
if strings.TrimSpace(stdout.String()) == "" {
t.Fatalf("%v produced no text output", args)
}
}
logCommands := [][]string{
{"--config", configPath, "--format", "log", "configure", "--summary-model", "gpt-log"},
{"--config", configPath, "--format", "log", "doctor"},
}
for _, args := range logCommands {
app := New()
var stdout bytes.Buffer
app.Stdout = &stdout
if err := app.Run(ctx, args); err != nil {
t.Fatalf("%v failed: %v", args, err)
}
if !strings.Contains(stdout.String(), "=") {
t.Fatalf("%v log output = %q", args, stdout.String())
}
}
jsonVersion := New()
var jsonOut bytes.Buffer
jsonVersion.Stdout = &jsonOut
if err := jsonVersion.Run(ctx, []string{"--config", configPath, "--format", "json", "version"}); err != nil {
t.Fatalf("json version: %v", err)
}
if !strings.Contains(jsonOut.String(), `"version"`) {
t.Fatalf("json version output = %q", jsonOut.String())
}
}
func TestCLIAppVectorFallbackCoveragePaths(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
configPath := filepath.Join(dir, "config.toml")
dbPath := filepath.Join(dir, "gitcrawl.db")
app := New()
if err := app.Run(ctx, []string{"--config", configPath, "init", "--db", dbPath}); err != nil {
t.Fatalf("init: %v", err)
}
repoID, firstID, secondID := seedCommandFlowStore(t, dbPath)
st, err := store.Open(ctx, dbPath)
if err != nil {
t.Fatalf("open store: %v", err)
}
now := time.Now().UTC().Format(time.RFC3339Nano)
for _, vector := range []store.ThreadVector{
{ThreadID: firstID, Basis: "other_basis", Model: "other-model", Dimensions: 2, ContentHash: "v1", Vector: []float64{1, 0}, CreatedAt: now, UpdatedAt: now},
{ThreadID: secondID, Basis: "other_basis", Model: "other-model", Dimensions: 2, ContentHash: "v2", Vector: []float64{0.95, 0.05}, CreatedAt: now, UpdatedAt: now},
} {
if err := st.UpsertThreadVector(ctx, vector); err != nil {
t.Fatalf("upsert vector: %v", err)
}
}
if err := st.Close(); err != nil {
t.Fatalf("close store: %v", err)
}
configure := New()
if err := configure.Run(ctx, []string{"--config", configPath, "configure", "--embed-model", "missing-model", "--embedding-basis", "missing-basis"}); err != nil {
t.Fatalf("configure: %v", err)
}
for _, args := range [][]string{
{"--config", configPath, "--json", "neighbors", "openclaw/openclaw", "--number", "101", "--limit", "1", "--threshold", "0.99"},
{"--config", configPath, "--json", "cluster", "openclaw/openclaw", "--threshold", "0.5", "--min-size", "2", "--limit", "2"},
{"--config", configPath, "--json", "refresh", "openclaw/openclaw", "--no-sync", "--no-embed", "--threshold", "0.5", "--min-size", "2"},
{"--config", configPath, "--json", "search", "openclaw/openclaw", "--query", "gateway", "--mode", ""},
} {
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, args); err != nil {
t.Fatalf("%v failed: %v\n%s", args, err, stdout.String())
}
}
if repoID == 0 {
t.Fatal("seed repo id should be non-zero")
}
}
func TestCLIAppUsageBranches(t *testing.T) {
ctx := context.Background()
configPath := filepath.Join(t.TempDir(), "config.toml")
cases := [][]string{
{"--format", "yaml", "status"},
{"serve"},
{"unknown"},
{"configure", "--bad"},
{"metadata", "extra"},
{"status", "extra"},
{"portable"},
{"portable", "unknown"},
{"portable", "prune", "extra"},
{"portable", "prune", "--body-chars", "bad"},
{"threads"},
{"threads", "bad-repo"},
{"threads", "openclaw/openclaw", "--numbers", "bad"},
{"threads", "openclaw/openclaw", "--limit", "bad"},
{"runs"},
{"runs", "openclaw/openclaw", "--limit", "bad"},
{"cluster-detail", "openclaw/openclaw", "--id", "bad"},
{"close-thread", "openclaw/openclaw"},
{"reopen-thread", "openclaw/openclaw", "--number", "bad"},
{"close-cluster", "openclaw/openclaw"},
{"reopen-cluster", "openclaw/openclaw", "--id", "bad"},
{"exclude-cluster-member", "openclaw/openclaw", "--id", "1"},
{"include-cluster-member", "openclaw/openclaw", "--id", "bad", "--number", "1"},
{"set-cluster-canonical", "openclaw/openclaw", "--id", "1", "--number", "bad"},
{"sync", "openclaw/openclaw", "--with", "bad"},
{"refresh"},
{"refresh", "openclaw/openclaw", "--no-sync", "--no-embed", "--no-cluster"},
{"refresh", "bad-repo"},
{"refresh", "openclaw/openclaw", "--limit", "bad"},
{"refresh", "openclaw/openclaw", "--threshold", "bad"},
{"refresh", "openclaw/openclaw", "--threshold", "2"},
{"refresh", "openclaw/openclaw", "--min-size", "bad"},
{"refresh", "openclaw/openclaw", "--k", "bad"},
{"search"},
{"search", "openclaw/openclaw"},
{"search", "bad-repo", "--query", "x"},
{"search", "openclaw/openclaw", "--query", "x", "--limit", "bad"},
{"search", "openclaw/openclaw", "--query", "x", "--mode", "bad"},
{"neighbors"},
{"neighbors", "bad-repo"},
{"neighbors", "openclaw/openclaw"},
{"neighbors", "openclaw/openclaw", "--number", "bad"},
{"neighbors", "openclaw/openclaw", "--number", "1", "--limit", "bad"},
{"neighbors", "openclaw/openclaw", "--number", "1", "--threshold", "bad"},
{"cluster"},
{"cluster", "bad-repo"},
{"cluster", "openclaw/openclaw", "--threshold", "bad"},
{"cluster", "openclaw/openclaw", "--threshold", "2"},
{"cluster", "openclaw/openclaw", "--min-size", "bad"},
{"cluster", "openclaw/openclaw", "--max-cluster-size", "bad"},
{"cluster", "openclaw/openclaw", "--limit", "bad"},
{"embed"},
{"embed", "bad-repo"},
{"embed", "openclaw/openclaw", "--number", "bad"},
{"embed", "openclaw/openclaw", "--limit", "bad"},
{"tui", "one", "two"},
{"tui", "--sort", "bad"},
}
for _, args := range cases {
app := New()
app.Stdout = &bytes.Buffer{}
app.Stderr = &bytes.Buffer{}
full := append([]string{"--config", configPath}, args...)
if err := app.Run(ctx, full); err == nil {
t.Fatalf("%v succeeded, want error", args)
}
}
}

67
internal/cli/gh_path.go Normal file
View File

@ -0,0 +1,67 @@
package cli
import (
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
)
func resolveRealGHPath() (string, error) {
envPath := strings.TrimSpace(os.Getenv("GITCRAWL_GH_PATH"))
candidates := []string{}
if envPath != "" {
candidates = append(candidates, envPath)
}
candidates = append(candidates,
"/opt/homebrew/opt/gh/bin/gh",
"/usr/local/opt/gh/bin/gh",
"/usr/local/bin/gh",
"/usr/bin/gh",
)
if lookPath, err := exec.LookPath("gh"); err == nil {
candidates = append(candidates, lookPath)
}
seen := map[string]bool{}
for _, candidate := range candidates {
candidate = strings.TrimSpace(candidate)
if candidate == "" || seen[candidate] {
continue
}
seen[candidate] = true
info, err := os.Stat(candidate)
if err != nil || info.IsDir() {
if envPath != "" && candidate == envPath {
return "", fmt.Errorf("real gh not found at GITCRAWL_GH_PATH %q", envPath)
}
continue
}
if isGitcrawlShimPath(candidate) {
if envPath != "" && candidate == envPath {
return "", fmt.Errorf("GITCRAWL_GH_PATH points to the gitcrawl shim (%s); set it to the real gh binary", envPath)
}
continue
}
return candidate, nil
}
return "", fmt.Errorf("real gh not found; set GITCRAWL_GH_PATH")
}
func isGitcrawlShimPath(path string) bool {
if path == "" {
return false
}
resolved := path
if eval, err := filepath.EvalSymlinks(path); err == nil {
resolved = eval
}
for _, value := range []string{path, resolved} {
base := strings.ToLower(filepath.Base(value))
if base == "gitcrawl" || base == "gitcrawl-gh" {
return true
}
}
return false
}

View File

@ -104,6 +104,15 @@ func (a *App) runGHSearch(ctx context.Context, args []string) error {
if err != nil {
return err
}
if len(threads) == 0 && ghSearchNeedsLiveEmptyCheck(kind, query, state) {
lastSync, err := rt.Store.LastSuccessfulListSyncAt(ctx, repo.ID, state)
if err != nil {
return err
}
if lastSync.IsZero() {
return localGHUnsupported(fmt.Errorf("empty local %s search has no broad %s sync", args[0], ghDefaultListState(state)))
}
}
jsonFields := strings.TrimSpace(*jsonFieldsRaw)
if jsonFields != "" || a.format == FormatJSON {
@ -126,7 +135,7 @@ func (a *App) runGHSearch(ctx context.Context, args []string) error {
}
func (a *App) syncGHSearchIfStale(ctx context.Context, owner, repoName, state string, maxAge time.Duration) error {
stale, lastSync, err := a.ghSearchCacheStale(ctx, owner, repoName, maxAge)
stale, lastSync, err := a.ghSearchCacheStale(ctx, owner, repoName, state, maxAge)
if err != nil {
return err
}
@ -142,7 +151,7 @@ func (a *App) syncGHSearchIfStale(ctx context.Context, owner, repoName, state st
return err
}
func (a *App) ghSearchCacheStale(ctx context.Context, owner, repoName string, maxAge time.Duration) (bool, time.Time, error) {
func (a *App) ghSearchCacheStale(ctx context.Context, owner, repoName, state string, maxAge time.Duration) (bool, time.Time, error) {
rt, err := a.openLocalRuntimeReadOnly(ctx)
if err != nil {
if errors.Is(err, os.ErrNotExist) {
@ -158,7 +167,7 @@ func (a *App) ghSearchCacheStale(ctx context.Context, owner, repoName string, ma
}
return false, time.Time{}, err
}
lastSync, err := rt.Store.LastSuccessfulSyncAt(ctx, repo.ID)
lastSync, err := rt.Store.LastSuccessfulListSyncAt(ctx, repo.ID, state)
if err != nil {
return false, time.Time{}, err
}
@ -168,6 +177,20 @@ func (a *App) ghSearchCacheStale(ctx context.Context, owner, repoName string, ma
return time.Since(lastSync) > maxAge, lastSync, nil
}
func ghSearchNeedsLiveEmptyCheck(kind, query, state string) bool {
if strings.TrimSpace(query) != "" || kind != "issue" {
return false
}
return ghDefaultListState(state) == "open"
}
func ghDefaultListState(state string) string {
if strings.TrimSpace(state) == "" {
return "open"
}
return strings.TrimSpace(state)
}
func parseGHSearchQuery(value string) (query string, repo string, state string) {
var queryParts []string
for _, part := range strings.Fields(value) {

View File

@ -66,6 +66,16 @@ func TestGHSearchCacheStaleUsesRepoSyncRuns(t *testing.T) {
t.Fatalf("repo: %v", err)
}
finishedAt := time.Now().UTC().Add(-1 * time.Hour).Format(time.RFC3339Nano)
if _, err := st.RecordRun(ctx, store.RunRecord{
RepoID: repoID,
Kind: "sync",
Scope: "numbers:13",
Status: "success",
StartedAt: time.Now().UTC().Format(time.RFC3339Nano),
FinishedAt: time.Now().UTC().Format(time.RFC3339Nano),
}); err != nil {
t.Fatalf("record targeted sync: %v", err)
}
if _, err := st.RecordRun(ctx, store.RunRecord{
RepoID: repoID,
Kind: "sync",
@ -74,7 +84,7 @@ func TestGHSearchCacheStaleUsesRepoSyncRuns(t *testing.T) {
StartedAt: finishedAt,
FinishedAt: finishedAt,
}); err != nil {
t.Fatalf("record sync: %v", err)
t.Fatalf("record broad sync: %v", err)
}
if err := st.Close(); err != nil {
t.Fatalf("close store: %v", err)
@ -82,14 +92,14 @@ func TestGHSearchCacheStaleUsesRepoSyncRuns(t *testing.T) {
run := New()
run.configPath = configPath
stale, lastSync, err := run.ghSearchCacheStale(ctx, "openclaw", "openclaw", 2*time.Hour)
stale, lastSync, err := run.ghSearchCacheStale(ctx, "openclaw", "openclaw", "open", 2*time.Hour)
if err != nil {
t.Fatalf("freshness check: %v", err)
}
if stale || lastSync.IsZero() {
t.Fatalf("expected cache to be fresh, stale=%v lastSync=%s", stale, lastSync)
}
stale, _, err = run.ghSearchCacheStale(ctx, "openclaw", "openclaw", 30*time.Minute)
stale, _, err = run.ghSearchCacheStale(ctx, "openclaw", "openclaw", "open", 30*time.Minute)
if err != nil {
t.Fatalf("stale freshness check: %v", err)
}
@ -110,7 +120,7 @@ func TestGHSearchCacheStaleWhenRepoMissing(t *testing.T) {
run := New()
run.configPath = configPath
stale, lastSync, err := run.ghSearchCacheStale(ctx, "openclaw", "missing", time.Minute)
stale, lastSync, err := run.ghSearchCacheStale(ctx, "openclaw", "missing", "open", time.Minute)
if err != nil {
t.Fatalf("freshness check: %v", err)
}

View File

@ -11,7 +11,6 @@ import (
"os"
"os/exec"
"path/filepath"
"strconv"
"strings"
"github.com/openclaw/gitcrawl/internal/store"
@ -107,13 +106,18 @@ func (a *App) runGHThreadView(ctx context.Context, resource string, args []strin
return usageErr(err)
}
if fs.NArg() != 1 {
return usageErr(fmt.Errorf("gh %s view requires a number", resource))
return usageErr(fmt.Errorf("gh %s view requires a number or GitHub URL", resource))
}
ref, _ := parseThreadReference(fs.Arg(0))
number, err := parseThreadNumber(fs.Arg(0))
if err != nil {
return usageErr(err)
}
repoValue, err := a.resolveGHRepo(ctx, firstNonEmpty(*repoShort, *repoLong))
repoArg := firstNonEmpty(*repoShort, *repoLong)
if repoArg == "" {
repoArg = ref.FullName()
}
repoValue, err := a.resolveGHRepo(ctx, repoArg)
if err != nil {
return localGHUnsupported(err)
}
@ -202,6 +206,22 @@ func (a *App) runGHThreadList(ctx context.Context, resource string, args []strin
if err != nil {
return err
}
if len(threads) == 0 && ghThreadListNeedsLiveEmptyCheck(ghThreadListRequest{
Kind: ghResourceKind(resource),
State: strings.TrimSpace(*stateRaw),
Query: strings.TrimSpace(*searchRaw),
Author: strings.TrimSpace(*authorRaw),
Assignee: strings.TrimSpace(*assigneeRaw),
Labels: labels.Values(),
}) {
fresh, err := a.localGHThreadListHasBroadSync(ctx, repoValue, strings.TrimSpace(*stateRaw))
if err != nil {
return err
}
if !fresh {
return localGHUnsupported(fmt.Errorf("empty local %s list has no broad %s sync", resource, ghDefaultListState(*stateRaw)))
}
}
jsonFields := strings.TrimSpace(*jsonFieldsRaw)
if jsonFields != "" || strings.TrimSpace(*jqRaw) != "" || a.format == FormatJSON {
if jsonFields == "" {
@ -289,6 +309,34 @@ func (a *App) localGHThreads(ctx context.Context, req ghThreadListRequest) ([]st
})
}
func ghThreadListNeedsLiveEmptyCheck(req ghThreadListRequest) bool {
if req.Kind != "issue" || strings.TrimSpace(req.Query) != "" || strings.TrimSpace(req.Author) != "" || strings.TrimSpace(req.Assignee) != "" || len(req.Labels) > 0 {
return false
}
return ghDefaultListState(req.State) == "open"
}
func (a *App) localGHThreadListHasBroadSync(ctx context.Context, repoValue, state string) (bool, error) {
owner, repoName, err := parseOwnerRepo(repoValue)
if err != nil {
return false, err
}
rt, err := a.openLocalRuntimeReadOnly(ctx)
if err != nil {
return false, localGHUnsupported(err)
}
defer rt.Store.Close()
repo, err := rt.repository(ctx, owner, repoName)
if err != nil {
return false, localGHUnsupported(err)
}
lastSync, err := rt.Store.LastSuccessfulListSyncAt(ctx, repo.ID, state)
if err != nil {
return false, err
}
return !lastSync.IsZero(), nil
}
func (a *App) resolveGHRepo(ctx context.Context, explicit string) (string, error) {
if strings.TrimSpace(explicit) != "" {
return strings.TrimSpace(explicit), nil
@ -309,17 +357,9 @@ func (a *App) resolveGHRepo(ctx context.Context, explicit string) (string, error
}
func (a *App) execRealGH(ctx context.Context, args []string) error {
ghPath := strings.TrimSpace(os.Getenv("GITCRAWL_GH_PATH"))
if ghPath == "" {
if _, err := os.Stat("/opt/homebrew/opt/gh/bin/gh"); err == nil {
ghPath = "/opt/homebrew/opt/gh/bin/gh"
} else {
var err error
ghPath, err = exec.LookPath("gh")
if err != nil {
return fmt.Errorf("real gh not found; set GITCRAWL_GH_PATH")
}
}
ghPath, err := resolveRealGHPath()
if err != nil {
return err
}
cmd := exec.CommandContext(ctx, ghPath, args...)
cmd.Stdin = os.Stdin
@ -356,12 +396,7 @@ func ghResourceKind(resource string) string {
}
func parseThreadNumber(value string) (int, error) {
value = strings.TrimSpace(strings.TrimPrefix(value, "#"))
number, err := strconv.Atoi(value)
if err != nil || number <= 0 {
return 0, fmt.Errorf("expected positive issue or pull request number, got %q", value)
}
return number, nil
return parseOptionalThreadNumber(value)
}
func ownerRepoFromGitRemote(value string) (string, error) {

View File

@ -41,8 +41,8 @@ func (a *App) execRealGHMaybeCached(ctx context.Context, args []string) error {
lockPath := entryPath + ".lock"
lock, locked := tryGHCommandCacheLock(lockPath)
if !locked {
if entry, ok := waitGHCommandCache(entryPath, lockPath, ttl); ok {
_ = a.incrementGHXCacheCounter("fallback_hits")
if entry, hit, ok := waitGHCommandCache(entryPath, lockPath, ttl, staleEntry, hasStaleEntry); ok {
_ = a.incrementGHXCacheCounter(hit)
return a.writeGHCommandCacheEntry(entry)
}
lock, locked = tryGHCommandCacheLock(lockPath)
@ -60,7 +60,7 @@ func (a *App) execRealGHMaybeCached(ctx context.Context, args []string) error {
stdout, stderr, exitCode, err := a.captureRealGH(ctx, args)
_ = a.incrementGHXCacheBackendMiss(args)
if err != nil && hasStaleEntry && staleEntry.ExitCode == 0 && ghCommandOutputLooksRateLimited(stdout, stderr) {
if err != nil && hasStaleEntry && ghCommandCacheEntryCanServeStale(staleEntry, ttl) && ghCommandOutputLooksRateLimited(stdout, stderr) {
_ = a.incrementGHXCacheCounter("stale_hits")
_, _ = fmt.Fprintf(a.Stderr, "gitcrawl: GitHub rate limited; serving stale cached gh response from %s ago\n", time.Since(staleEntry.CreatedAt).Round(time.Second))
return a.writeGHCommandCacheEntry(staleEntry)
@ -85,24 +85,16 @@ func cacheGHReadErrors() bool {
}
func (a *App) captureRealGH(ctx context.Context, args []string) (string, string, int, error) {
ghPath := strings.TrimSpace(os.Getenv("GITCRAWL_GH_PATH"))
if ghPath == "" {
if _, err := os.Stat("/opt/homebrew/opt/gh/bin/gh"); err == nil {
ghPath = "/opt/homebrew/opt/gh/bin/gh"
} else {
var err error
ghPath, err = exec.LookPath("gh")
if err != nil {
return "", "", 127, fmt.Errorf("real gh not found; set GITCRAWL_GH_PATH")
}
}
ghPath, err := resolveRealGHPath()
if err != nil {
return "", "", 127, err
}
var stdout, stderr bytes.Buffer
cmd := exec.CommandContext(ctx, ghPath, args...)
cmd.Stdin = os.Stdin
cmd.Stdout = &stdout
cmd.Stderr = &stderr
err := cmd.Run()
err = cmd.Run()
exitCode := 0
if err != nil {
exitCode = 1
@ -216,6 +208,47 @@ func ghCommandCacheEntryTTL(entry ghCommandCacheEntry, ttl time.Duration) time.D
return ttl
}
func ghCommandCacheEntryCanServeStale(entry ghCommandCacheEntry, ttl time.Duration) bool {
if entry.ExitCode != 0 || entry.CreatedAt.IsZero() {
return false
}
age := time.Since(entry.CreatedAt)
if age <= ghCommandCacheEntryTTL(entry, ttl) {
return true
}
return age <= ghCommandCacheEntryTTL(entry, ttl)+ghCommandCacheStaleGrace(entry.Args)
}
func ghCommandCacheStaleGrace(args []string) time.Duration {
if raw := strings.TrimSpace(os.Getenv("GITCRAWL_GH_STALE_GRACE")); raw != "" {
if duration, err := time.ParseDuration(raw); err == nil && duration >= 0 {
return duration
}
}
if len(args) == 0 {
return 5 * time.Minute
}
switch args[0] {
case "run":
return 2 * time.Minute
case "api":
route := normalizeGHAPIRoute(args[1:])
switch {
case strings.Contains(route, "/actions/runs"):
return 2 * time.Minute
case strings.Contains(route, "/pages"):
return 30 * time.Minute
case strings.Contains(route, "/contents"):
return 6 * time.Hour
case strings.HasPrefix(route, "api users/"):
return 24 * time.Hour
}
case "release", "workflow", "repo":
return 30 * time.Minute
}
return 10 * time.Minute
}
func ghCommandCacheEntryLooksRateLimited(entry ghCommandCacheEntry) bool {
return ghCommandOutputLooksRateLimited(entry.Stdout, entry.Stderr)
}
@ -270,19 +303,28 @@ func tryGHCommandCacheLock(path string) (*os.File, bool) {
return lock, true
}
func waitGHCommandCache(entryPath, lockPath string, ttl time.Duration) (ghCommandCacheEntry, bool) {
func waitGHCommandCache(entryPath, lockPath string, ttl time.Duration, staleEntry ghCommandCacheEntry, hasStaleEntry bool) (ghCommandCacheEntry, string, bool) {
if hasStaleEntry && ghCommandCacheEntryCanServeStale(staleEntry, ttl) {
time.Sleep(250 * time.Millisecond)
if entry, ok := readGHCommandCache(entryPath, ttl); ok {
return entry, "fallback_hits", true
}
if _, err := os.Stat(lockPath); err == nil {
return staleEntry, "stale_hits", true
}
}
deadline := time.Now().Add(30 * time.Second)
for time.Now().Before(deadline) {
time.Sleep(100 * time.Millisecond)
if entry, ok := readGHCommandCache(entryPath, ttl); ok {
return entry, true
return entry, "fallback_hits", true
}
if _, err := os.Stat(lockPath); os.IsNotExist(err) {
return ghCommandCacheEntry{}, false
return ghCommandCacheEntry{}, "", false
}
}
_ = os.Remove(lockPath)
return ghCommandCacheEntry{}, false
return ghCommandCacheEntry{}, "", false
}
func (a *App) ghCommandCacheKey(ctx context.Context, args []string) string {

View File

@ -3,6 +3,7 @@ package cli
import (
"context"
"encoding/json"
"net/url"
"os"
"strings"
"time"
@ -189,7 +190,7 @@ func ghCommandCacheTTLBase(args []string, stablePRDiff bool) time.Duration {
func ghRunCacheTTL(args []string) time.Duration {
if len(args) == 0 {
return 2 * time.Minute
return 30 * time.Second
}
switch args[0] {
case "view":
@ -197,13 +198,13 @@ func ghRunCacheTTL(args []string) time.Duration {
return 12 * time.Hour
}
if hasAnyGHFlag(args[1:], "--job") {
return 5 * time.Minute
return 1 * time.Minute
}
return 2 * time.Minute
return 30 * time.Second
case "list":
return 2 * time.Minute
return 30 * time.Second
default:
return 2 * time.Minute
return 30 * time.Second
}
}
@ -214,20 +215,35 @@ func ghAPICacheTTL(args []string) time.Duration {
return 6 * time.Hour
case strings.HasPrefix(route, "api users/"):
return 7 * 24 * time.Hour
case strings.Contains(route, "/contents"):
if ghAPIContentRefIsStable(args) {
return 7 * 24 * time.Hour
}
return 30 * time.Minute
case strings.Contains(route, "/pages/builds/latest"):
return 2 * time.Minute
case strings.Contains(route, "/pages/health"):
return 15 * time.Minute
case strings.Contains(route, "/pages"):
return 30 * time.Minute
case strings.Contains(route, "/actions/runs/:id/logs"):
return 12 * time.Hour
case strings.Contains(route, "/actions/jobs/:id/logs"):
return 12 * time.Hour
case strings.Contains(route, "/actions/runs/:id/jobs"):
return 5 * time.Minute
return 1 * time.Minute
case strings.Contains(route, "/actions/jobs/:id"):
return 1 * time.Minute
case strings.Contains(route, "/pending_deployments"):
return 30 * time.Second
case strings.Contains(route, "/actions/runs/:id"):
return 2 * time.Minute
return 30 * time.Second
case strings.Contains(route, "/actions/workflows/"):
return 5 * time.Minute
return 15 * time.Minute
case strings.Contains(route, "/actions/runs"):
return 2 * time.Minute
return 30 * time.Second
case strings.Contains(route, "/releases"):
return 30 * time.Minute
return 1 * time.Hour
case strings.Contains(route, "/branches") || strings.Contains(route, "/commits"):
return 10 * time.Minute
default:
@ -235,6 +251,62 @@ func ghAPICacheTTL(args []string) time.Duration {
}
}
func ghAPIContentRefIsStable(args []string) bool {
path := ghAPIPathArg(args)
_, rawQuery, found := strings.Cut(path, "?")
if !found {
return false
}
for _, part := range strings.Split(rawQuery, "&") {
name, value, ok := strings.Cut(part, "=")
if !ok || name != "ref" {
continue
}
value = strings.TrimSpace(value)
if decoded, err := url.QueryUnescape(value); err == nil {
value = strings.TrimSpace(decoded)
}
if len(value) == 40 && isHexString(value) {
return true
}
if ghAPIContentRefIsStableReleaseTag(value) {
return true
}
}
return false
}
func ghAPIContentRefIsStableReleaseTag(value string) bool {
value = strings.TrimSpace(value)
if strings.HasPrefix(value, "refs/heads/") {
return false
}
value = strings.TrimPrefix(value, "refs/tags/")
if strings.HasPrefix(value, "refs/") {
return false
}
if strings.HasPrefix(value, "v") {
value = strings.TrimPrefix(value, "v")
}
core := value
if before, _, found := strings.Cut(core, "+"); found {
core = before
}
if before, _, found := strings.Cut(core, "-"); found {
core = before
}
parts := strings.Split(core, ".")
if len(parts) != 3 {
return false
}
for _, part := range parts {
if !isDecimalString(part) {
return false
}
}
return true
}
func isGHPRDiff(args []string) bool {
return len(args) >= 2 && args[0] == "pr" && args[1] == "diff"
}
@ -262,6 +334,9 @@ func parseGHPRDiffIdentityArgs(args []string) (string, int, bool) {
if strings.HasPrefix(arg, "-") || number != 0 {
continue
}
if ref, ok := parseThreadReference(arg); ok && ref.FullName() != "" && repo == "" {
repo = ref.FullName()
}
parsed, err := parseThreadNumber(arg)
if err != nil {
return "", 0, false
@ -305,6 +380,14 @@ func normalizeGHAPIRoute(args []string) string {
if part == "" {
continue
}
if index >= 4 && len(parts) > 3 && parts[3] == "contents" {
parts = append(parts[:4], ":path")
break
}
if index >= 5 && len(parts) > 4 && parts[3] == "git" && parts[4] == "ref" {
parts = append(parts[:5], ":ref")
break
}
switch {
case isDecimalString(part):
parts[index] = ":id"
@ -356,6 +439,18 @@ func isDecimalString(value string) bool {
return true
}
func isHexString(value string) bool {
if value == "" {
return false
}
for _, r := range value {
if (r < '0' || r > '9') && (r < 'a' || r > 'f') && (r < 'A' || r > 'F') {
return false
}
}
return true
}
func mutatingGHCommand(args []string) bool {
if len(args) < 2 {
return false

View File

@ -121,6 +121,110 @@ echo "call-$count:$*"
}
}
func TestGHXCacheCommandsReportAndCleanCacheState(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
app := New()
app.configPath = configPath
var stdout bytes.Buffer
app.Stdout = &stdout
dir, err := app.ghCommandCacheDir()
if err != nil {
t.Fatalf("cache dir: %v", err)
}
now := time.Now()
freshPath := filepath.Join(dir, "fresh.json")
expiredPath := filepath.Join(dir, "expired.json")
if err := writeGHCommandCache(freshPath, ghCommandCacheEntry{CreatedAt: now.Add(-time.Minute), Args: []string{"api", "users/octocat"}, ExitCode: 0, Stdout: "{}"}); err != nil {
t.Fatalf("write fresh cache: %v", err)
}
if err := writeGHCommandCache(expiredPath, ghCommandCacheEntry{CreatedAt: now.Add(-8 * 24 * time.Hour), Args: []string{"api", "users/octocat"}, ExitCode: 0, Stdout: "{}"}); err != nil {
t.Fatalf("write expired cache: %v", err)
}
lockPath := filepath.Join(dir, "stale.lock")
if err := os.WriteFile(lockPath, []byte("123\n"), 0o600); err != nil {
t.Fatalf("write lock: %v", err)
}
old := now.Add(-3 * time.Minute)
if err := os.Chtimes(lockPath, old, old); err != nil {
t.Fatalf("age lock: %v", err)
}
if err := os.WriteFile(filepath.Join(dir, "broken.json"), []byte("{"), 0o600); err != nil {
t.Fatalf("write broken entry: %v", err)
}
if _, ok := ghCommandCacheKeyInfoFromDirEntry(dir, mustDirEntry(t, dir, "broken.json")); ok {
t.Fatal("broken cache entry should be ignored")
}
if err := app.incrementGHXCacheCounter("local_hits"); err != nil {
t.Fatalf("increment hit: %v", err)
}
if err := app.incrementGHXCacheBackendMiss([]string{"api", "repos/openclaw/gitcrawl/actions/runs/1/jobs"}); err != nil {
t.Fatalf("increment miss: %v", err)
}
if err := app.runGHXCache([]string{"stats", "--since", "2h"}); err != nil {
t.Fatalf("stats: %v", err)
}
statsText := stdout.String()
if !strings.Contains(statsText, "hit rate") || !strings.Contains(statsText, "Backend Misses by Route") {
t.Fatalf("stats output = %q", statsText)
}
stdout.Reset()
if err := app.runGHXCache([]string{"keys"}); err != nil {
t.Fatalf("keys: %v", err)
}
if !strings.Contains(stdout.String(), "api users/octocat") {
t.Fatalf("keys output = %q", stdout.String())
}
stdout.Reset()
if err := app.runGHXCache([]string{"snapshot", "--reset"}); err != nil {
t.Fatalf("snapshot: %v", err)
}
if !strings.Contains(stdout.String(), "Reset xcache counters") {
t.Fatalf("snapshot output = %q", stdout.String())
}
stdout.Reset()
if err := app.runGHXCache([]string{"gc"}); err != nil {
t.Fatalf("gc: %v", err)
}
if !strings.Contains(stdout.String(), "Removed 1 expired entrie(s), 1 stale lock(s)") {
t.Fatalf("gc output = %q", stdout.String())
}
stdout.Reset()
if err := app.runGHXCache([]string{"flush"}); err != nil {
t.Fatalf("flush: %v", err)
}
if !strings.Contains(stdout.String(), "Flushed") {
t.Fatalf("flush output = %q", stdout.String())
}
if err := app.clearGHCommandCache(); err != nil {
t.Fatalf("clear cache: %v", err)
}
if err := app.runGHXCache([]string{}); err == nil {
t.Fatal("missing xcache command should fail")
}
if err := app.runGHXCache([]string{"stats", "--since", "nope"}); err == nil {
t.Fatal("invalid since should fail")
}
if err := app.runGHXCache([]string{"mystery"}); err == nil {
t.Fatal("unknown xcache command should fail")
}
}
func mustDirEntry(t *testing.T, dir, name string) os.DirEntry {
t.Helper()
entries, err := os.ReadDir(dir)
if err != nil {
t.Fatalf("read dir: %v", err)
}
for _, entry := range entries {
if entry.Name() == name {
return entry
}
}
t.Fatalf("missing dir entry %s", name)
return nil
}
func TestGHShimCachesGHXStyleReadOnlyFallbackCommands(t *testing.T) {
for _, args := range [][]string{
{"gh", "release", "view", "v1.2.3", "-R", "openclaw/openclaw"},
@ -146,11 +250,11 @@ func TestGHShimCommandAwareCacheTTLs(t *testing.T) {
if got := ghCommandCacheTTL([]string{"run", "view", "123", "--log"}); got != 12*time.Hour {
t.Fatalf("run log ttl = %s, want 12h", got)
}
if got := ghCommandCacheTTL([]string{"run", "view", "123", "--job", "456"}); got != 5*time.Minute {
t.Fatalf("run job ttl = %s, want 5m", got)
if got := ghCommandCacheTTL([]string{"run", "view", "123", "--job", "456"}); got != time.Minute {
t.Fatalf("run job ttl = %s, want 1m", got)
}
if got := ghCommandCacheTTL([]string{"run", "list", "-R", "openclaw/openclaw"}); got != 2*time.Minute {
t.Fatalf("run list ttl = %s, want 2m", got)
if got := ghCommandCacheTTL([]string{"run", "list", "-R", "openclaw/openclaw"}); got != 30*time.Second {
t.Fatalf("run list ttl = %s, want 30s", got)
}
if got := ghCommandCacheTTL([]string{"search", "issues", "cache"}); got != 15*time.Minute {
t.Fatalf("search ttl = %s, want 15m", got)
@ -158,8 +262,26 @@ func TestGHShimCommandAwareCacheTTLs(t *testing.T) {
if got := ghCommandCacheTTL([]string{"api", "-i", "repos/openclaw/openclaw/actions/runs/123/logs"}); got != 12*time.Hour {
t.Fatalf("actions log api ttl = %s, want 12h", got)
}
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/actions/runs/123"}); got != 2*time.Minute {
t.Fatalf("actions run api ttl = %s, want 2m", got)
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/actions/runs/123"}); got != 30*time.Second {
t.Fatalf("actions run api ttl = %s, want 30s", got)
}
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/pages"}); got != 30*time.Minute {
t.Fatalf("pages api ttl = %s, want 30m", got)
}
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/contents/README.md?ref=v0.2.0"}); got != 7*24*time.Hour {
t.Fatalf("tagged contents api ttl = %s, want 7d", got)
}
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/contents/README.md?ref=refs%2Ftags%2Fv0.2.0"}); got != 7*24*time.Hour {
t.Fatalf("refs/tags contents api ttl = %s, want 7d", got)
}
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/contents/README.md?ref=0123456789abcdef0123456789abcdef01234567"}); got != 7*24*time.Hour {
t.Fatalf("sha contents api ttl = %s, want 7d", got)
}
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/contents/README.md?ref=vnext"}); got != 30*time.Minute {
t.Fatalf("mutable vnext contents api ttl = %s, want 30m", got)
}
if got := ghCommandCacheTTL([]string{"api", "repos/openclaw/openclaw/contents/README.md?ref=refs%2Fheads%2Fv0.2.0"}); got != 30*time.Minute {
t.Fatalf("v-prefixed branch contents api ttl = %s, want 30m", got)
}
if got := normalizeGHAPIRoute([]string{"repos/openclaw/openclaw/actions/runs?per_page=1"}); got != "api repos/:owner/:repo/actions/runs" {
t.Fatalf("normalized actions route = %q", got)
@ -167,6 +289,9 @@ func TestGHShimCommandAwareCacheTTLs(t *testing.T) {
if got := normalizeGHAPIRoute([]string{"--paginate", "repos/openclaw/openclaw/issues?state=all&creator=octocat", "--jq", ".[].number"}); got != "api repos/:owner/:repo/issues" {
t.Fatalf("normalized paginated issues route = %q", got)
}
if got := normalizeGHAPIRoute([]string{"repos/openclaw/openclaw/contents/.github/workflows/ci.yml?ref=main"}); got != "api repos/:owner/:repo/contents/:path" {
t.Fatalf("normalized contents route = %q", got)
}
entry := ghCommandCacheEntry{CreatedAt: time.Now().Add(-3 * time.Minute), ExitCode: 1, Stderr: "HTTP 403: API rate limit exceeded"}
if ttl := ghCommandCacheEntryTTL(entry, 12*time.Hour); ttl != 2*time.Minute {
t.Fatalf("rate-limit error ttl = %s, want 2m", ttl)
@ -187,6 +312,14 @@ func TestGHShimCommandAwareCacheTTLs(t *testing.T) {
if ttl := ghCommandCacheEntryTTL(completedRuns, 2*time.Minute); ttl != 30*time.Minute {
t.Fatalf("completed run list ttl = %s, want 30m", ttl)
}
completedJobs := ghCommandCacheEntry{
Args: []string{"api", "repos/openclaw/openclaw/actions/runs/123/jobs"},
ExitCode: 0,
Stdout: `{"jobs":[{"status":"completed","conclusion":"success"}]}`,
}
if ttl := ghCommandCacheEntryTTL(completedJobs, time.Minute); ttl != 12*time.Hour {
t.Fatalf("completed jobs ttl = %s, want 12h", ttl)
}
}
func TestGHShimCanonicalizesEquivalentCacheKeys(t *testing.T) {
@ -350,6 +483,66 @@ func TestGHShimTracksBackendMissesByCommandAndRoute(t *testing.T) {
if stats.Counters.BackendMissesByRoute["api repos/:owner/:repo/actions/runs/:id/logs"] != 1 {
t.Fatalf("backend misses by route = %#v", stats.Counters.BackendMissesByRoute)
}
if stats.Counters.BackendMissesByKey["api repos/openclaw/openclaw/actions/runs/123/logs -i"] != 1 {
t.Fatalf("backend misses by key = %#v", stats.Counters.BackendMissesByKey)
}
}
func TestGHShimXCacheStatsSinceAndSnapshot(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
dir := t.TempDir()
ghPath := filepath.Join(dir, "gh")
if err := os.WriteFile(ghPath, []byte("#!/bin/sh\necho repo:$*\n"), 0o755); err != nil {
t.Fatalf("write fake gh: %v", err)
}
t.Setenv("GITCRAWL_GH_PATH", ghPath)
t.Setenv("GH_REPO", "stats-since/"+filepath.Base(dir))
t.Setenv("GITCRAWL_GH_CACHE_TTL", "1m")
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
args := []string{"--config", configPath, "gh", "repo", "view", "openclaw/gitcrawl", "--json", "nameWithOwner"}
if err := run.Run(ctx, args); err != nil {
t.Fatalf("repo view: %v", err)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "xcache", "stats", "--since", "1h", "--json"}); err != nil {
t.Fatalf("xcache stats --since: %v", err)
}
var stats ghCommandCacheStats
if err := json.Unmarshal(stdout.Bytes(), &stats); err != nil {
t.Fatalf("decode stats: %v\n%s", err, stdout.String())
}
if stats.Since != "1h0m0s" || stats.CumulativeCounters == nil || stats.Counters.BackendMisses != 1 {
t.Fatalf("since stats = %+v", stats)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "xcache", "snapshot", "--reset", "--json"}); err != nil {
t.Fatalf("xcache snapshot: %v", err)
}
var snap ghCommandCacheSnapshotResult
if err := json.Unmarshal(stdout.Bytes(), &snap); err != nil {
t.Fatalf("decode snapshot: %v\n%s", err, stdout.String())
}
if snap.SnapshotPath == "" || !snap.Reset {
t.Fatalf("snapshot result = %+v", snap)
}
if _, err := os.Stat(snap.SnapshotPath); err != nil {
t.Fatalf("snapshot file: %v", err)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "xcache", "stats", "--json"}); err != nil {
t.Fatalf("xcache stats after snapshot reset: %v", err)
}
if err := json.Unmarshal(stdout.Bytes(), &stats); err != nil {
t.Fatalf("decode reset stats: %v\n%s", err, stdout.String())
}
if stats.Counters.BackendMisses != 0 {
t.Fatalf("snapshot reset counters = %+v", stats.Counters)
}
}
func TestGHShimCachesReadOnlyFallbackErrors(t *testing.T) {
@ -466,6 +659,80 @@ exit 1
}
}
func TestGHShimServesStaleWhileAnotherProcessRefreshes(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
dir := t.TempDir()
countPath := filepath.Join(dir, "count")
ghPath := filepath.Join(dir, "gh")
script := `#!/bin/sh
count=0
if [ -f "$GH_SHIM_COUNT" ]; then
count=$(cat "$GH_SHIM_COUNT")
fi
count=$((count + 1))
printf "%s" "$count" > "$GH_SHIM_COUNT"
if [ "$count" != "1" ]; then
sleep 1
fi
echo "release-$count"
`
if err := os.WriteFile(ghPath, []byte(script), 0o755); err != nil {
t.Fatalf("write fake gh: %v", err)
}
t.Setenv("GITCRAWL_GH_PATH", ghPath)
t.Setenv("GH_SHIM_COUNT", countPath)
t.Setenv("GITCRAWL_GH_CACHE_TTL", "1ns")
t.Setenv("GITCRAWL_GH_STALE_GRACE", "1h")
args := []string{"--config", configPath, "gh", "release", "view", "v1", "-R", "openclaw/openclaw"}
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, args); err != nil {
t.Fatalf("seed read: %v", err)
}
stdout.Reset()
var wg sync.WaitGroup
outputs := make(chan string, 2)
errs := make(chan error, 2)
for i := 0; i < 2; i++ {
wg.Add(1)
go func() {
defer wg.Done()
run := New()
var out bytes.Buffer
run.Stdout = &out
if err := run.Run(ctx, args); err != nil {
errs <- err
return
}
outputs <- strings.TrimSpace(out.String())
}()
}
wg.Wait()
close(errs)
close(outputs)
for err := range errs {
t.Fatalf("stale while refresh run: %v", err)
}
seen := map[string]int{}
for out := range outputs {
seen[out]++
}
if seen["release-1"] != 1 || seen["release-2"] != 1 {
t.Fatalf("outputs = %#v, want one stale and one refresh", seen)
}
countData, err := os.ReadFile(countPath)
if err != nil {
t.Fatalf("read count: %v", err)
}
if strings.TrimSpace(string(countData)) != "2" {
t.Fatalf("fake gh call count = %q, want 2", countData)
}
}
func TestGHShimMutatingFallbackClearsMatchingCacheForGHXStyleMutations(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)

View File

@ -0,0 +1,360 @@
package cli
import (
"bytes"
"context"
"encoding/json"
"os"
"path/filepath"
"strings"
"testing"
"time"
"github.com/openclaw/gitcrawl/internal/config"
"github.com/openclaw/gitcrawl/internal/store"
)
func TestGHCacheDescriptorAndPolicyBranches(t *testing.T) {
if got := canonicalGHCommandArgs(nil); got != nil {
t.Fatalf("nil canonical args = %+v", got)
}
canonical := canonicalGHCommandArgs([]string{"pr", "view", "12", "--json", "title,number", "-R", " openclaw/openclaw ", "--method", "get", "--flag"})
if strings.Join(canonical, " ") != "pr view 12 --flag --json=number,title --method=GET --repo=openclaw/openclaw" {
t.Fatalf("canonical args = %+v", canonical)
}
if got := canonicalGHCommandArgs([]string{"pr", "view", "--repo"}); strings.Join(got, " ") != "pr view --repo" {
t.Fatalf("missing value canonical args = %+v", got)
}
if !ghCacheTagsMatch([]string{"repo:openclaw/openclaw", "issues"}, stringSet([]string{"issues", "repo:openclaw/openclaw"})) {
t.Fatal("specific issue tag should match")
}
if ghCacheTagsMatch([]string{"repo:openclaw/openclaw"}, stringSet([]string{"repo:openclaw/openclaw", "issues"})) {
t.Fatal("repo tag alone should not match specific mutation")
}
app := New()
t.Setenv("GH_REPO", "openclaw/from-env")
tagCases := [][]string{
app.ghCommandCacheTags(context.Background(), []string{"issue", "view", "https://github.com/openclaw/openclaw/issues/10", "-R", "openclaw/openclaw"}),
app.ghCommandCacheTags(context.Background(), []string{"pr", "view", "12"}),
app.ghMutationInvalidationTags(context.Background(), []string{"run", "rerun", "99", "-R", "openclaw/openclaw"}),
app.ghCommandCacheTags(context.Background(), []string{"workflow", "view", "ci.yml", "-R", "openclaw/openclaw"}),
app.ghCommandCacheTags(context.Background(), []string{"release", "view", "v0.7.0", "-R", "openclaw/openclaw"}),
app.ghCommandCacheTags(context.Background(), []string{"api", "repos/openclaw/openclaw/actions/runs/99/jobs"}),
app.ghMutationInvalidationTags(context.Background(), []string{"cache", "delete"}),
}
for _, tags := range tagCases {
if len(tags) == 0 {
t.Fatalf("empty tags")
}
}
if repo := ghCommandRepo([]string{"repo", "view", "openclaw/openclaw"}); repo != "openclaw/openclaw" {
t.Fatalf("repo view repo = %q", repo)
}
if repo := ghAPIRepo([]string{"https://api.github.com/repos/openclaw/openclaw/issues/10"}); repo != "openclaw/openclaw" {
t.Fatalf("api repo = %q", repo)
}
if tags := ghAPITags([]string{"repos/openclaw/openclaw/releases/latest"}); len(tags) < 2 || tags[1] != "releases" {
t.Fatalf("release api tags = %+v", tags)
}
if got := firstGHNumberArg([]string{"--repo", "openclaw/openclaw", "https://github.com/openclaw/openclaw/pull/12"}); got != "12" {
t.Fatalf("first number = %q", got)
}
if got := uniqueStrings([]string{"", "a", " a ", "b"}); len(got) != 2 || got[0] != "a" || got[1] != "b" {
t.Fatalf("unique = %+v", got)
}
completedRun := ghCommandCacheEntry{Args: []string{"run", "view", "99"}, Stdout: `{"status":"completed"}`}
if ttl := ghCompletedRunCacheTTL(completedRun); ttl != 12*time.Hour {
t.Fatalf("run view ttl = %s", ttl)
}
completedList := ghCommandCacheEntry{Args: []string{"api", "repos/openclaw/openclaw/actions/runs"}, Stdout: `{"workflow_runs":[{"status":"completed"}]}`}
if ttl := ghCompletedRunCacheTTL(completedList); ttl != 30*time.Minute {
t.Fatalf("run list ttl = %s", ttl)
}
jobs := ghCommandCacheEntry{Args: []string{"api", "repos/openclaw/openclaw/actions/runs/99/jobs"}, Stdout: `{"jobs":[{"conclusion":"success"}]}`}
if ttl := ghCompletedRunCacheTTL(jobs); ttl != 12*time.Hour {
t.Fatalf("jobs ttl = %s", ttl)
}
if ghJSONStatusCompleted(`{`) || ghJSONCollectionCompleted(`[]`) || allGHStatusMapsCompleted([]map[string]any{{"status": "queued"}}) {
t.Fatal("incomplete JSON status classified as completed")
}
if !cacheableGHRead([]string{"label", "list"}) || !cacheableGHRead([]string{"org", "list"}) || !cacheableGHRead([]string{"search", "repos"}) {
t.Fatal("expected read-only gh commands to be cacheable")
}
if ghCommandName(nil) != "" || ghCommandName([]string{"pr"}) != "pr" || ghCommandName([]string{"api", "repos/x/y"}) != "api" {
t.Fatal("gh command name mismatch")
}
if ghRunCacheTTL(nil) != 30*time.Second || ghRunCacheTTL([]string{"view", "--job", "1"}) != time.Minute || ghRunCacheTTL([]string{"rerun"}) != 30*time.Second {
t.Fatal("run ttl mismatch")
}
if ttl := ghAPICacheTTL([]string{"repos/openclaw/openclaw/actions/runs/99/jobs"}); ttl != time.Minute {
t.Fatalf("jobs ttl = %s", ttl)
}
if ttl := ghAPICacheTTL([]string{"repos/openclaw/openclaw/contents/file?ref=main"}); ttl != 30*time.Minute {
t.Fatalf("unstable content ttl = %s", ttl)
}
if !ghAPIContentRefIsStableReleaseTag("refs/tags/v1.2.3") || !ghAPIContentRefIsStableReleaseTag("v1.2.3+build") || ghAPIContentRefIsStableReleaseTag("v1.2") {
t.Fatal("version ref classification mismatch")
}
}
func TestPortableRuntimeHelperBranches(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
root := filepath.Join(dir, "store")
dbPath := filepath.Join(root, "data", "openclaw__openclaw.sync.db")
if err := os.MkdirAll(filepath.Dir(dbPath), 0o755); err != nil {
t.Fatalf("mkdir db dir: %v", err)
}
if err := os.Mkdir(filepath.Join(root, ".git"), 0o755); err != nil {
t.Fatalf("mkdir git dir: %v", err)
}
if err := os.WriteFile(dbPath, []byte("db-v1"), 0o644); err != nil {
t.Fatalf("write db: %v", err)
}
app := New()
app.configPath = filepath.Join(dir, "config.toml")
mirror, err := app.portableRuntimeDBPath(dbPath)
if err != nil {
t.Fatalf("runtime path: %v", err)
}
changed, err := refreshPortableRuntimeDB(ctx, dbPath, mirror, false)
if err != nil || !changed {
t.Fatalf("initial runtime copy changed=%v err=%v", changed, err)
}
changed, err = refreshPortableRuntimeDB(ctx, dbPath, mirror, false)
if err != nil || changed {
t.Fatalf("second runtime copy changed=%v err=%v", changed, err)
}
if needs, err := portableRuntimeNeedsCopy(filepath.Join(dir, "missing.db"), mirror); err == nil || needs {
t.Fatalf("missing source needs=%v err=%v", needs, err)
}
if _, ok := portableStoreRoot(filepath.Join(dir, "plain", "db.sqlite")); ok {
t.Fatal("plain db should not have portable root")
}
if gitWorktreeClean(ctx, root) {
t.Fatal("fake git directory should not be a clean worktree")
}
statePath := portableStoreRefreshStatePath(mirror)
state := portableStoreRefreshState{LastSuccess: time.Now().UTC().Format(time.RFC3339Nano)}
if err := writePortableStoreRefreshState(statePath, state); err != nil {
t.Fatalf("write state: %v", err)
}
if got := readPortableStoreRefreshState(statePath); got.LastSuccess == "" {
t.Fatalf("read state = %+v", got)
}
if got := readPortableStoreRefreshState(filepath.Join(dir, "missing.json")); got.LastSuccess != "" {
t.Fatalf("missing state = %+v", got)
}
if !recentPortableRefresh(state.LastSuccess, time.Now().UTC(), time.Hour) || recentPortableRefresh("bad", time.Now().UTC(), time.Hour) || recentPortableRefresh("", time.Now().UTC(), time.Hour) {
t.Fatal("recent refresh classification mismatch")
}
t.Setenv("GITCRAWL_PORTABLE_REFRESH_TTL", "0")
if portableStoreRefreshInterval() != 0 {
t.Fatal("zero refresh ttl not honored")
}
if err := copyFileAtomic(filepath.Join(dir, "missing"), filepath.Join(dir, "out", "db")); err == nil {
t.Fatal("missing source copy should fail")
}
}
func TestGHCacheClearMatchingBranches(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
cfg, err := config.Load(configPath)
if err != nil {
t.Fatalf("load config: %v", err)
}
app := New()
app.configPath = configPath
dir, err := app.ghCommandCacheDir()
if err != nil {
t.Fatalf("cache dir: %v", err)
}
entry := ghCommandCacheEntry{
Args: []string{"issue", "view", "10", "-R", "openclaw/openclaw"},
Stdout: "{}",
Stderr: "",
ExitCode: 0,
CreatedAt: time.Now(),
Tags: []string{"repo:openclaw/openclaw", "issues", "issue:10"},
}
data, err := json.Marshal(entry)
if err != nil {
t.Fatalf("marshal entry: %v", err)
}
entryPath := filepath.Join(dir, "entry.json")
if err := os.WriteFile(entryPath, data, 0o644); err != nil {
t.Fatalf("write entry: %v", err)
}
if err := os.WriteFile(filepath.Join(dir, "entry.lock"), []byte("lock"), 0o644); err != nil {
t.Fatalf("write lock: %v", err)
}
if err := os.WriteFile(filepath.Join(dir, "ignore.txt"), []byte("x"), 0o644); err != nil {
t.Fatalf("write ignored entry: %v", err)
}
if err := app.clearGHCommandCacheMatching([]string{"issue:10"}); err != nil {
t.Fatalf("clear matching: %v", err)
}
if _, err := os.Stat(entryPath); !os.IsNotExist(err) {
t.Fatalf("entry still exists: %v", err)
}
if _, err := os.Stat(filepath.Join(dir, "entry.lock")); !os.IsNotExist(err) {
t.Fatalf("lock still exists: %v", err)
}
if err := os.WriteFile(entryPath, data, 0o644); err != nil {
t.Fatalf("rewrite entry: %v", err)
}
if err := app.clearGHCommandCacheForMutation(ctx, []string{"cache", "delete"}); err != nil {
t.Fatalf("clear global mutation: %v", err)
}
if _, err := os.Stat(entryPath); !os.IsNotExist(err) {
t.Fatalf("global clear left entry: %v", err)
}
if cfg.CacheDir == "" {
t.Fatal("seed config cache dir should not be empty")
}
}
func TestGHMetricsSearchRunsAndXCacheBranches(t *testing.T) {
var counters ghXCacheCounters
if incrementGHXCacheCounters(&counters, "unknown", nil) {
t.Fatal("unknown counter should not increment")
}
for _, name := range []string{"local_hits", "fallback_hits", "stale_hits", "backend_misses", "pass_through_writes"} {
if !incrementGHXCacheCounters(&counters, name, []string{"api", "repos/openclaw/openclaw/actions/runs/99"}) {
t.Fatalf("counter %s did not increment", name)
}
}
var bucket ghXCacheCounterBucket
for _, name := range []string{"local_hits", "fallback_hits", "stale_hits", "backend_misses", "pass_through_writes"} {
if !incrementGHXCacheCounterBucket(&bucket, name, []string{"pr", "view", "12", "-R", "openclaw/openclaw"}) {
t.Fatalf("bucket counter %s did not increment", name)
}
}
if incrementGHXCacheCounterBucket(&bucket, "bad", nil) {
t.Fatal("unknown bucket counter should not increment")
}
if got := ghCommandMissKey([]string{"pr", "view", strings.Repeat("x", 220)}); len(got) != 180 || !strings.HasSuffix(got, "...") {
t.Fatalf("miss key = %q len=%d", got, len(got))
}
if route := ghCommandRoute([]string{"api", "repos/openclaw/openclaw/actions/runs/99"}); !strings.Contains(route, "/actions/runs/:id") {
t.Fatalf("api route = %q", route)
}
if route := ghCommandRoute([]string{"pr"}); route != "pr" {
t.Fatalf("single route = %q", route)
}
now := time.Now().UTC()
counters.Hourly = map[string]ghXCacheCounterBucket{
"old": {StartedAt: now.Add(-2 * time.Hour), LocalHits: 9},
"new": {StartedAt: now.Add(-5 * time.Minute), LocalHits: 1, BackendMissesByCommand: map[string]int64{"api": 2}},
"zero": {LocalHits: 8},
}
recent := counters.since(time.Hour, now)
if recent.LocalHits != 1 || recent.BackendMissesByCommand["api"] != 2 {
t.Fatalf("recent counters = %+v", recent)
}
mergeCounterMap(&recent.BackendMissesByRoute, map[string]int64{"r": 3})
if recent.BackendMissesByRoute["r"] != 3 {
t.Fatalf("merged counters = %+v", recent.BackendMissesByRoute)
}
buckets := map[string]ghXCacheCounterBucket{"old": {StartedAt: now.Add(-8 * 24 * time.Hour)}, "new": {StartedAt: now}}
pruneGHXCacheBuckets(buckets, now.Add(-7*24*time.Hour))
if _, ok := buckets["old"]; ok || buckets["new"].StartedAt.IsZero() {
t.Fatalf("pruned buckets = %+v", buckets)
}
if _, start := ghXCacheCurrentBucket(now); !start.Equal(now.Truncate(time.Hour)) {
t.Fatalf("bucket start = %s", start)
}
if staleGHCommandCacheLock(fakeFileInfo{mod: now.Add(-3 * time.Minute)}) != true || staleGHCommandCacheLock(fakeFileInfo{mod: now}) {
t.Fatal("stale lock classification mismatch")
}
thread := store.Thread{
GitHubID: "99", Number: 99, Title: "Title", State: "open", HTMLURL: "https://example.com/99",
LabelsJSON: `["bug",""]`, AuthorLogin: "alice", AuthorType: "User", Body: "body",
UpdatedAt: "2026-05-08T00:00:00Z", CreatedAtGitHub: "2026-05-07T00:00:00Z", ClosedAtGitHub: "", IsDraft: true,
}
fields := "number,id,title,state,url,updatedAt,createdAt,closedAt,mergedAt,labels,isDraft,author,body"
rows, err := ghSearchJSONRows([]store.Thread{thread}, fields)
if err != nil || rows[0]["number"] != 99 {
t.Fatalf("search rows=%+v err=%v", rows, err)
}
if labels := ghLabelsFromJSON(`not-json`); labels != nil {
t.Fatalf("bad labels = %+v", labels)
}
if labels := ghLabelsFromJSON(`[{"name":"bug","color":"red"}]`); len(labels) != 1 || labels[0].Name != "bug" {
t.Fatalf("object labels = %+v", labels)
}
if _, err := ghSearchJSONRows([]store.Thread{thread}, "unsupported"); err == nil {
t.Fatal("unsupported search json field should fail")
}
if _, err := ghSearchJSONRows([]store.Thread{thread}, " "); err == nil {
t.Fatal("empty search json fields should fail")
}
query, repo, state := parseGHSearchQuery("repo:openclaw/openclaw is:pr is:open crash")
if query != "crash" || repo != "openclaw/openclaw" || state != "open" {
t.Fatalf("query=%q repo=%q state=%q", query, repo, state)
}
if !isGHSearchKind("pull-requests") || ghSearchKind("pulls") != "pull_request" || ghSearchKind("issues") != "issue" {
t.Fatal("search kind mismatch")
}
if _, err := parseGHSearchDuration("0"); err == nil {
t.Fatal("zero duration should fail")
}
if duration, err := parseGHSearchDuration("5"); err != nil || duration != 5*time.Second {
t.Fatalf("seconds duration=%s err=%v", duration, err)
}
if _, err := parseGHSearchLimit("5", "6"); err == nil {
t.Fatal("disagreeing limits should fail")
}
runs := []store.WorkflowRun{{
RunID: "99", RunNumber: 7, WorkflowName: "CI", Status: "completed", Conclusion: "success",
HTMLURL: "https://example.com/run", Event: "push", HeadBranch: "main", HeadSHA: "abc",
CreatedAtGH: "2026-05-08T00:00:00Z", UpdatedAtGH: "2026-05-08T00:01:00Z",
}, {RunID: "not-number", WorkflowName: "Deploy"}}
runRows := ghWorkflowRunJSONRows(runs, "databaseId,id,number,workflowName,name,displayTitle,status,conclusion,url,event,headBranch,headSha,createdAt,updatedAt")
if runRows[0]["databaseId"] != int64(99) || runRows[1]["databaseId"] != "not-number" {
t.Fatalf("run rows = %+v", runRows)
}
dir := t.TempDir()
entry := ghCommandCacheEntry{Args: []string{"run", "list"}, CreatedAt: time.Now().Add(-time.Hour), Stdout: "[]"}
data, err := json.Marshal(entry)
if err != nil {
t.Fatalf("marshal entry: %v", err)
}
if err := os.WriteFile(filepath.Join(dir, "good.json"), data, 0o644); err != nil {
t.Fatalf("write entry: %v", err)
}
if err := os.WriteFile(filepath.Join(dir, "bad.json"), []byte("{"), 0o644); err != nil {
t.Fatalf("write bad entry: %v", err)
}
entries, err := os.ReadDir(dir)
if err != nil {
t.Fatalf("read dir: %v", err)
}
found := false
for _, entry := range entries {
if info, ok := ghCommandCacheKeyInfoFromDirEntry(dir, entry); ok && info.Key == "good" {
found = true
}
}
if !found {
t.Fatal("cache key info did not parse good entry")
}
var buf bytes.Buffer
printGHXCacheMisses(&buf, "Misses", map[string]int64{"b": 1, "a": 2})
if !strings.Contains(buf.String(), "Misses") {
t.Fatalf("miss output = %q", buf.String())
}
}
type fakeFileInfo struct{ mod time.Time }
func (f fakeFileInfo) Name() string { return "fake" }
func (f fakeFileInfo) Size() int64 { return 0 }
func (f fakeFileInfo) Mode() os.FileMode { return 0 }
func (f fakeFileInfo) ModTime() time.Time { return f.mod }
func (f fakeFileInfo) IsDir() bool { return false }
func (f fakeFileInfo) Sys() any { return nil }

View File

@ -6,6 +6,7 @@ import (
"os"
"path/filepath"
"sort"
"strconv"
"strings"
"time"
)
@ -308,9 +309,8 @@ func firstGHNumberArg(args []string) string {
}
continue
}
arg = strings.TrimPrefix(strings.TrimSpace(arg), "#")
if isDecimalString(arg) {
return arg
if ref, ok := parseThreadReference(arg); ok && ref.Number > 0 {
return strconv.Itoa(ref.Number)
}
}
return ""
@ -347,6 +347,12 @@ func ghCompletedRunCacheTTL(entry ghCommandCacheEntry) time.Duration {
}
if entry.Args[0] == "api" {
route := normalizeGHAPIRoute(entry.Args[1:])
if strings.Contains(route, "/actions/runs/:id/jobs") && ghJSONJobsCompleted(entry.Stdout) {
return 12 * time.Hour
}
if strings.Contains(route, "/actions/jobs/:id") && ghJSONStatusCompleted(entry.Stdout) {
return 12 * time.Hour
}
if strings.Contains(route, "/actions/runs/:id") && ghJSONStatusCompleted(entry.Stdout) {
return 12 * time.Hour
}
@ -357,6 +363,16 @@ func ghCompletedRunCacheTTL(entry ghCommandCacheEntry) time.Duration {
return 0
}
func ghJSONJobsCompleted(raw string) bool {
var payload struct {
Jobs []map[string]any `json:"jobs"`
}
if err := json.Unmarshal([]byte(raw), &payload); err == nil {
return len(payload.Jobs) > 0 && allGHStatusMapsCompleted(payload.Jobs)
}
return ghJSONCollectionCompleted(raw)
}
func ghJSONStatusCompleted(raw string) bool {
var payload map[string]any
if err := json.Unmarshal([]byte(raw), &payload); err != nil {

View File

@ -6,6 +6,7 @@ import (
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/openclaw/gitcrawl/internal/config"
@ -54,6 +55,17 @@ func TestGHShimViewAndListUseLocalCache(t *testing.T) {
t.Fatalf("checks = %#v", checks)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "pr", "checks", "https://github.com/openclaw/openclaw/pull/12", "--json", "name,state"}); err != nil {
t.Fatalf("gh pr checks URL: %v", err)
}
if err := json.Unmarshal(stdout.Bytes(), &checks); err != nil {
t.Fatalf("decode URL checks: %v\n%s", err, stdout.String())
}
if len(checks) != 1 || checks[0]["name"] != "test" || checks[0]["state"] != "SUCCESS" {
t.Fatalf("URL checks = %#v", checks)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "run", "list", "-R", "openclaw/openclaw", "--branch", "manifest-cache", "--json", "databaseId,workflowName,status,conclusion,headSha"}); err != nil {
t.Fatalf("gh run list: %v", err)
@ -112,6 +124,28 @@ func TestGHShimViewAndListUseLocalCache(t *testing.T) {
if len(list) != 1 || int(list[0]["number"].(float64)) != 10 {
t.Fatalf("filtered list = %#v", list)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "issue", "view", "10", "-R", "openclaw/openclaw"}); err != nil {
t.Fatalf("gh issue human view: %v", err)
}
if got := stdout.String(); !strings.Contains(got, "title:\tHot loop burns CPU") || !strings.Contains(got, "runtime has a hot loop") {
t.Fatalf("human issue view = %q", got)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "issue", "list", "-R", "openclaw/openclaw", "--limit", "1"}); err != nil {
t.Fatalf("gh issue human list: %v", err)
}
if got := stdout.String(); !strings.Contains(got, "10\tHot loop burns CPU") {
t.Fatalf("human issue list = %q", got)
}
stdout.Reset()
if err := run.Run(ctx, []string{"--config", configPath, "gh", "pr", "list", "-R", "openclaw/openclaw", "--limit", "1"}); err != nil {
t.Fatalf("gh pr human list: %v", err)
}
if got := stdout.String(); !strings.Contains(got, "12\tManifest cache update") {
t.Fatalf("human pr list = %q", got)
}
}
func TestGHShimAutoHydratesPRDetailsOnMiss(t *testing.T) {

View File

@ -4,17 +4,32 @@ import (
"encoding/json"
"os"
"path/filepath"
"strings"
"time"
)
type ghXCacheCounters struct {
LocalHits int64 `json:"local_hits"`
FallbackHits int64 `json:"fallback_hits"`
StaleHits int64 `json:"stale_hits"`
BackendMisses int64 `json:"backend_misses"`
PassThroughWrites int64 `json:"pass_through_writes"`
LocalHits int64 `json:"local_hits"`
FallbackHits int64 `json:"fallback_hits"`
StaleHits int64 `json:"stale_hits"`
BackendMisses int64 `json:"backend_misses"`
PassThroughWrites int64 `json:"pass_through_writes"`
BackendMissesByCommand map[string]int64 `json:"backend_misses_by_command,omitempty"`
BackendMissesByRoute map[string]int64 `json:"backend_misses_by_route,omitempty"`
BackendMissesByKey map[string]int64 `json:"backend_misses_by_key,omitempty"`
Hourly map[string]ghXCacheCounterBucket `json:"hourly,omitempty"`
}
type ghXCacheCounterBucket struct {
StartedAt time.Time `json:"started_at"`
LocalHits int64 `json:"local_hits,omitempty"`
FallbackHits int64 `json:"fallback_hits,omitempty"`
StaleHits int64 `json:"stale_hits,omitempty"`
BackendMisses int64 `json:"backend_misses,omitempty"`
PassThroughWrites int64 `json:"pass_through_writes,omitempty"`
BackendMissesByCommand map[string]int64 `json:"backend_misses_by_command,omitempty"`
BackendMissesByRoute map[string]int64 `json:"backend_misses_by_route,omitempty"`
BackendMissesByKey map[string]int64 `json:"backend_misses_by_key,omitempty"`
}
func (a *App) ghXCacheCounters() (ghXCacheCounters, error) {
@ -57,6 +72,28 @@ func (a *App) incrementGHXCacheCounterWithArgs(name string, args []string) error
_ = os.Remove(lockPath)
}()
stats := readGHXCacheCounters(path)
if !incrementGHXCacheCounters(&stats, name, args) {
return nil
}
bucketKey, bucketStart := ghXCacheCurrentBucket(time.Now())
if stats.Hourly == nil {
stats.Hourly = map[string]ghXCacheCounterBucket{}
}
bucket := stats.Hourly[bucketKey]
if bucket.StartedAt.IsZero() {
bucket.StartedAt = bucketStart
}
_ = incrementGHXCacheCounterBucket(&bucket, name, args)
stats.Hourly[bucketKey] = bucket
pruneGHXCacheBuckets(stats.Hourly, time.Now().Add(-7*24*time.Hour))
data, err := json.Marshal(stats)
if err != nil {
return err
}
return writeAtomicFile(path, data, 0o600)
}
func incrementGHXCacheCounters(stats *ghXCacheCounters, name string, args []string) bool {
switch name {
case "local_hits":
stats.LocalHits++
@ -66,29 +103,69 @@ func (a *App) incrementGHXCacheCounterWithArgs(name string, args []string) error
stats.StaleHits++
case "backend_misses":
stats.BackendMisses++
if len(args) > 0 {
if stats.BackendMissesByCommand == nil {
stats.BackendMissesByCommand = map[string]int64{}
}
command := ghCommandName(args)
stats.BackendMissesByCommand[command]++
if route := ghCommandRoute(args); route != "" {
if stats.BackendMissesByRoute == nil {
stats.BackendMissesByRoute = map[string]int64{}
}
stats.BackendMissesByRoute[route]++
}
}
incrementGHXCacheMissMaps(&stats.BackendMissesByCommand, &stats.BackendMissesByRoute, &stats.BackendMissesByKey, args)
case "pass_through_writes":
stats.PassThroughWrites++
default:
return nil
return false
}
data, err := json.Marshal(stats)
if err != nil {
return err
return true
}
func incrementGHXCacheCounterBucket(bucket *ghXCacheCounterBucket, name string, args []string) bool {
switch name {
case "local_hits":
bucket.LocalHits++
case "fallback_hits":
bucket.FallbackHits++
case "stale_hits":
bucket.StaleHits++
case "backend_misses":
bucket.BackendMisses++
incrementGHXCacheMissMaps(&bucket.BackendMissesByCommand, &bucket.BackendMissesByRoute, &bucket.BackendMissesByKey, args)
case "pass_through_writes":
bucket.PassThroughWrites++
default:
return false
}
return writeAtomicFile(path, data, 0o600)
return true
}
func incrementGHXCacheMissMaps(byCommand, byRoute, byKey *map[string]int64, args []string) {
if len(args) == 0 {
return
}
if *byCommand == nil {
*byCommand = map[string]int64{}
}
(*byCommand)[ghCommandName(args)]++
if route := ghCommandRoute(args); route != "" {
if *byRoute == nil {
*byRoute = map[string]int64{}
}
(*byRoute)[route]++
}
if key := ghCommandMissKey(args); key != "" {
if *byKey == nil {
*byKey = map[string]int64{}
}
(*byKey)[key]++
}
}
func ghCommandMissKey(args []string) string {
if len(args) == 0 {
return ""
}
canonical := canonicalGHCommandArgs(args)
if len(canonical) == 0 {
return ghCommandName(args)
}
key := strings.Join(canonical, " ")
if len(key) > 180 {
key = key[:177] + "..."
}
return key
}
func ghCommandRoute(args []string) string {
@ -116,6 +193,53 @@ func readGHXCacheCounters(path string) ghXCacheCounters {
return stats
}
func (c ghXCacheCounters) since(since time.Duration, now time.Time) ghXCacheCounters {
if since <= 0 {
return c
}
cutoff := now.Add(-since)
var out ghXCacheCounters
for _, bucket := range c.Hourly {
if bucket.StartedAt.IsZero() || bucket.StartedAt.Before(cutoff) {
continue
}
out.LocalHits += bucket.LocalHits
out.FallbackHits += bucket.FallbackHits
out.StaleHits += bucket.StaleHits
out.BackendMisses += bucket.BackendMisses
out.PassThroughWrites += bucket.PassThroughWrites
mergeCounterMap(&out.BackendMissesByCommand, bucket.BackendMissesByCommand)
mergeCounterMap(&out.BackendMissesByRoute, bucket.BackendMissesByRoute)
mergeCounterMap(&out.BackendMissesByKey, bucket.BackendMissesByKey)
}
return out
}
func mergeCounterMap(dst *map[string]int64, src map[string]int64) {
if len(src) == 0 {
return
}
if *dst == nil {
*dst = map[string]int64{}
}
for key, value := range src {
(*dst)[key] += value
}
}
func ghXCacheCurrentBucket(now time.Time) (string, time.Time) {
start := now.UTC().Truncate(time.Hour)
return start.Format("2006-01-02T15:00:00Z"), start
}
func pruneGHXCacheBuckets(buckets map[string]ghXCacheCounterBucket, cutoff time.Time) {
for key, bucket := range buckets {
if !bucket.StartedAt.IsZero() && bucket.StartedAt.Before(cutoff) {
delete(buckets, key)
}
}
}
func writeAtomicFile(path string, data []byte, perm os.FileMode) error {
if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
return err

View File

@ -0,0 +1,251 @@
package cli
import (
"bytes"
"context"
"errors"
"os"
"path/filepath"
"strings"
"testing"
"time"
"github.com/openclaw/gitcrawl/internal/store"
)
func TestGHShimPRCacheAndPolicyHelperBranches(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
app := New()
app.configPath = configPath
var stdout bytes.Buffer
app.Stdout = &stdout
if err := app.Run(ctx, []string{"--config", configPath, "gh", "pr", "checks", "12", "-R", "openclaw/openclaw"}); err != nil {
t.Fatalf("human pr checks: %v", err)
}
if !strings.Contains(stdout.String(), "test\tcompleted\tsuccess") {
t.Fatalf("human checks = %q", stdout.String())
}
cache, err := app.localGHPullRequestCache(ctx, "openclaw/openclaw", 12)
if err != nil {
t.Fatalf("local pr cache: %v", err)
}
if _, err := app.loadGHPullRequestCache(ctx, "openclaw/openclaw", 12, false); err != nil {
t.Fatalf("load cached pr detail without freshness: %v", err)
}
if _, err := app.loadGHPullRequestCache(ctx, "openclaw/openclaw", 12, true); err != nil {
t.Fatalf("load fresh cached pr detail: %v", err)
}
if !ghPullRequestCacheFresh(cache) {
t.Fatalf("seeded cache should be fresh: %+v", cache.Detail)
}
cache.Detail.RawJSON = `{"head":{"sha":"different"}}`
if ghPullRequestCacheFresh(cache) {
t.Fatal("mismatched raw head sha should be stale")
}
cache.Detail.RawJSON = `{"head":{"sha":"abc123"}}`
cache.Detail.FetchedAt = "bad"
if ghPullRequestCacheFresh(cache) {
t.Fatal("bad fetched timestamp should be stale")
}
if !app.shouldAutoHydrateGHPRDetails(localGHUnsupported(errors.New("pull request detail: sql: no rows in result set"))) {
t.Fatal("missing local PR cache should auto-hydrate")
}
t.Setenv("GITCRAWL_GH_AUTO_HYDRATE", "0")
if app.shouldAutoHydrateGHThread(nil) {
t.Fatal("auto-hydrate env disable not honored")
}
if _, err := app.loadGHPullRequestCache(ctx, "openclaw/openclaw", 9999, true); err == nil {
t.Fatal("missing PR cache with auto-hydrate disabled should fail")
}
t.Setenv("GITCRAWL_GH_AUTO_HYDRATE", "")
if isMissingLocalPRCache(nil) || !isMissingLocalPRCache(localGHUnsupported(errors.New("cached PR branch \"x\" was not found"))) {
t.Fatal("missing cache classification mismatch")
}
number, err := app.findGHPullRequestNumberByBranch(ctx, "openclaw/openclaw", "manifest-cache")
if err != nil || number != 12 {
t.Fatalf("branch lookup number=%d err=%v", number, err)
}
if _, err := app.findGHPullRequestNumberByBranch(ctx, "openclaw/openclaw", "missing"); err == nil {
t.Fatal("missing branch lookup should fail")
}
if got := ghPRHeadRefFromRawJSON(`{"head":{"ref":" feature/cache "}}`); got != "feature/cache" {
t.Fatalf("head ref = %q", got)
}
if got := ghPRHeadRefFromRawJSON(`{`); got != "" {
t.Fatalf("invalid head ref = %q", got)
}
if !ghPRFieldsNeedFresh([]string{"number", "statusCheckRollup"}) || !ghPRFieldsNeedFresh([]string{"mergeStateStatus"}) || ghPRFieldsNeedFresh([]string{"files"}) {
t.Fatal("fresh field detection mismatch")
}
thread := store.Thread{IsDraft: true}
for _, field := range []string{"headRepositoryOwner", "headRepository", "mergeStateStatus", "additions", "deletions", "changedFiles", "isDraft"} {
if _, err := ghPRDetailJSONValue(thread, cache, field); err != nil {
t.Fatalf("field %s: %v", field, err)
}
}
if _, err := ghPRDetailJSONValue(thread, cache, "unsupported"); err == nil {
t.Fatal("unsupported PR detail field should fail")
}
var out bytes.Buffer
app.Stdout = &out
if err := app.writeJSONValue(map[string]any{"value": 1}, ""); err != nil || !strings.Contains(out.String(), `"value": 1`) {
t.Fatalf("write json out=%q err=%v", out.String(), err)
}
if err := app.writeJSONValue(make(chan int), ""); err == nil {
t.Fatal("unmarshalable JSON value should fail")
}
out.Reset()
if err := app.writeJSONValue(map[string]any{"value": 2}, ".value"); err != nil || strings.TrimSpace(out.String()) != "2" {
t.Fatalf("write json jq out=%q err=%v", out.String(), err)
}
t.Setenv("PATH", "")
if err := app.writeJSONValue(map[string]any{"value": 2}, ".value"); err == nil {
t.Fatal("jq expression without jq executable should fail")
}
}
func TestGHShimCachePolicyExtraBranches(t *testing.T) {
if cacheableGHRead(nil) || cacheableGHRead([]string{"repo", "view", "--web"}) {
t.Fatal("interactive or empty gh commands should not be cacheable")
}
if !cacheableGHRead([]string{"gist", "view", "1"}) || !cacheableGHRead([]string{"project", "item-list"}) || !cacheableGHRead([]string{"cache", "list"}) {
t.Fatal("expected read-only command to be cacheable")
}
if ghAPIReadOnly([]string{"repos/openclaw/gitcrawl/issues", "-f", "title=x"}) || ghAPIReadOnly([]string{"repos/openclaw/gitcrawl", "-X"}) || ghAPIReadOnly([]string{"repos/openclaw/gitcrawl", "--method=PATCH"}) {
t.Fatal("mutating or malformed API command should not be read-only")
}
if got := ghAPIPathArg([]string{"--paginate", "-H", "Accept: json", "--jq", ".[]", "--template", "{{.}}", "repos/openclaw/gitcrawl/issues"}); got != "repos/openclaw/gitcrawl/issues" {
t.Fatalf("api path with skipped flags = %q", got)
}
if got := ghAPIPathArg([]string{"-f", "x=y"}); got != "" {
t.Fatalf("api path with only fields = %q", got)
}
if !ghAPIReadOnly([]string{"repos/openclaw/gitcrawl", "--method=GET"}) {
t.Fatal("GET API command should be read-only")
}
if ghGraphQLReadOnly([]string{"graphql"}) || ghGraphQLReadOnly([]string{"graphql", "-X"}) || ghGraphQLReadOnly([]string{"graphql", "-X", "PUT", "-f", "query={ viewer { login } }"}) || ghGraphQLReadOnly([]string{"graphql", "--field=query=@query.graphql"}) {
t.Fatal("malformed or mutating GraphQL command should not be read-only")
}
if !ghGraphQLReadOnly([]string{"graphql", "--field=query=query { viewer { login } }"}) {
t.Fatal("GraphQL query should be read-only")
}
t.Setenv("GITCRAWL_GH_CACHE_TTL", "2m")
if got := ghCommandCacheTTL([]string{"repo", "view"}); got != 2*time.Minute {
t.Fatalf("env ttl = %s", got)
}
t.Setenv("GITCRAWL_GH_CACHE_TTL", "")
ttlCases := []struct {
args []string
want time.Duration
}{
{[]string{"api", "repos/openclaw/gitcrawl/pages/builds/latest"}, 2 * time.Minute},
{[]string{"api", "repos/openclaw/gitcrawl/pages/health"}, 15 * time.Minute},
{[]string{"api", "repos/openclaw/gitcrawl/actions/jobs/123/logs"}, 12 * time.Hour},
{[]string{"api", "repos/openclaw/gitcrawl/actions/jobs/123"}, time.Minute},
{[]string{"api", "repos/openclaw/gitcrawl/actions/runs/123/pending_deployments"}, 30 * time.Second},
{[]string{"api", "repos/openclaw/gitcrawl/actions/workflows/ci.yml"}, 15 * time.Minute},
{[]string{"api", "repos/openclaw/gitcrawl/releases/latest"}, time.Hour},
{[]string{"api", "repos/openclaw/gitcrawl/branches/main"}, 10 * time.Minute},
{[]string{"workflow", "list"}, 15 * time.Minute},
{[]string{"issue", "view"}, 5 * time.Minute},
{[]string{"unknown"}, 5 * time.Minute},
}
for _, tc := range ttlCases {
if got := ghCommandCacheTTL(tc.args); got != tc.want {
t.Fatalf("ttl %v = %s, want %s", tc.args, got, tc.want)
}
}
if !ghAPIContentRefIsStable([]string{"repos/openclaw/gitcrawl/contents/a?ref=v1.2.3-beta+1"}) || ghAPIContentRefIsStable([]string{"repos/openclaw/gitcrawl/contents/a?ref=refs/heads/v1.2.3"}) || ghAPIContentRefIsStable([]string{"repos/openclaw/gitcrawl/contents/a?ref=v1.2"}) {
t.Fatal("stable content ref classification mismatch")
}
t.Setenv("GH_REPO", "openclaw/from-env")
repo, number, ok := parseGHPRDiffIdentityArgs([]string{"pr", "diff", "42"})
if !ok || repo != "openclaw/from-env" || number != 42 {
t.Fatalf("diff identity repo=%q number=%d ok=%v", repo, number, ok)
}
repo, number, ok = parseGHPRDiffIdentityArgs([]string{"pr", "diff", "https://github.com/openclaw/openclaw/pull/78601"})
if !ok || repo != "openclaw/openclaw" || number != 78601 {
t.Fatalf("diff URL identity repo=%q number=%d ok=%v", repo, number, ok)
}
repo, number, ok = parseGHPRDiffIdentityArgs([]string{"pr", "diff", "https://github.com/openclaw/openclaw/issues/78601"})
if !ok || repo != "openclaw/openclaw" || number != 78601 {
t.Fatalf("diff issue URL identity repo=%q number=%d ok=%v", repo, number, ok)
}
for _, args := range [][]string{{"issue", "close"}, {"pr", "merge"}, {"project", "item-add"}, {"release", "upload"}, {"repo", "delete"}, {"run", "rerun"}, {"secret", "set"}, {"variable", "delete"}, {"workflow", "disable"}, {"api", "repos/openclaw/gitcrawl/issues", "-f", "title=x"}} {
if !mutatingGHCommand(args) {
t.Fatalf("%v should be mutating", args)
}
}
if mutatingGHCommand([]string{"pr", "checkout"}) || mutatingGHCommand([]string{"repo", "view"}) || mutatingGHCommand([]string{"api", "repos/openclaw/gitcrawl"}) {
t.Fatal("read-only commands classified as mutating")
}
for _, remote := range []string{"git@github.com:openclaw/gitcrawl.git", "https://github.com/openclaw/gitcrawl.git", "ssh://git@github.com/openclaw/gitcrawl.git"} {
if got, err := ownerRepoFromGitRemote(remote); err != nil || got != "openclaw/gitcrawl" {
t.Fatalf("remote %q => %q err=%v", remote, got, err)
}
}
if _, err := ownerRepoFromGitRemote("not-a-github-remote"); err == nil {
t.Fatal("bad remote should fail")
}
app := New()
if got, err := app.resolveGHRepo(context.Background(), " openclaw/explicit "); err != nil || got != "openclaw/explicit" {
t.Fatalf("explicit repo = %q err=%v", got, err)
}
if got, err := app.resolveGHRepo(context.Background(), ""); err != nil || got != "openclaw/from-env" {
t.Fatalf("env repo = %q err=%v", got, err)
}
t.Setenv("GH_REPO", "")
repoDir := t.TempDir()
if err := runGit(context.Background(), repoDir, "init", "-b", "main"); err != nil {
t.Fatalf("init git repo: %v", err)
}
if err := runGit(context.Background(), repoDir, "remote", "add", "origin", "https://github.com/openclaw/gitcrawl.git"); err != nil {
t.Fatalf("add origin: %v", err)
}
original, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
defer func() { _ = os.Chdir(original) }()
if err := os.Chdir(repoDir); err != nil {
t.Fatalf("chdir repo: %v", err)
}
if got, err := app.resolveGHRepo(context.Background(), ""); err != nil || got != "openclaw/gitcrawl" {
t.Fatalf("git remote repo = %q err=%v", got, err)
}
ghPath := filepath.Join(t.TempDir(), "gh")
if err := os.WriteFile(ghPath, []byte("#!/bin/sh\necho real-gh:$*\n"), 0o755); err != nil {
t.Fatalf("write fake gh: %v", err)
}
t.Setenv("GITCRAWL_GH_PATH", ghPath)
var ghOut bytes.Buffer
app.Stdout = &ghOut
if err := app.runGHShim(context.Background(), nil); err != nil {
t.Fatalf("empty gh shim fallback: %v", err)
}
if strings.TrimSpace(ghOut.String()) != "real-gh:" {
t.Fatalf("empty gh shim output = %q", ghOut.String())
}
shimPath := filepath.Join(t.TempDir(), "gitcrawl-gh")
if err := os.WriteFile(shimPath, []byte("#!/bin/sh\necho shim\n"), 0o755); err != nil {
t.Fatalf("write fake shim: %v", err)
}
shimLink := filepath.Join(t.TempDir(), "gh")
if err := os.Symlink(shimPath, shimLink); err != nil {
t.Fatalf("symlink fake shim: %v", err)
}
t.Setenv("GITCRAWL_GH_PATH", shimLink)
if _, err := resolveRealGHPath(); err == nil || !strings.Contains(err.Error(), "gitcrawl shim") {
t.Fatalf("shim path should fail fast, err=%v", err)
}
t.Setenv("GITCRAWL_GH_STALE_GRACE", "3m")
if got := ghCommandCacheStaleGrace([]string{"api", "users/octocat"}); got != 3*time.Minute {
t.Fatalf("env stale grace = %s", got)
}
t.Setenv("GITCRAWL_GH_STALE_GRACE", "")
if got := ghCommandCacheStaleGrace([]string{"api", "users/octocat"}); got != 24*time.Hour {
t.Fatalf("user stale grace = %s", got)
}
}

View File

@ -192,13 +192,18 @@ func (a *App) runGHPRChecks(ctx context.Context, args []string) error {
return usageErr(err)
}
if fs.NArg() != 1 {
return usageErr(fmt.Errorf("gh pr checks requires a number"))
return usageErr(fmt.Errorf("gh pr checks requires a number or GitHub URL"))
}
ref, _ := parseThreadReference(fs.Arg(0))
number, err := parseThreadNumber(fs.Arg(0))
if err != nil {
return usageErr(err)
}
repoValue, err := a.resolveGHRepo(ctx, firstNonEmpty(*repoShort, *repoLong))
repoArg := firstNonEmpty(*repoShort, *repoLong)
if repoArg == "" {
repoArg = ref.FullName()
}
repoValue, err := a.resolveGHRepo(ctx, repoArg)
if err != nil {
return localGHUnsupported(err)
}

View File

@ -4,6 +4,8 @@ import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"strings"
@ -64,6 +66,213 @@ func TestGHShimFallsBackForUnsupportedRead(t *testing.T) {
}
}
func TestGHShimFallsBackForEmptyOpenIssueListWithoutBroadSync(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimEmptyRepo(t, ctx)
dir := t.TempDir()
ghPath := filepath.Join(dir, "gh")
if err := os.WriteFile(ghPath, []byte("#!/bin/sh\necho fallback:$*\n"), 0o755); err != nil {
t.Fatalf("write fake gh: %v", err)
}
t.Setenv("GITCRAWL_GH_PATH", ghPath)
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, []string{"--config", configPath, "gh", "issue", "list", "-R", "openclaw/openclaw", "--state", "open", "--json", "number"}); err != nil {
t.Fatalf("fallback: %v", err)
}
if got := strings.TrimSpace(stdout.String()); got != "fallback:issue list -R openclaw/openclaw --state open --json number" {
t.Fatalf("fallback output = %q", got)
}
}
func TestGHShimSearchFallsBackForEmptyOpenRepoWithoutBroadSync(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimEmptyRepo(t, ctx)
dir := t.TempDir()
ghPath := filepath.Join(dir, "gh")
if err := os.WriteFile(ghPath, []byte("#!/bin/sh\necho fallback:$*\n"), 0o755); err != nil {
t.Fatalf("write fake gh: %v", err)
}
t.Setenv("GITCRAWL_GH_PATH", ghPath)
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, []string{"--config", configPath, "gh", "search", "issues", "-R", "openclaw/openclaw", "--state", "open", "--json", "number"}); err != nil {
t.Fatalf("fallback: %v", err)
}
if got := strings.TrimSpace(stdout.String()); got != "fallback:search issues -R openclaw/openclaw --state open --json number" {
t.Fatalf("fallback output = %q", got)
}
}
func TestGHShimAutoHydratePortableStoreWritesRuntimeMirror(t *testing.T) {
ctx := context.Background()
dir := t.TempDir()
remoteDir := filepath.Join(dir, "remote")
checkoutDir := filepath.Join(dir, "checkout")
dbRel := filepath.Join("data", "openclaw__openclaw.sync.db")
if err := os.MkdirAll(filepath.Join(remoteDir, "data"), 0o755); err != nil {
t.Fatalf("mkdir remote data: %v", err)
}
if err := runGit(ctx, remoteDir, "init", "-b", "main"); err != nil {
t.Fatalf("git init: %v", err)
}
seedPortableThread(t, filepath.Join(remoteDir, dbRel), 1, "portable issue")
if err := runGit(ctx, remoteDir, "add", dbRel); err != nil {
t.Fatalf("git add seed: %v", err)
}
if err := runGit(ctx, remoteDir, "-c", "user.email=test@example.com", "-c", "user.name=Test", "commit", "-m", "seed store"); err != nil {
t.Fatalf("git commit seed: %v", err)
}
if _, err := syncPortableStore(ctx, remoteDir, checkoutDir); err != nil {
t.Fatalf("clone portable store: %v", err)
}
configPath := filepath.Join(dir, "config.toml")
app := New()
if err := app.Run(ctx, []string{"--config", configPath, "init", "--db", filepath.Join(checkoutDir, dbRel)}); err != nil {
t.Fatalf("init config: %v", err)
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/repos/openclaw/openclaw":
_ = json.NewEncoder(w).Encode(map[string]any{"id": 101, "full_name": "openclaw/openclaw"})
case "/repos/openclaw/openclaw/issues/2":
_ = json.NewEncoder(w).Encode(map[string]any{
"id": 502,
"number": 2,
"state": "open",
"title": "runtime-only issue",
"body": "hydrate into runtime mirror",
"html_url": "https://github.com/openclaw/openclaw/issues/2",
"created_at": "2026-05-08T00:00:00Z",
"updated_at": "2026-05-08T00:00:00Z",
"labels": []map[string]any{},
"assignees": []map[string]any{},
"user": map[string]any{"login": "alice", "type": "User"},
})
default:
t.Fatalf("unexpected path: %s", r.URL.String())
}
}))
defer server.Close()
t.Setenv("GITHUB_TOKEN", "test-token")
t.Setenv("GITCRAWL_GITHUB_BASE_URL", server.URL)
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, []string{"--config", configPath, "gh", "issue", "view", "2", "-R", "openclaw/openclaw", "--json", "number,title"}); err != nil {
t.Fatalf("gh issue view: %v", err)
}
if !strings.Contains(stdout.String(), `"number": 2`) || !strings.Contains(stdout.String(), "runtime-only issue") {
t.Fatalf("view output = %q", stdout.String())
}
if !gitWorktreeClean(ctx, checkoutDir) {
t.Fatal("auto-hydrate dirtied portable checkout")
}
assertPortableThreadPresence(t, ctx, filepath.Join(checkoutDir, dbRel), 2, false)
mirrorPath, err := run.portableRuntimeDBPath(filepath.Join(checkoutDir, dbRel))
if err != nil {
t.Fatalf("runtime db path: %v", err)
}
assertPortableThreadPresence(t, ctx, mirrorPath, 2, true)
}
func TestGHShimViewAcceptsFullGitHubURL(t *testing.T) {
ctx := context.Background()
configPath := seedGHShimRepo(t, ctx)
run := New()
var stdout bytes.Buffer
run.Stdout = &stdout
if err := run.Run(ctx, []string{
"--config", configPath,
"gh", "issue", "view", "https://github.com/openclaw/openclaw/issues/10",
"--json", "number,title,url",
}); err != nil {
t.Fatalf("gh issue view URL: %v", err)
}
var row map[string]any
if err := json.Unmarshal(stdout.Bytes(), &row); err != nil {
t.Fatalf("decode issue view: %v\n%s", err, stdout.String())
}
if int(row["number"].(float64)) != 10 || row["url"] != "https://github.com/openclaw/openclaw/issues/10" {
t.Fatalf("row = %#v", row)
}
}
func seedGHShimEmptyRepo(t *testing.T, ctx context.Context) string {
t.Helper()
dir := t.TempDir()
configPath := filepath.Join(dir, "config.toml")
dbPath := filepath.Join(dir, "gitcrawl.db")
app := New()
if err := app.Run(ctx, []string{"--config", configPath, "init", "--db", dbPath}); err != nil {
t.Fatalf("init: %v", err)
}
cfg, err := config.Load(configPath)
if err != nil {
t.Fatalf("load config: %v", err)
}
cfg.CacheDir = filepath.Join(dir, "cache")
if err := config.Save(configPath, cfg); err != nil {
t.Fatalf("save config: %v", err)
}
st, err := store.Open(ctx, dbPath)
if err != nil {
t.Fatalf("open store: %v", err)
}
repoID, err := st.UpsertRepository(ctx, store.Repository{
Owner: "openclaw",
Name: "openclaw",
FullName: "openclaw/openclaw",
RawJSON: "{}",
UpdatedAt: "2026-05-08T00:00:00Z",
})
if err != nil {
t.Fatalf("seed repository: %v", err)
}
if _, err := st.RecordRun(ctx, store.RunRecord{
RepoID: repoID,
Kind: "sync",
Scope: "numbers:13",
Status: "success",
StartedAt: "2026-05-08T00:00:00Z",
FinishedAt: "2026-05-08T00:00:01Z",
}); err != nil {
t.Fatalf("record targeted sync: %v", err)
}
if err := st.Close(); err != nil {
t.Fatalf("close store: %v", err)
}
return configPath
}
func assertPortableThreadPresence(t *testing.T, ctx context.Context, dbPath string, number int, want bool) {
t.Helper()
st, err := store.OpenReadOnly(ctx, dbPath)
if err != nil {
t.Fatalf("open store %s: %v", dbPath, err)
}
defer st.Close()
repo, err := st.RepositoryByFullName(ctx, "openclaw/openclaw")
if err != nil {
t.Fatalf("repository %s: %v", dbPath, err)
}
threads, err := st.ListThreadsFiltered(ctx, store.ThreadListOptions{RepoID: repo.ID, IncludeClosed: true, Numbers: []int{number}})
if err != nil {
t.Fatalf("list threads %s: %v", dbPath, err)
}
got := len(threads) > 0
if got != want {
t.Fatalf("thread %d presence in %s = %v, want %v", number, dbPath, got, want)
}
}
func seedGHShimRepo(t *testing.T, ctx context.Context) string {
t.Helper()
dir := t.TempDir()

View File

@ -13,16 +13,18 @@ import (
)
type ghCommandCacheStats struct {
CacheDir string `json:"cache_dir"`
Entries int `json:"entries"`
Expired int `json:"expired"`
Locks int `json:"locks"`
Bytes int64 `json:"bytes"`
CacheHits int64 `json:"cache_hits"`
TotalReads int64 `json:"total_reads"`
HitRatePercent float64 `json:"hit_rate_percent"`
Counters ghXCacheCounters `json:"counters"`
Commands map[string]ghCommandCacheCount `json:"commands"`
CacheDir string `json:"cache_dir"`
Entries int `json:"entries"`
Expired int `json:"expired"`
Locks int `json:"locks"`
Bytes int64 `json:"bytes"`
Since string `json:"since,omitempty"`
CacheHits int64 `json:"cache_hits"`
TotalReads int64 `json:"total_reads"`
HitRatePercent float64 `json:"hit_rate_percent"`
Counters ghXCacheCounters `json:"counters"`
CumulativeCounters *ghXCacheCounters `json:"cumulative_counters,omitempty"`
Commands map[string]ghCommandCacheCount `json:"commands"`
}
type ghCommandCacheCount struct {
@ -43,18 +45,28 @@ type ghCommandCacheKeyInfo struct {
func (a *App) runGHXCache(args []string) error {
if len(args) == 0 {
return usageErr(fmt.Errorf("usage: gh xcache <stats|keys|gc|flush|reset>"))
return usageErr(fmt.Errorf("usage: gh xcache <stats|keys|gc|flush|reset|snapshot>"))
}
fs := flag.NewFlagSet("xcache "+args[0], flag.ContinueOnError)
fs.SetOutput(io.Discard)
jsonOut := fs.Bool("json", false, "write JSON output")
sinceRaw := fs.String("since", "", "show stats for the recent duration (stats only)")
resetAfterSnapshot := fs.Bool("reset", false, "reset counters after writing a snapshot (snapshot only)")
if err := fs.Parse(args[1:]); err != nil {
return usageErr(err)
}
a.applyCommandJSON(*jsonOut)
switch args[0] {
case "stats":
return a.runGHXCacheStats()
var since time.Duration
if strings.TrimSpace(*sinceRaw) != "" {
parsed, err := time.ParseDuration(strings.TrimSpace(*sinceRaw))
if err != nil || parsed <= 0 {
return usageErr(fmt.Errorf("invalid --since duration %q", *sinceRaw))
}
since = parsed
}
return a.runGHXCacheStats(since)
case "keys":
return a.runGHXCacheKeys()
case "gc":
@ -63,13 +75,15 @@ func (a *App) runGHXCache(args []string) error {
return a.runGHXCacheFlush()
case "reset":
return a.runGHXCacheReset()
case "snapshot":
return a.runGHXCacheSnapshot(*resetAfterSnapshot)
default:
return usageErr(fmt.Errorf("unknown xcache command %q", args[0]))
}
}
func (a *App) runGHXCacheStats() error {
stats, err := a.ghCommandCacheStats()
func (a *App) runGHXCacheStats(since time.Duration) error {
stats, err := a.ghCommandCacheStats(since)
if err != nil {
return err
}
@ -87,11 +101,15 @@ func (a *App) runGHXCacheStats() error {
_, _ = fmt.Fprintf(a.Stdout, " %-16s %d entries / %d bytes\n", command, count.Entries, count.Bytes)
}
}
if stats.Since != "" {
_, _ = fmt.Fprintf(a.Stdout, "\nSince: %s\n", stats.Since)
}
_, _ = fmt.Fprintf(a.Stdout, "\nCounters:\n local hits: %d\n fallback hits: %d\n stale hits: %d\n backend misses: %d\n pass-through writes: %d\n hit rate: %.1f%% (%d/%d reads)\n",
stats.Counters.LocalHits, stats.Counters.FallbackHits, stats.Counters.StaleHits, stats.Counters.BackendMisses, stats.Counters.PassThroughWrites,
stats.HitRatePercent, stats.CacheHits, stats.TotalReads)
printGHXCacheMisses(a.Stdout, "Backend Misses by Command", stats.Counters.BackendMissesByCommand)
printGHXCacheMisses(a.Stdout, "Backend Misses by Route", stats.Counters.BackendMissesByRoute)
printGHXCacheMisses(a.Stdout, "Backend Misses by Key", stats.Counters.BackendMissesByKey)
return nil
}
@ -161,6 +179,48 @@ func (a *App) runGHXCacheReset() error {
return err
}
type ghCommandCacheSnapshotResult struct {
SnapshotPath string `json:"snapshot_path"`
Reset bool `json:"reset"`
}
func (a *App) runGHXCacheSnapshot(reset bool) error {
stats, err := a.ghCommandCacheStats(0)
if err != nil {
return err
}
dir, err := a.ghCommandCacheDir()
if err != nil {
return err
}
snapshotDir := filepath.Join(dir, "_snapshots")
if err := os.MkdirAll(snapshotDir, 0o755); err != nil {
return err
}
path := filepath.Join(snapshotDir, time.Now().UTC().Format("20060102T150405Z")+".json")
data, err := json.MarshalIndent(stats, "", " ")
if err != nil {
return err
}
if err := writeAtomicFile(path, data, 0o600); err != nil {
return err
}
if reset {
if err := a.resetGHXCacheCounters(); err != nil {
return err
}
}
result := ghCommandCacheSnapshotResult{SnapshotPath: path, Reset: reset}
if a.format == FormatJSON {
return a.writeJSONValue(result, "")
}
_, err = fmt.Fprintf(a.Stdout, "Wrote xcache snapshot: %s\n", path)
if err == nil && reset {
_, err = fmt.Fprintln(a.Stdout, "Reset xcache counters")
}
return err
}
type ghCommandCacheGCResult struct {
Removed int `json:"removed"`
LocksRemoved int `json:"locks_removed"`
@ -178,7 +238,7 @@ func (a *App) runGHXCacheGC() error {
return err
}
func (a *App) ghCommandCacheStats() (ghCommandCacheStats, error) {
func (a *App) ghCommandCacheStats(since time.Duration) (ghCommandCacheStats, error) {
dir, err := a.ghCommandCacheDir()
if err != nil {
return ghCommandCacheStats{}, err
@ -188,9 +248,15 @@ func (a *App) ghCommandCacheStats() (ghCommandCacheStats, error) {
return ghCommandCacheStats{}, err
}
counters, _ := a.ghXCacheCounters()
cumulative := counters
stats := ghCommandCacheStats{CacheDir: dir, Locks: locks, Counters: counters, Commands: map[string]ghCommandCacheCount{}}
stats.CacheHits = counters.LocalHits + counters.FallbackHits + counters.StaleHits
stats.TotalReads = stats.CacheHits + counters.BackendMisses
if since > 0 {
stats.Since = since.String()
stats.CumulativeCounters = &cumulative
stats.Counters = counters.since(since, time.Now())
}
stats.CacheHits = stats.Counters.LocalHits + stats.Counters.FallbackHits + stats.Counters.StaleHits
stats.TotalReads = stats.CacheHits + stats.Counters.BackendMisses
if stats.TotalReads > 0 {
stats.HitRatePercent = float64(stats.CacheHits) / float64(stats.TotalReads) * 100
}

View File

@ -64,7 +64,7 @@ func candidateRealGHPaths() []string {
seen := map[string]bool{}
unique := paths[:0]
for _, path := range paths {
if path = strings.TrimSpace(path); path != "" && !seen[path] {
if path = strings.TrimSpace(path); path != "" && !seen[path] && !isGitcrawlShimPath(path) {
seen[path] = true
unique = append(unique, path)
}

View File

@ -3,6 +3,7 @@ package cli
import (
"context"
"encoding/json"
"errors"
"fmt"
"io"
"os"
@ -26,6 +27,8 @@ const portableStoreRefreshTimeout = 15 * time.Second
const portableStoreRefreshTTL = 2 * time.Minute
const portableStoreRefreshFailureBackoff = time.Minute
var errPortableStoreDirty = errors.New("portable store checkout has local changes")
func (a *App) openLocalRuntime(ctx context.Context) (localRuntime, error) {
cfg, err := config.Load(a.configPath)
if err != nil {
@ -91,7 +94,7 @@ func refreshPortableStoreForDB(ctx context.Context, dbPath string) error {
return nil
}
if !gitWorktreeClean(ctx, root) {
return nil
return errPortableStoreDirty
}
pullCtx, cancel := context.WithTimeout(ctx, portableStoreRefreshTimeout)
defer cancel()
@ -169,6 +172,7 @@ func refreshPortableStoreForDBIfDue(ctx context.Context, sourceDBPath, mirrorPat
if err := os.MkdirAll(filepath.Dir(statePath), 0o755); err != nil {
return err
}
removeStalePortableRefreshLock(lockPath, now)
lock, locked := tryGHCommandCacheLock(lockPath)
if !locked {
return nil
@ -196,6 +200,17 @@ func refreshPortableStoreForDBIfDue(ctx context.Context, sourceDBPath, mirrorPat
return writePortableStoreRefreshState(statePath, state)
}
func removeStalePortableRefreshLock(path string, now time.Time) {
info, err := os.Stat(path)
if err != nil {
return
}
if now.Sub(info.ModTime()) <= 2*portableStoreRefreshTimeout {
return
}
_ = os.Remove(path)
}
func portableStoreRefreshInterval() time.Duration {
if raw := strings.TrimSpace(os.Getenv("GITCRAWL_PORTABLE_REFRESH_TTL")); raw != "" {
if duration, err := time.ParseDuration(raw); err == nil && duration >= 0 {

View File

@ -0,0 +1,96 @@
package cli
import (
"context"
"os"
"path/filepath"
"testing"
"time"
)
func TestPortableRuntimeUtilityBranches(t *testing.T) {
dir := t.TempDir()
source := filepath.Join(dir, "source.db")
mirror := filepath.Join(dir, "runtime", "source.db")
if _, err := portableRuntimeNeedsCopy(source, mirror); err == nil {
t.Fatal("missing source should fail")
}
if err := os.WriteFile(source, []byte("v1"), 0o644); err != nil {
t.Fatalf("write source: %v", err)
}
needs, err := portableRuntimeNeedsCopy(source, mirror)
if err != nil || !needs {
t.Fatalf("missing mirror needs copy=%v err=%v", needs, err)
}
if err := copyFileAtomic(source, mirror); err != nil {
t.Fatalf("copy mirror: %v", err)
}
if err := os.WriteFile(mirror+"-wal", []byte("wal"), 0o644); err != nil {
t.Fatalf("write wal: %v", err)
}
if err := os.WriteFile(mirror+"-shm", []byte("shm"), 0o644); err != nil {
t.Fatalf("write shm: %v", err)
}
if err := os.Chtimes(mirror, time.Now().Add(time.Hour), time.Now().Add(time.Hour)); err != nil {
t.Fatalf("age mirror: %v", err)
}
needs, err = portableRuntimeNeedsCopy(source, mirror)
if err != nil || needs {
t.Fatalf("fresh mirror needs copy=%v err=%v", needs, err)
}
if err := copyFileAtomic(source, mirror); err != nil {
t.Fatalf("recopy mirror: %v", err)
}
if _, err := os.Stat(mirror + "-wal"); !os.IsNotExist(err) {
t.Fatalf("wal sidecar should be removed, err=%v", err)
}
if _, err := os.Stat(mirror + "-shm"); !os.IsNotExist(err) {
t.Fatalf("shm sidecar should be removed, err=%v", err)
}
statePath := portableStoreRefreshStatePath(mirror)
state := portableStoreRefreshState{LastAttempt: "attempt", LastSuccess: time.Now().UTC().Format(time.RFC3339Nano)}
if err := writePortableStoreRefreshState(statePath, state); err != nil {
t.Fatalf("write state: %v", err)
}
if got := readPortableStoreRefreshState(statePath); got.LastAttempt != "attempt" || got.LastSuccess == "" {
t.Fatalf("state = %+v", got)
}
if err := os.WriteFile(statePath, []byte("{"), 0o600); err != nil {
t.Fatalf("write invalid state: %v", err)
}
if got := readPortableStoreRefreshState(statePath); got.LastAttempt != "" {
t.Fatalf("invalid state should decode empty, got %+v", got)
}
now := time.Now().UTC()
if recentPortableRefresh("", now, time.Minute) || recentPortableRefresh("bad", now, time.Minute) || !recentPortableRefresh(now.Format(time.RFC3339Nano), now, time.Minute) {
t.Fatal("recent refresh classification mismatch")
}
lockPath := filepath.Join(dir, "refresh.lock")
if err := os.WriteFile(lockPath, []byte("123\n"), 0o600); err != nil {
t.Fatalf("write lock: %v", err)
}
removeStalePortableRefreshLock(lockPath, now)
if _, err := os.Stat(lockPath); err != nil {
t.Fatalf("fresh lock should remain: %v", err)
}
old := now.Add(-3 * portableStoreRefreshTimeout)
if err := os.Chtimes(lockPath, old, old); err != nil {
t.Fatalf("age lock: %v", err)
}
removeStalePortableRefreshLock(lockPath, now)
if _, err := os.Stat(lockPath); !os.IsNotExist(err) {
t.Fatalf("stale lock should be removed, err=%v", err)
}
t.Setenv("GITCRAWL_PORTABLE_REFRESH_TTL", "0")
if got := portableStoreRefreshInterval(); got != 0 {
t.Fatalf("zero ttl = %s", got)
}
t.Setenv("GITCRAWL_PORTABLE_REFRESH_TTL", "bad")
if got := portableStoreRefreshInterval(); got != portableStoreRefreshTTL {
t.Fatalf("bad ttl fallback = %s", got)
}
if err := refreshPortableStoreForDB(context.Background(), source); err != nil {
t.Fatalf("non-portable refresh should be no-op: %v", err)
}
}

View File

@ -9,7 +9,6 @@ import (
"regexp"
"runtime"
"sort"
"strconv"
"strings"
"time"
@ -1113,7 +1112,7 @@ func (m *clusterBrowserModel) startJumpInput() tea.Cmd {
m.showHelp = false
m.closeMenu("")
m.searchInput.Prompt = "# "
m.searchInput.Placeholder = "issue or PR number"
m.searchInput.Placeholder = "issue, PR, or GitHub URL"
m.searchInput.SetValue("")
m.status = "Jump to issue/PR"
return m.searchInput.Focus()
@ -1123,9 +1122,9 @@ func (m clusterBrowserModel) handleJumpKey(msg tea.KeyMsg) (clusterBrowserModel,
switch msg.String() {
case "enter":
m.jumping = false
value := strings.TrimPrefix(strings.TrimSpace(m.searchInput.Value()), "#")
value := strings.TrimSpace(m.searchInput.Value())
m.searchInput.Blur()
number, err := strconv.Atoi(value)
number, err := parseOptionalThreadNumber(value)
if err != nil || number <= 0 {
m.status = "Enter a positive issue or PR number"
return m, nil

View File

@ -0,0 +1,179 @@
package cli
import (
"context"
"strings"
"testing"
"time"
"github.com/openclaw/gitcrawl/internal/store"
)
func TestTUIRemainingActionAndErrorBranches(t *testing.T) {
thread := store.Thread{
ID: 1, Number: 10, Kind: "issue", State: "open", Title: "Thread title",
Body: "Body with https://example.com/docs", HTMLURL: "https://github.com/openclaw/openclaw/issues/10",
UpdatedAt: "2026-05-08T00:00:00Z",
}
cluster := store.ClusterSummary{
ID: 7, Source: store.ClusterSourceRun, StableSlug: "cluster-7", Status: "active",
Title: "Cluster title", RepresentativeNumber: 10, RepresentativeKind: "issue",
RepresentativeTitle: "Thread title", MemberCount: 1, UpdatedAt: "2026-05-08T00:00:00Z",
}
detail := store.ClusterDetail{
Cluster: cluster,
Members: []store.ClusterMemberDetail{{
Thread: thread,
Role: "member",
State: "active",
BodySnippet: "Body with https://example.com/docs",
Summaries: map[string]string{"problem_summary": "summary"},
}},
}
model := newClusterBrowserModel(context.Background(), nil, 0, clusterBrowserPayload{
Repository: "openclaw/openclaw",
Sort: "size",
MinSize: 1,
Clusters: []store.ClusterSummary{cluster},
})
model.detailCache[7] = detail
model.loadSelectedCluster()
model.memberIndex = 0
model.neighborCache[thread.ID] = []tuiNeighbor{{Thread: thread, Score: 0.9}}
for _, action := range []string{"sort-oldest", "member-sort-oldest", "toggle-closed", "close-menu"} {
if !model.runAction(action) {
t.Fatalf("action %s was not handled", action)
}
}
if model.payload.Sort != "oldest" || model.memberSort != memberSortOldest {
t.Fatalf("sort actions failed sort=%q member=%q", model.payload.Sort, model.memberSort)
}
t.Setenv("PATH", "")
errorActions := []string{
"open-cluster-representative",
"copy-cluster-url",
"copy-thread-detail",
"copy-body-preview",
"copy-summaries",
"copy-neighbors",
"copy-cluster-id",
"copy-cluster-name",
"copy-cluster-title",
"copy-member-list",
"copy-cluster",
"copy-visible-clusters",
"copy-reference-links",
"open",
"copy-url",
"copy-markdown",
"copy-title",
"open-first-link",
"copy-first-link",
}
for _, action := range errorActions {
model.status = ""
handled := model.runMenuItem(tuiMenuItem{label: action, action: action, value: "https://example.com/docs"})
if !handled || model.status == "" {
t.Fatalf("error action %s handled=%v status=%q", action, handled, model.status)
}
}
model.openReferenceLinkMenu("copy")
model.runAction("back-to-actions")
if model.menuTitle != "Actions" {
t.Fatalf("back to actions failed title=%q", model.menuTitle)
}
model.runMenuItem(tuiMenuItem{label: "Open picked", action: "open-picked-link", value: "https://example.com/docs"})
model.runMenuItem(tuiMenuItem{label: "Copy picked", action: "copy-picked-link", value: "https://example.com/docs"})
model.closeSelectedClusterLocally()
if !strings.Contains(model.status, "only available for durable clusters") {
t.Fatalf("raw cluster local close status=%q", model.status)
}
model.reopenSelectedClusterLocally()
model.excludeSelectedClusterMemberLocally()
model.includeSelectedClusterMemberLocally()
model.setSelectedClusterCanonicalLocally()
if !strings.Contains(model.status, "only available for durable clusters") {
t.Fatalf("raw member local action status=%q", model.status)
}
}
func TestTUIRemainingHelperBranches(t *testing.T) {
model := newClusterBrowserModel(context.Background(), nil, 0, clusterBrowserPayload{
Repository: "openclaw/openclaw",
MinSize: 1,
Limit: 1,
Clusters: []store.ClusterSummary{
{ID: 1, Status: "active", RepresentativeNumber: 101, MemberCount: 1, UpdatedAt: "2026-05-08T00:00:00Z"},
},
})
if model.currentClusterID() != 1 {
t.Fatalf("current cluster id = %d", model.currentClusterID())
}
if model.clusterRefreshLimit() != 1 {
t.Fatalf("cluster refresh limit = %d", model.clusterRefreshLimit())
}
model.ensureClusterInWorkingSet(store.ClusterSummary{ID: 2, Status: "closed", ClosedAt: "2026-05-08T00:00:00Z", MemberCount: 2})
if !model.selectClusterIDForJump(2) || !model.showClosed || model.minSize != 1 {
t.Fatalf("jump selection showClosed=%v minSize=%d selected=%d", model.showClosed, model.minSize, model.selected)
}
model.payload.Clusters = nil
if model.currentClusterID() != 0 || model.clusterSignature() != "" {
t.Fatalf("empty cluster helpers id=%d sig=%q", model.currentClusterID(), model.clusterSignature())
}
if _, ok := model.clusterFromWorkingSet(999); ok {
t.Fatal("missing working-set cluster should not resolve")
}
model.applyClusterRefresh(nil, 0)
if model.payload.Clusters == nil {
t.Fatal("nil refresh should normalize clusters")
}
model.autoRefreshFromStore()
if model.status != "Refresh unavailable for this view" {
t.Fatalf("auto refresh status=%q", model.status)
}
if cmd := model.autoRefreshCmd(); cmd != nil {
t.Fatalf("auto refresh command without store = %v", cmd)
}
model.switchRepository("")
if model.status != "Repository picker unavailable for this view" {
t.Fatalf("switch repository no store status=%q", model.status)
}
if label := (clusterBrowserModel{}).clusterPositionLabel(); label != "0" {
t.Fatalf("zero cluster position label = %q", label)
}
if label := model.clusterPositionLabel(); label != "0" {
t.Fatalf("empty model cluster position label = %q", label)
}
memberModel := model
memberModel.memberRows = []memberRow{}
if label := memberModel.memberPositionLabel(); label != "0" {
t.Fatalf("zero member position label = %q", label)
}
if got := formatRelativeTime(time.Now().Add(-30 * time.Minute).Format(time.RFC3339Nano)); got != "30m ago" {
t.Fatalf("minute age = %q", got)
}
if got := formatRelativeTime(time.Now().Add(-75 * 24 * time.Hour).Format(time.RFC3339Nano)); !strings.Contains(got, "mo ago") {
t.Fatalf("month age = %q", got)
}
if got := formatRelativeTime(""); got != "never" {
t.Fatalf("empty age = %q", got)
}
if got := formatRelativeTime("bad-time"); got != "bad-time" {
t.Fatalf("bad age = %q", got)
}
if got := wrapPlain("", 10); len(got) != 1 || got[0] != "" {
t.Fatalf("empty wrap = %+v", got)
}
if got := clampInt(5, 10, 1); got != 10 {
t.Fatalf("inverted clamp = %d", got)
}
if got := padCells("abcdef", 0); got != "" {
t.Fatalf("zero pad = %q", got)
}
if got := fitBlock("a\nb", 2, 1); got != "a " {
t.Fatalf("fit block = %q", got)
}
}

View File

@ -0,0 +1,326 @@
package cli
import (
"bytes"
"context"
"strings"
"testing"
"github.com/charmbracelet/bubbles/textinput"
tea "github.com/charmbracelet/bubbletea"
"github.com/openclaw/gitcrawl/internal/store"
)
func TestFloatingMenuRenderingBranches(t *testing.T) {
base := strings.Join([]string{
"01234567890123456789",
"01234567890123456789",
"01234567890123456789",
"01234567890123456789",
"01234567890123456789",
"01234567890123456789",
"01234567890123456789",
"01234567890123456789",
"01234567890123456789",
}, "\n")
model := clusterBrowserModel{
width: 20,
height: 6,
menuTitle: "Actions",
menuContext: focusClusters,
menuIndex: 2,
menuOff: 1,
menuFloating: true,
menuRect: tuiRect{x: 2, y: 1, w: 16, h: 8},
menuItems: []tuiMenuItem{
tuiMenuSection("Hidden"),
{label: "Open", action: "open"},
{label: "Close", action: "close"},
{label: "Skip", action: ""},
{label: "Refresh", action: "refresh"},
},
}
rendered := model.renderFloatingMenu(base)
if rendered == base || !strings.Contains(rendered, "Actions") || !strings.Contains(rendered, "Open") {
t.Fatalf("rendered menu = %q", rendered)
}
if got := (clusterBrowserModel{}).renderFloatingMenu(base); got != base {
t.Fatalf("empty rect should keep base view")
}
submenu := model
submenu.menuTitle = "Repository"
if lines := submenu.menuLines(14); !strings.Contains(strings.Join(lines, "\n"), "b back") {
t.Fatalf("submenu lines = %#v", lines)
}
if got := actionMenuSubtitle(focusMembers); got != "selected member scope" {
t.Fatalf("member subtitle = %q", got)
}
if got := actionMenuSubtitle(focusDetail); got != "detail scope" {
t.Fatalf("detail subtitle = %q", got)
}
if got := actionMenuSubtitle(""); got != "current selection" {
t.Fatalf("default subtitle = %q", got)
}
if palette := actionMenuColors(focusMembers); palette.accent == "" || palette.background == "" {
t.Fatalf("member palette = %+v", palette)
}
if style := floatingMenuStyle(1, 1, actionMenuColors("")); style.GetWidth() != 1 || style.GetHeight() != 1 {
t.Fatalf("minimum style size width=%d height=%d", style.GetWidth(), style.GetHeight())
}
if index, ok := visibleMenuShortcutIndex("2", model.menuItems, 1, 4); !ok || index != 2 {
t.Fatalf("shortcut index=%d ok=%v", index, ok)
}
if _, ok := visibleMenuShortcutIndex("x", model.menuItems, 1, 4); ok {
t.Fatal("non-numeric shortcut should not match")
}
}
func TestTUIMenuNavigationAndWheelBranches(t *testing.T) {
model := clusterBrowserModel{
width: 100,
height: 30,
menuIndex: 0,
menuOff: 4,
menuFloating: true,
menuRect: tuiRect{x: 0, y: 0, w: 20, h: 8},
menuItems: []tuiMenuItem{
tuiMenuSection("top"),
{label: "one", action: "one"},
{label: "two", action: "two"},
tuiMenuSection("middle"),
{label: "three", action: "three"},
{label: "four", action: "four"},
},
payload: clusterBrowserPayload{Clusters: []store.ClusterSummary{
{ID: 10, Title: "first"},
{ID: 11, Title: "second"},
}},
}
if model.firstSelectableMenuIndex() != 1 || model.lastSelectableMenuIndex() != 5 {
t.Fatalf("selectable bounds first=%d last=%d", model.firstSelectableMenuIndex(), model.lastSelectableMenuIndex())
}
if got := model.nextSelectableMenuIndex(1); got != 1 {
t.Fatalf("next selectable = %d", got)
}
if got := model.nearestSelectableMenuIndex(3, 1); got != 4 {
t.Fatalf("nearest forward = %d", got)
}
if got := model.nearestSelectableMenuIndex(3, -1); got != 2 {
t.Fatalf("nearest backward = %d", got)
}
empty := clusterBrowserModel{}
if got := empty.nearestSelectableMenuIndex(10, 1); got != 0 {
t.Fatalf("empty nearest = %d", got)
}
model.menuIndex = 5
model.keepMenuVisible()
if model.menuOff > model.menuIndex {
t.Fatalf("menu off=%d index=%d", model.menuOff, model.menuIndex)
}
layout := tuiLayout{
clusters: tuiRect{x: 0, y: 2, w: 20, h: 8},
members: tuiRect{x: 20, y: 2, w: 20, h: 8},
detail: tuiRect{x: 40, y: 2, w: 20, h: 8},
}
if got := model.actionMenuContextAt(layout, 1, 3); got != focusClusters {
t.Fatalf("cluster context = %q", got)
}
if got := model.actionMenuContextAt(layout, 21, 3); got != focusMembers {
t.Fatalf("member context = %q", got)
}
if got := model.actionMenuContextAt(layout, 41, 3); got != focusDetail {
t.Fatalf("detail context = %q", got)
}
if got := model.actionMenuContextAt(layout, 99, 99); got != "" {
t.Fatalf("outside context = %q", got)
}
if index, ok := model.menuIndexAtMouse(layout, 1, 4); !ok || index != 6 {
t.Fatalf("menu index at mouse index=%d ok=%v", index, ok)
}
model.menuFloating = false
if index, ok := model.menuIndexAtMouse(layout, 41, 6); !ok || index != 5 {
t.Fatalf("detail menu index at mouse index=%d ok=%v", index, ok)
}
if _, ok := model.menuIndexAtMouse(layout, 99, 99); ok {
t.Fatal("outside mouse should not hit menu")
}
if step := (clusterBrowserModel{width: 100, height: 30}).pageStep(); step <= 0 {
t.Fatalf("cluster page step = %d", step)
}
detailModel := clusterBrowserModel{focus: focusDetail}
detailModel.detailView.Height = 3
if step := detailModel.pageStep(); step != 3 {
t.Fatalf("detail page step = %d", step)
}
model.selected = 0
cmd := model.moveClusterByWheel(1)
if cmd == nil || model.selected != 1 || model.status != "Cluster 11" {
t.Fatalf("wheel move selected=%d status=%q cmd=%v", model.selected, model.status, cmd)
}
if cmd := model.moveClusterByWheel(1); cmd != nil {
t.Fatalf("boundary wheel move should not tick: %v", cmd)
}
model.wheelDelta = -1
model.wheelFocus = focusClusters
if cmd := model.applyQueuedWheelScroll(); cmd == nil || model.focus != focusClusters {
t.Fatalf("queued wheel cmd=%v focus=%q", cmd, model.focus)
}
model.wheelDelta = 0
if cmd := model.applyQueuedWheelScroll(); cmd != nil {
t.Fatalf("zero queued wheel should be nil: %v", cmd)
}
}
func TestTUISelectionAndVisibilityHelperBranches(t *testing.T) {
model := clusterBrowserModel{
payload: clusterBrowserPayload{Limit: 2, Clusters: []store.ClusterSummary{
{ID: 1, RepresentativeNumber: 101, MemberCount: 2, UpdatedAt: "2026-05-05T10:00:00Z"},
{ID: 2, RepresentativeNumber: 202, MemberCount: 1, UpdatedAt: "2026-05-05T11:00:00Z"},
}},
allClusters: []store.ClusterSummary{
{ID: 3, RepresentativeNumber: 303, MemberCount: 5, UpdatedAt: "2026-05-05T12:00:00Z"},
},
hasDetail: true,
detail: store.ClusterDetail{
Cluster: store.ClusterSummary{ID: 9, RepresentativeNumber: 909},
Members: []store.ClusterMemberDetail{{
Thread: store.Thread{Number: 909, State: "open"},
}},
},
detailCache: map[int64]store.ClusterDetail{
8: {Cluster: store.ClusterSummary{ID: 8}, Members: []store.ClusterMemberDetail{{Thread: store.Thread{Number: 808, State: "open"}}}},
},
memberRows: []memberRow{
{label: "header"},
{selectable: true, member: store.ClusterMemberDetail{Thread: store.Thread{Number: 202, State: "open"}}},
},
}
if got := model.currentClusterID(); got != 1 {
t.Fatalf("current cluster = %d", got)
}
if got := model.clusterRefreshLimit(); got != 2 {
t.Fatalf("refresh limit = %d", got)
}
if got := model.findLoadedClusterIDForThreadNumber(909); got != 9 {
t.Fatalf("detail cluster lookup = %d", got)
}
if got := model.findLoadedClusterIDForThreadNumber(808); got != 8 {
t.Fatalf("cache cluster lookup = %d", got)
}
if got := model.findLoadedClusterIDForThreadNumber(303); got != 3 {
t.Fatalf("working-set cluster lookup = %d", got)
}
if _, ok := model.clusterFromWorkingSet(404); ok {
t.Fatal("missing cluster should not be found")
}
if !model.selectMemberByNumber(202) || model.memberIndex != 1 {
t.Fatalf("member selection index = %d", model.memberIndex)
}
if model.selectMemberByNumber(999) {
t.Fatal("missing member should not be selected")
}
openThread := store.Thread{State: "open"}
closedThread := store.Thread{State: "closed"}
localClosedThread := store.Thread{State: "open", ClosedAtLocal: "2026-05-05T00:00:00Z"}
if !threadVisible(openThread, false) || threadVisible(closedThread, false) || threadVisible(localClosedThread, false) || !threadVisible(closedThread, true) {
t.Fatal("thread visibility mismatch")
}
if got := memberDisplayState(store.ClusterMemberDetail{State: "removed", Thread: openThread}); got != "removed" {
t.Fatalf("member state = %q", got)
}
if got := memberDisplayState(store.ClusterMemberDetail{Thread: localClosedThread}); got != "local" {
t.Fatalf("local member state = %q", got)
}
if memberVisible(store.ClusterMemberDetail{State: "removed", Thread: openThread}, false) || !memberVisible(store.ClusterMemberDetail{State: "removed", Thread: closedThread}, true) {
t.Fatal("member visibility mismatch")
}
noLimit := clusterBrowserModel{payload: clusterBrowserPayload{Clusters: model.payload.Clusters}, allClusters: model.allClusters}
if got := noLimit.clusterRefreshLimit(); got < len(model.allClusters) {
t.Fatalf("no-limit refresh limit = %d", got)
}
}
func TestTUIJumpToThreadNumberLoadsClusterFromStore(t *testing.T) {
st, repoID, clusterID := seedTUIDurableStore(t)
defer st.Close()
model := clusterBrowserModel{
ctx: context.Background(),
store: st,
repoID: repoID,
detailCache: map[int64]store.ClusterDetail{},
payload: clusterBrowserPayload{Limit: 1, Sort: "recent"},
minSize: 99,
}
model.jumpToThreadNumber(0)
if model.status != "Enter a positive issue or PR number" {
t.Fatalf("bad jump status = %q", model.status)
}
model.jumpToThreadNumber(202)
if model.focus != focusMembers || !strings.Contains(model.status, "Jumped to #202") {
t.Fatalf("jump focus=%q status=%q", model.focus, model.status)
}
if len(model.payload.Clusters) == 0 || model.payload.Clusters[model.selected].ID != clusterID {
t.Fatalf("selected clusters = %+v selected=%d want cluster %d", model.payload.Clusters, model.selected, clusterID)
}
if model.memberIndex < 0 || model.memberRows[model.memberIndex].thread().Number != 202 {
t.Fatalf("member rows index=%d rows=%+v", model.memberIndex, model.memberRows)
}
if _, ok := model.detailCache[clusterID]; !ok {
t.Fatalf("detail cache missing cluster %d", clusterID)
}
model.jumpToThreadNumber(999)
if model.status == "" || strings.Contains(model.status, "Jumped") {
t.Fatalf("missing jump status = %q", model.status)
}
}
func TestTUIJumpKeyAndRefreshCommandBranches(t *testing.T) {
input := textinput.New()
input.SetValue("#0")
model := clusterBrowserModel{searchInput: input, jumping: true}
next, cmd := model.handleJumpKey(tea.KeyMsg{Type: tea.KeyEnter})
if cmd != nil || next.jumping || next.status != "Enter a positive issue or PR number" {
t.Fatalf("bad enter next=%+v cmd=%v", next, cmd)
}
input = textinput.New()
input.SetValue("https://github.com/openclaw/openclaw/issues/123")
model = clusterBrowserModel{
searchInput: input,
jumping: true,
payload: clusterBrowserPayload{Clusters: []store.ClusterSummary{{ID: 1, RepresentativeNumber: 123}}},
allClusters: []store.ClusterSummary{{ID: 1, RepresentativeNumber: 123}},
detailCache: map[int64]store.ClusterDetail{},
}
next, cmd = model.handleJumpKey(tea.KeyMsg{Type: tea.KeyEnter})
if cmd != nil || next.jumping || !strings.Contains(next.status, "outside loaded members") {
t.Fatalf("valid enter next status=%q cmd=%v", next.status, cmd)
}
model = clusterBrowserModel{searchInput: textinput.New(), jumping: true}
next, cmd = model.handleJumpKey(tea.KeyMsg{Type: tea.KeyEsc})
if cmd != nil || next.jumping || next.status != "Jump cancelled" {
t.Fatalf("esc next=%+v cmd=%v", next, cmd)
}
next, cmd = model.handleJumpKey(tea.KeyMsg{Type: tea.KeyRunes, Runes: []rune("4")})
if next.jumping != true {
t.Fatalf("rune input should keep jump mode, next=%+v cmd=%v", next, cmd)
}
if (clusterBrowserModel{}).remoteRefreshTickCmd() == nil || (clusterBrowserModel{}).autoRefreshCmd() != nil || (clusterBrowserModel{store: &store.Store{}, repoID: 1}).autoRefreshCmd() == nil {
t.Fatal("refresh tick commands should be scheduled")
}
}
func TestInteractiveTUIFallsBackToJSONForNonFileOutput(t *testing.T) {
app := New()
var out bytes.Buffer
app.Stdout = &out
if app.canRunInteractiveTUI() {
t.Fatal("buffer stdout should not be interactive")
}
payload := clusterBrowserPayload{Repository: "openclaw/openclaw", Mode: "clusters", Clusters: []store.ClusterSummary{{ID: 1, MemberCount: 2}}}
if err := app.runInteractiveTUI(context.Background(), nil, 0, payload); err != nil {
t.Fatalf("run tui fallback: %v", err)
}
if !strings.Contains(out.String(), `"repository": "openclaw/openclaw"`) || !strings.Contains(out.String(), `"clusters"`) {
t.Fatalf("fallback tui output = %q", out.String())
}
}

View File

@ -6,7 +6,7 @@ import (
"path/filepath"
"strings"
"github.com/pelletier/go-toml/v2"
crawlconfig "github.com/vincentkoc/crawlkit/config"
)
const (
@ -49,15 +49,24 @@ type TokenResolution struct {
Source string
}
var appConfig = crawlconfig.App{Name: "gitcrawl", ConfigEnv: DefaultConfigEnv}
func Default() Config {
home := homeDir()
base := filepath.Join(home, ".config", "gitcrawl")
paths, err := appConfig.DefaultPaths()
if err != nil {
paths = crawlconfig.Paths{
DBPath: filepath.Join(homeDir(), ".config", "gitcrawl", "gitcrawl.db"),
CacheDir: filepath.Join(homeDir(), ".config", "gitcrawl", "cache"),
LogDir: filepath.Join(homeDir(), ".config", "gitcrawl", "logs"),
}
}
base := filepath.Dir(paths.DBPath)
return Config{
Version: 1,
DBPath: filepath.Join(base, "gitcrawl.db"),
CacheDir: filepath.Join(base, "cache"),
DBPath: paths.DBPath,
CacheDir: paths.CacheDir,
VectorDir: filepath.Join(base, "vectors"),
LogDir: filepath.Join(base, "logs"),
LogDir: paths.LogDir,
EmbeddingBasis: "title_original",
GitHub: GitHubConfig{
TokenEnv: DefaultTokenEnv,
@ -77,26 +86,19 @@ func Default() Config {
}
func ResolvePath(flagPath string) string {
if strings.TrimSpace(flagPath) != "" {
return expandHome(flagPath)
path, err := appConfig.ResolveConfigPath(flagPath)
if err != nil {
return filepath.Join(homeDir(), ".config", "gitcrawl", "config.toml")
}
if envPath := strings.TrimSpace(os.Getenv(DefaultConfigEnv)); envPath != "" {
return expandHome(envPath)
}
home := homeDir()
return filepath.Join(home, ".config", "gitcrawl", "config.toml")
return path
}
func Load(path string) (Config, error) {
cfg := Default()
resolved := ResolvePath(path)
data, err := os.ReadFile(resolved)
if err != nil {
if err := crawlconfig.LoadTOML(resolved, &cfg); err != nil {
return Config{}, err
}
if err := toml.Unmarshal(data, &cfg); err != nil {
return Config{}, fmt.Errorf("parse config: %w", err)
}
if err := cfg.Normalize(); err != nil {
return Config{}, err
}
@ -108,21 +110,19 @@ func Save(path string, cfg Config) error {
return err
}
resolved := ResolvePath(path)
if err := os.MkdirAll(filepath.Dir(resolved), 0o755); err != nil {
return fmt.Errorf("create config dir: %w", err)
}
data, err := toml.Marshal(cfg)
if err != nil {
return fmt.Errorf("marshal config: %w", err)
}
return os.WriteFile(resolved, data, 0o600)
return crawlconfig.WriteTOML(resolved, cfg, 0o600)
}
func EnsureRuntimeDirs(cfg Config) error {
for _, path := range []string{cfg.CacheDir, cfg.VectorDir, cfg.LogDir, filepath.Dir(cfg.DBPath)} {
if err := os.MkdirAll(expandHome(path), 0o755); err != nil {
return fmt.Errorf("create runtime dir %s: %w", path, err)
}
if err := crawlconfig.EnsureRuntimeDirs(crawlconfig.RuntimeConfig{
DBPath: cfg.DBPath,
CacheDir: cfg.CacheDir,
LogDir: cfg.LogDir,
}); err != nil {
return err
}
if err := os.MkdirAll(crawlconfig.ExpandHome(cfg.VectorDir), 0o755); err != nil {
return fmt.Errorf("create runtime dir %s: %w", cfg.VectorDir, err)
}
return nil
}
@ -200,13 +200,7 @@ func envOrDefault(primary, fallback string) string {
}
func expandHome(path string) string {
if path == "~" {
return homeDir()
}
if strings.HasPrefix(path, "~/") {
return filepath.Join(homeDir(), strings.TrimPrefix(path, "~/"))
}
return path
return crawlconfig.ExpandHome(path)
}
func homeDir() string {

View File

@ -4,6 +4,7 @@ import (
"context"
"encoding/json"
"errors"
"io"
"net/http"
"net/http/httptest"
"strings"
@ -13,6 +14,12 @@ import (
"unicode/utf8"
)
type roundTripFunc func(*http.Request) (*http.Response, error)
func (f roundTripFunc) RoundTrip(req *http.Request) (*http.Response, error) {
return f(req)
}
func TestEmbedAcceptsLargeBatchResponse(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var request embeddingRequest
@ -357,3 +364,79 @@ func TestEmbedRetryAfterDateForm(t *testing.T) {
t.Fatalf("expected ~3s sleep from HTTP-date Retry-After, got %v", slept)
}
}
func TestOpenAIErrorAndRetryHelpers(t *testing.T) {
apiErr := &APIError{Status: http.StatusBadGateway, Type: "overloaded_error", Code: "overloaded", Message: "try later"}
if got := apiErr.Error(); !strings.Contains(got, "status=502") || !strings.Contains(got, "message=try later") {
t.Fatalf("error string = %q", got)
}
if !apiErr.Retryable() || !apiErr.IsOverloaded() {
t.Fatalf("retryable/overloaded = %v/%v", apiErr.Retryable(), apiErr.IsOverloaded())
}
if (*APIError)(nil).Retryable() || !(&APIError{Status: http.StatusGatewayTimeout}).Retryable() || (&APIError{Status: http.StatusTooManyRequests, Type: "insufficient_quota"}).Retryable() {
t.Fatal("unexpected retryable classification")
}
if AsAPIError(nil) != nil || AsAPIError(errors.New("plain")) != nil {
t.Fatal("unexpected APIError extraction")
}
now := time.Date(2026, 5, 5, 10, 0, 0, 0, time.UTC)
if got := parseRetryAfter("1.5", now); got != 1500*time.Millisecond {
t.Fatalf("float retry-after = %s", got)
}
if got := parseRetryAfter("-1", now); got != 0 {
t.Fatalf("negative retry-after = %s", got)
}
if got := parseRetryAfter(now.Add(-time.Minute).Format(http.TimeFormat), now); got != 0 {
t.Fatalf("past retry-after = %s", got)
}
retry := RetryConfig{MaxAttempts: -1, BaseDelay: 0, MaxDelay: 50 * time.Millisecond, MaxElapsed: 0, Jitter: 0}
client := New(Options{APIKey: "test", Retry: &retry})
if client.retry.MaxAttempts != 1 {
t.Fatalf("max attempts = %d, want normalized 1", client.retry.MaxAttempts)
}
if got := client.backoff(10, 0, time.Second); got != 50*time.Millisecond {
t.Fatalf("retry-after should be clamped to max delay, got %s", got)
}
if got := client.backoff(10, 0, 0); got != 50*time.Millisecond {
t.Fatalf("exponential backoff should be clamped to max delay, got %s", got)
}
if !client.canSleep(now, 24*time.Hour) {
t.Fatal("max elapsed <= 0 should allow sleeping")
}
if err := sleepCtx(context.Background(), 0); err != nil {
t.Fatalf("zero sleep: %v", err)
}
ctx, cancel := context.WithCancel(context.Background())
cancel()
if err := sleepCtx(ctx, time.Hour); !errors.Is(err, context.Canceled) {
t.Fatalf("canceled sleep err = %v", err)
}
}
func TestEmbedRetriesTransportError(t *testing.T) {
var calls int
client := New(Options{
APIKey: "test",
BaseURL: "https://example.invalid",
Retry: &RetryConfig{MaxAttempts: 2, BaseDelay: time.Millisecond, MaxDelay: time.Millisecond, MaxElapsed: time.Hour, Jitter: 0},
Sleep: func(context.Context, time.Duration) error { return nil },
HTTPClient: &http.Client{Transport: roundTripFunc(func(req *http.Request) (*http.Response, error) {
calls++
if calls == 1 {
return nil, errors.New("temporary network break")
}
return &http.Response{
StatusCode: http.StatusOK,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader(`{"data":[{"index":0,"embedding":[0.5]}]}`)),
}, nil
})},
})
vectors, err := client.Embed(context.Background(), "model", []string{"hi"})
if err != nil {
t.Fatalf("embed: %v", err)
}
if calls != 2 || len(vectors) != 1 || vectors[0][0] != 0.5 {
t.Fatalf("calls=%d vectors=%v", calls, vectors)
}
}

View File

@ -36,4 +36,18 @@ func TestUpsertComment(t *testing.T) {
if id == 0 {
t.Fatal("expected comment id")
}
if _, err := st.UpsertComment(ctx, Comment{
ThreadID: threadID, GitHubID: "c0", CommentType: "issue_comment",
AuthorLogin: "octobot", AuthorType: "Bot", Body: "earlier bot note", IsBot: true, RawJSON: "{}",
CreatedAtGitHub: "2026-04-25T00:00:00Z", UpdatedAtGitHub: "2026-04-25T00:01:00Z",
}); err != nil {
t.Fatalf("second comment: %v", err)
}
comments, err := st.ListComments(ctx, threadID)
if err != nil {
t.Fatalf("list comments: %v", err)
}
if len(comments) != 2 || comments[0].GitHubID != "c0" || !comments[0].IsBot || comments[1].GitHubID != "c1" {
t.Fatalf("comments = %+v", comments)
}
}

View File

@ -254,6 +254,70 @@ func TestPortablePruneCanonicalizesSchemaAndMetadata(t *testing.T) {
}
}
func TestPortablePruneClearsPRRawJSONBlobPointersAndFingerprints(t *testing.T) {
ctx := context.Background()
st, err := Open(ctx, filepath.Join(t.TempDir(), "gitcrawl.db"))
if err != nil {
t.Fatalf("open store: %v", err)
}
defer st.Close()
repoID, threadIDs := seedVectorThreads(t, ctx, st)
threadID := threadIDs[1]
if _, err := st.DB().ExecContext(ctx, `
insert into blobs(id, sha256, media_type, compression, size_bytes, storage_kind, inline_text, created_at)
values(1, 'sha', 'application/json', 'none', 2, 'inline', '{}', '2026-05-05T00:00:00Z');
insert into thread_revisions(id, thread_id, source_updated_at, content_hash, title_hash, body_hash, labels_hash, raw_json_blob_id, created_at)
values(1, ?, '2026-05-05T00:00:00Z', 'content', 'title', 'body', 'labels', 1, '2026-05-05T00:00:00Z');
insert into thread_fingerprints(thread_revision_id, algorithm_version, fingerprint_hash, fingerprint_slug, title_tokens_json, body_token_hash, linked_refs_json, file_set_hash, module_buckets_json, simhash64, feature_json, created_at)
values(1, 'v1', 'hash', 'slug', '["token"]', 'body', '["#1"]', 'files', '["module"]', '1', '{"x":1}', '2026-05-05T00:00:00Z');
`, threadID); err != nil {
t.Fatalf("seed revision/fingerprint: %v", err)
}
if _, err := st.UpsertComment(ctx, Comment{ThreadID: threadID, GitHubID: "raw-comment", CommentType: "issue_comment", Body: "comment body that is long", RawJSON: `{"raw":true}`, CreatedAtGitHub: "2026-05-05T00:00:00Z"}); err != nil {
t.Fatalf("seed comment: %v", err)
}
if _, err := st.DB().ExecContext(ctx, `update comments set raw_json_blob_id = 1 where github_id = 'raw-comment'`); err != nil {
t.Fatalf("link comment blob: %v", err)
}
if err := st.UpsertPullRequestCache(ctx,
PullRequestDetail{ThreadID: threadID, RepoID: repoID, Number: 302, HeadSHA: "head", RawJSON: `{"detail":true}`, FetchedAt: "2026-05-05T00:00:00Z", UpdatedAt: "2026-05-05T00:00:00Z"},
[]PullRequestFile{{Path: "a.go", RawJSON: `{"file":true}`, FetchedAt: "2026-05-05T00:00:00Z"}},
[]PullRequestCommit{{SHA: "abc", RawJSON: `{"commit":true}`, FetchedAt: "2026-05-05T00:00:00Z"}},
[]PullRequestCheck{{Name: "ci", RawJSON: `{"check":true}`, FetchedAt: "2026-05-05T00:00:00Z"}},
[]WorkflowRun{{RepoID: repoID, RunID: "1", RawJSON: `{"run":true}`, FetchedAt: "2026-05-05T00:00:00Z"}},
); err != nil {
t.Fatalf("seed pr cache: %v", err)
}
stats, err := st.PrunePortablePayloads(ctx, PortablePruneOptions{BodyChars: 4})
if err != nil {
t.Fatalf("prune portable: %v", err)
}
if stats.RawJSONPruned < 6 || stats.FingerprintsPruned != 1 || stats.CommentsPruned != 1 {
t.Fatalf("portable stats = %+v", stats)
}
var commentRaw string
var commentBlob, revisionBlob any
if err := st.DB().QueryRowContext(ctx, `select raw_json, raw_json_blob_id from comments where github_id = 'raw-comment'`).Scan(&commentRaw, &commentBlob); err != nil {
t.Fatalf("read pruned comment: %v", err)
}
if commentRaw != "" || commentBlob != nil {
t.Fatalf("comment raw=%q blob=%v", commentRaw, commentBlob)
}
if err := st.DB().QueryRowContext(ctx, `select raw_json_blob_id from thread_revisions where id = 1`).Scan(&revisionBlob); err != nil {
t.Fatalf("read pruned revision: %v", err)
}
if revisionBlob != nil {
t.Fatalf("revision blob=%v", revisionBlob)
}
var titleTokens, linkedRefs, modules, features string
if err := st.DB().QueryRowContext(ctx, `select title_tokens_json, linked_refs_json, module_buckets_json, feature_json from thread_fingerprints where id = 1`).Scan(&titleTokens, &linkedRefs, &modules, &features); err != nil {
t.Fatalf("read pruned fingerprint: %v", err)
}
if titleTokens != "[]" || linkedRefs != "[]" || modules != "[]" || features != "{}" {
t.Fatalf("fingerprint title=%q refs=%q modules=%q features=%q", titleTokens, linkedRefs, modules, features)
}
}
func TestClusterHelperBranches(t *testing.T) {
summaries := []ClusterSummary{
{ID: 1, MemberCount: 1, UpdatedAt: "2026-04-30T01:00:00Z"},
@ -267,6 +331,23 @@ func TestClusterHelperBranches(t *testing.T) {
if summaries[0].ID != 1 {
t.Fatalf("recent sort = %+v", summaries)
}
summaries = []ClusterSummary{
{ID: 3, MemberCount: 2, UpdatedAt: "2026-04-30T01:00:00Z"},
{ID: 2, MemberCount: 2, UpdatedAt: "2026-04-30T01:00:00Z"},
{ID: 1, MemberCount: 3, UpdatedAt: "2026-04-30T00:00:00Z"},
}
sortClusterSummaries(summaries, "size")
if summaries[0].ID != 1 || summaries[1].ID != 2 {
t.Fatalf("size tie sort = %+v", summaries)
}
sortClusterSummaries(summaries, "oldest")
if summaries[0].ID != 1 || summaries[1].ID != 2 {
t.Fatalf("oldest tie sort = %+v", summaries)
}
sortClusterSummaries(summaries, "recent")
if summaries[0].ID != 2 || summaries[1].ID != 3 {
t.Fatalf("recent tie sort = %+v", summaries)
}
if ids := parseIDSet(`1, 2, 0, bad, 3`); len(ids) != 3 || !ids[2] {
t.Fatalf("parse id set = %+v", ids)
}
@ -276,6 +357,15 @@ func TestClusterHelperBranches(t *testing.T) {
if got := snippetRunes("abcdef", 3); got != "abc" {
t.Fatalf("snippet = %q", got)
}
if got := rowsAffected(errorResult{}); got != 0 {
t.Fatalf("error rows affected = %d", got)
}
if got := nullString(""); got.Valid {
t.Fatalf("empty null string = %+v", got)
}
if got := nullString("x"); !got.Valid || got.String != "x" {
t.Fatalf("non-empty null string = %+v", got)
}
if func() (panicked bool) {
defer func() { panicked = recover() != nil }()
_ = sqliteIdentifier(`bad"name`)
@ -756,6 +846,16 @@ func TestPortableVacuumAndVectorQueryBranches(t *testing.T) {
}
}
type errorResult struct{}
func (errorResult) LastInsertId() (int64, error) {
return 0, sql.ErrNoRows
}
func (errorResult) RowsAffected() (int64, error) {
return 0, sql.ErrNoRows
}
func seedVectorThreads(t *testing.T, ctx context.Context, st *Store) (int64, []int64) {
t.Helper()
now := time.Now().UTC().Format(time.RFC3339Nano)
@ -780,3 +880,106 @@ func seedVectorThreads(t *testing.T, ctx context.Context, st *Store) (int64, []i
}
return repoID, ids
}
func TestClosedStoreErrorBranches(t *testing.T) {
ctx := context.Background()
st, err := Open(ctx, filepath.Join(t.TempDir(), "gitcrawl.db"))
if err != nil {
t.Fatalf("open store: %v", err)
}
repoID, threadIDs := seedVectorThreads(t, ctx, st)
if _, err := st.SaveDurableClusters(ctx, repoID, []DurableClusterInput{{
StableKey: "closed-store",
RepresentativeThreadID: threadIDs[0],
Members: []DurableClusterMemberInput{{ThreadID: threadIDs[0]}, {ThreadID: threadIDs[1]}},
}}); err != nil {
t.Fatalf("seed durable cluster: %v", err)
}
if err := st.Close(); err != nil {
t.Fatalf("close store: %v", err)
}
checks := []struct {
name string
fn func() error
}{
{"display summaries", func() error {
_, err := st.ListDisplayClusterSummaries(ctx, ClusterSummaryOptions{RepoID: repoID, IncludeClosed: true})
return err
}},
{"run summaries", func() error {
_, err := st.ListRunClusterSummaries(ctx, ClusterSummaryOptions{RepoID: repoID})
return err
}},
{"durable summaries", func() error {
_, err := st.ListClusterSummaries(ctx, ClusterSummaryOptions{RepoID: repoID})
return err
}},
{"cluster detail", func() error {
_, err := st.ClusterDetail(ctx, ClusterDetailOptions{RepoID: repoID, ClusterID: 1})
return err
}},
{"durable detail", func() error {
_, err := st.DurableClusterDetail(ctx, ClusterDetailOptions{RepoID: repoID, ClusterID: 1})
return err
}},
{"thread cluster", func() error {
_, err := st.ClusterIDForThreadNumber(ctx, repoID, 301, true)
return err
}},
{"close cluster", func() error {
return st.CloseClusterLocally(ctx, repoID, 1, "closed")
}},
{"reopen cluster", func() error {
return st.ReopenClusterLocally(ctx, repoID, 1)
}},
{"save durable", func() error {
_, err := st.SaveDurableClusters(ctx, repoID, []DurableClusterInput{{
StableKey: "after-close",
RepresentativeThreadID: threadIDs[0],
Members: []DurableClusterMemberInput{{ThreadID: threadIDs[0]}},
}})
return err
}},
{"exclude member", func() error {
_, err := st.ExcludeClusterMemberLocally(ctx, repoID, 1, 301, "closed")
return err
}},
{"include member", func() error {
_, err := st.IncludeClusterMemberLocally(ctx, repoID, 1, 301, "closed")
return err
}},
{"canonical member", func() error {
_, err := st.SetClusterCanonicalLocally(ctx, repoID, 1, 301, "closed")
return err
}},
{"summaries", func() error {
_, err := st.summariesByThreadIDs(ctx, threadIDs)
return err
}},
{"portable prune", func() error {
_, err := st.PrunePortablePayloads(ctx, PortablePruneOptions{BodyChars: 8})
return err
}},
{"status", func() error {
_, err := st.Status(ctx)
return err
}},
{"repositories", func() error {
_, err := st.ListRepositories(ctx)
return err
}},
{"runs", func() error {
_, err := st.ListRuns(ctx, repoID, "sync", 1)
return err
}},
}
errorsSeen := 0
for _, check := range checks {
if err := check.fn(); err != nil {
errorsSeen++
}
}
if errorsSeen == 0 {
t.Fatal("closed store checks did not exercise any errors")
}
}

View File

@ -0,0 +1,87 @@
package store
import (
"context"
"path/filepath"
"testing"
)
func TestPullRequestCacheRoundTripAndWorkflowFilters(t *testing.T) {
ctx := context.Background()
st, err := Open(ctx, filepath.Join(t.TempDir(), "gitcrawl.db"))
if err != nil {
t.Fatalf("open store: %v", err)
}
defer st.Close()
repoID, threadIDs := seedVectorThreads(t, ctx, st)
threadID := threadIDs[1]
fetchedAt := "2026-05-05T10:00:00Z"
detail := PullRequestDetail{
ThreadID: threadID, RepoID: repoID, Number: 302,
BaseSHA: "base", HeadSHA: "head", HeadRef: "feature/cache", HeadRepoFullName: "openclaw/gitcrawl-fork",
MergeableState: "clean", Additions: 12, Deletions: 3, ChangedFiles: 2,
RawJSON: "{}", FetchedAt: fetchedAt, UpdatedAt: "2026-05-05T09:59:00Z",
}
files := []PullRequestFile{
{Path: "z.go", Status: "modified", Additions: 2, Deletions: 1, Changes: 3, Patch: "@@", RawJSON: "{}", FetchedAt: fetchedAt},
{Path: "a.go", Status: "renamed", Additions: 10, Changes: 10, PreviousPath: "old.go", RawJSON: "{}", FetchedAt: fetchedAt},
}
commits := []PullRequestCommit{
{SHA: "abc", Message: "feat: cache", AuthorLogin: "alice", AuthorName: "Alice", CommittedAt: "2026-05-05T08:00:00Z", HTMLURL: "https://example.invalid/commit/abc", RawJSON: "{}", FetchedAt: fetchedAt},
}
checks := []PullRequestCheck{
{Name: "z-check", Status: "completed", Conclusion: "success", DetailsURL: "https://example.invalid/z", WorkflowName: "CI", StartedAt: "2026-05-05T09:00:00Z", CompletedAt: "2026-05-05T09:05:00Z", RawJSON: "{}", FetchedAt: fetchedAt},
{Name: "a-check", Status: "queued", RawJSON: "{}", FetchedAt: fetchedAt},
}
runs := []WorkflowRun{
{RepoID: repoID, RunID: "100", RunNumber: 7, HeadBranch: "main", HeadSHA: "head", Status: "completed", Conclusion: "success", WorkflowName: "CI", Event: "push", HTMLURL: "https://example.invalid/run/100", CreatedAtGH: "2026-05-05T09:00:00Z", UpdatedAtGH: "2026-05-05T09:05:00Z", RawJSON: "{}", FetchedAt: fetchedAt},
{RepoID: repoID, RunID: "101", RunNumber: 8, HeadBranch: "release", HeadSHA: "other", Status: "in_progress", WorkflowName: "release", Event: "workflow_dispatch", CreatedAtGH: "2026-05-05T09:10:00Z", UpdatedAtGH: "2026-05-05T09:11:00Z", RawJSON: "{}", FetchedAt: fetchedAt},
}
if err := st.UpsertPullRequestCache(ctx, detail, files, commits, checks, runs); err != nil {
t.Fatalf("upsert pr cache: %v", err)
}
cache, err := st.PullRequestCache(ctx, repoID, 302)
if err != nil {
t.Fatalf("pull request cache: %v", err)
}
if cache.Detail.HeadSHA != "head" || cache.Detail.MergeableState != "clean" {
t.Fatalf("detail = %+v", cache.Detail)
}
if len(cache.Files) != 2 || cache.Files[0].Path != "a.go" || cache.Files[0].PreviousPath != "old.go" {
t.Fatalf("files = %+v", cache.Files)
}
if len(cache.Commits) != 1 || cache.Commits[0].SHA != "abc" || cache.Commits[0].AuthorName != "Alice" {
t.Fatalf("commits = %+v", cache.Commits)
}
if len(cache.Checks) != 2 || cache.Checks[0].Name != "a-check" || cache.Checks[1].Conclusion != "success" {
t.Fatalf("checks = %+v", cache.Checks)
}
mainRuns, err := st.ListWorkflowRuns(ctx, repoID, WorkflowRunListOptions{Branch: "main", HeadSHA: "head", Limit: 5})
if err != nil {
t.Fatalf("list filtered runs: %v", err)
}
if len(mainRuns) != 1 || mainRuns[0].RunID != "100" || mainRuns[0].HTMLURL == "" {
t.Fatalf("main runs = %+v", mainRuns)
}
allRuns, err := st.ListWorkflowRuns(ctx, repoID, WorkflowRunListOptions{})
if err != nil {
t.Fatalf("list default runs: %v", err)
}
if len(allRuns) != 2 || allRuns[0].RunID != "101" {
t.Fatalf("all runs = %+v", allRuns)
}
detail.HeadSHA = "head-v2"
if err := st.UpsertPullRequestCache(ctx, detail, files[:1], nil, nil, []WorkflowRun{{RepoID: repoID, RunID: "100", RunNumber: 9, HeadBranch: "main", HeadSHA: "head-v2", Status: "completed", Conclusion: "failure", UpdatedAtGH: "2026-05-05T10:00:00Z", RawJSON: "{}", FetchedAt: fetchedAt}}); err != nil {
t.Fatalf("update pr cache: %v", err)
}
cache, err = st.PullRequestCache(ctx, repoID, 302)
if err != nil {
t.Fatalf("updated pull request cache: %v", err)
}
if cache.Detail.HeadSHA != "head-v2" || len(cache.Files) != 1 || len(cache.Commits) != 0 || len(cache.Checks) != 0 {
t.Fatalf("updated cache = %+v", cache)
}
}

View File

@ -4,6 +4,7 @@ import (
"context"
"database/sql"
"fmt"
"strings"
"time"
)
@ -94,6 +95,50 @@ func (s *Store) LastSuccessfulSyncAt(ctx context.Context, repoID int64) (time.Ti
return parsed, nil
}
func (s *Store) LastSuccessfulListSyncAt(ctx context.Context, repoID int64, state string) (time.Time, error) {
scopes := listSyncScopesForState(state)
if len(scopes) == 0 {
return time.Time{}, nil
}
placeholders := make([]string, len(scopes))
args := make([]any, 0, 1+len(scopes))
args = append(args, repoID)
for i, scope := range scopes {
placeholders[i] = "?"
args = append(args, scope)
}
var lastSync string
err := s.q().QueryRowContext(ctx, `
select coalesce(max(finished_at), '')
from sync_runs
where repo_id = ? and status in ('success', 'completed') and scope in (`+strings.Join(placeholders, ",")+`)
`, args...).Scan(&lastSync)
if err != nil {
return time.Time{}, fmt.Errorf("read last successful list sync: %w", err)
}
if lastSync == "" {
return time.Time{}, nil
}
parsed, err := time.Parse(time.RFC3339Nano, lastSync)
if err != nil {
return time.Time{}, fmt.Errorf("parse last successful list sync %q: %w", lastSync, err)
}
return parsed, nil
}
func listSyncScopesForState(state string) []string {
switch strings.TrimSpace(strings.ToLower(state)) {
case "", "open":
return []string{"open", "all"}
case "closed":
return []string{"closed", "all"}
case "all":
return []string{"all"}
default:
return nil
}
}
func runTable(kind string) (string, error) {
switch kind {
case "sync":

View File

@ -111,3 +111,42 @@ func TestLastSuccessfulSyncAt(t *testing.T) {
t.Fatalf("last sync = %s, want %s", lastSync, want)
}
}
func TestLastSuccessfulListSyncAtIgnoresTargetedRuns(t *testing.T) {
ctx := context.Background()
st, err := Open(ctx, filepath.Join(t.TempDir(), "gitcrawl.db"))
if err != nil {
t.Fatalf("open store: %v", err)
}
defer st.Close()
repoID, err := st.UpsertRepository(ctx, Repository{
Owner: "openclaw", Name: "gitcrawl", FullName: "openclaw/gitcrawl", RawJSON: "{}", UpdatedAt: "2026-04-26T00:00:00Z",
})
if err != nil {
t.Fatalf("repo: %v", err)
}
if _, err := st.RecordRun(ctx, RunRecord{
RepoID: repoID, Kind: "sync", Scope: "numbers:13", Status: "success",
StartedAt: "2026-04-26T00:03:00Z", FinishedAt: "2026-04-26T00:03:30Z",
}); err != nil {
t.Fatalf("record targeted run: %v", err)
}
if lastSync, err := st.LastSuccessfulListSyncAt(ctx, repoID, "open"); err != nil || !lastSync.IsZero() {
t.Fatalf("targeted run should not count as broad list sync: last=%s err=%v", lastSync, err)
}
if _, err := st.RecordRun(ctx, RunRecord{
RepoID: repoID, Kind: "sync", Scope: "all", Status: "success",
StartedAt: "2026-04-26T00:04:00Z", FinishedAt: "2026-04-26T00:04:30Z",
}); err != nil {
t.Fatalf("record all run: %v", err)
}
lastSync, err := st.LastSuccessfulListSyncAt(ctx, repoID, "open")
if err != nil {
t.Fatalf("last broad sync: %v", err)
}
want, _ := time.Parse(time.RFC3339Nano, "2026-04-26T00:04:30Z")
if !lastSync.Equal(want) {
t.Fatalf("last broad sync = %s, want %s", lastSync, want)
}
}

View File

@ -4,12 +4,9 @@ import (
"context"
"database/sql"
"fmt"
"os"
"path/filepath"
"runtime"
"time"
_ "modernc.org/sqlite"
crawlstore "github.com/vincentkoc/crawlkit/store"
)
const (
@ -39,64 +36,33 @@ type Status struct {
}
func Open(ctx context.Context, path string) (*Store, error) {
if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
return nil, fmt.Errorf("create db dir: %w", err)
}
if err := ensureDBFile(path); err != nil {
return nil, err
}
dsn := fmt.Sprintf(
"file:%s?_pragma=foreign_keys(1)&_pragma=journal_mode(WAL)&_pragma=synchronous(NORMAL)&_pragma=temp_store(MEMORY)&_pragma=mmap_size(268435456)&_pragma=busy_timeout(5000)",
path,
)
db, err := sql.Open("sqlite", dsn)
base, err := crawlstore.Open(ctx, crawlstore.Options{Path: path})
if err != nil {
return nil, fmt.Errorf("open sqlite: %w", err)
}
db.SetMaxOpenConns(1)
db.SetMaxIdleConns(1)
if err := db.PingContext(ctx); err != nil {
_ = db.Close()
return nil, fmt.Errorf("ping sqlite: %w", err)
}
if err := tightenDBFilePerms(path); err != nil {
_ = db.Close()
return nil, err
}
db := base.DB()
st := &Store{db: db, path: path}
if err := st.migrate(ctx); err != nil {
_ = db.Close()
_ = base.Close()
return nil, err
}
return st, nil
}
func OpenReadOnly(ctx context.Context, path string) (*Store, error) {
if _, err := os.Stat(path); err != nil {
return nil, fmt.Errorf("stat db file: %w", err)
}
dsn := fmt.Sprintf(
"file:%s?mode=ro&_pragma=query_only(1)&_pragma=foreign_keys(1)&_pragma=temp_store(MEMORY)&_pragma=mmap_size(268435456)&_pragma=busy_timeout(5000)",
path,
)
db, err := sql.Open("sqlite", dsn)
base, err := crawlstore.OpenReadOnly(ctx, path)
if err != nil {
return nil, fmt.Errorf("open sqlite readonly: %w", err)
}
db.SetMaxOpenConns(1)
db.SetMaxIdleConns(1)
if err := db.PingContext(ctx); err != nil {
_ = db.Close()
return nil, fmt.Errorf("ping sqlite readonly: %w", err)
return nil, err
}
db := base.DB()
st := &Store{db: db, path: path}
current, err := st.schemaVersion(ctx)
if err != nil {
_ = db.Close()
_ = base.Close()
return nil, err
}
if current > schemaVersion {
_ = db.Close()
_ = base.Close()
return nil, fmt.Errorf("database schema version %d is newer than supported version %d", current, schemaVersion)
}
return st, nil
@ -273,31 +239,3 @@ func (s *Store) schemaVersion(ctx context.Context) (int, error) {
}
return version, nil
}
func ensureDBFile(path string) error {
if _, err := os.Stat(path); err == nil {
return nil
} else if !os.IsNotExist(err) {
return fmt.Errorf("stat db file: %w", err)
}
file, err := os.OpenFile(path, os.O_CREATE|os.O_EXCL|os.O_WRONLY, 0o600)
if err != nil && !os.IsExist(err) {
return fmt.Errorf("create db file: %w", err)
}
if file != nil {
if err := file.Close(); err != nil {
return fmt.Errorf("close db file: %w", err)
}
}
return nil
}
func tightenDBFilePerms(path string) error {
if runtime.GOOS == "windows" {
return nil
}
if err := os.Chmod(path, 0o600); err != nil {
return fmt.Errorf("chmod db file: %w", err)
}
return nil
}

View File

@ -6,6 +6,7 @@ import (
"encoding/hex"
"encoding/json"
"fmt"
"log/slog"
"strconv"
"strings"
"time"
@ -13,6 +14,7 @@ import (
"github.com/openclaw/gitcrawl/internal/documents"
gh "github.com/openclaw/gitcrawl/internal/github"
"github.com/openclaw/gitcrawl/internal/store"
"github.com/vincentkoc/crawlkit/progress"
)
type GitHubClient interface {
@ -45,6 +47,7 @@ type Options struct {
IncludeComments bool
IncludePRDetails bool
Reporter gh.Reporter
Logger *slog.Logger
}
type Stats struct {
@ -132,6 +135,15 @@ func (s *Syncer) Sync(ctx context.Context, options Options) (Stats, error) {
MetadataOnly: !options.IncludeComments,
StartedAt: started,
}
tracker := progress.New(options.Logger, progress.Options{
Name: "sync",
Unit: "threads",
Total: int64(len(rows)),
Attrs: []any{
"repository", stats.Repository,
"state", state,
},
})
persist := func(st *store.Store) error {
for _, row := range rows {
thread := mapIssueToThread(repoID, row, s.now().Format(time.RFC3339Nano))
@ -169,6 +181,11 @@ func (s *Syncer) Sync(ctx context.Context, options Options) (Stats, error) {
} else {
stats.IssuesSynced++
}
tracker.Add(1,
"number", thread.Number,
"kind", thread.Kind,
"thread_state", thread.State,
)
}
if len(numbers) == 0 && state == "open" && since != "" && options.Limit <= 0 {
closed, err := s.applyClosedOverlapSweep(ctx, st, repoID, options, since)
@ -193,13 +210,17 @@ func (s *Syncer) Sync(ctx context.Context, options Options) (Stats, error) {
}
if !options.IncludeComments {
if err := s.store.WithTx(ctx, persist); err != nil {
tracker.Finish(err)
return Stats{}, err
}
tracker.Finish(nil)
return stats, nil
}
if err := persist(s.store); err != nil {
tracker.Finish(err)
return Stats{}, err
}
tracker.Finish(nil)
return stats, nil
}

View File

@ -1,9 +1,12 @@
package syncer
import (
"bytes"
"context"
"encoding/json"
"log/slog"
"path/filepath"
"strings"
"testing"
"time"
@ -286,7 +289,13 @@ func TestSyncPersistsIssuesAndPullRequests(t *testing.T) {
s := New(fakeGitHub{}, st)
s.now = func() time.Time { return time.Date(2026, 4, 26, 0, 0, 0, 0, time.UTC) }
stats, err := s.Sync(ctx, Options{Owner: "openclaw", Repo: "gitcrawl", IncludeComments: true})
var progressLogs bytes.Buffer
stats, err := s.Sync(ctx, Options{
Owner: "openclaw",
Repo: "gitcrawl",
IncludeComments: true,
Logger: testProgressLogger(&progressLogs),
})
if err != nil {
t.Fatalf("sync: %v", err)
}
@ -321,6 +330,18 @@ func TestSyncPersistsIssuesAndPullRequests(t *testing.T) {
if documentCount != 1 {
t.Fatalf("document count: got %d want 1", documentCount)
}
for _, want := range []string{
`msg="sync progress"`,
`state=finished`,
`unit=threads`,
`percent=100.0`,
`completion=100.0%`,
`repository=openclaw/gitcrawl`,
} {
if !strings.Contains(progressLogs.String(), want) {
t.Fatalf("missing %q in progress logs:\n%s", want, progressLogs.String())
}
}
}
func TestSyncHydratesPullReviewComments(t *testing.T) {
@ -644,3 +665,51 @@ func TestMappingHelperBranches(t *testing.T) {
t.Fatalf("comment = %+v", comment)
}
}
func TestMappingFallbackBranches(t *testing.T) {
now := time.Date(2026, 5, 5, 12, 0, 0, 123, time.UTC)
normalized, err := normalizeSince("2026-05-05T12:00:00+02:00", now)
if err != nil {
t.Fatalf("normalize iso since: %v", err)
}
if normalized != "2026-05-05T10:00:00Z" {
t.Fatalf("normalized iso since = %q", normalized)
}
if got, err := normalizeSince("2w", now); err != nil || got != "2026-04-21T12:00:00.000000123Z" {
t.Fatalf("normalize weeks = %q, %v", got, err)
}
if got := mustJSON(map[string]any{"bad": make(chan int)}); got != "{}" {
t.Fatalf("mustJSON marshal fallback = %q", got)
}
thread := mapIssueToThread(99, map[string]any{
"id": int64(123),
"number": 456,
"state": "closed",
"title": "fallbacks",
"body": "body",
"html_url": "https://github.com/openclaw/gitcrawl/issues/456",
"labels": nil,
"assignees": nil,
"created_at": "2026-05-05T10:00:00Z",
"updated_at": "2026-05-05T11:00:00Z",
"closed_at": "2026-05-05T12:00:00Z",
}, "2026-05-05T12:00:00Z")
if thread.LabelsJSON != "[]" || thread.AssigneesJSON != "[]" {
t.Fatalf("nullable label defaults: labels=%s assignees=%s", thread.LabelsJSON, thread.AssigneesJSON)
}
if thread.GitHubID != "123" || thread.Number != 456 || thread.AuthorLogin != "" || thread.ClosedAtGitHub == "" {
t.Fatalf("thread = %+v", thread)
}
}
func testProgressLogger(out *bytes.Buffer) *slog.Logger {
return slog.New(slog.NewTextHandler(out, &slog.HandlerOptions{
ReplaceAttr: func(_ []string, attr slog.Attr) slog.Attr {
if attr.Key == slog.TimeKey {
return slog.Attr{}
}
return attr
},
}))
}

View File

@ -172,7 +172,9 @@ function markdownToHtml(markdown, currentRel) {
const flushParagraph = () => {
if (!paragraph.length) return;
html.push(`<p>${inline(paragraph.join(" "), currentRel)}</p>`);
const text = paragraph.join(" ");
const className = currentRel === "index.md" && /^\[Quickstart\]\([^)]*\)\s+\[View on GitHub\]\(/.test(text) ? ' class="home-actions"' : "";
html.push(`<p${className}>${inline(text, currentRel)}</p>`);
paragraph = [];
};
const closeList = () => {
@ -219,7 +221,7 @@ function markdownToHtml(markdown, currentRel) {
closeList();
flushBlockquote();
if (fence) {
html.push(`<pre><code class="language-${fence.lang}">${escapeHtml(fence.lines.join("\n"))}</code></pre>`);
html.push(`<pre><code class="language-${escapeAttr(fence.lang)}">${highlightCode(fence.lines.join("\n"), fence.lang)}</code></pre>`);
fence = null;
} else {
fence = { lang: fenceMatch[1] || "text", lines: [] };
@ -304,6 +306,89 @@ function markdownToHtml(markdown, currentRel) {
return html.join("\n");
}
function highlightCode(code, lang) {
const normalized = String(lang || "text").toLowerCase();
if (["bash", "sh", "shell", "zsh"].includes(normalized)) return highlightBash(code);
if (normalized === "json") return highlightJSON(code);
if (normalized === "toml") return highlightConfig(code, "toml");
if (["yaml", "yml"].includes(normalized)) return highlightConfig(code, "yaml");
if (normalized === "cron") return highlightCron(code);
return escapeHtml(code);
}
function highlightBash(code) {
return code.split("\n").map((line) => {
if (/^\s*#/.test(line)) return span("comment", line);
return highlightSegments(line, /("(?:\\.|[^"\\])*"|'(?:\\.|[^'\\])*'|`[^`]*`|\$\{?[A-Za-z_][A-Za-z0-9_]*\}?|--?[A-Za-z0-9][A-Za-z0-9_-]*|\b(?:brew|case|cd|curl|do|done|else|esac|export|fi|for|gh|git|gitcrawl|go|if|in|jq|ln|local|mkdir|set|then|while)\b|#.*)/g, (token) => {
if (token.startsWith("#")) return span("comment", token);
if (/^["'`]/.test(token)) return span("string", token);
if (token.startsWith("$")) return span("variable", token);
if (token.startsWith("-")) return span("option", token);
return span("keyword", token);
});
}).join("\n");
}
function highlightJSON(code) {
return highlightSegments(code, /("(?:\\.|[^"\\])*"\s*:)|("(?:\\.|[^"\\])*")|\b(?:true|false|null)\b|-?\b\d+(?:\.\d+)?(?:[eE][+-]?\d+)?\b/g, (token) => {
if (token.endsWith(":")) return `${span("key", token.slice(0, -1))}:`;
if (token.startsWith('"')) return span("string", token);
if (/^(?:true|false|null)$/.test(token)) return span("literal", token);
return span("number", token);
});
}
function highlightConfig(code, lang) {
return code.split("\n").map((line) => {
if (/^\s*#/.test(line)) return span("comment", line);
const commentMatch = line.match(/(^|[^"'])#.*/);
const commentStart = commentMatch ? commentMatch.index + commentMatch[1].length : -1;
const body = commentStart >= 0 ? line.slice(0, commentStart) : line;
const comment = commentStart >= 0 ? line.slice(commentStart) : "";
const highlighted = lang === "toml"
? highlightSegments(body, /(^\s*[A-Za-z0-9_.-]+(?=\s*=))|("(?:\\.|[^"\\])*"|'(?:\\.|[^'\\])*')|\b(?:true|false)\b|-?\b\d+(?:\.\d+)?\b/g, configToken)
: highlightSegments(body, /(^\s*[A-Za-z0-9_.-]+(?=\s*:))|("(?:\\.|[^"\\])*"|'(?:\\.|[^'\\])*')|\b(?:true|false|null)\b|-?\b\d+(?:\.\d+)?\b/g, configToken);
return highlighted + (comment ? span("comment", comment) : "");
}).join("\n");
}
function configToken(token) {
if (/^\s*[A-Za-z0-9_.-]+$/.test(token)) {
const leading = token.match(/^\s*/)[0];
return `${escapeHtml(leading)}${span("key", token.slice(leading.length))}`;
}
if (/^["']/.test(token)) return span("string", token);
if (/^(?:true|false|null)$/.test(token)) return span("literal", token);
return span("number", token);
}
function highlightCron(code) {
return code.split("\n").map((line) => {
if (/^\s*#/.test(line)) return span("comment", line);
return highlightSegments(line, /(\*|(?:\d+)(?:[-/,]\d+)*)|("[^"]*"|'[^']*')|#.*|\b[A-Z_][A-Z0-9_]*=/g, (token) => {
if (token.startsWith("#")) return span("comment", token);
if (/^["']/.test(token)) return span("string", token);
if (token.endsWith("=")) return span("key", token.slice(0, -1)) + "=";
return span("number", token);
});
}).join("\n");
}
function highlightSegments(text, pattern, classify) {
let out = "";
let last = 0;
for (const match of text.matchAll(pattern)) {
out += escapeHtml(text.slice(last, match.index));
out += classify(match[0]);
last = match.index + match[0].length;
}
return out + escapeHtml(text.slice(last));
}
function span(kind, value) {
return `<span class="hl-${kind}">${escapeHtml(value)}</span>`;
}
function inline(text, currentRel) {
const stash = [];
let out = text.replace(/`([^`]+)`/g, (_, code) => {
@ -346,10 +431,7 @@ function tocFromHtml(html) {
const re = /<h([23]) id="([^"]+)">([\s\S]*?)<\/h[23]>/g;
let m;
while ((m = re.exec(html))) {
const text = m[3]
.replace(/<a class="anchor"[^>]*>.*?<\/a>/, "")
.replace(/<[^>]+>/g, "")
.trim();
const text = htmlTextContent(m[3]).replace(/^#/, "").trim();
items.push({ level: Number(m[1]), id: m[2], text });
}
if (items.length < 2) return "";
@ -500,6 +582,33 @@ function escapeAttr(value) {
return escapeHtml(value);
}
function htmlTextContent(fragment) {
let out = "";
let inTag = false;
for (const char of fragment) {
if (char === "<") {
inTag = true;
continue;
}
if (inTag) {
if (char === ">") inTag = false;
continue;
}
out += char;
}
return decodeHtmlText(out);
}
function decodeHtmlText(value) {
return String(value).replace(/&(amp|lt|gt|quot|#39);/g, (_, entity) => ({
amp: "&",
lt: "<",
gt: ">",
quot: '"',
"#39": "'",
})[entity]);
}
function validateLinks(outputDir) {
const failures = [];
for (const file of allHtml(outputDir)) {

View File

@ -1,7 +1,7 @@
export function css() {
return `
:root{--ink:#0f1115;--text:#1f2328;--text-soft:#3b4147;--muted:#6b7280;--subtle:#9aa1ab;--bg:#fafafa;--paper:#ffffff;--accent:#2563eb;--accent-strong:#1d4ed8;--accent-soft:rgba(37,99,235,.08);--line:#e5e7eb;--line-soft:#eef0f3;--branch:#d0d7de;--code-bg:#0f172a;--code-fg:#e6edf3;--code-border:#1f2937;--code-scroll:#334155;--shadow:rgba(15,17,21,.08);--shadow-strong:rgba(15,17,21,.18);--tag-bg:#ddf4ff;--tag-fg:#0969da;--ring:rgba(37,99,235,.32);color-scheme:light}
[data-theme="dark"]{--ink:#e6edf3;--text:#c9d1d9;--text-soft:#8b949e;--muted:#8b949e;--subtle:#6e7681;--bg:#0d1117;--paper:#161b22;--accent:#58a6ff;--accent-strong:#79b8ff;--accent-soft:rgba(56,139,253,.16);--line:#30363d;--line-soft:#21262d;--branch:#30363d;--code-bg:#010409;--code-fg:#e6edf3;--code-border:#21262d;--code-scroll:#30363d;--shadow:rgba(0,0,0,.5);--shadow-strong:rgba(0,0,0,.7);--tag-bg:rgba(56,139,253,.16);--tag-fg:#58a6ff;--ring:rgba(56,139,253,.4);color-scheme:dark}
:root{--ink:#0f1115;--text:#1f2328;--text-soft:#3b4147;--muted:#6b7280;--subtle:#9aa1ab;--bg:#fafafa;--paper:#ffffff;--accent:#2563eb;--accent-strong:#1d4ed8;--accent-soft:rgba(37,99,235,.08);--line:#e5e7eb;--line-soft:#eef0f3;--branch:#d0d7de;--code-bg:#0f172a;--code-fg:#e6edf3;--code-border:#1f2937;--code-scroll:#334155;--hl-comment:#94a3b8;--hl-keyword:#93c5fd;--hl-string:#86efac;--hl-number:#fbbf24;--hl-literal:#c4b5fd;--hl-key:#67e8f9;--hl-variable:#f0abfc;--hl-option:#fda4af;--shadow:rgba(15,17,21,.08);--shadow-strong:rgba(15,17,21,.18);--tag-bg:#ddf4ff;--tag-fg:#0969da;--ring:rgba(37,99,235,.32);color-scheme:light}
[data-theme="dark"]{--ink:#e6edf3;--text:#c9d1d9;--text-soft:#8b949e;--muted:#8b949e;--subtle:#6e7681;--bg:#0d1117;--paper:#161b22;--accent:#58a6ff;--accent-strong:#79b8ff;--accent-soft:rgba(56,139,253,.16);--line:#30363d;--line-soft:#21262d;--branch:#30363d;--code-bg:#010409;--code-fg:#e6edf3;--code-border:#21262d;--code-scroll:#30363d;--hl-comment:#8b949e;--hl-keyword:#79c0ff;--hl-string:#a5d6ff;--hl-number:#ffa657;--hl-literal:#d2a8ff;--hl-key:#7ee787;--hl-variable:#ff7b72;--hl-option:#f2cc60;--shadow:rgba(0,0,0,.5);--shadow-strong:rgba(0,0,0,.7);--tag-bg:rgba(56,139,253,.16);--tag-fg:#58a6ff;--ring:rgba(56,139,253,.4);color-scheme:dark}
*{box-sizing:border-box}
html{scroll-behavior:smooth;scroll-padding-top:24px}
body{margin:0;background:var(--bg);color:var(--text);font-family:"Inter",ui-sans-serif,system-ui,-apple-system,Segoe UI,sans-serif;line-height:1.65;overflow-x:hidden;-webkit-font-smoothing:antialiased;font-feature-settings:"cv02","cv03","cv04","cv11"}
@ -70,6 +70,14 @@ body:not(.home) .doc>h1:first-child{display:none}
.doc :is(h2,h3,h4) .anchor:hover{opacity:1;color:var(--accent);text-decoration:none}
.doc p{margin:0 0 1.05em}
.doc-home>p:first-of-type{font-size:1.12rem;color:var(--text-soft);line-height:1.6;margin:0 0 1.3em;max-width:60ch}
.home-actions{display:flex;flex-wrap:wrap;gap:10px;margin:0 0 1.7em!important}
.home-actions a{display:inline-flex;align-items:center;justify-content:center;min-height:42px;border:1px solid var(--line);border-radius:8px;padding:8px 14px;background:var(--paper);color:var(--ink);font-weight:600;font-size:.94rem;line-height:1.2;text-decoration:none;box-shadow:0 1px 2px var(--shadow);transition:transform .14s,border-color .14s,background .14s,color .14s,box-shadow .14s}
.home-actions a:hover{text-decoration:none;transform:translateY(-1px);border-color:var(--accent);box-shadow:0 4px 12px var(--shadow)}
.home-actions a:focus-visible{outline:2px solid var(--accent);outline-offset:2px}
.home-actions a:first-child{background:var(--accent);border-color:var(--accent);color:#fff;box-shadow:0 3px 10px var(--ring)}
.home-actions a:first-child:hover{background:var(--accent-strong);border-color:var(--accent-strong);color:#fff}
.home-actions a:first-child::before{content:"↗";font-size:.9em;margin-right:8px}
.home-actions a[href*="github.com"]::before{content:"";width:15px;height:15px;margin-right:8px;background:currentColor;clip-path:path("M7.5 0C3.36 0 0 3.45 0 7.7c0 3.4 2.15 6.28 5.13 7.3.38.07.51-.17.51-.37v-1.31c-2.08.46-2.52-1.03-2.52-1.03-.34-.89-.83-1.12-.83-1.12-.68-.48.05-.47.05-.47.75.05 1.15.79 1.15.79.67 1.17 1.75.83 2.18.64.07-.5.26-.83.47-1.02-1.66-.2-3.41-.85-3.41-3.78 0-.83.29-1.52.77-2.05-.08-.2-.34-1.02.07-2.02 0 0 .63-.21 2.06.78.6-.17 1.24-.26 1.88-.26.64 0 1.28.09 1.88.26 1.43-.99 2.06-.78 2.06-.78.41 1 .15 1.82.07 2.02.48.53.77 1.22.77 2.05 0 2.94-1.75 3.58-3.42 3.78.27.24.51.72.51 1.45v2.1c0 .2.14.44.52.37A7.7 7.7 0 0 0 15 7.7C15 3.45 11.64 0 7.5 0Z")}
.doc ul,.doc ol{padding-left:1.3rem;margin:0 0 1.15em}
.doc li{margin:.25em 0}
.doc li>p{margin:0 0 .4em}
@ -80,6 +88,14 @@ body:not(.home) .doc>h1:first-child{display:none}
.doc pre::-webkit-scrollbar{height:8px;width:8px}
.doc pre::-webkit-scrollbar-thumb{background:var(--code-scroll);border-radius:8px}
.doc pre code{display:block;background:transparent;border:0;color:inherit;padding:0;font-size:1em;white-space:pre}
.doc pre .hl-comment{color:var(--hl-comment);font-style:italic}
.doc pre .hl-keyword{color:var(--hl-keyword);font-weight:500}
.doc pre .hl-string{color:var(--hl-string)}
.doc pre .hl-number{color:var(--hl-number)}
.doc pre .hl-literal{color:var(--hl-literal);font-weight:500}
.doc pre .hl-key{color:var(--hl-key)}
.doc pre .hl-variable{color:var(--hl-variable)}
.doc pre .hl-option{color:var(--hl-option)}
.doc pre .copy{position:absolute;top:8px;right:8px;background:rgba(255,255,255,.06);color:var(--code-fg);border:1px solid rgba(255,255,255,.16);border-radius:6px;padding:3px 9px;font:500 .7rem/1 "Inter",sans-serif;cursor:pointer;opacity:0;transition:opacity .15s,background .15s,border-color .15s}
.doc pre:hover .copy,.doc pre .copy:focus{opacity:1}
.doc pre .copy:hover{background:rgba(255,255,255,.12)}