3.1 KiB
| title | layout | nav_order | description | permalink |
|---|---|---|---|---|
| Home | home | 1 | gitcrawl is a local-first GitHub issue and pull request crawler for maintainer triage. | / |
gitcrawl
{: .fs-9 }
A local-first GitHub triage tool and a drop-in caching gh shim. Sync issues and PRs into SQLite for search and clustering — then let agents call gh against that same cache so you stop burning the API rate limit.
{: .fs-6 .fw-300 }
Quickstart{: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 } View on GitHub{: .btn .fs-5 .mb-4 .mb-md-0 }
Two jobs, one binary
gitcrawl mirrors a GitHub repository's issues and pull requests into local SQLite, then layers semantic clustering, full-text search, and a gh-compatible shim on top — so a maintainer (or an agent acting on their behalf) can triage threads and serve everyday gh reads without burning live API quota.
- Local SQLite first. All issues, PRs, comments, reviews, files, commits, checks, and workflow runs land in
~/.config/gitcrawl/gitcrawl.db. Queries hit the disk, not GitHub. - Drop-in
ghcache. Symlinkgitcrawl-ghasghand most read-only calls (gh search,gh issue/pr view,gh pr checks,gh run, REST GETs, GraphQL queries) answer from local SQLite. Agents stop hitting rate limits; mutating commands pass through unchanged. Rungh xcache statsto see hit rate, per-command misses, and evictions. - Semantic clustering. OpenAI embeddings group related reports, with deterministic GitHub reference evidence (
#123,pull/123) preventing weak similarity bridges from forming mega-clusters. - Terminal UI.
gitcrawl tuiis a keyboard- and mouse-driven cluster browser with bidirectional sort, jump-to-number, neighbors, and member-level governance actions. - Agent-friendly JSON. Every command supports
--jsonfor clean automation surfaces.
Pick your path
I want to try it
Quickstart walks you from git clone to a populated cluster view in five minutes.
I want to wire up an agent
The gh shim is the fastest way to cut GitHub API load — point your agent at gitcrawl-gh, keep the agent's gh calls intact.
I want to triage a busy repo
Read Sync to bring data local, then Clustering and the TUI for the maintainer workflow.
I want the full reference
Commands lists every flag and JSON field. Configuration covers env vars and paths.
Project status
Early bootstrap. The implementation is being built in small commits — see the changelog for what shipped recently.
The product contract in SPEC.md is the source of truth for what is in and out of scope.
Out of scope
- Local HTTP API
- Hosted service runtime
- Browser web UI
- GitHub write-back actions (use
ghfor those)
License
Released under the MIT license.