docs: document crawlkit archive surfaces

This commit is contained in:
Vincent Koc 2026-05-05 19:16:52 -07:00
parent 98be6a9c11
commit f32dae98cc
No known key found for this signature in database
4 changed files with 33 additions and 0 deletions

View File

@ -2,6 +2,21 @@
## 0.7.0 - Unreleased
### Changes
- Document the crawlkit-backed config/status/control, snapshot, mirror,
sync-state, output, and shared TUI surfaces now used on `main`.
- Clarify that Discord bot sync, desktop wiretap parsing, DM privacy filters,
schema ownership, FTS/ranking, embeddings, and analytics remain app-owned.
- Align terminal browser docs with the gitcrawl-style shared TUI model:
channel/person/thread groups, message rows, detail/thread panes, sorting,
mouse selection, right-click actions, and local/remote status chrome.
### Maintenance
- Document the read-only `metadata --json`, `status --json`, and
`doctor --json` control surface for launchers, automation, and CI checks.
### Fixes
- `wiretap` now uses a fast default path for Discord Chromium cache imports: it scans cheap context files plus route-bearing HTTP cache entries, checkpoints file progress in batches, and leaves exhaustive historical cache archaeology behind `--full-cache` / `desktop.full_cache`.

View File

@ -23,6 +23,8 @@ Wiretap DMs stay local and are never exported to the Git-backed snapshot mirror.
- imports classifiable Discord Desktop cache messages with `wiretap`, including proven DMs under `@me`
- publishes and imports private Git-backed archive snapshots for org-wide read access
- browses stored messages and local DMs in a terminal archive UI
- exposes `metadata --json`, `status --json`, and `doctor --json` for local
launchers, automation, and CI
- supports Git-only read mode with no Discord credentials on reader machines
- generates backup README activity reports, with optional AI-written field notes
- exposes read-only SQL for ad hoc analysis
@ -171,6 +173,12 @@ discrawl tui --dm
discrawl --json tui --limit 50
```
The terminal browser uses the shared crawlkit explorer. The left pane groups
channels, people, or threads; the middle pane lists messages; the right pane
shows the selected message, surrounding conversation, and thread detail. Mouse
selection, right-click actions, sortable headers, and the local/remote footer
follow the same interaction model as `gitcrawl tui`.
### `init`
Creates the local config and discovers accessible guilds.

View File

@ -18,6 +18,7 @@ Mirror Discord guilds into local SQLite. Search server history without depending
- **Just want to read a shared archive?** Use [`subscribe`](commands/subscribe.html) - no token needed.
- **Need DM search?** [`wiretap`](commands/wiretap.html) imports local Discord Desktop cache.
- **Want semantic search?** Configure [Embeddings](guides/embeddings.html), then run [`embed`](commands/embed.html).
- **Wiring an agent or launcher?** `discrawl metadata --json`, `discrawl status --json`, and `discrawl doctor --json` expose the read-only crawlkit control surface.
## At a glance
@ -30,6 +31,10 @@ discrawl search "panic: nil pointer"
discrawl tail
```
`discrawl tui` uses the shared crawlkit terminal explorer: channel/person/thread
groups on the left, message rows in the middle, and readable message/thread
detail on the right.
## Sections
- **[Start](install.html)** - install, configure, set up the Discord bot, security notes, contact

View File

@ -2,6 +2,11 @@
Discrawl can publish the SQLite archive as sharded, compressed NDJSON snapshots in a private Git repo, then auto-import that repo before local read commands. This gives readers org memory without Discord credentials.
Snapshot packing/import and git mirror mechanics are shared through
`crawlkit`. Discrawl still owns Discord-specific privacy policy: `@me` direct
messages, wiretap sync state, and local-only desktop rows are excluded from
published snapshots and are preserved locally on import.
## Publisher
```bash