5.3 KiB
| summary | read_when | |||
|---|---|---|---|---|
| Feature spec: import a skill from a public GitHub URL (auto-detect SKILL.md, selective file upload, provenance). |
|
GitHub import (public repos)
Goal: paste a GitHub URL → auto-detect skill → preview files → publish (selective) → persist provenance.
Non-goal (v1): private repos (no OAuth/PAT support).
Related:
docs/skill-format.md(what counts as a skill; text-only limits)docs/api.md/docs/http-api.md(REST patterns + auth)
UX
Upload page: “Import from GitHub” mode.
Flow:
- URL input
- Detect skill candidates (SKILL.md)
- If multiple candidates: choose one
- File picker: check/uncheck; smart-select referenced files
- Confirm slug/name/version/tags
- Import → publish
Accepted URLs
Allowlist: https://github.com/... only.
Supported shapes:
- Repo root:
https://github.com/<owner>/<repo> - Tree path:
https://github.com/<owner>/<repo>/tree/<ref>/<path> - Blob path (file):
https://github.com/<owner>/<repo>/blob/<ref>/<path>
Normalization:
- Strip query/hash for fetch.
- From
blob/.../SKILL.mdderivepathas parent folder. - If
refmissing: useHEAD.
Reject:
- Non-GitHub hosts.
- Unknown URL patterns.
- Paths containing
..after normalization.
Fetch strategy (public)
Download archive:
https://github.com/<owner>/<repo>/archive/<ref>.zip- Follow redirects. Final redirect usually pins a commit via
codeload.github.com/.../zip/<sha-or-branch>.
Unzip server-side (Node or Convex node action). Scan for skill candidates.
Skill candidate definition:
- Any folder containing
SKILL.mdorskill.md(also acceptskills.mdfor compatibility). - Treat repo root as a folder too.
Multiple skills:
- Return candidate list:
{ path, frontmatter.name, frontmatter.description }. - User chooses one.
Smart file selection
Defaults:
- Always select
SKILL.md(or chosen readme file). - Prefer selecting only within chosen skill folder; allow “include out-of-folder refs” if explicitly toggled.
Referenced file expansion:
- Parse Markdown links/images from selected
.mdfiles:[](<rel>),,<rel>only when relative.- Ignore
http(s):,mailto:,#anchors. - Strip query/hash from relative targets.
- Resolve against the current file’s directory.
- Normalize, reject escapes (
..). - Add referenced file if present in archive and is text-allowed.
- Recurse for newly added
.mdfiles.
Hard caps:
- Max recursion depth (e.g. 4).
- Max referenced additions (e.g. 200).
UI affordances:
- “Select referenced”
- “Select all text”
- “Clear”
- Search/filter by path
Publish behavior
Server publishes using existing pipeline:
- Text-only enforced (see
docs/skill-format.md). - Total ≤ 50MB (selected set).
- Must include
SKILL.md(or accepted variant).
Suggested defaults (UI):
displayName: frontmatternameelse folder basename → title case.slug: sanitize folder basename; if collision, suffix (-2,-3, …).version: if new skill →0.1.0; if updating own existing skill → bump patch.tags: defaultlatest.
Provenance (persist source)
Persist on each published version (server-side injection; no mutation of imported files):
- Store in
skillVersions.parsed.metadata.source:
Example:
{
"kind": "github",
"url": "https://github.com/visionik/ouracli",
"repo": "visionik/ouracli",
"ref": "HEAD",
"commit": "66ac8fb266b7c5ff6519431862be6a375bbfb883",
"path": "",
"importedAt": 1767930000000
}
Why parsed.metadata:
- Already optional and stored with each version.
- No schema churn for v1.
Future: canonical-claim
- “claim canonical” can key off
{ kind:'github', repo, path }. - Prefer commit-pinned provenance for auditability; allow UI to show “Imported from …”.
API sketch (internal actions)
Two-step (recommended):
previewGitHubImport(url)→{ commit, candidates:[...], files:[...], defaults:{...} }importGitHubSkill({ url, commit, candidatePath, selectedPaths, slug, displayName, version, tags })
Notes:
importGitHubSkillshould re-fetch by pinnedcommit(not floating branch), to avoid TOCTOU.- Validate
selectedPathssubset of fetched archive manifest.
Security / abuse controls
SSRF:
- Only
github.com(+codeload.github.comduring redirect follow). - No arbitrary redirects to other hosts.
Zip safety:
- Max compressed bytes (from
Content-Lengthif present; else streaming cap). - Max uncompressed total bytes.
- Max file count.
- Max single file size.
- Reject symlinks; reject absolute paths; reject
..segments.
Rate limits:
- Tie to existing write limits (import == publish).
- Cache preview results briefly (e.g. 60s) keyed by
{repo, commit}.
Error UX:
- “No SKILL.md found.”
- “Multiple skills found; pick one.”
- “Repo too large / too many files.”
- “Selected files exceed 50MB.”
Manual test checklist
- Repo root skill (
SKILL.mdat root). - Nested skill (
skills/foo/SKILL.md). - Multi-skill repo (two SKILL.md).
- SKILL.md references
docs/usage.md+ images; smart-select picks.mdand referenced text files; ignores external links. - Huge repo → clean “too large” error.
- Redirect pinning → import stores commit sha in provenance.