Context:
The current 40-task set is being split into a private holdout set plus a
new public set. The public repo will ship a different task set that
doesn't give away the holdout; in the meantime, stop tracking the current
tasks/ directory so benchmarking can continue locally without exposing
the set externally.
Changes:
- .gitignore: add tasks/ and lab-pr68627/ (vendored PR content, also
moving out of the public repo).
- git rm --cached tasks/: remove from tracking (files remain on disk
locally).
- tests/test_integration_checks.py:
* Module-level pytest.mark.skipif that skips the whole file when
tasks/ is absent — so CI against the public repo (no tasks)
stays green once the private set moves out.
* Update the t2-node-search-patch fixture to also define emptyNote()
since the task was hardened with that distractor. Without this, the
integration test asserts score==1.0 but gets 0.0 (the new
"emptyNote stays empty" test fails against a fixture that never
defines emptyNote).
Follow-up (separate work):
Public task set lands in a subsequent commit. Holdout access path
(encrypted-in-repo or private-repo) gets wired into the harness's
private_tasks_root / hidden_tasks_dir plumbing.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
23 lines
332 B
Plaintext
23 lines
332 B
Plaintext
__pycache__/
|
|
*.pyc
|
|
*.pyo
|
|
.venv/
|
|
dist/
|
|
*.egg-info/
|
|
.pytest_cache/
|
|
results/
|
|
.env
|
|
.tmp/
|
|
data/
|
|
.DS_Store
|
|
.clawbench/
|
|
reports/
|
|
scripts/__pycache__/
|
|
scripts/*.tmp
|
|
scripts/*.local.py
|
|
|
|
# Task set is being split into a private holdout + a new public set.
|
|
# Until the public set lands, keep the current tasks/ local-only.
|
|
tasks/
|
|
lab-pr68627/
|