clawbench/scripts
2026-04-22 12:03:13 -07:00
..
_archive_cache.sh clawbench: per-sweep cache archiving + generic sweep templates 2026-04-18 12:46:45 -07:00
analyze_open_vs_closed.py ClawBench: 7-model frontier baseline + bake-off tooling 2026-04-10 19:14:11 -07:00
audit_per_run.py analysis: fair-comparison audit and rejudge pipeline 2026-04-20 19:48:43 -07:00
audit_runs.py analysis: fair-comparison audit and rejudge pipeline 2026-04-20 19:48:43 -07:00
classify_regimes.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
compute_constraint_index.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
container_sweep_minimal.sh clawbench: per-sweep cache archiving + generic sweep templates 2026-04-18 12:46:45 -07:00
container_sweep_single.sh sweep: per-container state isolation + qwen model-id fix 2026-04-20 19:48:30 -07:00
generate_dynamical_report.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
generate_fair_report.py analysis: fair-comparison audit and rejudge pipeline 2026-04-20 19:48:43 -07:00
git_checkpoint.py clawbench: per-sweep cache archiving + generic sweep templates 2026-04-18 12:46:45 -07:00
ingest_real_run.py ClawBench: 7-model frontier baseline + bake-off tooling 2026-04-10 19:14:11 -07:00
inject_judge_rubrics.py ClawBench: 7-model frontier baseline + bake-off tooling 2026-04-10 19:14:11 -07:00
refactor_verifiers.py ClawBench: 7-model frontier baseline + bake-off tooling 2026-04-10 19:14:11 -07:00
rejudge_all.py analysis: fair-comparison audit and rejudge pipeline 2026-04-20 19:48:43 -07:00
run_open_vs_closed_bakeoff.py ClawBench: 7-model frontier baseline + bake-off tooling 2026-04-10 19:14:11 -07:00
run_posterior_dynamics_pipeline.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
scale_timeouts.py ClawBench: 7-model frontier baseline + bake-off tooling 2026-04-10 19:14:11 -07:00
seed_historical_db.py ClawBench: 7-model frontier baseline + bake-off tooling 2026-04-10 19:14:11 -07:00
snr_weighted_ranking.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
survival_analysis.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
variance_decomp.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00