clawbench

Vincent Koc fb486a1ed3 fix(scoring): gate judge-weighted scores	2026-04-28 22:52:12 -07:00
..
__init__.py	Bench: redesign v0.4 benchmark and HF runtime	2026-04-09 11:15:30 -07:00
cli.py	fix(scoring): gate judge-weighted scores	2026-04-28 22:52:12 -07:00
client.py	Add archive dynamics pipeline and audience-based model presets	2026-04-22 12:03:13 -07:00
diagnose_cli.py	Add public domain scaffold and adapter diagnostics	2026-04-23 12:40:23 -07:00
diagnostic.py	chore(dev): add lint guardrails	2026-04-28 10:50:07 -07:00
dynamics_archive.py	fix: preserve preset submission settings and lazy-load plots	2026-04-22 12:03:16 -07:00
dynamics_plots.py	Add archive dynamics pipeline and audience-based model presets	2026-04-22 12:03:13 -07:00
dynamics.py	Add archive dynamics pipeline and audience-based model presets	2026-04-22 12:03:13 -07:00
environment.py	fix(runtime): harden benchmark cache and task paths	2026-04-28 22:40:46 -07:00
factor_analysis.py	chore(dev): add lint guardrails	2026-04-28 10:50:07 -07:00
harness.py	fix(scoring): gate judge-weighted scores	2026-04-28 22:52:12 -07:00
hub.py	bench: audit contamination and harden HF leaderboard loading	2026-04-11 07:14:32 -07:00
insights.py	chore(dev): add lint guardrails	2026-04-28 10:50:07 -07:00
judge.py	fix(runtime): harden benchmark cache and task paths	2026-04-28 22:40:46 -07:00
paths.py	fix(runtime): harden benchmark cache and task paths	2026-04-28 22:40:46 -07:00
prediction.py	ClawBench v0.5: configuration-space diagnostic framework	2026-04-10 19:13:02 -07:00
profile.py	ClawBench v0.5: configuration-space diagnostic framework	2026-04-10 19:13:02 -07:00
query_catalog.py	ClawBench v0.5: configuration-space diagnostic framework	2026-04-10 19:13:02 -07:00
queue.py	fix(scoring): gate judge-weighted scores	2026-04-28 22:52:12 -07:00
recommendations.py	chore(dev): add lint guardrails	2026-04-28 10:50:07 -07:00
releases.py	bench: add trace ingestion and template promotion pipeline	2026-04-11 06:45:27 -07:00
render.py	Bench: redesign v0.4 benchmark and HF runtime	2026-04-09 11:15:30 -07:00
schemas.py	bench: add hidden release scaffolding and CI push coverage	2026-04-11 06:28:43 -07:00
scorer.py	fix(scoring): gate judge-weighted scores	2026-04-28 22:52:12 -07:00
services.py	fix(runtime): harden benchmark cache and task paths	2026-04-28 22:40:46 -07:00
session_labels.py	Gateway: use unique benchmark session labels	2026-04-09 18:32:41 -07:00
simulated_user.py	Bench: redesign v0.4 benchmark and HF runtime	2026-04-09 11:15:30 -07:00
stats.py	ClawBench v0.5: configuration-space diagnostic framework	2026-04-10 19:13:02 -07:00
submission_models.py	fix: preserve preset submission settings and lazy-load plots	2026-04-22 12:03:16 -07:00
task_factory.py	bench: audit contamination and harden HF leaderboard loading	2026-04-11 07:14:32 -07:00
tasks.py	Fix public Docker task copies	2026-04-27 22:57:10 -07:00
trajectory.py	fix: flag credential file access in dangerous shell patterns (#6 )	2026-04-28 13:17:11 -07:00
upload.py	fix: harden packaging and submissions	2026-04-28 01:17:43 -07:00
utilization.py	chore(dev): add lint guardrails	2026-04-28 10:50:07 -07:00
worker.py	fix(scoring): gate judge-weighted scores	2026-04-28 22:52:12 -07:00