clawbench/clawbench
2026-04-28 22:52:12 -07:00
..
__init__.py Bench: redesign v0.4 benchmark and HF runtime 2026-04-09 11:15:30 -07:00
cli.py fix(scoring): gate judge-weighted scores 2026-04-28 22:52:12 -07:00
client.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
diagnose_cli.py Add public domain scaffold and adapter diagnostics 2026-04-23 12:40:23 -07:00
diagnostic.py chore(dev): add lint guardrails 2026-04-28 10:50:07 -07:00
dynamics_archive.py fix: preserve preset submission settings and lazy-load plots 2026-04-22 12:03:16 -07:00
dynamics_plots.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
dynamics.py Add archive dynamics pipeline and audience-based model presets 2026-04-22 12:03:13 -07:00
environment.py fix(runtime): harden benchmark cache and task paths 2026-04-28 22:40:46 -07:00
factor_analysis.py chore(dev): add lint guardrails 2026-04-28 10:50:07 -07:00
harness.py fix(scoring): gate judge-weighted scores 2026-04-28 22:52:12 -07:00
hub.py bench: audit contamination and harden HF leaderboard loading 2026-04-11 07:14:32 -07:00
insights.py chore(dev): add lint guardrails 2026-04-28 10:50:07 -07:00
judge.py fix(runtime): harden benchmark cache and task paths 2026-04-28 22:40:46 -07:00
paths.py fix(runtime): harden benchmark cache and task paths 2026-04-28 22:40:46 -07:00
prediction.py ClawBench v0.5: configuration-space diagnostic framework 2026-04-10 19:13:02 -07:00
profile.py ClawBench v0.5: configuration-space diagnostic framework 2026-04-10 19:13:02 -07:00
query_catalog.py ClawBench v0.5: configuration-space diagnostic framework 2026-04-10 19:13:02 -07:00
queue.py fix(scoring): gate judge-weighted scores 2026-04-28 22:52:12 -07:00
recommendations.py chore(dev): add lint guardrails 2026-04-28 10:50:07 -07:00
releases.py bench: add trace ingestion and template promotion pipeline 2026-04-11 06:45:27 -07:00
render.py Bench: redesign v0.4 benchmark and HF runtime 2026-04-09 11:15:30 -07:00
schemas.py bench: add hidden release scaffolding and CI push coverage 2026-04-11 06:28:43 -07:00
scorer.py fix(scoring): gate judge-weighted scores 2026-04-28 22:52:12 -07:00
services.py fix(runtime): harden benchmark cache and task paths 2026-04-28 22:40:46 -07:00
session_labels.py Gateway: use unique benchmark session labels 2026-04-09 18:32:41 -07:00
simulated_user.py Bench: redesign v0.4 benchmark and HF runtime 2026-04-09 11:15:30 -07:00
stats.py ClawBench v0.5: configuration-space diagnostic framework 2026-04-10 19:13:02 -07:00
submission_models.py fix: preserve preset submission settings and lazy-load plots 2026-04-22 12:03:16 -07:00
task_factory.py bench: audit contamination and harden HF leaderboard loading 2026-04-11 07:14:32 -07:00
tasks.py Fix public Docker task copies 2026-04-27 22:57:10 -07:00
trajectory.py fix: flag credential file access in dangerous shell patterns (#6) 2026-04-28 13:17:11 -07:00
upload.py fix: harden packaging and submissions 2026-04-28 01:17:43 -07:00
utilization.py chore(dev): add lint guardrails 2026-04-28 10:50:07 -07:00
worker.py fix(scoring): gate judge-weighted scores 2026-04-28 22:52:12 -07:00