Free tools
Ready-made resources you can use today to plan, run, and communicate your LLM benchmarks. No signup needed—just grab what helps and share with your team.
A step-by-step checklist to scope datasets, judges, and reporting so runs are reproducible.
A starter prompt for LLM-as-judge with slots for scoring rubric, tie-breakers, and bias guards.
A lightweight slide/table outline to summarize results, deltas, caveats, and recommendations.
More tools coming
Want a specific template or judge prompt? Tell us and we’ll add it to the library.