Maintainer
aievals.co is built and maintained by Om Bharatiya (handle @ombharatiya, email coffeewithom@gmail.com). Om has been writing about AI systems for several years and runs the AI Evals project as a citation-grounded reference for the people who ship AI in production. The site replaces a folder of personal notes with something other practitioners can quote.
Contributors
Every page declares an owner in frontmatter for editorial accountability. The list of people who have committed code or content to this repository:
- Om Bharatiya (
@ombharatiya)
The full list is maintained in CONTRIBUTORS.md at the repository root. If you want to be added, the path is documented on the contributing page: ship a non-trivial pull request that meets the editorial bar and your name goes on the list.
Why this site exists
The eval space, as of May 2026, is loud and fragmented. Vendor blogs read like sales decks. Academic surveys are slow to update. Newsletters give you "Hot take #4,217." A handful of practitioners (Hamel Husain, Eugene Yan, Shreya Shankar, Chip Huyen, Lilian Weng) have written excellent essays, but each lives on its own site with its own taxonomy. There is no single, structured, opinionated reference that pulls the threads together.
aievals.co fills that gap. A serious AI engineer, product manager, or research scientist who lands here should leave with three things: a clear mental model of how evals fit into their work, a runnable next step they can take this week, and a verified citation trail they can quote in their team's RFC. Every page is built around those three outcomes.
The site is not a blog (pages are evergreen reference, versioned in git, not dated personal essays), not a directory ("Awesome AI Evals" lists already exist), and not vendor-neutral (the maintainer has favorites for production observability, RAG evals, CI harnesses, and red-team frameworks, and says so on the relevant pages, with reasoning).
Editorial discipline
Eight principles govern every page:
- Cited, or removed. Every non-obvious claim has a citation linking to a verified primary source. No "studies show" without the study.
- Opinion, with reasoning. Pages have opinions and back them. "X is correct because Z, despite Y" beats "some say X, others say Y."
- Code beats prose. If a concept can be demonstrated with 12 lines of Python, those 12 lines come first. Long-form explanation comes after.
- Binary beats Likert. We follow Hamel Husain's "binary pass/fail with critique" discipline. No "rate this 1 to 5 for helpfulness" unless multidimensional grading is the explicit point of the lesson.
- Real numbers, real cites, real dates. Benchmark numbers are dated. Every benchmark page includes a "fetched as of " line and a link to the live leaderboard.
- No marketing language. A prose linter rejects banned vocabulary (filler verbs, AI-sounding nouns, marketing adjectives), em dashes (Unicode U+2014), and recognizable LLM-stamp sentence patterns. The site is positioned against AI slop; the prose models the discipline it asks of contributors.
- Respect the reader's time. Each page declares its persona tags, prerequisites, and reading time at the top. Reference pages start with a quick-reference table before any prose. We aim for the shortest page that does the job.
- Show what good looks like. Where canonical examples exist (Hamel's "Field Guide" for error analysis, Anthropic's eval cookbook for judge design), we link to them.
How to cite this site
Pages on aievals.co are stable URLs with declared updated and verified dates in frontmatter. To cite a specific page in academic or technical writing, treat each page as a versioned web reference.
APA:
Bharatiya, O. (2026). The 60-80% Rule. AI Evals. Retrieved May 29, 2026, from https://www.aievals.co/learn/error-analysis/the-60-80-rule
BibTeX:
@misc{bharatiya2026sixty,
author = {Bharatiya, Om},
title = {The 60-80\% Rule},
year = {2026},
howpublished = {AI Evals},
url = {https://www.aievals.co/learn/error-analysis/the-60-80-rule},
note = {Verified 2026-05-29}
}
Swap the title, slug, and year for the page you are citing. The note field carries the verified date from the page's frontmatter so a reader can tell when the citations on that page were last walked. Content on aievals.co is published under CC BY 4.0: you can reuse it with attribution.