Contributing

Citation discipline, the verified-date rule, frontmatter, and commit hygiene for shipping a page to aievals.co.

Thanks for the offer. AI Evals exists because the eval space is noisy. Contributions raise the signal when they meet the bar; they lower it otherwise. This page is the long version of the rules in CONTRIBUTING.md at the repository root. Read both before opening a pull request.

What we accept

A new technique page or task-type playbook backed by citations to primary sources. A page that quotes a vendor blog without also citing the underlying paper does not count.
A correction to an existing page: a broken link, a stale benchmark number, a mis-stated tradeoff, a citation pointing at the wrong arXiv version. Small, specific, verifiable.
A new cookbook recipe with runnable code. Code beats prose. Recipes ship with a tested snippet, pinned versions, and an output expectation. Anything you can't paste into a fresh environment and run should not be a recipe yet.
A glossary term that is genuinely missing. Check the glossary first. Definitions are 1 to 2 sentences plus a "see also" link.

What we don't accept

"Awesome list" links without commentary. The site is opinionated. A bare URL with no take adds zero signal.
Vendor pitches. If your contribution is a page about your own tool, disclose the affiliation and write to the editorial bar that any external author would have to meet.
AI-generated drafts with no human verification of the citations. First-drafting with an AI assistant is fine. Submitting a draft where the author cannot defend every citation is not.
Decorative AI-generated images. The site uses Mermaid diagrams, hand-drawn SVG, and real photographs. No AI illustrations.

Citation discipline

Every non-obvious claim cites a primary source. Vendor whitepapers are cited as vendor whitepapers; arXiv preprints are cited as preprints; blog posts as blog posts. No paraphrasing without attribution. No "studies show" without the study.

To add a citation:

Open the URL in a fresh browser tab. Confirm the source matches what you are about to claim. If the source is a paper, read the relevant section; do not cite from the abstract alone.
Verify the URL responds with a 2xx (a quick curl -I is fine). Links that return 4xx or 5xx do not ship.
Add an entry to docs/requirements/08_REFERENCES_BIBLIOGRAPHY.md and the corresponding BibTeX entry in data/citations.bib.
Use the citation key in prose as [^citation-key]. The remark plugin renders it as a tooltipped superscript with a back-reference at the bottom of the page.
Run npm run verify-citations before opening a PR. The script fails on any URL returning 404 or 5xx.

When a claim has multiple supporting sources, cite the primary one and link the others under "Further reading" rather than stacking footnotes.

The verified date rule

Every content page declares two ISO dates in frontmatter: updated and verified.

updated is the last time the prose was edited for any reason: a typo fix counts.
verified means a human re-walked every citation on the page and confirmed the linked sources still match the prose.

Pages whose verified is older than 180 days display a staleness banner. Refreshing the date means doing the work, not just bumping the value. If you cannot defend every citation on a page, leave the date alone and open an issue instead.

Tone

Read the editorial credo in docs/requirements/01_VISION_AND_PRINCIPLES.md. Eight principles govern every page:

Cited, or removed.
Opinion, with reasoning.
Code beats prose.
Binary pass/fail beats Likert.
Real numbers, real cites, real dates.
No marketing language. No em dashes. No AI-sounding patterns.
Respect the reader's time.
Show what good looks like.

The prose linter at scripts/lint-prose.ts enforces a banned-vocabulary list (filler verbs, AI-sounding nouns, marketing adjectives, see the full list in 06_CONTENT_PLAN.md) and rejects em dashes (Unicode U+2014) and recognizable LLM-stamp sentence openers. Run npm run lint:prose locally before submitting; the same script runs in CI.

Frontmatter

Every page passes the zod schema for its template. See docs/requirements/05_TECH_STACK.md for the full set of schemas. The defaults a build agent inherits:

owner: "@ombharatiya"
draft: false

Override owner when the page is co-authored or maintained by a different contributor. The owner is the single point of editorial accountability for that page.

Required fields differ by template (concept, tool, paper theme, cookbook, task type, start track, standalone). Standalone pages like the one you are reading need only title, description, updated, verified, owner, and draft. Concept pages add section, order, personas, and citations.

Commit hygiene

One coherent change per commit. A page rewrite and a citation refactor go in two commits.
Imperative subject under 70 characters. "Add NurtureBoss case study to error analysis" is fine. "Updates" is not.
Body explains the why if non-obvious. Skip the body for trivial typo fixes.
No AI-sounding language in commit messages. The same banned vocabulary applies to commits as to prose.
No Co-Authored-By: Claude, Co-Authored-By: Claude Opus <noreply@anthropic.com>, or any Generated-By: trailer. The commit-msg hook installed by npm install rejects messages that include Claude, Generated with, or Co-Authored-By: lines. PR descriptions follow the same rule; strip any default "Generated with Claude Code" template before opening the PR.

The commit author stays at whatever your local git config user.name is set to. Commits in this repo are authored work, not signed by a model.

AI assistance disclosure

If you used an AI assistant to first-draft a page, run a human verification pass on every citation, benchmark number, tool claim, and the prose itself. The site is about AI; it is not written by AI. Disclose AI assistance in the PR description: which model, what it produced, what you changed. The disclosure is a credibility marker, not a confession.