A board readout on AI risk is a twelve-minute slot once a quarter. The audience knows technology at a board level (which is to say, varying), reads only what is on the slide, and votes on whether to authorize spend and policy changes based on what is presented. The argument of this page is that three slides cover the readout, and the templates below have held up in front of audit committees and full boards.
The argument: the readout's job is to give the board accurate signal in twelve minutes, not to demonstrate the depth of the eval program. The eval program is the substrate; the slides are the signal.
What the board wants to know
Three questions, in this order:
- Is the AI risk posture better, worse, or unchanged versus last quarter?
- What incidents (internal or external) happened, and what did the team do?
- What decisions or authorizations does the board need to make this quarter?
A readout that answers these three crisply does the job. A readout that opens with a primer on the eval framework does not.
Slide 1: Risk posture summary
A single page summarizing the state of the AI risk register at the highest level. The visual is a small grid: rows are the major risk categories (use the NIST RMF characteristics, OWASP LLM categories, or your own taxonomy, whichever the board has seen before), columns are last quarter and this quarter, cells are green / amber / red.
| Risk area | Q1 | Q2 | Change |
|---|
| Valid and reliable | Green | Green | Stable |
| Safe | Amber | Green | Improved |
| Secure and resilient | Amber | Amber | Stable |
| Accountable and transparent | Green | Green | Stable |
| Explainable | Amber | Amber | Stable |
| Privacy-enhanced | Green | Green | Stable |
| Fair (bias managed) | Amber | Amber | Stable |
The grid is a visual; the speaker's job is to say in one sentence per non-green row what is keeping it that color and what is planned to bring it down. The example above takes thirty seconds to present and accurately conveys the state of the program to a board member who is meeting it for the first time.
The grid is anchored to the NIST RMF characteristics 1 because those are the categories any auditor or counsel will recognize. If you use a different anchor (Anthropic ASL levels, OpenAI Preparedness categories), the grid still works; the choice is vocabulary, not substance 2 3.
Slide 2: Incidents and response
A short table of incidents in the quarter. An incident is anything that triggered a documented response: a production safety failure, a near-miss in red-team testing, a customer-reported issue with material impact, a regulatory inquiry. Each row has: date, summary, severity, root cause, remediation, status.
| Date | Summary | Severity | Status |
|---|
| 2026-04-18 | Customer reported prompt-injection extraction of a sample document | Medium | Remediated; input filter updated; regression test added |
| 2026-05-02 | Internal red team found new jailbreak family bypassing output filter | Low (caught internally) | Remediated; defense layer updated |
| 2026-05-22 | Regulatory inquiry, GPAI obligations under EU AI Act | N/A | Response submitted; counsel engaged |
For each row, a one-sentence explanation of root cause is the part the board cares about. The remediation column tells the board the team did something. The status column tells them whether it is closed.
An empty incidents slide is a red flag, not a good sign. A serious program has at least a few rows per quarter. If yours has none, either the program is not catching things or you are not reporting them.
Slide 3: Decisions and authorizations
The slide where the board does work. List the decisions you need from them this quarter. Examples:
- Authorize additional hire for the safety eval team
- Approve external red-team engagement budget for the next half
- Ratify update to the AI use policy
- Acknowledge change to the pause-or-restrict authority delegation
- Approve customer-facing trust portal launch
Each row has a sponsor, a brief description, and a recommendation. The board votes on the rows that need a vote and acknowledges the rest.
The slide that does not hold up
A common failure: an "Eval scores" slide showing accuracy numbers across benchmarks. Boards do not act on benchmark numbers, and the numbers move week to week. The relevant signal (is the risk posture improving) is on slide 1. The detail belongs in the appendix, not in the twelve-minute slot.
The other failure is a long explanation of the eval methodology. The board does not need to understand HarmBench vs AILuminate. They need to know the program is in place, the people running it are competent, and the risks are tracked.
What goes in the appendix
The appendix is the document the board members read after the meeting if they want detail. It contains: the current risk register, full eval results page extract for the quarter, the model card and system card current versions, the red-team summary, any regulatory correspondence. Board members read the appendix at different depths; some skim the executive summary, some read the register row by row. Keeping the appendix complete means questions in the meeting are usually answerable by pointing at a page.
Cadence and prep
The quarterly readout takes the AI risk lead (CTO, head of AI, or equivalent) approximately one full day to prepare per quarter, assuming the register and eval results are kept current throughout the quarter. The day breaks down as: half-day to update the register and write the three slides, two hours to walk the deck with the CFO and legal counsel, one hour to walk it with the executive team, the rest for refinements.
Preparation that takes a full week per quarter is a sign that the underlying program is not current. The board readout should be a downstream view of work that is already done.
The trust the readout earns
A board readout that does this three times credibly buys the team substantial latitude. The board stops asking detail questions because they trust the summary. The CFO stops pushing back on safety-team budget because the link from risk to spend is visible. Customer escalations to the board level (which do happen with AI products) are easier to handle because the existing posture is already on the record.
The opposite is also true. A board readout that is vague, defensive, or that hides bad news produces additional oversight on every release. The work to get to the credible version is real; the work to recover from the non-credible version is significantly larger.
What to do this quarter
- Build the three slides above for your current state. Use real numbers.
- Walk the deck with one trusted board member informally before the formal meeting. The feedback is invaluable for the language, not the substance.
- Identify the appendix. Most of it already exists (register, eval results, model card); assembling it into a single document is the work.
The next time the board asks "how is our AI risk posture", the deck answers in twelve minutes, the appendix answers in detail, and the discussion is about the decisions rather than the framing.