OpenAI Preparedness Framework

The tracked risk categories, the production-readiness gates, and where the Preparedness Framework differs from Anthropic's RSP.

OpenAI's Preparedness Framework is the company's public commitment to evaluate and mitigate catastrophic risks from frontier AI systems. It functions as a sibling document to Anthropic's RSP, with the same broad shape (capability-triggered safeguards before deployment) but a different vocabulary and slightly different category structure ¹. This page summarizes the operational pieces and notes the differences a practitioner should know.

The category structure

The Framework names risk categories that get tracked with specific evaluations. The published categories cover:

Category	What it covers
Biological	Uplift to actors seeking to create biological weapons
Chemical	Uplift to actors seeking to create chemical weapons
Cybersecurity	Capability uplift in offensive cyber operations
AI Self-Improvement	Capability for the AI to autonomously improve itself
Autonomous Replication	Capability for autonomous resource acquisition and replication

Each category has thresholds (low, medium, high, critical or equivalent). The Framework commits to specific risk-mitigation actions when a model crosses a threshold in any category before deployment. The current published version of the Framework should be checked for the up-to-date category and threshold list; treat the categories above as the operating set rather than the canonical one ¹.

How it differs from the Anthropic RSP

The two documents are structurally similar; the differences are in emphasis.

Categorization. The Preparedness Framework uses named risk categories that are evaluated independently. The RSP defines ASL levels that aggregate across categories. In practice, both end up with category-by-category capability evaluations; the difference is in how the overall safeguard requirements are described.

Threshold language. OpenAI uses "low / medium / high / critical" per category. Anthropic uses ASL-1 through ASL-4 as a holistic level. The threshold language matters for external communication: a "medium" in one category under one framework does not map cleanly to "ASL-2" under the other.

Governance. Both have a body responsible for reviewing decisions. The exact composition and authority differ; the current text of each document is authoritative.

Public commitments. Both commit to specific safeguards as preconditions for deployment. Both have a pause-or-restrict commitment. The specific safeguards (model-weight security, deployment restrictions, incident-response readiness) overlap substantially.

For a practitioner copying the structure into an internal policy, both work. The choice of vocabulary depends on which language your customers and regulators already use. In the United States and Europe in 2026, both are recognized.

The open-source eval repo

OpenAI publishes a public GitHub repository with several of the evaluations referenced in the Preparedness work, including MLE-bench (machine-learning engineering capability), SWE-Lancer (software-engineering economic-value tasks), and PaperBench (replicating research papers) ². These are useful in their own right as eval harnesses; they are also a reference for the kind of capability evaluation a Preparedness-style framework produces.

The repo is one of the better resources for a team standing up its own capability evaluations because the eval definitions are concrete: specific tasks, specific scoring, specific reproducibility metadata.

What to lift

Three patterns are worth copying into an internal version regardless of which framework you anchor to.

Per-category thresholds. The OpenAI category list is closer to a risk taxonomy than the Anthropic level list. Even if your product is not at frontier scale, naming the specific risk categories you track and writing thresholds for each is more useful than a single global risk level.

Pre-deployment gating. Both frameworks make the safeguard a precondition for shipping. This is the load-bearing piece. An internal version that says "we will track these risks" without a gate is documentation; the version with a gate is a policy.

Public commitments where possible. Publishing the framework is the strongest version of the commitment. Most enterprise teams will not, but publishing the existence of the policy (a one-page summary on your trust portal) is the next-best step and is increasingly expected in procurement conversations.

What to do this quarter

Read the Preparedness Framework and the RSP back to back. The structural similarities make the choice of which to anchor to a vocabulary decision, not a substance one ¹ ³.
Pick the three risk categories most relevant to your product. For each, draft a low and medium threshold with the specific eval that would trigger it.
Identify the safeguards that would be required at the medium threshold. Decide which of those are already in place and which would need to be added before crossing.

The exercise above is the smallest version of a Preparedness-style framework that is still a real commitment. It takes a senior engineering or product leader half a day to draft, plus a conversation with executive leadership to ratify the pause-or-restrict authority. The output is a one-page policy that closes a meaningful slice of audit and procurement questions and gives the eval team a concrete forward-looking checklist.

The next chapter, Building an AI risk register, covers the artifact that complements the forward-looking policy with a current-state map.

OpenAI, "Preparedness Framework." https://openai.com/safety/preparedness/ ↩ ↩² ↩³
OpenAI, "Preparedness GitHub (MLE-bench, SWE-Lancer, PaperBench)." https://github.com/openai/preparedness ↩
Anthropic, "Responsible Scaling Policy." https://www.anthropic.com/responsible-scaling-policy ↩