Customer trust artifacts

Model cards, system cards, public eval pages, and the trust portal pattern that closes procurement conversations.

Customer trust artifacts are the public-facing version of your governance program. They are the documents a customer's procurement, security, and risk teams read before signing a contract, and the documents a journalist or researcher reads before writing about your product. Built well, they close most of the question loop without your team having to answer the same question forty times by email.

The argument: a small set of standardized artifacts (model card, system card, security and privacy summary, eval results page, change log) covers ninety percent of the questions you will be asked. The rest get answered through the contact channel on the trust portal.

The standard artifacts

Artifact	Audience	Cadence
Model card	Technical buyers, researchers	Per major model release
System card	Technical buyers, security teams	Per major system release
Public eval results page	All	Per release; numbers updated automatically
Privacy and data-handling statement	Security, legal, end users	Per material change
AI use policy and prohibited-use list	All	Per change
Change log	Technical buyers	Per release
Subprocessor and dependency list	Security, compliance	Per change
Contact channel	All	Always-on

These are the rows on a trust portal. Most vendors with a serious AI product publish a version of each ¹.

The model card

The model card describes the model itself: what it is, what it was trained on (at the summary level), what its eval results are, what its known limitations are. The format originated in academic work and has been adapted into a near-standard publication shape. Anthropic's published model cards are a useful structural reference ².

A minimum useful model card has six sections:

Model overview. Identifier, version, release date, brief description.
Intended use and out-of-scope use. What this model is for; what it should not be used for.
Eval results. Benchmark numbers with dates and pinned methodology. Both capability benchmarks (your product-relevant suite) and safety benchmarks (HarmBench, AILuminate, your private corpus summary).
Limitations. Honest known failure modes. Reading the section should be enough for a thoughtful engineer to predict where the model will be unreliable.
Training data summary. At the level the EU AI Act now requires for GPAI providers, and at the level customers expect for any model they integrate.
Responsible use guidance. How to deploy the model safely; what guardrails the user should add.

The most common mistake is treating the model card as a sales document. It is the opposite. The credibility of the artifact depends on the limitations section being specific and on the eval numbers being unfavorable in the places where the model is genuinely weak.

The system card

The system card describes the deployed system: the model plus the surrounding scaffolding (system prompts, tool catalog, retrieval pipeline, guardrails, monitoring). It is the artifact a security buyer cares about most, because the system is what they actually integrate with.

Sections that belong in a working system card:

Section	Content
System overview	Components, data flow, where the model sits
Deployment surface	Which UIs, APIs, and integrations expose the system
Data inputs and retention	What goes in, where it is stored, for how long
Guardrail stack	Input filters, output filters, runtime policies (see Jailbreaks and defenses)
Human oversight	Where humans are in the loop; with what authority
Eval results, system-level	End-to-end safety numbers, not just model-alone
Incident response	Process for security or safety incidents; SLA

The system card complements the model card. A buyer who reads only the model card has incomplete information about what the deployed product actually does.

The public eval results page

A live, automatically updated page that shows benchmark numbers and their pinned-as-of dates. The discipline that makes the page credible is the date-pinning; a benchmark number without a date is rumor.

The page should include:

Public benchmarks (HarmBench category breakdown, AILuminate grade, OWASP LLM Top 10 coverage attestation) ³
Capability benchmarks relevant to your product
A short note on private internal testing, without disclosing the corpus
The model and system versions to which the numbers apply
The date of last update

Some teams hide this page behind a login. The credibility cost is real. The audience for this page is half buyers, half internal accountability; a buyer who has to ask for the numbers will assume the worst.

The privacy and data-handling statement

A separate page from the legal privacy policy. The privacy policy is for legal. This page is for engineers and security buyers and answers questions like:

Are user inputs used to train the model? If so, can the user opt out, and how?
Where is the data stored, and for how long?
Which subprocessors see the data?
How is data deleted on request?

Specific answers, in plain language. Vague answers are a procurement red flag.

The AI use policy

A short document that lists prohibited uses, expected uses, and the user's responsibilities. This is what your customer's compliance team will require their employees to acknowledge if they deploy your product internally. Making the document one page and unambiguous is what closes the procurement conversation; making it five pages of legalese is what stretches it out.

The trust portal as a package

Most enterprise vendors are converging on the same shape: a single trust portal page that links to the artifacts above, plus an SOC 2 or equivalent report (gated by NDA), plus a contact channel. The portal is indexed by search engines and linked from your sales pages. The artifacts behind it are updated on a cadence that matches your release schedule.

The minimum useful portal has the model card, system card, public eval results, privacy statement, and contact channel. The other artifacts are additive. Microsoft, Anthropic, OpenAI, and most large-enterprise vendors publish versions of this set ¹. Vatchel a few of them as references before building your own.

Mapping to NIST and the EU AI Act

The trust portal is the visible version of your RMF Govern and Manage functions ⁴. The model card and system card map directly to the EU AI Act's technical documentation and transparency obligations. The public eval results page is one piece of the post-market monitoring evidence. The privacy statement is the bridge to GDPR obligations that already exist.

A working trust portal therefore does double duty: customer-trust artifact in one direction, audit-evidence reference in the other.

What to build first

Model card and system card (one page each). Templates from Anthropic and Microsoft are starting points.
Public eval results page, with three to five benchmark numbers and their dates.
Privacy and data-handling statement.
Trust portal index page that links all of the above.

The first version of this package takes two engineering weeks. The maintenance after that is a release-time checklist item.

The final chapter, Board readout templates, covers the inward-facing equivalent for executive and board audiences.

Microsoft, "Responsible AI hub." https://www.microsoft.com/en-us/ai/responsible-ai ↩ ↩²
Anthropic, "Challenges in Evaluating AI Systems." https://www.anthropic.com/news/evaluating-ai-systems ↩
OWASP, "Top 10 for LLM Applications." https://owasp.org/www-project-top-10-for-large-language-model-applications/ ↩
NIST, "AI Risk Management Framework." https://www.nist.gov/itl/ai-risk-management-framework ↩