
The US government just expanded the most consequential AI safety pact of 2026. The Center for AI Standards and Innovation, the new arm of NIST inside the Commerce Department, announced on May 5 that Google DeepMind, Microsoft, and xAI will give the government pre-release access to their frontier AI models for security and risk evaluation. The three companies join OpenAI and Anthropic, who signed similar agreements earlier this year. NIST AI testing is now the de facto pre-launch checkpoint for every major US frontier lab.
What’s actually new
The pact, signed jointly by the five labs and announced by NIST and the Center for AI Standards and Innovation (CAISI) at a Tuesday morning briefing, commits each company to a process that did not exist a year ago. Before a new frontier-tier model launches publicly, the company will share it with CAISI for evaluation. CAISI will run safety, cybersecurity, biosecurity, and dual-use risk assessments against the model. The lab will receive findings and recommendations before launch. The lab retains the launch decision; CAISI does not have approval authority. The arrangement is voluntary, but the political and reputational cost of withdrawing makes it effectively binding.
The scope of the pact is specifically frontier-tier models — the ones at or near the leading edge of capability. Routine model updates, fine-tunes, smaller-tier variants, and customer-deployed models are not subject to the pre-release review. The labs and CAISI will jointly decide which model versions trigger the review process. Practically, this means flagship models like Anthropic‘s Claude Opus, OpenAI’s GPT-5 Reasoning, Google’s Gemini Pro, Microsoft’s frontier in-house models, and xAI’s Grok 4 family.
The findings CAISI produces will not be public. The agreement specifies that risk evaluations are shared confidentially with the originating lab. Aggregated findings or major systemic concerns may be published in CAISI’s annual report, but individual model-by-model results stay in private channels. The transparency lobby has been pushing for more public disclosure; the labs argued successfully for confidentiality on the grounds that public risk inventories give attackers a roadmap.
The legal mechanism is the Defense Production Act executive order signed by the previous administration in 2023, modernized through the 2025 NIST authorities, plus voluntary commitments the labs made to the Biden administration in 2023 that the current administration has retained. The pact is technically a voluntary commitment by the labs. The leverage that makes it enforceable is the political fallout if any lab were seen to skip the review on a model that subsequently caused harm.
The international dimension is the unspoken context. The UK AI Safety Institute (now AI Security Institute) signed a parallel pact with the same five labs in 2024 and 2025. The EU AI Office is building toward similar arrangements under the EU AI Act’s general-purpose AI rules. Japan, South Korea, and Singapore have established AI safety institutes that are negotiating their own pre-release access arrangements. The leading frontier labs increasingly submit each new model to a small set of national institutes before launch.
Why it matters
- Pre-launch review is now standard, not exceptional. A year ago, government pre-release review was OpenAI and Anthropic’s voluntary commitment. Today it is industry-wide. The lab that opts out becomes an outlier in a way that affects enterprise procurement.
- Launch timing slows by roughly two to six weeks. CAISI’s evaluation cycle adds material time between model completion and public release. Labs will plan launches with the cycle in mind, which structures their release cadence in a way they did not before.
- The enterprise procurement narrative just got stronger. CIOs, CISOs, and chief AI officers can point to NIST pre-release review as evidence of responsible vendor behavior, which materially helps the labs win regulated-industry deals.
- The smaller open-weights and specialized labs face a regulatory gap. Mistral, Cohere, Reka, Together, AI21, Stability, and the various Chinese labs are not part of this pact. The asymmetry creates competitive and policy questions over the next 12 months.
- National-security AI applications get a clearer pathway. Models that pass CAISI review have a verified safety posture that DOD, DHS, IC, and federal procurement can rely on for classified or sensitive workflow approvals.
- The transparency debate hardens. Civil society, academic researchers, and journalists wanted public risk reports; the agreement keeps them confidential. The political pressure to crack open the findings will compound.
How to use it today
The practical implications cluster by audience. Enterprise procurement teams should update their vendor due diligence to include CAISI review status. Developers building on frontier models should expect roughly the same model availability they have today but with cleaner safety posture documentation. Regulated industries (financial services, healthcare, federal) get clearer enterprise procurement justifications.
- Add a “pre-release review” question to your vendor questionnaire. Ask whether the model your vendor uses was submitted to CAISI or another national institute for pre-release evaluation. Use the answer as a procurement input.
- Update your AI governance policy. Specify that frontier-tier models used in production should come from vendors participating in pre-release review programs. Smaller-tier and specialized models can use other vendors with documented risk assessments.
- For regulated industries, document the chain. Your compliance file should include the vendor’s NIST/CAISI participation status, the version of the model deployed, and any vendor-supplied risk documentation.
- If you build for federal markets, ask vendors for CAISI summary statements. Public-sector procurement officers will increasingly ask for evidence; vendors should be able to produce it.
- If you build agentic systems, note that pre-release review focuses on the underlying model. Your agent stack, prompts, tools, and orchestration are out of scope for CAISI review. Your security and safety posture for the agent is your own responsibility.
For teams building on frontier models, the practical impact on code is minimal. The model APIs continue to work the same way. The model versions you call may shift slightly on the launch calendar. The minimum viable check that the model you depend on participated in pre-release review can be expressed in a procurement attestation.
vendor_attestation = {
"vendor": "Anthropic",
"model_id": "claude-opus-4-7-20260505",
"pre_release_review": {
"caisi_review_completed": True,
"caisi_review_date": "2026-05-01",
"caisi_summary_doc_id": "CAISI-2026-0145",
"other_review_bodies": ["UK AISI", "Singapore AI Safety Network"],
},
"safety_documentation": {
"model_card_url": "https://www.anthropic.com/claude-opus-4-7/card",
"system_card_url": "https://www.anthropic.com/claude-opus-4-7/system-card",
"vendor_risk_assessment_url": "https://www.anthropic.com/claude-opus-4-7/risk",
},
}
def vendor_passes_governance_policy(attestation, required_review_bodies=("CAISI",)):
completed = attestation["pre_release_review"]["caisi_review_completed"]
if "CAISI" in required_review_bodies and not completed:
return False
return True
How it compares
The pre-release AI review landscape has consolidated into a small set of national institutes plus voluntary lab commitments. The table below summarizes the current state.
| Institute | Country | Labs in pact | Findings disclosure | Status as of May 2026 |
|---|---|---|---|---|
| CAISI (NIST) | USA | Anthropic, OpenAI, Google, Microsoft, xAI | Confidential with annual aggregated report | Active, expanded May 2026 |
| UK AI Security Institute | UK | Anthropic, OpenAI, Google, Meta, Microsoft | Confidential, periodic public summaries | Active since 2024 |
| EU AI Office (GPAI) | EU | Voluntary code; mandatory under AI Act for GPAISR | Mix of confidential and public per regulation | Code of Practice active; full GPAISR rules ramping |
| Singapore AI Safety Network | Singapore | Anthropic, OpenAI, others | Confidential | Active in 2026 |
| Japan AI Safety Institute (AISI) | Japan | Anthropic, others; expanding | Confidential | Active; coordination MOU with CAISI signed 2025 |
| Canada AI Safety Institute | Canada | Smaller-scale; coordination with CAISI | Confidential | Standing up in 2026 |
The competitive read: a top-tier model from a Western lab in 2026 will typically pass through three to five national institute reviews before public launch. The reviews are largely coordinated to avoid duplicate work. The labs that have not joined any pact (most Chinese labs, several smaller open-weights labs) face increasing market pressure as enterprise procurement asks the question.
What’s next
Three threads to watch over the next sixty days. First, the eventual public release of CAISI’s first annual aggregated report, expected in late 2026. The report will be the first public window into what pre-release review actually finds across the major labs and will shape the political and regulatory debate. Second, the question of whether smaller open-weights and specialized labs join the pact. Cohere, Mistral, and AI21 have all signaled interest; whether they sign formal agreements will affect their enterprise positioning. Third, the international coordination dynamic. The Western institutes are converging on shared methodologies; whether China and other major non-Western AI nations participate in any parallel arrangement will determine whether AI safety review becomes a global standard or a Western club.
The longer arc is that pre-release model review is becoming the regulatory floor for frontier AI in democratic countries. The pattern is similar to how new pharmaceuticals pass through FDA review before public availability, or how new aircraft designs pass through FAA certification. AI is not a regulated industry in the same legal sense, but the operational reality is converging on the regulated-industry pattern through voluntary commitments and political pressure.
Frequently Asked Questions
Does CAISI have the authority to block a model launch?
No. The pact is voluntary, and CAISI’s findings are recommendations, not approvals. The lab retains the final launch decision. The political and reputational cost of releasing a model against CAISI’s strong objections is significant, which gives the review real influence even without formal authority.
Does this apply to fine-tunes or customer-deployed models?
No. The pact targets frontier-tier base models from the participating labs. Customer fine-tunes, agent applications, prompt-tuned products, and downstream deployments are out of scope. Customers are responsible for their own risk assessments on these.
Will the findings be made public?
Individual model-by-model findings will not be public. CAISI will publish an annual aggregated report describing the overall risk patterns identified across reviewed models. Some labs may voluntarily disclose specific findings in their model cards or system cards.
How long does the review take?
CAISI has not published a specific timeline. Industry reporting suggests two to six weeks between submission and findings delivery, depending on model complexity and risk profile. Labs are planning launches with this cycle built in.
What about Chinese AI labs?
No Chinese labs are part of the pact. China’s regulatory environment for AI is separate; the Cyberspace Administration of China runs its own approval process for large language models. The US and Chinese review regimes do not currently coordinate.
Will smaller AI companies be included?
Not in the current pact. The pact targets the five frontier labs whose models sit at or near the leading edge of capability. CAISI has signaled it may expand to additional companies as their model capabilities rise to frontier tier. Several mid-tier labs have expressed interest in joining.