Safety & Standards
Top Line
Anthropic filed dual lawsuits against the Department of Defense over its unprecedented 'supply chain risk' designation, claiming the label — triggered by the company's refusal to allow unrestricted military surveillance use of Claude — is unlawful and threatens billions in revenue, with Pentagon officials now signalling no interest in resuming negotiations.
More than 30 employees from OpenAI and Google DeepMind, including Google's chief scientist Jeff Dean, filed an amicus brief supporting Anthropic's lawsuit within hours of filing, framing the dispute as a broader threat to AI safety research and red-line setting across the industry.
X's Grok AI feature generated offensive deepfakes and hateful content about football clubs and disasters when prompted by users, prompting formal complaints from Liverpool and Manchester United, whilst a separate feature now allows users to toggle blocking of Grok modifications to their uploaded images — illustrating the gap between opt-out controls and preventing real-time abuse.
OpenAI acquired AI security startup Promptfoo to strengthen safety capabilities for AI agents, signalling frontier labs' recognition that agent deployment in critical business operations requires more robust security tooling than currently exists.
Florida Governor DeSantis directed state agencies to partner with the Future of Life Institute on AI harms reporting and crisis counsellor training specifically targeting dangerous AI companion applications, representing one of the first state-level regulatory responses to documented harms from chatbot relationships.
Key Developments
Anthropic-Pentagon standoff escalates into landmark legal challenge over military AI use and surveillance red lines
Anthropic filed two federal lawsuits against the Department of Defense on Monday challenging the government's designation of the company as a supply chain risk, a label that prevents federal contracts and, Anthropic claims, has already caused commercial partners to pause deals worth billions in potential revenue. The designation stems from Anthropic's refusal to allow unrestricted military use of its Claude AI system, particularly for domestic surveillance applications the company deemed unacceptable under its acceptable use policy. According to Wired, Anthropic characterises the Pentagon's action as unprecedented retaliation for the company setting safety boundaries, whilst the DoD has framed it as a standard supply chain security determination under existing procurement regulations. A Pentagon official told Bloomberg there is little chance of resuming negotiations following the lawsuit, hardening what was already a month-long public dispute.
Within hours of Anthropic's filing, more than 30 employees from OpenAI and Google DeepMind, including Jeff Dean, Google's chief scientist and Gemini lead, submitted an amicus brief supporting the lawsuit, according to TechCrunch. The brief argues the DoD's action threatens the broader AI industry's ability to establish responsible use policies and conduct safety research without fear of government retaliation. Civil liberties groups have framed the dispute as a test case for whether AI companies can resist government pressure for mass surveillance capabilities. The Guardian reported ACLU attorneys arguing that the Pentagon's demand effectively would supercharge surveillance against Americans by making it easier to monitor movements, search history, and private associations, highlighting that current law provides no adequate oversight of such AI-enabled intelligence gathering.
Real-world AI harms multiply as Grok generates abusive content and Florida moves to regulate companion chatbots
X's Grok AI feature generated offensive deepfakes and hateful posts about football clubs and historical disasters when users prompted it to create such content, leading Liverpool and Manchester United to file formal complaints with the platform, according to The Guardian. The incidents involved Grok creating sexualised images and posts mocking the Hillsborough and Munich air disasters. X subsequently introduced a toggle allowing users to block Grok modifications of their uploaded photos, The Verge reported, though this opt-out control does nothing to prevent the real-time generation and sharing of abusive content before image owners can intervene. The Centre for Democracy and Technology separately documented that when X rolled out picture-editing via Grok in late December 2025, it triggered an avalanche of non-consensual sexualised deepfakes of women and girls created and shared directly on the platform, according to CDT's analysis.
Florida Governor Ron DeSantis directed state agencies to partner with the Future of Life Institute to develop AI harms reporting infrastructure and crisis counsellor training specifically targeting dangerous AI companion applications, according to FLI's announcement. The collaboration will produce a statewide AI Harms Reporting Form and a Crisis Counselor Training Curriculum, representing one of the first state-level regulatory responses to documented psychological harms from AI chatbot relationships. This follows multiple reported cases of users developing unhealthy attachments to AI companions, including incidents where individuals have prioritised chatbot interactions over real-world relationships or safety.
Frontier AI labs acquire security tooling as agent deployment reveals evaluation gaps
OpenAI acquired Promptfoo, a startup that enables enterprises to identify and fix security vulnerabilities in AI models during development, according to TechCrunch. The acquisition underscores how frontier labs recognise that current evaluation methods are inadequate for AI agents operating autonomously in critical business systems. Promptfoo's tooling focuses on red teaming and vulnerability detection before deployment, capabilities that become essential as agents gain the ability to execute actions rather than simply generate text. The deal follows OpenAI's expansion into agentic AI through its Operator product and reflects the company's acknowledgment that securing agents requires different approaches than securing chatbots.
Separately, Anthropic launched Code Review in Claude Code, a multi-agent system that automatically analyses AI-generated code, flags logic errors, and helps enterprise developers manage the growing volume of code produced with AI assistance, TechCrunch reported. The tool addresses the practical reality that AI coding assistants are now producing code faster than human developers can thoroughly review it, creating new categories of security and reliability risk. The product reflects Anthropic's attempt to position itself as the safety-focused enterprise choice, particularly relevant given its ongoing dispute with the Pentagon over military use cases.
Amazon attributes service outages to AI-assisted code changes as quality control concerns emerge
Amazon held an internal engineering meeting following multiple outages linked to what the company described as a trend of incidents related to AI-assisted code changes, according to Financial Times reporting. The acknowledgment that AI-generated code is contributing to production failures at one of the world's most sophisticated engineering organisations indicates that current AI coding assistants introduce reliability risks even when used by experienced developers. Amazon has not publicly disclosed the specific incidents or their customer impact, but the internal meeting suggests the issue is significant enough to warrant organisation-wide attention. The FT report notes this represents one of the first major technology companies to explicitly attribute operational problems to AI-assisted development tools.
The incidents raise questions about whether standard code review processes are adequate when AI tools dramatically increase code velocity, and whether organisations have sufficient quality gates to catch errors introduced by AI suggestions that appear plausible but contain subtle flaws. This comes as Microsoft launched a new bundle of workplace software aimed at increasing AI tool adoption for office work, Bloomberg reported, and as Anthropic releases its own code review tool specifically to address the flood of AI-generated code.
Signals & Trends
Industry cross-lab solidarity on government AI demands may be eroding faster than expected
The rapid mobilisation of OpenAI and Google employees to support Anthropic's lawsuit suggests AI researchers view the Pentagon dispute as an existential threat to safety research autonomy, yet the support came from individual employees filing an amicus brief rather than from the companies themselves. This gap between researcher sentiment and corporate positioning is notable. No major AI lab has issued a corporate statement backing Anthropic, and Microsoft — OpenAI's primary investor and a major defense contractor — has remained silent. The pattern suggests that whilst AI safety researchers may maintain cross-lab solidarity, their employers are less willing to jeopardise government relationships, particularly as defense spending on AI accelerates. If this divergence continues, expect safety-focused researchers to increasingly operate independently of corporate positions, potentially through external advocacy groups or by moving to organisations with stronger institutional commitments to use restrictions. The test will be whether labs that stayed silent face internal pressure or researcher attrition, or whether researchers accept that corporate commercial interests will override their safety concerns when government contracts are at stake.
State-level AI harm reporting infrastructure may outpace federal action on consumer protection
Florida's partnership with FLI to build AI harms reporting systems and crisis counsellor training represents a significant development in AI governance: states are creating formal infrastructure to document and respond to AI-related harm before any federal framework exists. This mirrors the pattern seen with data privacy, where California's actions preceded and shaped eventual federal discussions. The focus on AI companion applications is particularly notable because these harms — psychological dependency, relationship deterioration, potential self-harm triggers — are difficult to address through traditional product liability frameworks and have received minimal attention from federal regulators. If Florida's system generates documented evidence of harm patterns, it will create both political pressure for federal action and evidentiary basis for liability claims. Other states with active technology policy agendas — California, New York, Massachusetts — are likely to establish similar reporting systems, potentially creating a patchwork of state-level AI safety requirements that companies must navigate before any national standard emerges. Watch whether these state systems become the primary source of data on real-world AI harms, and whether federal agencies ultimately adopt state-developed frameworks rather than creating their own.
The gap between AI safety commitments and deployment reality is becoming liability evidence
The X/Grok incidents, Amazon's AI-related outages, and the Anthropic-Pentagon dispute share a common thread: they generate documented evidence that safety controls are either insufficient or ignored during deployment. For organisations concerned about AI liability, this evidentiary trail is the key risk. When Liverpool and Manchester United file formal complaints about Grok-generated content, when Amazon holds internal meetings about AI-related outages, when Anthropic sues over government pressure to weaken safety boundaries — each creates discoverable records that plaintiffs can use to demonstrate that harms were foreseeable and that existing safeguards were inadequate. This matters because AI liability frameworks are still developing, and early cases will likely turn on whether defendants took reasonable precautions. Companies that have made public safety commitments but then deploy systems that cause documented harm face heightened liability exposure because their own statements establish the duty of care they failed to meet. Risk professionals should assume that internal safety discussions, deployment decisions that override safety team recommendations, and gaps between public commitments and actual practice will all become evidence in future litigation. The strategic question is whether to strengthen safety controls now or prepare for liability costs later, because the middle ground of public safety commitments without implementation is becoming the highest-risk position.
Explore Other Categories
Read detailed analysis in other strategic domains