AI companies are failing existential safety tests. That's not slowing them down
A sweeping safety review shows leading AI companies advancing toward superhuman systems without guardrails to stop catastrophic failure

An Rong Xu/Bloomberg via Getty Images
The world's leading artificial intelligence companies all received failing or near-failing grades for their plans to control superintelligent systems, according to a new safety assessment released today, even as they race to develop technology that could surpass human intelligence.
Suggested Reading
The Future of Life Institute's winter 2025 AI Safety Index evaluated eight major AI companies across six dimensions, including risk assessment, current harms, and existential safety. The organization, which gained attention in 2023 for an open letter calling to pause development of the most powerful AI systems, assembled an independent panel of leading experts to conduct its third edition.
Related Content
While Anthropic, OpenAI, and Google DeepMind led the pack with C+ to C grades overall, every company scored D or F on existential safety measures — the ability to prevent loss of control over advanced AI systems.
"AI CEOs claim they know how to build superhuman AI, yet none can show how they'll prevent us from losing control," said Stuart Russell, a UC Berkeley computer science professor and one of the index's expert reviewers. The report noted that companies admit catastrophic risks could be as high as one in three, yet lack concrete plans to reduce them to acceptable levels.
The assessment revealed a widening gap between top performers and stragglers, including xAI, Meta, and Chinese companies DeepSeek, Z.ai, and Alibaba Cloud. Companies across the board performed poorly in the Current Harms domain, which evaluates how AI models perform on standardized trustworthiness benchmarks that test designed to measure safety, robustness, and the ability to control harmful outputs. Reviewers found that "frequent safety failures, weak robustness, and inadequate control of serious harms are universal patterns" with uniformly low performance on these benchmarks.
Anthropic scored highest in this category with C+, while xAI received a failing grade. OpenAI's score dropped to C- from a B in the second edition, influenced in part by recent real-world incidents. Reviewers recommended the company "increase efforts to prevent AI psychosis and suicide, and act less adversarially toward alleged victims."
"If we'd been told in 2016 that the largest tech companies would run chatbots that encourage kids to kill themselves and produce documented psychosis in long-term users, it would have sounded like a paranoid fever dream," said Tegan Maharaj, professor at HEC Montréal and a reviewer.
The index noted that while none of the tested models failed the benchmarks outright, the consistently poor scores across companies revealed systemic weaknesses in how the industry approaches immediate safety risks, even before considering the more speculative dangers of superintelligent systems.
Five companies participated in the index's detailed survey for the first time, providing unprecedented transparency into their safety practices. However, reviewers concluded that even the strongest performers fall short of emerging regulatory standards like the EU AI Code of Practice and California's SB 53, with gaps in independent oversight, transparent threat modeling, and measurable risk thresholds. "Overall, companies generally are doing poorly, and even the best are making questionable assumptions in their safety strategies," one reviewer cautioned.