As artificial intelligence advances at an unprecedented pace, one of the industry’s leading safety-focused firms is sounding the alarm and calling for unified global action to rein in development of the most powerful systems. Anthropic, the developer of the popular Claude chatbot, outlined its proposal in a public blog post published Thursday, arguing that rapid technological gains have outpaced safety preparations, creating a tangible risk that humans could ultimately lose control of increasingly autonomous AI systems.
In the post, co-authored by company co-founder Jack Clark and Marina Favaro, head of Anthropic’s independent research institute, the firm laid out the core case for a coordinated pause. Citing current industry trends, Anthropic warned that given access to sufficient computing power, cutting-edge AI systems could soon achieve the ability to design and build their own improved successors — a scenario known as recursive self-improvement. While the company acknowledged that this technological milestone could unlock major breakthroughs in fields ranging from medical research to scientific discovery, it also emphasized that it would dramatically amplify the risk of unaligned AI that operates outside of human oversight.
The proposed pause, Anthropic argued, would create critical breathing room for societal institutions and AI alignment research to catch up to rapid technical advances. Alignment, a core concept in AI safety, refers to the ongoing work of ensuring AI systems’ goals and behaviors align with human values and intentions. Anthropic also noted that a coordinated global verification mechanism would prevent bad actors from exploiting a widespread slowdown to secretly accelerate their own development, and avoid the scenario where less safety-focused firms gain an unfair advantage by pushing ahead unregulated.
The proposal comes as the AI industry is already roiled by competing perspectives on how to govern cutting-edge development. Just one day before Anthropic published its post, OpenAI — Anthropic’s main rival and developer of the ChatGPT large language model — published a report pushing for a different approach to AI governance. OpenAI argued that democratic national governments, not private tech companies acting independently, should be the ultimate arbiters of AI rules, safety safeguards and accountability mechanisms. “Decisions about the pace of AI innovation should not be left to any one lab, company, or special interest group,” the company said in its statement.
Anthropic’s call for a pause also follows a separate alarming warning released earlier this same week from a team of cybersecurity researchers at the University of Toronto. The team published research detailing how off-the-shelf AI tools can be repurposed to create a new breed of adaptive AI-powered “worm” that evolves its hacking strategy as it spreads across connected devices, allowing it to take over entire large-scale computing networks.
Lead researcher Nicolas Papernot explained in an interview that the team built the proof-of-concept worm using a widely available open-source AI tool that is cheap and easy for bad actors to access and modify. Unlike traditional cyberattacks that focus exclusively on high-value targets such as banking systems, hospital infrastructure, or power grids, Papernot noted that AI-powered hacking lowers the cost of attacks so dramatically that any internet-connected device — even an old unused laptop stored in a basement — can be co-opted as a launch pad for larger attacks on critical infrastructure. “Anything connected to the internet is now at risk,” he said, adding that even smaller, widely available AI tools pose meaningful security risks, not just the largest and most powerful frontier language models. Papernot notified Canadian cybersecurity authorities ahead of publishing his team’s findings, and called for expanded cross-sector collaboration between tech firms, government agencies and academic researchers to develop effective countermeasures for AI-powered cyber threats.
Widespread concern about unregulated advanced AI and its potential to cause societal harm has grown steadily as models grow more capable. Earlier this year, Anthropic’s own Mythos model sent shockwaves through finance and tech industries after demonstrating an ability to autonomously detect unpatched vulnerabilities in existing commercial code. Despite growing risks, regulatory progress has lagged, particularly in the United States — where most of the world’s leading AI development labs are based. Earlier this week, the Trump administration issued an executive order placing responsibility for safety testing on the firms themselves, requiring that companies voluntarily submit their most capable models for government cybersecurity testing before public release.
This is not the first time AI researchers and industry figures have called for a pause on advanced AI development. In 2023, the non-profit Future of Life Institute led a prominent push to halt advanced AI development for six months to allow time for the creation of binding safety guardrails, a move backed by high-profile figures including Elon Musk, owner of independent AI lab xAI. That previous effort failed to gain widespread industry or government traction.
Anthropic has positioned itself as a safety-first AI developer since its founding. Earlier this year, the firm drew public attention and government pushback when it refused to license its AI models to the U.S. military for use in domestic surveillance and fully autonomous weapons systems. As a result, the Pentagon placed Anthropic on a national security blacklist that is set to take effect in 2026, barring the company from federal government contracts.
Anthropic’s new proposal comes as both the firm and OpenAI are moving toward initial public offerings (IPOs) to sell shares to public markets. Analysts currently estimate that Anthropic’s IPO could value the company at nearly $1 trillion, underscoring the high financial stakes at play in the global debate over AI safety and regulation.
