AI hacking is here — and it will only get more dangerous
Some AI tools are becoming autonomous operators capable of executing attacks at speeds and scales that human hackers simply cannot match

Joan Cros/NurPhoto via Getty Images
A version of this article originally appeared in Quartz’s AI & Tech newsletter. Sign up here to get the latest AI & tech news, analysis and insights straight to your inbox.
For years, security researchers warned that artificial intelligence would eventually transform cyberattacks. The first real examples have arrived.
In the past two months, a Chinese state-sponsored hacking group used Anthropic's Claude to orchestrate a cyber espionage campaign against roughly 30 global targets, including major tech companies, financial institutions, and government agencies. Pro-Ukrainian hackers deployed AI-generated decoy documents to infiltrate Russian defense contractors. And a Stanford experiment found that an AI system called Artemis outperformed nine out of 10 professional penetration testers at finding vulnerabilities in the university's engineering network.
The common thread is that AI tools have crossed a threshold. They're no longer just helpful assistants for writing phishing emails or generating code snippets. They're becoming autonomous operators capable of executing attacks at speeds and scales that human hackers simply cannot match.
What once took teams now takes minutes
The Anthropic attack, disclosed in mid-November, illustrates the new reality. Chinese hackers manipulated Claude Code, an agentic AI tool designed for legitimate software development, into performing most of the work traditionally done by humans. The AI conducted reconnaissance on target systems, identified security vulnerabilities, wrote custom exploit code, harvested credentials, and exfiltrated data. According to Anthropic's analysis, the attackers were able to automate 80-90% of the campaign, requiring human intervention at only a handful of critical decision points.
At the attack's peak, the AI made thousands of requests, often multiple per second. That's a pace no team of hackers could sustain.
Testifying before the House Homeland Security Committee this month, an Anthropic executive called it a proof of concept that concerns about AI-powered hacking are no longer hypothetical. Kevin Mandia, the founder of Mandiant, a cybersecurity company Google $GOOGL spent $5.4 billion to acquire, and now of a new AI-focused security startup called Armadin, offered a blunter prediction. "Offense is going to be all-AI in under two years," he told The Wall Street Journal.
To be clear, most hacking still doesn't require anything close to this level of sophistication. Millions of people still use "password" as their password. Phishing emails work because someone clicks the link. People give out bank information to strangers who call them and ask for it. The overwhelming majority of breaches exploit human error, not cutting-edge AI.
But for nation-state attackers and well-resourced criminal groups targeting hardened systems, AI represents a force multiplier that changes the calculus of what's possible.
The next frontier involves malware that thinks for itself
Today's AI-powered attacks still need to phone home. The malware talks to an AI service in the cloud, gets instructions, and acts on them. But security researchers are already exploring what happens when that's no longer necessary.
Dreadnode, a security research firm, has prototyped malware that uses the AI already installed on a victim's computer. No internet connection required, no server for defenders to track down and shut off. Their experiment took advantage of the fact that Microsoft $MSFT now ships CoPilot+ PCs with pre-installed AI models.
In their proof of concept, Dreadnode created malware that uses the victim's own on-device AI to make decisions about what to do next, eliminating the need for the traditional back-and-forth communication between malware and a hacker's server. The AI assesses the local environment, decides which actions to take, and adapts its behavior accordingly.
The experiment required more hand-holding than the researchers initially hoped. Current small AI models lack the sophistication of cutting-edge systems, and most computers don't have the specialized hardware to run AI inference without grinding to a halt. Dreadnode still came away convinced that building autonomous malware without external infrastructure “is not only possible but fairly straightforward to implement.”
As AI hardware becomes more common and on-device models grow more capable, this technique could become practical at scale.
The pace of improvement is what worries researchers most. Just 18 months ago, AI models struggled with basic logic and had limited coding abilities. Today, they can execute complex multi-step attack sequences with minimal human oversight. Security firms testing frontier models report growing evidence that AI systems are improving at finding weaknesses and stringing them together into attacks.
Fully autonomous AI attacks remain out of reach for now. The Chinese hackers using Claude still needed to jailbreak the model and approve its actions at key points.
But the same capabilities that help people write code and automate their work can be turned toward breaking into systems. The tools don't care which side you're on.