Agentic AI Cybersecurity: The 3 Axes of the New Threat Surface (Updated)

Agentic AI is turning cybersecurity from a contest between people with tools into a contest between systems with objectives. The shift is not just about more powerful models; it is about software that can perceive, plan, and act across networks with a degree of independence. Those same properties that make agents attractive for automation also make them attractive to attackers, exposing gaps in how organizations think about identity, privilege, and control.¹

I think about agentic AI cybersecurity along three axes:

agents as instruments for attackers
agents as new assets to defend
agents as unstable connective tissue inside complex systems

I think about agentic AI in cybersecurity along three axes

Agents as instruments for attackers

Attackers already use AI for content generation, translation, and code assistance. Agentic AI raises the ceiling. Instead of issuing one-off prompts, an attacker can task an agent with a goal, such as compromising a small business or siphoning data from a cloud account, and let it iterate. Agents that chain reconnaissance, phishing, credential stuffing, vulnerability scanning, and data exfiltration compress what used to be days of manual work into continuous, low-cost campaigns.

The emerging language of “Vibe Crime” captures this direction of travel. Trend Micro’s research and independent summaries describe criminals using agentic AI to orchestrate highly personalized, multi-step fraud at scale, with AI handling targeting, message generation, interaction, and adaptation in near real time.² Those same reports sketch a shift toward “Cybercrime-as-a-Service” and even “Cybercrime-as-a-Sidekick,” where persistent agents act as on-demand operators for non-expert criminals who only need to specify outcomes.² The sophistication sits in the agent, not in the human behind it.

Forecasts from major security vendors push this further. Palo Alto Networks argues that agentic AI will enable malware that no longer depends on fixed command-and-control infrastructure, because the agent itself embodies strategy and coordination.¹ They describe future botnets that collude, adapt to defenses, and generate new exploits autonomously. Whether that full vision arrives or not, the strategic point stands: once attackers can deploy agents that reason about defenses and change course without a human in the loop, traditional threat models based on static indicators and signatures start to look fragile.

Anthropic reports that attackers used AI agent capabilities not just as an advisor, but to execute cyberattacks themselves, in what it describes as the first documented large-scale cyberattack carried out without substantial human intervention.³ That experience was constrained and monitored, but it demonstrates how quickly a capable model, wrapped in an agent framework, can approximate the behavior of a junior intrusion operator. The barrier to entry for sophisticated campaigns drops from specialized human skill to access to a model and an agent scaffold.

Anthropic’s Mythos and the agentic future

In April 2026, Anthropic announced Claude Mythos Preview, a frontier model that can autonomously identify previously unknown vulnerabilities, generate working exploits, and execute complex attack paths with minimal human input. Mythos reportedly succeeded in autonomous exploit development in benchmark tests where earlier frontier models had essentially failed, and it has already uncovered numerous latent flaws across widely deployed software. Precisely because of those capabilities, Anthropic chose not to release Mythos publicly, instead limiting access to a small group of critical infrastructure and technology partners through its Project Glasswing initiative.

Mythos is framed as a defensive tool, but it also demonstrates how quickly agents capable of planning, probing, and exploiting at machine speed are becoming real. The question for defenders is not whether one company keeps such a model locked down, but how soon comparable agentic capabilities will be broadly available without the same safety guardrails.

Agents as assets and attack surfaces

Defenders are not passive in this story. Security operations centers, IT teams, and developers are deploying their own agents for log triage, ticket routing, configuration changes, and incident response. Those agents become targets in their own right. They hold credentials, broker access to tools, and often operate at privilege levels higher than most humans.

OWASP’s 2026 Top 10 for Agentic Applications frames this clearly. It lists risks that barely existed a year ago: goal hijacking, tool misuse, identity and privilege abuse, agentic supply chain compromise, memory and context poisoning, insecure inter-agent communication, cascading failures, human-agent trust exploitation, and outright rogue agents.⁴ Each item is a reminder that the traditional “input validation and patching” mindset is not enough when software has objectives and can move through APIs and systems on its own.

Goal hijacking and memory poisoning are particularly corrosive. An attacker does not need to break the underlying model; they only need to manipulate prompts, context windows, or long-term memory so that an agent pursues the wrong objectives while still appearing helpful. A compromised vector store or poisoned knowledge base can quietly steer an agent toward misclassification, misrouting, or destructive actions. Because those changes manifest as “decisions,” not as obvious crashes, they can persist for long periods before anyone notices.

Tool misuse and identity abuse connect classic privilege-escalation problems to the agent world. When an agent sits atop ticketing systems, cloud consoles, CI/CD pipelines, or identity providers, any gap in scoping or isolation becomes a path for lateral movement. Agents that can run code, change access policies, or reconfigure infrastructure turn prompt injection and environment manipulation into high-impact attack vectors. The OWASP work makes a simple point: agents are effectively new superusers, and they require the same rigor in identity, authorization, and monitoring as human administrators.⁴

Agents inside complex systems

The most unnerving aspect of agentic AI is not any single exploit; it is the way agents interact inside larger systems. Security vendors and think tanks converge on three concerns.

First, cascading failure. When multiple agents coordinate across environments, an error in one can propagate quickly.⁴ A poisoned memory, manipulated message, or spoofed inter-agent communication can misdirect entire workflows, from automated customer responses to infrastructure changes. The same property that makes multi-agent systems powerful : emergent behavior from simple components, also makes them fragile in adversarial settings.

Second, supply-chain opacity. Palo Alto’s analysis calls out the difficulty of verifying the provenance of models, training data, and agent frameworks.¹ If an organization cannot tell whether a base model or an orchestration library has been manipulated upstream, it risks importing a “digital Trojan horse” into core systems. That risk extends beyond open-source to closed-source components, where a lack of transparency hinders forensic work after an incident.

Third, the collaboration gap. Security specialists, AI researchers, and operations teams still do not share a common language about agent behavior.¹ That gap shows up in inconsistent threat modeling, incomplete logging, and optimistic assumptions about how “safe” a given agent configuration is. OWASP’s new guidance and CISA’s AI collaboration efforts signal early attempts to close that gap, but adoption in practice is lagging.⁴¹

Defensive design principles

Given this trajectory, several design imperatives emerge.

Agents require their own threat models. Treat every high-privilege agent as a new class of administrator: least privilege by default, explicit tool scoping, and isolation between environments. That includes careful control over which systems an agent can call, what data it can see, and how its outputs are consumed. Agents should not be able to unilaterally change security controls, identity configurations, or production code without strong guardrails.

Memory and context are part of the attack surface. Vector stores, caches, and shared knowledge bases that feed agents need authentication, integrity checks, and monitoring. Updates to long-term memory should be constrained and auditable, with patterns that allow only vetted processes to modify critical facts. When memory poisoning is possible, every retrieval becomes a potential injection point.

Observability must extend to behavior, not just infrastructure. Logs should capture prompts, tool calls, key decisions, and critical state transitions, with enough structure to support forensic reconstruction and anomaly detection. The same attention that went into SIEM rules and network telemetry now has to be applied to agent behaviors and orchestration flows.

Red teaming has to evolve. Traditional penetration testing looks at networks, applications, and human processes. Agent-centric red teaming introduces new exercises: prompt-injection drills, goal-hijack scenarios, attempts to subvert tools via natural-language interfaces, and experiments with cascading agent failures. The point is not to “break the model” but to explore where agents can be steered off course or persuaded to act outside their intended scope.³⁴

Finally, organizational memory needs to keep up. Knowledge management, policy, and training should document not just where agents run, but what roles they play, what assumptions they encode, and what incidents have already occurred. In fast-moving environments, the first defense against repetition is shared learning; without it, each team rediscovers the same failure modes in isolation.

Agentic AI will be central to both offense and defense over the next several years. Criminals and nation-states will use agents to scale campaigns, personalize attacks, and continuously probe defenses. Enterprises will depend on agents to keep up with the volume of data, events, and changes in their own environments. The organizations that thrive will not be those that avoid agents, but those that learn to see them as powerful, fallible actors that demand the same security imagination long applied to human adversaries and human administrators.

Notes

1. Michael Sikorski, “The Next Great Cybersecurity Threat: Agentic AI,” Palo Alto Networks Perspectives, 2025. https://www.paloaltonetworks.com/perspectives/the-next-great-cybersecurity-threat-agentic-ai/

2. Stephen Hilt and Robert McArdle, “The Next Phase of Cybercrime,” December 09, 2025 https://www.trendmicro.com/vinfo/us/security/news/cybercrime-and-digital-threats/the-next-phase-of-cybercrime-agentic-ai-and-the-shift-to-autonomous-criminal-operations

3. Anthropic, “Disrupting the first reported AI-orchestrated cyber espionage campaign,” Anthropic blog, 2025. https://www.anthropic.com/news/disrupting-AI-espionage

4. OWASP, “OWASP Top 10 for Agentic Applications for 2026,” December 09, 2025. https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/

For more serious insights on AI, click here.

All images via Preplexity AI from prompts by the author.

Did you enjoy this analysis of Agentic AI Cybersecurity? If so, please like, share, or comment. Thank you.

Follow Us

Agentic AI Cybersecurity: The 3 Axes of the New Threat Surface (Updated)

Agentic AI Cybersecurity: The 3 Axes of the New Threat Surface (Updated)

I think about agentic AI cybersecurity along three axes:

Agents as instruments for attackers

Anthropic’s Mythos and the agentic future

Agents as assets and attack surfaces

Agents inside complex systems

Defensive design principles

Notes

Like this:

Related

Archives

Agentic AI Cybersecurity: The 3 Axes of the New Threat Surface (Updated)

I think about agentic AI cybersecurity along three axes:

Agents as instruments for attackers

Anthropic’s Mythos and the agentic future

Agents as assets and attack surfaces

Agents inside complex systems

Defensive design principles

Notes

Share this post:

Like this:

Related

Reader Interactions

Leave a ReplyCancel reply

Footer

Archives

Tag Cloud