AI Agent Threat Matrix

A structured framework for classifying, detecting, and defending against attacks on AI agent systems.

Version 1.0 | March 2026 | OpenA2A

Purpose

AI agents operate differently from traditional software. They make decisions based on natural language, delegate actions to tools with varying trust levels, communicate with other agents via open protocols, and maintain persistent memory that can be poisoned. These properties create attack surfaces that existing frameworks do not adequately model.

The AI Agent Threat Matrix classifies attacks against AI agent systems into 9 tactics and 57 techniques, organized by kill chain stage. Every technique is grounded in observed adversary behavior or validated in a controlled lab environment. Every technique maps to automated detection, a reproducible lab scenario, and a defensive control.

What This Covers (and What It Doesn't)

This matrix covers the agent layer — the infrastructure between the model and the user:

Governance file manipulation (SOUL.md, system prompts, behavioral constraints)
Skill and plugin supply chain attacks
MCP and A2A protocol exploitation
Agent memory poisoning and persistence
Credential exposure through agent infrastructure
Sandbox escape via framework defaults
Cross-agent lateral movement and identity attacks

This matrix does not cover:

Model-level attacks (adversarial examples, training data poisoning) — see MITRE ATLAS
Prompt injection as a standalone topic — see OWASP Top 10 for LLM
Traditional enterprise network attacks — see MITRE ATT&CK

The Agent Threat Matrix is designed to work alongside these frameworks, not replace them.

Matrix Overview

Tactic	Kill Chain Stage	Techniques	Description
Reconnaissance	1	7	Map the target agent's attack surface, capabilities, and behavioral boundaries
Initial Access	2	8	Gain control over agent behavior through prompt manipulation or input exploitation
Credential Harvest	3	6	Extract API keys, tokens, and credentials from agent context and connected services
Privilege Escalation	4	6	Escalate capabilities beyond declared scope or bypass authorization
Lateral Movement	5	6	Pivot from compromised agent to connected services or other agents
Persistence	6	6	Establish persistent access surviving restarts and session changes
Collection	7	6	Gather and stage data from databases, file systems, and APIs
Exfiltration	8	6	Transfer collected data out of target environment
Impact	9	6	Modify data, deploy malicious code, or disrupt services

57 techniques across 9 tactics. 34 attack classes grouping related techniques. 16 techniques with real-world evidence. 38 techniques validated in controlled lab environments. 3 techniques adapted from traditional environments (marked as such).

Evidence Standard

Every technique in this matrix is assigned an evidence tier:

Tier	Meaning	Count
Observed	Confirmed in real-world production systems	16 (28%)
Validated	Reproduced in controlled lab environment (DVAA)	38 (67%)
Adapted	Well-understood traditional technique applied to agent context, not yet observed agent-specifically	3 (5%)

We do not publish purely theoretical techniques. Every entry has either a real-world observation, a reproducible lab scenario, or an established traditional precedent.

See EVIDENCE_AUDIT.md for the full justification of every technique's evidence tier.

Cross-Framework Mapping

This matrix maps every technique to four external references:

Framework	What It Provides
HackMyAgent	Automated detection (204 checks, 115 attack payloads)
DVAA	Lab validation (10 vulnerable agents, 14 challenges)
OASB	Defensive controls (72 controls across 11 categories)
MITRE ATT&CK / ATLAS / OWASP LLM	Gap analysis showing what existing frameworks cover and where the agent layer extends beyond them

See cross-references/ for detailed mapping documents.

Attack Classes (34)

Attack classes group related techniques by the underlying vulnerability pattern:

Governance (10 classes)

Class	Description	Techniques
SOUL-POISON	Malicious instructions injected into governance files at write-time	T-2001, T-2003, T-2007, T-2008
SOUL-DRIFT	Multi-turn sequences gradually eroding behavioral boundaries	T-2004, T-4006
SOUL-INJECT	Conflicting instructions via tool outputs or indirect channels	T-1003, T-2001, T-2003, T-2007, T-2008
PHANTOM-SOUL	Agent deployed with zero behavioral constraints	T-1006
SOUL-FORK	Different behavior under evaluation vs production	T-5002
SOUL-HIJACK	External content achieving constitution override	T-4002, T-5002
SOUL-BOUNDARY	Exploiting ambiguous constraint definitions	T-2008
SOUL-DELEGATE	Delegation without authorization chain verification	T-4001
SOUL-IMPERSONATE	False capability claims beyond authorization	T-4002
SOUL-HV	Harm avoidance override variants (4 sub-types)	T-2001, T-2003

Supply Chain (8 classes)

Class	Description	Real-World Evidence
UNICODE-STEGO	Invisible Unicode encoding instructions in source code	os-info-checker npm attack (May 2025)
MEM-POISON	Persistent instructions in agent memory surviving restarts	DVAA L2-04, L3-03
SKILL-MEM-AMP	Skill plants payload in memory, survives uninstall	DVAA validated
RAG-POISON	Malicious content in vector databases retrieved by RAG pipeline	Academic research, DVAA RAGBot
HEARTBEAT-RCE	Periodic instruction fetch via heartbeat URL persistence	OpenClaw heartbeat mechanism, NemoClaw H-007
SKILL-FRONTMATTER	YAML metadata injection bypassing content filters	DVAA PluginBot
SKILL-EXFIL	Skill exfiltrates data outside declared tool boundaries	DVAA tool chain scenarios
ORG-SKILL-SPREAD	Compromised admin skill propagates organization-wide	ClawHavoc campaign patterns

Infrastructure (8 classes)

Class	Description	Real-World Evidence
GATEWAY-EXPLOIT	Misconfigured API gateways exposing agent infrastructure	~75K OpenClaw gateways unauthenticated
MCP-EXPLOIT	MCP server config, tool permissions, transport security	DVAA MCP agents, MCP protocol analysis
RETROACTIVE-PRIV	Existing credentials silently gaining AI permissions	32 API keys in HTTP responses (Jan 2026)
LLM-EXPOSE	LLM inference endpoints exposed without authentication	~56K Ollama instances (Mar 2026)
AITOOL-EXPOSE	AI development tools exposed (Jupyter, MLflow, Gradio)	~8.3K Jupyter, ~740 MLflow (Mar 2026)
CODE-INJECTION	Command injection via unsanitized inputs	NemoClaw C-001, C-002
INTEGRITY-BYPASS	Digest/hash bypass on empty or missing values	NemoClaw C-005
TOCTOU-RACE	Time-of-check-time-of-use race in verification pipelines	NemoClaw C-006

NemoClaw-Specific (5 classes)

Class	Description	Evidence
NEMO-CRED-LEAK	Credential exposure in NemoClaw configuration	C-004 (CLI args), H-004 (env passthrough)
NEMO-NETWORK-EXPOSE	Network services bound to public interfaces	C-004 (gateway), C-005 (k3s)
NEMO-SUPPLY-CHAIN	Supply chain integrity bypass	C-003 (curl\|sh), C-005 (digest bypass)
NEMO-SANDBOX-ESCAPE	Sandbox isolation failure	H-001 (Docker privileged), H-004 (Landlock)
NEMO-OPENCLAW-INHERIT	Inherited OpenClaw flaws surviving sandboxing	H-007 (Telegram pre-allowed)

Identity (2 classes)

Class	Description
AGENT-IMPERSONATE	False capability claims in A2A communications
BEHAVIORAL-IMPERSONATE	Stolen credentials detected via behavioral baseline mismatch

Sandbox (1 class)

Class	Description
SANDBOX-ESCAPE	General sandbox escape via privileged containers or LSM degradation

How to Use This Framework

For Red Teams

Follow the kill chain stages sequentially. Use technique IDs to plan attack paths. Reference DVAA challenges for practice. The Attack Paths section provides complete worked examples.

For Blue Teams

Map your defenses against each tactic. Use OASB controls as a checklist. Any tactic without detection or prevention represents a gap. Focus on breaking the chain at the earliest stage.

For Researchers

Cite techniques using their IDs (e.g., "ATM T-2001"). The catalog is extensible — contribute new techniques via pull request with evidence requirements.

For Vendors

Map your product's detection capabilities to ATM technique IDs. This enables customers to understand which agent-specific threats your product covers.

Attack Paths

Complete kill chain traversals demonstrated in DVAA:

Path A — API Agent Full Compromise: T-1001 → T-2001 → T-3001 → T-5004 → T-7001 → T-8002 (LegacyBot → ToolBot: recon, inject, harvest creds, pivot, enumerate files, exfiltrate)

Path B — Memory Persistence Chain: T-1001 → T-2001 → T-6001 → T-7004 → T-8005 (MemoryBot: recon, inject, persist in memory, dump memory, exfiltrate via conversation)

Path C — Multi-Agent A2A Chain: T-1006 → T-2001 → T-4002 → T-5002 → T-9001 (Orchestrator → Worker → ToolBot: discover agent card, inject, impersonate admin, pivot via A2A, modify data)

Path D — Supply Chain to Full Compromise: T-1002 → T-2005 → T-6004 → T-5003 → T-9006 (PluginBot → ProxyBot: discover tools, inject via tool description, backdoor skill, hop MCP servers, compromise downstream)

Repository Structure

agent-threat-matrix/
├── README.md                 # This document
├── EVIDENCE_AUDIT.md         # Evidence tier justification for every technique
├── matrix.json               # Machine-readable matrix (full data)
├── tactics/                  # One file per tactic (kill chain stage)
├── techniques/               # One file per technique (T-XXXX)
├── attack-classes/           # One file per attack class
├── mitigations/              # OASB control mappings
├── evidence/                 # Real-world evidence reports
├── cross-references/         # MITRE ATT&CK, ATLAS, OWASP LLM mappings
├── CONTRIBUTING.md           # How to propose new techniques
├── CHANGELOG.md
└── LICENSE                   # Apache-2.0

How to Cite

When referencing individual techniques:

AI Agent Threat Matrix T-2001 (Direct Prompt Injection). OpenA2A, 2026. https://threats.opena2a.org/techniques/T-2001

When referencing the framework:

OpenA2A. "AI Agent Threat Matrix v1.0." March 2026. https://threats.opena2a.org

Contributing

See CONTRIBUTING.md for the full process. In summary:

Every new technique must have either a real-world observation or a reproducible lab scenario
Assign the next available technique ID in the appropriate stage range
Include HMA detection check ID, DVAA validation (if applicable), and OASB control mapping
Submit as a pull request with evidence documentation

License

Apache-2.0. The framework is free to use, cite, and build upon.

Related Projects

Project	Role
HackMyAgent	Automated detection for ATM techniques (204 checks, 115 payloads)
DVAA	Lab validation environment (10 agents, 14 challenges)
OASB	Defensive benchmark (72 controls)
AI Agent Kill Chain	Tactical framework (the 9-stage progression model)

Maintained by OpenA2A. Contributions welcome via pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Agent Threat Matrix

Purpose

What This Covers (and What It Doesn't)

Matrix Overview

Evidence Standard

Cross-Framework Mapping

Attack Classes (34)

Governance (10 classes)

Supply Chain (8 classes)

Infrastructure (8 classes)

NemoClaw-Specific (5 classes)

Identity (2 classes)

Sandbox (1 class)

How to Use This Framework

For Red Teams

For Blue Teams

For Researchers

For Vendors

Attack Paths

Repository Structure

How to Cite

Contributing

License

Related Projects

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
attack-classes		attack-classes
cross-references		cross-references
tactics		tactics
techniques		techniques
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
EVIDENCE_AUDIT.md		EVIDENCE_AUDIT.md
LICENSE		LICENSE
README.md		README.md
matrix.json		matrix.json

Folders and files

Latest commit

History

Repository files navigation

AI Agent Threat Matrix

Purpose

What This Covers (and What It Doesn't)

Matrix Overview

Evidence Standard

Cross-Framework Mapping

Attack Classes (34)

Governance (10 classes)

Supply Chain (8 classes)

Infrastructure (8 classes)

NemoClaw-Specific (5 classes)

Identity (2 classes)

Sandbox (1 class)

How to Use This Framework

For Red Teams

For Blue Teams

For Researchers

For Vendors

Attack Paths

Repository Structure

How to Cite

Contributing

License

Related Projects

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages