Skip to content

Latest commit

 

History

History
934 lines (656 loc) · 52.3 KB

File metadata and controls

934 lines (656 loc) · 52.3 KB

AARTS: An Open Standard for AI Agent Runtime Safety

Version 0.1 — Draft Date: 2026-02-24

Status: Draft — feedback welcome. This is an early release of the AARTS specification. We invite comments, corrections, and suggestions from the community. Nothing in this document is final. Governed under Section 13.


1. Abstract

AI agents — whether embedded in IDEs, orchestration frameworks, or standalone applications — can execute shell commands, write and modify files, make network requests, install packages, spawn sub-agents, and interact with external services, often with minimal human oversight. Today, each agent platform implements its own ad-hoc security controls (if any), creating fragmented protection with no guarantee of feature parity across environments.

AARTS (An Open Standard for AI Agent Runtime Safety) defines a vendor-neutral set of lifecycle hook points, a common data model, and behavioral requirements that agent host platforms must expose and security vendors must implement to provide consistent, in-depth security across all agentic environments.

The standard is designed so that:

  • Host vendors (IDE makers, framework authors, orchestration platforms) know exactly which hooks to expose and what data to provide at each hook point.
  • Security vendors can build a single evaluation engine that works across all compliant hosts with feature parity.
  • Users and enterprises get consistent protection regardless of which agent platform they use.

AARTS is not a security product. It is a contract between hosts and security engines.


2. Scope

2.1 In Scope

  • IDE-based agents: Cursor, Claude Code, Windsurf, GitHub Copilot, Cline, Continue, and similar AI coding assistants embedded in editors.
  • Agentic frameworks: OpenClaw, LangChain, CrewAI, AutoGen, OpenAI Agents SDK, and similar libraries for building agent applications.
  • Orchestration platforms: Systems that manage multiple agents, route tasks, and coordinate workflows (e.g., n8n, Glean, Zapier).
  • MCP (Model Context Protocol) interactions: Agents connecting to and invoking tools on MCP servers.

2.2 Out of Scope

  • Specific security product implementations (detection algorithms, threat databases, scoring models).
  • LLM provider-side safety mechanisms (content filters, RLHF guardrails).
  • Network-level security controls (firewalls, proxies, DLP).
  • Authentication and authorization protocols for agent platforms themselves. (Note: host↔engine channel security is in scope — see Section 10.1.)

2.3 Design Goals

Goal Description
Feature parity A security vendor implementing AARTS can provide identical protection across all compliant hosts.
Minimal host burden Hosts expose hooks and provide data; they do not implement security logic.
Configurable failure policy By default, security engine failures do not block the user's workflow (fail-open). Hosts MUST allow this to be configured to fail-closed for high-security environments.
Forward compatibility The hook point model is extensible. New hook points can be added without breaking existing integrations.
Implementation freedom The standard defines interfaces, not implementations. Security vendors choose their own detection strategies.

3. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Term Definition
Agent An AI system that can autonomously plan and execute actions (tool calls) on behalf of a user.
Host The platform that runs the agent — an IDE, framework, orchestrator, or application. The host is responsible for exposing hook points.
Security engine A vendor-provided component that receives hook events, evaluates them, and returns verdicts. Also referred to as "engine."
Hook point A defined moment in the agent lifecycle where the host fires an event to the security engine and (optionally) awaits a verdict before proceeding.
Hook event The data payload delivered to the security engine at a hook point.
Verdict The security engine's response: an enforcement decision with supporting metadata.
Artifact A security-relevant element extracted from a hook event — a URL, command, file path, code snippet, prompt text, or package reference.
Adapter A thin, host-specific module that translates between the host's native event format and the AARTS data model.
Tool Any capability the agent can invoke: shell execution, file I/O, web requests, MCP calls, code execution, etc.
Sub-agent An agent spawned by another agent to handle a delegated task.
Skill / Agent definition A configuration file (often Markdown with YAML frontmatter) that defines an agent's behavior, instructions, and hook registrations.
System prompt The foundational instructions provided to the LLM, typically set by the host or application author.
Custom instructions User-provided or project-provided instructions that augment the system prompt (e.g., rules files, CLAUDE.md, .cursorrules).
Context injection Any content added to the LLM prompt that is not the user's direct input — includes system prompts, custom instructions, file contents, tool results, and hook-injected context.

4. Architecture Overview

AARTS defines a hook-based interposition architecture. The host fires events at defined lifecycle points; the security engine evaluates them and returns verdicts that the host enforces.

┌─────────────────────────────────────────────────────────────────────┐
│                          Agent Host                                 │
│  (IDE, framework, orchestrator)                                     │
│                                                                     │
│   Lifecycle event occurs                                            │
│         │                                                           │
│         ▼                                                           │
│   ┌──────────────┐     ┌──────────────┐     ┌─────────────────┐     │
│   │  Hook point  │───▶│   Adapter    │────▶│ Security engine │     │
│   │  (host-side) │     │ (normalize)  │     │ (evaluate)      │     │
│   └──────────────┘     └──────────────┘     └────────┬────────┘     │
│                                                      │              │
│                                                      ▼              │
│                                                 ┌──────────┐        │
│                                                 │ Verdict  │        │
│                                                 └────┬─────┘        │
│                                                      │              │
│                                                      ▼              │
│                                            ┌──────────────────┐     │
│                                            │  Host enforces   │     │
│                                            │  (allow / deny / │     │
│                                            │   ask / modify)  │     │
│                                            └──────────────────┘     │
└─────────────────────────────────────────────────────────────────────┘

4.1 Integration Models

Hosts MAY integrate with security engines via any of the following transport mechanisms:

Model Description Typical use
Subprocess Host spawns engine as a child process; communicates via stdin/stdout JSON. IDE extensions, CLI-based agents
In-process Engine runs in the same process as the host; invoked via function calls or event callbacks. Framework plugins, embedded agents
Remote Engine runs as a separate service; communicates via HTTP, gRPC, or WebSocket. Enterprise deployments, multi-agent orchestrators

Integration considerations:

  • Subprocess: Hosts MAY keep the engine process alive to avoid startup latency. Host MUST handle engine crashes gracefully (fail-open).
  • In-process: Lowest latency but tightest coupling — engine bugs can crash the host. Hosts SHOULD isolate engine calls with error boundaries. Engine has access to host process memory; trust implications must be considered.
  • Remote: Strongest isolation; enables centralized policy. Hosts MUST implement timeouts, the configured failure policy (Section 9.1), and channel security (Section 10.1). Privacy implications of transmitting hook data over the network MUST be addressed.

4.2 Version Compatibility

Before the first hook event is delivered, the host and the security engine MUST establish that they support a compatible AARTS specification version.

4.2.1 Protocol

  1. The host sends a version check to the engine indicating its AARTS version (e.g., "1").
  2. The engine responds with whether it supports that version (compatible: true/false) and its engine_id.
  3. If the engine does not support the host's AARTS version, the host MUST log the failure and MUST NOT send hook events to that engine. The host SHOULD inform the user that the engine could not be activated.

The transport mechanism follows the integration model (Section 4.1): JSON over stdin/stdout for subprocess, function call for in-process, HTTP/gRPC endpoint for remote.

4.2.2 Version Check Request

Sent by the host to the engine at initialization.

Field Type Req Description
aarts_version string MUST The AARTS major version the host implements (e.g., "1").

4.2.3 Version Check Response

Returned by the engine.

Field Type Req Description
compatible boolean MUST Whether the engine supports the requested AARTS version.
engine_id string MUST The engine's identifier.
engine_version string SHOULD The engine's software version.

If compatible is false, the host and engine cannot interoperate. The host MUST NOT send hook events to that engine.


5. Agent Lifecycle Model

An agent interaction follows a lifecycle with well-defined phases. AARTS defines hook points at each phase where security evaluation can occur.

┌─────────────────────────────────────────────────────────────────────┐
│                                                                     │
│  SESSION START                                                      │
│    ├── Configuration & plugin loading                               │
│    ├── Skill / agent definition loading                             │
│    └── MCP server connections                                       │
│                                                                     │
│  TURN (repeats per user interaction)                                │
│    ├── User input received                                          │
│    ├── Input processing (hooks, transformations)                    │
│    ├── Prompt assembly (system prompt + context + instructions      │
│    │     + conversation history + processed input)                  │
│    ├── LLM request sent                                             │
│    ├── LLM response received                                        │
│    └── Action execution loop:                                       │
│         ├── Tool call planned                                       │
│         ├── Tool executed                                           │
│         ├── Tool result returned                                    │
│         ├── Result fed back to LLM                                  │
│         └── (repeat until LLM produces final text response)         │
│                                                                     │
│  COMPACTION (when context approaches token limit)                   │
│    ├── Context compaction triggered                                 │
│    ├── Context summarized / truncated / selectively pruned          │
│    └── Compacted context used for subsequent turns                  │
│                                                                     │
│  SUB-AGENT (optional, may nest)                                     │
│    ├── Sub-agent spawn requested                                    │
│    ├── Sub-agent session (follows same TURN lifecycle)              │
│    └── Sub-agent result returned to parent                          │
│                                                                     │
│  SESSION END                                                        │
│    ├── Cleanup                                                      │
│    └── Summary / audit finalization                                 │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

6. Data Model

This section defines the AARTS data structures in a language-agnostic notation. Field types use standard JSON types (string, number, boolean, object, array).

6.1 Event Envelope

Every hook event shares these base fields. Hosts MUST include all required envelope fields on every event. Security engines can rely on their presence without per-hook checks.

Field Type Req Description
hook_point string MUST The hook point name (e.g., "PreToolUse", "PreLLMRequest").
session_id string MUST Unique identifier for this agent session.
timestamp string (ISO 8601) MUST When the event was generated.
host_id string MUST Host platform identifier (e.g., "cursor", "claude-code", "openclaw"). See naming convention below.
host_version string SHOULD Version of the host platform.
aarts_version string MUST AARTS specification version (e.g., "0.1").
locale string (BCP 47) SHOULD The user's preferred language (e.g., "en", "cs", "ja"). Security engines SHOULD return human-readable fields (e.g., reasons) in the requested locale when available and MAY fall back to English.

host_id naming convention. The host_id value is self-assigned by the host. It is not drawn from a central registry. Hosts MUST use lowercase kebab-case matching the pattern [a-z0-9]+(-[a-z0-9]+)* (e.g., "claude-code", "cursor", "n8n"). The chosen identifier SHOULD remain stable across host versions so that audit logs and policy rules can reference it reliably.

6.2 Standard Extensions

Hook events are composed from the base envelope plus zero or more standard extensions — reusable field groups that multiple hook points share. Each hook point definition (Section 8) declares which extensions it includes.

Implementations SHOULD define these extensions as composable structures (mixins, traits, embedded structs, or equivalent) so that common handling logic is written once.

6.2.1 Turn Context

Included by events that occur within a specific conversation turn.

Field Type Req Description
turn_id string MUST Identifier for this conversation turn within the session.

6.2.2 Tool Context

Included by events related to tool invocation. Provides the tool identity, its full input, and extracted security artifacts.

Field Type Req Description
tool_name string MUST Canonical tool name from the standard taxonomy (Section 7). Use "other" when no category applies.
tool_name_native string SHOULD The host's original tool name (e.g., "WebFetch", "str_replace"). MUST be provided when tool_name is "other".
tool_input object MUST The complete tool input payload as provided by the agent.
artifacts array of Artifact MUST Security-relevant artifacts extracted from the tool input (Section 6.4).

6.2.3 Trust Source

Included by events that evaluate a component originating from a trust boundary — plugins, skills, MCP servers, or modifying hooks. Enables the security engine to apply different policies based on origin.

Field Type Req Description
source string MUST Origin classification: "managed", "project", "user", "marketplace".
source_id string SHOULD Identifier of the specific source (plugin name, file path, server URI, hook ID).

6.2.4 Modification Tracking

Included by events where content has been modified by another hook or processor. Enables the security engine to detect unauthorized tampering by comparing the original and modified content.

Field Type Req Description
original_value string or object MUST The content before modification.
modified_value string or object MUST The content after modification.
was_modified boolean MUST Whether any modification occurred.
modifier_chain array of ModifierEntry MUST Ordered list of hooks/processors that touched the content.

ModifierEntry:

Field Type Req Description
modifier_id string MUST Hook or processor identifier.
config_source string MUST Where the modifier was configured: "managed", "project", "user", "plugin".
modified boolean MUST Whether this specific modifier changed the content.

6.2.5 Provenance

Included by events where understanding the causal chain behind an action improves security evaluation — for example, a tool call is more suspicious if the prompt was tampered with.

Field Type Req Description
source_prompt_modified boolean SHOULD Whether the user's input was modified by hooks before the agent processed it.
active_skill string SHOULD The skill/agent definition currently active in the agent's context.
is_sub_agent boolean SHOULD Whether this event originates from a sub-agent.
parent_session_id string SHOULD If sub-agent, the parent session's identifier.
input_modifier_chain array of ModifierEntry SHOULD Hooks that modified the user's input for this turn.

6.3 Hook Point Reference

Consolidated reference for all hook points. Hook-specific fields are defined in Section 8.

Hook Point Phase Blocking Extensions Level
SessionStart Session Recommended Basic
SessionEnd Session Notification Comprehensive
PrePluginLoad Config Required Trust Source Standard
PreSkillLoad Config Required Trust Source Standard
PreMCPConnect Config Required Trust Source Comprehensive
PreUserInput Prompt Required Turn Comprehensive
PostInputProcessing Prompt Required Turn, Modification Standard
PreLLMRequest Prompt Required Turn Standard
PostLLMResponse Prompt Required Turn Comprehensive
PreToolUse Tool Required Turn, Tool, Provenance Basic
PostToolUse Tool Recommended Turn, Tool Standard
PreToolInputModification Tool Required Trust Source, Modification Comprehensive
PreSubAgentSpawn Sub-agent Required Comprehensive
PostSubAgentResult Sub-agent Recommended Comprehensive
PreOutputDeliver Output Recommended Turn Comprehensive
PreMemoryWrite Memory Required Comprehensive
PreMemoryRead Memory Recommended Comprehensive
PreCompact Prompt Required Turn Comprehensive
PostCompact Prompt Required Turn, Modification Comprehensive
PostAskResolution Verdict Notification Turn Standard

6.4 Artifact

A security-relevant element extracted from a hook event.

Field Type Req Description
type string MUST Artifact type (see table below).
value string MUST The raw artifact value.
context string MAY Hint about how the artifact was found or used (e.g., "base64_encoded", "piped_to_shell", "from_file_content", "in_system_prompt").
source_field string MAY Which event field this artifact was extracted from.

Artifact types:

Type Description
url A URL found in commands, file content, prompts, or tool inputs.
command A shell command or command fragment.
file_path A file system path being read, written, or deleted.
content A body of text (file content, code, prompt text). Implementations SHOULD cap at a reasonable size (e.g., 64 KB).
package A software package reference (name, version, registry).
prompt_fragment A segment of prompt text flagged for inspection.
credential A detected secret, token, or key. The value SHOULD be masked/redacted.
network_address An IP address, hostname, or connection endpoint.

6.4.1 Verbatim Data

Hosts MUST provide artifact data verbatim — exactly as the agent or tool expressed it, without normalization or transformation. Artifact extraction, decoding (Base64, URL-encoding, etc.), path resolution, and other normalization is the security engine's responsibility, not the host's.

This design principle ensures that:

  • Hosts remain simple and do not introduce subtle behavioral differences in corner cases.
  • Security engines have full control over how artifacts are interpreted.
  • The same raw data is available to all engines regardless of host.

Hosts MUST populate the source_field when the artifact is extracted from a specific event field, to support audit and debugging.

6.5 Verdict

The security engine's response to a hook event.

Field Type Req Description
decision string MUST One of: "allow", "deny", "ask".
category string MUST Threat category (e.g., "tool_execution", "network_egress", "supply_chain", "prompt_injection", "data_exfiltration", "persistence", "privilege_escalation").
severity string MUST One of: "info", "warning", "critical".
confidence number MAY Engine confidence in this verdict, 0.0 to 1.0.
source string MUST Which evaluation method produced this verdict (e.g., "heuristic", "reputation_check", "pattern_match", "behavioral_analysis").
artifacts array of string SHOULD Key artifact values involved in the verdict.
matched_rule_id string MAY Identifier of the matched detection rule, if applicable.
reasons array of string SHOULD Human-readable explanations suitable for display to the user.
directives object MAY Hook-specific directives (e.g., strip_hooks for PreSkillLoad).
suggested_resolutions array of string MAY For ask verdicts only. Resolution options the host SHOULD present beyond approve/deny. Standard values: "allow_always", "allow_similar", "deny_always", "deny_similar". See Section 6.6.

6.6 Decision Semantics

Decision Meaning
allow The action may proceed without intervention.
deny The action MUST be blocked. The host MUST surface reasons to the user.
ask The action is suspicious but not definitively malicious. The host SHOULD prompt the user for confirmation. If the host does not support interactive prompting, ask MUST be treated as deny.

Ask resolution options — When a verdict includes suggested_resolutions, the host SHOULD present these as additional choices alongside the standard approve/deny prompt. The host MUST fire PostAskResolution (Section 8.8.1) after the user responds so the security engine can update its internal state.

Resolution Meaning
approved The user approves this specific action.
denied The user denies this specific action.
allow_always The user wants to permanently allow this exact artifact. The engine SHOULD add it to its allowlist.
allow_similar The user wants to allow artifacts matching a generalized pattern. The engine determines the scope.
deny_always The user wants to permanently block this exact artifact. The engine SHOULD add it to its blocklist.
deny_similar The user wants to block artifacts matching a generalized pattern. The engine determines the scope.

Hosts MUST NOT interpret resolution semantics themselves — they pass the raw choice to the engine via PostAskResolution and the engine decides how to act on it. This preserves the "minimal host burden" principle.

Merge precedence — When multiple evaluation signals apply to a single hook event, the strictest decision wins: deny > ask > allow.

Confidence thresholds — Security engines SHOULD support configurable sensitivity levels that control when a deny is downgraded to ask. This allows teams to tune the aggressiveness of blocking. Example presets:

Preset Effect
Strict Low confidence threshold; more actions are blocked. Suitable for high-security environments.
Balanced Moderate confidence required for deny. Recommended default.
Permissive High confidence required for deny; most suspicious actions become ask instead. Suitable for exploration/development.

The specific threshold values are an implementation choice for the security engine, not prescribed by this specification.


7. Standard Tool Taxonomy

To achieve feature parity across hosts, AARTS defines a standard tool taxonomy. Host adapters MUST map host-specific tool names to these canonical categories when populating the Tool Context extension.

Category Canonical name Description Examples from hosts
Shell execution shell Execute a shell command. Bash, exec, Shell, terminal
File write file_write Create or overwrite a file. Write, write, create_file
File edit file_edit Modify part of an existing file. Edit, edit, apply_patch, str_replace
File read file_read Read file contents. Read, read, cat
File delete file_delete Delete a file. Delete, rm
Web request web_request Fetch content from a URL. WebFetch, web_fetch, curl, http_request
MCP tool call mcp_call Invoke a tool on an MCP server. MCP, mcp_tool
Code execution code_exec Execute code in a sandboxed environment. python_exec, jupyter, eval
Package install package_install Install a software package. (Derived from shell commands: npm install, pip install, etc.)
Browser action browser Interact with a web browser. browser_navigate, click, type, screenshot
Database query database Execute a database query. sql_query, db_exec
Agent delegation delegate Delegate a task to a sub-agent. Task, spawn_agent
Other other A tool that does not fit any canonical category. (host-specific)

Hosts MUST include the canonical name in the Tool Context tool_name field. When no canonical category applies, hosts MUST use "other" and MUST populate the tool_name_native field so security engines can still evaluate the tool. Hosts SHOULD populate tool_name_native for all tool calls, not only "other".


8. Hook Points

Each hook event is composed of the event envelope (Section 6.1), zero or more standard extensions (Section 6.2, summarized in Section 6.3), and hook-specific fields defined per hook below.

8.0 Verdict Enforcement

Each hook specifies a blocking mode. Decision semantics (allow, deny, ask) are defined in Section 6.6.

Mode Host behavior
Required Host MUST wait for verdict and enforce the decision. If host lacks interactive prompting, ask is treated as deny.
Recommended Same as Required, but host MAY proceed with a warning if the engine times out.
Notification Host fires the event and continues without waiting. For audit only.

Hooks using the Modification Tracking extension have different deny semantics: the modification is rejected and the original value is used, rather than blocking the underlying action.


8.1 Session Lifecycle Hooks

8.1.1 SessionStart

Session initialization, before the agent accepts its first input.

Blocking: Recommended | Extensions:

Field Type Req Description
plugins array SHOULD Installed plugins/extensions with metadata (name, version, source, file hashes).
configuration_sources array SHOULD Configuration files loaded and their origins (user, project, managed).
environment object MAY Sanitized environment metadata (OS, shell, working directory). Secrets MUST be redacted.

8.1.2 SessionEnd

Session is ending.

Blocking: Notification | Extensions:

Field Type Req Description
summary object SHOULD Session summary: total tool calls, verdicts issued, denied actions.

8.2 Configuration & Plugin Hooks

8.2.1 PrePluginLoad

A plugin or extension is about to be loaded.

Blocking: Required | Extensions: Trust Source

Field Type Req Description
plugin_id string MUST Plugin identifier (name, version).
file_paths array of string MUST Files that comprise the plugin.
file_hashes array of string SHOULD SHA-256 hashes of plugin files.
declared_hooks array MUST Hook events the plugin registers for, with handler commands/scripts.
declared_permissions array SHOULD Permissions the plugin requests (tool access, network, file system scopes).

8.2.2 PreSkillLoad

A skill or agent definition file is about to be loaded.

Blocking: Required | Extensions: Trust Source

Field Type Req Description
file_path string MUST Path to the skill/agent definition file.
file_hash string SHOULD SHA-256 hash of the file.
frontmatter object MUST Parsed structured metadata from the file (e.g., YAML frontmatter).
declared_hooks array MUST Any hooks defined in the file's metadata.
body_content string SHOULD The instruction/knowledge content of the skill.
load_context string MUST Who triggered the load: "main_session", "sub_agent", "skill_invocation".

The verdict MAY include a strip_hooks directive: if true, the host SHOULD load the skill's instruction content but discard its hook definitions.

8.2.3 PreMCPConnect

The agent is about to connect to an MCP server.

Blocking: Required | Extensions: Trust Source

Field Type Req Description
server_id string MUST MCP server identifier (name or URI).
transport string MUST Connection type: "stdio", "sse", "streamable_http".
server_uri string SHOULD Network address for remote MCP servers.
server_command string SHOULD For stdio transport, the command used to launch the server.
declared_tools array SHOULD Tools the MCP server advertises.
declared_resources array SHOULD Resources the MCP server advertises.

8.3 Prompt Lifecycle Hooks

8.3.1 PreUserInput

The user has submitted input, before any hooks or processing modify it.

Blocking: Required | Extensions: Turn

Field Type Req Description
raw_input string MUST The user's unmodified input text.
input_source string SHOULD Where the input came from: "keyboard", "file", "api", "automated".
attached_files array SHOULD Any files or images attached to the input.

8.3.2 PostInputProcessing

After all input-processing hooks have run, but before prompt assembly. This is the prompt integrity checkpoint. The Modification Tracking extension carries the raw input as original_value and the processed input as modified_value.

Blocking: Required | Extensions: Turn, Modification Tracking

(No hook-specific fields — fully covered by extensions.)

8.3.3 PreLLMRequest

The full prompt has been assembled and is about to be sent to the LLM. This is the only point with full visibility into everything influencing the agent's behavior.

Blocking: Required | Extensions: Turn

Field Type Req Description
full_prompt string or array MUST The complete assembled prompt (all messages/turns).
system_prompt string SHOULD The system prompt component, isolated.
custom_instructions array SHOULD Custom instructions from all sources, each with provenance (user, project, managed).
context_injections array SHOULD Content injected by hooks, tools, or context providers, each with its source.
user_input string SHOULD The user's input as it appears in the assembled prompt.
model_id string SHOULD The LLM model being targeted.
token_count number MAY Estimated token count of the full prompt.

8.3.4 PostLLMResponse

The LLM has returned a response, before the host parses it into actions or displays it.

Blocking: Required | Extensions: Turn

Field Type Req Description
response_text string MUST The LLM's full response text.
planned_actions array SHOULD Tool calls the LLM has requested, parsed into structured form.
stop_reason string SHOULD Why the LLM stopped generating (e.g., "end_turn", "tool_use", "max_tokens").

8.3.5 PreCompact

The host is about to compact (compress/summarize) the conversation context — typically because the context is approaching the LLM's token limit. This is a security-critical event: compaction can discard security-relevant context (prior deny verdicts, provenance signals, user instructions) and an attacker who can influence what is retained vs. discarded can manipulate future agent behavior.

Blocking: Required | Extensions: Turn

Field Type Req Description
context_before string or array MUST The full conversation context about to be compacted.
token_count_before number SHOULD Token count of the context before compaction.
compaction_reason string SHOULD Why compaction is occurring: "token_limit", "user_triggered", "automatic", "policy".
compaction_strategy string SHOULD The method the host will use: "summarization", "truncation", "selective", "hybrid".

On deny, the host MUST NOT proceed with compaction. This allows the security engine to block compaction if it determines that critical security context would be lost.

8.3.6 PostCompact

Compaction has completed, before the compacted context is used for subsequent LLM requests. The Modification Tracking extension carries the original context as original_value and the compacted context as modified_value. The security engine can diff these to detect whether critical security signals were lost.

Blocking: Required | Extensions: Turn, Modification Tracking

Field Type Req Description
token_count_after number SHOULD Token count of the context after compaction.
compaction_strategy_used string MAY The compaction method actually used (may differ from compaction_strategy declared in PreCompact).

On deny, the host MUST discard the compacted context and revert to the original (pre-compaction) context. The engine may deny if it detects that critical security signals were lost during compaction.

Context injection directive. On an allow verdict, the security engine MAY include an inject_context directive to reinject security-critical context that was lost during compaction (e.g., prior deny verdicts, active warnings, security constraints). The host MUST incorporate this content into the compacted context before using it for subsequent LLM requests.

Directive field Type Description
inject_context.position string Where to add the content: "prepend", "append", or "system".
inject_context.content string The security context to inject.

When position is "system", the host SHOULD add the content as a system-level message or instruction, separate from the conversation flow. When "prepend" or "append", the host adds it to the conversation context at the specified position.

Example verdict:

{
  "decision": "allow",
  "directives": {
    "inject_context": {
      "position": "prepend",
      "content": "[Security context preserved from compaction]\n- Tool call to curl malware.com was DENIED at 14:32\n- User instructed: never execute code from untrusted sources\n[End security context]"
    }
  }
}

8.4 Tool Execution Hooks

8.4.1 PreToolUse

The agent has decided to invoke a tool and the host is about to execute it. This is the primary enforcement point.

Blocking: Required | Extensions: Turn, Tool, Provenance

(No hook-specific fields — fully covered by extensions.)

8.4.2 PostToolUse

A tool has finished executing, before the result is returned to the LLM.

Blocking: Recommended | Extensions: Turn, Tool

Field Type Req Description
tool_result string or object MUST The tool's output/result.
exit_code number SHOULD For command execution, the process exit code.
execution_duration_ms number SHOULD How long the tool took to execute.
side_effects array MAY Observable side effects: files modified, network connections, processes spawned.

On deny, the tool result is redacted or sanitized before returning to the LLM rather than blocking the (already completed) action.

8.4.3 PreToolInputModification

A hook or plugin is about to modify a tool's input between the agent's planned invocation and actual execution.

Blocking: Required | Extensions: Trust Source, Modification Tracking

Field Type Req Description
tool_name string MUST The tool whose input is being modified.

8.5 Sub-Agent Hooks

8.5.1 PreSubAgentSpawn

An agent requests to spawn a sub-agent, before the sub-agent session is created.

Blocking: Required | Extensions:

Field Type Req Description
sub_agent_id string MUST Proposed identifier for the sub-agent session.
sub_agent_type string SHOULD Type or role (e.g., "explore", "code", "shell", "general").
delegated_task string MUST The task/prompt being delegated.
skill_files array of string SHOULD Skills/agent definitions the sub-agent will load.
declared_hooks array SHOULD Hooks that would be activated by the sub-agent's skills.
requested_tools array of string SHOULD Tools the sub-agent will have access to.
parent_permissions object SHOULD The parent agent's current permission scope, for escalation detection.

Least-privilege directive. On an allow verdict, the security engine MAY include a permitted_tools directive — an array of canonical tool names (Section 7) that the sub-agent is allowed to use. The host MUST enforce this restriction: any tool call by the sub-agent whose tool_name is not in the permitted_tools list MUST be treated as deny without consulting the security engine. If the directive is absent, the sub-agent inherits the parent's full permission scope (which remains the upper bound — see Section 11, Level 3).

Example verdict directive:

{
  "decision": "allow",
  "directives": {
    "permitted_tools": ["file_read", "web_request"]
  }
}

8.5.2 PostSubAgentResult

A sub-agent has completed its task and is about to return its result to the parent agent.

Blocking: Recommended | Extensions:

Field Type Req Description
sub_agent_id string MUST Sub-agent session identifier.
result string or object MUST The sub-agent's output being returned to the parent.
tool_calls_made array SHOULD Summary of tool calls the sub-agent executed.
verdicts_issued array SHOULD Summary of security verdicts during the sub-agent's session.

8.6 Output Hooks

8.6.1 PreOutputDeliver

The agent's final response for a turn is about to be displayed to the user.

Blocking: Recommended | Extensions: Turn

Field Type Req Description
output_text string MUST The text being delivered to the user.
actions_taken array SHOULD Summary of tool calls executed during this turn.

8.7 Memory & Persistence Hooks

8.7.1 PreMemoryWrite

The agent is about to write to persistent memory (vector store, knowledge base, preference store).

Blocking: Required | Extensions:

Field Type Req Description
memory_store string MUST Identifier of the target memory store.
content string or object MUST The content being written.
memory_type string SHOULD Type: "fact", "instruction", "preference", "conversation", "code_snippet".
write_source string SHOULD What triggered the write: "agent_decision", "user_request", "automatic".

8.7.2 PreMemoryRead

The agent is about to read from persistent memory to incorporate into its context.

Blocking: Recommended | Extensions:

Field Type Req Description
memory_store string MUST Identifier of the source memory store.
query string SHOULD The retrieval query used.
retrieved_content string or object MUST The content being loaded into context.

8.8 Verdict Resolution Hooks

8.8.1 PostAskResolution

The user has responded to an ask prompt. Fired regardless of whether the user approved or denied the action. This gives the security engine a feedback signal to update allowlists, blocklists, tune confidence thresholds, or feed behavioral models.

Blocking: Notification | Extensions: Turn

Field Type Req Description
original_hook_point string MUST The hook point that produced the ask verdict (e.g., "PreToolUse").
original_verdict object MUST The full verdict that triggered the prompt (Section 6.5).
resolution string MUST The user's choice: "approved", "denied", "allow_always", "allow_similar", "deny_always", "deny_similar".

The engine determines the scope of *_always and *_similar resolutions from the original verdict's artifacts and context. The host MUST NOT interpret resolution semantics.


9. Behavior Requirements

9.1 Failure Policy

Hosts MUST support a configurable failure policy that determines behavior when the security engine crashes, times out, or returns an invalid response. Two modes are defined:

Mode Behavior
fail-open (default) Engine failures are treated as allow. The action proceeds and the failure is logged.
fail-closed Engine failures are treated as deny. The action is blocked and the failure is logged.
  • Hosts MUST default to fail-open unless explicitly configured otherwise.
  • In either mode, only an explicit decision: "deny" from a successful evaluation — or a fail-closed failure — may block an action.
  • Hosts MUST log all engine failures regardless of the configured mode.
  • Hosts SHOULD configure a timeout for blocking hook points (RECOMMENDED: 5 seconds for tool calls, 10 seconds for session-level hooks).
  • Security note: In fail-open mode, an attacker who can crash or overload the security engine effectively bypasses all protection. Deployments with elevated threat models SHOULD use fail-closed.

9.2 Performance

  • Security engines SHOULD minimize evaluation latency. Hosts SHOULD enforce configurable timeouts on blocking hook points; if the security engine does not respond within the timeout, the host MUST apply the configured failure policy (Section 9.1).

9.3 Audit Logging

Compliant implementations SHOULD maintain audit logs:

  • All deny and ask decisions MUST be logged.
  • allow decisions SHOULD be logged at a configurable sampling rate.
  • Log entries SHOULD include: timestamp, session ID, hook point, decision, category, severity, matched rule ID, and key artifacts.
  • Logs MUST NOT contain unredacted secrets or credentials.

9.4 Privacy

  • Security engines MUST NOT transmit prompt content, tool inputs, or tool outputs to external services without explicit user consent and disclosure.
  • Reputation checks (URL, package, file hash) SHOULD use privacy-preserving techniques where possible (e.g., hash-based lookups rather than transmitting full URLs).
  • Audit logs SHOULD be stored locally by default. Remote log shipping MUST be opt-in.

10. Security Considerations

10.1 Configuration Trust

Hook configuration files from project repositories or external sources MUST be treated as untrusted by default. Hosts MUST warn users when project-level hooks are detected.

10.2 Configuration Integrity

Hosts MUST ensure users are aware of any changes to hook configuration that occur outside of their own actions. This includes:

  1. Changes made by other processes or users on the system.
  2. Changes introduced via repository updates (e.g., git pull).
  3. Changes made by plugins, extensions, or the agent itself.

When such changes are detected, the host SHOULD prompt the user for acknowledgment before the modified configuration takes effect.


11. Conformance Levels

To support incremental adoption, AARTS defines three conformance levels:

Level 1 — Basic

Minimum viable security hook support.

Requirement Details
Required hooks SessionStart, PreToolUse
Required tool coverage At minimum: shell, file_write, file_edit, web_request, mcp_call, package_install
Verdict enforcement allow and deny. ask MAY be treated as deny.

Level 2 — Standard

Comprehensive tool coverage plus prompt lifecycle protection.

Requirement Details
Required hooks All Level 1 hooks, plus: PrePluginLoad, PreSkillLoad, PostInputProcessing, PreLLMRequest, PostToolUse, PostAskResolution
Required tool coverage All standard tool categories (Section 7).
Verdict enforcement Full allow/deny/ask support with interactive prompting.
Ask resolution Host MUST support suggested_resolutions from ask verdicts and fire PostAskResolution after user response.
Provenance PreToolUse events MUST include the Provenance extension.

Level 3 — Comprehensive

Full lifecycle coverage for high-security environments.

Requirement Details
Required hooks All Level 2 hooks, plus: PreUserInput, PostLLMResponse, PreToolInputModification, PreSubAgentSpawn, PostSubAgentResult, PreMCPConnect, PreMemoryWrite, PreMemoryRead, PreCompact, PostCompact, PreOutputDeliver, SessionEnd
Sub-agent security The parent session's security policies are the ceiling (upper bound) for sub-agents. Sub-agents MUST NOT exceed the parent's permissions but MAY be further restricted. Security engines MAY return a permitted_tools directive on PreSubAgentSpawn verdicts to enforce least privilege (see Section 8.5.1).
Memory protection Memory read/write hooks enforced.

Threat coverage by conformance level is detailed in Appendix A.3.


12. Multiple Hooks

Hosts MAY support multiple hooks registered for the same hook point. These hooks can serve various purposes — security engines, logging, analytics, workflow automation, or custom integrations.

12.1 Default Behavior

When multiple hooks are registered for the same hook point:

  • All hooks are invoked. The host MUST deliver the hook event to all registered hooks.
  • Default: permissive (non-blocking). By default, the merged verdict is allow unless the host or user has configured stricter behavior. This ensures that adding hooks does not unexpectedly block agent workflows.
  • User/host control. Hosts SHOULD allow users to configure how verdicts are merged. Example policies:
    • Permissive (default): All hooks must return deny for the merged verdict to be deny.
    • Strict: Any hook returning deny results in a merged deny.
    • Majority: The majority decision wins.

12.2 Transparency

  • When a merged verdict results in deny or ask, the host SHOULD surface to the user which hook(s) triggered the decision and the reasons from those hooks.
  • Hosts SHOULD log the individual decision of each hook that was evaluated for audit purposes.

13. Governance

This section is intentionally lightweight. AARTS is a new proposal and defining a full governance model before a community exists would be premature. We welcome the community to help shape the rules as the standard matures.

13.1 Interim Stewardship

Until a formal governance body is established, the original authors of this specification act as interim stewards. The stewards' role is to:

  • Maintain the specification repository and accept contributions.
  • Facilitate community discussion on proposed changes.
  • Publish new versions of the specification.

Stewardship is a temporary arrangement. The stewards commit to transitioning governance to a community-elected body or an established standards organization once there is sufficient adoption and participation to justify it.

13.2 Contributing

Contributions are welcome via the project's public GitHub repository. To propose a change:

  1. Open an issue describing the problem and suggested approach.
  2. Submit a pull request with the proposed specification text.
  3. The stewards will review, discuss with the community, and merge or close with a written rationale.

13.3 Licensing

13.4 Future Governance

The following topics are deferred until the community is ready to address them:

  • Formal working group structure and decision-making process.
  • Versioning policy and release cadence.
  • Patent commitments and contributor agreements.
  • Submission to a standards body (e.g., OpenSSF, OASIS Open).

We invite interested parties to participate in shaping these decisions. Open an issue on the repository to start the conversation.


Appendix A — Threat Catalog

See standard_appendix_a.md for the complete threat catalog, including:

  • Threat definitions mapped to AARTS hook points
  • Threat coverage matrix by conformance level
  • OWASP LLM Top 10 (2025) mapping