Skip to content

Latest commit

 

History

History
116 lines (95 loc) · 4.36 KB

File metadata and controls

116 lines (95 loc) · 4.36 KB

SAFE-M-1: Architectural Defense - Control/Data Flow Separation

Overview

Mitigation ID: SAFE-M-1
Category: Architectural Defense
Effectiveness: High (Provable Security)
Implementation Complexity: High
First Published: 2025-01-03

Description

Control/Data Flow Separation is an architectural defense that creates a protective system layer around LLMs by explicitly separating control flow (from trusted queries) from data flow (including untrusted tool descriptions). This approach ensures that malicious instructions embedded in data cannot influence program execution.

The most notable implementation is CaMeL (Control and Memory Language), developed by researchers from Google and other institutions, which demonstrates 77% task completion with provable security guarantees.

Mitigates

Technical Implementation

Core Principles

  1. Explicit Control Flow Extraction: Parse and extract control flow from trusted sources only
  2. Data Isolation: Treat all external inputs (tool descriptions, API responses, etc.) as pure data
  3. Capability-Based Security: Implement fine-grained permissions for data access and flow

Architecture Components

┌─────────────────┐
│  Trusted Query  │
└────────┬────────┘
         │
    ┌────▼─────┐
    │  Parser  │ ← Extracts control flow
    └────┬─────┘
         │
┌────────▼────────┐     ┌──────────────┐
│ Control Engine  │────►│ Data Handler │
│ (Protected)     │     │ (Sandboxed)  │
└────────┬────────┘     └──────────────┘
         │
    ┌────▼─────┐
    │   LLM    │
    └──────────┘

Implementation Steps

  1. Design Phase:

    • Define trust boundaries
    • Identify all data sources
    • Design capability model
  2. Development Phase:

    • Implement control flow parser
    • Create data sandboxing layer
    • Build capability enforcement
  3. Deployment Phase:

    • Configure security policies
    • Set up monitoring
    • Train operations team

Benefits

  • Provable Security: Mathematical guarantees against certain attack classes
  • Defense in Depth: Works even if underlying LLM is vulnerable
  • No Model Retraining: Can be applied to existing LLMs

Limitations

  • Performance Impact: ~7% reduction in task completion rate
  • Complexity: Requires significant architectural changes
  • Not Universal: Some tasks may be incompatible with strict separation

Implementation Examples

Example 1: Tool Call Protection

# Traditional vulnerable approach
def execute_tool(tool_description, parameters):
    # Tool description can influence execution
    return llm.execute(f"{tool_description}\n{parameters}")

# CaMeL-style protected approach
def execute_tool_protected(tool_id, parameters):
    # Control flow predetermined, description is data only
    control_flow = trusted_registry.get_control_flow(tool_id)
    tool_data = untrusted_sources.get_tool_description(tool_id)
    
    # Description cannot alter execution path
    return protected_executor.run(
        control=control_flow,
        data={'description': tool_data, 'params': parameters}
    )

Testing and Validation

  1. Security Testing:

    • Attempt prompt injection with known payloads
    • Verify data cannot influence control flow
    • Test capability enforcement
  2. Functional Testing:

    • Ensure legitimate operations still work
    • Measure performance impact
    • Validate error handling

References

Related Mitigations

Version History

Version Date Changes Author
1.0 2025-01-03 Initial documentation Frederick Kautz