SAFE-M-1: Architectural Defense - Control/Data Flow Separation

July 8, 2025 · View on GitHub

Overview

Mitigation ID: SAFE-M-1
Category: Architectural Defense
Effectiveness: High (Provable Security)
Implementation Complexity: High
First Published: 2025-01-03

Control/Data Flow Separation is an architectural defense that creates a protective system layer around LLMs by explicitly separating control flow (from trusted queries) from data flow (including untrusted tool descriptions). This approach ensures that malicious instructions embedded in data cannot influence program execution.

The most notable implementation is CaMeL (Control and Memory Language), developed by researchers from Google and other institutions, which demonstrates 77% task completion with provable security guarantees.

Mitigates

SAFE-T1001: Tool Poisoning Attack (TPA)
SAFE-T1102: Prompt Injection (Multiple Vectors)
SAFE-T1401: Line Jumping

Technical Implementation

Core Principles

Explicit Control Flow Extraction: Parse and extract control flow from trusted sources only
Data Isolation: Treat all external inputs (tool descriptions, API responses, etc.) as pure data
Capability-Based Security: Implement fine-grained permissions for data access and flow

Architecture Components

┌─────────────────┐
│  Trusted Query  │
└────────┬────────┘
         │
    ┌────▼─────┐
    │  Parser  │ ← Extracts control flow
    └────┬─────┘
         │
┌────────▼────────┐     ┌──────────────┐
│ Control Engine  │────►│ Data Handler │
│ (Protected)     │     │ (Sandboxed)  │
└────────┬────────┘     └──────────────┘
         │
    ┌────▼─────┐
    │   LLM    │
    └──────────┘

Implementation Steps

Design Phase:
- Define trust boundaries
- Identify all data sources
- Design capability model
Development Phase:
- Implement control flow parser
- Create data sandboxing layer
- Build capability enforcement
Deployment Phase:
- Configure security policies
- Set up monitoring
- Train operations team

Benefits

Provable Security: Mathematical guarantees against certain attack classes
Defense in Depth: Works even if underlying LLM is vulnerable
No Model Retraining: Can be applied to existing LLMs

Limitations

Performance Impact: ~7% reduction in task completion rate
Complexity: Requires significant architectural changes
Not Universal: Some tasks may be incompatible with strict separation

Implementation Examples

Example 1: Tool Call Protection

# Traditional vulnerable approach
def execute_tool(tool_description, parameters):
    # Tool description can influence execution
    return llm.execute(f"{tool_description}\n{parameters}")

# CaMeL-style protected approach
def execute_tool_protected(tool_id, parameters):
    # Control flow predetermined, description is data only
    control_flow = trusted_registry.get_control_flow(tool_id)
    tool_data = untrusted_sources.get_tool_description(tool_id)
    
    # Description cannot alter execution path
    return protected_executor.run(
        control=control_flow,
        data={'description': tool_data, 'params': parameters}
    )

Testing and Validation

Security Testing:
- Attempt prompt injection with known payloads
- Verify data cannot influence control flow
- Test capability enforcement
Functional Testing:
- Ensure legitimate operations still work
- Measure performance impact
- Validate error handling

References

Defeating Prompt Injections by Design - Google Research (2025)

SAFE-M-2: Cryptographic Integrity
SAFE-M-3: AI-Powered Content Analysis

Version History

Version	Date	Changes	Author
1.0	2025-01-03	Initial documentation	Frederick Kautz