Octocode - MemPalace Security Hardening

# RFC: MemPalace Security Hardening

---

## Summary

The MemPalace MCP server exposes 17 tools to AI agents with write access to ChromaDB, SQLite, and the local filesystem. A security audit found 9 issues across input validation, argument injection, concurrency safety, resource exhaustion, data leakage, and error handling. This RFC proposes a phased hardening plan that addresses all findings — from HIGH (inconsistent input validation, argument injection) through MEDIUM (thread safety, resource limits, WAL data exposure) to LOW (error handling, parameter bounds, information leakage).

---

## Motivation

### Problem

MemPalace is a personal memory system where AI agents have write access to persistent storage via MCP tools. The trust boundary is the MCP JSON-RPC protocol: any connected AI agent can invoke any tool with any arguments. The current implementation has gaps where security controls were partially applied — write tools are validated but their companion read/mutate tools are not, and the tool dispatch passes unfiltered arguments to handlers.

### Current State

**Inconsistent input validation** — `sanitize_name()` is applied to some tools but not others:

| Tool | Line | Validates? |
|------|------|-----------|
| `tool_kg_add` | `mcp_server.py:422` | YES — `sanitize_name` on subject, predicate, object |
| `tool_kg_invalidate` | `mcp_server.py:449` | **NO** — raw strings to `_entity_id()` then SQL |
| `tool_diary_write` | `mcp_server.py:477` | YES — `sanitize_name` on agent_name |
| `tool_diary_read` | `mcp_server.py:543` | **NO** — raw agent_name into ChromaDB where clause |
| `tool_kg_query` | `mcp_server.py:416` | **NO** — raw entity to `_entity_id()` then SQL |
| `tool_kg_timeline` | `mcp_server.py:463` | **NO** — raw entity to SQL |

**Unfiltered argument dispatch** — `mcp_server.py:903` passes `**tool_args` with no whitelist:
```python
result = TOOLS[tool_name]["handler"](**tool_args)
```
This lets MCP callers set optional handler parameters like `added_by` (default `"mcp"`) and `source_file` to arbitrary values, spoofing the WAL audit trail.

**No concurrency protection** — `knowledge_graph.py:93` uses `check_same_thread=False` with a shared singleton connection and no locking.

**No resource limits** — 5 write tools with no rate limiting, no entry caps, and `sanitize_content` allows up to 100,000 characters per write.

**Sensitive data in WAL** — `mcp_server.py:86-96` logs content previews (first 200 chars) to a growing plaintext file with no rotation.

**Broad exception handlers** — 20+ locations use `except Exception: pass` or return generic errors, masking security-relevant failures.

**Unbounded graph traversal** — `tool_traverse_graph` accepts `max_hops` with no upper bound.

**Internal errors leaked** — `tool_check_duplicate` and others return `str(e)` to MCP callers, potentially exposing internal paths and state.

### Evidence

All findings confirmed by direct code inspection of `mempalace/mempalace/mempalace/mcp_server.py` (947 lines), `mempalace/mempalace/mempalace/knowledge_graph.py` (394 lines), and supporting modules. Each finding was validated against the actual implementation — no claims are inferred.

### Impact

- **AI agent as attacker**: A compromised or misbehaving AI agent can bypass input validation through unvalidated companion tools, spoof audit trails via argument injection, and exhaust disk space via unlimited writes.
- **Data integrity**: Concurrent writes via the shared SQLite connection could corrupt the knowledge graph.
- **Privacy**: Sensitive personal data persists indefinitely in the WAL log without rotation or encryption.
- **Operational**: Silent exception swallowing means failures — including security failures — go undetected.

### Use Cases

1. A malicious MCP client calls `tool_kg_invalidate` with crafted strings that would be rejected by `tool_kg_add`'s `sanitize_name`.
2. An AI agent floods `tool_add_drawer` with 100K-char writes, filling disk and degrading search.
3. A caller sets `added_by=trusted_agent` to impersonate another agent in the WAL log.
4. Concurrent tool calls corrupt the knowledge graph SQLite database.
5. A user stores medical or financial data; the first 200 chars persist permanently in the WAL.

---

## Guide-Level Explanation

After this RFC is implemented, every MCP tool that accepts string input will validate it through the same `sanitize_name()` / `sanitize_content()` gates. The tool dispatcher will only pass declared parameters to handlers — undeclared arguments are silently dropped. The SQLite knowledge graph will use thread-safe access. Write operations will have configurable rate limits and size caps. The WAL will rotate and can optionally hash content previews. Exception handlers will log structured errors instead of silently passing.

For existing users, this is a transparent hardening — no API changes, no new tools, no behavioral differences for well-behaved callers. The only visible change is that previously-accepted malformed input may now return validation errors.

---

## Reference-Level Explanation

### Finding 1: Inconsistent Input Validation (HIGH)

**What**: Apply `sanitize_name()` to all tools accepting entity/name parameters.

**Files changed**: `mcp_server.py`

**Affected tools and changes**:

| Tool | Line | Change |
|------|------|--------|
| `tool_kg_invalidate` | 449 | Add `sanitize_name` on `subject`, `predicate`, `object` |
| `tool_kg_query` | 416 | Add `sanitize_name` on `entity` |
| `tool_kg_timeline` | 463 | Add `sanitize_name` on `entity` (when not None) |
| `tool_diary_read` | 543 | Add `sanitize_name` on `agent_name` |
| `tool_list_rooms` | 213 | Add `sanitize_name` on `wing` (when not None) |
| `tool_search` | 249 | Add `sanitize_name` on `wing` and `room` (when not None) |
| `tool_traverse_graph` | 299 | Add `sanitize_name` on `start_room` |
| `tool_find_tunnels` | 307 | Add `sanitize_name` on `wing_a` and `wing_b` (when not None) |

**Pattern**: Wrap each tool body with:
```python
try:
    entity = sanitize_name(entity, "entity")
except ValueError as e:
    return {"error": str(e)}
```

**Compatibility**: Backward-compatible — callers using valid names see no change. Callers sending crafted strings now get a clear validation error instead of undefined behavior.

### Finding 2: MCP Dispatch Argument Injection (HIGH)

**What**: Filter `tool_args` to only keys declared in the tool's `input_schema["properties"]` before dispatch.

**File**: `mcp_server.py:894-903`

**Change**:
```python
allowed_keys = set(TOOLS[tool_name]["input_schema"].get("properties", {}).keys())
tool_args = {k: v for k, v in tool_args.items() if k in allowed_keys}
```

**Compatibility**: Backward-compatible — well-behaved callers only send declared parameters. Extra arguments are silently dropped (MCP convention). Callers relying on undeclared argument injection will see default values used instead of their injected values.

### Finding 3: SQLite Thread Safety (MEDIUM)

**What**: Replace the shared singleton connection with thread-local connections.

**File**: `knowledge_graph.py:88-96`

**Options**:
- **A (recommended)**: Use `threading.local()` for per-thread connections
- **B**: Add a `threading.Lock()` around all `_conn()` calls
- **C**: Create a new connection per operation (simpler but slower)

**Change (Option A)**:
```python
import threading

class KnowledgeGraph:
    def __init__(self, db_path=None):
        self.db_path = db_path or DEFAULT_KG_PATH
        self._local = threading.local()
        ...

    def _conn(self):
        if not hasattr(self._local, 'connection') or self._local.connection is None:
            self._local.connection = sqlite3.connect(self.db_path, timeout=10)
            self._local.connection.execute("PRAGMA journal_mode=WAL")
            self._local.connection.row_factory = sqlite3.Row
        return self._local.connection
```

**Compatibility**: Transparent — API unchanged, WAL mode enables concurrent reads.

### Finding 4: Resource Limits on Write Operations (MEDIUM)

**What**: Add configurable rate limiting and size caps.

**File**: `mcp_server.py` — new module-level rate limiter

**Design**:
```python
_WRITE_LIMITS = {
    "max_writes_per_minute": 60,
    "max_content_length": 100_000,   # already exists in sanitize_content
    "max_total_entries": 500_000,
    "max_wal_size_mb": 100,
}
```

A simple in-memory counter per tool resets each minute. Configurable via `config.json`.

**Compatibility**: No impact on normal usage (60 writes/min is generous). Only affects runaway loops.

### Finding 5: WAL Data Exposure (MEDIUM)

**What**: Add WAL rotation and optional content hashing.

**File**: `mcp_server.py:74-96`

**Design**:
- Rotate WAL when file exceeds configurable size (default 10MB)
- Keep last N rotated files (default 5)
- Add config option `wal_hash_content: true` to SHA-256 hash previews instead of storing plaintext
- Default: current behavior (plaintext) to avoid breaking existing tooling

**Compatibility**: Existing WAL consumers continue working. Rotation is transparent.

### Finding 6: Error Handling Improvements (LOW)

**What**: Replace bare `except: pass` with structured logging.

**Files**: `mcp_server.py`, `hooks_cli.py`, `config.py`, `knowledge_graph.py`

**Design**:
- All exception handlers log to `logger.debug()` at minimum
- Tool dispatch returns error category (validation_error, not_found, internal_error) instead of generic "Internal tool error"
- No raw `str(e)` returned to callers — use predefined error messages

**Compatibility**: No behavioral change for callers. Better observability.

### Finding 7: Graph Traversal Depth Bound (LOW)

**What**: Clamp `max_hops` to a maximum of 5.

**File**: `mcp_server.py:299`

**Change**:
```python
def tool_traverse_graph(start_room: str, max_hops: int = 2):
    max_hops = min(max_hops, 5)
    ...
```

**Compatibility**: Callers requesting >5 hops get 5. Default (2) unchanged.

### Finding 8: Internal Error Leakage (LOW)

**What**: Return generic error messages, log details server-side.

**Files**: `mcp_server.py` (tool_check_duplicate:290, tool dispatch:909-912)

**Change**: Replace `return {"error": str(e)}` with `return {"error": "Operation failed"}` and `logger.exception(...)`.

**Compatibility**: Callers that parse error messages for internal details will stop working — this is intentional.

### Finding 9: Duplicate Module Structure (INFO)

**What**: The project has files duplicated between `mempalace/` (top level) and `mempalace/mempalace/mempalace/` (nested package). This RFC does not propose fixing this — it is a separate cleanup concern. However, all code changes in this RFC should target `mempalace/mempalace/mempalace/mempalace/` which is the active package.

---

## Drawbacks

1. **Validation errors for edge cases**: Some tool callers may send strings that happen to pass through today but fail `sanitize_name()`. For example, names with special Unicode characters. This is a feature, not a bug — but it could surface as unexpected errors.

2. **Rate limiting false positives**: Bulk import workflows (e.g., mining a large project) go through the CLI miner, not MCP tools, so they are unaffected. However, a future MCP-based bulk import would need rate limit exemption.

3. **WAL rotation complexity**: Adding rotation introduces file management logic. If the rotation crashes mid-write, the WAL could be left in an inconsistent state. Mitigation: atomic rename on rotation.

4. **Thread-local connections**: Increases connection count under threading. For a local tool this is acceptable, but if the MCP server were to serve many concurrent requests, connection pooling would be better.

---

## Rationale and Alternatives

### Why This Design?

This RFC takes the **defense-in-depth** approach: apply validation at every entry point, restrict dispatch to declared parameters, protect shared resources with thread safety, and limit resource consumption. Each fix is small, targeted, and backward-compatible.

The key insight is that the codebase already has good security primitives (`sanitize_name`, parameterized SQL, file permissions) — they just need consistent application.

### Alternatives Considered

#### Alternative A: MCP Authentication Layer

- **What**: Add an authentication/authorization layer to the MCP server so only trusted agents can call write tools.
- **Pros**: Eliminates the untrusted-agent threat at the protocol level.
- **Cons**: MCP specification does not standardize auth. Would require custom handshake. No existing MCP clients support it. Adds complexity to setup.
- **Why not chosen**: Premature for a local-first tool. The proposed hardening achieves comparable protection through input validation and rate limiting without protocol changes.

#### Alternative B: Read-Only MCP + Separate Write API

- **What**: Remove write tools from MCP entirely. Writes go through the CLI or a separate authenticated API.
- **Pros**: Eliminates the MCP write attack surface completely.
- **Cons**: Breaks the core use case — AI agents need to write diary entries and knowledge graph facts during conversation. Forcing CLI writes defeats the purpose of the memory system.
- **Why not chosen**: Incompatible with the product's core value proposition.

#### Alternative C: Do Nothing

- **What**: Accept the current risk profile.
- **Pros**: Zero implementation cost.
- **Cons**: A single misbehaving AI agent can corrupt the palace, fill disk, spoof audit trails, and leak sensitive data. As MemPalace gains users, the attack surface grows proportionally.
- **Why not chosen**: The fixes are small and the risks are real. The cost-benefit clearly favors hardening.

### Comparison Matrix

| Dimension | Proposed (Hardening) | Alt A (MCP Auth) | Alt B (Read-Only MCP) | Alt C (Do Nothing) |
|-----------|---------------------|-------------------|-----------------------|--------------------|
| Security improvement | High | Very High | Very High | None |
| Implementation complexity | Low | High | Medium | None |
| Backward compatibility | Full | Breaking | Breaking | Full |
| User experience impact | None | Requires auth setup | Loses core feature | None |
| Maintenance burden | Low | Medium | Medium | Low |

### Trade-off Summary

The proposed approach gives the best balance of security improvement vs. implementation cost and compatibility. It hardens the existing architecture without changing the user model.

---

## Prior Art

- **ChromaDB**: Uses client-side validation for collection names and metadata keys. MemPalace's `sanitize_name` follows the same pattern — validate at the API boundary.
- **SQLite thread safety**: The [SQLite documentation](https://www.sqlite.org/threadsafe.html) recommends either serialized mode or separate connections per thread. Option A (thread-local) aligns with SQLite's recommendations.
- **MCP specification**: Does not define authentication or rate limiting. Server-side validation is the expected defense, consistent with this RFC's approach.
- **OWASP Input Validation**: The inconsistent validation finding aligns with OWASP's "validate all inputs" principle — partial validation is often worse than no validation because it creates a false sense of security.

---

## Unresolved Questions

**Before acceptance:**
- [ ] Should `sanitize_name` be relaxed to allow Unicode names (e.g., non-Latin entity names)? Current regex is ASCII-only.
- [ ] Should rate limits be per-tool or global across all write tools?

**During implementation:**
- [ ] What is the correct `max_total_entries` for ChromaDB before performance degrades?
- [ ] Should WAL rotation happen synchronously (blocking) or asynchronously?

**Out of scope:**
- [ ] MCP protocol-level authentication (tracked separately)
- [ ] Duplicate module structure cleanup (separate RFC)
- [ ] Encryption at rest for palace data

**Bikeshedding:**
- [ ] Exact rate limit numbers (60/min is a starting point)
- [ ] WAL rotation file count (5 is a starting point)

---

## Future Possibilities

- **Audit dashboard**: With structured WAL logging and rotation, a future `mempalace audit` CLI command could summarize write activity, detect anomalies, and flag suspicious patterns.
- **Per-agent quotas**: Once rate limiting infrastructure exists, it could be extended to per-agent write budgets tracked in the WAL.
- **Content encryption**: WAL hashing is a stepping stone toward full encryption at rest for sensitive palace data.

---

## References

### Code References

- `mcp_server.py:422-447` — `tool_kg_add` with `sanitize_name`; proves that validation infrastructure exists but is inconsistently applied
- `mcp_server.py:449-461` — `tool_kg_invalidate` WITHOUT `sanitize_name`; proves inconsistency with `tool_kg_add`
- `mcp_server.py:477-541` — `tool_diary_write` with `sanitize_name`; proves validation exists for write path
- `mcp_server.py:543-588` — `tool_diary_read` WITHOUT `sanitize_name`; proves inconsistency with `tool_diary_write`
- `mcp_server.py:416-420` — `tool_kg_query` WITHOUT `sanitize_name`; proves query tools bypass validation
- `mcp_server.py:463-467` — `tool_kg_timeline` WITHOUT `sanitize_name`; proves timeline bypasses validation
- `mcp_server.py:894-903` — tool dispatch with `**tool_args` and no argument filtering
- `mcp_server.py:326-330` — `tool_add_drawer` with optional `added_by="mcp"` injectable via dispatch
- `knowledge_graph.py:93` — `check_same_thread=False` without thread safety measures
- `mcp_server.py:86-96` — WAL logger writing plaintext content previews
- `mcp_server.py:299` — `tool_traverse_graph` with unbounded `max_hops`
- `mcp_server.py:290` — `tool_check_duplicate` returning `str(e)` to caller
- `config.py:19-47` — `sanitize_name` implementation; proves the validation gate is well-designed

### Related

- PR #393 (docs: Mermaid diagrams) — documentation-only, no security relevance

---

## Implementation Plan

### Approach

Implement all 8 code findings in 3 phases, ordered by severity and dependency. The duplicate module structure (Finding 9) is out of scope.

### Steps

#### Phase 1: Input Validation & Argument Filtering (HIGH findings)

- [ ] Add `sanitize_name` to `tool_kg_invalidate` — `mcp_server.py:449`
- [ ] Add `sanitize_name` to `tool_kg_query` — `mcp_server.py:416`
- [ ] Add `sanitize_name` to `tool_kg_timeline` — `mcp_server.py:463`
- [ ] Add `sanitize_name` to `tool_diary_read` — `mcp_server.py:543`
- [ ] Add `sanitize_name` to `tool_list_rooms`, `tool_search`, `tool_traverse_graph`, `tool_find_tunnels` — optional params when not None
- [ ] Add argument whitelist filter before dispatch — `mcp_server.py:894`
- [ ] Add tests: verify crafted strings are rejected by all validated tools
- [ ] Add tests: verify undeclared arguments are dropped by dispatch

#### Phase 2: Thread Safety & Resource Limits (MEDIUM findings)

- [ ] Replace singleton connection with `threading.local()` in `KnowledgeGraph` — `knowledge_graph.py:88-96`
- [ ] Add `close()` to `threading.local` connections on shutdown
- [ ] Add in-memory rate limiter for write tools — `mcp_server.py`
- [ ] Add WAL rotation on size threshold — `mcp_server.py:74-96`
- [ ] Add optional `wal_hash_content` config — `config.py`
- [ ] Add tests: concurrent write safety
- [ ] Add tests: rate limit enforcement
- [ ] Add tests: WAL rotation triggers correctly

#### Phase 3: Error Handling & Bounds (LOW findings)

- [ ] Clamp `max_hops` to 5 in `tool_traverse_graph` — `mcp_server.py:299`
- [ ] Replace `str(e)` returns with generic messages in all tool error paths
- [ ] Add `logger.debug()` to all silent `except: pass` blocks
- [ ] Add structured error categories to tool dispatch error response
- [ ] Add tests: verify no internal paths or state leak in error responses

### Risk Mitigations

| Risk | Mitigation |
|------|-----------|
| Validation breaks existing callers | Phase 1 only rejects strings that `sanitize_name` already rejects on write tools — consistent behavior |
| Rate limiting blocks legitimate bulk use | CLI miner bypasses MCP; default 60/min is generous; configurable |
| Thread-local connections increase resource use | For local-first tool with low concurrency, negligible; monitor in benchmarks |
| WAL rotation data loss | Atomic rename; keep 5 rotated files; test rotation under concurrent writes |

### Testing Strategy

| Type | Scope | Approach |
|------|-------|----------|
| Unit | Input validation | Parameterized tests: valid names pass, crafted strings rejected, for every tool |
| Unit | Argument filtering | Verify undeclared args dropped, declared args pass through |
| Unit | Rate limiting | Mock time, verify limit enforcement and reset |
| Integration | Thread safety | Concurrent write/read from multiple threads, verify no corruption |
| Integration | WAL rotation | Write until rotation triggers, verify old logs preserved and new log starts |
| Regression | All existing tests | Ensure no behavioral changes for valid inputs |

Tool	Line	Validates?
`tool_kg_add`	`mcp_server.py:422`	YES — `sanitize_name` on subject, predicate, object
`tool_kg_invalidate`	`mcp_server.py:449`	NO — raw strings to `_entity_id()` then SQL
`tool_diary_write`	`mcp_server.py:477`	YES — `sanitize_name` on agent_name
`tool_diary_read`	`mcp_server.py:543`	NO — raw agent_name into ChromaDB where clause
`tool_kg_query`	`mcp_server.py:416`	NO — raw entity to `_entity_id()` then SQL
`tool_kg_timeline`	`mcp_server.py:463`	NO — raw entity to SQL

Tool	Line	Change
`tool_kg_invalidate`	449	Add `sanitize_name` on `subject`, `predicate`, `object`
`tool_kg_query`	416	Add `sanitize_name` on `entity`
`tool_kg_timeline`	463	Add `sanitize_name` on `entity` (when not None)
`tool_diary_read`	543	Add `sanitize_name` on `agent_name`
`tool_list_rooms`	213	Add `sanitize_name` on `wing` (when not None)
`tool_search`	249	Add `sanitize_name` on `wing` and `room` (when not None)
`tool_traverse_graph`	299	Add `sanitize_name` on `start_room`
`tool_find_tunnels`	307	Add `sanitize_name` on `wing_a` and `wing_b` (when not None)

Dimension	Proposed (Hardening)	Alt A (MCP Auth)	Alt B (Read-Only MCP)	Alt C (Do Nothing)
Security improvement	High	Very High	Very High	None
Implementation complexity	Low	High	Medium	None
Backward compatibility	Full	Breaking	Breaking	Full
User experience impact	None	Requires auth setup	Loses core feature	None
Maintenance burden	Low	Medium	Medium	Low

Risk	Mitigation
Validation breaks existing callers	Phase 1 only rejects strings that `sanitize_name` already rejects on write tools — consistent behavior
Rate limiting blocks legitimate bulk use	CLI miner bypasses MCP; default 60/min is generous; configurable
Thread-local connections increase resource use	For local-first tool with low concurrency, negligible; monitor in benchmarks
WAL rotation data loss	Atomic rename; keep 5 rotated files; test rotation under concurrent writes

Type	Scope	Approach
Unit	Input validation	Parameterized tests: valid names pass, crafted strings rejected, for every tool
Unit	Argument filtering	Verify undeclared args dropped, declared args pass through
Unit	Rate limiting	Mock time, verify limit enforcement and reset
Integration	Thread safety	Concurrent write/read from multiple threads, verify no corruption
Integration	WAL rotation	Write until rotation triggers, verify old logs preserved and new log starts
Regression	All existing tests	Ensure no behavioral changes for valid inputs

Octocode - MemPalace Security Hardening #401

Description

RFC: MemPalace Security Hardening

Summary

Motivation

Problem

Current State

Evidence

Impact

Use Cases

Guide-Level Explanation

Reference-Level Explanation

Finding 1: Inconsistent Input Validation (HIGH)

Finding 2: MCP Dispatch Argument Injection (HIGH)

Finding 3: SQLite Thread Safety (MEDIUM)

Finding 4: Resource Limits on Write Operations (MEDIUM)

Finding 5: WAL Data Exposure (MEDIUM)

Finding 6: Error Handling Improvements (LOW)

Finding 7: Graph Traversal Depth Bound (LOW)

Finding 8: Internal Error Leakage (LOW)

Finding 9: Duplicate Module Structure (INFO)

Drawbacks

Rationale and Alternatives

Why This Design?

Alternatives Considered

Alternative A: MCP Authentication Layer

Alternative B: Read-Only MCP + Separate Write API

Alternative C: Do Nothing

Comparison Matrix

Trade-off Summary

Prior Art

Unresolved Questions

Future Possibilities

References

Code References

Related

Implementation Plan

Approach

Steps

Phase 1: Input Validation & Argument Filtering (HIGH findings)

Phase 2: Thread Safety & Resource Limits (MEDIUM findings)

Phase 3: Error Handling & Bounds (LOW findings)

Risk Mitigations

Testing Strategy

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions