Architecture: Why state-trace looks different from Graphiti

This document explains the deliberate architectural choices in state-trace and how they differ from general-purpose temporal context graphs like Graphiti. It exists because a reasonable reviewer comparing the two would otherwise assume state-trace is a toy (in-memory graph, no Neo4j). It isn't; it's a different shape of system for a different problem.

The two problems

Graphiti's problem: unbounded temporal knowledge graph for agents. Many episodes, many entities, long-lived facts that evolve over weeks or months. Multi-tenant. Needs a real graph database because the working set is larger than RAM.

state-trace's problem: bounded working memory for a single coding/debugging session. Tens to low hundreds of nodes at a time. A fix, a failing test, the file under the cursor, the hypothesis the agent is currently exploring. Cold data (closed sessions) stays on disk in SQLite; hot data lives in a networkx.MultiDiGraph and is traversed directly.

These are different systems even when the word "memory" appears in both.

Consequence: the hot graph is in-process

The retrieval path is pure Python over networkx. No ORM, no network round-trip, no query planner. For a working set that fits in RAM (target ceiling: ~256 nodes × ~8 capacity units ≈ ~2k effective memory units per session), this is categorically faster than routing the same traversal through Neo4j or Kuzu:

Operation	state-trace (networkx)	graph DB (typical)
`retrieve()` on ~100-node session	~1–30 ms	~50–500 ms
`retrieve_brief()` (adds compaction)	~2–40 ms	~100–800 ms
2-hop causal traversal with edge prior	in-memory BFS/heap	Cypher + planner

Those numbers are consistent with the AvgLatencyMs column across the README benchmarks. The graph-DB column is the honest read on "what would this cost if we forced it through Kuzu" and is also consistent with the head-to-head harness latency.

For agent loops that want working-memory retrieval inside every action selection, the difference compounds quickly. For long-lived knowledge bases where you're querying across sessions, weeks of history, and multiple tenants, the graph DB is the right choice — that's Graphiti's lane.

Consequence: capacity is a hard constraint, not an afterthought

Because the hot graph is bounded, state-trace is built around enforce_capacity() from day one: decay, compression, and lifecycle-aware retention are part of the engine, not optional hygiene. The long-horizon pressure benchmark exists to verify this.

Graphiti does not have to make the same tradeoff — it has the substrate to keep growing. But that means "what's still live in this session?" is a question state-trace can answer cheaply and directly (see engine.current_state() and engine.failed_hypotheses()), whereas for Graphiti the same answer has to be inferred from temporal facts.

Consequence: typed coding-agent ontology, not generic Entity/Edge

state-trace ships with ten-plus first-class node types — task, observation, decision, file, symbol, patch_hunk, error_signature, test, command, episode, session, goal — and a dozen-plus causal edge types including patches_file, fails_in, verified_by, rejected_by, supersedes, contradicts, derived_from. The retrieval scorer routes differently per intent (locate_file, failure_analysis, history, general) using those types.

Graphiti is intentionally more schema-light. It's the right tradeoff for a general-purpose system, but it means coding-agent-specific queries ("which file should I patch", "what did I try and reject") go through generic BM25/cosine/BFS instead of type-aware priors.

Cold storage: SQLite+FTS5, not a replacement graph DB

When MemoryEngine(storage_path="...db") points at a .db/.sqlite/.sqlite3 file, state-trace upserts into a WAL-journaled SQLite database with an FTS5 seed index. This is deliberately not a second graph store: SQLite handles durability and cold text seeding; the graph continues to live in networkx for active retrieval.

The tradeoff, stated plainly:

JSON backend: simple, single-writer, fine for benchmark scripts. Reloads the whole graph on load().
SQLite+FTS5 backend: WAL journal mode, incremental upserts, process-safe reads, FTS5 for seed-stage lexical search. Recommended for long-running MCP harnesses.
Neither is a substitute for a real graph DB at SaaS scale. That's intentional: the design brief is local-first working memory.

What would change for multi-tenant SaaS scale

Honest accounting of what would have to change if state-trace wanted to compete with Graphiti on its home turf (multi-tenant, weeks-of-history, cross-session retrieval):

Graph substrate. networkx would go from authoritative to cache. The authoritative store becomes Neo4j/Kuzu/DuckPGQ or similar. The retrieval code is traversal-pattern-compatible with that (it's already heap-priority BFS over typed edges), but the hot-path latency assumptions break and we'd need query planning.
Capacity semantics. The current enforce_capacity is per-process. Multi-tenant needs tenant-scoped budgets, persisted across processes, with eviction audited.
Concurrency. networkx is not thread-safe; the current engine is single-writer. SaaS scale needs optimistic concurrency or a serialized writer queue.
Temporal reasoning. The supersedes and invalid_at primitives work well within a session but haven't been stress-tested across many parallel agents editing the same namespace. Graphiti is ahead here.

None of these are blockers; they're just not work we've done, and they're not the work we should do next. The next move is depth on the coding-agent lane (see README.md benchmark section), not horizontal expansion into Graphiti's lane.

What stays true at scale

The architectural wedge does not depend on the graph substrate:

Typed coding-agent nodes and edges.
Bounded working memory (a policy, not an implementation detail).
current_state / failed_hypotheses as first-class APIs — "what's true now in this debugging session" is cheap for us, expensive for a general-purpose knowledge graph.
Compact, artifact-first briefs for small-model harnesses.
MCP-mountable, local-first deployment.

A future Neo4j-backed state-trace would still differ from Graphiti in the same ways it does today. It would just be able to hold more sessions concurrently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture: Why state-trace looks different from Graphiti

The two problems

Consequence: the hot graph is in-process

Consequence: capacity is a hard constraint, not an afterthought

Consequence: typed coding-agent ontology, not generic Entity/Edge

Cold storage: SQLite+FTS5, not a replacement graph DB

What would change for multi-tenant SaaS scale

What stays true at scale

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture: Why state-trace looks different from Graphiti

The two problems

Consequence: the hot graph is in-process

Consequence: capacity is a hard constraint, not an afterthought

Consequence: typed coding-agent ontology, not generic Entity/Edge

Cold storage: SQLite+FTS5, not a replacement graph DB

What would change for multi-tenant SaaS scale

What stays true at scale