Skip to content

refactor: coerce None metadatas at ChromaDB backend boundary (supersedes per-site guards from #999, #1013) #1020

@bensig

Description

@bensig

Context

Two PRs recently merged to guard against `None` metadata / document entries returned by ChromaDB `query()`:

Both fix real user-visible crashes (`AttributeError: 'NoneType' object has no attribute 'get'`) that hit frequently on chromadb 1.5.x via the queue-stall in #1006.

Combined, these introduce ~9 per-site `meta = meta or {}` / `doc = doc or ""` guards across 5 files. During review of #999, @jphein proposed centralizing the coercion at the adapter boundary:

"Whack-a-mole across 8 call sites suggests the cleaner fix probably lives one layer down — either:

  1. `_get_cached_metadata` / `_fetch_all_metadata` (in `mcp_server.py`) strip `None` entries as they build the cache, OR
  2. `ChromaCollection.get()` / `ChromaCollection.query()` in `backends/chroma.py` coerce `None` metadatas to `{}` at the adapter boundary.

Option 2 is more general — every future call site would get the guard for free... Happy to convert this to the adapter-level fix if maintainers prefer."

This issue tracks the adapter-level consolidation jphein offered.

Proposal

Normalize `None` → `{}` (for metadatas) and `None` → `""` (for documents) inside `mempalace/backends/chroma.py`'s `ChromaCollection.query()` and `.get()` return paths. Every consumer inherits the guard; every per-site `meta or {}` / `doc or ""` guard introduced by #999 and #1013 gets deleted.

Alignment with RFC 001

This fits RFC 001's direction of formalizing backend return shapes (typed `QueryResult` / `GetResult`, normalized at the boundary). Arguably the coercion belongs in whatever the RFC 001 cleanup PR lands — merge coordination needed.

Scope

Non-goals

Offer to @jphein

You raised this during #999 review and offered to convert. If you have bandwidth, happy to have you drive this one — your diagnosis is clearest on what the failure mode looks like. Alternatively, any contributor can grab it.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions