Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions mempalace/backends/chroma.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,9 @@ def _pin_hnsw_threads(collection) -> None:
logger.debug("_pin_hnsw_threads modify failed", exc_info=True)


_BLOB_FIX_MARKER = ".blob_seq_ids_migrated"


def _fix_blob_seq_ids(palace_path: str) -> None:
"""Fix ChromaDB 0.6.x -> 1.5.x migration bug: BLOB seq_ids -> INTEGER.

Expand All @@ -168,10 +171,19 @@ def _fix_blob_seq_ids(palace_path: str) -> None:
type INTEGER) is not compatible with SQL type BLOB".

Must run BEFORE PersistentClient is created (the compactor fires on init).

Opening a Python sqlite3 connection against a ChromaDB 1.5.x WAL-mode
database leaves state that segfaults the next PersistentClient call. After
the migration has run once successfully, a marker file is written so
subsequent opens skip the sqlite connection entirely. Already-migrated
palaces can touch the marker manually to opt into the fast path.
"""
db_path = os.path.join(palace_path, "chroma.sqlite3")
if not os.path.isfile(db_path):
return
marker = os.path.join(palace_path, _BLOB_FIX_MARKER)
if os.path.isfile(marker):
return
Comment on lines +185 to +187
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New marker-file behavior isn’t covered by tests: there are existing tests for _fix_blob_seq_ids, but none assert (1) marker creation on successful no-op/migration, and (2) early return when the marker exists (ensuring sqlite3.connect() is not called). Adding/adjusting tests would prevent regressions on this reliability guard.

Copilot uses AI. Check for mistakes.
try:
with sqlite3.connect(db_path) as conn:
Comment on lines 188 to 189
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrated is assigned but never read. This looks like leftover state from an earlier version; please remove it or use it (e.g., for conditional logging/marker handling) to avoid dead code and lint noise.

Copilot uses AI. Check for mistakes.
for table in ("embeddings", "max_seq_id"):
Expand All @@ -189,6 +201,14 @@ def _fix_blob_seq_ids(palace_path: str) -> None:
conn.commit()
except Exception:
logger.exception("Could not fix BLOB seq_ids in %s", db_path)
return
# Write marker whether or not rows needed migration — the palace is now
# confirmed to be in the INTEGER-seq_id state and future opens can skip the
# sqlite3.connect() entirely.
try:
open(marker, "a").close()
except OSError:
logger.exception("Could not write migration marker %s", marker)


Comment on lines +191 to 214
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The marker is written even if both table queries hit sqlite3.OperationalError (e.g., schema mismatch / missing tables), which means we may skip future migrations without ever confirming seq_id types. Consider tracking whether any table was successfully inspected, and only writing the marker when at least one relevant table was checked (or after actual updates).

Copilot uses AI. Check for mistakes.
# ---------------------------------------------------------------------------
Expand Down