Skip to content

fix(kg): validate ISO-8601 date formats at MCP boundary#1167

Merged
igorls merged 2 commits intoMemPalace:developfrom
arnoldwender:fix/kg-date-validation
May 6, 2026
Merged

fix(kg): validate ISO-8601 date formats at MCP boundary#1167
igorls merged 2 commits intoMemPalace:developfrom
arnoldwender:fix/kg-date-validation

Conversation

@arnoldwender
Copy link
Copy Markdown
Contributor

What and Why

tool_kg_query (as_of), tool_kg_add (valid_from), and tool_kg_invalidate (ended) accepted any string and forwarded it to SQLite without format validation. Parameterized queries prevent SQL injection, but invalid date strings silently produce empty result sets — callers cannot distinguish "no fact at this time" from "your date format was unrecognized." This is especially painful for natural-language LLM callers that synthesize dates like "March 2026" or "Jan 2025".

Root Cause

mempalace/mcp_server.py:843-897 — the three kg tool wrappers sanitize subject/predicate/object but pass temporal parameters straight through to _kg.

Change Summary

  • mempalace/config.py — new sanitize_iso_date() validator alongside the other input sanitizers. Accepts YYYY, YYYY-MM, and YYYY-MM-DD; passes through None / ""; raises ValueError with a field-named message on anything else.
  • mempalace/mcp_server.py — call sanitize_iso_date(...) in tool_kg_query, tool_kg_add, and tool_kg_invalidate before values reach the storage layer.
  • tests/test_config.py — 13 unit tests for sanitize_iso_date (accepted forms, passthrough, whitespace, rejection of natural-language / US format / invalid month/day / non-string).
  • tests/test_mcp_server.py — 4 integration tests at the MCP boundary: rejection of "Jan 2025" / "March 2026" / "yesterday", and acceptance of partial ISO forms.

Test Plan

  • pytest tests/test_config.py tests/test_mcp_server.py::TestKGTools -v — all green (9/9 kg + 13/13 date validator tests)
  • ruff check — clean
  • ruff format --check — formatted

Closes #1164

@igorls igorls added bug Something isn't working area/kg Knowledge graph area/mcp MCP server and tools labels Apr 24, 2026
@arnoldwender arnoldwender force-pushed the fix/kg-date-validation branch from 32a196d to e59b526 Compare April 24, 2026 20:57
@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, sanitize_iso_date() accepts YYYY and YYYY-MM for tool_kg_query(as_of), but KnowledgeGraph.query_entity() compares TEXT dates lexicographically (valid_from <= as_of). Passing partial dates like '2026' or '2026-03' will silently exclude facts whose valid_from is '2026-01-01' / '2026-03-15', undermining the goal of avoiding “silent empty result sets.”

Severity: action required | Category: correctness

How to fix: Normalize or reject partial dates

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

sanitize_iso_date() currently allows YYYY and YYYY-MM, but KG filtering compares TEXT dates lexicographically (valid_from <= as_of), causing partial as_of values to silently exclude valid facts.

Issue Context

KG validity filtering is implemented in KnowledgeGraph.query_entity() by comparing valid_from/valid_to TEXT columns to as_of.

Fix Focus Areas

  • mempalace/config.py[74-99]
  • mempalace/mcp_server.py[844-854]
  • mempalace/knowledge_graph.py[240-280]

Implementation direction

Pick one consistent approach:

  1. Reject partial dates at MCP boundary (require YYYY-MM-DD for as_of), or
  2. Expand partial as_of to a full-date bound before querying (e.g., YYYY -> YYYY-12-31, YYYY-MM -> last day of month), or
  3. Change KG queries to use date-aware functions/normalized storage (store as real dates or enforce full ISO dates for stored values and filters).

Add tests that assert factual correctness for partial inputs if you keep supporting them.

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Found by Qodo code review

arnoldwender added a commit to arnoldwender/mempalace that referenced this pull request Apr 26, 2026
Per qodo-ai review on PR MemPalace#1167: sanitize_iso_date() previously accepted
YYYY and YYYY-MM, but KnowledgeGraph.query_entity() compares valid_from/
valid_to TEXT columns lexicographically against as_of. Lexicographic
comparison treats '2026-01-01' as greater than '2026' (because '-' >
end-of-string), so partial as_of values silently excluded valid facts —
re-introducing the silent-empty-results problem this PR was meant to
fix.

Tighten _ISO_DATE_RE to require YYYY-MM-DD only. Update docstring and
error message accordingly. Invert the two test cases that asserted
partials were accepted.
@arnoldwender
Copy link
Copy Markdown
Contributor Author

Addressed in be60e2b — tightened _ISO_DATE_RE to require full YYYY-MM-DD and inverted the two tests that asserted partials were accepted.

Picked option 1 (reject at boundary) over option 2 (expand to range) because option 2 needs different expansions for valid_from <= as_of vs valid_to >= as_of (one wants upper-bound expansion, the other lower-bound), which means a partial as_of no longer maps to a single instant and the query semantics shift. That's a feature, not a bug fix — out of scope for the silent-empty-results goal of this PR.

Full suite: 1251 passed. Ready for re-review.

tool_kg_query (as_of), tool_kg_add (valid_from), and tool_kg_invalidate
(ended) accepted any string and forwarded it to SQLite without format
validation. Parameterized queries prevent SQL injection, but invalid
date strings silently produce empty result sets — callers cannot
distinguish "no fact at this time" from "your date format was
unrecognized." This is especially painful for natural-language LLM
callers that synthesize dates like "March 2026" or "Jan 2025".

Add sanitize_iso_date() in config.py alongside the other input
validators. It accepts YYYY, YYYY-MM, and YYYY-MM-DD forms; passes
through None/empty; and raises ValueError with a field-named message
on anything else. Call it from the three kg MCP tool wrappers before
values reach the storage layer so the caller gets a clear error
instead of a silent miss.

Closes MemPalace#1164
Per qodo-ai review on PR MemPalace#1167: sanitize_iso_date() previously accepted
YYYY and YYYY-MM, but KnowledgeGraph.query_entity() compares valid_from/
valid_to TEXT columns lexicographically against as_of. Lexicographic
comparison treats '2026-01-01' as greater than '2026' (because '-' >
end-of-string), so partial as_of values silently excluded valid facts —
re-introducing the silent-empty-results problem this PR was meant to
fix.

Tighten _ISO_DATE_RE to require YYYY-MM-DD only. Update docstring and
error message accordingly. Invert the two test cases that asserted
partials were accepted.
@arnoldwender arnoldwender force-pushed the fix/kg-date-validation branch from be60e2b to abe8576 Compare April 30, 2026 13:22
@arnoldwender
Copy link
Copy Markdown
Contributor Author

Rebased on latest develop (was 104 commits behind, now merge-clean). The only conflict was a one-line import addition in tests/test_config.pydevelop added normalize_wing_name to the same import the PR adds sanitize_iso_date to. Folded both into a multi-line import; no logical changes.

Verified locally:

  • ruff check . clean on touched files
  • ruff format --check clean
  • pytest tests/test_config.py tests/test_mcp_server.py — 113 passed (incl. test_iso_date_* and test_kg_query_rejects_partial_iso_dates)

Ready for review.

@igorls igorls added this to the v3.3.5 milestone May 2, 2026
@igorls igorls merged commit 7ede231 into MemPalace:develop May 6, 2026
6 checks passed
igorls added a commit that referenced this pull request May 6, 2026
…1282 #1167 #1160

Bundled CHANGELOG entries for the seven Tier-1 PRs merged today, including
the behavior-change call-out for #1167 (KG date validators now reject
non-ISO inputs that previously produced silent empty results).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/kg Knowledge graph area/mcp MCP server and tools bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(kg): validate ISO-8601 date formats in temporal parameters at MCP boundary

3 participants