fix(kg): validate ISO-8601 date formats at MCP boundary#1167
fix(kg): validate ISO-8601 date formats at MCP boundary#1167igorls merged 2 commits intoMemPalace:developfrom
Conversation
32a196d to
e59b526
Compare
|
Hi, sanitize_iso_date() accepts YYYY and YYYY-MM for tool_kg_query(as_of), but KnowledgeGraph.query_entity() compares TEXT dates lexicographically (valid_from <= as_of). Passing partial dates like '2026' or '2026-03' will silently exclude facts whose valid_from is '2026-01-01' / '2026-03-15', undermining the goal of avoiding “silent empty result sets.” Severity: action required | Category: correctness How to fix: Normalize or reject partial dates Agent prompt to fix - you can give this to your LLM of choice:
We noticed a couple of other issues in this PR as well - happy to share if helpful. Found by Qodo code review |
Per qodo-ai review on PR MemPalace#1167: sanitize_iso_date() previously accepted YYYY and YYYY-MM, but KnowledgeGraph.query_entity() compares valid_from/ valid_to TEXT columns lexicographically against as_of. Lexicographic comparison treats '2026-01-01' as greater than '2026' (because '-' > end-of-string), so partial as_of values silently excluded valid facts — re-introducing the silent-empty-results problem this PR was meant to fix. Tighten _ISO_DATE_RE to require YYYY-MM-DD only. Update docstring and error message accordingly. Invert the two test cases that asserted partials were accepted.
|
Addressed in be60e2b — tightened Picked option 1 (reject at boundary) over option 2 (expand to range) because option 2 needs different expansions for Full suite: 1251 passed. Ready for re-review. |
tool_kg_query (as_of), tool_kg_add (valid_from), and tool_kg_invalidate (ended) accepted any string and forwarded it to SQLite without format validation. Parameterized queries prevent SQL injection, but invalid date strings silently produce empty result sets — callers cannot distinguish "no fact at this time" from "your date format was unrecognized." This is especially painful for natural-language LLM callers that synthesize dates like "March 2026" or "Jan 2025". Add sanitize_iso_date() in config.py alongside the other input validators. It accepts YYYY, YYYY-MM, and YYYY-MM-DD forms; passes through None/empty; and raises ValueError with a field-named message on anything else. Call it from the three kg MCP tool wrappers before values reach the storage layer so the caller gets a clear error instead of a silent miss. Closes MemPalace#1164
Per qodo-ai review on PR MemPalace#1167: sanitize_iso_date() previously accepted YYYY and YYYY-MM, but KnowledgeGraph.query_entity() compares valid_from/ valid_to TEXT columns lexicographically against as_of. Lexicographic comparison treats '2026-01-01' as greater than '2026' (because '-' > end-of-string), so partial as_of values silently excluded valid facts — re-introducing the silent-empty-results problem this PR was meant to fix. Tighten _ISO_DATE_RE to require YYYY-MM-DD only. Update docstring and error message accordingly. Invert the two test cases that asserted partials were accepted.
be60e2b to
abe8576
Compare
|
Rebased on latest Verified locally:
Ready for review. |
What and Why
tool_kg_query(as_of),tool_kg_add(valid_from), andtool_kg_invalidate(ended) accepted any string and forwarded it to SQLite without format validation. Parameterized queries prevent SQL injection, but invalid date strings silently produce empty result sets — callers cannot distinguish "no fact at this time" from "your date format was unrecognized." This is especially painful for natural-language LLM callers that synthesize dates like"March 2026"or"Jan 2025".Root Cause
mempalace/mcp_server.py:843-897— the three kg tool wrappers sanitize subject/predicate/object but pass temporal parameters straight through to_kg.Change Summary
mempalace/config.py— newsanitize_iso_date()validator alongside the other input sanitizers. AcceptsYYYY,YYYY-MM, andYYYY-MM-DD; passes throughNone/""; raisesValueErrorwith a field-named message on anything else.mempalace/mcp_server.py— callsanitize_iso_date(...)intool_kg_query,tool_kg_add, andtool_kg_invalidatebefore values reach the storage layer.tests/test_config.py— 13 unit tests forsanitize_iso_date(accepted forms, passthrough, whitespace, rejection of natural-language / US format / invalid month/day / non-string).tests/test_mcp_server.py— 4 integration tests at the MCP boundary: rejection of"Jan 2025"/"March 2026"/"yesterday", and acceptance of partial ISO forms.Test Plan
pytest tests/test_config.py tests/test_mcp_server.py::TestKGTools -v— all green (9/9 kg + 13/13 date validator tests)ruff check— cleanruff format --check— formattedCloses #1164