You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the Vietnamese locale — vi.json itself looks reasonable, but I need the PR scope trimmed before I can merge. Right now it touches 13 files with 465 additions, and most of that isn't Vietnamese-related. Please drop:
Unrelated ruff-reformat churn in 9 files — backends/chroma.py, tests/test_closet_llm.py, tests/test_closets.py, tests/test_convo_miner.py, tests/test_mcp_server.py, tests/test_mcp_stdio_protection.py, tests/test_normalize.py, tests/test_readme_claims.py, tests/test_sweeper.py. Looks like a newer ruff version reformatted them locally. Drop these from this PR; if you want the reformat merged, open a separate PR for it.
API surface change in mempalace/i18n/__init__.py — the PR adds direct_address_pattern (singular) as an alias of direct_address_patterns (plural) in the merged output dict. The loader has only ever emitted the plural key in output; the singular belongs to the JSON input schema. Your Vietnamese test handles both isinstance(p, str) and isinstance(p, re.Pattern) branches, which suggests the test was written against the wrong key. Please fix the test to use direct_address_patterns instead of adding the alias — the alias would also conflict with the schema-invariant test added in feat(i18n): add entity detection to German, Spanish, and French locales #1001.
vi.json end-of-file newline — missing.
Prune multi-word entries from regex.stop_words — many Vietnamese particles in the list are multi-word (cái gì, cái nào, người ta, etc.). The tokenizer splits on whitespace (\w{2,}), so space-containing entries never fire. Keep single-word tokens only. (See feat(searcher): wire i18n stop words into BM25 tokenizer (#973) #977 for the same issue fixed on ja / zh-CN.)
Once this is scoped to mempalace/i18n/vi.json + test additions in tests/test_i18n.py only, it'll be quick to review and merge.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
direct_address_pattern(legacy/test usage)direct_address_patterns(current internal usage)tests/test_i18n.py:tests/test_i18n_lang_case.pyintact.How to test
Expected result from latest run:
python -m pytest tests/ -v->1044 passed, 1 skipped, 106 deselectedruff check .->All checks passed!Checklist
python -m pytest tests/ -v)ruff check .)