Skip to content

Commit fe878fd

Browse files
committed
test(data): pin tokenization-through-delimiters as design intent
The reviewer noted that TestSearchDocumentsBadSyntaxGraceful does not catch the case where a malformed input like "(kitchen" normalizes to the inner term "kitchen". In our model that is the intended behavior: phrase-wrap escape relies on the FTS5 tokenizer to extract usable terms from whatever surrounding punctuation the user typed mid-keystroke. Add an explicit positive test asserting that "(kitchen", "kitchen)", '"kitchen', and "kitchen*" all match a document containing "kitchen", so a future reader doesn't try to "fix" this by routing malformed input to no-results.
1 parent 8a46ec2 commit fe878fd

1 file changed

Lines changed: 26 additions & 0 deletions

File tree

internal/data/fts_test.go

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,32 @@ func TestSearchDocumentsUpdateReflected(t *testing.T) {
171171
assert.Empty(t, results)
172172
}
173173

174+
// TestSearchDocumentsMalformedTokenizes pins the design intent of the
175+
// phrase-wrap escape: when a user types something with stray delimiters
176+
// like "(kitchen" or "kitchen)", the FTS5 tokenizer extracts the inner
177+
// word and the prefix match still works. This is desirable for type-as-
178+
// you-go search where partial input should still surface relevant
179+
// results, not an accidental matching bug.
180+
func TestSearchDocumentsMalformedTokenizes(t *testing.T) {
181+
t.Parallel()
182+
store := newTestStore(t)
183+
184+
require.NoError(t, store.CreateDocument(&Document{
185+
Title: "Kitchen Renovation",
186+
FileName: "k.pdf",
187+
ExtractedText: "plumber notes",
188+
}))
189+
190+
for _, q := range []string{`(kitchen`, `kitchen)`, `"kitchen`, `kitchen*`} {
191+
t.Run(q, func(t *testing.T) {
192+
results, err := store.SearchDocuments(q)
193+
require.NoError(t, err)
194+
require.Len(t, results, 1, "delimiters around %q should not block tokenization", q)
195+
assert.Equal(t, "Kitchen Renovation", results[0].Title)
196+
})
197+
}
198+
}
199+
174200
// TestSearchDocumentsBadSyntaxGraceful verifies that inputs which would
175201
// be malformed FTS5 expressions if passed verbatim do not error out and
176202
// also do not accidentally match real documents. A document is inserted

0 commit comments

Comments
 (0)