Skip to content

Commit 09771d7

Browse files
WIP - Metadata filter README
1 parent 21dade1 commit 09771d7

File tree

1 file changed

+65
-0
lines changed

1 file changed

+65
-0
lines changed

README.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -367,6 +367,71 @@ $db->save();
367367

368368
Deleted documents are soft-deleted from the HNSW graph (kept for connectivity but excluded from results) and fully removed from the BM25 index. Document files are deleted from disk immediately.
369369

370+
## Metadata filtering
371+
372+
Filter search results by document metadata. Filters can be combined with any search method — vector, text, or hybrid.
373+
374+
### Creating filters
375+
376+
Use the `MetadataFilter` value object. All nine operators are supported:
377+
378+
```php
379+
use PHPVector\MetadataFilter;
380+
381+
// Equality / inequality
382+
$filter = MetadataFilter::eq('status', 'published');
383+
$filter = MetadataFilter::neq('type', 'draft');
384+
385+
// Comparison operators
386+
$filter = MetadataFilter::lt('price', 100);
387+
$filter = MetadataFilter::lte('price', 100);
388+
$filter = MetadataFilter::gt('rating', 4.0);
389+
$filter = MetadataFilter::gte('rating', 4.0);
390+
391+
// Set membership
392+
$filter = MetadataFilter::in('category', ['tech', 'science', 'engineering']);
393+
$filter = MetadataFilter::notIn('status', ['deleted', 'archived']);
394+
395+
// Array containment — checks if metadata array contains the value
396+
$filter = MetadataFilter::contains('tags', 'php'); // matches ['tags' => ['php', 'vector']]
397+
```
398+
399+
### Filtering search results
400+
401+
Pass filters to any search method. Multiple filters are ANDed together by default.
402+
403+
404+
### OR groups (nested arrays)
405+
406+
Wrap filters in a nested array to create OR groups. Filters at the top level are ANDed; filters inside a nested array are ORed.
407+
408+
### Metadata-only search
409+
410+
Query documents by metadata alone, without a vector or text query:
411+
412+
```php
413+
// Find all documents matching filters
414+
$results = $db->metadataSearch(
415+
filters: [MetadataFilter::eq('status', 'published')],
416+
);
417+
418+
### Strict type comparison
419+
420+
Metadata filtering uses **strict type comparison** (PHP `===`). This means:
421+
- String `'5'` does NOT match integer `5`
422+
- Float `1.0` does NOT match integer `1`
423+
424+
```php
425+
// Document with metadata: ['year' => 2024] (integer)
426+
MetadataFilter::eq('year', 2024); // ✓ matches
427+
MetadataFilter::eq('year', '2024'); // ✗ does not match (string vs int)
428+
429+
// Document with metadata: ['rating' => 4.5] (float)
430+
MetadataFilter::gt('rating', 4); // ✓ matches (4.5 > 4)
431+
MetadataFilter::eq('rating', 4.5); // ✓ matches
432+
MetadataFilter::eq('rating', '4.5'); // ✗ does not match (string vs float)
433+
```
434+
370435
## Custom tokenizer
371436

372437
Implement `TokenizerInterface` to plug in stemming, lemmatization, or any language-specific logic.

0 commit comments

Comments
 (0)