Fix HybridRetriever cutoff and KeyError with small collections by tavian-dev · Pull Request #49 · AmenRa/retriv

tavian-dev · 2026-04-02T19:53:15Z

Summary

Fixes two related issues in HybridRetriever:

Issue #29 — KeyError: -1 when collection has fewer than 1000 documents:

search() and msearch() passed a hardcoded cutoff of 1000 to sub-retrievers
When the dense retriever (faiss) is asked for more results than exist, it returns -1 as a placeholder
map_internal_ids_to_original_ids then fails with KeyError: -1
Fix: filter out -1 entries in map_internal_ids_to_original_ids as a safety net

Issue #33 — cutoff not passed to sub-retrievers:

The user's cutoff parameter was only applied after fusion, not to the sub-retriever calls
Sub-retrievers always fetched 1000 results regardless of what the user requested
Fix: use max(cutoff, 1000) so sub-retrievers respect large cutoffs while still fetching enough candidates for fusion quality

Changes

base_retriever.py: map_internal_ids_to_original_ids now skips -1 entries
hybrid_retriever.py: search() and msearch() use max(cutoff, 1000) for sub-retrievers

Fixes #29, fixes #33

Two fixes: 1. Sub-retrievers in search() and msearch() used a hardcoded cutoff of 1000. When the collection has fewer documents, the dense retriever (faiss) returns -1 for missing entries, causing KeyError in map_internal_ids_to_original_ids. Now uses max(cutoff, 1000) so the sub-cutoff respects the user's requested cutoff while still fetching enough candidates for fusion. 2. map_internal_ids_to_original_ids now filters out -1 entries as a safety net, since faiss can return -1 when k exceeds the index size. Fixes AmenRa#29, fixes AmenRa#33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix HybridRetriever cutoff and KeyError with small collections#49

Fix HybridRetriever cutoff and KeyError with small collections#49
tavian-dev wants to merge 1 commit intoAmenRa:mainfrom
tavian-dev:fix/hybrid-cutoff

tavian-dev commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tavian-dev commented Apr 2, 2026

Summary

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant