Can I fed 500K documents in rank_bm25?

Thanks for this awesome library.

I am curious to know whether rank_bm25 can handle 500K documents. Each document has around 1000 words.

Looking forward to your feedback. I want to use the following functionality with rank_bm25:

```
from rank_bm25 import BM25Okapi

corpus = [
    "Hello there good man!",
    "It is quite windy in London",
    "How is the weather today?"
]

tokenized_corpus = [doc.split(" ") for doc in corpus]
bm25 = BM25Okapi(tokenized_corpus)


query = "windy London"
tokenized_query = query.split(" ")

doc_scores = bm25.get_scores(tokenized_query)
result = bm25.get_top_n(tokenized_query, corpus, n=1)

print(result)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I fed 500K documents in rank_bm25? #27

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Can I fed 500K documents in rank_bm25? #27

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions