Congratulations for the initiative, your project it's being quite useful in my work.
I would like to suggest adding a function for the BM25F method, which takes different document fields relevance into account before using BM25 saturating function.
This avoids dangerous over-estimation of terms importance when combining linearly BM25 scores from different fields [1]. Therefore, it could make your project more robust for structured text ranking.
References:
[1] https://trec.nist.gov/pubs/trec13/papers/microsoft-cambridge.web.hard.pdf
[2] https://www.researchgate.net/publication/221613382_Simple_BM25_extension_to_multiple_weighted_fields
Thank you in advance.
Congratulations for the initiative, your project it's being quite useful in my work.
I would like to suggest adding a function for the BM25F method, which takes different document fields relevance into account before using BM25 saturating function.
This avoids dangerous over-estimation of terms importance when combining linearly BM25 scores from different fields [1]. Therefore, it could make your project more robust for structured text ranking.
References:
[1] https://trec.nist.gov/pubs/trec13/papers/microsoft-cambridge.web.hard.pdf
[2] https://www.researchgate.net/publication/221613382_Simple_BM25_extension_to_multiple_weighted_fields
Thank you in advance.