You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: en/learn/faq.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -87,7 +87,7 @@ of a double. This can happen in two cases:
87
87
88
88
- The [ranking](../basics/ranking.html) expression used a feature which became `NaN` (Not a Number). For example, `log(0)` would produce
89
89
-Infinity. One can use [isNan](../reference/ranking/ranking-expressions.html#isnan-x) to guard against this.
90
-
- Surfacing low scoring hits using [grouping](../querying/grouping.html), that is, rendering low ranking hits with `each(output(summary()))` that are outside of what Vespa computed and caches on a heap. This is controlled by the [keep-rank-count](../reference/schemas/schemas.html#keep-rank-count).
90
+
- Surfacing low scoring hits using [grouping](../querying/grouping.html), that is, rendering low ranking hits with `each(output(summary()))` that are outside what Vespa computed and caches on a heap. This is controlled by the [total-keep-rank-count](../reference/schemas/schemas.html#total-keep-rank-count) parameter.
91
91
92
92
### How to pin query results?
93
93
To hard-code documents to positions in the result set,
It is worth noting that parameters such as `targetHits` (for the match phase) and `rerank-count`
574
-
(for first and second phase) are applied **per content node**. Also note that the stateless container nodes can
575
-
also be [scaled independently](../../performance/sizing-search.html) to handle increased query load.
573
+
The stateless container nodes can
574
+
be [scaled independently](../../performance/sizing-search.html) to handle increased query load.
576
575
577
576
## Configuring match-phase (retrieval)
578
577
@@ -1380,8 +1379,8 @@ We run the evaluation script on a set of unseen test queries, and get the follow
1380
1379
```
1381
1380
1382
1381
For the first phase ranking, we care most about recall, as we just want to make sure that the candidate documents are
1383
-
ranked high enough to be included in the second-phase ranking. (the default number of documents that will be exposed to
1384
-
second-phase is 10 000, but can be controlled by the `rerank-count` parameter).
1382
+
ranked high enough to be included in the second-phase ranking. The number of documents to be reranked in second-phase
1383
+
in total over all content nodes is controlled by the `total-rerank-count` parameter.
1385
1384
1386
1385
We can see that our results are already very good. This is of course due to the fact that we have a small,synthetic dataset.
1387
1386
In reality, you should align the metric expectations with your dataset and test queries.
@@ -1392,7 +1391,7 @@ within your latency budget, as you want some headroom for second-phase ranking.
1392
1391
## Second-phase ranking
1393
1392
1394
1393
For the second-phase ranking, we can afford to use a more expensive ranking expression, since we will only run it
1395
-
on the top-k documents from the first-phase ranking (defined by the `rerank-count` parameter, which defaults to 10,000 documents).
1394
+
on the top-k documents from the first-phase ranking (decided by the `total-rerank-count` parameter).
1396
1395
1397
1396
This is where we can significantly improve ranking quality by using more sophisticated models and features that would
1398
1397
be too expensive to compute for all matched documents.
@@ -1589,7 +1588,7 @@ vespa query \
1589
1588
**Performance monitoring:**
1590
1589
1591
1590
* Monitor latency impact of second-phase ranking
1592
-
* Adjust `rerank-count` based on quality vs. performance trade-offs
1591
+
* Adjust `total-rerank-count` based on quality vs. performance trade-offs
1593
1592
* Consider using different models for different query types or use cases
1594
1593
1595
1594
The second-phase ranking represents a crucial step in building high-quality RAG applications,
@@ -1598,7 +1597,7 @@ providing the precision needed for effective LLM context while maintaining reaso
1598
1597
## (Optional) Global-phase ranking
1599
1598
1600
1599
We also have the option of configuring [global-phase](../../reference/schemas/schemas.html#globalphase-rank) ranking, which can rerank the top k
1601
-
(as set by `rerank-count` parameter) documents from the second-phase ranking.
1600
+
(as set by `total-rerank-count` parameter) documents from the second-phase ranking.
1602
1601
1603
1602
Common options for global-phase are [cross-encoders](../../ranking/cross-encoders.html) or another GBDT model, trained for
1604
1603
better separating top ranked documents on objectives such as [LambdaMart](https://xgboost.readthedocs.io/en/latest/tutorials/learning_to_rank.html). For RAG applications,
0 commit comments