documentation/en/basics/ranking.html at 3b3f076a7e736c181fa7413e8d2de29ac8511403 · vespa-engine/documentation · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
---
# Copyright Vespa.ai. All rights reserved.
title: "Ranking"
redirect_from:
- /en/ranking.html
---

<p><i>Ranking</i> in Vespa is the computation that is done on matching documents during query execution.
These are specified as <a href="../ranking/ranking-expressions-features.html">ranking functions</a> in
<em>rank profiles</em> in the schema.</p>

<p>The special function named <code>first-phase</code> will determine the initial <i>rank</i> of the matches,
such that the top k can be selected as response to a query:</p>

<pre>
  rank-profile my-rank-profile {
    first-phase {
      expression: 0.7 * bm25(text) + 0.3 * attribute(popularity)
    }
  }
</pre>


<h2 id="ranking-functions-and-features">Ranking functions and features</h2>

<p>The ranking functions can be any mathematical function combining rank features,
including <a href="../ranking/tensor-user-guide.html#ranking-with-tensors">tensor math</a> and
<a href="#machine-learned-model-inference">machine-learned models</a>.</p>

<p>The rank features these functions can use are of three categories:</p>

<ul>
    <li><b>Document features</b>, using <code>attribute(fieldName)</code>: Any document field which has <code>attribute</code> in the indexing statement.</li>
    <li><b>Query features</b>, aka inputs, using <code>query(name)</code>: Any value sent with the query as an input. When these are tensors
    (not scalars) they must be declared as an input in the rank profile.
    <li><b>Match features</b>: A built-in feature which says something about how well a query and document matches, e.g. bm25 or closeness.</li>
</ul>

<p>Refer to the <a href="../reference/ranking/rank-features.html">full list of rank features</a>.</p>

<p>Query features (inputs) that are tensors must be declared in the rank profile:</p>

<pre>
  rank-profile my-rank-profile {
    inputs {
      query(user_context) tensor&lt;float&gt;(x[3])
    }
    first-phase {
      expression: bm25(text) + sum(query(user_context) * attribute(document_context))
    }
  }
</pre>

<p>This is also how the type of query vectors in vector search are declared.</p>


<h2 id="rank-profiles">Rank profiles</h2>

<p>A schema can have any number of rank profiles specifying computations and ranking
for different use cases, experiments, and so on. Queries select one using the
<a href="../reference/api/query.html#ranking.profile">ranking.profile</a> parameter
in requests or a <a href="../querying/query-profiles.html">query profile</a>.
If no profile is specified in the request, the one called <code>default</code> is used, and
if that isn't specified in the schema, a default one ranking by the <a href="../ranking/nativerank.html">nativeRank</a>
feature is used. Another built-in rank profile <code>unranked</code> is also always available.
Specifying this boosts serving performance in queries which do not need ranking because ordering is not important or
<a href="../reference/querying/sorting-language.html">explicit field sorting</a> is used.</p>

<p>To avoid very long schema files, rank profiles can also be specified in their own files in the
application package, named
<code>schemas/[schema-name]/[profile-name].profile</code>.
See the <a href="../reference/schemas/schemas.html#rank-profile">schema reference</a> for documentation
of all the content of rank profiles.

<p>Rank profiles can inherit other profiles to avoid duplication, as in
<code>rank-profile myProfile inherits default, another</code>.</p>


<h2 id="phased-ranking">Phased ranking</h2>

<p>In addition to first-phase which specify the initial ranking that will be applied
on all matching documents during matching, rank profiles can also specify functions
that will be applied to <i>rerank</i> the top k documents before returning the final result.
This is useful to direct more computation towards the most promising candidate documents:

<pre>
schema myapp {

    rank-profile my-rank-profile {

        first-phase {
            expression {
                attribute(quality) * freshness(timestamp) + bm25(title)
            }
        }

        second-phase {
            expression: xgboost(my_xgboost_reranker)
            total-rerank-count: 1000 # Over all nodes
        }

        global-phase {
          expression: sum(onnx(my_large_onnx_model))
          rerank-count: 20
        }

    }

}
</pre>

<p>The <code>second-phase</code> expression is executed locally on the content node, using local data.
This is efficient on thousands of candidates. The <code>global-phase</code> expression
is executed on the global result set after merging, in the container node and is best used for
any very expensive and high quality final reranking.
See <a href="../ranking/phased-ranking.html">phased ranking</a> for details.</p>


<h2 id="ranking-functions">Ranking functions</h2>

<p>A rank profile can define any number of functions which can be used in other ranking
expressions or (when taking no arguments) be returned with results.</p>

<pre>
schema myapp {

    rank-profile my-rank-profile {

        function clickProbability() {
            expression: xgboost('myClickModel')
        }

        function textRanking(field) {
            expression: 0.7 * bm25(field) + 0.3 * nativeProximity(field)
        }

        first-phase {
            expression {
                0.1 * clickProbability()
                0.2 * closeness(embeddingsField) +
                0.3 * textRanking(titleField) +
                0.4 * textRanking(bodyField)
            }
        }

        summary-features {
            clickProbability() # Returned with every mathed document
        }

    }

}
</pre>

<p>Read more in <a href="../ranking/ranking-expressions-features.html">ranking expressions and functions</a>.</p>


<h2 id="layered-ranking">Layered ranking</h2>

<p>In addition to ranking <i>documents</i>, a rank profile can also rank and select array elements within documents.
This is most commonly used to select individual chunks within documents in RAG applications, see
<a href="../rag/working-with-chunks.html#layered-ranking-selecting-chunks-to-return">working with chunks</a>.</p>


<h2 id="machine-learned-model-inference">Machine-Learned model inference</h2>

<p>The best quality is achieved by learning relevance functions using machine learning from a training set.
Vespa lets you use machine-learned models in these formats in distributed ranking (first-and second phase):</p>

<ul>
    <li><a href="../ranking/onnx">ONNX</a>, allowing importing models from ML frameworks like Tensorflow, PyTorch and scikit-learn.</li>
    <li><a href="../ranking/xgboost">XGBoost</a></li>
    <li><a href="../ranking/lightgbm">LightGBM</a></li>
</ul>

<p>As these are exposed as rank features, they can be used in ranking expressions exactly like any other rank feature.</p>

<br/>
<h4>Next: <a href="operations.html">Operations</a></h4>