You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: add prerequisites and correct constant key in rate limiting examples
- Add key-auth plugin to Example 1 (per-tier), Example 3 (limit-conn),
and the combined example: ${consumer_name} requires an authenticated
consumer; without it the rule is silently skipped and APISIX returns 500
- Replace bare 'global' constant key with '${http_host ?? global}' in
Example 3 (limit-conn), the ai-rate-limiting example, and the combined
example: plain constant strings produce n_resolved=0 and are skipped
at runtime due to an APISIX bug (filed at apache/apisix#13180)
- Add prerequisite note to the ai-rate-limiting section: the plugin is
silently inactive without ai-proxy, which sets picked_ai_instance_name
Copy file name to clipboardExpand all lines: blog/en/blog/2026/04/14/apisix-3.16-dynamic-rate-limiting.md
+18-3Lines changed: 18 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -110,10 +110,13 @@ Variable support lets you pull rate limiting parameters directly from the reques
110
110
111
111
Suppose your authentication middleware injects an `X-Rate-Quota` header based on the user's subscription tier:
112
112
113
+
> **Prerequisite**: this example uses `${consumer_name}` as the rate limit key, which requires an authentication plugin on the route so that the consumer identity is available at request time. Add `key-auth` (or any other auth plugin) to the route plugins and create a consumer before testing.
114
+
113
115
```json
114
116
{
115
117
"uri": "/api/v1/*",
116
118
"plugins": {
119
+
"key-auth": {},
117
120
"limit-count": {
118
121
"rules": [
119
122
{
@@ -178,10 +181,15 @@ Tenant A calling `/api/v1/users` and Tenant B calling the same endpoint get inde
178
181
179
182
The `limit-conn` plugin also supports rules and variables, enabling dynamic concurrency control:
180
183
184
+
> **Prerequisite**: the per-consumer rule uses `${consumer_name}`, which requires an authentication plugin. Add `key-auth` to the route plugins and create a consumer before testing.
185
+
>
186
+
> **Note**: plain constant strings (e.g. `"global"`) are not supported as `key` values in the `rules` array due to a current APISIX limitation — the rule is silently skipped at runtime. Use a variable expression that always resolves instead, such as `"${http_host ?? global}"`. A fix has been filed at [apache/apisix#13180](https://github.com/apache/apisix/issues/13180).
187
+
181
188
```json
182
189
{
183
190
"uri": "/api/v1/inference",
184
191
"plugins": {
192
+
"key-auth": {},
185
193
"limit-conn": {
186
194
"default_conn_delay": 0.1,
187
195
"rules": [
@@ -193,7 +201,7 @@ The `limit-conn` plugin also supports rules and variables, enabling dynamic conc
193
201
{
194
202
"conn": 100,
195
203
"burst": 20,
196
-
"key": "global"
204
+
"key": "${http_host ?? global}"
197
205
}
198
206
],
199
207
"rejected_code": 503
@@ -212,6 +220,10 @@ This limits each consumer to 5 concurrent connections while capping the total at
212
220
213
221
## AI Rate Limiting: Token Budget Management
214
222
223
+
> **Prerequisite**: `ai-rate-limiting` must be used alongside the `ai-proxy` plugin. Without `ai-proxy`, the plugin is silently inactive — it relies on `ai-proxy` to populate `ctx.picked_ai_instance_name` and `ctx.ai_token_usage` at runtime. The configuration below shows `ai-rate-limiting` in isolation for clarity; in production, add your `ai-proxy` configuration to the same route.
224
+
>
225
+
> **Note**: the global cap rule uses `"${http_host ?? global}"` instead of a plain `"global"` string. See the note in Example 3 for the reason.
226
+
215
227
For AI gateway use cases, the `ai-rate-limiting` plugin combines multiple rules with variable support for fine-grained token budget control:
216
228
217
229
```json
@@ -236,7 +248,7 @@ For AI gateway use cases, the `ai-rate-limiting` plugin combines multiple rules
236
248
{
237
249
"count": 1000000,
238
250
"time_window": 60,
239
-
"key": "global",
251
+
"key": "${http_host ?? global}",
240
252
"header_prefix": "global"
241
253
}
242
254
],
@@ -264,10 +276,13 @@ As AI API costs scale directly with token usage, this kind of layered budget con
264
276
265
277
The real power emerges when you combine both features. Here is a complete example for an API platform with tiered pricing:
266
278
279
+
> **Prerequisite**: add an authentication plugin (e.g. `key-auth`) to the route so that `${consumer_name}` is populated at runtime. The global cap rule uses `"${http_host ?? global}"` instead of a plain `"global"` string — see the note in Example 3 for the reason.
280
+
267
281
```json
268
282
{
269
283
"uri": "/api/v1/*",
270
284
"plugins": {
285
+
"key-auth": {},
271
286
"limit-count": {
272
287
"rules": [
273
288
{
@@ -285,7 +300,7 @@ The real power emerges when you combine both features. Here is a complete exampl
0 commit comments