Skip to content

Commit 60f07cb

Browse files
feat(rate-limiter): add rust-backed execution engine with benchmarks and review fixes
Introduce a Rust-backed rate limiter engine via PyO3 for the hot-path evaluation, keeping Python as the lifecycle/policy owner with full fallback support. Engine: - RateLimiterEngine with single evaluate_many() call per hook invocation - Memory and Redis backends with all three algorithms (fixed_window, sliding_window, token_bucket) - EVALSHA caching with NOSCRIPT fallback for Redis (REDIS-02) - Process-global monotonic clock for cross-thread correctness - Amortized key eviction for idle memory-backend keys (MEM-06) - Process-unique sorted-set members preventing multi-instance collision - Saturating arithmetic for token bucket overflow safety Python integration: - Rust fast-path with async Redis bridge (evaluate_many_async) - Python fallback preserved when Rust unavailable - Pre-parsed rate strings at init for both Rust and Python paths - Zero-count rate string rejection matching Rust validation - allow_many() return length consistency fix - Fail-open and retry_after=min(blocked) policies documented as contracts Auth: - Centralised resolve_session_teams() for session-token team resolution - JWT intersection policy via _narrow_by_jwt_teams() - Cache stores raw DB teams, not narrowed intersection - tenant_id propagation from team_id for by_tenant rate limiting Tests and benchmarks: - 167 unit tests (15 new) covering Rust path, sweep, identity fallback - Criterion benchmarks measuring steady-state under-limit hot path - Python vs Rust comparison harness (compare_performance.py) - Three-tier load tests: correctness, scale, Redis capacity - HTTP 429 classification fix in load test response handling Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
1 parent 4bfc250 commit 60f07cb

25 files changed

Lines changed: 7593 additions & 64 deletions

.dockerignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -303,6 +303,7 @@ docs/build/
303303
# PyBuilder
304304
target/
305305
**/target/
306+
**/target/**
306307

307308
# Jupyter Notebook
308309
.ipynb_checkpoints

Makefile

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2322,9 +2322,12 @@ load-test-agentgateway-mcp-server-time: ## Load test external MCP server (loc
23222322
MCP_PROTOCOL_LOCUSTFILE ?= tests/loadtest/locustfile_mcp_protocol.py
23232323
MCP_RATE_LIMITER_LOCUSTFILE ?= tests/loadtest/locustfile_rate_limiter_backend_correctness.py
23242324
MCP_RATE_LIMITER_SCALE_LOCUSTFILE ?= tests/loadtest/locustfile_rate_limiter_scale.py
2325+
MCP_RATE_LIMITER_REDIS_CAPACITY_LOCUSTFILE ?= tests/loadtest/locustfile_rate_limiter_redis_capacity.py
23252326
RL_ALGORITHM ?= fixed_window
23262327
RL_USERS ?= 100
23272328
RL_SPAWN_RATE ?= 10
2329+
RL_REQS_PER_SECOND ?= 0.25
2330+
RL_PROMPT_ID ?=
23282331
MCP_PROTOCOL_HOST ?= http://localhost:4444
23292332
MCP_BENCHMARK_HOST ?= http://localhost:8080
23302333
MCP_BENCHMARK_SERVER_ID ?= 9779b6698cbd4b4995ee04a4fab38737
@@ -2445,7 +2448,7 @@ benchmark-rate-limiter: ## Rate limiter correctness test (1
24452448
# help: benchmark-rate-limiter-scale - Multi-user scale test showing Redis memory divergence across algorithms
24462449
.PHONY: benchmark-rate-limiter-scale
24472450
RL_RUN_TIME ?= 300s
2448-
benchmark-rate-limiter-scale: ## Scale test: 500 unique users, Redis memory timeline per algorithm
2451+
benchmark-rate-limiter-scale: ## Scale test: RL_USERS unique users (default 100), Redis memory timeline per algorithm
24492452
@echo "📈 Running rate limiter scale test (resource divergence)..."
24502453
@echo " Algorithm: $(RL_ALGORITHM) (must match plugins/config.yaml)"
24512454
@echo " Users: $(RL_USERS) unique identities (each creates own Redis key)"
@@ -2475,6 +2478,36 @@ benchmark-rate-limiter-scale: ## Scale test: 500 unique users, Red
24752478
--only-summary \
24762479
ScaleComparisonUser || true'
24772480

2481+
2482+
# help: benchmark-rate-limiter-redis-capacity - Multi-instance prompt-path concurrency benchmark for Redis rate limiting
2483+
.PHONY: benchmark-rate-limiter-redis-capacity
2484+
benchmark-rate-limiter-redis-capacity: ## Capacity test: 3 gateways + Redis on prompt_pre_fetch path
2485+
@echo "🚀 Running rate limiter Redis capacity test..."
2486+
@echo " Host: $(MCP_BENCHMARK_HOST)"
2487+
@echo " Topology: nginx -> 3 gateways -> shared Redis"
2488+
@echo " Path: REST /prompts/{id} (prompt_pre_fetch)"
2489+
@echo " Users: $(RL_USERS)"
2490+
@echo " Spawn rate: $(RL_SPAWN_RATE)/s"
2491+
@echo " Pace: $(RL_REQS_PER_SECOND) req/s per user"
2492+
@echo " Duration: $(RL_RUN_TIME)"
2493+
@test -d "$(VENV_DIR)" || $(MAKE) venv
2494+
@/bin/bash -eu -o pipefail -c 'source $(VENV_DIR)/bin/activate && \
2495+
LOCUST_LOG_LEVEL=ERROR \
2496+
RL_USERS=$(RL_USERS) \
2497+
RL_SPAWN_RATE=$(RL_SPAWN_RATE) \
2498+
RL_RUN_TIME=$(RL_RUN_TIME) \
2499+
RL_REQS_PER_SECOND=$(RL_REQS_PER_SECOND) \
2500+
RL_LIMIT_PER_MIN=$(RL_LIMIT_PER_MIN) \
2501+
RL_PROMPT_ID=$(RL_PROMPT_ID) \
2502+
locust -f $(MCP_RATE_LIMITER_REDIS_CAPACITY_LOCUSTFILE) \
2503+
--host=$(MCP_BENCHMARK_HOST) \
2504+
--users=$(RL_USERS) \
2505+
--spawn-rate=$(RL_SPAWN_RATE) \
2506+
--run-time=$(RL_RUN_TIME) \
2507+
--headless \
2508+
--only-summary \
2509+
CapacityPromptUser || true'
2510+
24782511
.PHONY: benchmark-mcp-mixed-300
24792512
benchmark-mcp-mixed-300: ## Distributed 300-user mixed MCP benchmark
24802513
@echo "📊 Running distributed mixed MCP benchmark..."

plugins/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ plugins:
214214
author: "Mihai Criveti"
215215
hooks: ["prompt_pre_fetch", "tool_pre_invoke"]
216216
tags: ["limits", "throttle"]
217-
mode: "permissive"
217+
mode: "enforce"
218218
priority: 20
219219
conditions: []
220220
config:

0 commit comments

Comments
 (0)