Skip to content

fix(prometheus): flush expired slab memory in exporter timer#13195

Open
sihyeonn wants to merge 1 commit intoapache:masterfrom
sihyeonn:fix/prometheus-flush-expired-slab
Open

fix(prometheus): flush expired slab memory in exporter timer#13195
sihyeonn wants to merge 1 commit intoapache:masterfrom
sihyeonn:fix/prometheus-flush-expired-slab

Conversation

@sihyeonn
Copy link
Copy Markdown
Contributor

Summary

When metrics are configured with an expire value, nginx's slab allocator marks entries as logically expired but does not automatically return the underlying slab pages to the free-space pool. As a result, apisix_shared_dict_free_space_bytes for prometheus-metrics decreases monotonically over time — slabs are only reclaimed when explicitly flushed.

Root Cause

ngx.shared.DICT:flush_expired() must be called explicitly to reclaim slab memory from expired entries. Without it:

  • Time series expire logically (reads return nil after expire seconds)
  • But the slab memory is not returned to free space
  • free_space_bytes trends toward zero regardless of actual active time-series count

This can be observed by comparing free_space_bytes with the active time-series count: the count fluctuates (e.g. drops significantly during low-traffic periods) while free space never recovers — even after most entries have expired.

Fix

Call dict:flush_expired(1000) inside exporter_timer, which already runs every refresh_interval (default 15 s) in the privileged agent process.

local prom_dict = ngx.shared["prometheus-metrics"]
if prom_dict then
    prom_dict:flush_expired(1000)
end

Why max_count=1000: Without a limit, a single flush call could hold the shared-dict write lock for an extended time if many expired entries have accumulated. Limiting to 1000 per cycle keeps the lock time well under 10 ms in practice, while remaining entries are flushed in subsequent timer ticks (every 15 s).

The call runs in the privileged agent process, which is separate from worker request-handling processes, so the brief write-lock has minimal impact on request throughput.

Checklist

  • No functional change to metric collection or rendering
  • Compatible with existing expire metric configuration
  • Works with any refresh_interval setting

When metrics are registered with an `expire` value, nginx's slab
allocator marks entries as logically expired but does not return
the underlying slab pages to the free-space pool automatically.
As a result, `free_space_bytes` decreases monotonically over time
even when many time series have expired, because slabs are only
reclaimed when a flush is explicitly requested.

Call `dict:flush_expired(1000)` in `exporter_timer` (which runs
every `refresh_interval`, defaulting to 15 s) so that expired slabs
are reclaimed promptly. The `max_count=1000` argument bounds the
write-lock hold time to a few milliseconds per call, avoiding any
noticeable impact on worker request processing.

Fixes the pattern where `apisix_shared_dict_free_space_bytes` for
`prometheus-metrics` decreases continuously until the dict is
exhausted, even though active time-series counts fluctuate normally.
@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. bug Something isn't working labels Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant