Skip to content

Commit 4d6937d

Browse files
committed
client: clean up demand RU gauge and clarify its semantics
Address review feedback on #10582: - Delete `DemandRUPerSecGauge` label series in `cleanUpResourceGroup` so the new gauge does not leak labels when a resource group is deleted. Add a TODO tracking the remaining per-group metrics that still leak (TokenConsumedHistogram, GroupRunningKVRequestCounter, SuccessfulRequestDuration, FailedRequestCounter, ResourceGroupTokenRequestCounter, RequestRetryCounter, FailedLimitReserveDuration). - Clarify the metric Help text and doc comment to make the pre-throttling semantics explicit: the EMA includes requests rejected by the token bucket, which is the whole reason this metric is exposed separately from the consumption-based `avg_ru_per_sec`. Signed-off-by: JmPotato <github@ipotato.me>
1 parent 38190e7 commit 4d6937d

File tree

2 files changed

+8
-2
lines changed

2 files changed

+8
-2
lines changed

client/resource_group/controller/global_controller.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -634,6 +634,11 @@ func (c *ResourceGroupsController) cleanUpResourceGroup() {
634634
if gc.inactive || gc.tombstone.Load() {
635635
c.groupsController.Delete(resourceGroupName)
636636
metrics.ResourceGroupStatusGauge.DeleteLabelValues(resourceGroupName, resourceGroupName)
637+
metrics.DemandRUPerSecGauge.DeleteLabelValues(resourceGroupName)
638+
// TODO: clean up the remaining per-group metrics (e.g. TokenConsumedHistogram,
639+
// GroupRunningKVRequestCounter, SuccessfulRequestDuration, FailedRequestCounter,
640+
// ResourceGroupTokenRequestCounter, RequestRetryCounter, FailedLimitReserveDuration)
641+
// which currently leak label series on resource group deletion.
637642
return true
638643
}
639644
gc.inactive = true

client/resource_group/controller/metrics/metrics.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ const (
3131
var (
3232
// ResourceGroupStatusGauge comments placeholder
3333
ResourceGroupStatusGauge *prometheus.GaugeVec
34-
// DemandRUPerSecGauge is the EMA of demanded RU/s before throttling per resource group.
34+
// DemandRUPerSecGauge is the EMA of demanded RU/s per resource group, including
35+
// requests rejected by the token bucket (pre-throttling demand).
3536
DemandRUPerSecGauge *prometheus.GaugeVec
3637
// SuccessfulRequestDuration comments placeholder
3738
SuccessfulRequestDuration *prometheus.HistogramVec
@@ -76,7 +77,7 @@ func initMetrics(constLabels prometheus.Labels) {
7677
Namespace: namespace,
7778
Subsystem: "resource_group",
7879
Name: "demand_ru_per_sec",
79-
Help: "EMA of demanded RU/s before throttling for each resource group.",
80+
Help: "EMA of demanded RU/s per resource group, including requests rejected by the token bucket (pre-throttling demand).",
8081
ConstLabels: constLabels,
8182
}, []string{newResourceGroupNameLabel})
8283

0 commit comments

Comments
 (0)