resource control: add client-side pre-throttling demand RU/s metric

## Enhancement Task

### Problem

Currently, there is no metric that accurately reflects the **real-time RU/s demand** from clients **before** Resource Control throttling takes effect:

- **Client-side `avgRUPerSec`** (`group_controller.go`) is computed from `getRUValueFromConsumption()` — actual post-throttling consumption. When a resource group is throttled, requests wait in `Reserve()`, consumption slows, and `avgRUPerSec` only reflects the throttled rate.
- **Server-side `read_request_unit_max_per_sec` / `write_request_unit_max_per_sec`** are derived from `Consumption.RRU/WRU` reported by clients — also post-throttling values.
- **Server-side `sampled_request_unit_per_sec`** is based on `requiredToken` in `AcquireTokenBuckets`, which is `avgRUPerSec * targetPeriod * amplification - availableTokens` — not a clean demand rate, and lacks per-instance granularity.

This makes it impossible for operators to determine the true workload demand when Resource Control is actively throttling.

### Proposal

Add a new **client-side** Prometheus Gauge that tracks the EMA of demanded RU/s, sampled at the `acquireTokens()` entry point (before `Reserve()` throttling):

- **Metric**: `resource_manager_client_resource_group_demand_ru_per_sec{resource_group="..."}`
- **Data source**: the RU cost (`v`) passed to `acquireTokens()` in `group_controller.go`, which represents the true per-request demand before any token bucket throttling
- **Smoothing**: time-aware EMA (reuse the existing `movingAvgFactor` logic)

### Expected Usage

```promql
# Per-instance demand
resource_manager_client_resource_group_demand_ru_per_sec{instance="tidb-0", resource_group="default"}

# Cluster-wide demand for a resource group
sum(resource_manager_client_resource_group_demand_ru_per_sec) by (resource_group)

# Peak demand over time
max_over_time(sum(resource_manager_client_resource_group_demand_ru_per_sec) by (resource_group)[1h])
```

### Benefits

- **Accurate**: samples RU cost before throttling, reflects true workload demand
- **Per-instance**: client-side metric naturally carries `instance` label
- **Aggregatable**: `sum by` in Grafana for cluster-wide view
- **Rolling-upgrade friendly**: pure client-side change, no proto or PD server changes required

### Related

- Part of the observability improvements tracked in #10488


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resource control: add client-side pre-throttling demand RU/s metric #10581

Enhancement Task

Problem

Proposal

Expected Usage

Benefits

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

resource control: add client-side pre-throttling demand RU/s metric #10581

Description

Enhancement Task

Problem

Proposal

Expected Usage

Benefits

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions