Skip to content

feat(observability): expose eBPF map capacity, memlock, and utilization metrics #1666

@MeloveGupta

Description

@MeloveGupta

What would you like to be added: Three new Prometheus gauges for kmesh-owned eBPF maps:

  • kmesh_map_max_entries{node_name, map_name, map_type} - map capacity
  • kmesh_map_memlock_bytes{node_name, map_name, map_type} - kernel locked memory
  • kmesh_map_entry_utilization_ratio{node_name, map_name, map_type} - entry_count / max_entries

Changes required:

  1. Extend MapInfo in pkg/controller/telemetry/map_metric.go to capture MaxEntries, Type, and Memlock
  2. Add map_type to kmeshMapLabels in pkg/controller/telemetry/utils.go
  3. Register the three new gauges and expose them on /status/metric
  4. Update Grafana dashboards in samples/addons/ with panels for capacity, utilization, and memlock
  5. Add unit tests in map_metric_test.go covering new label dimensions, utilization ratio computation, and missing memlock edge case

Why is this needed: Currently updatePrometheusMetric only collects mapName and entryCount.
Neither MaxEntries (capacity) nor Memlock (locked kernel memory) is exported,
leaving operators with no map headroom or memory visibility.

This makes scale diagnosis difficult as raised in #945, operators cannot
tell whether a map is approaching its capacity limit or how much kernel memory
it is consuming under load. The performance monitoring proposal
(docs/proposal/performance_monitoring.md) explicitly lists these data points
but they were never implemented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions