Skip to content

Make metrics-proxy JVM heap size configurable for large clusters #35830

@mbelchatowski

Description

@mbelchatowski

Problem

The metrics-proxy JVM heap size is hardcoded in
MetricsProxyContainer.java with no override mechanism:

int heapSize = adminCluster ? 96 : 320;
builder.jvm.heapsize(heapSize);
builder.jvm.minHeapsize(heapSize);

For clusters with high metrics cardinality, driven by the multiplicative combination of content containers * document types * rank profiles, the metrics-proxy runs out of memory and stops emitting metrics.

In our case, a cluster with 12 document types and 120+ content containers causes the metrics-proxy to OOM. The cardinality of rank-profile metrics alone (with dimensions host * documenttype * rankProfile *
metricName * suffix) exceeds what 320MB can handle.

Proposed solution

Allow the metrics-proxy heap to accommodate large clusters. Some options:

  • Scale the heap dynamically based on cluster characteristics (number of content nodes, document types, rank profiles)
  • Make it configurable via services.xml
  • Increase the default to handle larger deployments

The current fixed value of 320MB breaks for larger clusters.

Context

  • The number of metrics scales multiplicatively with content nodes, document types, and rank profiles
  • There is no way to reduce metrics-proxy memory usage through consumer filtering in services.xml. The proxy tracks all metrics internally regardless
  • Vertical scaling (fewer, larger nodes) or reducing document types/rank profiles are workarounds, but the root issue is the non-configurable heap

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions