As language SIGs include support to export metrics using the OTLP it is being discovered that the native export format they are exporting from do not fit nicely or at all into the OTLP structure. This issue is intended to track these friction points, proposed changes, and action items to resolution.
Ideally the OTLP would natively support the output of all combination of Instruments and Aggregators defined in the OpenTelemetry Specification. Currently, this is not the case.
Identified Issues
There is no metric kind for a MinMaxSumCount aggregator that outputs the minimum, maximum, sum, and count of events observed.
It has been suggested that a Summary metric kind can be used to transport these values. The Summary kind does have fields for the count and sum, however, it does not have dedicated fields for the maximum nor minimum.
The suggested work around here is to send the maximum as the 100th percentile and the minimum as the 0th percentile. Both of which are not obvious and the latter is mathematically incorrect (the 0th percentile is the value which 0% of events occured, where as the minimum is the minimal value where at least 1 event occured).
As outlined in the linked ticket:
- It's not possible to provide exemplars for non-histograms
- It's not possible to provide more than one exemplar per bucket
- It requires converting certain data types to a string, which will require additional specification
Instrumentation Instead of Aggregation
The OTLP metric kinds are centered around instruments not output from Aggregation:
...
GAUGE_INT64 = 1;
GAUGE_DOUBLE = 2;
GAUGE_HISTOGRAM = 3;
COUNTER_INT64 = 4;
COUNTER_DOUBLE = 5;
CUMULATIVE_HISTOGRAM = 6;
SUMMARY = 7;
The incongruence of the instruments these kinds were modeled after and the actual instruments of the OpenTelemetry Specification is secondary to the fact that they are modeled after instruments in the first place.
If the goal of the OTLP is to transport the output of Instrument -> Aggregator, it should be modeled after the output of the Aggregators, aggregations. And while, yes, the Histogram is one of these aggregations, the included histogram kinds are conflated with instrument qualifiers (Gauge, Cumulative).
Nebulously Defined Need to be Compatible with OpenMetrics
A common refrain for the design of the OTLP is "We are trying to also stay compatible with OpenMetrics protocol".
This desire to remain compatible is an admirable one, it is negatively impacting this project. The OTLP in its pursuit to maintain easy translation to OpenMetrics has made it difficult to translate from OpenTelemetry into it. Additionally, it has led to a design that is bloated with OpenMetrics metric kinds (GAUGE_INT64, GAUGE_DOUBLE) that OpenTelemetry does not use.
It seems as though if support for the OpenMetrics protocol is desired, the implementations of the OpenTelemetry Specification should support that protocol. Instead of having the OTLP try to match that projects decisions.
Steps Forward
I'm hoping to have this Issue start the discussion on how we can resolve these issues. I see the metrics proto in need of a major overhaul to provide first class support for OpenTelemetry. Possibly being done in a v2 of the proto.
cc @bogdandrutu, @jmacd, @jkwatson, @c24t
As language SIGs include support to export metrics using the OTLP it is being discovered that the native export format they are exporting from do not fit nicely or at all into the OTLP structure. This issue is intended to track these friction points, proposed changes, and action items to resolution.
Ideally the OTLP would natively support the output of all combination of
Instruments andAggregators defined in the OpenTelemetry Specification. Currently, this is not the case.Identified Issues
First Class Support For Min and Max Values of a MinMaxSumCount Aggregation
There is no metric kind for a
MinMaxSumCountaggregator that outputs the minimum, maximum, sum, and count of events observed.It has been suggested that a
Summarymetric kind can be used to transport these values. TheSummarykind does have fields for the count and sum, however, it does not have dedicated fields for the maximum nor minimum.The suggested work around here is to send the maximum as the 100th percentile and the minimum as the 0th percentile. Both of which are not obvious and the latter is mathematically incorrect (the 0th percentile is the value which 0% of events occured, where as the minimum is the minimal value where at least 1 event occured).
HistogramValue.Bucket.Exemplar is inadequate
As outlined in the linked ticket:
Instrumentation Instead of Aggregation
The OTLP metric kinds are centered around instruments not output from Aggregation:
The incongruence of the instruments these kinds were modeled after and the actual instruments of the OpenTelemetry Specification is secondary to the fact that they are modeled after instruments in the first place.
If the goal of the OTLP is to transport the output of
Instrument->Aggregator, it should be modeled after the output of theAggregators, aggregations. And while, yes, theHistogramis one of these aggregations, the included histogram kinds are conflated with instrument qualifiers (Gauge,Cumulative).Nebulously Defined Need to be Compatible with OpenMetrics
A common refrain for the design of the OTLP is "We are trying to also stay compatible with OpenMetrics protocol".
This desire to remain compatible is an admirable one, it is negatively impacting this project. The OTLP in its pursuit to maintain easy translation to OpenMetrics has made it difficult to translate from OpenTelemetry into it. Additionally, it has led to a design that is bloated with OpenMetrics metric kinds (
GAUGE_INT64,GAUGE_DOUBLE) that OpenTelemetry does not use.It seems as though if support for the OpenMetrics protocol is desired, the implementations of the OpenTelemetry Specification should support that protocol. Instead of having the OTLP try to match that projects decisions.
Steps Forward
I'm hoping to have this Issue start the discussion on how we can resolve these issues. I see the metrics proto in need of a major overhaul to provide first class support for OpenTelemetry. Possibly being done in a
v2of the proto.cc @bogdandrutu, @jmacd, @jkwatson, @c24t