The traces proto currently contains InstrumentationLibrarySpans which does
not have clearly defined semantics. InstrumentationLibrarySpans may contain a
number of spans all associated with an InstrumentationLibrary. The nature of
this association is not clear.
The InstrumentationLibrary has a name and a version. It is not clear if
these fields are part of Resource identity or are attributes of a Span.
Presumably they should be interpreted as attributes of the Span.
I am not aware of any other trace protocols or backends that have the
equivalent of InstrumentationLibrary concept. However ultimately all span data
produced by OpenTelemetry libraries will end up in a backend and the
InstrumentationLibrary concept must be mapped to an existing concept. Span
attributes seem to be the only concept that fit the bill. Using attributes from
the start of the collection pipeline removes the need to deal with
InstrumentationLibrary by all codebases that need to make a mapping decision
(Collector, backend ingest points, etc).
Proposal 1
I suggest to remove InstrumentationLibrary message type from the protocol and
add semantic conventions for recording instrumentation library in Span
attributes.
The benefits of this approach over using InstrumentationLibrary are the following:
-
There is not need for a new concept and new message type at the protocol
level. This adds unnecessary complexity to all codebases that need to read and
write traces but don't care about instrumentation library concept (likely the
majory of codebases).
-
It uses the general concept of attributes that already exists and is well
understood and by doing so makes the semantics of instrumentation library name
clear.
There is potentially a small downside: using InstrumentationLibrary message
may be more efficient to encode/decode especially especially if there are
multiple spans from the same instrumentation library (see Proposal 2 for a
potential solution to this).
On the other hand, for Span data that does not have an associated
instrumentation library name and version we are incurring cost of having the
InstrumentationLibrarySpans message even if it does not refer to any
InstrumentationLibrary. We pay the cost of the concept even if we don't need
it. As opposed to this the approach with recording instrumentation library
information in attributes using semantic conventions is zero cost for cases
which do not need to record instrumentation library.
To illustrate, here is an example. Using InstrumentationLibrarySpans we would
need to record the following:
resource_spans:
resource:
...
instrumentation_library_spans:
- instrumentation_library:
name: io.opentelemetry.redis
spans:
- name: request
start_time: 123
- instrumentation_library:
name: io.opentelemetry.apache.httpd
spans:
- name: request
start_time: 456
After removing InstrumentationLibrarySpans concept we would do this:
resource_spans:
resource:
...
spans:
- name: request
start_time: 123
attributes:
- key: instrumentation.library.name
value: io.opentelemetry.redis
- name: request
start_time: 456
attributes:
- key: instrumentation.library.name
value: io.opentelemetry.apache.httpd
Proposal 2
Rename InstrumentationLibrarySpans to SpanBatch and instead of allowing to
only record instrumentation library name and version allow recording any span
attributes at the batch level:
resource_spans:
resource:
...
span_batches:
- common_attributes:
- key: instrumentation.library.name
value: io.opentelemetry.redis
spans:
- name: request
start_time: 123
- common_attributes:
- key: instrumentation.library.name
value: io.opentelemetry.apache.httpd
spans:
- name: request
start_time: 456
This allows to represent the instrumentation library or anything other
attributes that are common for a batch of spans. The protocol will require that
common_attributes and Span aatributes do not contain the same key to avoid
expensive merging in translations.
The benefits of this approach is that instead of having a special concept just
for instrumentation library we have a generic mechanism to record repeating span
attributes efficiently (and instrumentation library information is just one of
the use cases).
Similar approach will also work nicely for log attributes when we add log
support.
The traces proto currently contains
InstrumentationLibrarySpanswhich doesnot have clearly defined semantics.
InstrumentationLibrarySpansmay contain anumber of spans all associated with an
InstrumentationLibrary. The nature ofthis association is not clear.
The
InstrumentationLibraryhas anameand aversion. It is not clear ifthese fields are part of Resource identity or are attributes of a Span.
Presumably they should be interpreted as attributes of the Span.
I am not aware of any other trace protocols or backends that have the
equivalent of
InstrumentationLibraryconcept. However ultimately all span dataproduced by OpenTelemetry libraries will end up in a backend and the
InstrumentationLibraryconcept must be mapped to an existing concept. Spanattributes seem to be the only concept that fit the bill. Using attributes from
the start of the collection pipeline removes the need to deal with
InstrumentationLibraryby all codebases that need to make a mapping decision(Collector, backend ingest points, etc).
Proposal 1
I suggest to remove
InstrumentationLibrarymessage type from the protocol andadd semantic conventions for recording instrumentation library in Span
attributes.
The benefits of this approach over using
InstrumentationLibraryare the following:There is not need for a new concept and new message type at the protocol
level. This adds unnecessary complexity to all codebases that need to read and
write traces but don't care about instrumentation library concept (likely the
majory of codebases).
It uses the general concept of attributes that already exists and is well
understood and by doing so makes the semantics of instrumentation library name
clear.
There is potentially a small downside: using
InstrumentationLibrarymessagemay be more efficient to encode/decode especially especially if there are
multiple spans from the same instrumentation library (see Proposal 2 for a
potential solution to this).
On the other hand, for Span data that does not have an associated
instrumentation library name and version we are incurring cost of having the
InstrumentationLibrarySpansmessage even if it does not refer to anyInstrumentationLibrary. We pay the cost of the concept even if we don't needit. As opposed to this the approach with recording instrumentation library
information in attributes using semantic conventions is zero cost for cases
which do not need to record instrumentation library.
To illustrate, here is an example. Using
InstrumentationLibrarySpanswe wouldneed to record the following:
After removing
InstrumentationLibrarySpansconcept we would do this:Proposal 2
Rename
InstrumentationLibrarySpanstoSpanBatchand instead of allowing toonly record instrumentation library name and version allow recording any span
attributes at the batch level:
This allows to represent the instrumentation library or anything other
attributes that are common for a batch of spans. The protocol will require that
common_attributesand Spanaatributesdo not contain the same key to avoidexpensive merging in translations.
The benefits of this approach is that instead of having a special concept just
for instrumentation library we have a generic mechanism to record repeating span
attributes efficiently (and instrumentation library information is just one of
the use cases).
Similar approach will also work nicely for log attributes when we add log
support.