Skip to content

Clarify and potentially remove InstrumentationLibrarySpans #149

@tigrannajaryan

Description

@tigrannajaryan

The traces proto currently contains InstrumentationLibrarySpans which does
not have clearly defined semantics. InstrumentationLibrarySpans may contain a
number of spans all associated with an InstrumentationLibrary. The nature of
this association is not clear.

The InstrumentationLibrary has a name and a version. It is not clear if
these fields are part of Resource identity or are attributes of a Span.
Presumably they should be interpreted as attributes of the Span.

I am not aware of any other trace protocols or backends that have the
equivalent of InstrumentationLibrary concept. However ultimately all span data
produced by OpenTelemetry libraries will end up in a backend and the
InstrumentationLibrary concept must be mapped to an existing concept. Span
attributes seem to be the only concept that fit the bill. Using attributes from
the start of the collection pipeline removes the need to deal with
InstrumentationLibrary by all codebases that need to make a mapping decision
(Collector, backend ingest points, etc).

Proposal 1

I suggest to remove InstrumentationLibrary message type from the protocol and
add semantic conventions for recording instrumentation library in Span
attributes.

The benefits of this approach over using InstrumentationLibrary are the following:

  • There is not need for a new concept and new message type at the protocol
    level. This adds unnecessary complexity to all codebases that need to read and
    write traces but don't care about instrumentation library concept (likely the
    majory of codebases).

  • It uses the general concept of attributes that already exists and is well
    understood and by doing so makes the semantics of instrumentation library name
    clear.

There is potentially a small downside: using InstrumentationLibrary message
may be more efficient to encode/decode especially especially if there are
multiple spans from the same instrumentation library (see Proposal 2 for a
potential solution to this).

On the other hand, for Span data that does not have an associated
instrumentation library name and version we are incurring cost of having the
InstrumentationLibrarySpans message even if it does not refer to any
InstrumentationLibrary. We pay the cost of the concept even if we don't need
it. As opposed to this the approach with recording instrumentation library
information in attributes using semantic conventions is zero cost for cases
which do not need to record instrumentation library.

To illustrate, here is an example. Using InstrumentationLibrarySpans we would
need to record the following:

resource_spans:
  resource: 
    ...
  instrumentation_library_spans:
    - instrumentation_library:
        name: io.opentelemetry.redis
      spans:
        - name: request
            start_time: 123

    - instrumentation_library:
        name: io.opentelemetry.apache.httpd
      spans:
        - name: request
            start_time: 456

After removing InstrumentationLibrarySpans concept we would do this:

resource_spans:
  resource: 
    ...
  spans:
    - name: request
      start_time: 123
      attributes:
        - key: instrumentation.library.name
          value: io.opentelemetry.redis

    - name: request
      start_time: 456
      attributes:
        - key: instrumentation.library.name
          value: io.opentelemetry.apache.httpd

Proposal 2

Rename InstrumentationLibrarySpans to SpanBatch and instead of allowing to
only record instrumentation library name and version allow recording any span
attributes at the batch level:

resource_spans:
  resource: 
    ...
  span_batches:
    - common_attributes:
        - key: instrumentation.library.name
          value: io.opentelemetry.redis
      spans:
        - name: request
            start_time: 123

    - common_attributes:
        - key: instrumentation.library.name
          value: io.opentelemetry.apache.httpd
      spans:
        - name: request
            start_time: 456 

This allows to represent the instrumentation library or anything other
attributes that are common for a batch of spans. The protocol will require that
common_attributes and Span aatributes do not contain the same key to avoid
expensive merging in translations.

The benefits of this approach is that instead of having a special concept just
for instrumentation library we have a generic mechanism to record repeating span
attributes efficiently (and instrumentation library information is just one of
the use cases).

Similar approach will also work nicely for log attributes when we add log
support.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions