Skip to content

Create shared context for updating span pipeline from TracerProvider and affecting Tracer.#650

Merged
lalitb merged 21 commits intoopen-telemetry:mainfrom
jsuereth:wip-trace-context
Apr 16, 2021
Merged

Create shared context for updating span pipeline from TracerProvider and affecting Tracer.#650
lalitb merged 21 commits intoopen-telemetry:mainfrom
jsuereth:wip-trace-context

Conversation

@jsuereth
Copy link
Copy Markdown
Contributor

@jsuereth jsuereth commented Apr 5, 2021

  • Create new TracerContext class that stores (thread-safe) access to Sampler + SpanPrcoessor.
  • All calls to SpanProcessor go through TracerContext, ensuring we abide by specification where TracerProvider calls MUST impact previously created Tracers
  • Update all examples for new API.
  • Adds "stub" API for appending new SpanProcessors to a given TracerContext.

Replaces #590 with only the shared context portion.

@jsuereth jsuereth marked this pull request as ready for review April 5, 2021 20:27
@jsuereth jsuereth requested a review from a team April 5, 2021 20:27
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 5, 2021

Codecov Report

Merging #650 (a0999fc) into main (b0f27a4) will decrease coverage by 0.01%.
The diff coverage is 90.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #650      +/-   ##
==========================================
- Coverage   94.66%   94.65%   -0.02%     
==========================================
  Files         204      206       +2     
  Lines        9418     9381      -37     
==========================================
- Hits         8916     8880      -36     
+ Misses        502      501       -1     
Impacted Files Coverage Δ
sdk/src/trace/span.h 100.00% <ø> (ø)
sdk/src/trace/tracer_provider.cc 76.47% <72.72%> (+8.28%) ⬆️
sdk/src/trace/tracer_context.cc 75.00% <75.00%> (ø)
ext/test/zpages/tracez_data_aggregator_test.cc 96.35% <100.00%> (ø)
ext/test/zpages/tracez_processor_test.cc 98.71% <100.00%> (+<0.01%) ⬆️
...clude/opentelemetry/sdk/common/atomic_unique_ptr.h 100.00% <100.00%> (ø)
sdk/include/opentelemetry/sdk/trace/tracer.h 100.00% <100.00%> (ø)
sdk/src/trace/span.cc 89.58% <100.00%> (ø)
sdk/src/trace/tracer.cc 76.92% <100.00%> (+1.92%) ⬆️
sdk/test/trace/tracer_provider_test.cc 100.00% <100.00%> (ø)
... and 5 more

Copy link
Copy Markdown
Member

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank for the changes.

Comment thread examples/http/BUILD
"//exporters/ostream:ostream_span_exporter",
"//sdk/src/trace",
],
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the bazel build for this.

// 1. If existing processor is an "AggregateProcessor" append the new processor to it.
// 2. If the existing processor is NOT an "AggregateProcessor", create a new Aggregate of this and
// the other,
// then replace our atomic ptr with the new aggregate.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it would be separate PR, just curious how to figure if existing processor is AggregateProcessor :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow, yeah just copy-pasted the documentation. I'll rephrase this for now until the next PR lands.

Copy link
Copy Markdown
Contributor

@seemk seemk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for being out of the loop a bit, but what was the problem this (and #590) solved in the context of Resource SDK? Just to get up to speed, thanks!

Comment thread sdk/include/opentelemetry/sdk/trace/tracer_context.h Outdated
Comment thread sdk/include/opentelemetry/sdk/trace/tracer_context.h Outdated
@lalitb
Copy link
Copy Markdown
Member

lalitb commented Apr 7, 2021

I apologize for being out of the loop a bit, but what was the problem this (and #590) solved in the context of Resource SDK? Just to get up to speed, thanks!

@seemk - This PR is to ensure that the tracer-provider has sole ownership of all the configuration ( sampler, processor, resources ). You can follow the discussions in #580 specifically these comments:

#580 (comment)
#580 (comment)
#580 (comment) ( the final proposal which is implemented in this PR).

Copy link
Copy Markdown
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing that work and putting this chunk into its own PR. This makes it much easier to reason about and to review.

Comment on lines +51 to +56
* Obtain the (conceptual) active processor.
*
* Note: When more than one processor is active, this will
* return an "aggregate" processor
*/
SpanProcessor &GetActiveProcessor() const noexcept;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an inactive processor?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any inactive processor. @jsuereth can confirm though :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'inactive' would be any constructed but not registered processor.

I'm not tied to this name, just took it from Java ;)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just name it GetProcessor to avoid the confusion.

* Note: When more than one processor is active, this will
* return an "aggregate" processor
*/
SpanProcessor &GetActiveProcessor() const noexcept;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might theoretically run into issue here when some tracer does some work with a processor obtained by GetActiveProcessor, while another thread calls RegisterPipeline and thus causes the memory pointed to by the obtained the processor to be freed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be the case. The interface is "RegisterPipeline", and it's a shared pointer. The TracerContext can guarantee that ALL registered processors remain alive for its own lifecycle. I.e. we're not going to delete a pipeline, we're only going to allow registration of new ones specifically to prevent this issue.

We have the inverse problem around Recordable though, specifically if we have an "active" span that did NOT get a recordable registered when a new pipeline is registered, what happens when that span ends?

If you look at my design around "ExportableSpan" where a pipeline can ask for its instance of Recordable (and handle a missing recordable), that's how I planned to solve that issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand. So if a new pipeline is registered, the old one isn't just overwritten, but kept in store and just made inactive?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that all the pipelines(i.e, processors) remain active. And we can only add a new pipeline, but can't remove the existing one? And once we add multi-processor support, TracerContext would actually maintain a unique_ptr to (say) CompositeSpanProcessor which further manages all the active pipelines/processors?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. i'm adopting the design I've seen elsewhere where you can "append a pipeline", but can't remove an existing pipeline.

I.e. once a pipeline is active, it's forever active. We don't have a register/unregister mechanism.

const opentelemetry::sdk::resource::Resource &GetResource() const noexcept;

/**
* Force all active SpanProcessors to flush any buffered spans
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by all active span processor. Wouldn't we just have a CompositeSpanProcessor? Which would encapsulate flushing and shutdown? So that this is a thing the CompositeSpanProcessor has to worry about, and not the TraceContext?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I wouldn't mind leaving ForceFlush and Shutdown out. GetActiveProcessor().ForceFlush() isn't that bad.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for users of TracerContext to understand what it does. Yes the implementation of this behavior may be implemented by "CompositeSpanProcessor" but it's not something that would be directly exposed, instead users just see "RegisterPipeline" and "Shutdown" and "ActiveProcessor". I.e. we still need to document the behavior of this class and how to use its methods.

Note: I'm not tied to "active" processor, I borrowed the name from the Java SDK.

* @return The sampler for this tracer provider.
*/
std::shared_ptr<Sampler> GetSampler() const noexcept;
void RegisterProcessor(std::unique_ptr<SpanProcessor> processor) noexcept;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have RegisterProcessor here and RegisterPipeline on the TraceContext. Could we unify the terminology here? I don't have a strong preference as to which term to use.

Comment thread sdk/src/trace/span.cc
}

Span::Span(std::shared_ptr<Tracer> &&tracer,
std::shared_ptr<SpanProcessor> processor,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reason why the processor was put onto the span: this is to guarantee that the span/recordable gets ended on the same processor it was created with.

Otherwise, this could happen:

  1. A span/recordable a is created from processor/exporter A.
  2. The processor/exporter is changed, maybe replaced by a composite processor/exporter B.
  3. When the span/recordable a is ended it is passed to the processor/exporter B and cannot be handled in this context (as it was created from processor/exporter A).

We should decide if that's still a use case we want to support. Just putting it out there that this was considered in the initial design.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had a different proposal where a processor asks for its recordable back later and receives a nullptr (or other "false" flag).

I can revert this for the other design. I think this is a use case that needs support, there's just more than one option for how to do it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind which option we take. I'm in favor though of guaranteeing that the Recordable passed to the exporter is of the correct type.

@lalitb
Copy link
Copy Markdown
Member

lalitb commented Apr 14, 2021

@jsuereth - Would like to merge this PR once you and @pyohannes think it's good to go , that will allow me to raise PR for #664 on top of these changes :)

@jsuereth
Copy link
Copy Markdown
Contributor Author

@lalitb Understood. So, to make progress, it'd be nice to know which of @pyohannes comments are blocking and which to address now.

  • Active Processor terminology. I don't see a counter proposal and i haven't thought of another term. I'd vote to leave it similar to Java unless there's an alternative.
  • Keeping SpanPrcoessor on Span vs. relying on "active span processor" in tracer. I'm ok going either way here. my preference is what I've done in this PR but I'm happy to change it.
  • I can update to consistently use RegisterPipeline everywhere as the spec uses the term 'pipeline" to denote processor+exporter pairing.

WDYT @pyohannes ?

@pyohannes
Copy link
Copy Markdown
Contributor

Please let's use RegisterPipeline consistently wherever appropriate.

Otherwise I'm fine with the current approach. I was a bit confused by the missing RegisterPipeline implementation, but I think my main concerns will be handled by a proper implementation of this method and aggregate processors.

Copy link
Copy Markdown
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this on the way.

Copy link
Copy Markdown
Contributor

@ThomsonTan ThomsonTan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update change log?

shared_data = std::shared_ptr<TracezSharedData>(new TracezSharedData());
processor = std::shared_ptr<TracezSpanProcessor>(new TracezSpanProcessor(shared_data));
auto resource = opentelemetry::sdk::resource::Resource::Create({});
// Note: we make a *different* processor for the tracercontext. THis is because
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Note: we make a *different* processor for the tracercontext. THis is because
// Note: we make a *different* processor for the tracercontext. This is because

* Attaches a span processor to this tracer context.
*
* @param processor The new span processor for this tracer. This must not be
* a nullptr. Ownership is given to the `TracerContext`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* a nullptr. Ownership is given to the `TracerContext`.
* a nullptr. Ownership is given to the `TracerContext`.

Comment on lines +51 to +56
* Obtain the (conceptual) active processor.
*
* Note: When more than one processor is active, this will
* return an "aggregate" processor
*/
SpanProcessor &GetActiveProcessor() const noexcept;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just name it GetProcessor to avoid the confusion.

@lalitb
Copy link
Copy Markdown
Member

lalitb commented Apr 15, 2021

@lalitb Understood. So, to make progress, it'd be nice to know which of @pyohannes comments are blocking and which to address now.

  • Active Processor terminology. I don't see a counter proposal and i haven't thought of another term. I'd vote to leave it similar to Java unless there's an alternative.
  • Keeping SpanPrcoessor on Span vs. relying on "active span processor" in tracer. I'm ok going either way here. my preference is what I've done in this PR but I'm happy to change it.
  • I can update to consistently use RegisterPipeline everywhere as the spec uses the term 'pipeline" to denote processor+exporter pairing.

WDYT @pyohannes ?

Thanks for summarising @jsuereth. don't see these as blockers - we can merge the PR once the formatting issue is fixed as that is blocking merge :)

@lalitb lalitb merged commit 7b07013 into open-telemetry:main Apr 16, 2021
@jsuereth jsuereth deleted the wip-trace-context branch April 19, 2021 19:16
GerHobbelt pushed a commit to GerHobbelt/opentelemetry-cpp that referenced this pull request Mar 6, 2026
…and-patch-dependencies

Update dependency bazel_lib to v3.2.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants