[OpenTelemetry] Optimize trace sampling by martincostello · Pull Request #7057 · open-telemetry/opentelemetry-dotnet

martincostello · 2026-04-10T13:45:10Z

Changes

While looking at some profiles for an OTel instrumented application of mine, I noticed that TraceProviderSdk.ComputeActivitySamplingResult() came up in the top 10 OTel-related samples.

This PR adopts a Copilot-authored suggestion to avoid allocating an array in SamplingResult when there's no attributes so that TraceProviderSdk can skip enumeration.

Benchmark results

Copilot Summary

This PR is faster in every case, with allocation improvements in one case and unchanged allocations in the others:

Method	`main` Mean	`pr-7057` Mean	Time delta	`main` Alloc	`pr-7057` Alloc	Alloc delta
NoAttributes	122.1 ns	110.2 ns	-9.8%	328 B	328 B	0.0%
WithAttributeArray	181.9 ns	166.5 ns	-8.5%	672 B	640 B	-4.8%
WithAttributeList	277.0 ns	275.5 ns	-0.5%	752 B	752 B	0.0%
Drop	104.7 ns	101.0 ns	-3.5%	328 B	328 B	0.0%
ParentBasedSampled	114.3 ns	109.5 ns	-4.2%	328 B	328 B	0.0%

This PR improves latency by about 0.5% to 9.8% across all cases. Allocation is mostly unchanged, with a small improvement in WithAttributeArray (672 B -> 640 B).

Expand to see

`main` + the new benchmarks

BenchmarkDotNet v0.15.8, Windows 11 (10.0.26200.8117/25H2/2025Update/HudsonValley2)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.201
  [Host]     : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3

Method	Mean	Error	StdDev	Ratio	RatioSD	Gen0	Allocated	Alloc Ratio
NoAttributes	122.1 ns	1.09 ns	1.02 ns	1.00	0.01	0.0260	328 B	1.00
WithAttributeArray	181.9 ns	1.75 ns	1.63 ns	1.49	0.02	0.0534	672 B	2.05
WithAttributeList	277.0 ns	2.85 ns	2.53 ns	2.27	0.03	0.0596	752 B	2.29
Drop	104.7 ns	0.74 ns	0.69 ns	0.86	0.01	0.0261	328 B	1.00
ParentBasedSampled	114.3 ns	0.56 ns	0.47 ns	0.94	0.01	0.0261	328 B	1.00

This PR

BenchmarkDotNet v0.15.8, Windows 11 (10.0.26200.8117/25H2/2025Update/HudsonValley2)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.201
  [Host]     : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3

Method	Mean	Error	StdDev	Ratio	RatioSD	Gen0	Allocated	Alloc Ratio
NoAttributes	110.2 ns	0.65 ns	0.61 ns	1.00	0.01	0.0261	328 B	1.00
WithAttributeArray	166.5 ns	1.11 ns	0.99 ns	1.51	0.01	0.0508	640 B	1.95
WithAttributeList	275.5 ns	1.69 ns	1.50 ns	2.50	0.02	0.0596	752 B	2.29
Drop	101.0 ns	0.75 ns	0.70 ns	0.92	0.01	0.0261	328 B	1.00
ParentBasedSampled	109.5 ns	2.18 ns	2.04 ns	0.99	0.02	0.0261	328 B	1.00

Merge requirement checklist

CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
~~Unit tests added/updated~~
~~Appropriate CHANGELOG.md files updated for non-trivial changes~~
~~Changes in public API reviewed (if applicable)~~

codecov · 2026-04-10T13:53:33Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.80%. Comparing base (a2b6372) to head (b800efc).
⚠️ Report is 35 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7057      +/-   ##
==========================================
+ Coverage   88.54%   88.80%   +0.25%     
==========================================
  Files         270      270              
  Lines       12884    13110     +226     
==========================================
+ Hits        11408    11642     +234     
+ Misses       1476     1468       -8

Flag	Coverage Δ
unittests-Project-Experimental	`88.52% <100.00%> (+0.07%)`	⬆️
unittests-Project-Stable	`88.60% <100.00%> (+0.12%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/OpenTelemetry/Trace/Sampler/SamplingResult.cs	`100.00% <100.00%> (ø)`
src/OpenTelemetry/Trace/TracerProviderSdk.cs	`99.35% <100.00%> (+<0.01%)`	⬆️

... and 13 files with indirect coverage changes

Copilot

Pull request overview

This PR optimizes trace sampling in the OpenTelemetry .NET SDK by avoiding unnecessary attribute enumeration/allocations on the hot path when applying sampler-provided tags.

Changes:

Update TracerProviderSdk sampling/tag application to skip attribute iteration when none are provided, and add an array fast-path for tag application.
Adjust SamplingResult to store sampler attributes as nullable internally (enabling the skip in TracerProviderSdk) while preserving the public Attributes API.
Add BenchmarkDotNet benchmarks to measure sampling result/tagging scenarios (no attributes, array-backed attributes, enumerable-backed attributes, drop, and parent-based).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
test/Benchmarks/Trace/SamplingResultBenchmarks.cs	Adds benchmarks covering common sampling/tagging scenarios to validate perf impact.
src/OpenTelemetry/Trace/TracerProviderSdk.cs	Skips attribute processing when none are provided; adds array fast-path for applying sampling tags.
src/OpenTelemetry/Trace/Sampler/SamplingResult.cs	Stores attributes internally as nullable and exposes an internal accessor used to avoid iteration on the no-attributes path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Optimize `SamplingResult` for cases when there are no attributes to avoid allocating an enumerator and add benchmarks.

Use normal dashes.

The benchmark doesn't call the method directly, so it can stay private.

Remove inaccurate comment.

cijothomas · 2026-04-13T18:54:33Z

        if (activitySamplingResult > ActivitySamplingResult.PropagationData)
        {
-            foreach (var att in samplingResult.Attributes)
+            if (samplingResult.AttributesOrNull is { } attributes)


❤️ nice elimination of the alloc here!

would we get same benefit by comparing Attributes to empty array (Array.Empty<KeyValuePair<string, object>>()) )first before attempting to iterate? That'd be simpler from code complexity, even though it may not be future proof..

Not sure - but the compiler can use special private types for collection expressions so "not null" is probably best than assuming it's the exact object returned by Array.Empty<T>().

cijothomas · 2026-04-13T19:10:07Z

@@ -0,0 +1,134 @@
+// Copyright The OpenTelemetry Authors


nit: could we put in the same SamplerBenchmarks?

Sorry, I don't quite follow this comment.

Do you mean to just move the new benchmarks into SamplerBenchmarks?

cijothomas

The change itself is fine. I left couple of suggestions to reduce complexity. This is a crucial code path, so perf gains are always worth fighting for, but I also think we should keep complexity minimal, if feasible.

No blockers. Suggest getting additional maintainer review given this is in such crucial path.

Thanks for looking for performance improvements! Very excited for this (and other PRs towards perf)

Kielek

I fully agree with @cijothomas comments.

Remove special casing for arrays.

martincostello · 2026-04-14T16:12:09Z

With the latest changes:

Copilot Summary

Overall: This PR is faster on 7/8 common benchmarks. The only regression is SamplerNotModifyingTraceState at 1.03x / +2.9% slower. Allocations are unchanged
everywhere at 1.00x / 0%.

PR/main ratios are shown below; for duration, < 1.00x is faster.

Suite	Benchmark	Duration (main -> PR)	Duration ratio / change	Alloc (main -> PR)	Alloc ratio / change
`SamplerBenchmarks`	`SamplerNotModifyingTraceState`	117.6 ns -> 121.0 ns	1.03x / +2.9%	328 B -> 328 B	1.00x / 0.0%
`SamplerBenchmarks`	`SamplerModifyingTraceState`	125.1 ns -> 120.3 ns	0.96x / -3.8%	328 B -> 328 B	1.00x / 0.0%
`SamplerBenchmarks`	`SamplerAppendingTraceState`	153.7 ns -> 129.6 ns	0.84x / -15.7%	384 B -> 384 B	1.00x / 0.0%
`SamplingResultBenchmarks`	`NoAttributes`	121.4 ns -> 116.5 ns	0.96x / -4.0%	328 B -> 328 B	1.00x / 0.0%
`SamplingResultBenchmarks`	`WithAttributeArray`	177.7 ns -> 175.3 ns	0.99x / -1.4%	672 B -> 672 B	1.00x / 0.0%
`SamplingResultBenchmarks`	`WithAttributeList`	298.1 ns -> 282.9 ns	0.95x / -5.1%	752 B -> 752 B	1.00x / 0.0%
`SamplingResultBenchmarks`	`Drop`	104.3 ns -> 101.2 ns	0.97x / -3.0%	328 B -> 328 B	1.00x / 0.0%
`SamplingResultBenchmarks`	`ParentBasedSampled`	141.6 ns -> 111.4 ns	0.79x / -21.3%	328 B -> 328 B	1.00x / 0.0%

Suite-level summary

SamplerBenchmarks: geometric mean duration ratio 0.94x (about 5.9% faster overall), allocations 1.00x / 0%.
SamplingResultBenchmarks: geometric mean duration ratio 0.93x (about 7.3% faster overall), allocations 1.00x / 0%.

Expand to see

`main` + the new benchmarks

BenchmarkDotNet v0.15.8, Windows 11 (10.0.26200.8117/25H2/2025Update/HudsonValley2)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.201
  [Host]     : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3

Method	Mean	Error	StdDev	Gen0	Allocated
SamplerNotModifyingTraceState	117.6 ns	0.60 ns	0.53 ns	0.0261	328 B
SamplerModifyingTraceState	125.1 ns	0.36 ns	0.32 ns	0.0260	328 B
SamplerAppendingTraceState	153.7 ns	1.04 ns	0.92 ns	0.0305	384 B

BenchmarkDotNet v0.15.8, Windows 11 (10.0.26200.8117/25H2/2025Update/HudsonValley2)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.201
  [Host]     : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3

Method	Mean	Error	StdDev	Median	Ratio	RatioSD	Gen0	Allocated	Alloc Ratio
NoAttributes	121.4 ns	1.01 ns	0.90 ns	121.7 ns	1.00	0.01	0.0260	328 B	1.00
WithAttributeArray	177.7 ns	3.16 ns	5.86 ns	174.8 ns	1.46	0.05	0.0534	672 B	2.05
WithAttributeList	298.1 ns	1.81 ns	1.69 ns	298.8 ns	2.46	0.02	0.0596	752 B	2.29
Drop	104.3 ns	0.41 ns	0.39 ns	104.2 ns	0.86	0.01	0.0261	328 B	1.00
ParentBasedSampled	141.6 ns	0.94 ns	0.88 ns	141.5 ns	1.17	0.01	0.0260	328 B	1.00

This PR

BenchmarkDotNet v0.15.8, Windows 11 (10.0.26200.8117/25H2/2025Update/HudsonValley2)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.201
  [Host]     : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3

Method	Mean	Error	StdDev	Gen0	Allocated
SamplerNotModifyingTraceState	121.0 ns	1.20 ns	1.06 ns	0.0260	328 B
SamplerModifyingTraceState	120.3 ns	0.89 ns	0.79 ns	0.0260	328 B
SamplerAppendingTraceState	129.6 ns	0.47 ns	0.44 ns	0.0305	384 B

BenchmarkDotNet v0.15.8, Windows 11 (10.0.26200.8117/25H2/2025Update/HudsonValley2)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.201
  [Host]     : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 10.0.5 (10.0.5, 10.0.526.15411), X64 RyuJIT x86-64-v3

Method	Mean	Error	StdDev	Ratio	RatioSD	Gen0	Allocated	Alloc Ratio
NoAttributes	116.5 ns	1.92 ns	1.80 ns	1.00	0.02	0.0261	328 B	1.00
WithAttributeArray	175.3 ns	1.60 ns	1.50 ns	1.51	0.03	0.0534	672 B	2.05
WithAttributeList	282.9 ns	2.91 ns	2.72 ns	2.43	0.04	0.0596	752 B	2.29
Drop	101.2 ns	0.87 ns	0.77 ns	0.87	0.01	0.0261	328 B	1.00
ParentBasedSampled	111.4 ns	0.97 ns	0.90 ns	0.96	0.02	0.0261	328 B	1.00

martincostello · 2026-04-14T16:13:16Z

Not special casing the types (rather than using the IEnumerable<T>) removes any allocation gains.

github-actions bot added pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package perf Performance related labels Apr 10, 2026

martincostello marked this pull request as ready for review April 10, 2026 14:29

martincostello requested a review from a team as a code owner April 10, 2026 14:29

Copilot AI review requested due to automatic review settings April 10, 2026 14:29

Copilot started reviewing on behalf of martincostello April 10, 2026 14:30 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Comment thread src/OpenTelemetry/Trace/Sampler/SamplingResult.cs

martincostello commented Apr 10, 2026

View reviewed changes

Comment thread src/OpenTelemetry/Trace/Sampler/SamplingResult.cs Outdated

martincostello mentioned this pull request Apr 12, 2026

[DNM] Performance improvements #7062

Closed

4 tasks

martincostello marked this pull request as draft April 13, 2026 13:07

This comment was marked as resolved.

Sign in to view

martincostello added 4 commits April 13, 2026 15:26

[OpenTelemetry] Optimize trace sampling

eb262cb

Optimize `SamplingResult` for cases when there are no attributes to avoid allocating an enumerator and add benchmarks.

[OpenTelemetry] Fix lint warnings

8213721

Use normal dashes.

[OpenTelemetry] Revert accessibility modifier

2ec341e

The benchmark doesn't call the method directly, so it can stay private.

[OpenTelemetry] Address feedback

de0e117

Remove inaccurate comment.

martincostello force-pushed the optimize-trace-sampling branch from 0ab86a9 to de0e117 Compare April 13, 2026 14:26

martincostello marked this pull request as ready for review April 13, 2026 15:25

cijothomas reviewed Apr 13, 2026

View reviewed changes

Comment thread src/OpenTelemetry/Trace/TracerProviderSdk.cs

cijothomas reviewed Apr 13, 2026

View reviewed changes

Comment thread src/OpenTelemetry/Trace/TracerProviderSdk.cs Outdated

cijothomas approved these changes Apr 13, 2026

View reviewed changes

Kielek reviewed Apr 14, 2026

View reviewed changes

Comment thread src/OpenTelemetry/Trace/TracerProviderSdk.cs Outdated

Kielek approved these changes Apr 14, 2026

View reviewed changes

[OpenTelemetry] Address feedback

b800efc

Remove special casing for arrays.

martincostello requested review from Kielek and cijothomas April 14, 2026 16:13

martincostello added this pull request to the merge queue Apr 18, 2026

Merged via the queue into open-telemetry:main with commit 810fba4 Apr 18, 2026
63 checks passed

martincostello deleted the optimize-trace-sampling branch April 18, 2026 07:29

Conversation

martincostello commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Benchmark results

Copilot Summary

main + the new benchmarks

This PR

Merge requirement checklist

Uh oh!

codecov bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

cijothomas Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

cijothomas Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

martincostello Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

cijothomas Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

martincostello Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cijothomas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Kielek left a comment

Choose a reason for hiding this comment

Uh oh!

martincostello commented Apr 14, 2026

Copilot Summary

main + the new benchmarks

This PR

Uh oh!

martincostello commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

martincostello commented Apr 10, 2026 •

edited

Loading

`main` + the new benchmarks

codecov bot commented Apr 10, 2026 •

edited

Loading

`main` + the new benchmarks