OTEP: Process Context: Sharing Resource Attributes with External Readers#4719
Conversation
This OTEP introduces a standard mechanism for OpenTelemetry SDKs to publish process-level resource attributes for access by out-of-process readers such as the OpenTelemetry eBPF Profiler. External readers like the OpenTelemetry eBPF Profiler operate outside the instrumented process and cannot access resource attributes configured within OpenTelemetry SDKs. We propose a mechanism for OpenTelemetry SDKs to publish process-level resource attributes, through a standard format based on Linux anonymous memory mappings. When an SDK initializes (or updates its resource attributes) it publishes this information to a small, fixed-size memory region that external processes can discover and read. The OTEL eBPF profiler will then, upon observing a previously-unseen process, probe and read this information, associating it with any profiling samples taken from a given process. _I'm opening this PR as a draft with the intention of sharing with the Profiling SIG for an extra round of feedback before asking for a wider review._ _This OTEP is based on [Sharing Process-Level Resource Attributes with the OpenTelemetry eBPF Profiler](https://docs.google.com/document/d/1-4jo29vWBZZ0nKKAOG13uAQjRcARwmRc4P313LTbPOE/edit?tab=t.0), big thanks to everyone that provided feedback and helped refine the idea so far._
|
Marking as ready for review! |
|
So this would be a new requirement for eBPF profiler implementations? My issue is the lack of safe support for Erlang/Elixir to do this. While something that could just be accessed as a file or socket wouldn't have that issue. We'd have to pull in a third party, or implement ourselves, library that is a NIF to make these calls and that brings in instability many would rather not have when the goal of our SDK is to not be able to bring down a users program if the SDk crashes -- unless they specifically configure it to do so. |
No, hard requirement should not be the goal: for starters, this is Linux-only (for now), so right off the gate this means it's not going to be available everywhere. Having this discussion is exactly why it was included as one of the open questions in the doc 👍 Our view is that we should go for recommended to implement and recommended to enable by default. In languages/runtimes where it's easy to do so (Go, Rust, Java 22+, possibly Ruby, ...etc?) we should be able to deliver this experience. For others, such as Erlang/Elixir, Java 8-21 (requires a native library, similar to Erlang/Elixir), the goal would be to make it very easy to enable/use for users that want it, but still optional so as to not impact anyone that is not interested. We should probably record the above guidance on the OTEP, if/once we're happy with it 🤔 |
|
cc @open-telemetry/specs-entities-approvers for extra eyes |
|
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Co-authored-by: Florian Lehner <[email protected]>
Following discussion so far, we can probably avoid having our home-grown `OtelProcessCtx` and instead use the common OTEL `Resource` message.
This PR adds an experimental C/C++ implementation for the "Process Context" OTEP being proposed in open-telemetry/opentelemetry-specification#4719 This implementation previously lived in https://github.com/ivoanjo/proc-level-demo/tree/main/anonmapping-clib and as discussed during the OTEL profiling SIG meeting we want to add it to this repository so it becomes easier to find and contribute to. I've made sure to include a README explaining how to use it. Here's the ultra-quick start (Linux-only): ```bash $ ./build.sh $ ./build/example_ctx --keep-running Published: service=my-service, instance=123d8444-2c7e-46e3-89f6-6217880f7123, env=prod, version=4.5.6, sdk=example_ctx.c/c/1.2.3, resources=resource.key1=resource.value1,resource.key2=resource.value2 Continuing forever, to exit press ctrl+c... TIP: You can now `sudo ./otel_process_ctx_dump.sh 267023` to see the context # In another shell $ sudo ./otel_process_ctx_dump.sh 267023 # Update this to match the PID from above Found OTEL context for PID 267023 Start address: 756f28ce1000 00000000 4f 54 45 4c 5f 43 54 58 02 00 00 00 0b 68 55 47 |OTEL_CTX.....hUG| 00000010 70 24 7d 18 50 01 00 00 a0 82 6d 7e 6a 5f 00 00 |p$}.P.....m~j_..| 00000020 Parsed struct: otel_process_ctx_signature : "OTEL_CTX" otel_process_ctx_version : 2 otel_process_ctx_published_at_ns : 1764606693650819083 (2025-12-01 16:31:33 GMT) otel_process_payload_size : 336 otel_process_payload : 0x00005f6a7e6d82a0 Payload dump (336 bytes): 00000000 0a 25 0a 1b 64 65 70 6c 6f 79 6d 65 6e 74 2e 65 |.%..deployment.e| 00000010 6e 76 69 72 6f 6e 6d 65 6e 74 2e 6e 61 6d 65 12 |nvironment.name.| ... Protobuf decode: attributes { key: "deployment.environment.name" value { string_value: "prod" } } attributes { key: "service.instance.id" value { string_value: "123d8444-2c7e-46e3-89f6-6217880f7123" } } attributes { key: "service.name" value { string_value: "my-service" } } ... ``` Note that because the upstream OTEP is still under discussion, this implementation is experimental and may need changes to match up with the final version of the OTEP.
As pointed out during review, these don't necessarily exist for some resources so let's streamline the spec for now.
reyang
left a comment
There was a problem hiding this comment.
I've left several suggestions, overall looks good to me!
Co-authored-by: Reiley Yang <[email protected]>
|
FYI, this will be merged at the end of this week if there are no additional reviews / feedback |
There was a problem hiding this comment.
Pull request overview
Adds a new OTEP describing a Linux-specific mechanism for OpenTelemetry SDKs to publish process-level Resource attributes into a discoverable memory mapping so out-of-process readers (e.g., the OpenTelemetry eBPF Profiler) can correlate profiles with other signals.
Changes:
- Introduces a header + protobuf payload format for “process context” published via
mmap/prctland discovered via/proc/<pid>/maps. - Specifies publication, reading, and update protocols (including ordering/barrier guidance).
- Documents trade-offs, alternatives, prototypes, and open questions for standardization.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
In older versions of the spec, the signature was used for synchronization, not the timestamp, and I missed fully updating this item when that changed.
…in step 2 We tweaked the text above to read it in step 3, but didn't update the reference.
Nice small improvements from copilot Co-authored-by: Copilot <[email protected]>
Co-authored-by: Robert Pająk <[email protected]>
|
Oops, just spotted the CI checks that are unhappy, fixing... |
|
Ok, CI unhappiness should be addressed! @pellared can I ask you to try it now? |
As open-telemetry/opentelemetry-specification#4719 looks to be merged soon, it came up as we were implementing this spec in the OTel eBPF Profiler (open-telemetry/opentelemetry-ebpf-profiler#1181) that it'd be nice to stop copy-pasting the `process_context.proto` and it's time to add it to the proper place. This is my first contribution to this repo so please do point out if I missed something! I didn't touch the collector parts since this message is not expected to be processed by the collector directly.
Changes
External readers like the OpenTelemetry eBPF Profiler operate outside the instrumented process and cannot access resource attributes configured within OpenTelemetry SDKs. We propose a mechanism for OpenTelemetry SDKs to publish process-level resource attributes, through a standard format based on Linux anonymous memory mappings.
When an SDK initializes (or updates its resource attributes) it publishes this information to a small, fixed-size memory region that external processes can discover and read. The OTEL eBPF profiler will then, upon observing a previously-unseen process, probe and read this information, associating it with any profiling samples taken from a given process.
Why open as draft:
I'm opening this PR as a draft with the intention of sharing with the Profiling SIG for an extra round of feedback before asking for a wider review.This OTEP is based on Sharing Process-Level Resource Attributes with the OpenTelemetry eBPF Profiler, big thanks to everyone that provided feedback and helped refine the idea so far.
CHANGELOG.mdfile updated for non-trivial changes