-
Notifications
You must be signed in to change notification settings - Fork 125
feat: add distributed tracing for webhook handling and PipelineRun timing #2605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,117 @@ | ||
| --- | ||
| title: Distributed Tracing | ||
| weight: 5 | ||
| --- | ||
|
|
||
| This page describes how to enable OpenTelemetry distributed tracing for Pipelines-as-Code. When enabled, PaC emits trace spans for webhook event processing and PipelineRun lifecycle timing. | ||
|
|
||
| ## Enabling tracing | ||
|
|
||
| The ConfigMap `pipelines-as-code-config-observability` controls tracing configuration. It must exist in the same namespace as the Pipelines-as-Code controller and watcher deployments. See [config/305-config-observability.yaml](https://github.com/tektoncd/pipelines-as-code/blob/main/config/305-config-observability.yaml) for the full example. | ||
|
|
||
| It contains the following tracing fields: | ||
|
|
||
| * `tracing-protocol`: Export protocol. Supported values: `grpc`, `http/protobuf`, `none`. Default is `none` (tracing disabled). | ||
| * `tracing-endpoint`: OTLP collector endpoint. Required when protocol is not `none`. The `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable takes precedence if set. | ||
| * `tracing-sampling-rate`: Fraction of traces to sample. `0.0` = none, `1.0` = all. Default is `0`. | ||
|
|
||
| ### Example | ||
|
|
||
| ```yaml | ||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: pipelines-as-code-config-observability | ||
| namespace: pipelines-as-code | ||
| data: | ||
| tracing-protocol: grpc | ||
| tracing-endpoint: "http://otel-collector.observability.svc.cluster.local:4317" | ||
| tracing-sampling-rate: "1.0" | ||
| ``` | ||
|
|
||
| Changes to `tracing-protocol`, `tracing-endpoint`, and `tracing-sampling-rate` require restarting the controller and watcher pods. The trace exporter is created once at startup from the ConfigMap values at that time. Set `tracing-protocol` to `none` or remove the tracing keys to disable tracing. | ||
|
|
||
| The controller and watcher locate this ConfigMap by name via the `CONFIG_OBSERVABILITY_NAME` environment variable set in their deployment manifests. Operator-based installations may manage this differently; consult the operator documentation for details. | ||
|
|
||
| ## Emitted spans | ||
|
|
||
| The controller emits a `PipelinesAsCode:ProcessEvent` span for each webhook event. The watcher emits `waitDuration` and `executeDuration` spans for completed PipelineRuns. | ||
|
|
||
| ### Webhook event span (`PipelinesAsCode:ProcessEvent`) | ||
|
|
||
| [OTel VCS semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/vcs/): | ||
|
|
||
| | Attribute | Source | | ||
| | --- | --- | | ||
| | `vcs.provider.name` | Git provider name | | ||
| | `vcs.repository.url.full` | Repository URL | | ||
| | `vcs.ref.head.revision` | Head commit SHA | | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://opentelemetry.io/docs/specs/semconv/registry/attributes/vcs/#vcs-change-id |
||
|
|
||
| PaC-specific: | ||
|
|
||
| | Attribute | Source | | ||
| | --- | --- | | ||
| | `pipelinesascode.tekton.dev.event_type` | Webhook event type | | ||
|
|
||
| ### PipelineRun timing spans (`waitDuration`, `executeDuration`) | ||
|
|
||
| Tekton-compatible bare keys (match Tekton's own reconciler spans for correlation): | ||
|
|
||
| | Attribute | Source | | ||
| | --- | --- | | ||
| | `namespace` | PipelineRun namespace | | ||
| | `pipelinerun` | PipelineRun name | | ||
|
|
||
| Cross-service delivery attributes (`delivery.tekton.dev.*`): | ||
|
|
||
| | Attribute | Source | | ||
| | --- | --- | | ||
| | `delivery.tekton.dev.pipelinerun_uid` | PipelineRun UID | | ||
| | `delivery.tekton.dev.result_message` | First failing TaskRun message; omitted on success; truncated to 1024 bytes | | ||
|
|
||
| Additional `delivery.tekton.dev.*` attributes are sourced from [configurable PipelineRun labels](#configuring-label-sourced-attributes). | ||
|
|
||
| [OTel CI/CD semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/cicd/) (`executeDuration` only): | ||
|
|
||
| | Attribute | Source | | ||
| | --- | --- | | ||
| | `cicd.pipeline.result` | Outcome enum (see below) | | ||
|
|
||
| ### `cicd.pipeline.result` enum | ||
|
|
||
| | Condition | Value | | ||
| | --- | --- | | ||
| | `Status=True` | `success` | | ||
| | `Status=False`, reason `Failed` | `failure` | | ||
| | `Status=False`, reason `PipelineRunTimeout` | `timeout` | | ||
| | `Status=False`, reason `Cancelled` or `CancelledRunningFinally` | `cancellation` | | ||
| | `Status=False`, any other reason | `error` | | ||
|
|
||
| ## Configuring label-sourced attributes | ||
|
|
||
| Some span attributes are read from PipelineRun labels. The label names are configurable via the main `pipelines-as-code` ConfigMap so deployments can point at their existing labels without rewriting producers: | ||
|
|
||
| | ConfigMap key | PipelineRun label read (default) | Span attribute emitted | | ||
| | --- | --- | --- | | ||
| | `tracing-label-action` | `delivery.tekton.dev/action` | `cicd.pipeline.action.name` | | ||
| | `tracing-label-application` | `delivery.tekton.dev/application` | `delivery.tekton.dev.application` | | ||
| | `tracing-label-component` | `delivery.tekton.dev/component` | `delivery.tekton.dev.component` | | ||
|
|
||
| Setting a ConfigMap key to the empty string disables emission of that label-sourced attribute. Only label-sourced attributes are affected; all other span attributes are always emitted. The emitted span attribute keys are fixed regardless of which labels are read, so cross-service queries work uniformly. | ||
|
|
||
| Unlike the observability ConfigMap above (which requires a pod restart), changes to these label mappings are picked up automatically without restarting pods. | ||
|
|
||
| ## Trace context propagation | ||
|
|
||
| When Pipelines-as-Code creates a PipelineRun, it sets the `tekton.dev/pipelinerunSpanContext` annotation with a JSON-encoded OTel TextMapCarrier containing the W3C `traceparent`. PaC tracing works independently — you get PaC spans regardless of whether Tekton Pipelines has tracing enabled. | ||
|
|
||
| If Tekton Pipelines is also configured with tracing pointing at the same collector, its reconciler spans appear as children of the PaC span, providing a single end-to-end trace from webhook receipt through task execution. See the [Tekton Pipelines tracing documentation](https://github.com/tektoncd/pipeline/blob/main/docs/developers/tracing.md) for Tekton's independent tracing setup. | ||
|
|
||
| ## Deploying a trace collector | ||
|
|
||
| Pipelines-as-Code exports traces using the standard OpenTelemetry Protocol (OTLP). You need a running OTLP-compatible collector for the `tracing-endpoint` to point to. Common options include: | ||
|
|
||
| * [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) -- the vendor-neutral reference collector | ||
| * [Jaeger](https://www.jaegertracing.io/docs/latest/getting-started/) -- supports OTLP ingestion natively since v1.35 | ||
|
|
||
| Deploying and operating a collector is outside the scope of Pipelines-as-Code. Refer to your organization's observability infrastructure or the links above for setup instructions. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| package adapter | ||
|
|
||
| import ( | ||
| "context" | ||
| "net/http" | ||
| "testing" | ||
|
|
||
| "github.com/openshift-pipelines/pipelines-as-code/pkg/params/info" | ||
| testtracing "github.com/openshift-pipelines/pipelines-as-code/pkg/test/tracing" | ||
| "github.com/openshift-pipelines/pipelines-as-code/pkg/tracing" | ||
| "go.opentelemetry.io/otel" | ||
| "go.opentelemetry.io/otel/propagation" | ||
| sdktrace "go.opentelemetry.io/otel/sdk/trace" | ||
| semconv "go.opentelemetry.io/otel/semconv/v1.40.0" | ||
| "go.opentelemetry.io/otel/trace" | ||
| "gotest.tools/v3/assert" | ||
| ) | ||
|
|
||
| func TestSetVCSSpanAttributes(t *testing.T) { | ||
| t.Parallel() | ||
|
|
||
| eventTypeKey := string(tracing.PACEventTypeKey) | ||
| repoURLKey := string(semconv.VCSRepositoryURLFullKey) | ||
| headRevKey := string(semconv.VCSRefHeadRevisionKey) | ||
|
|
||
| tests := []struct { | ||
| name string | ||
| event *info.Event | ||
| want map[string]string | ||
| }{ | ||
| { | ||
| name: "full event", | ||
| event: &info.Event{ | ||
| EventType: "pull_request", | ||
| URL: "https://github.com/test/repo", | ||
| SHA: "abc123", | ||
| }, | ||
| want: map[string]string{ | ||
| eventTypeKey: "pull_request", | ||
| repoURLKey: "https://github.com/test/repo", | ||
| headRevKey: "abc123", | ||
| }, | ||
| }, | ||
| { | ||
| name: "event type only", | ||
| event: &info.Event{ | ||
| EventType: "push", | ||
| }, | ||
| want: map[string]string{ | ||
| eventTypeKey: "push", | ||
| }, | ||
| }, | ||
| { | ||
| name: "url without sha", | ||
| event: &info.Event{ | ||
| EventType: "issue_comment", | ||
| URL: "https://github.com/test/repo", | ||
| }, | ||
| want: map[string]string{ | ||
| eventTypeKey: "issue_comment", | ||
| repoURLKey: "https://github.com/test/repo", | ||
| }, | ||
| }, | ||
| } | ||
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| t.Parallel() | ||
|
|
||
| exporter := &testtracing.RecordingExporter{} | ||
| tp := sdktrace.NewTracerProvider( | ||
| sdktrace.WithSampler(sdktrace.AlwaysSample()), | ||
| sdktrace.WithSyncer(exporter), | ||
| ) | ||
| defer func() { _ = tp.Shutdown(context.Background()) }() | ||
|
|
||
| ctx, span := tp.Tracer("test").Start(context.Background(), "test-span") | ||
| setVCSSpanAttributes(ctx, tt.event) | ||
| span.End() | ||
|
|
||
| spans := exporter.GetSpans() | ||
| assert.Equal(t, len(spans), 1) | ||
| got := map[string]string{} | ||
| for _, a := range spans[0].Attributes() { | ||
| got[string(a.Key)] = a.Value.AsString() | ||
| } | ||
| assert.DeepEqual(t, got, tt.want) | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| func TestProcessEventSpanHonorsIncomingTraceContext(t *testing.T) { | ||
| exporter := testtracing.SetupTracer(t) | ||
|
|
||
| // Simulate an external system sending a webhook with a traceparent header. | ||
| // Create a parent span to generate a valid trace context. | ||
| parentCtx, parentSpan := otel.Tracer("external-system").Start(context.Background(), "external-root") | ||
| expectedTraceID := parentSpan.SpanContext().TraceID() | ||
| parentSpan.End() | ||
|
|
||
| // Inject the parent context into HTTP headers (what the webhook sender would do). | ||
| req, _ := http.NewRequestWithContext(context.Background(), http.MethodPost, "http://localhost", nil) | ||
| otel.GetTextMapPropagator().Inject(parentCtx, propagation.HeaderCarrier(req.Header)) | ||
|
|
||
| // This is the exact extract → start sequence from handleEvent. | ||
| tracedCtx := otel.GetTextMapPropagator().Extract(context.Background(), propagation.HeaderCarrier(req.Header)) | ||
| _, span := otel.Tracer(tracing.TracerName).Start(tracedCtx, "PipelinesAsCode:ProcessEvent", | ||
| trace.WithSpanKind(trace.SpanKindServer), | ||
| ) | ||
| span.End() | ||
|
|
||
| spans := exporter.GetSpans() | ||
| var processSpan sdktrace.ReadOnlySpan | ||
| for _, s := range spans { | ||
| if s.Name() == "PipelinesAsCode:ProcessEvent" { | ||
| processSpan = s | ||
| } | ||
| } | ||
| assert.Assert(t, processSpan != nil, "ProcessEvent span not found") | ||
| assert.Equal(t, processSpan.Parent().TraceID(), expectedTraceID, | ||
| "ProcessEvent span should be parented under the incoming trace context, not a new root") | ||
| assert.Assert(t, processSpan.Parent().IsValid(), | ||
| "ProcessEvent span should have a valid remote parent") | ||
| } | ||
|
|
||
| func TestProcessEventSpanCreatesRootWithoutIncomingContext(t *testing.T) { | ||
| exporter := testtracing.SetupTracer(t) | ||
|
|
||
| // Webhook with no traceparent header. | ||
| req, _ := http.NewRequestWithContext(context.Background(), http.MethodPost, "http://localhost", nil) | ||
|
|
||
| tracedCtx := otel.GetTextMapPropagator().Extract(context.Background(), propagation.HeaderCarrier(req.Header)) | ||
| _, span := otel.Tracer(tracing.TracerName).Start(tracedCtx, "PipelinesAsCode:ProcessEvent", | ||
| trace.WithSpanKind(trace.SpanKindServer), | ||
| ) | ||
| span.End() | ||
|
|
||
| spans := exporter.GetSpans() | ||
| processSpan := testtracing.FindSpan(spans, "PipelinesAsCode:ProcessEvent") | ||
| assert.Assert(t, processSpan != nil) | ||
| assert.Assert(t, !processSpan.Parent().IsValid(), | ||
| "ProcessEvent span should be a root when no incoming trace context is present") | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.