Skip to content

Commit 9f9c728

Browse files
ci-operatorclaude
authored andcommitted
feat: add distributed tracing for webhook handling and PipelineRun timing
Instrument PaC with OpenTelemetry tracing. The controller emits a ProcessEvent span for each webhook, honoring incoming W3C trace context when present. The watcher emits waitDuration and executeDuration timing spans for completed PipelineRuns. Trace context is propagated onto created PipelineRuns via annotation for end-to-end delivery traces. Tracing configuration uses the existing pipelines-as-code-config-observability ConfigMap (located via CONFIG_OBSERVABILITY_NAME env var). Label-sourced span attributes are configurable via the main pipelines-as-code ConfigMap. See docs/content/docs/operations/tracing.md for the schema. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c9be9d6 commit 9f9c728

File tree

20 files changed

+1577
-20
lines changed

20 files changed

+1577
-20
lines changed

config/302-pac-configmap.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,12 @@ data:
173173
# Default: true
174174
skip-push-event-for-pr-commits: "true"
175175

176+
# PipelineRun label names PaC reads to populate tracing span attributes.
177+
# Empty disables emission of the corresponding attribute.
178+
tracing-label-action: "delivery.tekton.dev/action"
179+
tracing-label-application: "delivery.tekton.dev/application"
180+
tracing-label-component: "delivery.tekton.dev/component"
181+
176182
# Configure a custom console here, the driver support custom parameters from
177183
# Repo CR along a few other template variable, see documentation for more
178184
# details

config/305-config-observability.yaml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,4 +50,18 @@ data:
5050
5151
# metrics-export-interval specifies how often metrics are exported.
5252
# Only applicable for grpc and http/protobuf protocols.
53-
# metrics-export-interval: "30s"
53+
# metrics-export-interval: "30s"
54+
55+
# tracing-protocol specifies the trace export protocol.
56+
# Supported values: "grpc", "http/protobuf", "none".
57+
# Default is "none" (tracing disabled).
58+
# tracing-protocol: "none"
59+
60+
# tracing-endpoint specifies the OTLP collector endpoint.
61+
# Required when tracing-protocol is "grpc" or "http/protobuf".
62+
# The OTEL_EXPORTER_OTLP_ENDPOINT env var takes precedence if set.
63+
# tracing-endpoint: "http://otel-collector.observability.svc.cluster.local:4317"
64+
65+
# tracing-sampling-rate controls the fraction of traces sampled.
66+
# 0.0 = none, 1.0 = all. Default is 0 (none).
67+
# tracing-sampling-rate: "1.0"
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
title: Distributed Tracing
3+
weight: 5
4+
---
5+
6+
This page describes how to enable OpenTelemetry distributed tracing for Pipelines-as-Code. When enabled, PaC emits trace spans for webhook event processing and PipelineRun lifecycle timing.
7+
8+
## Enabling tracing
9+
10+
The ConfigMap `pipelines-as-code-config-observability` controls tracing configuration. It must exist in the same namespace as the Pipelines-as-Code controller and watcher deployments. See [config/305-config-observability.yaml](https://github.com/tektoncd/pipelines-as-code/blob/main/config/305-config-observability.yaml) for the full example.
11+
12+
It contains the following tracing fields:
13+
14+
* `tracing-protocol`: Export protocol. Supported values: `grpc`, `http/protobuf`, `none`. Default is `none` (tracing disabled).
15+
* `tracing-endpoint`: OTLP collector endpoint. Required when protocol is not `none`. The `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable takes precedence if set.
16+
* `tracing-sampling-rate`: Fraction of traces to sample. `0.0` = none, `1.0` = all. Default is `0`.
17+
18+
### Example
19+
20+
```yaml
21+
apiVersion: v1
22+
kind: ConfigMap
23+
metadata:
24+
name: pipelines-as-code-config-observability
25+
namespace: pipelines-as-code
26+
data:
27+
tracing-protocol: grpc
28+
tracing-endpoint: "http://otel-collector.observability.svc.cluster.local:4317"
29+
tracing-sampling-rate: "1.0"
30+
```
31+
32+
Changes to `tracing-protocol`, `tracing-endpoint`, and `tracing-sampling-rate` require restarting the controller and watcher pods. The trace exporter is created once at startup from the ConfigMap values at that time. Set `tracing-protocol` to `none` or remove the tracing keys to disable tracing.
33+
34+
The controller and watcher locate this ConfigMap by name via the `CONFIG_OBSERVABILITY_NAME` environment variable set in their deployment manifests. Operator-based installations may manage this differently; consult the operator documentation for details.
35+
36+
## Emitted spans
37+
38+
The controller emits a `PipelinesAsCode:ProcessEvent` span for each webhook event. The watcher emits `waitDuration` and `executeDuration` spans for completed PipelineRuns.
39+
40+
### Webhook event span (`PipelinesAsCode:ProcessEvent`)
41+
42+
[OTel VCS semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/vcs/):
43+
44+
| Attribute | Source |
45+
| --- | --- |
46+
| `vcs.provider.name` | Git provider name |
47+
| `vcs.repository.url.full` | Repository URL |
48+
| `vcs.ref.head.revision` | Head commit SHA |
49+
50+
PaC-specific:
51+
52+
| Attribute | Source |
53+
| --- | --- |
54+
| `pipelinesascode.tekton.dev.event_type` | Webhook event type |
55+
56+
### PipelineRun timing spans (`waitDuration`, `executeDuration`)
57+
58+
Tekton-compatible bare keys (match Tekton's own reconciler spans for correlation):
59+
60+
| Attribute | Source |
61+
| --- | --- |
62+
| `namespace` | PipelineRun namespace |
63+
| `pipelinerun` | PipelineRun name |
64+
65+
Cross-service delivery attributes (`delivery.tekton.dev.*`):
66+
67+
| Attribute | Source |
68+
| --- | --- |
69+
| `delivery.tekton.dev.pipelinerun_uid` | PipelineRun UID |
70+
| `delivery.tekton.dev.result_message` | First failing TaskRun message; omitted on success; truncated to 1024 bytes |
71+
72+
Additional `delivery.tekton.dev.*` attributes are sourced from [configurable PipelineRun labels](#configuring-label-sourced-attributes).
73+
74+
[OTel CI/CD semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/cicd/) (`executeDuration` only):
75+
76+
| Attribute | Source |
77+
| --- | --- |
78+
| `cicd.pipeline.result` | Outcome enum (see below) |
79+
80+
### `cicd.pipeline.result` enum
81+
82+
| Condition | Value |
83+
| --- | --- |
84+
| `Status=True` | `success` |
85+
| `Status=False`, reason `Failed` | `failure` |
86+
| `Status=False`, reason `PipelineRunTimeout` | `timeout` |
87+
| `Status=False`, reason `Cancelled` or `CancelledRunningFinally` | `cancellation` |
88+
| `Status=False`, any other reason | `error` |
89+
90+
## Configuring label-sourced attributes
91+
92+
Some span attributes are read from PipelineRun labels. The label names are configurable via the main `pipelines-as-code` ConfigMap so deployments can point at their existing labels without rewriting producers:
93+
94+
| ConfigMap key | PipelineRun label read (default) | Span attribute emitted |
95+
| --- | --- | --- |
96+
| `tracing-label-action` | `delivery.tekton.dev/action` | `cicd.pipeline.action.name` |
97+
| `tracing-label-application` | `delivery.tekton.dev/application` | `delivery.tekton.dev.application` |
98+
| `tracing-label-component` | `delivery.tekton.dev/component` | `delivery.tekton.dev.component` |
99+
100+
Setting a ConfigMap key to the empty string disables emission of that label-sourced attribute. Only label-sourced attributes are affected; all other span attributes are always emitted. The emitted span attribute keys are fixed regardless of which labels are read, so cross-service queries work uniformly.
101+
102+
Unlike the observability ConfigMap above (which requires a pod restart), changes to these label mappings are picked up automatically without restarting pods.
103+
104+
## Trace context propagation
105+
106+
When Pipelines-as-Code creates a PipelineRun, it sets the `tekton.dev/pipelinerunSpanContext` annotation with a JSON-encoded OTel TextMapCarrier containing the W3C `traceparent`. PaC tracing works independently — you get PaC spans regardless of whether Tekton Pipelines has tracing enabled.
107+
108+
If Tekton Pipelines is also configured with tracing pointing at the same collector, its reconciler spans appear as children of the PaC span, providing a single end-to-end trace from webhook receipt through task execution. See the [Tekton Pipelines tracing documentation](https://github.com/tektoncd/pipeline/blob/main/docs/developers/tracing.md) for Tekton's independent tracing setup.
109+
110+
## Deploying a trace collector
111+
112+
Pipelines-as-Code exports traces using the standard OpenTelemetry Protocol (OTLP). You need a running OTLP-compatible collector for the `tracing-endpoint` to point to. Common options include:
113+
114+
* [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) -- the vendor-neutral reference collector
115+
* [Jaeger](https://www.jaegertracing.io/docs/latest/getting-started/) -- supports OTLP ingestion natively since v1.35
116+
117+
Deploying and operating a collector is outside the scope of Pipelines-as-Code. Refer to your organization's observability infrastructure or the links above for setup instructions.

go.mod

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,9 @@ require (
3030
gitlab.com/gitlab-org/api/client-go v1.46.0
3131
go.opentelemetry.io/otel v1.43.0
3232
go.opentelemetry.io/otel/metric v1.43.0
33+
go.opentelemetry.io/otel/sdk v1.43.0
3334
go.opentelemetry.io/otel/sdk/metric v1.43.0
35+
go.opentelemetry.io/otel/trace v1.43.0
3436
go.uber.org/zap v1.27.1
3537
golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90
3638
golang.org/x/oauth2 v0.36.0
@@ -91,8 +93,6 @@ require (
9193
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0 // indirect
9294
go.opentelemetry.io/otel/exporters/prometheus v0.65.0 // indirect
9395
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.43.0 // indirect
94-
go.opentelemetry.io/otel/sdk v1.43.0 // indirect
95-
go.opentelemetry.io/otel/trace v1.43.0 // indirect
9696
go.opentelemetry.io/proto/otlp v1.10.0 // indirect
9797
go.uber.org/atomic v1.11.0 // indirect
9898
go.yaml.in/yaml/v2 v2.4.4 // indirect

pkg/adapter/adapter.go

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@ import (
2323
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/gitea"
2424
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/github"
2525
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/gitlab"
26+
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
27+
"go.opentelemetry.io/otel"
28+
"go.opentelemetry.io/otel/propagation"
29+
semconv "go.opentelemetry.io/otel/semconv/v1.40.0"
30+
"go.opentelemetry.io/otel/trace"
2631
"go.uber.org/zap"
2732
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
2833
"knative.dev/eventing/pkg/adapter/v2"
@@ -191,6 +196,17 @@ func (l listener) handleEvent(ctx context.Context) http.HandlerFunc {
191196
}
192197
gitProvider.SetPacInfo(&pacInfo)
193198

199+
tracedCtx := otel.GetTextMapPropagator().Extract(ctx, propagation.HeaderCarrier(request.Header))
200+
201+
tracer := otel.Tracer(tracing.TracerName)
202+
tracedCtx, span := tracer.Start(tracedCtx, "PipelinesAsCode:ProcessEvent",
203+
trace.WithSpanKind(trace.SpanKindServer),
204+
)
205+
206+
span.SetAttributes(
207+
semconv.VCSProviderNameKey.String(gitProvider.GetConfig().Name),
208+
)
209+
194210
s := sinker{
195211
run: l.run,
196212
vcx: gitProvider,
@@ -206,8 +222,10 @@ func (l listener) handleEvent(ctx context.Context) http.HandlerFunc {
206222
localRequest := request.Clone(request.Context())
207223

208224
go func() {
209-
err := s.processEvent(ctx, localRequest)
225+
defer span.End()
226+
err := s.processEvent(tracedCtx, localRequest)
210227
if err != nil {
228+
span.RecordError(err)
211229
logger.Errorf("an error occurred: %v", err)
212230
}
213231
}()
@@ -236,12 +254,6 @@ func (l listener) processRes(processEvent bool, provider provider.Interface, log
236254
func (l listener) detectProvider(req *http.Request, reqBody string) (provider.Interface, *zap.SugaredLogger, error) {
237255
log := *l.logger
238256

239-
// payload validation
240-
var event map[string]any
241-
if err := json.Unmarshal([]byte(reqBody), &event); err != nil {
242-
return nil, &log, fmt.Errorf("invalid event body format: %w", err)
243-
}
244-
245257
gitHub := github.New()
246258
gitHub.Run = l.run
247259
isGH, processReq, logger, reason, err := gitHub.Detect(req, reqBody, &log)

pkg/adapter/sinker.go

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ import (
1414
"github.com/openshift-pipelines/pipelines-as-code/pkg/pipelineascode"
1515
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider"
1616
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/status"
17+
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
18+
semconv "go.opentelemetry.io/otel/semconv/v1.40.0"
19+
"go.opentelemetry.io/otel/trace"
1720
"go.uber.org/zap"
1821
)
1922

@@ -117,6 +120,10 @@ func (s *sinker) processEvent(ctx context.Context, request *http.Request) error
117120
}
118121
}
119122

123+
// Enrich span with VCS attributes — for incoming events these are
124+
// pre-populated; for webhook events ParsePayload filled them in.
125+
setVCSSpanAttributes(ctx, s.event)
126+
120127
p := pipelineascode.NewPacs(s.event, s.vcx, s.run, s.pacInfo, s.kint, s.logger, s.globalRepo)
121128
return p.Run(ctx)
122129
}
@@ -174,3 +181,17 @@ func (s *sinker) createSkipCIStatus(ctx context.Context) error {
174181

175182
return nil
176183
}
184+
185+
func setVCSSpanAttributes(ctx context.Context, event *info.Event) {
186+
span := trace.SpanFromContext(ctx)
187+
if !span.IsRecording() {
188+
return
189+
}
190+
span.SetAttributes(tracing.PACEventTypeKey.String(event.EventType))
191+
if event.URL != "" {
192+
span.SetAttributes(semconv.VCSRepositoryURLFullKey.String(event.URL))
193+
}
194+
if event.SHA != "" {
195+
span.SetAttributes(semconv.VCSRefHeadRevisionKey.String(event.SHA))
196+
}
197+
}

0 commit comments

Comments
 (0)