Skip to content

Commit bd9f468

Browse files
ci-operatorzakisk
authored andcommitted
feat: enable tracing for webhooks and PipelineRun execution
Instrument PaC with OpenTelemetry tracing. The controller emits a ProcessEvent span for each webhook, honoring incoming W3C trace context when present. The watcher emits waitDuration and executeDuration timing spans for completed PipelineRuns. Trace context is propagated onto created PipelineRuns via annotation for end-to-end delivery traces. Tracing configuration uses the existing pipelines-as-code-config-observability ConfigMap (located via CONFIG_OBSERVABILITY_NAME env var). Label-sourced span attributes are configurable via the main pipelines-as-code ConfigMap. See docs/content/docs/operations/tracing.md for the schema.
1 parent 80f9be1 commit bd9f468

File tree

21 files changed

+1582
-16
lines changed

21 files changed

+1582
-16
lines changed

config/302-pac-configmap.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,12 @@ data:
173173
# Default: true
174174
skip-push-event-for-pr-commits: "true"
175175

176+
# PipelineRun label names PaC reads to populate tracing span attributes.
177+
# Empty disables emission of the corresponding attribute.
178+
tracing-label-action: "delivery.tekton.dev/action"
179+
tracing-label-application: "delivery.tekton.dev/application"
180+
tracing-label-component: "delivery.tekton.dev/component"
181+
176182
# Configure a custom console here, the driver support custom parameters from
177183
# Repo CR along a few other template variable, see documentation for more
178184
# details

config/305-config-observability.yaml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,4 +50,18 @@ data:
5050
5151
# metrics-export-interval specifies how often metrics are exported.
5252
# Only applicable for grpc and http/protobuf protocols.
53-
# metrics-export-interval: "30s"
53+
# metrics-export-interval: "30s"
54+
55+
# tracing-protocol specifies the trace export protocol.
56+
# Supported values: "grpc", "http/protobuf", "none".
57+
# Default is "none" (tracing disabled).
58+
# tracing-protocol: "none"
59+
60+
# tracing-endpoint specifies the OTLP collector endpoint.
61+
# Required when tracing-protocol is "grpc" or "http/protobuf".
62+
# The OTEL_EXPORTER_OTLP_ENDPOINT env var takes precedence if set.
63+
# tracing-endpoint: "http://otel-collector.observability.svc.cluster.local:4317"
64+
65+
# tracing-sampling-rate controls the fraction of traces sampled.
66+
# 0.0 = none, 1.0 = all. Default is 0 (none).
67+
# tracing-sampling-rate: "1.0"
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
title: Distributed Tracing
3+
weight: 5
4+
---
5+
6+
This page describes how to enable OpenTelemetry distributed tracing for Pipelines-as-Code. When enabled, PaC emits trace spans for webhook event processing and PipelineRun lifecycle timing.
7+
8+
## Enabling tracing
9+
10+
The ConfigMap `pipelines-as-code-config-observability` controls tracing configuration. It must exist in the same namespace as the Pipelines-as-Code controller and watcher deployments. See [config/305-config-observability.yaml](https://github.com/tektoncd/pipelines-as-code/blob/main/config/305-config-observability.yaml) for the full example.
11+
12+
It contains the following tracing fields:
13+
14+
* `tracing-protocol`: Export protocol. Supported values: `grpc`, `http/protobuf`, `none`. Default is `none` (tracing disabled).
15+
* `tracing-endpoint`: OTLP collector endpoint. Required when protocol is not `none`. The `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable takes precedence if set.
16+
* `tracing-sampling-rate`: Fraction of traces to sample. `0.0` = none, `1.0` = all. Default is `0`.
17+
18+
### Example
19+
20+
```yaml
21+
apiVersion: v1
22+
kind: ConfigMap
23+
metadata:
24+
name: pipelines-as-code-config-observability
25+
namespace: pipelines-as-code
26+
data:
27+
tracing-protocol: grpc
28+
tracing-endpoint: "http://otel-collector.observability.svc.cluster.local:4317"
29+
tracing-sampling-rate: "1.0"
30+
```
31+
32+
Changes to `tracing-protocol`, `tracing-endpoint`, and `tracing-sampling-rate` require restarting the controller and watcher pods. The trace exporter is created once at startup from the ConfigMap values at that time. Set `tracing-protocol` to `none` or remove the tracing keys to disable tracing.
33+
34+
The controller and watcher locate this ConfigMap by name via the `CONFIG_OBSERVABILITY_NAME` environment variable set in their deployment manifests. Operator-based installations may manage this differently; consult the operator documentation for details.
35+
36+
## Emitted spans
37+
38+
The controller emits a `PipelinesAsCode:ProcessEvent` span for each webhook event. The watcher emits `waitDuration` and `executeDuration` spans for completed PipelineRuns.
39+
40+
### Webhook event span (`PipelinesAsCode:ProcessEvent`)
41+
42+
[OTel VCS semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/vcs/):
43+
44+
| Attribute | Source |
45+
| --- | --- |
46+
| `vcs.provider.name` | Git provider name |
47+
| `vcs.repository.url.full` | Repository URL |
48+
| `vcs.ref.head.revision` | Head commit SHA |
49+
50+
PaC-specific:
51+
52+
| Attribute | Source |
53+
| --- | --- |
54+
| `pipelinesascode.tekton.dev.event_type` | Webhook event type |
55+
56+
### PipelineRun timing spans (`waitDuration`, `executeDuration`)
57+
58+
Tekton-compatible bare keys (match Tekton's own reconciler spans for correlation):
59+
60+
| Attribute | Source |
61+
| --- | --- |
62+
| `namespace` | PipelineRun namespace |
63+
| `pipelinerun` | PipelineRun name |
64+
65+
Cross-service delivery attributes (`delivery.tekton.dev.*`):
66+
67+
| Attribute | Source |
68+
| --- | --- |
69+
| `delivery.tekton.dev.pipelinerun_uid` | PipelineRun UID |
70+
| `delivery.tekton.dev.result_message` | First failing TaskRun message; omitted on success; truncated to 1024 bytes |
71+
72+
Additional `delivery.tekton.dev.*` attributes are sourced from [configurable PipelineRun labels](#configuring-label-sourced-attributes).
73+
74+
[OTel CI/CD semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/cicd/) (`executeDuration` only):
75+
76+
| Attribute | Source |
77+
| --- | --- |
78+
| `cicd.pipeline.result` | Outcome enum (see below) |
79+
80+
### `cicd.pipeline.result` enum
81+
82+
| Condition | Value |
83+
| --- | --- |
84+
| `Status=True` | `success` |
85+
| `Status=False`, reason `Failed` | `failure` |
86+
| `Status=False`, reason `PipelineRunTimeout` | `timeout` |
87+
| `Status=False`, reason `Cancelled` or `CancelledRunningFinally` | `cancellation` |
88+
| `Status=False`, any other reason | `error` |
89+
90+
## Configuring label-sourced attributes
91+
92+
Some span attributes are read from PipelineRun labels. The label names are configurable via the main `pipelines-as-code` ConfigMap so deployments can point at their existing labels without rewriting producers:
93+
94+
| ConfigMap key | PipelineRun label read (default) | Span attribute emitted |
95+
| --- | --- | --- |
96+
| `tracing-label-action` | `delivery.tekton.dev/action` | `cicd.pipeline.action.name` |
97+
| `tracing-label-application` | `delivery.tekton.dev/application` | `delivery.tekton.dev.application` |
98+
| `tracing-label-component` | `delivery.tekton.dev/component` | `delivery.tekton.dev.component` |
99+
100+
Setting a ConfigMap key to the empty string disables emission of that label-sourced attribute. Only label-sourced attributes are affected; all other span attributes are always emitted. The emitted span attribute keys are fixed regardless of which labels are read, so cross-service queries work uniformly.
101+
102+
Unlike the observability ConfigMap above (which requires a pod restart), changes to these label mappings are picked up automatically without restarting pods.
103+
104+
## Trace context propagation
105+
106+
When Pipelines-as-Code creates a PipelineRun, it sets the `tekton.dev/pipelinerunSpanContext` annotation with a JSON-encoded OTel TextMapCarrier containing the W3C `traceparent`. PaC tracing works independently — you get PaC spans regardless of whether Tekton Pipelines has tracing enabled.
107+
108+
If Tekton Pipelines is also configured with tracing pointing at the same collector, its reconciler spans appear as children of the PaC span, providing a single end-to-end trace from webhook receipt through task execution. See the [Tekton Pipelines tracing documentation](https://github.com/tektoncd/pipeline/blob/main/docs/developers/tracing.md) for Tekton's independent tracing setup.
109+
110+
## Deploying a trace collector
111+
112+
Pipelines-as-Code exports traces using the standard OpenTelemetry Protocol (OTLP). You need a running OTLP-compatible collector for the `tracing-endpoint` to point to. Common options include:
113+
114+
* [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) -- the vendor-neutral reference collector
115+
* [Jaeger](https://www.jaegertracing.io/docs/latest/getting-started/) -- supports OTLP ingestion natively since v1.35
116+
117+
Deploying and operating a collector is outside the scope of Pipelines-as-Code. Refer to your organization's observability infrastructure or the links above for setup instructions.

go.mod

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,9 @@ require (
3030
gitlab.com/gitlab-org/api/client-go v1.46.0
3131
go.opentelemetry.io/otel v1.43.0
3232
go.opentelemetry.io/otel/metric v1.43.0
33+
go.opentelemetry.io/otel/sdk v1.43.0
3334
go.opentelemetry.io/otel/sdk/metric v1.43.0
35+
go.opentelemetry.io/otel/trace v1.43.0
3436
go.uber.org/zap v1.27.1
3537
golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90
3638
golang.org/x/oauth2 v0.36.0
@@ -91,8 +93,6 @@ require (
9193
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0 // indirect
9294
go.opentelemetry.io/otel/exporters/prometheus v0.65.0 // indirect
9395
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.43.0 // indirect
94-
go.opentelemetry.io/otel/sdk v1.43.0 // indirect
95-
go.opentelemetry.io/otel/trace v1.43.0 // indirect
9696
go.opentelemetry.io/proto/otlp v1.10.0 // indirect
9797
go.uber.org/atomic v1.11.0 // indirect
9898
go.yaml.in/yaml/v2 v2.4.4 // indirect

pkg/adapter/adapter.go

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ import (
2323
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/gitea"
2424
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/github"
2525
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/gitlab"
26+
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
27+
"go.opentelemetry.io/otel"
28+
"go.opentelemetry.io/otel/propagation"
29+
"go.opentelemetry.io/otel/trace"
2630
"go.uber.org/zap"
2731
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
2832
"knative.dev/eventing/pkg/adapter/v2"
@@ -191,6 +195,13 @@ func (l listener) handleEvent(ctx context.Context) http.HandlerFunc {
191195
}
192196
gitProvider.SetPacInfo(&pacInfo)
193197

198+
tracedCtx := otel.GetTextMapPropagator().Extract(ctx, propagation.HeaderCarrier(request.Header))
199+
200+
tracer := otel.Tracer(tracing.TracerName)
201+
tracedCtx, span := tracer.Start(tracedCtx, "PipelinesAsCode:ProcessEvent",
202+
trace.WithSpanKind(trace.SpanKindServer),
203+
)
204+
194205
s := sinker{
195206
run: l.run,
196207
vcx: gitProvider,
@@ -206,8 +217,10 @@ func (l listener) handleEvent(ctx context.Context) http.HandlerFunc {
206217
localRequest := request.Clone(request.Context())
207218

208219
go func() {
209-
err := s.processEvent(ctx, localRequest)
220+
defer span.End()
221+
err := s.processEvent(tracedCtx, localRequest)
210222
if err != nil {
223+
span.RecordError(err)
211224
logger.Errorf("an error occurred: %v", err)
212225
}
213226
}()

pkg/adapter/sinker.go

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ import (
1414
"github.com/openshift-pipelines/pipelines-as-code/pkg/pipelineascode"
1515
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider"
1616
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/status"
17+
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
18+
semconv "go.opentelemetry.io/otel/semconv/v1.40.0"
19+
"go.opentelemetry.io/otel/trace"
1720
"go.uber.org/zap"
1821
)
1922

@@ -117,6 +120,10 @@ func (s *sinker) processEvent(ctx context.Context, request *http.Request) error
117120
}
118121
}
119122

123+
// Enrich span with VCS attributes — for incoming events these are
124+
// pre-populated; for webhook events ParsePayload filled them in.
125+
setVCSSpanAttributes(ctx, s.event)
126+
120127
p := pipelineascode.NewPacs(s.event, s.vcx, s.run, s.pacInfo, s.kint, s.logger, s.globalRepo)
121128
return p.Run(ctx)
122129
}
@@ -174,3 +181,17 @@ func (s *sinker) createSkipCIStatus(ctx context.Context) error {
174181

175182
return nil
176183
}
184+
185+
func setVCSSpanAttributes(ctx context.Context, event *info.Event) {
186+
span := trace.SpanFromContext(ctx)
187+
if !span.IsRecording() {
188+
return
189+
}
190+
span.SetAttributes(tracing.PACEventTypeKey.String(event.EventType))
191+
if event.URL != "" {
192+
span.SetAttributes(semconv.VCSRepositoryURLFullKey.String(event.URL))
193+
}
194+
if event.SHA != "" {
195+
span.SetAttributes(semconv.VCSRefHeadRevisionKey.String(event.SHA))
196+
}
197+
}

pkg/adapter/sinker_tracing_test.go

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
package adapter
2+
3+
import (
4+
"context"
5+
"net/http"
6+
"testing"
7+
8+
"github.com/openshift-pipelines/pipelines-as-code/pkg/params/info"
9+
testtracing "github.com/openshift-pipelines/pipelines-as-code/pkg/test/tracing"
10+
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
11+
"go.opentelemetry.io/otel"
12+
"go.opentelemetry.io/otel/propagation"
13+
sdktrace "go.opentelemetry.io/otel/sdk/trace"
14+
semconv "go.opentelemetry.io/otel/semconv/v1.40.0"
15+
"go.opentelemetry.io/otel/trace"
16+
"gotest.tools/v3/assert"
17+
)
18+
19+
func TestSetVCSSpanAttributes(t *testing.T) {
20+
t.Parallel()
21+
22+
eventTypeKey := string(tracing.PACEventTypeKey)
23+
repoURLKey := string(semconv.VCSRepositoryURLFullKey)
24+
headRevKey := string(semconv.VCSRefHeadRevisionKey)
25+
26+
tests := []struct {
27+
name string
28+
event *info.Event
29+
want map[string]string
30+
}{
31+
{
32+
name: "full event",
33+
event: &info.Event{
34+
EventType: "pull_request",
35+
URL: "https://github.com/test/repo",
36+
SHA: "abc123",
37+
},
38+
want: map[string]string{
39+
eventTypeKey: "pull_request",
40+
repoURLKey: "https://github.com/test/repo",
41+
headRevKey: "abc123",
42+
},
43+
},
44+
{
45+
name: "event type only",
46+
event: &info.Event{
47+
EventType: "push",
48+
},
49+
want: map[string]string{
50+
eventTypeKey: "push",
51+
},
52+
},
53+
{
54+
name: "url without sha",
55+
event: &info.Event{
56+
EventType: "issue_comment",
57+
URL: "https://github.com/test/repo",
58+
},
59+
want: map[string]string{
60+
eventTypeKey: "issue_comment",
61+
repoURLKey: "https://github.com/test/repo",
62+
},
63+
},
64+
}
65+
66+
for _, tt := range tests {
67+
t.Run(tt.name, func(t *testing.T) {
68+
t.Parallel()
69+
70+
exporter := &testtracing.RecordingExporter{}
71+
tp := sdktrace.NewTracerProvider(
72+
sdktrace.WithSampler(sdktrace.AlwaysSample()),
73+
sdktrace.WithSyncer(exporter),
74+
)
75+
defer func() { _ = tp.Shutdown(context.Background()) }()
76+
77+
ctx, span := tp.Tracer("test").Start(context.Background(), "test-span")
78+
setVCSSpanAttributes(ctx, tt.event)
79+
span.End()
80+
81+
spans := exporter.GetSpans()
82+
assert.Equal(t, len(spans), 1)
83+
got := map[string]string{}
84+
for _, a := range spans[0].Attributes() {
85+
got[string(a.Key)] = a.Value.AsString()
86+
}
87+
assert.DeepEqual(t, got, tt.want)
88+
})
89+
}
90+
}
91+
92+
func TestProcessEventSpanHonorsIncomingTraceContext(t *testing.T) {
93+
exporter := testtracing.SetupTracer(t)
94+
95+
// Simulate an external system sending a webhook with a traceparent header.
96+
// Create a parent span to generate a valid trace context.
97+
parentCtx, parentSpan := otel.Tracer("external-system").Start(context.Background(), "external-root")
98+
expectedTraceID := parentSpan.SpanContext().TraceID()
99+
parentSpan.End()
100+
101+
// Inject the parent context into HTTP headers (what the webhook sender would do).
102+
req, _ := http.NewRequestWithContext(context.Background(), http.MethodPost, "http://localhost", nil)
103+
otel.GetTextMapPropagator().Inject(parentCtx, propagation.HeaderCarrier(req.Header))
104+
105+
// This is the exact extract → start sequence from handleEvent.
106+
tracedCtx := otel.GetTextMapPropagator().Extract(context.Background(), propagation.HeaderCarrier(req.Header))
107+
_, span := otel.Tracer(tracing.TracerName).Start(tracedCtx, "PipelinesAsCode:ProcessEvent",
108+
trace.WithSpanKind(trace.SpanKindServer),
109+
)
110+
span.End()
111+
112+
spans := exporter.GetSpans()
113+
var processSpan sdktrace.ReadOnlySpan
114+
for _, s := range spans {
115+
if s.Name() == "PipelinesAsCode:ProcessEvent" {
116+
processSpan = s
117+
}
118+
}
119+
assert.Assert(t, processSpan != nil, "ProcessEvent span not found")
120+
assert.Equal(t, processSpan.Parent().TraceID(), expectedTraceID,
121+
"ProcessEvent span should be parented under the incoming trace context, not a new root")
122+
assert.Assert(t, processSpan.Parent().IsValid(),
123+
"ProcessEvent span should have a valid remote parent")
124+
}
125+
126+
func TestProcessEventSpanCreatesRootWithoutIncomingContext(t *testing.T) {
127+
exporter := testtracing.SetupTracer(t)
128+
129+
// Webhook with no traceparent header.
130+
req, _ := http.NewRequestWithContext(context.Background(), http.MethodPost, "http://localhost", nil)
131+
132+
tracedCtx := otel.GetTextMapPropagator().Extract(context.Background(), propagation.HeaderCarrier(req.Header))
133+
_, span := otel.Tracer(tracing.TracerName).Start(tracedCtx, "PipelinesAsCode:ProcessEvent",
134+
trace.WithSpanKind(trace.SpanKindServer),
135+
)
136+
span.End()
137+
138+
spans := exporter.GetSpans()
139+
processSpan := testtracing.FindSpan(spans, "PipelinesAsCode:ProcessEvent")
140+
assert.Assert(t, processSpan != nil)
141+
assert.Assert(t, !processSpan.Parent().IsValid(),
142+
"ProcessEvent span should be a root when no incoming trace context is present")
143+
}

pkg/apis/pipelinesascode/keys/keys.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,8 @@ const (
6868
GithubApplicationID = "github-application-id"
6969
GithubPrivateKey = "github-private-key"
7070
ResultsRecordSummary = "results.tekton.dev/recordSummaryAnnotations"
71+
72+
SpanContextAnnotation = "tekton.dev/pipelinerunSpanContext"
7173
)
7274

7375
var ParamsRe = regexp.MustCompile(`{{([^}]{2,})}}`)

0 commit comments

Comments
 (0)