Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions config/302-pac-configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,12 @@ data:
# Default: true
skip-push-event-for-pr-commits: "true"

# PipelineRun label names PaC reads to populate tracing span attributes.
# Empty disables emission of the corresponding attribute.
tracing-label-action: "delivery.tekton.dev/action"
tracing-label-application: "delivery.tekton.dev/application"
tracing-label-component: "delivery.tekton.dev/component"

# Configure a custom console here, the driver support custom parameters from
# Repo CR along a few other template variable, see documentation for more
# details
Expand Down
16 changes: 15 additions & 1 deletion config/305-config-observability.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,18 @@ data:

# metrics-export-interval specifies how often metrics are exported.
# Only applicable for grpc and http/protobuf protocols.
# metrics-export-interval: "30s"
# metrics-export-interval: "30s"

# tracing-protocol specifies the trace export protocol.
# Supported values: "grpc", "http/protobuf", "none".
# Default is "none" (tracing disabled).
# tracing-protocol: "none"

# tracing-endpoint specifies the OTLP collector endpoint.
# Required when tracing-protocol is "grpc" or "http/protobuf".
# The OTEL_EXPORTER_OTLP_ENDPOINT env var takes precedence if set.
# tracing-endpoint: "http://otel-collector.observability.svc.cluster.local:4317"

# tracing-sampling-rate controls the fraction of traces sampled.
# 0.0 = none, 1.0 = all. Default is 0 (none).
# tracing-sampling-rate: "1.0"
117 changes: 117 additions & 0 deletions docs/content/docs/operations/tracing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
title: Distributed Tracing
weight: 5
---

This page describes how to enable OpenTelemetry distributed tracing for Pipelines-as-Code. When enabled, PaC emits trace spans for webhook event processing and PipelineRun lifecycle timing.

## Enabling tracing

The ConfigMap `pipelines-as-code-config-observability` controls tracing configuration. It must exist in the same namespace as the Pipelines-as-Code controller and watcher deployments. See [config/305-config-observability.yaml](https://github.com/tektoncd/pipelines-as-code/blob/main/config/305-config-observability.yaml) for the full example.

It contains the following tracing fields:

* `tracing-protocol`: Export protocol. Supported values: `grpc`, `http/protobuf`, `none`. Default is `none` (tracing disabled).
* `tracing-endpoint`: OTLP collector endpoint. Required when protocol is not `none`. The `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable takes precedence if set.
* `tracing-sampling-rate`: Fraction of traces to sample. `0.0` = none, `1.0` = all. Default is `0`.

### Example

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: pipelines-as-code-config-observability
namespace: pipelines-as-code
data:
tracing-protocol: grpc
tracing-endpoint: "http://otel-collector.observability.svc.cluster.local:4317"
Comment thread
zakisk marked this conversation as resolved.
tracing-sampling-rate: "1.0"
```

Changes to `tracing-protocol`, `tracing-endpoint`, and `tracing-sampling-rate` require restarting the controller and watcher pods. The trace exporter is created once at startup from the ConfigMap values at that time. Set `tracing-protocol` to `none` or remove the tracing keys to disable tracing.

The controller and watcher locate this ConfigMap by name via the `CONFIG_OBSERVABILITY_NAME` environment variable set in their deployment manifests. Operator-based installations may manage this differently; consult the operator documentation for details.

## Emitted spans

The controller emits a `PipelinesAsCode:ProcessEvent` span for each webhook event. The watcher emits `waitDuration` and `executeDuration` spans for completed PipelineRuns.

### Webhook event span (`PipelinesAsCode:ProcessEvent`)

[OTel VCS semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/vcs/):

| Attribute | Source |
| --- | --- |
| `vcs.provider.name` | Git provider name |
| `vcs.repository.url.full` | Repository URL |
| `vcs.ref.head.revision` | Head commit SHA |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://opentelemetry.io/docs/specs/semconv/registry/attributes/vcs/#vcs-change-id
@ci-operator what about also having vcs.change.id which is I guess pull request number? I know you've added revision but it would be helpful for tracer to have pull request number.


PaC-specific:

| Attribute | Source |
| --- | --- |
| `pipelinesascode.tekton.dev.event_type` | Webhook event type |

### PipelineRun timing spans (`waitDuration`, `executeDuration`)

Tekton-compatible bare keys (match Tekton's own reconciler spans for correlation):

| Attribute | Source |
| --- | --- |
| `namespace` | PipelineRun namespace |
| `pipelinerun` | PipelineRun name |

Cross-service delivery attributes (`delivery.tekton.dev.*`):

| Attribute | Source |
| --- | --- |
| `delivery.tekton.dev.pipelinerun_uid` | PipelineRun UID |
| `delivery.tekton.dev.result_message` | First failing TaskRun message; omitted on success; truncated to 1024 bytes |

Additional `delivery.tekton.dev.*` attributes are sourced from [configurable PipelineRun labels](#configuring-label-sourced-attributes).

[OTel CI/CD semantic conventions](https://opentelemetry.io/docs/specs/semconv/attributes-registry/cicd/) (`executeDuration` only):

| Attribute | Source |
| --- | --- |
| `cicd.pipeline.result` | Outcome enum (see below) |

### `cicd.pipeline.result` enum

| Condition | Value |
| --- | --- |
| `Status=True` | `success` |
| `Status=False`, reason `Failed` | `failure` |
| `Status=False`, reason `PipelineRunTimeout` | `timeout` |
| `Status=False`, reason `Cancelled` or `CancelledRunningFinally` | `cancellation` |
| `Status=False`, any other reason | `error` |

## Configuring label-sourced attributes

Some span attributes are read from PipelineRun labels. The label names are configurable via the main `pipelines-as-code` ConfigMap so deployments can point at their existing labels without rewriting producers:

| ConfigMap key | PipelineRun label read (default) | Span attribute emitted |
| --- | --- | --- |
| `tracing-label-action` | `delivery.tekton.dev/action` | `cicd.pipeline.action.name` |
| `tracing-label-application` | `delivery.tekton.dev/application` | `delivery.tekton.dev.application` |
| `tracing-label-component` | `delivery.tekton.dev/component` | `delivery.tekton.dev.component` |

Setting a ConfigMap key to the empty string disables emission of that label-sourced attribute. Only label-sourced attributes are affected; all other span attributes are always emitted. The emitted span attribute keys are fixed regardless of which labels are read, so cross-service queries work uniformly.

Unlike the observability ConfigMap above (which requires a pod restart), changes to these label mappings are picked up automatically without restarting pods.

## Trace context propagation

When Pipelines-as-Code creates a PipelineRun, it sets the `tekton.dev/pipelinerunSpanContext` annotation with a JSON-encoded OTel TextMapCarrier containing the W3C `traceparent`. PaC tracing works independently — you get PaC spans regardless of whether Tekton Pipelines has tracing enabled.

If Tekton Pipelines is also configured with tracing pointing at the same collector, its reconciler spans appear as children of the PaC span, providing a single end-to-end trace from webhook receipt through task execution. See the [Tekton Pipelines tracing documentation](https://github.com/tektoncd/pipeline/blob/main/docs/developers/tracing.md) for Tekton's independent tracing setup.

## Deploying a trace collector

Pipelines-as-Code exports traces using the standard OpenTelemetry Protocol (OTLP). You need a running OTLP-compatible collector for the `tracing-endpoint` to point to. Common options include:

* [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) -- the vendor-neutral reference collector
* [Jaeger](https://www.jaegertracing.io/docs/latest/getting-started/) -- supports OTLP ingestion natively since v1.35

Deploying and operating a collector is outside the scope of Pipelines-as-Code. Refer to your organization's observability infrastructure or the links above for setup instructions.
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ require (
gitlab.com/gitlab-org/api/client-go v1.46.0
go.opentelemetry.io/otel v1.43.0
go.opentelemetry.io/otel/metric v1.43.0
go.opentelemetry.io/otel/sdk v1.43.0
go.opentelemetry.io/otel/sdk/metric v1.43.0
go.opentelemetry.io/otel/trace v1.43.0
go.uber.org/zap v1.27.1
golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90
golang.org/x/oauth2 v0.36.0
Expand Down Expand Up @@ -91,8 +93,6 @@ require (
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0 // indirect
go.opentelemetry.io/otel/exporters/prometheus v0.65.0 // indirect
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.43.0 // indirect
go.opentelemetry.io/otel/sdk v1.43.0 // indirect
go.opentelemetry.io/otel/trace v1.43.0 // indirect
go.opentelemetry.io/proto/otlp v1.10.0 // indirect
go.uber.org/atomic v1.11.0 // indirect
go.yaml.in/yaml/v2 v2.4.4 // indirect
Expand Down
15 changes: 14 additions & 1 deletion pkg/adapter/adapter.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ import (
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/gitea"
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/github"
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/gitlab"
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/trace"
"go.uber.org/zap"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"knative.dev/eventing/pkg/adapter/v2"
Expand Down Expand Up @@ -191,6 +195,13 @@ func (l listener) handleEvent(ctx context.Context) http.HandlerFunc {
}
gitProvider.SetPacInfo(&pacInfo)

tracedCtx := otel.GetTextMapPropagator().Extract(ctx, propagation.HeaderCarrier(request.Header))

tracer := otel.Tracer(tracing.TracerName)
tracedCtx, span := tracer.Start(tracedCtx, "PipelinesAsCode:ProcessEvent",
trace.WithSpanKind(trace.SpanKindServer),
)

s := sinker{
run: l.run,
vcx: gitProvider,
Expand All @@ -206,8 +217,10 @@ func (l listener) handleEvent(ctx context.Context) http.HandlerFunc {
localRequest := request.Clone(request.Context())

go func() {
err := s.processEvent(ctx, localRequest)
defer span.End()
err := s.processEvent(tracedCtx, localRequest)
if err != nil {
span.RecordError(err)
logger.Errorf("an error occurred: %v", err)
}
}()
Expand Down
21 changes: 21 additions & 0 deletions pkg/adapter/sinker.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ import (
"github.com/openshift-pipelines/pipelines-as-code/pkg/pipelineascode"
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider"
"github.com/openshift-pipelines/pipelines-as-code/pkg/provider/status"
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
semconv "go.opentelemetry.io/otel/semconv/v1.40.0"
"go.opentelemetry.io/otel/trace"
"go.uber.org/zap"
)

Expand Down Expand Up @@ -117,6 +120,10 @@ func (s *sinker) processEvent(ctx context.Context, request *http.Request) error
}
}

// Enrich span with VCS attributes — for incoming events these are
// pre-populated; for webhook events ParsePayload filled them in.
setVCSSpanAttributes(ctx, s.event)

p := pipelineascode.NewPacs(s.event, s.vcx, s.run, s.pacInfo, s.kint, s.logger, s.globalRepo)
return p.Run(ctx)
}
Expand Down Expand Up @@ -174,3 +181,17 @@ func (s *sinker) createSkipCIStatus(ctx context.Context) error {

return nil
}

func setVCSSpanAttributes(ctx context.Context, event *info.Event) {
span := trace.SpanFromContext(ctx)
if !span.IsRecording() {
return
}
span.SetAttributes(tracing.PACEventTypeKey.String(event.EventType))
if event.URL != "" {
span.SetAttributes(semconv.VCSRepositoryURLFullKey.String(event.URL))
}
if event.SHA != "" {
span.SetAttributes(semconv.VCSRefHeadRevisionKey.String(event.SHA))
}
}
143 changes: 143 additions & 0 deletions pkg/adapter/sinker_tracing_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
package adapter

import (
"context"
"net/http"
"testing"

"github.com/openshift-pipelines/pipelines-as-code/pkg/params/info"
testtracing "github.com/openshift-pipelines/pipelines-as-code/pkg/test/tracing"
"github.com/openshift-pipelines/pipelines-as-code/pkg/tracing"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/propagation"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.40.0"
"go.opentelemetry.io/otel/trace"
"gotest.tools/v3/assert"
)

func TestSetVCSSpanAttributes(t *testing.T) {
t.Parallel()

eventTypeKey := string(tracing.PACEventTypeKey)
repoURLKey := string(semconv.VCSRepositoryURLFullKey)
headRevKey := string(semconv.VCSRefHeadRevisionKey)

tests := []struct {
name string
event *info.Event
want map[string]string
}{
{
name: "full event",
event: &info.Event{
EventType: "pull_request",
URL: "https://github.com/test/repo",
SHA: "abc123",
},
want: map[string]string{
eventTypeKey: "pull_request",
repoURLKey: "https://github.com/test/repo",
headRevKey: "abc123",
},
},
{
name: "event type only",
event: &info.Event{
EventType: "push",
},
want: map[string]string{
eventTypeKey: "push",
},
},
{
name: "url without sha",
event: &info.Event{
EventType: "issue_comment",
URL: "https://github.com/test/repo",
},
want: map[string]string{
eventTypeKey: "issue_comment",
repoURLKey: "https://github.com/test/repo",
},
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
t.Parallel()

exporter := &testtracing.RecordingExporter{}
tp := sdktrace.NewTracerProvider(
sdktrace.WithSampler(sdktrace.AlwaysSample()),
sdktrace.WithSyncer(exporter),
)
defer func() { _ = tp.Shutdown(context.Background()) }()

ctx, span := tp.Tracer("test").Start(context.Background(), "test-span")
setVCSSpanAttributes(ctx, tt.event)
span.End()

spans := exporter.GetSpans()
assert.Equal(t, len(spans), 1)
got := map[string]string{}
for _, a := range spans[0].Attributes() {
got[string(a.Key)] = a.Value.AsString()
}
assert.DeepEqual(t, got, tt.want)
})
}
}

func TestProcessEventSpanHonorsIncomingTraceContext(t *testing.T) {
exporter := testtracing.SetupTracer(t)

// Simulate an external system sending a webhook with a traceparent header.
// Create a parent span to generate a valid trace context.
parentCtx, parentSpan := otel.Tracer("external-system").Start(context.Background(), "external-root")
expectedTraceID := parentSpan.SpanContext().TraceID()
parentSpan.End()

// Inject the parent context into HTTP headers (what the webhook sender would do).
req, _ := http.NewRequestWithContext(context.Background(), http.MethodPost, "http://localhost", nil)
otel.GetTextMapPropagator().Inject(parentCtx, propagation.HeaderCarrier(req.Header))

// This is the exact extract → start sequence from handleEvent.
tracedCtx := otel.GetTextMapPropagator().Extract(context.Background(), propagation.HeaderCarrier(req.Header))
_, span := otel.Tracer(tracing.TracerName).Start(tracedCtx, "PipelinesAsCode:ProcessEvent",
trace.WithSpanKind(trace.SpanKindServer),
)
span.End()

spans := exporter.GetSpans()
var processSpan sdktrace.ReadOnlySpan
for _, s := range spans {
if s.Name() == "PipelinesAsCode:ProcessEvent" {
processSpan = s
}
}
assert.Assert(t, processSpan != nil, "ProcessEvent span not found")
assert.Equal(t, processSpan.Parent().TraceID(), expectedTraceID,
"ProcessEvent span should be parented under the incoming trace context, not a new root")
assert.Assert(t, processSpan.Parent().IsValid(),
"ProcessEvent span should have a valid remote parent")
}

func TestProcessEventSpanCreatesRootWithoutIncomingContext(t *testing.T) {
exporter := testtracing.SetupTracer(t)

// Webhook with no traceparent header.
req, _ := http.NewRequestWithContext(context.Background(), http.MethodPost, "http://localhost", nil)

tracedCtx := otel.GetTextMapPropagator().Extract(context.Background(), propagation.HeaderCarrier(req.Header))
_, span := otel.Tracer(tracing.TracerName).Start(tracedCtx, "PipelinesAsCode:ProcessEvent",
trace.WithSpanKind(trace.SpanKindServer),
)
span.End()

spans := exporter.GetSpans()
processSpan := testtracing.FindSpan(spans, "PipelinesAsCode:ProcessEvent")
assert.Assert(t, processSpan != nil)
assert.Assert(t, !processSpan.Parent().IsValid(),
"ProcessEvent span should be a root when no incoming trace context is present")
}
2 changes: 2 additions & 0 deletions pkg/apis/pipelinesascode/keys/keys.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ const (
GithubApplicationID = "github-application-id"
GithubPrivateKey = "github-private-key"
ResultsRecordSummary = "results.tekton.dev/recordSummaryAnnotations"

SpanContextAnnotation = "tekton.dev/pipelinerunSpanContext"
)

var ParamsRe = regexp.MustCompile(`{{([^}]{2,})}}`)
Loading
Loading