Skip to content

feat: per-job oidc_tokens block for custom audiences#1008

Open
cchristous wants to merge 12 commits intosemaphoreio:mainfrom
cchristous:feat/oidc-tokens-custom-audience
Open

feat: per-job oidc_tokens block for custom audiences#1008
cchristous wants to merge 12 commits intosemaphoreio:mainfrom
cchristous:feat/oidc-tokens-custom-audience

Conversation

@cchristous
Copy link
Copy Markdown
Contributor

@cchristous cchristous commented May 1, 2026

Status: Draft — proto changes pending maintainer assistance

This PR adds a per-job oidc_tokens: block to the pipeline yaml that lets pipelines mint additional OIDC tokens with custom audiences, alongside the existing default SEMAPHORE_OIDC_TOKEN.

Why: The OIDC aud claim (RFC 7519 §4.1.3) identifies the token's intended consumer. Today Semaphore hardcodes aud = https://<org>.semaphoreci.com, which prevents Semaphore tokens from being used with services that strictly verify a specific audience. The motivating case is PyPI, which requires aud="pypi" (downstream PR pypi/warehouse#19048 implementing pypi/warehouse#18882). Other ecosystems with the same constraint include npm trusted publishing, Docker Hub OIDC, and HashiCorp Vault bound_audiences.

What's in this PR

Schema (plumber/spec/priv/v1.0.yml)

jobs:
  - name: Publish to PyPI
    oidc_tokens:
      PYPI_OIDC_TOKEN:
        aud: pypi
      NPM_TOKEN:
        aud: ["npm-prod", "npm-staging"]
    commands:
      - ...

Validation rules (yaml-level, fail fast):

  1. Each entry MUST have aud. Missing aud → schema validation failure.
  2. aud is a non-empty string (≤256 chars) OR a non-empty list of non-empty strings (1–8 entries, each ≤256 chars).
  3. Yaml key (env var name) MUST match ^[A-Z_][A-Z0-9_]*$. Up to 16 entries per job.
  4. Reserved key SEMAPHORE_OIDC_TOKEN rejected at semantic-validation time.

Secrethub JWT generator

secrethub/lib/secrethub/open_id_connect/jwt.ex learns to honor an optional :audience field on the request struct. When set, it overrides the default org-URL audience. Single-element list flattens to a string per RFC 7519 convention (necessary for strict_aud-checking consumers like PyPI). Multi-element list becomes a JSON array.

req.audience JWT aud claim
absent / [] (default) "https://<org>.<domain>" (current behavior)
["pypi"] (single-element list) "pypi" (string)
["a", "b"] (multi-element list) ["a", "b"] (JSON array)

Pure additive change — existing callers unaffected.

Semantic validator

plumber/ppl/lib/ppl/definition_reviser/oidc_tokens_validator.ex rejects SEMAPHORE_OIDC_TOKEN as a custom token name (it's the reserved name for the auto-injected default). Wired into the definition_reviser with chain alongside JobMatrixValidator / ParallelismValidator.

Documentation

User guide and reference pages updated under docs/docs/using-semaphore/openid.md and docs/docs/reference/openid.md.

What's NOT in this PR (needs maintainer assistance)

The full feature requires changes to renderedtext/internal_api (the private proto-source repo) which I don't have access to:

  1. Add optional audience field to GenerateOpenIDConnectTokenRequest in secrethub.proto. Once added and .pb.ex files regenerated, internal_grpc_api.ex can pass the field through to JWT.generate_and_sign/1 (which already reads it via Map.get(req, :audience, [])).

  2. Add OIDCTokenSpec message and oidc_tokens field on Job.Spec in jobs.proto (or whichever proto file currently defines Semaphore.Jobs.V1alpha.Job.Spec). Once added:

    • plumber/ppl needs a small change to compile yaml oidc_tokens (a map from name to {aud}) into the proto repeated OIDCTokenSpec oidc_tokens field.
    • zebra/lib/zebra/workers/job_request_factory/open_id_connect.ex needs a new load_oidc_tokens/5 function that mints one additional OIDC token per Job.Spec.oidc_tokens entry (with the requested audience) and emits the env vars. The default-token path stays untouched.

The implementation pattern for both Zebra and the plumber compilation follows the existing secrets: block precisely.

I've intentionally not modified any .pb.ex files in this PR (those are generated). Once the proto changes land, regenerating in the consuming services (plumber/proto, secrethub, zebra, front, guard, public-api, ee/gofer) plus the Zebra and plumber/ppl changes above complete the feature.

I'm happy to do those final pieces myself if you can grant me the proto-repo access, or to review/iterate if a maintainer wants to drive them.

Tests

  • secrethub/test/secrethub/open_id_connect/jwt_test.exs — 7 tests:

    • 4 audience-handling cases (default-via-absent, default-via-empty-list, single-element flatten, multi-element JSON array)
    • audience: nil falls back to default (covers proto-deserialization edge case)
    • iss invariance: overriding aud does not change iss (asserted on default and override paths)
    • JWTFilter interaction (with :open_id_connect_filter enabled): asserts aud survives when allowlisted; complementary test asserts aud is stripped when not allowlisted (plus refute of a non-allowlisted claim to prove the filter actually ran).
      Existing 134 secrethub regression tests pass.
  • plumber/spec/test/pipelines/v1.0-oidc_tokens*.yml — 11 fixtures auto-discovered by SemaphoreYamlSpecTest:

    • 1 positive (string + list aud forms)
    • 7 negative .fail.yml covering each rejection rule:
      missing_aud, invalid_key, empty_aud_list, aud_empty_string, aud_too_long (257 chars), too_many_audiences (9 entries), too_many_tokens (17 entries)
    • 3 positive at-bound fixtures (at_max_tokens = 16, at_max_audiences = 8, at_max_aud_length = 256) so a regression that tightens a bound flips a passing fixture.
  • plumber/ppl/test/definition_reviser/oidc_tokens_validator_test.exs — 6 unit tests covering happy paths, reserved name in regular block, reserved name in after_pipeline, and reserved name on second job in a block (proves block iteration isn't short-circuited).

Schema bounds

The oidc_tokens schema has explicit upper limits to prevent claim-bloat / DoS-style abuse:

  • maxProperties: 16 — at most 16 entries per job
  • maxLength: 256 — each aud string up to 256 chars
  • maxItems: 8aud list up to 8 audiences
  • minLength: 1, minItems: 1 — empty values rejected

Each bound has a positive fixture at the limit and a negative fixture one over.

Out-of-scope test fix

plumber/ppl/test/definition_reviser/global_job_config_test.exs had a brittle on_exit callback that called System.put_env/2 with a nil value (returned by System.get_env/1 when the env var was unset before the test). Replaced with a small restore_env/2 helper that calls System.delete_env/1 for the nil case. Strictly an improvement; the test now passes regardless of host env state. Out of scope for the feature, but it was the only thing keeping CI red on this branch.

mix format clean, mix credo --strict clean on new files.

Backward compatibility

  • Existing pipelines unchanged.
  • SEMAPHORE_OIDC_TOKEN semantics unchanged.
  • :audience field on the JWT request map is optional (read via Map.get(..., [])); old callers unaffected.
  • oidc_tokens is purely additive; pipelines that don't use it behave identically to today.

Open questions for maintainers

  1. Naming. I chose oidc_tokens to align with Semaphore's existing SEMAPHORE_OIDC_TOKEN env var vocabulary. GitLab CI uses id_tokens. Open to either.
  2. Feature flag scoping. Today's OIDC is gated by :open_id_connect. Should oidc_tokens ride this same flag, or get a new one for staged rollout?
  3. Self-hosted (CE) eligibility. OIDC is Enterprise-only on SaaS today. The original requestor of this feature (pypi/warehouse#18882) is a self-hosted user. Should oidc_tokens be available in CE?
  4. Yaml shape. I chose a map (yaml key = env var name); existing secrets: is a list of objects. Map matches GitLab and is more compact for the common single-entry case. Open to flipping if you prefer.

Related

cchristous added 5 commits May 1, 2026 13:25
Adds an optional `:audience` field to the OIDC token request handled
by `Secrethub.OpenIDConnect.JWT.generate_and_sign/1`.

Behavior:
  * absent or empty list -> existing default `https://<org>.<domain>`
  * single-element list  -> string (RFC 7519 sec 4.1.3 convention,
    required by strict-aud consumers like PyPI Trusted Publishers)
  * multi-element list   -> JSON array

Purely additive: existing callers that do not set `:audience` are
unaffected. Filed as part of the wider per-job `oidc_tokens` block
work; gRPC plumbing and request-struct field are out of scope for
this change and will land separately.
Introduces a per-job `oidc_tokens` map keyed by env var name, with each
entry requiring an `aud` value (string or non-empty list of strings).
Keys are restricted to the env var pattern via patternProperties combined
with additionalProperties: false. Reserved-name semantics
(`SEMAPHORE_OIDC_TOKEN`) are intentionally not enforced at the schema
level and will be handled by the semantic check in a follow-up task.

Adds positive and negative fixtures for the existing pipelines
directory test (missing aud, invalid env var name, empty aud list).
- Rename 4 fixture files to match existing snake_case convention
- maxProperties: 16 on oidc_tokens map (cap entries per job)
- maxLength: 256 on aud strings, maxItems: 8 on aud arrays (hardening)
Reject SEMAPHORE_OIDC_TOKEN as a custom token name (reserved for the
auto-injected default token).

Env var name format ([A-Z_][A-Z0-9_]*) is enforced at the JSON schema
layer via patternProperties. Duplicate token names are impossible at
the yaml level (map-shaped block).
Adds a "Custom audiences" section to the OIDC user guide with a complete
PyPI publishing example, and a corresponding reference section
documenting the schema, behavior, and validation errors.
cchristous added 7 commits May 1, 2026 22:52
- Handle audience: nil (key present, nil value) by falling back to default
- Add iss assertion to existing tests (locks in iss/aud independence)
- Add JWTFilter interaction test (verifies aud survives filter when enabled)
Adds .fail.yml fixtures verifying maxProperties: 16, maxLength: 256,
maxItems: 8, and minLength: 1 (empty aud string) on the oidc_tokens schema.
- Clarify that single-element aud lists flatten to strings (RFC 7519)
- Add "coming soon" banner noting Zebra runtime injection is pending
- Genericize the reserved-name validation error description
- Document schema limits in the user guide
- Add oidc_tokens cross-reference in the canonical pipeline-yaml reference
- Drop hard-coded "24h" TTL from user guide (rot risk; reference page
  intentionally omits the value)
- Re-add the "when OIDC is enabled" qualifier to the user guide's
  default-token claim (matches reference page)
- Remove leftover LLM scaffolding line in reference/openid.md
- Add @SPEC to OIDCTokensValidator.validate/1
- Strengthen JWTFilter test (asserts filter actually ran by omitting
  `sub` from the allowlist mock and refuting it survived into the JWT)
- Add complementary "aud not allowlisted" filter test verifying that
  a user-supplied custom audience is stripped when `aud` is not in the
  org claim allowlist
- Add positive boundary fixtures at exactly 16 tokens / 8 audiences /
  256-char aud so a regression that tightens any of those bounds flips
  a passing fixture to failing
- Mirror the "Coming soon" admonition into pipeline-yaml.md so all
  three docs (user guide, openid reference, pipeline-yaml reference)
  are in sync
- Add iss assertion to the single-element audience override test
  (the most regression-prone unwrap path)
- Add explicit `{#oidc_tokens-block}` anchor to the openid heading so
  the cross-link from pipeline-yaml.md does not depend on auto-slug
  behavior
The on_exit callback called System.put_env/2 with the value returned by
System.get_env/1, which is nil when the env var was unset before the
test ran. System.put_env/2 has no clause for nil and crashes.

Replace the bare put_env calls with a restore_env/2 helper that calls
System.delete_env/1 when the saved value was nil. The test now passes
regardless of host env state.
The previous "strips audience" test asserted `aud` was absent from the
final JWT after the filter ran. CI revealed that Joken's default config
injects a placeholder `aud` value ("Joken") when the claims map lacks
one, so `aud` is present after signing — just not the user's override.

The security-relevant property is "user override is suppressed", not
"aud key is absent." Updated the assertion to refute equality with the
override and clarified the test name and docstring.
@cchristous cchristous marked this pull request as ready for review May 2, 2026 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

1 participant