Skip to content

Design: support Gateway API's new XListenerSet resource#7839

Merged
cert-manager-prow[bot] merged 12 commits intomasterfrom
proposal-gatewayapi-listenerset
Dec 11, 2025
Merged

Design: support Gateway API's new XListenerSet resource#7839
cert-manager-prow[bot] merged 12 commits intomasterfrom
proposal-gatewayapi-listenerset

Conversation

@maelvls
Copy link
Copy Markdown
Member

@maelvls maelvls commented Jul 3, 2025

Top-level issue: #8251

Rendered document: 20250703.gatewayapi-listenerset.md

Pull Request Motivation

I'd like to propose a design to address:

This design supersedes two designs:

Related:

/kind design

Release Note

NONE

CyberArk tracker: VC-46888

@cert-manager-prow cert-manager-prow bot added kind/design Categorizes issue or PR as related to design. release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 3, 2025
@cert-manager-prow cert-manager-prow bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 7, 2025
@maelvls maelvls changed the title Proposal: support Gateway API's new ListenerSet Design: support Gateway API's new ListenerSet Jul 8, 2025
@maelvls maelvls requested a review from wallrj July 8, 2025 16:22
Copy link
Copy Markdown
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maelvls

This looks great.

I haven't tried the Gateway API examples myself, but I will. I'd like to try updating the getting started tutorials to use Gateway API instead of Ingress so that I can understand all this better.

@cert-manager-prow cert-manager-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 9, 2025

Two workarounds have been found by cluster operators:

- **Using a wildcard certificate as hostname on the Gateway:** this solution introduces risks associated with wildcard certificates (cf. [OWASP notes](https://cheatsheetseries.owasp.org/cheatsheets/Transport_Layer_Security_Cheat_Sheet.html#carefully-consider-the-use-of-wildcard-certificates) on using wildcard certificates).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel compelled to point out that the point of allowing wildcard certificates on the Gateway at all was to minimize the risk of exposure of the wildcard key. That's one of the primary reasons why we built ReferenceGrant - so that very-secure keys could be consumed by Gateway owners without being able to be read by those same Gateway owners.

That said, I don't think that changes this recommendation: There are risks associated with wildcard certificates that need to be managed, and that using cert-manager to manage them currently doesn't use a separate namespace and ReferenceGrant, so the concerns called out by OWASP do apply in cert-manager's case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems pretty clear the average person has no way of understanding how to do this correctly. And it seems likely that the average implementation doesn't make it remotely easy.

Certainly from reading people trying to do things, it doesn't sound like the implementations do anything close to what the spec writers imagined.

That said, if someone can point to an implementation that does what was expected, that'd be handy to leave as a reference.

Copy link
Copy Markdown
Member Author

@maelvls maelvls Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both. I wasn't aware of the fact that ReferenceGrant aims at locking down the wildcard cert's private key so that it is possible to use it across multiple teams, each of them using a separate Gateway that is given "read" access to that shared wildcard cert.

I've expanded my bullet point to take that into account:

  • Using a wildcard certificate as hostname on the Gateway: Gateway API intentionally supports wildcard certificates with a secure design: the wildcard private key stays isolated in a privileged namespace (e.g., gateway-system), and Gateway owners reference it via a ReferenceGrant without being able to read the key itself. However, when cert-manager creates wildcard certificates for this workaround, it typically places the Secret in the same namespace as the Gateway or uses cluster-wide permissions, bypassing the ReferenceGrant isolation model. This means developers can read the wildcard private key directly, introducing the risks associated with wildcard certificate exposure (cf. OWASP notes on using wildcard certificates).

@youngnick
Copy link
Copy Markdown

This design makes sense to me, great to see.

I'd also encourage whoever implements this to make clear to users that this is for TESTING ONLY. The X prefix on XListenerSet is in case we need to make breaking changes. Experimental resources like that should not be used in production, and migrating from XListenerSet to ListenerSet when this goes to stable will require manual action (pulling down the object, changing the Kind and Group, and re-adding a new object). This is by design, as this is for testing only.

Implementing this now for cert-manager is amazing though, because it will allow implementations to be sure that one of the most common use cases for this flow is tested and validated before we move the whole object to Standard/Stable.

That's going to allow us to get this design right as early as possible, and hopefully be able to move forward XListenerSet into ListenerSet as soon as possible.

Thanks for this design @maelvls, nice work.

Copy link
Copy Markdown

@kflynn kflynn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, too.


### Issuer Annotations

What about a Gateway resource with the `cert-manager.io/issuer` and a listener
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes me very sad that this needs to be a thing, but I think that the way you've covered it makes the most sense.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, me too. But the only other option is to do some sort of Policy behavior, which would be as bad, and would break the existing-user experience too much.

@maelvls maelvls linked an issue Aug 19, 2025 that may be closed by this pull request
@maelvls maelvls changed the title Design: support Gateway API's new ListenerSet Design: support Gateway API's new XListenerSet resource Nov 13, 2025
@maelvls maelvls added the cybr Used by CyberArk-employed maintainers to report to line management what's being worked on. label Nov 14, 2025

## Motivation

Application developers previously using Ingress could configure both routing and TLS certificates independently. When moving to Gateway API, the TLS configuration is now centralized in the Gateway resource, typically controlled by the cluster operator (see below section about locking down Gateway resources). This change restricts developers, who can no longer configure TLS in a self-service way.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid see below in favor of see [anchor](...) or something. Some of us use screen readers where "below" is a totally unusable concept.


Two workarounds have been found by cluster operators:

- **Using a wildcard certificate as hostname on the Gateway:** this solution introduces risks associated with wildcard certificates (cf. [OWASP notes](https://cheatsheetseries.owasp.org/cheatsheets/Transport_Layer_Security_Cheat_Sheet.html#carefully-consider-the-use-of-wildcard-certificates) on using wildcard certificates).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems pretty clear the average person has no way of understanding how to do this correctly. And it seems likely that the average implementation doesn't make it remotely easy.

Certainly from reading people trying to do things, it doesn't sound like the implementations do anything close to what the spec writers imagined.

That said, if someone can point to an implementation that does what was expected, that'd be handy to leave as a reference.

@maelvls maelvls force-pushed the proposal-gatewayapi-listenerset branch 2 times, most recently from 8bc5514 to 7229f98 Compare November 24, 2025 14:08
@maelvls maelvls force-pushed the proposal-gatewayapi-listenerset branch 3 times, most recently from 9c11b4e to ed5edeb Compare November 24, 2025 15:04
@maelvls
Copy link
Copy Markdown
Member Author

maelvls commented Nov 24, 2025

I've addressed all the comments, good to go!

@maelvls maelvls force-pushed the proposal-gatewayapi-listenerset branch from ed5edeb to 2f85dc9 Compare November 25, 2025 09:23
@maelvls maelvls force-pushed the proposal-gatewayapi-listenerset branch from 2f85dc9 to 3b52e7d Compare November 25, 2025 10:30

| Question | Answer |
|---|---|
| How can this feature be enabled or disabled for an existing cert-manager installation? | Feature gate `--feature-gates XGatewayAPI=true` will control enabling/disabling reconciling `XListenerSet` resources. |
Copy link
Copy Markdown
Member Author

@maelvls maelvls Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the blog post PR, I had originally shown an example of what the feature gate would look like. I had written the following snippet:

--enable-gateway-api \
--feature-gates XGatewayAPI=true

@ThatsMrTalbot, cert-manager/website#1857 (comment), made the point that this feature gate doesn't reflect the actual feature:

Should the feature gate be XListenerSet? XGatewayAPI seems very generic and does not tell me what it does

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah XListenerSets would be a much better feature gate name

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah XListenerSets would be a much better feature gate name

Copy link
Copy Markdown
Member Author

@maelvls maelvls Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've changed it to:

--feature-gates XListenerSet=true

(I dropped the ending s since the gate is on the XListenerSet kind)

@wilmardo
Copy link
Copy Markdown

wilmardo commented Nov 28, 2025

Certainly from reading people trying to do things, it doesn't sound like the implementations do anything close to what the spec writers imagined.

That said, if someone can point to an implementation that does what was expected, that'd be handy to leave as a reference.

@jsoref

See this reddit comment from one of the maintainers of GW API for some context:
https://www.reddit.com/r/kubernetes/comments/1p613rp/comment/nqnlmh4/

As I've said in other Reddit comments, this is because when we first designed this relationship, certificates were absolutely not a thing you wanted App Devs touching or owning, because they were bought from Verisign or similar and cost thousands of dollars each.

Sadly for us, but happily for everyone else, Let's Encrypt (and cert-manager for Kubernetes) helped to break the certificate monopoly and make it possible to allow App Devs to "own" their own Certificates (in the sense of asking something else to provision a certificate for them), while having that be acceptably secure.

We started in 2019, but at that stage, Let's Encrypt hadn't broken through into broad usage yet, particularly in the enterprise users that all our employers tend to focus on.

I am not sure that a Reddit comment is the reference you were looking for but it at least gave me the context on why it currently is this way compared the the Ingress resource.

@cert-manager-prow cert-manager-prow bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 30, 2025
As suggested by @jsoref and @ThatsMrTalbot, the feature gate name should be
more specific and reflect what it actually does rather than being too generic.

Signed-off-by: Maël Valais <[email protected]>
Explains the two main concerns that led to centralizing TLS config at the
Gateway level:
- Traffic hijacking protection (preventing conflicting Ingress objects)
- Certificate cost concerns (certs were expensive when GW API was designed)

This provides important historical context for why ListenerSet emerged as the
solution to restore developer self-service while maintaining security.

Signed-off-by: Maël Valais <[email protected]>
Addresses the nuance that Gateway API intentionally supports wildcard
certificates with a secure design using ReferenceGrant for namespace isolation.
The security concern arises specifically when cert-manager creates wildcard
certificates without following this isolation model, allowing developers to
read the private key directly.

This clarification maintains the validity of the OWASP warning while
acknowledging Gateway API's thoughtful security design.

Signed-off-by: Maël Valais <[email protected]>
Signed-off-by: Maël Valais <[email protected]>
@maelvls maelvls force-pushed the proposal-gatewayapi-listenerset branch from e17bdb8 to 3a268f0 Compare November 30, 2025 13:55
@maelvls
Copy link
Copy Markdown
Member Author

maelvls commented Nov 30, 2025

@wilmardo Thanks for pointing me to that comment. I agree, Nick's comment on Reddit is helpful and sheds some light on why things are the way they are. It would be nice if this bit of information was added to Gateway API's Key differences between Ingress API and Gateway API. I've added a paragraph with this important bit of context.


**Traffic hijacking protection:** with the Ingress API, one team can accidentally or maliciously capture traffic intended for another team by creating an Ingress with the same hostname but different TLS configuration. This often happens in larger clusters with many teams, where conflicting Ingress objects can silently intercept traffic meant for other services.

**Certificate cost concerns:** as [Nick Young explained](https://www.reddit.com/r/kubernetes/comments/1p613rp/comment/nqnlmh4/), when Gateway API was first designed, certificates were expensive assets bought from Verisign or similar providers, costing thousands of dollars each. You absolutely didn't want app developers touching or owning those certificates.
Copy link
Copy Markdown
Member Author

@maelvls maelvls Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this after reading the Reddit thread, FYI @youngnick

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. I'll make a note that we need to update the differences page as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which pages are those?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, that page should really have SEO for the word hijacking.

Copy link
Copy Markdown

@arifsumona arifsumona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

59905167

Copy link
Copy Markdown
Member

@wallrj-cyberark wallrj-cyberark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maelvls and others for putting together the design. Can be refined if necessary as you proceed with the implementation.

@wallrj
Copy link
Copy Markdown
Member

wallrj commented Dec 11, 2025

/approve
/lgtm

@cert-manager-prow cert-manager-prow bot added the lgtm Indicates that a PR is ready to be merged. label Dec 11, 2025
@cert-manager-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kflynn, wallrj, wallrj-cyberark

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cert-manager-prow cert-manager-prow bot merged commit 23629d5 into master Dec 11, 2025
6 checks passed
@maelvls maelvls self-assigned this Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cybr Used by CyberArk-employed maintainers to report to line management what's being worked on. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/design Categorizes issue or PR as related to design. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create certificate based on HTTPRoute configuration

8 participants