Design: support Gateway API's new XListenerSet resource#7839
Design: support Gateway API's new XListenerSet resource#7839cert-manager-prow[bot] merged 12 commits intomasterfrom
Conversation
|
|
||
| Two workarounds have been found by cluster operators: | ||
|
|
||
| - **Using a wildcard certificate as hostname on the Gateway:** this solution introduces risks associated with wildcard certificates (cf. [OWASP notes](https://cheatsheetseries.owasp.org/cheatsheets/Transport_Layer_Security_Cheat_Sheet.html#carefully-consider-the-use-of-wildcard-certificates) on using wildcard certificates). |
There was a problem hiding this comment.
I feel compelled to point out that the point of allowing wildcard certificates on the Gateway at all was to minimize the risk of exposure of the wildcard key. That's one of the primary reasons why we built ReferenceGrant - so that very-secure keys could be consumed by Gateway owners without being able to be read by those same Gateway owners.
That said, I don't think that changes this recommendation: There are risks associated with wildcard certificates that need to be managed, and that using cert-manager to manage them currently doesn't use a separate namespace and ReferenceGrant, so the concerns called out by OWASP do apply in cert-manager's case.
There was a problem hiding this comment.
It seems pretty clear the average person has no way of understanding how to do this correctly. And it seems likely that the average implementation doesn't make it remotely easy.
Certainly from reading people trying to do things, it doesn't sound like the implementations do anything close to what the spec writers imagined.
That said, if someone can point to an implementation that does what was expected, that'd be handy to leave as a reference.
There was a problem hiding this comment.
Thanks both. I wasn't aware of the fact that ReferenceGrant aims at locking down the wildcard cert's private key so that it is possible to use it across multiple teams, each of them using a separate Gateway that is given "read" access to that shared wildcard cert.
I've expanded my bullet point to take that into account:
- Using a wildcard certificate as hostname on the Gateway: Gateway API intentionally supports wildcard certificates with a secure design: the wildcard private key stays isolated in a privileged namespace (e.g.,
gateway-system), and Gateway owners reference it via a ReferenceGrant without being able to read the key itself. However, when cert-manager creates wildcard certificates for this workaround, it typically places the Secret in the same namespace as the Gateway or uses cluster-wide permissions, bypassing the ReferenceGrant isolation model. This means developers can read the wildcard private key directly, introducing the risks associated with wildcard certificate exposure (cf. OWASP notes on using wildcard certificates).
|
This design makes sense to me, great to see. I'd also encourage whoever implements this to make clear to users that this is for TESTING ONLY. The Implementing this now for cert-manager is amazing though, because it will allow implementations to be sure that one of the most common use cases for this flow is tested and validated before we move the whole object to Standard/Stable. That's going to allow us to get this design right as early as possible, and hopefully be able to move forward XListenerSet into ListenerSet as soon as possible. Thanks for this design @maelvls, nice work. |
|
|
||
| ### Issuer Annotations | ||
|
|
||
| What about a Gateway resource with the `cert-manager.io/issuer` and a listener |
There was a problem hiding this comment.
It makes me very sad that this needs to be a thing, but I think that the way you've covered it makes the most sense.
There was a problem hiding this comment.
Hah, me too. But the only other option is to do some sort of Policy behavior, which would be as bad, and would break the existing-user experience too much.
|
|
||
| ## Motivation | ||
|
|
||
| Application developers previously using Ingress could configure both routing and TLS certificates independently. When moving to Gateway API, the TLS configuration is now centralized in the Gateway resource, typically controlled by the cluster operator (see below section about locking down Gateway resources). This change restricts developers, who can no longer configure TLS in a self-service way. |
There was a problem hiding this comment.
Please avoid see below in favor of see [anchor](...) or something. Some of us use screen readers where "below" is a totally unusable concept.
|
|
||
| Two workarounds have been found by cluster operators: | ||
|
|
||
| - **Using a wildcard certificate as hostname on the Gateway:** this solution introduces risks associated with wildcard certificates (cf. [OWASP notes](https://cheatsheetseries.owasp.org/cheatsheets/Transport_Layer_Security_Cheat_Sheet.html#carefully-consider-the-use-of-wildcard-certificates) on using wildcard certificates). |
There was a problem hiding this comment.
It seems pretty clear the average person has no way of understanding how to do this correctly. And it seems likely that the average implementation doesn't make it remotely easy.
Certainly from reading people trying to do things, it doesn't sound like the implementations do anything close to what the spec writers imagined.
That said, if someone can point to an implementation that does what was expected, that'd be handy to leave as a reference.
Signed-off-by: Maël Valais <[email protected]>
Signed-off-by: Maël Valais <[email protected]>
…ior will be Signed-off-by: Maël Valais <[email protected]>
Signed-off-by: Maël Valais <[email protected]>
Signed-off-by: Maël Valais <[email protected]> Co-authored-by: Richard Wall <[email protected]>
…nerSet in example Signed-off-by: Maël Valais <[email protected]>
8bc5514 to
7229f98
Compare
9c11b4e to
ed5edeb
Compare
|
I've addressed all the comments, good to go! |
ed5edeb to
2f85dc9
Compare
Address review feedback from @wallrj, @youngnick, @kflynn, and @jsoref. Signed-off-by: Maël Valais <[email protected]>
2f85dc9 to
3b52e7d
Compare
|
|
||
| | Question | Answer | | ||
| |---|---| | ||
| | How can this feature be enabled or disabled for an existing cert-manager installation? | Feature gate `--feature-gates XGatewayAPI=true` will control enabling/disabling reconciling `XListenerSet` resources. | |
There was a problem hiding this comment.
In the blog post PR, I had originally shown an example of what the feature gate would look like. I had written the following snippet:
--enable-gateway-api \
--feature-gates XGatewayAPI=true@ThatsMrTalbot, cert-manager/website#1857 (comment), made the point that this feature gate doesn't reflect the actual feature:
Should the feature gate be XListenerSet? XGatewayAPI seems very generic and does not tell me what it does
There was a problem hiding this comment.
Yeah XListenerSets would be a much better feature gate name
There was a problem hiding this comment.
Yeah XListenerSets would be a much better feature gate name
There was a problem hiding this comment.
Thanks. I've changed it to:
--feature-gates XListenerSet=true
(I dropped the ending s since the gate is on the XListenerSet kind)
See this reddit comment from one of the maintainers of GW API for some context:
I am not sure that a Reddit comment is the reference you were looking for but it at least gave me the context on why it currently is this way compared the the |
Signed-off-by: Maël Valais <[email protected]>
As suggested by @jsoref and @ThatsMrTalbot, the feature gate name should be more specific and reflect what it actually does rather than being too generic. Signed-off-by: Maël Valais <[email protected]>
Explains the two main concerns that led to centralizing TLS config at the Gateway level: - Traffic hijacking protection (preventing conflicting Ingress objects) - Certificate cost concerns (certs were expensive when GW API was designed) This provides important historical context for why ListenerSet emerged as the solution to restore developer self-service while maintaining security. Signed-off-by: Maël Valais <[email protected]>
Addresses the nuance that Gateway API intentionally supports wildcard certificates with a secure design using ReferenceGrant for namespace isolation. The security concern arises specifically when cert-manager creates wildcard certificates without following this isolation model, allowing developers to read the private key directly. This clarification maintains the validity of the OWASP warning while acknowledging Gateway API's thoughtful security design. Signed-off-by: Maël Valais <[email protected]> Signed-off-by: Maël Valais <[email protected]>
e17bdb8 to
3a268f0
Compare
|
@wilmardo Thanks for pointing me to that comment. I agree, Nick's comment on Reddit is helpful and sheds some light on why things are the way they are. It would be nice if this bit of information was added to Gateway API's Key differences between Ingress API and Gateway API. I've added a paragraph with this important bit of context. |
|
|
||
| **Traffic hijacking protection:** with the Ingress API, one team can accidentally or maliciously capture traffic intended for another team by creating an Ingress with the same hostname but different TLS configuration. This often happens in larger clusters with many teams, where conflicting Ingress objects can silently intercept traffic meant for other services. | ||
|
|
||
| **Certificate cost concerns:** as [Nick Young explained](https://www.reddit.com/r/kubernetes/comments/1p613rp/comment/nqnlmh4/), when Gateway API was first designed, certificates were expensive assets bought from Verisign or similar providers, costing thousands of dollars each. You absolutely didn't want app developers touching or owning those certificates. |
There was a problem hiding this comment.
I've added this after reading the Reddit thread, FYI @youngnick
There was a problem hiding this comment.
Thanks for that. I'll make a note that we need to update the differences page as well.
There was a problem hiding this comment.
The "Differences" page is this one:
There was a problem hiding this comment.
Indeed, that page should really have SEO for the word hijacking.
wallrj-cyberark
left a comment
There was a problem hiding this comment.
Thanks @maelvls and others for putting together the design. Can be refined if necessary as you proceed with the implementation.
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kflynn, wallrj, wallrj-cyberark The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Top-level issue: #8251
Rendered document: 20250703.gatewayapi-listenerset.md
Pull Request Motivation
I'd like to propose a design to address:
This design supersedes two designs:
Related:
/kind design
Release Note
CyberArk tracker: VC-46888