Skip to content

Default capture proxy to TLS with self-signed cert#2717

Open
jugal-chauhan wants to merge 7 commits intoopensearch-project:mainfrom
jugal-chauhan:fix-proxy-default-to-tls
Open

Default capture proxy to TLS with self-signed cert#2717
jugal-chauhan wants to merge 7 commits intoopensearch-project:mainfrom
jugal-chauhan:fix-proxy-default-to-tls

Conversation

@jugal-chauhan
Copy link
Copy Markdown
Collaborator

@jugal-chauhan jugal-chauhan commented Apr 14, 2026

Description

Follows the same secure-by-default pattern established for Kafka in #2412. When a user configures a capture proxy without specifying a TLS block, the config transformer now injects a self-signed cert-manager TLS configuration using the migrations-ca ClusterIssuer. Users can also explicitly opt out with tls: {mode: "plaintext"}.

Changes

Schema (userSchemas.ts):
  • Added plaintext mode to PROXY_TLS_CONFIG discriminated union. This will now be the only way to run the proxy without TLS
Config transformer (migrationConfigTransformer.ts):
  • defaultProxyTlsConfig() will generate a certManager TLS config with *.svc.cluster.local + proxy name as SANs
  • During normalization, absent TLS (with no legacy sslConfigFile) gets the default injected
  • plaintext mode is stripped before passing to Argo, preserving existing HTTP behavior
Helm (values.yaml, issuerConfigmap.yaml, migrationConsole.yaml):
  • Issuer name/kind exposed as configurable values with migrations-ca / ClusterIssuer defaults
  • ConfigMap + env var injection into migration-console pod so the transformer can read the issuer config at runtime
  • Hardcoded fallback in the transformer ensures it works in tests and local dev without the configmap

Impact of these changes

  • Existing proxy configs without tls will now get TLS. This is intentional (secure-by-default)
  • Anyone currently relying on plaintext proxy behavior must add tls: {mode: "plaintext"} to preserve the old behavior
  • sslConfigFile is respected. If the legacy sslConfigFile option is set, no default TLS is injected (they're mutually exclusive per the existing schema validation)
  • Namespace in SANs: Uses *.svc.cluster.local rather than *.ma.svc.cluster.local to avoid hardcoding the namespace
  • Requires cert-manager: The default TLS injection assumes cert-manager is installed and the migrations-ca ClusterIssuer exists (created by the selfSignedClusterIssuer helm hook). Deployments without cert-manager that use proxies will need tls: {mode: "plaintext"}

Issues Resolved

MIGRATIONS-3029

Testing

  • Updated existing proxy endpoint test (now expects https:// and allowInsecure: true)
  • Added traffic-plaintext-tls.json valid fixture for the plaintext opt-out path
  • Refreshed argo schema snapshots

Check List

  • New functionality includes testing
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Jugal Chauhan <jugaldc@amazon.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.43%. Comparing base (abe6949) to head (b19ae03).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2717      +/-   ##
============================================
+ Coverage     73.28%   73.43%   +0.15%     
  Complexity      106      106              
============================================
  Files           721      721              
  Lines         33372    33372              
  Branches       2910     2910              
============================================
+ Hits          24457    24508      +51     
+ Misses         7582     7533      -49     
+ Partials       1333     1331       -2     
Flag Coverage Δ
gradle 69.85% <ø> (+0.22%) ⬆️
node 90.97% <ø> (ø)
python 77.77% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

kind: (process.env.PROXY_DEFAULT_ISSUER_KIND || "ClusterIssuer") as "ClusterIssuer" | "Issuer",
},
dnsNames: [
"*.svc.cluster.local",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works, we need the namespace in the path here since IIRC * only goes one level in SAN

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use

dnsNames: [                                                                                                  
    proxyName,                                                                                               
    `${proxyName}.ma`,                                                                                       
    `${proxyName}.ma.svc.cluster.local`,                                                                     
],   

The alternative is to just use the short name and let clients connect with allow_insecure: true (which the transformer already sets when TLS is present). Since this is a self signed CA, clients won't trust it regardless without either allow_insecure or explicitly trusting the CA

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you suggest an approach in which we do not have to hardcode ma namespace ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we need to also include the NLB domain name (or trust all) in AWS

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included in the latest commit

Signed-off-by: Jugal Chauhan <jugaldc@amazon.com>
@jugal-chauhan jugal-chauhan added the run-cdc-tests This label will run CDC test relevant jobs on Jenkins label Apr 15, 2026
@jugal-chauhan
Copy link
Copy Markdown
Collaborator Author

This PR will fail on cdc tests until #2721

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-cdc-tests This label will run CDC test relevant jobs on Jenkins run-eks-tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants