Skip to content

importsdk: redact sensitive source params in outward-facing errors#67719

Merged
ti-chi-bot[bot] merged 5 commits intopingcap:masterfrom
GMHDBJD:tidb2-importsdk-redact-20260413
Apr 13, 2026
Merged

importsdk: redact sensitive source params in outward-facing errors#67719
ti-chi-bot[bot] merged 5 commits intopingcap:masterfrom
GMHDBJD:tidb2-importsdk-redact-20260413

Conversation

@GMHDBJD
Copy link
Copy Markdown
Collaborator

@GMHDBJD GMHDBJD commented Apr 13, 2026

What problem does this PR solve?

Issue Number: close #67718

Problem Summary:

pkg/importsdk wrapped several outward-facing errors with the raw import source path. When the source path contained secret query parameters such as access-key, secret-access-key, session-token, or sas-token, those values were leaked in the returned error string.

What changed and how does it work?

  • Add a cached redactedSourcePath to fileScanner and initialize it once in NewFileScanner.
  • Route outward-facing source=%s error annotations through the redacted path for initialization, schema creation, and size-estimation failures.
  • Make the redaction best-effort by applying a second sensitive-parameter scrub after ast.RedactURL, so malformed URLs and unsupported schemes do not bypass masking.
  • Add regression tests covering both initialization failures and schema-restore failures to ensure secret parameters are not exposed in returned errors.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Manual test details:

  • In a temporary upstream/master worktree, apply the new regression tests and run:
    • go test -run 'TestFileScanner/(NewFileScannerRedactsSensitiveSourcePathInInitErrors|CreateSchemasAndTablesRedactsSensitiveSourcePathOnError)$' -tags=intest,deadlock ./pkg/importsdk
    • Confirm the tests fail before the fix because raw secret parameters appear in the error text.
  • In this branch, run:
    • go test -run 'TestFileScanner/(NewFileScannerRedactsSensitiveSourcePathInInitErrors|CreateSchemasAndTablesRedactsSensitiveSourcePathOnError)$' -tags=intest,deadlock ./pkg/importsdk
    • go test -tags=intest,deadlock ./pkg/importsdk
    • make lint

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

  • Bug Fixes
    • Sensitive credentials in source URLs are now consistently redacted in error messages across initialization, schema/table discovery, loader creation, and import size estimation — including malformed/parse-failure cases.
  • Tests
    • Added tests and a helper to verify credential query parameters are masked in initialization and runtime errors (including schema discovery failures) and updated test dependencies to support these cases.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-triage-completed release-note-none Denotes a PR that doesn't merit a release note. labels Apr 13, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai bot commented Apr 13, 2026

Review Complete

Findings: 0 issues
Posted: 0
Duplicates/Skipped: 0

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 13, 2026
@GMHDBJD GMHDBJD added type/bugfix This PR fixes a bug. sig/migrate component/import component/lightning This issue is related to Lightning of TiDB. labels Apr 13, 2026
@tiprow
Copy link
Copy Markdown

tiprow bot commented Apr 13, 2026

Hi @GMHDBJD. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c708b2f9-ea0b-4886-a634-2ee22fdfeed7

📥 Commits

Reviewing files that changed from the base of the PR and between 26067ce and be0688a.

📒 Files selected for processing (2)
  • pkg/importsdk/file_scanner.go
  • pkg/importsdk/file_scanner_test.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/importsdk/file_scanner.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/importsdk/file_scanner_test.go

📝 Walkthrough

Walkthrough

Replaces outward-facing uses of raw source paths with a redacted value on fileScanner (from ast.RedactURL) and updates error annotations to expose only the redacted source. Adds tests validating credential query-parameter redaction and updates a test dependency.

Changes

Cohort / File(s) Summary
File scanner core
pkg/importsdk/file_scanner.go
Adds redactedSourcePath to fileScanner, computes it in NewFileScanner via ast.RedactURL(sourcePath), and updates error annotations to use the redacted path or a parsed-error-specific redacted value instead of the raw sourcePath.
Tests & test deps
pkg/importsdk/file_scanner_test.go, pkg/importsdk/BUILD.bazel
Adds tests asserting sensitive query params (access-key, secret-access-key, session-token) are redacted in init/runtime errors; introduces MySQL driver test dependency for mocking MySQL errors.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested reviewers

  • joechenrh
  • D3Hunter

Poem

I'm a rabbit with a coder's knack,
I nip the secrets in the stack,
Query keys turned into x's, neat,
Errors tidy from head to feet,
Hopping off with a carrot treat 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: redacting sensitive source parameters in outward-facing errors from importsdk.
Description check ✅ Passed The description follows the template and includes issue reference (#67718), problem summary, detailed explanation of changes, test coverage with manual test instructions, and side-effects checklist.
Linked Issues check ✅ Passed The PR fully addresses issue #67718 by implementing cached redacted source paths, routing error annotations through redacted paths, and adding comprehensive regression tests for both initialization and schema-creation failures.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the credential-leak issue in pkg/importsdk: source redaction in file_scanner.go, regression tests in file_scanner_test.go, and test dependency addition in BUILD.bazel.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Command failed


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/importsdk/file_scanner_test.go (1)

156-158: Prefer deriving the expected redacted source in test setup.

At Line 157, hardcoding the masked URL duplicates redaction logic and can drift when redaction rules evolve.

♻️ Suggested refactor
-		fs.sourcePath = "s3://bucket/path?access-key=ak&secret-access-key=sk&session-token=token"
-		fs.redactedSourcePath = "s3://bucket/path?access-key=xxxxxx&secret-access-key=xxxxxx&session-token=xxxxxx"
+		fs.sourcePath = "s3://bucket/path?access-key=ak&secret-access-key=sk&session-token=token"
+		fs.redactedSourcePath = redactSourcePath(fs.sourcePath)

As per coding guidelines, "Code SHOULD remain maintainable for future readers with basic TiDB familiarity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/importsdk/file_scanner_test.go` around lines 156 - 158, Replace the
hardcoded masked URL by deriving fs.redactedSourcePath from fs.sourcePath using
the package's redaction utility instead of duplicating masking logic;
specifically, set fs.redactedSourcePath = RedactSourcePath(fs.sourcePath) (or
call the existing RedactCredentials/RedactSource helper present in the package)
so the test uses the same redaction codepath as production and won't drift as
rules change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/importsdk/file_scanner_test.go`:
- Around line 156-158: Replace the hardcoded masked URL by deriving
fs.redactedSourcePath from fs.sourcePath using the package's redaction utility
instead of duplicating masking logic; specifically, set fs.redactedSourcePath =
RedactSourcePath(fs.sourcePath) (or call the existing
RedactCredentials/RedactSource helper present in the package) so the test uses
the same redaction codepath as production and won't drift as rules change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 32d2c6a2-eb3b-4000-9ce8-610353a04189

📥 Commits

Reviewing files that changed from the base of the PR and between a83fcdb and 605f512.

📒 Files selected for processing (2)
  • pkg/importsdk/file_scanner.go
  • pkg/importsdk/file_scanner_test.go

Copy link
Copy Markdown

@pantheon-ai pantheon-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Code looks good. No issues found.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/importsdk/file_scanner_test.go (1)

131-136: Extract duplicated redaction assertions into a helper.

Both new subtests repeat the same secret-masking checks; a small helper would keep this easier to maintain.

♻️ Optional refactor
+	assertSecretsRedacted := func(t *testing.T, err error) {
+		t.Helper()
+		require.Error(t, err)
+		require.ErrorContains(t, err, "access-key=xxxxxx")
+		require.ErrorContains(t, err, "secret-access-key=xxxxxx")
+		require.ErrorContains(t, err, "session-token=xxxxxx")
+		require.NotContains(t, err.Error(), "access-key=ak")
+		require.NotContains(t, err.Error(), "secret-access-key=sk")
+		require.NotContains(t, err.Error(), "session-token=token")
+	}
+
 	t.Run("NewFileScannerRedactsSensitiveSourcePathInInitErrors", func(t *testing.T) {
 		_, err := NewFileScanner(
 			ctx,
 			"s3://?access-key=ak&secret-access-key=sk&session-token=token",
 			db,
 			cfg,
 		)
-		require.Error(t, err)
-		require.ErrorContains(t, err, "access-key=xxxxxx")
-		require.ErrorContains(t, err, "secret-access-key=xxxxxx")
-		require.ErrorContains(t, err, "session-token=xxxxxx")
-		require.NotContains(t, err.Error(), "access-key=ak")
-		require.NotContains(t, err.Error(), "secret-access-key=sk")
-		require.NotContains(t, err.Error(), "session-token=token")
+		assertSecretsRedacted(t, err)
 	})
...
 		err = invalidScanner.CreateSchemasAndTables(ctx)
-		require.Error(t, err)
-		require.ErrorContains(t, err, "access-key=xxxxxx")
-		require.ErrorContains(t, err, "secret-access-key=xxxxxx")
-		require.ErrorContains(t, err, "session-token=xxxxxx")
-		require.NotContains(t, err.Error(), "access-key=ak")
-		require.NotContains(t, err.Error(), "secret-access-key=sk")
-		require.NotContains(t, err.Error(), "session-token=token")
+		assertSecretsRedacted(t, err)
 		require.ErrorContains(t, err, "invalid schema statement")
 		require.NoError(t, invalidMock.ExpectationsWereMet())
 	})

Also applies to: 167-172

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/importsdk/file_scanner_test.go` around lines 131 - 136, Extract the
repeated secret-masking assertions in pkg/importsdk/file_scanner_test.go into a
small test helper (e.g., func assertSecretsRedacted(t *testing.T, err error))
and call it from the two subtests instead of duplicating the five require.*
checks; update both the blocks that currently assert
access-key/secret-access-key/session-token redaction (the block around the shown
diff and the similar block at the later occurrence) to call this helper,
preserving the same require.ErrorContains and require.NotContains checks inside
the helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/importsdk/file_scanner_test.go`:
- Around line 131-136: Extract the repeated secret-masking assertions in
pkg/importsdk/file_scanner_test.go into a small test helper (e.g., func
assertSecretsRedacted(t *testing.T, err error)) and call it from the two
subtests instead of duplicating the five require.* checks; update both the
blocks that currently assert access-key/secret-access-key/session-token
redaction (the block around the shown diff and the similar block at the later
occurrence) to call this helper, preserving the same require.ErrorContains and
require.NotContains checks inside the helper.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 08bfbb8f-e187-445a-b5fb-54d227baf816

📥 Commits

Reviewing files that changed from the base of the PR and between 605f512 and 7cbdaef.

📒 Files selected for processing (3)
  • pkg/importsdk/BUILD.bazel
  • pkg/importsdk/file_scanner.go
  • pkg/importsdk/file_scanner_test.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/importsdk/BUILD.bazel
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/importsdk/file_scanner.go

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.4439%. Comparing base (a83fcdb) to head (be0688a).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #67719        +/-   ##
================================================
- Coverage   77.6233%   77.4439%   -0.1794%     
================================================
  Files          1981       1965        -16     
  Lines        548326     548346        +20     
================================================
- Hits         425629     424661       -968     
- Misses       121887     123683      +1796     
+ Partials        810          2       -808     
Flag Coverage Δ
integration 40.9150% <ø> (+6.5753%) ⬆️
unit 76.6667% <75.0000%> (+0.2974%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 61.5065% <ø> (ø)
parser ∅ <ø> (∅)
br 49.9460% <ø> (-10.5236%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@GMHDBJD
Copy link
Copy Markdown
Collaborator Author

GMHDBJD commented Apr 13, 2026

/test mysql-test

@tiprow
Copy link
Copy Markdown

tiprow bot commented Apr 13, 2026

@GMHDBJD: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/test mysql-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@GMHDBJD
Copy link
Copy Markdown
Collaborator Author

GMHDBJD commented Apr 13, 2026

/ok-to-test

@ti-chi-bot ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Apr 13, 2026
@ingress-bot
Copy link
Copy Markdown

🔍 Starting code review for this PR...

Copy link
Copy Markdown

@ingress-bot ingress-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review was generated by AI and should be verified by a human reviewer.
Manual follow-up is recommended before merge.

Summary

  • Total findings: 4
  • Inline comments: 4
  • Summary-only findings (no inline anchor): 0
Findings (highest risk first)

⚠️ [Major] (1)

  1. Malformed storage URLs bypass source-path redaction in parse-error path (pkg/importsdk/file_scanner.go:62, pkg/parser/ast/misc.go:3818)

🟡 [Minor] (2)

  1. fileScanner keeps parallel source-path state with one dead representation (pkg/importsdk/file_scanner.go:51, pkg/importsdk/file_scanner_test.go:162)
  2. Parse-backend error branch changes error detail contract without intent documentation (pkg/importsdk/file_scanner.go:65, pkg/importsdk/file_scanner.go:69)

🧹 [Nit] (1)

  1. Subtest name overstates redaction coverage across initialization errors (pkg/importsdk/file_scanner_test.go:133)

Comment thread pkg/importsdk/file_scanner.go
Comment thread pkg/importsdk/file_scanner.go Outdated
Comment thread pkg/importsdk/file_scanner.go Outdated
Comment thread pkg/importsdk/file_scanner_test.go Outdated
@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Apr 13, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, joechenrh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Apr 13, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 13, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-04-13 09:24:58.228376396 +0000 UTC m=+1380303.433736453: ☑️ agreed by D3Hunter.
  • 2026-04-13 09:29:01.874007253 +0000 UTC m=+1380547.079367310: ☑️ agreed by joechenrh.

@ti-chi-bot ti-chi-bot bot merged commit 2cd71a2 into pingcap:master Apr 13, 2026
35 checks passed
@GMHDBJD GMHDBJD deleted the tidb2-importsdk-redact-20260413 branch April 13, 2026 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved component/import component/lightning This issue is related to Lightning of TiDB. lgtm ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. sig/migrate size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/bugfix This PR fixes a bug.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

importsdk: redact sensitive source params in outward-facing errors

4 participants