Skip to content

objstore: support use native OSS sdk to access#65610

Merged
ti-chi-bot[bot] merged 15 commits intopingcap:masterfrom
D3Hunter:oss-sdk
Jan 20, 2026
Merged

objstore: support use native OSS sdk to access#65610
ti-chi-bot[bot] merged 15 commits intopingcap:masterfrom
D3Hunter:oss-sdk

Conversation

@D3Hunter
Copy link
Copy Markdown
Contributor

@D3Hunter D3Hunter commented Jan 16, 2026

What problem does this PR solve?

Issue Number: close #65461

Problem Summary:

What changed and how does it work?

as title
Changes:

  • Implements native OSS storage client with credential management and region detection
  • Updates storage routing logic to direct OSS URIs to the new native implementation
  • redact for OSS urls

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

run the tests in store_test.go manually on real OSS, all pass

run a small table import with OSS through role-arn

mysql> set global tidb_cloud_storage_uri = 'oss://nextgen-some-bucket/tmp-dir?role-arn=acs:ram::00000:role/oss-assume';
Query OK, 0 rows affected (2.65 sec)

mysql> import into t from 'oss://nextgen-some-bucket/datasets/t*.csv?role-arn=acs:ram::00000:role/oss-assume' with detached;
+--------+-----------+---------------------------------------------------------------------------------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+
| Job_ID | Group_Key | Data_Source                                                                                 | Target_Table | Table_ID | Phase | Status  | Source_File_Size | Imported_Rows | Result_Message | Create_Time                | Start_Time | End_Time | Created_By | Last_Update_Time | Cur_Step | Cur_Step_Processed_Size | Cur_Step_Total_Size | Cur_Step_Progress_Pct | Cur_Step_Speed | Cur_Step_ETA |
+--------+-----------+---------------------------------------------------------------------------------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+
|      1 | NULL      | oss://nextgen-some-bucket/datasets/t*.csv?role-arn=acs:ram::00000:role/oss-assume | `test`.`t`   |        7 |       | pending | 10B              |          NULL |                | 2026-01-19 18:16:44.792871 | NULL       | NULL     | root@%     | NULL             | NULL     | NULL                    | NULL                | NULL                  | NULL           | NULL         |
+--------+-----------+---------------------------------------------------------------------------------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+
1 row in set (6.03 sec)

mysql> show import job 1;
+--------+-----------+---------------------------------------------------------------------------------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+----------------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+
| Job_ID | Group_Key | Data_Source                                                                                 | Target_Table | Table_ID | Phase | Status   | Source_File_Size | Imported_Rows | Result_Message | Create_Time                | Start_Time                 | End_Time                   | Created_By | Last_Update_Time           | Cur_Step | Cur_Step_Processed_Size | Cur_Step_Total_Size | Cur_Step_Progress_Pct | Cur_Step_Speed | Cur_Step_ETA |
+--------+-----------+---------------------------------------------------------------------------------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+----------------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+
|      1 | NULL      | oss://nextgen-some-bucket/datasets/t*.csv?role-arn=acs:ram::00000:role/oss-assume | `test`.`t`   |        7 |       | finished | 10B              |             5 |                | 2026-01-19 18:16:44.792871 | 2026-01-19 18:16:45.365882 | 2026-01-19 18:17:10.394838 | root@%     | 2026-01-19 18:17:10.394838 | NULL     | NULL                    | NULL                | NULL                  | NULL           | NULL         |
+--------+-----------+---------------------------------------------------------------------------------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+----------------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+
1 row in set (0.00 sec)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 16, 2026
@tiprow
Copy link
Copy Markdown

tiprow bot commented Jan 16, 2026

Hi @D3Hunter. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter D3Hunter marked this pull request as draft January 16, 2026 12:56
@D3Hunter D3Hunter marked this pull request as ready for review January 20, 2026 05:49
Copilot AI review requested due to automatic review settings January 20, 2026 05:49
@D3Hunter D3Hunter changed the title [wip]objstore: support use native OSS sdk to access objstore: support use native OSS sdk to access Jan 20, 2026
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 20, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for using the native Alibaba Cloud OSS SDK to access OSS storage, as an alternative to the generic S3-compatible approach.

Changes:

  • Implements native OSS storage client with credential management and region detection
  • Updates storage routing logic to direct OSS URIs to the new native implementation
  • Adds comprehensive test coverage for OSS-specific functionality

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/objstore/storage.go Routes OSS provider to new native OSS storage implementation
pkg/objstore/s3like/store.go Refactors CopyFrom to support OSS wrapper through interface checking
pkg/objstore/parse.go Adds OSS scheme parsing to backend configuration
pkg/objstore/ossstore/store_test.go Comprehensive test suite for OSS storage operations
pkg/objstore/ossstore/store.go Core OSS storage implementation with credential management
pkg/objstore/ossstore/BUILD.bazel Build configuration updates for new OSS storage module
pkg/objstore/BUILD.bazel Adds ossstore dependency to objstore package

Copilot AI review requested due to automatic review settings January 20, 2026 06:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

qs.Bucket, qs.Region, detectedRegion))
}

log.Info("succeed to get bucket region", zap.String("region", detectedRegion))
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "succeeded" should be "succeeded" in the log message. While grammatically "succeed" is correct as an infinitive, in this context it should be "succeeded" to maintain consistency with past tense logging conventions for completed actions.

Suggested change
log.Info("succeed to get bucket region", zap.String("region", detectedRegion))
log.Info("succeeded to get bucket region", zap.String("region", detectedRegion))

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +74
// TODO changing the input backend is a side effect, it shouldn't be part of
// the NewXXX, but we have to do it here to keep compatibility now.
//
// OSS credential through assume role need refresh periodically, if we do
// send them out to TiKV, they also need to be refreshed, not sure how this
// works for BR now, we can add it later.
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment on lines 69-74 suggests that sending OSS credentials to TiKV is not currently supported but may need to be added later. However, the code on line 76 immediately returns an error when SendCredentials is true, making this a hard restriction. Consider adding more context to the TODO about when/why this might be needed, or if this is a permanent limitation that should be documented differently. This will help future maintainers understand whether this is a temporary workaround or a design decision.

Suggested change
// TODO changing the input backend is a side effect, it shouldn't be part of
// the NewXXX, but we have to do it here to keep compatibility now.
//
// OSS credential through assume role need refresh periodically, if we do
// send them out to TiKV, they also need to be refreshed, not sure how this
// works for BR now, we can add it later.
// NOTE: changing the input backend is a side effect, it shouldn't be part of
// the NewXXX, but we have to do it here to keep compatibility with existing
// callers that expect backend.AccessKey / SecretAccessKey to be cleared.
//
// OSS credentials obtained via AssumeRole need to be refreshed periodically.
// If we were to send credentials to TiKV, TiKV would also need a mechanism
// to refresh or re-fetch them before expiry. BR / TiKV do not currently have
// a protocol for propagating refreshed OSS credentials, so sending them
// would either be unsafe (risk of using expired credentials) or require a
// larger design change.
//
// As a result, sending OSS credentials to TiKV is intentionally disabled
// here. If we ever add support for this, it must come with a well-defined
// credential refresh / rotation mechanism and should be documented as such.

Copilot uses AI. Check for mistakes.
GetBucketPrefix() storeapi.BucketPrefix
})
if !ok {
return errors.Annotatef(berrors.ErrStorageInvalidConfig, "CopyFrom is only supported by S3 storage, get %T", inStore)
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CopyFrom method now uses a duck-typed interface check instead of a concrete type assertion. While this works, the error message still says "CopyFrom is only supported by S3 storage" which is now misleading since it also supports OSS storage. The error message should be updated to reflect that it supports S3-like storage implementations.

Suggested change
return errors.Annotatef(berrors.ErrStorageInvalidConfig, "CopyFrom is only supported by S3 storage, get %T", inStore)
return errors.Annotatef(berrors.ErrStorageInvalidConfig, "CopyFrom is only supported by S3-like storage implementations, got %T", inStore)

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +182
credRefresher = newCredentialRefresher(provider, log.L().With(
zap.String("bucket", qs.GetBucket()),
zap.String("prefix", qs.GetPrefix()),
))
if err := credRefresher.refreshOnce(); err != nil {
return nil, errors.Annotatef(err, "failed to get initial OSS credentials")
}
ossCfg = ossCfg.WithCredentialsProvider(credRefresher)
}

if opts.AccessRecording != nil {
ossOptFns = append(ossOptFns, func(o *oss.Options) {
o.ResponseHandlers = append(o.ResponseHandlers, func(resp *http.Response) error {
// nolint:bodyclose
opts.AccessRecording.RecRequest(resp.Request)
return nil
})
})
}

// get bucket location or check the specified region is correct
getLocCfg := &(*ossCfg)
if qs.Region == "" {
getLocCfg = getLocCfg.WithRegion(defaultRegion)
} else {
getLocCfg = getLocCfg.WithRegion(qs.Region)
}
ossCli := oss.NewClient(getLocCfg, ossOptFns...)
resp, err := ossCli.GetBucketLocation(ctx, &oss.GetBucketLocationRequest{Bucket: oss.Ptr(qs.Bucket)})
if err != nil {
return nil, errors.Annotatef(err, "failed to get location of bucket %s", qs.Bucket)
}

detectedRegion := trimOSSRegionID(tea.StringValue(resp.LocationConstraint))
if qs.Region != "" && detectedRegion != qs.Region {
return nil, errors.Trace(fmt.Errorf("bucket and region are not matched, bucket=%s, input region=%s, real region=%s",
qs.Bucket, qs.Region, detectedRegion))
}

log.Info("succeed to get bucket region", zap.String("region", detectedRegion))

qs.Prefix = storeapi.NewPrefix(qs.Prefix).String()
bucketPrefix := storeapi.NewBucketPrefix(qs.Bucket, qs.Prefix)
ossCfg = ossCfg.WithRegion(detectedRegion)

cli := &client{
svc: oss.NewClient(ossCfg, ossOptFns...),
BucketPrefix: bucketPrefix,
options: &qs,
}
if err := s3like.CheckPermissions(ctx, cli, opts.CheckPermissions); err != nil {
return nil, errors.Annotatef(berrors.ErrStorageInvalidPermission, "check permission failed due to %v", err)
}

if credRefresher != nil {
if err = credRefresher.startRefresh(); err != nil {
return nil, errors.Annotatef(err, "failed to start OSS credential refresher")
}
}
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The credential refresher may leak resources if the function returns an error after line 128 but before line 178. The refresher has already called refreshOnce() and initialized credentials, but if an error occurs in GetBucketLocation (line 152), region mismatch check (line 158-160), or CheckPermissions (line 174-175), the function returns without calling credRefresher.close(). Consider adding a defer statement after credRefresher creation to clean up on error, or ensure proper cleanup in all error paths.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's ok to skip close without starting the routine

@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 2.73973% with 142 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.6972%. Comparing base (e15515e) to head (3ced329).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #65610        +/-   ##
================================================
- Coverage   77.8228%   77.6972%   -0.1256%     
================================================
  Files          1989       1916        -73     
  Lines        543383     532702     -10681     
================================================
- Hits         422876     413895      -8981     
+ Misses       118848     118798        -50     
+ Partials       1659          9      -1650     
Flag Coverage Δ
integration 41.6886% <0.0000%> (-6.4995%) ⬇️
unit 76.8245% <2.7397%> (+0.3736%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 56.7974% <ø> (ø)
parser ∅ <ø> (∅)
br 48.7449% <ø> (-12.3064%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jan 20, 2026
@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow bot commented Jan 20, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jan 20, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Jan 20, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-01-20 08:25:29.411504938 +0000 UTC m=+482357.025461784: ☑️ agreed by joechenrh.
  • 2026-01-20 09:07:22.168026105 +0000 UTC m=+484869.781982961: ☑️ agreed by Leavrth.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

/approve

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Jan 20, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, joechenrh, Leavrth

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Jan 20, 2026
@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow bot commented Jan 20, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow bot commented Jan 20, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@D3Hunter
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow bot commented Jan 20, 2026

@D3Hunter: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot merged commit 812bd31 into pingcap:master Jan 20, 2026
29 checks passed
@D3Hunter D3Hunter deleted the oss-sdk branch January 20, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

objstore: support access OSS using native SDK

4 participants