pkg/ddl, pkg/tici: add TiCI pre-split for DDL global sort ingest | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts by 3pointer · Pull Request #67313 · pingcap/tidb

3pointer · 2026-03-26T02:55:06Z

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

When TiDB builds FULLTEXT or HYBRID indexes with the TiCI backend in global sort mode, TiCI does not get an early pre-split signal before the write-and-ingest stage starts. As a result, TiCI cannot use the aggregated SortedKV metadata to analyze shard distribution and split internal shards ahead of ingest.

In addition, the TiDB side does not define the corresponding PreSplitImportShards RPC yet, so the pre-split request cannot be sent with the required metadata.

What changed and how does it work?

This PR is scoped to DDL backfill only (ADD FULLTEXT INDEX / ADD HYBRID INDEX). IMPORT INTO is not supported in this PR; follow-up changes can be made separately if needed.

This PR adds a TiCI pre-split hook in generateGlobalSortIngestPlan for TiCI-backed FULLTEXT and HYBRID index jobs.

The main changes are:

Add PreSplitImportShards RPC and related request/response messages to pkg/tici/tici.proto, then regenerate pkg/tici/tici.pb.go.
Add TiCI client support in pkg/tici/tici_manager_client.go to call PreSplitImportShards, including keyspace propagation and test failpoint hooks.
In generateGlobalSortIngestPlan, detect ActionAddFullTextIndex and ActionAddHybridIndex, aggregate merged SortedKVMeta groups, and build a TiCI pre-split request with:
- global start_key / end_key
- total_kv_size / total_kv_cnt
- data_file_count / stat_file_count
- per-group metadata in meta_groups
Call the TiCI pre-split RPC synchronously with a 1 minute timeout before generating the final ingest plan.
If the TiCI pre-split call fails, log the error and degrade to the existing global-sort ingest flow instead of failing the whole DDL job.
Add a guard in FULLTEXT index creation to require cloud storage, because TiCI pre-split only applies to the global-sort ingest path.
Add unit tests for both the DDL scheduler flow and the TiCI client request path.

Check List

Tests

Unit test

Suggested / used test commands:
- make failpoint-enable && (cd pkg/ddl && go test -run TestBackfillingSchedulerGlobalSortModeTiCIPreSplit --tags=intest; rc=$?; cd ../..; make failpoint-disable; exit $rc)
- go test -run TestPreSplitImportShards --tags=intest ./pkg/tici
- go test -run TestPreSplitImportShardsMock --tags=intest ./pkg/tici
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Support calling TiCI PreSplitImportShards before global-sort ingest for FULLTEXT and HYBRID index backfill jobs.


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

## Release Notes

* **New Features**
  * Added pre-split import shards support for full-text and hybrid index creation to optimize shard distribution during index backfilling operations.
  * Enhanced distributed backfilling scheduler with TiCI integration for improved metadata-driven planning.

* **Tests**
  * Added comprehensive test coverage for pre-split import shards functionality and backfilling scheduler workflows.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

…gcap#67124)

ti-chi-bot · 2026-03-26T02:55:10Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2026-03-26T02:55:14Z

📝 Walkthrough

Walkthrough

The changes integrate a new TiCI PreSplitImportShards RPC into the DDL backfilling scheduler to support pre-splitting import shards before full index building. The RangeSplitter API is extended to return group size and key count metrics, enabling accurate tracking of split groups. TiCI client code is added with failpoint test support, and DDL backfilling logic now conditionally invokes TiCI pre-split for full-text and hybrid index jobs.

Changes

Cohort / File(s)	Summary
TiCI Proto & RPC Definition `pkg/tici/tici.proto`	Added `PreSplitImportShards` RPC and supporting message types (`PreSplitImportShardsRequest`, `PreSplitImportShardsResponse`, `PreSplitImportShardMeta`, `PreSplitImportIndexResult`) to enable meta-driven import shard pre-splitting.
TiCI Client & Manager `pkg/tici/tici_manager_client.go`, `pkg/tici/tici_manager_client_test.go`, `pkg/tici/BUILD.bazel`	Implemented `PreSplitImportShards` method with failpoint interception for testing, exposed test helpers for request capture, added keyspace codec interface, and updated build dependencies and test sharding.
RangeSplitter Signature & Tests `pkg/lightning/backend/external/split.go`, `pkg/lightning/backend/external/split_test.go`, `pkg/lightning/backend/external/testutil.go`, `pkg/lightning/backend/external/merge_v2.go`, `pkg/dxf/importinto/planner.go`	Extended `SplitOneRangesGroup()` to return `groupSize` and `groupKeyCnt`; updated all call sites to handle expanded return values; adjusted test assertions to validate new metrics.
DDL Backfilling TiCI Integration `pkg/ddl/backfilling_dist_scheduler.go`, `pkg/ddl/backfilling_dist_scheduler_internal_test.go`, `pkg/ddl/backfilling_dist_scheduler_test.go`, `pkg/ddl/BUILD.bazel`	Added `storageWithPDAndCodec` interface validation, conditional TiCI pre-split execution for specific job types, range-group aggregation logic (1GiB grouping), failpoint hooks, and comprehensive test coverage including mock store validation and TiCI request assertion.

Sequence Diagram

sequenceDiagram
    participant Scheduler as DDL Backfilling<br/>Scheduler
    participant Storage as Storage<br/>(PD + Codec)
    participant RangeSplit as RangeSplitter
    participant TiCI as TiCI Client
    participant MetaService as TiCI Meta<br/>Service

    Scheduler->>Storage: validateStorage<br/>as storageWithPDAndCodec
    Storage-->>Scheduler: ✓ codec available

    Scheduler->>RangeSplit: SplitOneRangesGroup()
    RangeSplit-->>Scheduler: endKey, dataFiles,<br/>groupSize, groupKeyCnt

    Scheduler->>Scheduler: Aggregate groups<br/>to 1GiB report groups<br/>Deduplicate file counts

    Scheduler->>TiCI: buildTiCIPreSplitRequest<br/>(task, table, index IDs,<br/>aggregated KV stats,<br/>report groups)

    TiCI->>MetaService: PreSplitImportShards<br/>Request (timeout: 1min)
    MetaService-->>TiCI: Response (split_keys,<br/>shard_counts)

    TiCI-->>Scheduler: ✓ pre-split complete<br/>or log error & continue

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

importsdk, importer, importinto: add import size estimate #67241 — Extends the same SplitOneRangesGroup() API with new return values (groupSize, groupKeyCnt) and updates all call sites accordingly.
ddl: require global sort for non-empty fulltext backfill | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts #67262 — Modifies DDL backfilling, cloud-import, and TiCI integration paths alongside global-sort and associated test infrastructure.
pkg/ddl, pkg/tici: add TiCI pre-split for DDL global sort ingest | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts #67124 — Implements identical code-level changes to add TiCI PreSplitImportShards proto/client, modify generateGlobalSortIngestPlan, and update RangeSplitter return handling.

Suggested labels

release-note

Suggested reviewers

wjhuang2016
OliverS929
GMHDBJD

Poem

🐰 Whiskers twitching with delight,
Pre-split shards now get it right!
TiCI hops through grouped-up ranges,
Splitting import schemes arranges,
Metrics tracked from split to shore! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering the problem statement, solution approach, implementation details, testing, and release notes as required by the template.
Title check	✅ Passed	The title clearly and specifically describes the main change: adding TiCI pre-split functionality for DDL global sort ingest operations, with affected packages listed.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tiprow · 2026-03-26T02:55:22Z

Hi @3pointer. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

3pointer · 2026-03-26T02:56:07Z

/ok-to-test

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

pkg/tici/tici.proto (1)

600-629: Document the start_key / end_key contract.

KeyRange above explicitly defines its bounds, but these new messages do not. Since the scheduler is feeding split boundaries straight into this RPC, it would help to state whether end_key is exclusive so TiCI does not have to infer it.

Proposed comment update

 // One merged SortedKVMeta group used for import pre-split analysis.
 message PreSplitImportShardMeta {
   int64 ele_id = 1;
+  // Inclusive lower bound of this meta group.
   bytes start_key = 2;
+  // Exclusive upper bound of this meta group.
   bytes end_key = 3;
   uint64 total_kv_size = 4;
   uint64 total_kv_cnt = 5;
   int32 data_file_count = 6;
   int32 stat_file_count = 7;
 }
 
 message PreSplitImportShardsRequest {
   // TiDB unique task ID for this Import Into/Index Backfilling job.
   string tidb_task_id = 1;
   // Table ID of the target table.
   int64 table_id = 2;
   // Index ID of the target index. If this is an Import Into job that relates
   // to multiple indexes, this field should contain all the index IDs.
   repeated int64 index_ids = 3;
   uint64 scan_snapshot_ts = 4;
+  // Inclusive lower bound across all meta_groups.
   bytes start_key = 5;
+  // Exclusive upper bound across all meta_groups.
   bytes end_key = 6;
   uint64 total_kv_size = 7;

As per coding guidelines, "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/tici/tici.proto` around lines 600 - 629, The proto messages
PreSplitImportShardMeta and PreSplitImportShardsRequest lack a clear contract
for the start_key/end_key semantics; update their comments to state the exact
bounds (e.g., whether start_key is inclusive and end_key is exclusive, how
empty/null keys are treated, and any required prefix/encoding assumptions) so
callers (and TiCI scheduler) don't have to infer behavior from KeyRange; add
this clarifying text to the comments above the start_key/end_key fields in both
PreSplitImportShardMeta and PreSplitImportShardsRequest and reference KeyRange
only to note consistency with its inclusive/exclusive convention.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/tici/tici_manager_client.go`:
- Around line 431-439: The mock interception call maybeMockPreSplitImportShards
is invoked before the request is enriched with KeyspaceId, so tests that
exercise the mock path see a different payload than the real RPC; move the
maybeMockPreSplitImportShards(req) invocation to after the request is
normalized/enriched (i.e., after req.KeyspaceId = t.getKeyspaceID()) in the
PreSplitImportShards method (and the other similar entry point referenced),
ensuring ManagerCtx.getKeyspaceID() is applied before calling
maybeMockPreSplitImportShards so the mock sees the same request as the real RPC.

---

Nitpick comments:
In `@pkg/tici/tici.proto`:
- Around line 600-629: The proto messages PreSplitImportShardMeta and
PreSplitImportShardsRequest lack a clear contract for the start_key/end_key
semantics; update their comments to state the exact bounds (e.g., whether
start_key is inclusive and end_key is exclusive, how empty/null keys are
treated, and any required prefix/encoding assumptions) so callers (and TiCI
scheduler) don't have to infer behavior from KeyRange; add this clarifying text
to the comments above the start_key/end_key fields in both
PreSplitImportShardMeta and PreSplitImportShardsRequest and reference KeyRange
only to note consistency with its inclusive/exclusive convention.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 11723cfd-b5c0-4d14-a299-6f6b509f63e1

📥 Commits

Reviewing files that changed from the base of the PR and between 45244ec and cf54bd8.

⛔ Files ignored due to path filters (1)

pkg/tici/tici.pb.go is excluded by !**/*.pb.go

📒 Files selected for processing (13)

pkg/ddl/BUILD.bazel
pkg/ddl/backfilling_dist_scheduler.go
pkg/ddl/backfilling_dist_scheduler_internal_test.go
pkg/ddl/backfilling_dist_scheduler_test.go
pkg/dxf/importinto/planner.go
pkg/lightning/backend/external/merge_v2.go
pkg/lightning/backend/external/split.go
pkg/lightning/backend/external/split_test.go
pkg/lightning/backend/external/testutil.go
pkg/tici/BUILD.bazel
pkg/tici/tici.proto
pkg/tici/tici_manager_client.go
pkg/tici/tici_manager_client_test.go

coderabbitai · 2026-03-26T03:18:28Z

+func (t *ManagerCtx) PreSplitImportShards(ctx context.Context, req *PreSplitImportShardsRequest) error {
+	if handled, err := maybeMockPreSplitImportShards(req); handled {
+		return err
+	}
+	if req == nil {
+		return errors.New("pre split import shards request is nil")
+	}
+	req.KeyspaceId = t.getKeyspaceID()
+


⚠️ Potential issue | 🟡 Minor

Move the mock interception after request enrichment.

Both entry points run maybeMockPreSplitImportShards(req) before req.KeyspaceId is filled, so the captured payload in the new failpoint-based tests is not the same request that the real RPC sends. That makes the mock path blind to keyspace-propagation regressions.

Suggested fix

func (t *ManagerCtx) PreSplitImportShards(ctx context.Context, req *PreSplitImportShardsRequest) error { - if handled, err := maybeMockPreSplitImportShards(req); handled { - return err - } if req == nil { return errors.New("pre split import shards request is nil") } req.KeyspaceId = t.getKeyspaceID() + if handled, err := maybeMockPreSplitImportShards(req); handled { + return err + } t.mu.RLock() defer t.mu.RUnlock() ... }

func PreSplitImportShards(ctx context.Context, store keyspaceStorage, req *PreSplitImportShardsRequest) error { - if handled, err := maybeMockPreSplitImportShards(req); handled { - return err - } + if req == nil { + return errors.New("pre split import shards request is nil") + } + if store != nil { + req.KeyspaceId = uint32(store.GetCodec().GetKeyspaceID()) + } + if handled, err := maybeMockPreSplitImportShards(req); handled { + return err + } etcdClient, err := getEtcdClientFunc() ... }

Also applies to: 1038-1060

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/tici/tici_manager_client.go` around lines 431 - 439, The mock interception call maybeMockPreSplitImportShards is invoked before the request is enriched with KeyspaceId, so tests that exercise the mock path see a different payload than the real RPC; move the maybeMockPreSplitImportShards(req) invocation to after the request is normalized/enriched (i.e., after req.KeyspaceId = t.getKeyspaceID()) in the PreSplitImportShards method (and the other similar entry point referenced), ensuring ManagerCtx.getKeyspaceID() is applied before calling maybeMockPreSplitImportShards so the mock sees the same request as the real RPC.

codecov · 2026-03-26T03:20:29Z

Codecov Report

❌ Patch coverage is 67.04871% with 115 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release-fts-202602@45244ec). Learn more about missing BASE report.

Additional details and impacted files

@@                   Coverage Diff                   @@
##             release-fts-202602     #67313   +/-   ##
=======================================================
  Coverage                      ?   76.7314%           
=======================================================
  Files                         ?       1962           
  Lines                         ?     558695           
  Branches                      ?          0           
=======================================================
  Hits                          ?     428695           
  Misses                        ?     128538           
  Partials                      ?       1462

Flag	Coverage Δ
integration	`45.3443% <0.0000%> (?)`
unit	`73.9331% <67.0487%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`56.7974% <0.0000%> (?)`
parser	`∅ <0.0000%> (?)`
br	`66.2237% <0.0000%> (?)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

3pointer · 2026-03-26T04:27:56Z

/retest

3pointer · 2026-03-26T05:12:02Z

/retest

3pointer · 2026-03-26T08:02:12Z

/test pull-error-log-review

tiprow · 2026-03-26T08:02:37Z

@3pointer: No presubmit jobs available for pingcap/tidb@release-fts-202602

Details

In response to this:

/test pull-error-log-review

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ti-chi-bot · 2026-03-26T08:03:01Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: GMHDBJD, wjhuang2016

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [GMHDBJD,wjhuang2016]
~~pkg/ddl/OWNERS~~ [GMHDBJD,wjhuang2016]
~~pkg/dxf/OWNERS~~ [GMHDBJD,wjhuang2016]
~~pkg/lightning/OWNERS~~ [GMHDBJD]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-03-26T08:03:06Z

[LGTM Timeline notifier]

Timeline:

2026-03-26 02:57:05.260705849 +0000 UTC m=+409821.296776109: ☑️ agreed by GMHDBJD.
2026-03-26 08:03:05.335101135 +0000 UTC m=+428181.371171395: ☑️ agreed by wjhuang2016.

ti-chi-bot · 2026-03-26T08:03:07Z

@3pointer: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-error-log-review	`cf54bd8`	link	false	`/test pull-error-log-review`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

pkg/ddl, pkg/tici: add TiCI pre-split for DDL global sort ingest (pin…

cf54bd8

…gcap#67124)

ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Mar 26, 2026

ti-chi-bot bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 26, 2026

3pointer changed the title ~~pkg/ddl, pkg/tici: add TiCI pre-split for DDL global sort ingest (#67…~~ pkg/ddl, pkg/tici: add TiCI pre-split for DDL global sort ingest Mar 26, 2026

3pointer marked this pull request as ready for review March 26, 2026 02:55

ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2026

ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Mar 26, 2026

GMHDBJD approved these changes Mar 26, 2026

View reviewed changes

ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 26, 2026

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

3pointer changed the title ~~pkg/ddl, pkg/tici: add TiCI pre-split for DDL global sort ingest~~ pkg/ddl, pkg/tici: add TiCI pre-split for DDL global sort ingest | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts Mar 26, 2026

wjhuang2016 approved these changes Mar 26, 2026

View reviewed changes

ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 26, 2026

ti-chi-bot bot merged commit 2338f7d into pingcap:release-fts-202602 Mar 26, 2026
25 of 26 checks passed

coderabbitai bot mentioned this pull request Apr 15, 2026

*: use task key in TiCI requests | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts #67786

Open

13 tasks

Conversation

3pointer commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Uh oh!

ti-chi-bot bot commented Mar 26, 2026

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

tiprow bot commented Mar 26, 2026

Uh oh!

3pointer commented Mar 26, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

3pointer commented Mar 26, 2026

Uh oh!

3pointer commented Mar 26, 2026

Uh oh!

3pointer commented Mar 26, 2026

Uh oh!

tiprow bot commented Mar 26, 2026

Uh oh!

ti-chi-bot bot commented Mar 26, 2026

Uh oh!

ti-chi-bot bot commented Mar 26, 2026

[LGTM Timeline notifier]

Uh oh!

ti-chi-bot bot commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

3pointer commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading

codecov bot commented Mar 26, 2026 •

edited

Loading