Skip to content

Import Into: potential regression after #62419 removed coarse split/scatter for large IMPORT INTO #66311

@OliverS929

Description

@OliverS929

Bug Report

1. Minimal reproduce step (Required)

  1. Run IMPORT INTO with a very large dataset (global sort + ingest path).
  2. During ingest, the generated split key count is large (len(splitKeys) > 100).
  3. On builds that include #62419 (rate limiter for split/ingest), region split/scatter runs in one layer only (coarse layer removed).
  4. Under this condition, region distribution becomes highly skewed across stores, and write pressure concentrates on a few stores.
  5. TiDB ingest may repeatedly hit retryable TiKV write errors (e.g. needRescan) and loop on retries.

Observed signals in logs/monitoring:

  • Frequent retries in needRescan stage.
  • Errors around opening/sending write streams to TiKV (Unavailable, connection refused/reset/timeout, EOF).
  • Region count distribution skew across stores during ingest.

2. What did you expect to see? (Required)

  • For large split key sets, split/scatter should spread pressure early and avoid region concentration.
  • Existing limiter behavior from #62419 should still be retained to protect PD/TiKV from excessive split requests.
  • IMPORT INTO should not amplify store-level pressure due to missing coarse split/scatter stage.

3. What did you see instead (Required)

  • Severe region skew during ingest.
  • Repeated retry/rescan loops in ingest write path.
  • Store-level pressure concentrates on a subset of stores; manual scatter may mitigate temporarily.

4. What is your TiDB version? (Required)

  • Exact build hash needs confirmation.
  • Regression suspicion is tied to builds containing #62419:
    • #59609 introduced 2-level split/scatter.
    • #62419 added limiter but removed the coarse split/scatter layer.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions