Enhancement Task
Background
After a region is split, the new regions may still stay concentrated on the same set of stores for a while.
For load-based split, this means the split can happen, but the load is not scattered soon enough.
We want PD to scatter newly split regions shortly after split, instead of relying on manual scatter or waiting for later balancing.
Proposal
Add a simplified split-scatter flow in PD:
- When PD handles
AskBatchSplit, record the source region and newly allocated region IDs in a short-lived pending cache.
- After the new regions report heartbeat to PD, use their latest region stats to calculate a simple priority score based on CPU and byte traffic.
- Maintain a bounded priority queue in PD for these pending split regions.
- Pop regions from the queue with a configurable limit and create scatter operators one by one.
- Regions from the same split batch should use the same scatter group so they can be scattered together.
Enhancement Task
Background
After a region is split, the new regions may still stay concentrated on the same set of stores for a while.
For load-based split, this means the split can happen, but the load is not scattered soon enough.
We want PD to scatter newly split regions shortly after split, instead of relying on manual scatter or waiting for later balancing.
Proposal
Add a simplified split-scatter flow in PD:
AskBatchSplit, record the source region and newly allocated region IDs in a short-lived pending cache.