Is your feature request related to a problem? Please describe.
The iceberg-source creates one task per data file during initial load. For tables with many small files, the coordination overhead per task (DynamoDB acquire/complete operations) can dominate the actual file processing time.
Describe the solution you'd like
Group multiple data files into a single initial load task based on total file size, consistent with the approach planned for SHUFFLE_WRITE tasks.
Additional context
Related: #6682 (source-layer shuffle), #6724
Is your feature request related to a problem? Please describe.
The iceberg-source creates one task per data file during initial load. For tables with many small files, the coordination overhead per task (DynamoDB acquire/complete operations) can dominate the actual file processing time.
Describe the solution you'd like
Group multiple data files into a single initial load task based on total file size, consistent with the approach planned for SHUFFLE_WRITE tasks.
Additional context
Related: #6682 (source-layer shuffle), #6724