[FEATURE] Group initial load tasks by file size in iceberg-source

**Is your feature request related to a problem? Please describe.**

The iceberg-source creates one task per data file during initial load. For tables with many small files, the coordination overhead per task (DynamoDB acquire/complete operations) can dominate the actual file processing time.

**Describe the solution you'd like**

Group multiple data files into a single initial load task based on total file size, consistent with the approach planned for SHUFFLE_WRITE tasks.

**Additional context**

Related: #6682 (source-layer shuffle), https://github.com/opensearch-project/data-prepper/issues/6724

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Group initial load tasks by file size in iceberg-source #6725

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Group initial load tasks by file size in iceberg-source #6725

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions