Is your feature request related to a problem? Please describe.
When configuring multiple tables in iceberg-source, users must repeat the same catalog properties for each table entry even when the tables share the same catalog. Additionally, shuffle parameters (partitions, target_partition_size) cannot be tuned per table.
Describe the solution you'd like
- Allow a top level
catalog definition that applies to all tables by default. When a table specifies its own catalog, it fully replaces the top level definition (no partial merge).
iceberg:
catalog:
type: rest
uri: "http://iceberg-rest-catalog:8181"
io-impl: "org.apache.iceberg.aws.s3.S3FileIO"
tables:
- table_name: "db.table_a"
identifier_columns: ["id"]
- table_name: "db.table_b"
identifier_columns: ["id"]
catalog:
type: glue
warehouse: "s3://other-bucket/warehouse"
io-impl: "org.apache.iceberg.aws.s3.S3FileIO"
- Allow per table overrides for shuffle parameters (
partitions, target_partition_size). When a table specifies its own shuffle, it fully replaces the top level shuffle parameters. Node level settings like server_port and ssl remain top level only and are not overridable per table.
iceberg:
shuffle:
partitions: 64
target_partition_size: 64mb
tables:
- table_name: "db.large_table"
shuffle:
partitions: 256
target_partition_size: 128mb
- table_name: "db.small_table"
Additional context
Related PR: #6682 (source-layer shuffle implementation)
Is your feature request related to a problem? Please describe.
When configuring multiple tables in iceberg-source, users must repeat the same catalog properties for each table entry even when the tables share the same catalog. Additionally, shuffle parameters (
partitions,target_partition_size) cannot be tuned per table.Describe the solution you'd like
catalogdefinition that applies to all tables by default. When a table specifies its owncatalog, it fully replaces the top level definition (no partial merge).partitions,target_partition_size). When a table specifies its ownshuffle, it fully replaces the top level shuffle parameters. Node level settings likeserver_portandsslremain top level only and are not overridable per table.Additional context
Related PR: #6682 (source-layer shuffle implementation)