Support shared catalog config across tables in iceberg-source#6727
Support shared catalog config across tables in iceberg-source#6727lawofcycles wants to merge 2 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Sotaro Hikita <bering1814@gmail.com>
dlvenable
left a comment
There was a problem hiding this comment.
Thanks @lawofcycles for this contribution! I have a few small comments.
|
|
||
| final Map<String, String> catalogProps = new HashMap<>(tableConfig.getCatalog()); | ||
| final Map<String, String> catalogProps = new HashMap<>( | ||
| tableConfig.getCatalog().isEmpty() ? sourceConfig.getCatalog() : tableConfig.getCatalog()); |
There was a problem hiding this comment.
Should we use != null instead? Perhaps an empty catalog can be an explicit no catalog at all.
There was a problem hiding this comment.
Good point. Changed to != null and updated TableConfig.catalog default from Collections.emptyMap() to null. This way, an unset catalog falls back to the shared config, while an explicitly empty catalog: {} is passed through as is (which will fail at catalog initialization.
| when(configB.getIdentifierColumns()).thenReturn(List.of("id")); | ||
| when(configB.isDisableExport()).thenReturn(false); | ||
|
|
||
| when(sourceConfig.getCatalog()).thenReturn(helper.catalogProperties()); |
There was a problem hiding this comment.
Should we use an alternative catalog to be sure that it is correctly selected?
There was a problem hiding this comment.
Since the integration test environment has a single REST catalog server, using a truly distinct catalog to verify selection is not straightforward. I added IcebergServiceTest with unit tests that use mockStatic(CatalogUtil.class) to capture and assert the exact catalog properties passed to buildIcebergCatalog for each table. The integration test export_with_mixed_catalog_config is kept as a smoke test to confirm the mixed configuration works end to end.
| final EnhancedSourceCoordinator coordinator = createInMemoryCoordinator(); | ||
| coordinator.createPartition(new LeaderPartition()); | ||
| final IcebergService service = new IcebergService(coordinator, sourceConfig, pluginMetrics, | ||
| acknowledgementSetManager, org.opensearch.dataprepper.event.TestEventFactory.getTestEventFactory()); |
There was a problem hiding this comment.
Use an import on the class or static import. Avoid fully qualified names.
Change catalog fallback check from isEmpty() to null check so that an explicitly empty catalog is not silently treated as unset. TableConfig.catalog default is now null instead of emptyMap. Add IcebergServiceTest to verify catalog selection logic: shared only, table override, and mixed configurations. Replace fully qualified TestEventFactory references with imports in IcebergSourceIT. Signed-off-by: Sotaro Hikita <bering1814@gmail.com>
Description
Adds support for a top level
catalogconfiguration in iceberg-source that applies to all tables by default. When a table specifies its owncatalog, it fully replaces the top level definition.This reduces configuration duplication when multiple tables share the same catalog.
Issues Resolved
Partially addresses #6726 (catalog config sharing). Per-table shuffle overrides will be addressed after #6682 is merged.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.