feat: DH-21522: allow column region optimizations in Predicate Pushdown filtering#7666
Conversation
…delegate to the ColumnRegion for predicate pushdown.
No docs changes detected for 9fe11c9 |
There was a problem hiding this comment.
Pull request overview
This PR refactors predicate pushdown filtering for regioned column sources to support both table-location and per-column-region optimizations, enabling more granular (region-level) pushdown planning and execution.
Changes:
- Introduces a new
RegionedPushdownActionmodel (Location vs Region actions) and new regioned pushdown filter context types. - Updates regioned pushdown execution to run per-region and merge results, enabling column-region pushdown participation.
- Refactors
ParquetTableLocationpushdown logic from an internal enum to the new action-based API and updates the engine interfaces accordingly.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| extensions/parquet/table/src/test/java/io/deephaven/parquet/table/location/ParquetTableLocationTest.java | Removes the prior unit test that validated pushdown mode cost ordering. |
| extensions/parquet/table/src/main/java/io/deephaven/parquet/table/location/ParquetTableLocation.java | Implements location-level supported actions and action contexts for parquet pushdown planning/execution. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownHelper.java | Adds shared utilities for region-thread context and combining per-region pushdown results. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownFilterMatcher.java | Introduces the regioned action-based pushdown matcher API. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownFilterContext.java | Adds a regioned pushdown context carrying column definitions + rename mappings. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownFilterLocationContext.java | Extends the regioned context with access to the current table location. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedPushdownAction.java | Defines the new pushdown action abstraction (Location/Region) and related context interfaces. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceManager.java | Refactors manager-level pushdown scheduling/merging and exposes internals needed for region pushdown. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/RegionedColumnSourceBase.java | Refactors regioned column source pushdown to operate directly on regions with location-aware contexts. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/GenericColumnRegionBase.java | Adds default region-level pushdown orchestration combining region + location actions by cost. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/regioned/ColumnRegion.java | Makes column regions pushdown-capable via RegionedPushdownFilterMatcher. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/sources/UnionSourceManager.java | Updates to use the new BasePushdownFilterContext#filter() accessor. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/locations/impl/AbstractTableLocation.java | Adds default action-based pushdown planning/execution for table locations. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/locations/TableLocation.java | Switches TableLocation to the new RegionedPushdownFilterMatcher API. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/PushdownResult.java | Adds a new cost constant for region-level single-value optimizations. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/PushdownFilterMatcher.java | Provides default implementations for pushdown matcher methods. |
| engine/table/src/main/java/io/deephaven/engine/table/impl/BasePushdownFilterContext.java | Makes the base context abstract and introduces filter() accessor (encapsulation change). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…nderneath Unioned table to activate Column Region optimizations.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 64 out of 64 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
rcaudy
left a comment
There was a problem hiding this comment.
Shipping comments and partial review.
|
|
||
| private static final RegionedPushdownAction.Location ParquetRowGroupMetadata = | ||
| new RegionedPushdownAction.Location( | ||
| () -> QueryTable.DISABLE_WHERE_PUSHDOWN_PARQUET_ROW_GROUP_METADATA, |
There was a problem hiding this comment.
Feels weird to centralize something that mentions Parquet on QueryTable.
…t. Documentation.
…ation' into nightly/DH-21522-parquettablelocation # Conflicts: # engine/table/src/main/java/io/deephaven/engine/table/impl/PushdownResult.java
…lumnSourceManager to store table definition instead of List<ColumnDefinition>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 94 out of 94 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
# Conflicts: # engine/table/src/main/java/io/deephaven/engine/table/impl/sources/UnionSourceManager.java
| final long regionFirstIncludedKey = maybeIt.peekNextKey(); | ||
| final REGION_TYPE region = pageStore.lookupRegion(regionFirstIncludedKey); | ||
| final RowSequence regionRows = maybeIt.getNextRowSequenceThrough(region.maxRow(regionFirstIncludedKey)); | ||
| final long regionFirstKey = regionFirstIncludedKey & pageStore.mask(); |
There was a problem hiding this comment.
I'm pretty sure this is wrong. I think you should use io.deephaven.engine.page.Page#firstRow, as in final long regionFirstKey = region.firstRow(regionFirstIncludedKey);. I think you're getting the last possible row key, not the first.
This PR refactors predicate pushdown filtering for regioned column sources to support both table-location and per-column-region optimizations, enabling more granular (region-level) pushdown planning and execution.
Changes:
RegionedPushdownActionmodel (Location vs Region actions) and new regioned pushdown filter context types.ParquetTableLocationpushdown logic from an internal enum to the new action-based API and updates the engine interfaces accordingly.Code Coverage Summary: