Commit f0e02e9
refactor: skip RowFilter in DataFusion via per-run decoders (no arrow-rs changes)
Instead of adding a `with_fully_matched_row_groups` API to arrow-rs,
implement the optimization entirely in DataFusion by creating separate
ParquetPushDecoders for row groups that need filtering vs those that
are fully matched.
Key changes:
- Split row groups into consecutive runs of same filter requirement
via `split_decoder_runs()`, preserving original row group ordering
for ordered scans.
- Each filtered run gets its own RowFilter; fully-matched runs skip it.
- Use VecDeque<ParquetPushDecoder> in PushDecoderStreamState to chain
decoders sequentially.
- Remove [patch.crates-io] arrow-rs fork dependency.
This aligns with the direction of per-row-group morsels: each decoder
run can naturally become a morsel when that infrastructure lands.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 7cff519 commit f0e02e9
4 files changed
Lines changed: 282 additions & 105 deletions
File tree
- datafusion/datasource-parquet
- benches
- src
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
284 | 284 | | |
285 | 285 | | |
286 | 286 | | |
287 | | - | |
288 | | - | |
289 | | - | |
290 | | - | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
Lines changed: 26 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
121 | | - | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
122 | 124 | | |
123 | 125 | | |
124 | 126 | | |
| 127 | + | |
125 | 128 | | |
126 | 129 | | |
127 | 130 | | |
128 | 131 | | |
129 | 132 | | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
138 | 155 | | |
139 | 156 | | |
140 | 157 | | |
| |||
0 commit comments