Commit c36d665
authored
fix(datafusion): handle coalesced multi-file batches in next-scan (delta-io#4112)
# Description
Fix next-scan execution when upstream coalescing produces batches with
rows from multiple files.
Changes:
- Split incoming batches into contiguous file_id runs before applying DV
masks/transforms
- Buffer fan-out outputs via VecDeque to preserve row order
- Return `internal_datafusion_err!` on unexpected file_id column type
instead of panicking
- Add tests for interleaved file IDs, fanout, and invalid/null file_id
paths
# Related Issue(s)
<!---
For example:
- closes #106
--->
# Documentation
<!---
Share links to useful documentation
--->
Signed-off-by: Ethan Urbanski <ethan@urbanskitech.com>1 parent 054c18a commit c36d665
1 file changed
Lines changed: 411 additions & 45 deletions
0 commit comments