Fix: array list values are leaked on nested unnest operators#10689
Fix: array list values are leaked on nested unnest operators#10689alamb merged 2 commits intoapache:mainfrom
unnest operators#10689Conversation
unnest operators
| // If there is only one list column to unnest and it doesn't contain any NULL lists, | ||
| // we can return the values array directly without any copying. | ||
| if typed_arrays.len() == 1 && typed_arrays[0].null_count() == 0 { | ||
| Ok(vec![typed_arrays[0].values().clone()]) |
There was a problem hiding this comment.
this will return the underlying low level array, which include the row indices filtered out by limit/offset
jonahgao
left a comment
There was a problem hiding this comment.
Nice catch ! Thanks @duongcongtoai
Another simple way to reproduce it is
select unnest(column1) from (select * from (values([1,2,3]), ([4,5,6])) limit 1);I'm not sure if this is a bug in arrow-rs or by design. It only slices the offset data of the ListArray, without slicing the child data.
I think current impl is correct, because the offset values after slicing is still offsetted based on original array. Else we have to do shifting for each offset elements like |
I think the shifting operation should be done internally in arrow-rs if needed, as it depends on what the data in the values looks like. Moreover, we don't need to do shifting for FixedSizeListArray. The current solution looks good to me. |
…e#10689) * chore: fix issue + add test * simplify test case
Which issue does this PR close?
Closes #10672 .
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?