Skip to content

Check list size before concat in ScalarValue#10329

Merged
alamb merged 2 commits intoapache:mainfrom
timsaucer:bug/datatype_to_list_size_0
May 2, 2024
Merged

Check list size before concat in ScalarValue#10329
alamb merged 2 commits intoapache:mainfrom
timsaucer:bug/datatype_to_list_size_0

Conversation

@timsaucer
Copy link
Copy Markdown
Member

This PR resolves an issue in which we get a panic attempting to concatenate empty arrays. This prevents lead and lag functions from working on these data types.

Which issue does this PR close?

Closes #10328.

Rationale for this change

When we call WindowAggState::new with a data type of a list, we will call ScalarValue::try_from(out_type)?.to_array_of_size(0) which in turn calls list_to_array_of_size. Within this function there is no check on the return size. It could be zero due to either an empty array input or a request for 0 elements. In either of these conditions, you will get a panic from concat.

What changes are included in this PR?

This is a small change to check the return size. When zero it will just take an empty slice of the original array.

Are these changes tested?

I have included an expansion to an existing unit test to cover this. In my bug report I have also shown minimal example to reproduce the problem and the updated results from it's inclusion.

Are there any user-facing changes?

There are no user facing changes.

…d. If not, do not attempt to call concat on empty arrays.
@alamb
Copy link
Copy Markdown
Contributor

alamb commented May 1, 2024

Thanks @timsaucer -- I started the CI checks on this PR and hope to check it out later today if no one beats me to it

Copy link
Copy Markdown
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb
Copy link
Copy Markdown
Contributor

alamb commented May 2, 2024

Thanks @timsaucer and @jayzhan211

@alamb alamb merged commit 2c0afce into apache:main May 2, 2024
@alamb
Copy link
Copy Markdown
Contributor

alamb commented May 2, 2024

BTW I think this would benefit from some end to end testing as well, so I filed #10345

richox pushed a commit to auron-project/datafusion that referenced this pull request May 8, 2024
* Add check in list_to_array_of_size to see if any data will be returned. If not, do not attempt to call concat on empty arrays.

* Formatting
richox pushed a commit to auron-project/datafusion that referenced this pull request May 8, 2024
* Add check in list_to_array_of_size to see if any data will be returned. If not, do not attempt to call concat on empty arrays.

* Formatting
richox pushed a commit to apache/auron that referenced this pull request May 9, 2024
 revert "split huge batch into smaller ones during projection (#448)", This reverts commit 7d33df1.
 fix issue "Nondeterministic expression xxx should be initialized before eval".
 fix issue "IO error: Bad file descriptor" while accessing spill file.
 fix jvm coredump by removing chrono dependency for printing logs.
@richox richox mentioned this pull request May 9, 2024
lihao712 pushed a commit to apache/auron that referenced this pull request May 9, 2024
revert "split huge batch into smaller ones during projection (#448)", This reverts commit 7d33df1.
 fix issue "Nondeterministic expression xxx should be initialized before eval".
 fix issue "IO error: Bad file descriptor" while accessing spill file.
 fix jvm coredump by removing chrono dependency for printing logs.

Co-authored-by: zhangli20 <zhangli20@kuaishou.com>
@timsaucer timsaucer deleted the bug/datatype_to_list_size_0 branch May 14, 2024 12:08
joroKr21 pushed a commit to coralogix/arrow-datafusion that referenced this pull request May 24, 2024
* Add check in list_to_array_of_size to see if any data will be returned. If not, do not attempt to call concat on empty arrays.

* Formatting
joroKr21 added a commit to coralogix/arrow-datafusion that referenced this pull request May 24, 2024
* Add check in list_to_array_of_size to see if any data will be returned. If not, do not attempt to call concat on empty arrays.

* Formatting

Co-authored-by: Tim Saucer <timsaucer@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to perform lead/lag built in functions on List and Struct data types

3 participants