minor: inconsistent group by position planning#10679
Merged
alamb merged 3 commits intoapache:mainfrom May 28, 2024
Merged
Conversation
jonahgao
approved these changes
May 27, 2024
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Contributor
findepi
pushed a commit
to findepi/datafusion
that referenced
this pull request
Jul 16, 2024
* fix: inconsistent group by position planning * Update datafusion/sql/src/utils.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * improved error message --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #.
Rationale for this change
While #10591 found that planning for INT literals in GROUP BY clause is inconsistent. For example:
produces following logical plan (this is logical plan before any optimizations):
1resolved as column from select clause, and3, 4, 5remain as constants. Don't think this behavior is intended + there are couple of tests insql_integration.rs(modified below), for similar cases, expect queries to fail.The suggestion is to fail while planning phase for such cases, and return error, specifying first index which can not be resolved as select column.
What changes are included in this PR?
Resolution for positions (integer literals in
GROUP BYclause) now returns error on first unresolved position.SIgnature for
resolve_positions_to_exprschanged to avoid unnecessary cloning.resolve_aliases_to_exprssignature has similar changes, for the same reasons (it's been changed just for consistency with resolve_positions_... function).Are these changes tested?
Added sqllogictests + some test coverage for
resolve_positions_to_exprsAre there any user-facing changes?
Users might experience query planning failures for
GROUP BY %idx%whereidxis out of bounds ofSELECTexpressions list (which seems to be OK from SQL semantics point of view -- PG/MySQL handle this case in the same way).