Skip to content

rewrite approx_median to approx_percentile_cont while planning phase#2262

Merged
andygrove merged 2 commits intoapache:masterfrom
korowa:approx_median_rewrite
Apr 28, 2022
Merged

rewrite approx_median to approx_percentile_cont while planning phase#2262
andygrove merged 2 commits intoapache:masterfrom
korowa:approx_median_rewrite

Conversation

@korowa
Copy link
Copy Markdown
Contributor

@korowa korowa commented Apr 18, 2022

Which issue does this PR close?

Closes #2221 .

Rationale for this change

At this moment optimization rule for "approx_median -> approx_percentile_cont" replacement slightly breaks logical plan (expressions inside of aggregate step don't match its output schema) and it works fine in case of one optimizer pass, but while second optimizer pass projection_push_down rule cleans up approximate_percentile_cont.

What changes are included in this PR?

Suggestion is to move function replacement to planning phase - it seems to be more appropriate (we don't actually need the whole execution plan for this, because it's just rewriting of single expression), and there is no need to adjust all aliases / projections / schemas while optimization phase after replacement.

Are there any user-facing changes?

Fixed execution of queries with approx_median aggregate expression / window function / having filter.

Copy link
Copy Markdown
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @korowa. I have failing tests in #2369 which are fixed by this PR so LGTM.

@andygrove
Copy link
Copy Markdown
Member

@realno @yahoNanJing fyi since you have both worked on this code. I plan to merge this soon if there are no objections.

@andygrove andygrove merged commit 7b61d52 into apache:master Apr 28, 2022
MazterQyou pushed a commit to cube-js/arrow-datafusion that referenced this pull request Jul 5, 2022
MazterQyou pushed a commit to cube-js/arrow-datafusion that referenced this pull request Sep 1, 2022
MazterQyou pushed a commit to cube-js/arrow-datafusion that referenced this pull request Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Aggregate func Approx_median not work with Parquet format

2 participants