Commit 79e9f9d
fix: SanityCheckPlan error with window functions and NVL filter (apache#20231)
## Which issue does this PR close?
Closes apache#20194
## Rationale for this change
A query with `ROW_NUMBER() OVER (... ORDER BY CASE WHEN col='0' THEN 1
ELSE 0 END)` combined with a filter `nvl(t2.value_2_3,'0')='0'` fails
with a `SanityCheckPlan` error. This worked in 50.3.0 but broke in
52.1.0.
## What changes are included in this PR?
**Root cause**: `collect_columns_from_predicate_inner` was extracting
equality pairs where neither side was a `Column` (e.g. `nvl(col, '0') =
'0'`), creating equivalence classes between complex expressions and
literals. `normalize_expr`'s deep traversal would then replace the
literal `'0'` inside unrelated sort/window CASE WHEN expressions with
the complex NVL expression, corrupting the sort ordering and causing a
mismatch between `SortExec`'s reported output ordering and
`BoundedWindowAggExec`'s expected ordering.
**Fix** (two changes in `filter.rs`):
1. **`collect_columns_from_predicate_inner`**: Only extract equality
pairs where at least one side is a `Column` reference. This matches the
function's documented intent ("Column-Pairs") and prevents
complex-expression-to-literal equivalence classes from being created.
2. **`extend_constants`**: Recognize `Literal` expressions as inherently
constant (previously only checked `is_expr_constant` on the input's
equivalence properties, which doesn't know about literals). This ensures
constant propagation still works for `complex_expr = literal` predicates
— e.g. `nvl(col, '0')` is properly marked as constant after the filter.
## How was this tested?
- Unit test `test_collect_columns_skips_non_column_pairs` verifying the
filtering logic
- Sqllogictest reproducing the exact query from the issue
- Full test suites: equivalence tests (51 passed), physical-plan tests
(1255 passed), physical-optimizer tests (20 passed)
- Manual verification with datafusion-cli running the reproduction query
## Test plan
- [x] Unit test for `collect_columns_from_predicate_inner` column
filtering
- [x] Sqllogictest regression test for apache#20194
- [x] Existing test suites pass
- [x] Manual reproduction query succeeds
---------
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>1 parent 9dd1def commit 79e9f9d
1 file changed
Lines changed: 42 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1978 | 1978 | | |
1979 | 1979 | | |
1980 | 1980 | | |
| 1981 | + | |
| 1982 | + | |
| 1983 | + | |
| 1984 | + | |
| 1985 | + | |
| 1986 | + | |
| 1987 | + | |
| 1988 | + | |
| 1989 | + | |
| 1990 | + | |
| 1991 | + | |
| 1992 | + | |
| 1993 | + | |
| 1994 | + | |
| 1995 | + | |
| 1996 | + | |
| 1997 | + | |
| 1998 | + | |
| 1999 | + | |
| 2000 | + | |
| 2001 | + | |
| 2002 | + | |
| 2003 | + | |
| 2004 | + | |
| 2005 | + | |
| 2006 | + | |
| 2007 | + | |
| 2008 | + | |
| 2009 | + | |
| 2010 | + | |
| 2011 | + | |
| 2012 | + | |
| 2013 | + | |
| 2014 | + | |
| 2015 | + | |
| 2016 | + | |
| 2017 | + | |
| 2018 | + | |
| 2019 | + | |
| 2020 | + | |
| 2021 | + | |
| 2022 | + | |
1981 | 2023 | | |
1982 | 2024 | | |
1983 | 2025 | | |
| |||
0 commit comments