pkg/planner: avoid wrong outer join simplification with nested IN#67764
pkg/planner: avoid wrong outer join simplification with nested IN#67764hawkingrei wants to merge 6 commits intopingcap:masterfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdded two regression tests for UNION ALL vs CTE behavior, updated null-rejection logic to detect/handle nested IN expressions, adjusted test BUILD deps, and added unit tests and helpers validating nested-IN null-rejection behavior. Changes
Sequence Diagram(s)(Skipped — changes are localized to planner expression handling and tests; no multi-component sequential flow requiring visualization.) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)Command failed Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/planner/core/operator/logicalop/logical_join.go (1)
362-377: Clarify that this helper detects descendantIN, not rootIN.The current behavior is subtle (Line 362 returns
falsefor a top-levelINexpression). A short comment here will prevent accidental misuse.As per coding guidelines, "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs, and SHOULD NOT restate what the code already makes clear."Suggested clarification
+// hasNestedIn returns true when any descendant scalar-function node is `IN`. +// Note: a top-level `IN` root intentionally returns false. func hasNestedIn(expr expression.Expression) bool { sf, ok := expr.(*expression.ScalarFunction) if !ok { return false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/operator/logicalop/logical_join.go` around lines 362 - 377, hasNestedIn currently returns false for a top-level IN and only detects descendant/child IN expressions, which is subtle and can be misused; update the code by adding a concise comment above the hasNestedIn function clarifying that the helper detects nested/descendant IN expressions (not root-level IN) and documenting the invariant/intent for callers (reference function name hasNestedIn and the check for expr.(*expression.ScalarFunction) and child.FuncName.L == ast.In) so future readers understand its exact behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/planner/core/operator/logicalop/logical_join.go`:
- Around line 362-377: hasNestedIn currently returns false for a top-level IN
and only detects descendant/child IN expressions, which is subtle and can be
misused; update the code by adding a concise comment above the hasNestedIn
function clarifying that the helper detects nested/descendant IN expressions
(not root-level IN) and documenting the invariant/intent for callers (reference
function name hasNestedIn and the check for expr.(*expression.ScalarFunction)
and child.FuncName.L == ast.In) so future readers understand its exact behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 66bdfbc4-eabd-4969-b748-e0d2bf8a7064
📒 Files selected for processing (10)
.github/workflows/update-bazel-files.ymlpkg/planner/core/casetest/schema/cannot_find_column_test.gopkg/planner/core/casetest/schema/testdata/cannot_find_column_suite_in.jsonpkg/planner/core/casetest/schema/testdata/cannot_find_column_suite_out.jsonpkg/planner/core/casetest/schema/testdata/cannot_find_column_suite_xut.jsonpkg/planner/core/issuetest/planner_issue_test.gopkg/planner/core/operator/logicalop/logical_join.gopkg/planner/core/operator/logicalop/logical_top_n.gopkg/planner/core/operator/logicalop/logicalop_test/BUILD.bazelpkg/planner/core/operator/logicalop/logicalop_test/logical_operator_test.go
b052e15 to
9dcac1b
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #67764 +/- ##
================================================
- Coverage 77.6020% 77.1480% -0.4540%
================================================
Files 1981 1964 -17
Lines 548804 548846 +42
================================================
- Hits 425883 423424 -2459
- Misses 122111 125420 +3309
+ Partials 810 2 -808
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/planner/util/null_misc.go (1)
128-146: Add a brief intent comment forhasUnsafeNestedInlogic.This safety rule is subtle; a short comment describing why these nested
INshapes are treated as unsafe will reduce future regressions.As per coding guidelines, "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs."Proposed patch
-func hasUnsafeNestedIn(ctx base.PlanContext, schema *expression.Schema, expr expression.Expression, skipPlanCacheCheck bool) bool { +// hasUnsafeNestedIn detects nested IN patterns that cannot be proven null-rejected +// safely via current IN-list decomposition, so callers should conservatively treat +// the whole predicate as not null-rejected. +func hasUnsafeNestedIn(ctx base.PlanContext, schema *expression.Schema, expr expression.Expression, skipPlanCacheCheck bool) bool {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/util/null_misc.go` around lines 128 - 146, Add a short intent comment above the hasUnsafeNestedIn function explaining why specific nested IN expressions are considered unsafe: note that the function detects nested ScalarFunction IN nodes (child.FuncName.L == ast.In) that are not null-rejected via isNullRejectedInList and therefore can change semantics with NULLs (and affect plan caching), and that the recursion checks nested scalar functions for this unsafe shape; reference the function name hasUnsafeNestedIn and the helper isNullRejectedInList in the comment and briefly state the invariant the function enforces (i.e., treat non-null-rejected nested INs as unsafe for planning/plan-cache reuse).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/planner/util/null_misc.go`:
- Around line 128-146: Add a short intent comment above the hasUnsafeNestedIn
function explaining why specific nested IN expressions are considered unsafe:
note that the function detects nested ScalarFunction IN nodes (child.FuncName.L
== ast.In) that are not null-rejected via isNullRejectedInList and therefore can
change semantics with NULLs (and affect plan caching), and that the recursion
checks nested scalar functions for this unsafe shape; reference the function
name hasUnsafeNestedIn and the helper isNullRejectedInList in the comment and
briefly state the invariant the function enforces (i.e., treat non-null-rejected
nested INs as unsafe for planning/plan-cache reuse).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 4bed4773-2e00-4138-8e23-a283f4eda6ff
📒 Files selected for processing (5)
pkg/planner/core/issuetest/planner_issue_test.gopkg/planner/core/operator/logicalop/logical_join.gopkg/planner/util/BUILD.bazelpkg/planner/util/column_test.gopkg/planner/util/null_misc.go
✅ Files skipped from review due to trivial changes (2)
- pkg/planner/util/BUILD.bazel
- pkg/planner/core/operator/logicalop/logical_join.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/planner/core/issuetest/planner_issue_test.go
|
/retest |
|
@pantheon-bot review |
|
@hawkingrei I've received your request and will start reviewing the pull request. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
What problem does this PR solve?
Issue Number: close #67373
Problem Summary:
RIGHT OUTER JOINcan be incorrectly simplified toINNER JOINwhen the filter is considerednull-rejected too aggressively. For expressions with nested
IN, this drops rows in repeatedderived-table
UNION ALLqueries, while the equivalent CTE form keeps the correct result.What changed and how does it work?
logical_join.goconservative for predicatescontaining nested
IN.matches the CTE result.
Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Bug Fixes
Tests