pkg/planner: avoid wrong outer join simplification with nested IN by hawkingrei · Pull Request #67764 · pingcap/tidb

hawkingrei · 2026-04-14T09:53:19Z

What problem does this PR solve?

Issue Number: close #67373

Problem Summary:

RIGHT OUTER JOIN can be incorrectly simplified to INNER JOIN when the filter is considered
null-rejected too aggressively. For expressions with nested IN, this drops rows in repeated
derived-table UNION ALL queries, while the equivalent CTE form keeps the correct result.

What changed and how does it work?

Make the local outer-join null-reject check in logical_join.go conservative for predicates
containing nested IN.
Add a regression case that keeps the original issue shape and verifies the derived-table query
matches the CTE result.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Fixed an issue where repeated derived tables over a RIGHT OUTER JOIN in UNION ALL could lose rows because the join was incorrectly simplified to an inner join.

Summary by CodeRabbit

Bug Fixes
- Fixed planner handling of nested IN expressions so null-rejection logic is correct.
- Corrected EXPLAIN and execution behavior for queries with “safe nested IN” to report and return empty results when appropriate.
- Ensured queries using CTEs and equivalent derived-table + UNION ALL yield consistent results.
Tests
- Added regression tests covering the nested-IN behaviors and CTE vs. UNION ALL equivalence.

coderabbitai · 2026-04-14T09:53:47Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 26757bea-76b9-413c-83d7-c449ff43a363

📥 Commits

Reviewing files that changed from the base of the PR and between ac8f4c6 and 477e749.

📒 Files selected for processing (2)

pkg/planner/core/operator/logicalop/logical_join.go
pkg/planner/util/null_misc.go

📝 Walkthrough

Walkthrough

Added two regression tests for UNION ALL vs CTE behavior, updated null-rejection logic to detect/handle nested IN expressions, adjusted test BUILD deps, and added unit tests and helpers validating nested-IN null-rejection behavior.

Changes

Cohort / File(s)	Summary
Regression tests `pkg/planner/core/issuetest/planner_issue_test.go`	Added two regression test blocks: `repeated-derived-union-all-keeps-all-rows` (compares CTE vs repeated derived-table UNION ALL) and `safe-nested-in-still-allows-outer-to-inner` (plan assertion + empty-result check).
Planner null-rejection logic `pkg/planner/core/operator/logicalop/logical_join.go`	Added `containsNestedInDescendant` and adjusted `isNullRejected` to bypass null-rejection when nested `IN` descendants are present.
Null-rejection helpers & tests `pkg/planner/util/null_misc.go`, `pkg/planner/util/column_test.go`	Replaced a PlanContext type, added `hasUnsafeNestedIn` check to prevent treating predicates as null-rejected when inner `IN` is unsafe; added `null-rejected-nested-in` unit test and test helpers.
Build/test deps `pkg/importsdk/BUILD.bazel`, `pkg/planner/util/BUILD.bazel`	Test-only BUILD updates: added `//pkg/parser/ast` to `importsdk_test` deps and `//pkg/util/mock` to `util_test` deps.

Sequence Diagram(s)

(Skipped — changes are localized to planner expression handling and tests; no multi-component sequential flow requiring visualization.)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

qw4990
guo-shaoge
AilinKid

Poem

🐰 I sniffed the plan where INs hid inside,

Peeked through derived tables where rows did hide,
I hopped in tests and BUILDs with a cheer,
Brought back the rows that had disappeared,
A tiny carrot dance — bug fixed, hop wide!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title clearly summarizes the main issue being fixed: avoiding incorrect simplification of outer joins when nested IN predicates are present.
Description check	✅ Passed	The pull request description adequately covers the problem, solution approach, test coverage, and release notes as required by the template.
Linked Issues check	✅ Passed	The changes directly address issue `#67373` by making the null-reject check conservative for nested IN predicates and adding regression tests verifying the fix.
Out of Scope Changes check	✅ Passed	All changes align with the linked issue objectives: null-rejection logic refinements, helper function additions for nested IN detection, and regression test coverage.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Command failed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

pkg/planner/core/operator/logicalop/logical_join.go (1)
362-377: Clarify that this helper detects descendant IN, not root IN.

The current behavior is subtle (Line 362 returns false for a top-level IN expression). A short comment here will prevent accidental misuse.
Suggested clarification
+// hasNestedIn returns true when any descendant scalar-function node is `IN`.
+// Note: a top-level `IN` root intentionally returns false.
 func hasNestedIn(expr expression.Expression) bool {
 	sf, ok := expr.(*expression.ScalarFunction)
 	if !ok {
 		return false
 	}
As per coding guidelines, "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs, and SHOULD NOT restate what the code already makes clear."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/operator/logicalop/logical_join.go` around lines 362 - 377,
hasNestedIn currently returns false for a top-level IN and only detects
descendant/child IN expressions, which is subtle and can be misused; update the
code by adding a concise comment above the hasNestedIn function clarifying that
the helper detects nested/descendant IN expressions (not root-level IN) and
documenting the invariant/intent for callers (reference function name
hasNestedIn and the check for expr.(*expression.ScalarFunction) and
child.FuncName.L == ast.In) so future readers understand its exact behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/planner/core/operator/logicalop/logical_join.go`:
- Around line 362-377: hasNestedIn currently returns false for a top-level IN
and only detects descendant/child IN expressions, which is subtle and can be
misused; update the code by adding a concise comment above the hasNestedIn
function clarifying that the helper detects nested/descendant IN expressions
(not root-level IN) and documenting the invariant/intent for callers (reference
function name hasNestedIn and the check for expr.(*expression.ScalarFunction)
and child.FuncName.L == ast.In) so future readers understand its exact behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 66bdfbc4-eabd-4969-b748-e0d2bf8a7064

📥 Commits

Reviewing files that changed from the base of the PR and between 5733f22 and b052e15.

📒 Files selected for processing (10)

.github/workflows/update-bazel-files.yml
pkg/planner/core/casetest/schema/cannot_find_column_test.go
pkg/planner/core/casetest/schema/testdata/cannot_find_column_suite_in.json
pkg/planner/core/casetest/schema/testdata/cannot_find_column_suite_out.json
pkg/planner/core/casetest/schema/testdata/cannot_find_column_suite_xut.json
pkg/planner/core/issuetest/planner_issue_test.go
pkg/planner/core/operator/logicalop/logical_join.go
pkg/planner/core/operator/logicalop/logical_top_n.go
pkg/planner/core/operator/logicalop/logicalop_test/BUILD.bazel
pkg/planner/core/operator/logicalop/logicalop_test/logical_operator_test.go

ti-chi-bot · 2026-04-14T10:12:13Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign winoros for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS
pkg/planner/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

codecov · 2026-04-14T10:38:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.1480%. Comparing base (5733f22) to head (477e749).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #67764        +/-   ##
================================================
- Coverage   77.6020%   77.1480%   -0.4540%     
================================================
  Files          1981       1964        -17     
  Lines        548804     548846        +42     
================================================
- Hits         425883     423424      -2459     
- Misses       122111     125420      +3309     
+ Partials        810          2       -808

Flag	Coverage Δ
integration	`40.8892% <48.5714%> (+6.5495%)`	⬆️
unit	`76.3372% <100.0000%> (+0.0078%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`61.5065% <ø> (ø)`
parser	`∅ <ø> (∅)`
br	`49.8452% <ø> (-10.5620%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

hawkingrei · 2026-04-14T15:30:41Z

/retest

coderabbitai

🧹 Nitpick comments (1)

pkg/planner/util/null_misc.go (1)

128-146: Add a brief intent comment for hasUnsafeNestedIn logic.

This safety rule is subtle; a short comment describing why these nested IN shapes are treated as unsafe will reduce future regressions.

Proposed patch

-func hasUnsafeNestedIn(ctx base.PlanContext, schema *expression.Schema, expr expression.Expression, skipPlanCacheCheck bool) bool {
+// hasUnsafeNestedIn detects nested IN patterns that cannot be proven null-rejected
+// safely via current IN-list decomposition, so callers should conservatively treat
+// the whole predicate as not null-rejected.
+func hasUnsafeNestedIn(ctx base.PlanContext, schema *expression.Schema, expr expression.Expression, skipPlanCacheCheck bool) bool {

As per coding guidelines, "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/util/null_misc.go` around lines 128 - 146, Add a short intent
comment above the hasUnsafeNestedIn function explaining why specific nested IN
expressions are considered unsafe: note that the function detects nested
ScalarFunction IN nodes (child.FuncName.L == ast.In) that are not null-rejected
via isNullRejectedInList and therefore can change semantics with NULLs (and
affect plan caching), and that the recursion checks nested scalar functions for
this unsafe shape; reference the function name hasUnsafeNestedIn and the helper
isNullRejectedInList in the comment and briefly state the invariant the function
enforces (i.e., treat non-null-rejected nested INs as unsafe for
planning/plan-cache reuse).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/planner/util/null_misc.go`:
- Around line 128-146: Add a short intent comment above the hasUnsafeNestedIn
function explaining why specific nested IN expressions are considered unsafe:
note that the function detects nested ScalarFunction IN nodes (child.FuncName.L
== ast.In) that are not null-rejected via isNullRejectedInList and therefore can
change semantics with NULLs (and affect plan caching), and that the recursion
checks nested scalar functions for this unsafe shape; reference the function
name hasUnsafeNestedIn and the helper isNullRejectedInList in the comment and
briefly state the invariant the function enforces (i.e., treat non-null-rejected
nested INs as unsafe for planning/plan-cache reuse).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4bed4773-2e00-4138-8e23-a283f4eda6ff

📥 Commits

Reviewing files that changed from the base of the PR and between 87a2c1c and ac8f4c6.

📒 Files selected for processing (5)

pkg/planner/core/issuetest/planner_issue_test.go
pkg/planner/core/operator/logicalop/logical_join.go
pkg/planner/util/BUILD.bazel
pkg/planner/util/column_test.go
pkg/planner/util/null_misc.go

✅ Files skipped from review due to trivial changes (2)

pkg/planner/util/BUILD.bazel
pkg/planner/core/operator/logicalop/logical_join.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/planner/core/issuetest/planner_issue_test.go

hawkingrei · 2026-04-15T09:19:46Z

/retest

hawkingrei · 2026-04-15T09:20:40Z

@pantheon-bot review

pantheon-ai · 2026-04-15T09:20:47Z

@hawkingrei I've received your request and will start reviewing the pull request. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

_{ℹ️ Learn more details on Pantheon AI.}

ti-chi-bot bot added do-not-merge/needs-triage-completed release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 14, 2026

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

pkg/planner: avoid wrong outer join simplification with nested IN

9dcac1b

hawkingrei force-pushed the issue-67373-fix-20260414 branch from b052e15 to 9dcac1b Compare April 14, 2026 10:08

ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 14, 2026

chore: update bazel file

87a2c1c

ti-chi-bot bot removed the do-not-merge/needs-triage-completed label Apr 14, 2026

pkg/planner: refine nested IN null-reject proof

0a9d743

ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 15, 2026

chore: update bazel file

ac8f4c6

coderabbitai bot reviewed Apr 15, 2026

View reviewed changes

hawkingrei added 2 commits April 15, 2026 10:53

pkg/planner: rename nested IN helper

bc8d98f

pkg/planner: clarify nested IN null-reject comments

477e749

hawkingrei closed this Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg/planner: avoid wrong outer join simplification with nested IN#67764

pkg/planner: avoid wrong outer join simplification with nested IN#67764
hawkingrei wants to merge 6 commits intopingcap:masterfrom
hawkingrei:issue-67373-fix-20260414

hawkingrei commented Apr 14, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 14, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

ti-chi-bot bot commented Apr 14, 2026

Uh oh!

codecov bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

hawkingrei commented Apr 14, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

hawkingrei commented Apr 15, 2026

Uh oh!

hawkingrei commented Apr 15, 2026

Uh oh!

pantheon-ai bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hawkingrei commented Apr 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot bot commented Apr 14, 2026

Uh oh!

codecov bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hawkingrei commented Apr 14, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

hawkingrei commented Apr 15, 2026

Uh oh!

hawkingrei commented Apr 15, 2026

Uh oh!

pantheon-ai bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hawkingrei commented Apr 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 14, 2026 •

edited

Loading

codecov bot commented Apr 14, 2026 •

edited

Loading

pantheon-ai bot commented Apr 15, 2026 •

edited

Loading