fix: join swap for projected semi/anti joins#13022
Conversation
|
cc @berkaysynnada @my-vegetable-has-exploded Perhaps you could review this PR, due to still being in context after #12967? |
Dandandan
left a comment
There was a problem hiding this comment.
This makes sense to me & looks good. It would be good to get another review
| case::left(JoinType::Left, vec![1], true), | ||
| case::right(JoinType::Right, vec![1], true), | ||
| case::full(JoinType::Full, vec![1], true), | ||
| case::left_anti(JoinType::LeftAnti, vec![0], false), |
There was a problem hiding this comment.
I think it will be better to add a swap test for LeftAnti & LeftSemi?
There was a problem hiding this comment.
Do you mean a separate one? Initially I was going to do so, but realized that that it'll result in 3 almost equivalent tests ("normal" joins / left semi-anti / right semi-anti) except for minor differences, so decided that it could be better to parameterize test code to highlight these differences and reduce code duplication.
There was a problem hiding this comment.
Since small_on_right is false for LeftAnti & LeftSemi, the swap rule don't work for those tests. But it doesn't matters.
berkaysynnada
left a comment
There was a problem hiding this comment.
LGTM, thank you @korowa, nice catch
I see you've already did what #13022 (comment) mentions, but perhaps I couldn't understand the point, so not merging this yet
| }) | ||
| .collect() | ||
| }) | ||
| match join_type { |
Which issue does this PR close?
Closes #10978.
Rationale for this change
On join swap embedded projection for anti/semi joins is getting modified just like for any other join types. This leads to resulting embedded projection, containing column indices out of bounds of join output schema.
What changes are included in this PR?
The fix is not to modify output projection for semi/anti joins -- they already have one-side schema as an output, and trying to add/subtract the number of fields from the other input side to column indices will break the projection.
Are these changes tested?
Existing test for swapping join with embedded projection is extended to test all join types.
Are there any user-facing changes?
Swapping inputs for semi/anti joins with embedded projection should start working.