feat: support Set Comparison Subquery#19109
Conversation
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
| async fn set_comparison_all_empty() -> Result<()> { | ||
| let ctx = SessionContext::new(); |
There was a problem hiding this comment.
how about adding
async fn set_comparison_type_mismatch() -> Result<()> {
// SELECT v FROM t WHERE v > ANY (SELECT s FROM strings)
// INT > STRING should error with clear message
...
too?
| )?; | ||
|
|
||
| let df = ctx | ||
| .sql("select v from t where v < all(select v from e)") |
There was a problem hiding this comment.
I think tests for:
- Multiple operators (
=,!=,>=,<=) - NULL semantics (e.g.,
5 != ALL (1, NULL)
would improve test coverage
There was a problem hiding this comment.
@waynexia
Thanks for the ping.
I will check again after you fix the clippy errors.
There was a problem hiding this comment.
Hi @kosiew, CI is all green now, please take another look, thank you!
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
d61929e to
bf60b20
Compare
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
| // Grow stacker segments earlier to avoid deep unparser recursion overflow in q20. | ||
| set_minimum_stack_size(512 * 1024); | ||
| set_stack_allocation_size(8 * 1024 * 1024); |
There was a problem hiding this comment.
This case got stack overflow after I rebased against the latest main branch, and this is a workaround. I'm thinking of rewriting unparser to avoid deep nested recursion, to some iterative flavor in the follow-up PR.
There was a problem hiding this comment.
Let's create an issue to track this.
|
Thank you @kosiew ❤️ I'm going to merge this tomorrow if there are no other comments |
Add a new `Expr::ArraySubquery(Subquery)` expression variant to represent SQL `ARRAY(SELECT ...)` subqueries. This is the first step toward supporting correlated array subqueries (e.g. BigQuery-style `ARRAY(SELECT x FROM UNNEST(arr) WHERE cond)`). This commit adds the variant and all required exhaustive match arms across the codebase, following the same pattern as `SetComparison` (added in apache#19109). No SQL parsing or optimizer rewrite is included yet — those will follow in subsequent commits. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19082. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Support Set Comparison Subquery (or Quantified Comparison Subquery in some systems), it looks like ```sql ... WHERE t1.a > ANY(SELECT ... FROM t2) ... ``` or ```sql ... WHERE t1.a < ALL(SELECT ... FROM t2) ... ``` ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - New `Expr` branch `Expr::SetComparison` and corresponding `SetComparison` struct - New optimizer rule `RewriteSetComparison` to translate this kind of subquery using existing components - Corresponding substrait and proto changes `RewriteSetComparison` is a very naive implementation. We can optimize it in various ways in the follow-up PRs, like using min/max for non-equal comparison operators etc. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes, by unit test and slt ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> To SQL user, this is a new feature. To API user, `ExprPlanner::plan_any` is removed (not needed) <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Which issue does this PR close?
Rationale for this change
Support Set Comparison Subquery (or Quantified Comparison Subquery in some systems), it looks like
or
What changes are included in this PR?
ExprbranchExpr::SetComparisonand correspondingSetComparisonstructRewriteSetComparisonto translate this kind of subquery using existing componentsRewriteSetComparisonis a very naive implementation. We can optimize it in various ways in the follow-up PRs, like using min/max for non-equal comparison operators etc.Are these changes tested?
Yes, by unit test and slt
Are there any user-facing changes?
To SQL user, this is a new feature. To API user,
ExprPlanner::plan_anyis removed (not needed)