Propagate table constraints through physical plans to optimize sort operations#14111
Propagate table constraints through physical plans to optimize sort operations#14111berkaysynnada merged 43 commits intoapache:mainfrom gokselk:feature/physical-planner-functional-dependence
Conversation
…anner-functional-dependence
|
I left my reviews here: synnada-ai#53 (review) |
Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
ozankabak
left a comment
There was a problem hiding this comment.
We are very close to the finish line. Let's iterate over my comments
| match constraint { | ||
| Constraint::PrimaryKey(indices) => { | ||
| let new_indices = | ||
| update_elements_with_matching_indices(indices, proj_indices); |
There was a problem hiding this comment.
If you refactor the update_elements_with_matching_indices function to take two impl Iterator's (you probably need to replace the looping order to do that), this function can also accept proj_indices as an impl Iterator.
There was a problem hiding this comment.
The update_elements_with_matching_indices function uses .position() on proj_indices, which makes it necessary to clone it if we take it as an impl Iterator. I think this defeats the whole purpose of this refactoring.
There was a problem hiding this comment.
What I had in mind was to swap the loop order (iterate on proj_indices on the outer loop). That may enable us to use an impl Iterator for proj_indices. We probably will need to keep the type of entries as a slice because it does not have an ordering (though we can enforce that in a future PR). Had entries was ordered, I think we could have also taken it in as an impl Iterator -- but let's leave the latter for a future PR
There was a problem hiding this comment.
We looked into this with @berkaysynnada and it seems to have some intricacies. Let's leave this to another PR
| .map(|col| col.index()) | ||
| .collect::<Vec<_>>(); | ||
| debug_assert_eq!(mapping.map.len(), indices.len()); | ||
| self.constraints.project(&indices) |
There was a problem hiding this comment.
You will not need to collect and materialize indices if you refactor project to accept an iterator. See my comment in functional_dependencies.rs.
ozankabak
left a comment
There was a problem hiding this comment.
I went through this carefully (twice) and it LGTM
|
I'll merge this PR once the main branch is all green |
|
Great efforts! Thank you @gokselk |
Which issue does this PR close?
Closes #14110.
Rationale for this change
This PR extends the physical planner to propagate table constraints (PRIMARY KEY and UNIQUE) through the query plan. This allows us to optimize sort operations by recognizing when ordering requirements are already satisfied by existing constraints.
What changes are included in this PR?
Constraintspropagation through physical plans including:EquivalencePropertiesto consider constraints when evaluating sort requirementsAre these changes tested?
Yes, the changes include:
Are there any user-facing changes?
The changes are mostly internal optimizations, but users will see: