Skip to content

Commit 74ea766

Browse files
authored
Bring dynamic filtering cherry picks (#70)
* Fix dynamic filter is_used function (apache#19734) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19715. ## Rationale for this change The:is_used() API incorrectly returned false for custom `DataSource` implementations that didn't call reassign_expr_columns() -> with_new_children() . This caused `HashJoinExec` to skip computing dynamic filters even when they were actually being used. ## What changes are included in this PR? Updated is_used() to check both outer and inner Arc counts ## Are these changes tested? Functionality is covered by existing test `test_hashjoin_dynamic_filter_pushdown_is_used`. I was not sure if to add a repro since it would require adding a custom `DataSource`, the current tests in datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs use `FileScanConfig` ## Are there any user-facing changes? no (cherry picked from commit 278950a) * Simplify wait_complete function (apache#19937) ## Which issue does this PR close? ## Rationale for this change The current v52 signature `pub async fn wait_complete(self: &Arc<Self>)` (introduced in apache#19546) is a bit unergonomic. The method requires `&Arc<DynamicFilterPhysicalExpr>`, but when working with `Arc<dyn PhysicalExpr>`, downcasting only gives you `&DynamicFilterPhysicalExpr`. Since you can't convert `&DynamicFilterPhysicalExpr` to `Arc<DynamicFilterPhysicalExpr>`, the method becomes impossible to call. The `&Arc<Self>` param was used to check` is_used()` via Arc strong count, but this was overly defensive. ## What changes are included in this PR? - Changed `DynamicFilterPhysicalExpr::wait_complete` signature from `pub async fn wait_complete(self: &Arc<Self>)` to `pub async fn wait_complete(&self)`. - Removed the `is_used()` check from `wait_complete()` - this method, like `wait_update()`, should only be called on filters that have consumers. If the caller doesn't know whether the filter has consumers, they should call `is_used()` first to avoid waiting indefinitely. This approach avoids complex signatures and dependencies between the APIs methods. ## Are these changes tested? Yes, existing tests cover this functionality, I removed the "mock" consumer from `test_hash_join_marks_filter_complete_empty_build_side` and `test_hash_join_marks_filter_complete` since the fix in apache#19734 makes is_used check the outer struct `strong_count` as well. ## Are there any user-facing changes? The signature of `wait_complete` changed. (cherry picked from commit bef1368)
1 parent a2a464e commit 74ea766

2 files changed

Lines changed: 18 additions & 27 deletions

File tree

datafusion/physical-expr/src/expressions/dynamic_filters.rs

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,10 @@ impl DynamicFilterPhysicalExpr {
272272
///
273273
/// This method will return when [`Self::update`] is called and the generation increases.
274274
/// It does not guarantee that the filter is complete.
275+
///
276+
/// Producers (e.g.) HashJoinExec may never update the expression or mark it as completed if there are no consumers.
277+
/// If you call this method on a dynamic filter created by such a producer and there are no consumers registered this method would wait indefinitely.
278+
/// This should not happen under normal operation and would indicate a programming error either in your producer or in DataFusion if the producer is a built in node.
275279
pub async fn wait_update(&self) {
276280
let mut rx = self.state_watch.subscribe();
277281
// Get the current generation
@@ -283,17 +287,16 @@ impl DynamicFilterPhysicalExpr {
283287

284288
/// Wait asynchronously until this dynamic filter is marked as complete.
285289
///
286-
/// This method returns immediately if the filter is already complete or if the filter
287-
/// is not being used by any consumers.
290+
/// This method returns immediately if the filter is already complete.
288291
/// Otherwise, it waits until [`Self::mark_complete`] is called.
289292
///
290293
/// Unlike [`Self::wait_update`], this method guarantees that when it returns,
291294
/// the filter is fully complete with no more updates expected.
292-
pub async fn wait_complete(self: &Arc<Self>) {
293-
if !self.is_used() {
294-
return;
295-
}
296-
295+
///
296+
/// Producers (e.g.) HashJoinExec may never update the expression or mark it as completed if there are no consumers.
297+
/// If you call this method on a dynamic filter created by such a producer and there are no consumers registered this method would wait indefinitely.
298+
/// This should not happen under normal operation and would indicate a programming error either in your producer or in DataFusion if the producer is a built in node.
299+
pub async fn wait_complete(&self) {
297300
if self.inner.read().is_complete {
298301
return;
299302
}
@@ -310,14 +313,14 @@ impl DynamicFilterPhysicalExpr {
310313
/// that created the filter). This is useful to avoid computing expensive filter
311314
/// expressions when no consumer will actually use them.
312315
///
313-
/// Note: We check the inner Arc's strong_count, not the outer Arc's count, because
314-
/// when filters are transformed (e.g., via reassign_expr_columns during filter pushdown),
315-
/// new outer Arc instances are created via with_new_children(), but they all share the
316-
/// same inner `Arc<RwLock<Inner>>`. This is what allows filter updates to propagate to
317-
/// consumers even after transformation.
316+
/// # Implementation Details
317+
///
318+
/// We check both Arc counts to handle two cases:
319+
/// - Transformed filters (via `with_new_children`) share the inner Arc (inner count > 1)
320+
/// - Direct clones (via `Arc::clone`) increment the outer count (outer count > 1)
318321
pub fn is_used(self: &Arc<Self>) -> bool {
319322
// Strong count > 1 means at least one consumer is holding a reference beyond the producer.
320-
Arc::strong_count(&self.inner) > 1
323+
Arc::strong_count(self) > 1 || Arc::strong_count(&self.inner) > 1
321324
}
322325

323326
fn render(

datafusion/physical-plan/src/joins/hash_join/exec.rs

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -513,10 +513,8 @@ impl HashJoinExec {
513513
///
514514
/// This method is intended for testing only and should not be used in production code.
515515
#[doc(hidden)]
516-
pub fn dynamic_filter_for_test(&self) -> Option<Arc<DynamicFilterPhysicalExpr>> {
517-
self.dynamic_filter
518-
.as_ref()
519-
.map(|df| Arc::clone(&df.filter))
516+
pub fn dynamic_filter_for_test(&self) -> Option<&Arc<DynamicFilterPhysicalExpr>> {
517+
self.dynamic_filter.as_ref().map(|df| &df.filter)
520518
}
521519

522520
/// Calculate order preservation flags for this hash join.
@@ -4635,11 +4633,6 @@ mod tests {
46354633
let dynamic_filter = HashJoinExec::create_dynamic_filter(&on);
46364634
let dynamic_filter_clone = Arc::clone(&dynamic_filter);
46374635

4638-
// Simulate a consumer by creating a transformed copy (what happens during filter pushdown)
4639-
let _consumer = Arc::clone(&dynamic_filter)
4640-
.with_new_children(vec![])
4641-
.unwrap();
4642-
46434636
// Create HashJoinExec with the dynamic filter
46444637
let mut join = HashJoinExec::try_new(
46454638
left,
@@ -4688,11 +4681,6 @@ mod tests {
46884681
let dynamic_filter = HashJoinExec::create_dynamic_filter(&on);
46894682
let dynamic_filter_clone = Arc::clone(&dynamic_filter);
46904683

4691-
// Simulate a consumer by creating a transformed copy (what happens during filter pushdown)
4692-
let _consumer = Arc::clone(&dynamic_filter)
4693-
.with_new_children(vec![])
4694-
.unwrap();
4695-
46964684
// Create HashJoinExec with the dynamic filter
46974685
let mut join = HashJoinExec::try_new(
46984686
left,

0 commit comments

Comments
 (0)