Skip to content

Commit 97453f2

Browse files
committed
Fix dynamic filter is_used function (apache#19734)
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19715. ## Rationale for this change The:is_used() API incorrectly returned false for custom `DataSource` implementations that didn't call reassign_expr_columns() -> with_new_children() . This caused `HashJoinExec` to skip computing dynamic filters even when they were actually being used. ## What changes are included in this PR? Updated is_used() to check both outer and inner Arc counts ## Are these changes tested? Functionality is covered by existing test `test_hashjoin_dynamic_filter_pushdown_is_used`. I was not sure if to add a repro since it would require adding a custom `DataSource`, the current tests in datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs use `FileScanConfig` ## Are there any user-facing changes? no (cherry picked from commit 278950a)
1 parent 9f3ddce commit 97453f2

2 files changed

Lines changed: 8 additions & 10 deletions

File tree

datafusion/physical-expr/src/expressions/dynamic_filters.rs

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -310,14 +310,14 @@ impl DynamicFilterPhysicalExpr {
310310
/// that created the filter). This is useful to avoid computing expensive filter
311311
/// expressions when no consumer will actually use them.
312312
///
313-
/// Note: We check the inner Arc's strong_count, not the outer Arc's count, because
314-
/// when filters are transformed (e.g., via reassign_expr_columns during filter pushdown),
315-
/// new outer Arc instances are created via with_new_children(), but they all share the
316-
/// same inner `Arc<RwLock<Inner>>`. This is what allows filter updates to propagate to
317-
/// consumers even after transformation.
313+
/// # Implementation Details
314+
///
315+
/// We check both Arc counts to handle two cases:
316+
/// - Transformed filters (via `with_new_children`) share the inner Arc (inner count > 1)
317+
/// - Direct clones (via `Arc::clone`) increment the outer count (outer count > 1)
318318
pub fn is_used(self: &Arc<Self>) -> bool {
319319
// Strong count > 1 means at least one consumer is holding a reference beyond the producer.
320-
Arc::strong_count(&self.inner) > 1
320+
Arc::strong_count(self) > 1 || Arc::strong_count(&self.inner) > 1
321321
}
322322

323323
fn render(

datafusion/physical-plan/src/joins/hash_join/exec.rs

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -513,10 +513,8 @@ impl HashJoinExec {
513513
///
514514
/// This method is intended for testing only and should not be used in production code.
515515
#[doc(hidden)]
516-
pub fn dynamic_filter_for_test(&self) -> Option<Arc<DynamicFilterPhysicalExpr>> {
517-
self.dynamic_filter
518-
.as_ref()
519-
.map(|df| Arc::clone(&df.filter))
516+
pub fn dynamic_filter_for_test(&self) -> Option<&Arc<DynamicFilterPhysicalExpr>> {
517+
self.dynamic_filter.as_ref().map(|df| &df.filter)
520518
}
521519

522520
/// Calculate order preservation flags for this hash join.

0 commit comments

Comments
 (0)