[SPARK-41398][SQL][FOLLOWUP] Update runtime filtering javadoc to reflect relaxed partition constraints

yyanyy · cloud-fan · commit 38e51eb5e8d4 · 2026-01-29T10:54:26.000+08:00
Update the javadoc for `SupportsRuntimeV2Filtering.filter()` and `SupportsRuntimeFiltering.filter()` to reflect the changes made in [SPARK-41398](https://issues.apache.org/jira/browse/SPARK-41398), which relaxed the constraint on partition values during runtime filtering. After that change, scans can now either: - Replace partitions with no matching data with empty InputPartitions, or - Report only a subset of the original partition values (omitting those with no data) The previous documentation stated that the "overall number of partitions" must be preserved, which is no longer required. The only constraint is that new partition values not present in the original partitioning cannot be introduced. ### What changes were proposed in this pull request? A javadoc update to follow up on #38924 ### Why are the changes needed? To make the java doc up to date for future implementer ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? compile and checkstyle ### Was this patch authored or co-authored using generative AI tooling? Yes - Claude Opus 4.5 Closes #54046 from yyanyy/spark-41398-update-javadoc. Authored-by: Yan Yan <yyanyyyy@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java
@@ -51,8 +51,10 @@ public interface SupportsRuntimeFiltering extends SupportsRuntimeV2Filtering {
    * the originally reported partitioning during runtime filtering. While applying runtime filters,
    * the scan may detect that some {@link InputPartition}s have no matching data. It can omit
    * such partitions entirely only if it does not report a specific partitioning. Otherwise,
-   * the scan can replace the initially planned {@link InputPartition}s that have no matching
-   * data with empty {@link InputPartition}s but must preserve the overall number of partitions.
+   * the scan can either replace the initially planned {@link InputPartition}s that have no
+   * matching data with empty {@link InputPartition}s, or report only a subset of the original
+   * partition values (omitting those with no data). The scan must not report new partition values
+   * that were not present in the original partitioning.
    * <p>
    * Note that Spark will call {@link Scan#toBatch()} again after filtering the scan at runtime.
    *
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java
@@ -55,9 +55,10 @@ public interface SupportsRuntimeV2Filtering extends Scan {
    * the originally reported partitioning during runtime filtering. While applying runtime
    * predicates, the scan may detect that some {@link InputPartition}s have no matching data. It
    * can omit such partitions entirely only if it does not report a specific partitioning.
-   * Otherwise, the scan can replace the initially planned {@link InputPartition}s that have no
-   * matching data with empty {@link InputPartition}s but must preserve the overall number of
-   * partitions.
+   * Otherwise, the scan can either replace the initially planned {@link InputPartition}s that
+   * have no matching data with empty {@link InputPartition}s, or report only a subset of the
+   * original partition values (omitting those with no data). The scan must not report new
+   * partition values that were not present in the original partitioning.
    * <p>
    * Note that Spark will call {@link Scan#toBatch()} again after filtering the scan at runtime.
    *