[SPARK-55692][SQL] Fix SupportsRuntimeFiltering and SupportsRuntimeV2Filtering documentation

peter-toth · dongjoon-hyun · commit 366550689fc7 · 2026-02-26T08:59:18.000-08:00
### What changes were proposed in this pull request? This is a follow-up to #38924 clarify behaviour of scans with runtime filters. ### Why are the changes needed? Please see discussion at #54330 (comment). ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This is a documentation change. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #54490 from peter-toth/SPARK-55692-fix-supportsruntimefiltering-docs. Authored-by: Peter Toth <peter.toth@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java
@@ -49,10 +49,11 @@ public interface SupportsRuntimeFiltering extends SupportsRuntimeV2Filtering {
    * <p>
    * If the scan also implements {@link SupportsReportPartitioning}, it must preserve
    * the originally reported partitioning during runtime filtering. While applying runtime filters,
-   * the scan may detect that some {@link InputPartition}s have no matching data. It can omit
-   * such partitions entirely only if it does not report a specific partitioning. Otherwise,
-   * the scan can replace the initially planned {@link InputPartition}s that have no matching
-   * data with empty {@link InputPartition}s but must preserve the overall number of partitions.
+   * the scan may detect that some {@link InputPartition}s have no matching data, in which case
+   * it can either replace the initially planned {@link InputPartition}s that have no matching data
+   * with empty {@link InputPartition}s, or report only a subset of the original partition values
+   * (omitting those with no data) via {@link Batch#planInputPartitions()}. The scan must not report
+   * new partition values that were not present in the original partitioning.
    * <p>
    * Note that Spark will call {@link Scan#toBatch()} again after filtering the scan at runtime.
    *
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeV2Filtering.java
@@ -53,11 +53,11 @@ public interface SupportsRuntimeV2Filtering extends Scan {
    * <p>
    * If the scan also implements {@link SupportsReportPartitioning}, it must preserve
    * the originally reported partitioning during runtime filtering. While applying runtime
-   * predicates, the scan may detect that some {@link InputPartition}s have no matching data. It
-   * can omit such partitions entirely only if it does not report a specific partitioning.
-   * Otherwise, the scan can replace the initially planned {@link InputPartition}s that have no
-   * matching data with empty {@link InputPartition}s but must preserve the overall number of
-   * partitions.
+   * predicates, the scan may detect that some {@link InputPartition}s have no matching data, in
+   * which case it can either replace the initially planned {@link InputPartition}s that have no
+   * matching data with empty {@link InputPartition}s, or report only a subset of the original
+   * partition values (omitting those with no data) via {@link Batch#planInputPartitions()}. The
+   * scan must not report new partition values that were not present in the original partitioning.
    * <p>
    * Note that Spark will call {@link Scan#toBatch()} again after filtering the scan at runtime.
    *