Skip to content

chore: Remove config option for native_iceberg_compat#4019

Draft
andygrove wants to merge 12 commits intoapache:mainfrom
andygrove:remove-iceberg-compat
Draft

chore: Remove config option for native_iceberg_compat#4019
andygrove wants to merge 12 commits intoapache:mainfrom
andygrove:remove-iceberg-compat

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Apr 21, 2026

Which issue does this PR close?

Partially address #4020.

Rationale for this change

Removing support for native_iceberg_compat reduces the maintenance burden of the project.

This PR makes it impossible for users to select native_iceberg_compat and stops running tests for that scan impl.

Subsequent PRs will remove the implementation code.

What changes are included in this PR?

  • COMET_NATIVE_SCAN_IMPL config now only allows auto or native_datafusion
  • CometScanRule no longer uses the value of config option COMET_NATIVE_SCAN_IMPL and just uses native_datafusion scan
  • Tests have been updated to stop testing with native_iceberg_compat
  • Tests that were specific to native_iceberg_compat have been removed
  • Golden files regenerated - we no longer generate different versions per scan impl
  • Updated documentation

How are these changes tested?

@andygrove andygrove marked this pull request as ready for review April 21, 2026 15:35
@comphead
Copy link
Copy Markdown
Contributor

Thanks @andygrove 👀

Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove, let's get that CI matrix and tech debt down.

Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is LGTM, I was thinking if we need it for some debug/benchmark usecases but nothing comes to my head. So let it go

@parthchandra
Copy link
Copy Markdown
Contributor

Is there still a functionality gap between native_iceberg_compat and native_datafusion? I see some tests with IgnoreCometNativeDataFusion in the diff files.

@andygrove
Copy link
Copy Markdown
Member Author

andygrove commented Apr 21, 2026

Is there still a functionality gap between native_iceberg_compat and native_datafusion? I see some tests with IgnoreCometNativeDataFusion in the diff files.

% grep "IgnoreCometNativeDataFusion(\"" dev/diffs/3.5.8.diff
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3442")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+      IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {

So #3720 and #3442. We should be able resolve #3442 now that #4011 is merged.

@andygrove
Copy link
Copy Markdown
Member Author

Is there still a functionality gap between native_iceberg_compat and native_datafusion? I see some tests with IgnoreCometNativeDataFusion in the diff files.

% grep "IgnoreCometNativeDataFusion(\"" dev/diffs/3.5.8.diff
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3442")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+      IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {
+    IgnoreCometNativeDataFusion("https://github.com/apache/datafusion-comet/issues/3720")) {

So #3720 and #3442. We should be able resolve #3442 now that #4011 is merged.

#3443 is fixed in #4038, so that just leaves #3720 which is just that Comet is more lenient with type widening and/or throws different exception than Spark. Not a correctness issue.

@andygrove
Copy link
Copy Markdown
Member Author

@parthchandra ok if we merge this one?

@parthchandra
Copy link
Copy Markdown
Contributor

My general recommendation would be that we enable ignored tests before dropping native_iceberg_compat.
Also, the description of #3720 seems to indicate that issue is more complex than just mismatched error messages (I can be convinced that it is not a serious problem after all). We do have the framework to match Spark error messages but perhaps it is not necessary to have error messages match exactly.

Here's Claude's summary of ignored tests -

1. IgnoreCometNativeDataFusion — skipped for native_datafusion and auto

AdaptiveQueryExecSuite (#3321)

Test Name Diffs
join key with multiple references on the filtering plan 4.0.1
SPARK-43402: FileSourceScanExec supports push down data filter with scalar subquery 4.0.1
alter temporary view should follow current storeAnalyzedPlanForView config 4.0.1

AdaptiveQueryExecSuite (#3442)

Test Name Diffs
static scan metrics 3.4.3, 3.5.8, 4.0.1

FileBasedDataSourceSuite (#3321)

Test Name Diffs
Enabling/disabling ignoreMissingFiles using parquet (conditionally tagged only when format == "parquet") 4.0.1
Enabling/disabling ignoreCorruptFiles 4.0.1

ParquetFilterSuite (3.4.3 only)

Test Name Diffs
filter pushdown - StringPredicate (tagged IgnoreCometNativeDataFusion in 3.4.3; IgnoreCometNativeScan in 3.5.8/4.0.1) 3.4.3

ParquetSchemaSuite (#3720)

Test Name Diffs
SPARK-35640: read binary as timestamp should throw schema incompatible error 3.4.3, 3.5.8, 4.0.1
SPARK-35640: int as long should throw schema incompatible error 3.4.3, 3.5.8
SPARK-47447: read TimestampLTZ as TimestampNTZ 4.0.1
SPARK-36182: can't read TimestampLTZ as TimestampNTZ 3.4.3, 3.5.8
SPARK-34212 Parquet should read decimals correctly 3.4.3, 3.5.8, 4.0.1
row group skipping doesn't overflow when reading into larger type 3.4.3, 3.5.8, 4.0.1

ParquetSchemaEvolutionSuite (#3720)

Test Name Diffs
schema mismatch failure error message for parquet vectorized reader 3.4.3, 3.5.8, 4.0.1
SPARK-45604: schema mismatch failure error on timestamp_ntz to array<timestamp_ntz> 3.4.3, 3.5.8, 4.0.1

ParquetTypeWideningSuite (#3321)

Test Name Diffs
parquet widening conversion DateType -> TimestampNTZType (conditionally tagged) 4.0.1
unsupported parquet conversion $fromType -> $toType (multiple type combos) 4.0.1
unsupported parquet timestamp conversion $fromType ($outputTimestampType) -> $toType 4.0.1
parquet decimal precision change Decimal($fromPrecision, 2) -> Decimal($toPrecision, 2) 4.0.1
parquet decimal precision and scale change Decimal($fromPrecision, $fromScale) -> Decimal($toPrecision, $toScale) 4.0.1

2. assume() — runtime skip

ParquetRowIndexSuite (#3886) — 4.0.1 only

Test Name Condition
invalid row index column type - ${conf.desc} Skipped when COMET_NATIVE_SCAN_IMPL is SCAN_NATIVE_DATAFUSION or SCAN_AUTO. Comet throws RuntimeException instead of SparkException.

CometExpressionSuite — Comet's own test suite

Test Name Condition
get_struct_field - select primitive fields Skipped when scanImpl == SCAN_AUTO && Spark 4.0+
get_struct_field - select subset of struct Skipped when scanImpl == SCAN_AUTO && Spark 4.0+
get_struct_field - read entire struct Skipped when scanImpl == SCAN_AUTO && Spark 4.0+

Summary by Tracking Issue

Issue Count Description
#3321 ~12 Schema evolution, corrupt/missing files, AQE, type widening
#3720 ~8 Schema mismatch errors, decimal reads, row group skipping
#3442 1 Static scan metrics with DPP
#3886 1 Row index column type error type mismatch
(no issue) 5 Filter pushdown / accumulator tests (IgnoreCometNativeScan)
(no issue) 3 get_struct_field tests (auto + Spark 4.0+ only)

@parthchandra
Copy link
Copy Markdown
Contributor

More Claude analysis on schema mismatch. Claude recommends we explicitly check that the following tests fail with a different message instead of actually succeeding (because the results will be wrong) -

Description Test Name(s) Diffs
binary -> timestamp SPARK-35640: read binary as timestamp should throw schema incompatible error 3.4.3, 3.5.8, 4.0.1
timestamp_ntz -> array<timestamp_ntz> SPARK-45604: schema mismatch failure error on timestamp_ntz to array<timestamp_ntz> 3.4.3, 3.5.8, 4.0.1
string read as int schema mismatch failure error message for parquet vectorized reader 3.4.3, 3.5.8, 4.0.1
decimal precision/scale overflow SPARK-34212 Parquet should read decimals correctly 3.4.3, 3.5.8, 4.0.1
int->bigint row group overflow row group skipping doesn't overflow when reading into larger type 3.4.3, 3.5.8, 4.0.1
int read as long (schema evolution disabled) SPARK-35640: int as long should throw schema incompatible error 3.4.3, 3.5.8
TimestampLTZ -> TimestampNTZ SPARK-47447: read TimestampLTZ as TimestampNTZ (4.0.1), SPARK-36182: can't read TimestampLTZ as TimestampNTZ (3.4.3, 3.5.8) 3.4.3, 3.5.8, 4.0.1

@parthchandra
Copy link
Copy Markdown
Contributor

One final word from Claude -
We lose partial acceleration for queries that can't use native_datafusion:

  • Queries using input_file_name() — JVM scan + native downstream ops
  • Queries with metadata columns — JVM scan + native downstream ops
  • Queries with ignoreMissingFiles=true — JVM scan + native downstream ops
  • Queries with row indexes or field IDs — JVM scan + native downstream ops
    ( I was under the impression these had been implemented).

@andygrove andygrove marked this pull request as draft April 23, 2026 12:34
@andygrove
Copy link
Copy Markdown
Member Author

My general recommendation would be that we enable ignored tests before dropping native_iceberg_compat. Also, the description of #3720 seems to indicate that issue is more complex than just mismatched error messages (I can be convinced that it is not a serious problem after all). We do have the framework to match Spark error messages but perhaps it is not necessary to have error messages match exactly.

Alright, I have moved this to draft for now. I'll take another more detailed look at #3720 soon. Thanks @parthchandra.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants