Skip to content

[native_datafusion] [Spark SQL Tests] Plan structure differences cause test failures #3315

@andygrove

Description

@andygrove

Summary

4 Spark SQL tests fail because native_datafusion produces different plan nodes and partitioning info than expected.

Failing Tests

  • ParquetV2Suite: "Fallback Parquet V2 to V1" — expects FileSourceScanExec or CometScanExec in plan, but native_datafusion uses CometNativeScan
  • BroadcastJoinSuite: "broadcast join where streamed side's output partitioning is HashPartitioning" (x2) — UnknownPartitioning(8) instead of PartitioningCollection
  • FileStreamSinkSuite: "self-union, DSv1, read via DataStreamReader API" / "self-union, DSv1, read via table API" — streaming query expects specific plan structure

Root Cause

native_datafusion uses CometNativeScan instead of CometScanExec/FileSourceScanExec and reports UnknownPartitioning instead of preserving the original partitioning information. Tests that inspect plan internals fail.

Related

Discovered in CI for #3307 (enable native_datafusion in auto scan mode).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions