chore: complete coverage for primitive to string casts by parthchandra · Pull Request #4009 · apache/datafusion-comet

parthchandra · 2026-04-20T20:05:34Z

Which issue does this PR close?

Part of #286
Fixes: #1068

Rationale for this change

Comet had gaps in Decimal-to-String cast coverage:

Decimal -> String (LEGACY mode): cast_array was falling through to DataFusion's built-in cast for Decimal128 -> Utf8, which produces plain notation. Spark's LEGACY mode uses Java BigDecimal.toString(), which produces scientific notation when the adjusted exponent is less than -6. For example, Decimal(38,18) zero was cast as
"0.000000000000000000" by Comet but "0E-18" by Spark.

Decimal with negative scale: When spark.sql.legacy.allowNegativeScaleOfDecimal=true, Spark formats negative-scale decimals in scientific notation (e.g. Decimal(7,-2) unscaled=123 -> "1.23E+4"). Comet now matches this. When the config is disabled, CometCast.isSupported returns Incompatible since negative-scale decimals cannot be created
via normal SQL paths.

Other primitive types -> String: Boolean, integer, float, double, date, timestamp, and binary casts to String were already handled correctly by DataFusion's built-in cast. The existing castTest harness already exercises LEGACY, TRY, and ANSI modes for all of these types, so no new tests were needed.

What changes are included in this PR?

native/spark-expr/src/conversion_funcs/numeric.rs: adds cast_decimal128_to_utf8 and decimal128_to_java_string that replicate Java BigDecimal.toString() behavior (scientific notation when adj_exp < -6).
native/spark-expr/src/conversion_funcs/cast.rs: adds a LEGACY-only match arm for Decimal128 -> Utf8 that calls cast_decimal128_to_utf8; TRY and ANSI fall through to DataFusion's plain-notation cast, which already matches Spark.
spark/src/main/scala/org/apache/comet/expressions/CometCast.scala: marks all DecimalType -> StringType as Compatible across all eval modes, with a special Incompatible case for negative-scale decimals when the legacy config is disabled.
Tests: adds cast DecimalType(38,18) to StringType and cast DecimalType with negative scale to StringType tests. Extends generateDecimalsPrecision38Scale18 with small-magnitude values that trigger the scientific notation path.

How are these changes tested?

Unit tests in CometCastSuite covering Decimal(38,18), negative-scale decimals (with localTableScan.enabled to avoid Parquet), and the CometCast.isSupported boundary for negative-scale with config disabled.
Rust unit test test_decimal128_to_java_string covering all branches of the formatting logic (plain, leading zeros, scientific notation, positive/negative exponent, zero, negatives).
Existing castTest harness already exercises LEGACY, TRY, and ANSI modes for all other primitive-to-String casts (boolean, byte, short, int, long, float, double, date, timestamp, binary), confirming compatibility.

parthchandra · 2026-04-20T22:57:53Z

This is already done in #3939

chore: complete coverage for primitive to string casts

1385e47

parthchandra marked this pull request as draft April 20, 2026 20:05

parthchandra closed this Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: complete coverage for primitive to string casts#4009

chore: complete coverage for primitive to string casts#4009
parthchandra wants to merge 1 commit intoapache:mainfrom
parthchandra:cast-primitive-string-2

parthchandra commented Apr 20, 2026

Uh oh!

parthchandra commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

parthchandra commented Apr 20, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

parthchandra commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant