feat: add support for timestamp_seconds expression by andygrove · Pull Request #3146 · apache/datafusion-comet

andygrove · 2026-01-14T23:13:27Z

Summary

Adds native Comet support for Spark timestamp_seconds function (also known as seconds_to_timestamp)
Converts seconds since Unix epoch to a timestamp
Supports Int32, Int64, and Float64 input types

Test Plan

Added unit tests in CometTemporalExpressionSuite
Tests include random data, literal values, nulls, negative values (before epoch), and fractional seconds
All existing tests pass

Note: This PR was generated with AI assistance.

Adds native Comet support for Spark's timestamp_seconds (SecondsToTimestamp) function, which converts seconds since Unix epoch to a timestamp. Supports Int32, Int64, and Float64 inputs. NaN and Infinite float values return null per Spark behavior. Closes apache#3111 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

codecov-commenter · 2026-01-15T00:54:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.96%. Comparing base (f09f8af) to head (3a7d8d9).
⚠️ Report is 907 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3146      +/-   ##
============================================
+ Coverage     56.12%   59.96%   +3.83%     
- Complexity      976     1473     +497     
============================================
  Files           119      175      +56     
  Lines         11743    16170    +4427     
  Branches       2251     2682     +431     
============================================
+ Hits           6591     9696    +3105     
- Misses         4012     5126    +1114     
- Partials       1140     1348     +208

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…mestamp # Conflicts: # docs/source/user-guide/latest/configs.md # native/spark-expr/src/comet_scalar_funcs.rs # native/spark-expr/src/lib.rs # spark/src/main/scala/org/apache/comet/serde/datetime.scala # spark/src/test/scala/org/apache/comet/CometTemporalExpressionSuite.scala

andygrove · 2026-01-30T15:03:42Z

Moving this to draft until #3328 is merged

…mestamp # Conflicts: # native/spark-expr/src/comet_scalar_funcs.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mbutrovich · 2026-04-14T15:21:04Z

Thanks @andygrove! Claude summarized my notes for me. Hopefully it didn't transcribe anything wrong or hallucinate :)

PR #3146: timestamp_seconds

Overflow behavior mismatch with Spark

This is the main concern. Spark's SecondsToTimestamp for integer inputs uses Math.multiplyExact, which throws ArithmeticException("long overflow") on overflow. The Rust implementation uses checked_mul and maps overflow to None (null). So Comet silently returns null where Spark would throw.

Either return an error on overflow to match Spark, or document the deviation and mark it Incompatible in the serde's getSupportLevel.

NULL buffer handling

All three input type paths build the output via .iter().map().collect(), which constructs the null buffer row-by-row through Arrow's builder. For Int32 and Int64, no new nulls are introduced. The output nulls are identical to the input nulls. The null buffer could be cloned directly and values computed with Arrow's unary kernel or direct buffer arithmetic. This avoids per-row null tracking overhead. Float64 is trickier since NaN/Infinity introduce new nulls, so iterative construction is more justifiable there.

Missing input types

Spark accepts NumericType (Byte, Short, Int, Long, Float, Double, Decimal). The PR only handles Int32, Int64, Float64. Float32 would be trivial to add (cast to f64 and use the Float64 path). Decimal is more complex and could be deferred.

Test gaps

A few cases worth adding:

NaN input: SELECT timestamp_seconds(CAST('NaN' AS DOUBLE)) (should return null)

Infinity input: SELECT timestamp_seconds(CAST('Infinity' AS DOUBLE)) (should return null)

Overflow: SELECT timestamp_seconds(9223372036854775807) (Spark throws, what does Comet do?)

Int32 column reference (currently only bigint column in the test table)

Float64 column reference (currently only literal casts)

Docs

docs/spark_expressions_support.md has timestamp_seconds marked [ ]. Should be updated to [x].

- Error on Int64 overflow instead of returning null, matching Spark's Math.multiplyExact which throws ArithmeticException - Use try_unary kernel for Int32/Int64 paths to avoid per-row null buffer tracking overhead - Add Float32 input type support - Properly handle scalar inputs (scalar-in/scalar-out contract) - Add test coverage for NaN, Infinity, Int32 column, Float64 column - Mark timestamp_seconds as supported in spark_expressions_support.md

…mestamp # Conflicts: # native/spark-expr/src/comet_scalar_funcs.rs # native/spark-expr/src/lib.rs # spark/src/main/scala/org/apache/comet/serde/datetime.scala

andygrove · 2026-04-21T01:25:59Z

@mbutrovich this is ready for another look when you have time

mbutrovich · 2026-04-23T12:58:24Z

This was the only nit:

Byte/Short/Decimal still unsupported — Comet will fall back to Spark for these, which is fine, but getSupportLevel doesn't check child types. If someone does
timestamp_seconds(CAST(x AS TINYINT)), it would hit a DataFusion planning error rather than a clean Comet fallback. An override like the one in
CometArrayPosition.getSupportLevel that checks input types would be more robust.

andygrove marked this pull request as draft January 14, 2026 23:49

update docs

bbfd11e

andygrove marked this pull request as ready for review January 15, 2026 03:41

andygrove marked this pull request as draft January 30, 2026 01:48

andygrove and others added 2 commits February 10, 2026 12:11

Merge remote-tracking branch 'apache/main' into feature/seconds-to-ti…

650b6da

…mestamp # Conflicts: # native/spark-expr/src/comet_scalar_funcs.rs

test: migrate timestamp_seconds tests to SQL file-based approach

03fb7a6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

andygrove marked this pull request as ready for review February 11, 2026 01:03

andygrove added 4 commits March 7, 2026 08:32

upmerge

3ff2862

fmt

b32b3e6

chore: merge latest from apache/main

c6072e2

fix: resolve merge conflicts after merging main

1c66349

andygrove added 3 commits April 14, 2026 09:31

cargo fmt

d469b5e

Merge remote-tracking branch 'apache/main' into feature/seconds-to-ti…

3eda727

…mestamp # Conflicts: # native/spark-expr/src/comet_scalar_funcs.rs # native/spark-expr/src/lib.rs # spark/src/main/scala/org/apache/comet/serde/datetime.scala

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support for timestamp_seconds expression#3146

feat: add support for timestamp_seconds expression#3146
andygrove wants to merge 12 commits intoapache:mainfrom
andygrove:feature/seconds-to-timestamp

andygrove commented Jan 14, 2026

Uh oh!

codecov-commenter commented Jan 15, 2026 •

edited

Loading

Uh oh!

andygrove commented Jan 30, 2026

Uh oh!

mbutrovich commented Apr 14, 2026

PR #3146: `timestamp_seconds`

Overflow behavior mismatch with Spark

NULL buffer handling

Missing input types

Test gaps

Docs

Uh oh!

andygrove commented Apr 21, 2026

Uh oh!

mbutrovich commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andygrove commented Jan 14, 2026

Summary

Test Plan

Uh oh!

codecov-commenter commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

andygrove commented Jan 30, 2026

Uh oh!

mbutrovich commented Apr 14, 2026

PR #3146: timestamp_seconds

Overflow behavior mismatch with Spark

NULL buffer handling

Missing input types

Test gaps

Docs

Uh oh!

andygrove commented Apr 21, 2026

Uh oh!

mbutrovich commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Jan 15, 2026 •

edited

Loading

PR #3146: `timestamp_seconds`