Skip to content

docs: add compatibility documentation to all expressions#4067

Merged
andygrove merged 18 commits intoapache:mainfrom
andygrove:add-more-compat-docs
Apr 24, 2026
Merged

docs: add compatibility documentation to all expressions#4067
andygrove merged 18 commits intoapache:mainfrom
andygrove:add-more-compat-docs

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Apr 24, 2026

What issue does this close?

N/A

Rationale for this change

  • Add all known compatibility/unsupported notes into the expression serde implementations so that they get generated into the compatibility guide
  • Remove compatibility info from the list of supported expressions - the compatibility guide is the correct place for this information

This PR documents the current state. There were some surprises. There is follow on issue #4074 to address those.

How is this tested

Manually

andygrove and others added 15 commits April 23, 2026 17:00
Wire GenerateDocs to emit incompatibility and unsupported notes into
each expression compatibility page, driven by getIncompatibleReasons
and getUnsupportedReasons on the serde traits. Add matching defaults
to CometAggregateExpressionSerde so aggregate.md is covered too.
Fix CometDateFormat.getUnsupportedReasons formatting.
Move hand-written incompatibility and unsupported notes for
CollectSet, Average, SortArray, TruncTimestamp, and StructsToJson from
the per-category markdown pages into the corresponding serde via
getIncompatibleReasons / getUnsupportedReasons, so GenerateDocs drives
the compatibility guide from a single source of truth. Clarify in the
trait scaladoc that reasons should be written in Markdown.
Create compatibility/expressions/math.md with an EXPR_COMPAT marker
block and wire it to QueryPlanSerde.mathExpressions in GenerateDocs so
CometAbs's unsupported-reason note surfaces in the guide. Add math to
the expressions toctree.
Satisfy scalafix DisableSyntax.noExplicitPublicVal by annotating
CometHour.incompatReason and CometAbs.unsupportedReason as `: String`.
… methods

Add guidance to the contributor guide covering the new documentation
methods on CometExpressionSerde. Also simplify CometDateFormat's
getUnsupportedReasons to list only the supported Spark format keys.
…ion serdes

- Remove getSupportLevel override from CometArrayAppend (always returned
  Compatible, which is the default)
- Add getIncompatibleReasons() to all always-Incompatible expression serdes
- Add getIncompatibleReasons() and/or getUnsupportedReasons() to all
  conditional expression serdes that were missing them
- Add compatibility guide pages for string, map, and misc expression categories
- Register new categories in GenerateDocs so content is auto-generated

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…for references

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andygrove andygrove changed the title docs: add compatibility reasons to all expression serdes docs: add compatibility documentation to all expressions Apr 24, 2026
# Conflicts:
#	docs/source/contributor-guide/adding_a_new_expression.md
#	docs/source/user-guide/latest/compatibility/expressions/index.md
#	spark/src/main/scala/org/apache/comet/GenerateDocs.scala
#	spark/src/main/scala/org/apache/comet/serde/arrays.scala
@andygrove andygrove marked this pull request as ready for review April 24, 2026 16:59
…Notes

Remove Spark-Compatible? and Compatibility Notes columns from
expressions.md; those details now live in the generated Compatibility
Guide. Add getCompatibleNotes() to CometExpressionSerde and
CometAggregateExpressionSerde for differences that are always present
and do not require opting in via allowIncompatible, rendered as a
distinct section in the compatibility guide. Backfill reasons in
serdes that previously only appeared in expressions.md.
# Conflicts:
#	docs/source/user-guide/latest/expressions.md
#	spark/src/main/scala/org/apache/comet/serde/arrays.scala
| BitXorAgg | |
| BoolAnd | `bool_and` |
| BoolOr | `bool_or` |
| CollectSet | |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we prob need to address those gaps later. for example count, corr, collect_set supported and have sql expression

| UnscaledValue | Yes | |
| Expression |
| ---------------------------- |
| Alias |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why some time we have SQL column and some times not?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no good reason - do you think it is worth keeping sql column?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this doc does not need tables now and can just be simple lists


object CometFirst extends CometAggregateExpressionSerde[First] {

override def getCompatibleNotes(): Seq[String] = Seq(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the difference between getCompatibleNotes and getIncompatibleReasons and getUnsupportedReasons. I can think of diff between 2 and 3, but 1 is confusing

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • getCompatibleNotes is for compatibility issues that we decided to accept and still accelerate the expression
  • getIncompatibleReasons is for compatibility issues that we decided to fall back for and have user opt-in

Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove this is great, finally all incompats in a single place

Comment thread spark/src/main/scala/org/apache/comet/serde/aggregates.scala
Comment thread spark/src/main/scala/org/apache/comet/serde/unixtime.scala
Comment thread spark/src/main/scala/org/apache/comet/serde/strings.scala
Copy link
Copy Markdown
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side.

@andygrove andygrove merged commit 5076f63 into apache:main Apr 24, 2026
170 of 171 checks passed
@andygrove andygrove deleted the add-more-compat-docs branch April 24, 2026 21:43
@andygrove
Copy link
Copy Markdown
Member Author

Merged. Thanks @parthchandra @comphead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants