Skip to content

Update cranelift requirement from 0.88.0 to 0.89.0#123

Closed
dependabot[bot] wants to merge 1 commit intomasterfrom
dependabot/cargo/master/cranelift-0.89.0
Closed

Update cranelift requirement from 0.88.0 to 0.89.0#123
dependabot[bot] wants to merge 1 commit intomasterfrom
dependabot/cargo/master/cranelift-0.89.0

Conversation

@dependabot
Copy link
Copy Markdown

@dependabot dependabot Bot commented on behalf of github Oct 21, 2022

Updates the requirements on cranelift to permit the latest version.

Commits

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [cranelift](https://github.com/bytecodealliance/wasmtime) to permit the latest version.
- [Release notes](https://github.com/bytecodealliance/wasmtime/releases)
- [Changelog](https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-some-possible-changes.md)
- [Commits](https://github.com/bytecodealliance/wasmtime/commits)

---
updated-dependencies:
- dependency-name: cranelift
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github
Copy link
Copy Markdown
Author

dependabot Bot commented on behalf of github Oct 21, 2022

The following labels could not be found: auto-dependencies.

@dependabot @github
Copy link
Copy Markdown
Author

dependabot Bot commented on behalf of github Oct 26, 2022

Looks like cranelift is up-to-date now, so this is no longer needed.

@dependabot dependabot Bot closed this Oct 26, 2022
@dependabot dependabot Bot deleted the dependabot/cargo/master/cranelift-0.89.0 branch October 26, 2022 01:10
comphead added a commit that referenced this pull request Oct 16, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.
Related to apache#18084

## Rationale for this change

Run extended suite on PRs for critical areas, to avoid post merge
bugfixing

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead added a commit that referenced this pull request Oct 16, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

Followup on
apache#18063 (review)

## Rationale for this change

Use cheaper `NullBuffer::union` to apply null mask instead of iterator
approach

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead added a commit that referenced this pull request Oct 17, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

Followup on
apache#18063 (review)

## Rationale for this change

Use cheaper `NullBuffer::union` to apply null mask instead of iterator
approach

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

(cherry picked from commit 337378a)
comphead added a commit that referenced this pull request Oct 17, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

Followup on
apache#18063 (review)

## Rationale for this change

Use cheaper `NullBuffer::union` to apply null mask instead of iterator
approach

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

(cherry picked from commit 337378a)
comphead pushed a commit that referenced this pull request Oct 17, 2025
…unctions in proto (apache#18024)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#17417.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

- Support `null_treatment`, `distinct`, and `filter` for window function
in proto.
- Support `null_treatment` for aggregate udf in proto.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

- [x] Add `null_treatment`, `distinct`, `filter` fields to
`WindowExprNode` message and handle them in `to/from_proto.rs`.
- [x] Add `null_treatment` field to `AggregateUDFExprNode` message and
handle them in `to/from_proto.rs`.
- [ ] Docs update: I'm not sure where to add docs as declared in the
issue description.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

- Add tests to `roundtrip_window` for respectnulls, ignorenulls,
distinct, filter.
- Add tests to `roundtrip_aggregate_udf` for respectnulls, ignorenulls.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
N/A

---------

Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
comphead pushed a commit that referenced this pull request Oct 17, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Doesn't close an issue.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Hi we are hiop, a Serverless Data Logistic Platform.
We use DataFusion as a core part of our backend engine, and it plays a
crucial role in our data infrastructure. Our team members are passionate
about the project and actively try contribute to its development
(@dariocurr).

We’d love to have Hiop listed among the Known Users to show our support
and help the DataFusion community continue to grow.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Just adding hiop as known user

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 17, 2025
…8117)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#3695
- Closes apache#3797

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Was looking at above issues and I don't believe we skip the failed rules
for any tests anymore (default for the config is also `false`), apart
from this cleanup, so filing this PR so we can close the issues. Seems
we only do in this `window.slt` test after this fix:


https://github.com/apache/datafusion/blob/621a24978a7a9c6d2b27973d1853dbc8776a56b5/datafusion/sqllogictest/test_files/window.slt#L2587-L2611

Which seems intentional.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Remove unnecessary `skip_failed_rules` config.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Existing tests.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

No.

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 17, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
`EXPLAIN ANALYZE` can be used for profiling and displays the results
alongside the EXPLAIN plan. The issue is that it currently shows too
many low-level details. It would provide a better user experience if
only the most commonly used metrics were shown by default, with more
detailed metrics available through specific configuration options.

### Example
In `datafusion-cli`:
```
> CREATE EXTERNAL TABLE IF NOT EXISTS lineitem
STORED AS parquet
LOCATION '/Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem';
0 row(s) fetched.
Elapsed 0.000 seconds.

explain analyze select *
from lineitem
where l_orderkey = 3000000;
```
The parquet reader includes a large number of low-level details:
```
metrics=[output_rows=19813, elapsed_compute=14ns, batches_split=0, bytes_scanned=2147308, file_open_errors=0, file_scan_errors=0, files_ranges_pruned_statistics=18, num_predicate_creation_errors=0, page_index_rows_matched=19813, page_index_rows_pruned=729088, predicate_cache_inner_records=0, predicate_cache_records=0, predicate_evaluation_errors=0, pushdown_rows_matched=0, pushdown_rows_pruned=0, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=1, row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=0, bloom_filter_eval_time=21.997µs, metadata_load_time=273.83µs, page_index_eval_time=29.915µs, row_pushdown_eval_time=42ns, statistics_eval_time=76.248µs, time_elapsed_opening=4.02146ms, time_elapsed_processing=24.787461ms, time_elapsed_scanning_total=24.17671ms, time_elapsed_scanning_until_data=23.103665ms]
```

I believe only a subset of it is commonly used, for example
`output_rows`, `metadata_load_time`, and how many file/row-group/pages
are pruned, and it would better to only display the most common ones by
default.

### Existing `VERBOSE` keyword
There is a existing verbose keyword in `EXPLAIN ANALYZE VERBOSE`,
however it's turning on per-partition metrics instead of controlling
detail level. I think it would be hard to mix this partition control and
the detail level introduced in this PR, so they're separated: the
following config will be used for detail level and the semantics of
`EXPLAIN ANALYZE VERBOSE` keep unchanged.

### This PR: configurable explain analyze level
1. Introduced a new config option `datafusion.explain.analyze_level`.
When set to `dev` (default value), all existing metrics will be shown.
If set to `summary`, only `BaselineMetrics` will be displayed (i.e.
`output_rows` and `elapsed_compute`).
Note now we only include `BaselineMetrics` for simplicity, in the
follow-up PRs we can figure out what's the commonly used metrics for
each operator, and add them to `summary` analyze level, finally set the
`summary` analyze level to default.
2. Add a `MetricType` field associated with `Metric` for detail level or
potentially category in the future. For different configurations, a
certain `MetricType` set will be shown accordingly.

#### Demo
```
-- continuing the above example
> set datafusion.explain.analyze_level = summary;
0 row(s) fetched.
Elapsed 0.000 seconds.

> explain analyze select *
from lineitem
where l_orderkey = 3000000;
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=5, elapsed_compute=25.339µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                   |   FilterExec: l_orderkey@0 = 3000000, metrics=[output_rows=5, elapsed_compute=81.221µs]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|                   |     DataSourceExec: file_groups={14 groups: [[Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-0.parquet:0..11525426], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-0.parquet:11525426..20311205, Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:0..2739647], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:2739647..14265073], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:14265073..20193593, Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-2.parquet:0..5596906], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-2.parquet:5596906..17122332], ...]}, projection=[l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment], file_type=parquet, predicate=l_orderkey@0 = 3000000, pruning_predicate=l_orderkey_null_count@2 != row_count@3 AND l_orderkey_min@0 <= 3000000 AND 3000000 <= l_orderkey_max@1, required_guarantees=[l_orderkey in (3000000)], metrics=[output_rows=19813, elapsed_compute=14ns] |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.025 seconds.
```
Only `BaselineMetrics` are shown.


## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
4. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
UT

## Are there any user-facing changes?

No
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 17, 2025
…e#18091)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

N/A

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

There's a few functions in
`datafusion/expr-common/src/type_coercion/aggregates.rs` that are unused
elsewhere in the codebase, likely a remnant before the refactor to UDF,
so removing them. Some are still used (`coerce_avg_type()` and
`avg_return_type()`) so these are inlined into the Avg aggregate
function (similar to Sum). Also refactor some window functions to use
already available macros.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

- Remove some unused functions
- Inline avg coerce & return type logic
- Refactor Spark Avg a bit to remove unnecessary code
- Refactor ntile & nth window functions to use available macros

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Existing tests.

## Are there any user-facing changes?

Yes as these functions were publicly exported; however I'm not sure they
were meant to be used by users anyway, given what they do.

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 17, 2025
apache#18099)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

Fixes comparison errors when using dictionary-encoded types with
comparison functions like NULLIF.

## Rationale for this change

When using dictionary-encoded columns (e.g., Dictionary(Int32, Utf8)) in
comparison operations with literals or other types, DataFusion would
throw an error stating the types are not comparable. This was
particularly problematic for functions like NULLIF which rely on
comparison coercion.

The issue was that comparison_coercion_numeric didn't handle dictionary
types, even though the general comparison_coercion function did have
dictionary support.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

1. Refactored dictionary comparison logic: Extracted common dictionary
coercion logic into dictionary_comparison_coercion_generic to avoid code
duplication.
2. Added numeric-specific dictionary coercion: Introduced
dictionary_comparison_coercion_numeric that uses numeric-preferring
comparison rules when dealing with dictionary value types.
3. Updated comparison_coercion_numeric: Added a call to
dictionary_comparison_coercion_numeric in the coercion chain to properly
handle dictionary types.
4. Added sqllogictest cases demonstrating the fix works for various
dictionary comparison scenarios.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

Yes, added tests in datafusion/sqllogictest/test_files/nullif.slt
covering:
  - Dictionary type compared with string literal
  - String compared with dictionary type
  - Dictionary compared with dictionary

All tests pass with the fix and would fail without it.

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

This is a bug fix that enables previously failing queries to work
correctly. No breaking changes or API modifications.

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
alamb added a commit that referenced this pull request Oct 18, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18135 
## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead added a commit that referenced this pull request Oct 24, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 24, 2025
…pache#17478) (apache#18130)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Related to apache#17405
- Related to apache#18072

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

See apache#17478

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
 
See apache#17478

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Co-authored-by: Filippo Rossi <12383260+notfilippo@users.noreply.github.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 24, 2025
…nics (apache#18013) (apache#18131)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Related to apache/datafusion-comet#2539
- Related to apache#18072

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Return errors rather than panicking.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Co-authored-by: Andy Grove <agrove@apache.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 24, 2025
…amps (apache#17777) (apache#18129)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Related to apache#17776
- Related to apache#18072

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

If I've understood semantic equality correctly, any two timestamps
should meet the bar for equality regardless of time units and timezones,
but the current code doesn't reflect that.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Adds a branch to this method for timestamps.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

Yes

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Shiv Bhatia <shivbhatia10@gmail.com>
Co-authored-by: Shiv Bhatia <sbhatia@palantir.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 24, 2025
…e#18161) (apache#18179)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123. -->

- Related to apache#18070
- Part of apache#18072

## Rationale for this change

Fix performance regression in Datafusion 50

## What changes are included in this PR?

Backport apache#18161 to `branch-50`

## Are these changes tested?

Yes
## Are there any user-facing changes?

Fix performance regression

Co-authored-by: Yongting You <2010youy01@gmail.com>
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#16678.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

The issue has been fixed in apache#16639, this PR just adds a testcase for it.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Add a test case for `to_timestamp(double)` with vectorized input.
Similar to the one presented in the issue.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

No
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18070

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
See the above issue and its comment
apache#18070 (comment)

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
In nested loop join, when the join column includes `List(Utf8View)`, use
`take()` instead of `to_array_of_size()` to avoid deep copying the utf8
buffers inside `Utf8View` array.

This is the quick fix, avoiding deep copy inside `to_array_of_size()` is
a bit tricky.
Here is `ListArray`'s physical layout:
https://arrow.apache.org/rust/arrow/array/struct.GenericListArray.html
If multiple elements is pointing to the same list range, the underlying
payload can't be reused.So the potential fix in `to_array_of_size` can
only avoids copying the inner-inner utf8view array buffers, but can't
avoid copying the inner array (i.e. views are still copied), and deep
copying for other primitive types also can't be avoided. Seems this can
be better solved when `ListView` type is ready 🤔

### Benchmark
I tried query 1 in apache#18070,
but only used 3 randomly sampled `places` parquet file.

49.0.0: 4s
50.0.0: stuck > 1 minute
PR: 4s

Now the performance are similar, I suspect the most time is spend
evaluating the expensive `array_has` so the optimization in
apache#16996 can't help much.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
Existing tests
## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
No
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#17913 .

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
- Improve SQL code block rendering by upgrading `pydata-sphinx-theme`
- fix sidebar layout 

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
4. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
yes

## Are there any user-facing changes?

documentation ui
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- part of apache#17427 

## Rationale for this change
Adds regular joins (left, right, full, inner) for PWMJ as they behave
differently in the code path.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?
Adds classic join + physical planner

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?
Yes SLT tests  + unit tests

## Follow up work to this pull request
- Handling partitioned queries and multiple record batches (fuzz testing
will be handled with this)
- Simplify physical planning
- Add more unit tests for different types (another pr as the LOC in this
pr is getting a little daunting)

next would be to implement the existence joins

---------

Co-authored-by: Yongting You <2010youy01@gmail.com>
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#17854 .

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 27, 2025
…pache#18017)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#17993

## Rationale for this change

```
DataFusion CLI v50.1.0
> SET TIME ZONE = '+08:00';
0 row(s) fetched. 
Elapsed 0.011 seconds.

> SELECT arrow_typeof(now());
+---------------------------------------+
| arrow_typeof(now())                   |
+---------------------------------------+
| Timestamp(Nanosecond, Some("+08:00")) |
+---------------------------------------+
1 row(s) fetched. 
Elapsed 0.015 seconds.

> SELECT count(1) result FROM (SELECT now() as n) a WHERE n > '2000-01-01'::date;
+--------+
| result |
+--------+
| 1      |
+--------+
1 row(s) fetched. 
Elapsed 0.029 seconds.
```

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?
When the timezone changes, re-register `now()` function

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

According to three-valued logic we should return `null` and that's also
what happens when the argument is not a constant as seen in the test.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

Updated `ArrayHas::simplify` to explicitly handle `null`

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

Updated the `array_has` SQL test and added unit tests

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

Yes, a minor change in behaviour wrt `null`

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#16820 .

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 27, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Part of apache#16602

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Now we have to search in the code comment (or even implementation) to
find the documentation of certain metrics, it would be better to open a
page in the `user-guide` for metrics.

The doc has to be manually updated, the metrics construction is
scattered in the codebase, so it's hard to make it auto-generated.

This PR only includes 2 common metrics, I plan to add more
operator-specific metrics while working on
apache#18116

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 30, 2025
…he#18316)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
Noticed this function when reviewing
apache#18313. I think it’s a good
opportunity to add more documentation.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 30, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18267.
/cc @NGA-TRAN 
## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
The `information_schema.views` does not have display `WITH ORDER` for
the definition of a table.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Added condition for writing `WITH ORDER` for CreateExternalTable.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
Did not add tests for this functionality as not other display
functionality has tests and seems like a separate PR would be
appropriate if this is needed.

This was tested manually with:

In `datafusion-cli`

```
-- Not sorted
CREATE EXTERNAL TABLE  dimension_csv
STORED AS CSV 
LOCATION '/path/to/the/attached/dimension_1.csv'
OPTIONS ('format.has_header' 'true');

-- Sorted
CREATE EXTERNAL TABLE  dimension_csv_sorted
STORED AS CSV 
WITH ORDER (env, service, host)
LOCATION '/path/to/the/attached/dimension_1.csv'
OPTIONS ('format.has_header' 'true');
```
Then running:
```
select * from information_schema.views;
```
With link to data:
[dimension_1.csv](https://github.com/user-attachments/files/23124138/dimension_1.csv)
## Are there any user-facing changes?

Yes, improves the information_schema.views display to include `WITH
ORDER`
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 30, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18323 .

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Adds more detailed metrics, so it is easier to identify which part of
the aggregate streams are actually slow.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Added a metrics struct, and used it in the functions common to the
aggregate streams.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes, added some tests to verify the metrics are actually updated and can
be retrieved.

I've also ran the groupby benchmarks to ensure we don't create timers in
a way that could impact performance, and it seems ok, all the changes
are within what I'd expect as std variation on a local machine.
```
Comparing main and agg-metrics
--------------------
Benchmark h2o.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ agg-metrics ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 1252.42 ms │  1196.62 ms │     no change │
│ QQuery 2     │ 3976.62 ms │  3392.89 ms │ +1.17x faster │
│ QQuery 3     │ 3448.29 ms │  2918.47 ms │ +1.18x faster │
│ QQuery 4     │ 1909.15 ms │  1632.98 ms │ +1.17x faster │
│ QQuery 5     │ 3056.36 ms │  2831.82 ms │ +1.08x faster │
│ QQuery 6     │ 2663.13 ms │  2594.64 ms │     no change │
│ QQuery 7     │ 2802.28 ms │  2592.43 ms │ +1.08x faster │
│ QQuery 8     │ 4489.29 ms │  4199.00 ms │ +1.07x faster │
│ QQuery 9     │ 7001.75 ms │  6622.98 ms │ +1.06x faster │
│ QQuery 10    │ 4725.80 ms │  4619.37 ms │     no change │
└──────────────┴────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)          │ 35325.09ms │
│ Total Time (agg-metrics)   │ 32601.19ms │
│ Average Time (main)        │  3532.51ms │
│ Average Time (agg-metrics) │  3260.12ms │
│ Queries Faster             │          7 │
│ Queries Slower             │          0 │
│ Queries with No Change     │          3 │
│ Queries with Failure       │          0 │
└────────────────────────────┴────────────┘

```

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Nothing that is direct to the user, additional metrics will now be
available, but no breaking changes.

---------

Co-authored-by: Raz Luvaton <16746759+rluvaton@users.noreply.github.com>
Co-authored-by: Eshed Schacham <ashdnazg@gmail.com>
comphead pushed a commit that referenced this pull request Oct 30, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18026

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

This makes it possible to support temporary views in datafusion-python
without code duplication.

Ref: apache/datafusion-python#1267

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

- Add new public function `DataFrame::into_temporary_view`
- Update `DataFrameTableProvider` with a new member that determines the
`table_type`
- Add a test

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes, see added test `register_temporary_table`

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Yes, there is a new public function `DataFrame::into_temporary_view`

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 30, 2025
…8321)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18299

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
See writeup in apache#18297
This PR is for the remaining metrics in `DataSourceExec` with parquet
data source.

### Demo
In datafusion-cli
```
CREATE EXTERNAL TABLE IF NOT EXISTS lineitem
STORED AS parquet
LOCATION '/Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem';

set datafusion.explain.analyze_level = summary;

explain analyze select *
from lineitem
where l_orderkey = 3000000;

+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=5, elapsed_compute=48.677µs, output_bytes=1092.0 B]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                   |   FilterExec: l_orderkey@0 = 3000000, metrics=[output_rows=5, elapsed_compute=1.65872ms, output_bytes=530.8 KB]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |     DataSourceExec: file_groups={14 groups: [[Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-0.parquet:0..11525426], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-0.parquet:11525426..20311205, Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:0..2739647], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:2739647..14265073], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:14265073..20193593, Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-2.parquet:0..5596906], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-2.parquet:5596906..17122332], ...]}, projection=[l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment], file_type=parquet, predicate=l_orderkey@0 = 3000000, pruning_predicate=l_orderkey_null_count@2 != row_count@3 AND l_orderkey_min@0 <= 3000000 AND 3000000 <= l_orderkey_max@1, required_guarantees=[l_orderkey in (3000000)], metrics=[output_rows=19813, elapsed_compute=14ns, output_bytes=5.7 MB, files_ranges_pruned_statistics=21 total → 3 matched, page_index_rows_pruned=748901 total → 19813 matched, row_groups_pruned_bloom_filter=1 total → 1 matched, row_groups_pruned_statistics=1 total → 1 matched, bytes_scanned=2147308, metadata_load_time=1.794289ms] |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.081 seconds.
```

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
Update `row_groups_pruned_statistics`, `row_groups_pruned_bloom_filter`,
`page_index_rows_pruned` with the new `PruningMetrics` metric type.

The functional changes in the pr are in
`datafusion/datasource-parquet/src/*`, it's only a few of lines, most
changes are fixing tests.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
UTs are updated for the new metrics

## Are there any user-facing changes?

No
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Oct 30, 2025
…#18374)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Temporary file is created during test, see reproducer in
`datafusion-cli`

```sh
yongting@Yongtings-MacBook-Pro-2 ~/C/datafusion (main=)> cargo test --package datafusion --test core_integration --all-features -- dataframe::test_copy_schema --exact --nocapture
   Compiling datafusion v50.3.0 (/Users/yongting/Code/datafusion/datafusion/core)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 2.50s
     Running tests/core_integration.rs (target/debug/deps/core_integration-dee3896b38f536b2)

running 1 test
test dataframe::test_copy_schema ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 801 filtered out; finished in 0.02s

yongting@Yongtings-MacBook-Pro-2 ~/C/datafusion (main=)> git status
On branch main
Your branch is up to date with 'upstream/main'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        "datafusion/core/\"/"

nothing added to commit but untracked files present (use "git add" to track)
```

This PR fixes this test, and make CI stricter for similar temporary file
creations.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
Yes, I have run the CI without fix in my local repo, it's failing as
expected:
https://github.com/2010YOUY01/arrow-datafusion/actions/runs/18913128118/job/53989721867

After the fix, the CI should be able to pass.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 30, 2025
…ields (apache#18100)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#16997
- Part of apache#11725
- Supersedes apache#17085

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

When reviewing apache#17085 I was very confused by the fix suggested, and
tried to understand why `AccumulatorArgs` didn't have easy access to
`Field`s of its input expressions, as compared to scalar/window
functions which do. Introducing this new field should make it easier for
users to grab datatype, metadata, nullability of their input expressions
for aggregate functions.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Add a slice of `FieldRef` to `AccumulatorArgs` so users don't need to
compute the input expression fields themselves via using schema. This
addresses apache#16997 as it was confusing to have only the schema available
as there are valid (?) cases where the schema is empty (such as literal
only input).

This fix differs from apache#17085 in that it doesn't special case for when
there is literal only input; it leaves the physical `schema` provided to
`AccumulatorArgs` untouched but provides a more ergonomic (and less
confusing) API for users to retrieve `Field`s of their input arguments.

- I'm still not sure if the schema being empty for literal only inputs
is correct or not, so this might be considered a side step. If we could
remove `schema` entirely from `AccumulatorArgs` maybe we wouldn't need
to worry about this, but see my comment for why that wasn't done in this
PR

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Existing unit tests.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

Yes, new field to `AccumulatorArgs` which is publicly exposed (with all
it's fields).

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 30, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18334.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

The array-based implementation of date_trunc can produce incorrect
results for negative timestamps (i.e. dates before 1970-01-01). Check
for any such incorrect values and compensate accordingly.

Running the date_trunc benchmark suggests this fix introduces an ~9%
performance cost.

```
date_trunc_minute_1000  time:   [1.7424 µs 1.7495 µs 1.7583 µs]
                        change: [+7.9289% +8.5950% +9.1955%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe
```

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes, an SLT is added based on the issue.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Oct 30, 2025
…e#18317)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#15881
  - See my notes below

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Trying to move away from user defined signatures where possible; mainly
to ensure consistency of error checking/messages. The original issue is
because the function has to do this checking itself leading to
inconsistency of error used (ideally shouldn't be internal). By
uplifting away from a user defined signature we can make use of existing
code meant to handle this checking and error messages for us.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Defined range/generate_series signature via coercible API instead of
being user defined. Some accompanying changes are needed in the
signature code to make this possible.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Added SLT tests and fixed any existing ones.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

No (error messages do change though)

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
It's better to let pruning metrics in parquet displayed in an order that
is the same as the actual pruning order:
```
metrics=...files_ranges_pruned_statistics=21 total → 3 matched, row_groups_pruned_statistics=1 total → 1 matched, row_groups_pruned_bloom_filter=1 total → 1 matched, page_index_rows_pruned=748901 total → 19813 matched...
```
Now it's ordered alphabetically.
See apache#18321 (comment)
for reproducing.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
Update the sort key API in `MetricValue`, to let the parquet pruning
metrics display in the expected order.

## Are these changes tested?
UT
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?
No
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead added a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change
Found when was testing apache#18356

```
> select date_trunc('YY', now());
Execution error: Unsupported date_trunc granularity: yy

```

Which is confusing, I would like to get a list of supported values
<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 4, 2025
…or UDAFs (apache#18397)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18280

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

`AggregateUDFImpl::is_ordered_set_aggregate` is confusingly named as all
it does currently is permit usage of `WITHIN GROUP` SQL syntax. I don't
think it would have any functionality in the future beyond this. Also
makes it easier if in future we decide to implement [hypothetical-set
aggregate
functions](https://www.postgresql.org/docs/9.4/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
too, since we wouldn't need a `is_hypothetical_set_aggregate` variation
either.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Rename `AggregateUDFImpl::is_ordered_set_aggregate` to
`AggregateUDFImpl::supports_within_group_clause`.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Existing tests.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

Yes. Added section to upgrade guide.

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

Part of apache#18217

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
In `FilterExec`, selectivity is calculated as `output_rows/input_rows`.
This PR supports such metric. I think this metrics provides important
application-level insights, and would be commonly used, so it is
displayed in the `summary` verbose level.

### Demo in `datafusion-cli`
```
> set datafusion.explain.analyze_level = summary;
0 row(s) fetched.
Elapsed 0.000 seconds.

> explain analyze select * from generate_series(100) as t1(v1) where v1 <10;
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                   |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=10, elapsed_compute=1.763µs, output_bytes=64.0 KB]                                                                          |
|                   |   CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=10, elapsed_compute=25.833µs, output_bytes=64.0 KB]                                                                |
|                   |     FilterExec: value@0 < 10, metrics=[output_rows=10, elapsed_compute=34.888µs, output_bytes=128.0 B, selectivity=9.9% (10/101)]                                                      |
|                   |       RepartitionExec: partitioning=RoundRobinBatch(14), input_partitions=1, metrics=[]                                                                                                |
|                   |         LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=0, end=100, batch_size=8192], metrics=[output_rows=101, elapsed_compute=33.167µs, output_bytes=64.0 KB] |
|                   |                                                                                                                                                                                        |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.004 seconds.
```

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
1. Add a new `MetricValue` for ratio.
2. Tracking selectivity in `FilterExec` with `MetricValue::Ratio`

## Are these changes tested?
UT
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
3. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?
No
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 4, 2025
…apache#18430)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

Remove a unnecessary vec allocation in SortMergeJoinStream
initialization

## Rationale for this change

Remove a unnecessary vec allocation in SortMergeJoinStream
initialization

## What changes are included in this PR?

Remove a unnecessary vec allocation in SortMergeJoinStream
initialization

## Are these changes tested?

Covered by existing

## Are there any user-facing changes?

No
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

N/A

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

When reviewing apache#18424 I noticed some refactoring that could be applied
to existing array reverse implementation.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

See my comments for the refactors & justifications.

Existing tests.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

No.

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#16272.

## Rationale for this change

- Null with null type were throwing invalid casts in substrait round
trips

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

- Added a null variation const that allows NULL with NULL types to be
casted properly.
- Made this a UserDefined type from the substrait side of things

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?
- Yes added unit test in producer/types.rs
- Previously failing tests pass

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes 
- apache#18204
- apache#18081
- fixes apache#18219 as a side effect

## Rationale for this change

Default timezone was previously zulu however with the recent change to
support default tz in now(), current_date(), etc which used to have no
default tz the choice was made to unset the system wide timezone.

## What changes are included in this PR?

Code, tests, upgrading doc.

## Are these changes tested?

Yes, with existing tests.

## Are there any user-facing changes?

Yes. Any query that used to use the default timezone would return a
timestamp with a timezone of 'Z' will now return a timestamp without a
timezone. This can be changed back to the previous behaviour with the
sql

```sql
SET TIMEZONE = '+00:00';
```

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Nov 4, 2025
…8363)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#17670 
- Closes apache#18419

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

Added a `flatten()` `List(LargeList)` test to the `sqllogictest`

Added support for array `flatten()` on `List(LargeList(_))` types
<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

`sqllogictest` passes, but I still need to implement a test where
offsets could not be downcasted from i64 to i32
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

Users will be able to use `flatten` on `List(LargeList)` types
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 4, 2025
…ALYZE (apache#18455)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18410

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

This PR adds the `reduction_factor` metric to the `AggregateExec`
mode=Partial case.

e.g from the issue

```
create table t1(a int, b int);
insert into t1 values (1,10), (1, 20), (2,10), (2,30);
explain analyze select a, sum(b) from t1 group by a;

+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                         |
+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | AggregateExec: mode=FinalPartitioned, gby=[a@0 as a], aggr=[sum(t1.b)], metrics=[output_rows=2, elapsed_compute=7.856539ms, output_bytes=544.0 B]                            |
|                   |   CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=2, elapsed_compute=192.334µs, output_bytes=96.0 KB]                                                      |
|                   |     RepartitionExec: partitioning=Hash([a@0], 10), input_partitions=10, metrics=[]                                                                                           |
|                   |       RepartitionExec: partitioning=RoundRobinBatch(10), input_partitions=1, metrics=[]                                                                                      |
|                   |         AggregateExec: mode=Partial, gby=[a@0 as a], aggr=[sum(t1.b)], metrics=[output_rows=2, elapsed_compute=2.581625ms, output_bytes=544.0 B, reduction_factor=50% (2/4)] |
|                   |           DataSourceExec: partitions=1, partition_sizes=[1], metrics=[]                                                                                                      |
|                   |                                                                                                                                                                              |
+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```

Note: For AggregateExec cases where this doesn't apply, the
reduction_factor metric won't be shown. Here's an example of the explain
analyze from the modified test in `explain_analyze.rs`.

```
running query: EXPLAIN ANALYZE SELECT count(*) as cnt FROM (SELECT count(*), c1 FROM aggregate_test_100 WHERE c13 != 'C2GT5KVyOPZpgKVl110TyZO0NcJ434' GROUP BY c1 ORDER BY c1 ) AS a UNION ALL SELECT 1 as cnt UNION ALL SELECT lead(c1, 1) OVER () as cnt FROM (select 1 as c1) AS b LIMIT 3
Query Output:

+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | CoalescePartitionsExec: fetch=3, metrics=[output_rows=3, elapsed_compute=6.084µs, output_bytes=25.0 B]                                                                                                                                                                                                                                                                                                                                                                                                    |
|                   |   UnionExec, metrics=[output_rows=3, elapsed_compute=117.208µs, output_bytes=25.0 B]                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |     ProjectionExec: expr=[count(Int64(1))@0 as cnt], metrics=[output_rows=1, elapsed_compute=1.333µs, output_bytes=8.0 B]                                                                                                                                                                                                                                                                                                                                                                                 |
|                   |       AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))], metrics=[output_rows=1, elapsed_compute=70.542µs, output_bytes=8.0 B]                                                                                                                                                                                                                                                                                                                                                                    |
|                   |         CoalescePartitionsExec, metrics=[output_rows=3, elapsed_compute=4.958µs, output_bytes=24.0 B]                                                                                                                                                                                                                                                                                                                                                                                                     |
|                   |           AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))], metrics=[output_rows=3, elapsed_compute=51.835µs, output_bytes=24.0 B]                                                                                                                                                                                                                                                                                                                                                             |
|                   |             ProjectionExec: expr=[], metrics=[output_rows=5, elapsed_compute=2.251µs, output_bytes=0.0 B]                                                                                                                                                                                                                                                                                                                                                                                                 |
|                   |               AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[], metrics=[output_rows=5, elapsed_compute=76.666µs, output_bytes=48.0 KB, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, peak_mem_used=50544, aggregate_arguments_time=3ns, aggregation_time=3ns, emitting_time=5.875µs, time_calculating_group_ids=9.459µs]                                                                                                                                                            |
|                   |                 CoalesceBatchesExec: target_batch_size=4096, metrics=[output_rows=5, elapsed_compute=11.249µs, output_bytes=192.0 KB]                                                                                                                                                                                                                                                                                                                                                                     |
|                   |                   RepartitionExec: partitioning=Hash([c1@0], 3), input_partitions=3, metrics=[spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, fetch_time=15.064041ms, repartition_time=149.418µs, send_time=8.672µs]                                                                                                                                                                                                                                                                                  |
|                   |                     AggregateExec: mode=Partial, gby=[c1@0 as c1], aggr=[], metrics=[output_rows=5, elapsed_compute=248.667µs, output_bytes=16.0 KB, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, skipped_aggregation_rows=0, peak_mem_used=52168, aggregate_arguments_time=3ns, aggregation_time=3ns, emitting_time=7.377µs, time_calculating_group_ids=128.46µs, reduction_factor=5.1% (5/99)]                                                                                                   |
|                   |                       CoalesceBatchesExec: target_batch_size=4096, metrics=[output_rows=99, elapsed_compute=81.459µs, output_bytes=64.0 KB]                                                                                                                                                                                                                                                                                                                                                               |
|                   |                         FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434, projection=[c1@0], metrics=[output_rows=99, elapsed_compute=503.793µs, output_bytes=1584.0 B, selectivity=99% (99/100)]                                                                                                                                                                                                                                                                                                      |
|                   |                           RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=1, metrics=[spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, fetch_time=4.160958ms, repartition_time=1ns, send_time=16.085µs]                                                                                                                                                                                                                                                                             |
|                   |                             DataSourceExec: file_groups={1 group: [[Users/peter/Documents/open-source/datafusion/testing/data/csv/aggregate_test_100.csv]]}, projection=[c1, c13], file_type=csv, has_header=true, metrics=[output_rows=100, elapsed_compute=1ns, output_bytes=19.1 KB, batches_split=0, file_open_errors=0, file_scan_errors=0, time_elapsed_opening=313.458µs, time_elapsed_processing=3.974624ms, time_elapsed_scanning_total=3.771208ms, time_elapsed_scanning_until_data=3.714625ms] |
|                   |     ProjectionExec: expr=[1 as cnt], metrics=[output_rows=1, elapsed_compute=20.792µs, output_bytes=8.0 B]                                                                                                                                                                                                                                                                                                                                                                                                |
|                   |       PlaceholderRowExec, metrics=[]                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |     ProjectionExec: expr=[lead(b.c1,Int64(1)) ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING@1 as cnt], metrics=[output_rows=1, elapsed_compute=1.333µs, output_bytes=9.0 B]                                                                                                                                                                                                                                                                                                                    |
|                   |       BoundedWindowAggExec: wdw=[lead(b.c1,Int64(1)) ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: Field { "lead(b.c1,Int64(1)) ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING": nullable Int64 }, frame: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING], mode=[Sorted], metrics=[output_rows=1, elapsed_compute=560µs, output_bytes=17.0 B]                                                                                                                             |
|                   |         ProjectionExec: expr=[1 as c1], metrics=[output_rows=1, elapsed_compute=2.459µs, output_bytes=8.0 B]                                                                                                                                                                                                                                                                                                                                                                                              |
|                   |           PlaceholderRowExec, metrics=[]                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```

The following cases don't include `reduction_factor` metric
- `AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))]`
- `AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))]`
- `AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1], aggr=[]`

While this case does:
- `AggregateExec: mode=Partial, gby=[c1@0 as c1], aggr=[]` ->
`reduction_factor=5.1% (5/99)`

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
Yes

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
Yes, a new metric will be visible when running `EXPLAIN ANALYZE`

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Yongting You <2010youy01@gmail.com>
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18418

## Rationale for this change

Keep dependencies up to date.

## What changes are included in this PR?

Benchmark updates.

## Are these changes tested?

Yes, every single benchmark was run.

## Are there any user-facing changes?

No.
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

There's no issue for this, just some simple text fixes.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

While "the the" *can* be grammatically correct, in these instances it
was not.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

"the the" -> "the"

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Not at as such, no.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Yes, in both code docs and in the website.
comphead pushed a commit that referenced this pull request Nov 4, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18479.

You can see an example run with the warning
[here](https://github.com/apache/datafusion/actions/runs/19056329292/job/54427495514),
just expand the `Check datafusion (no-default-features)` section.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Fixes a warning when building without the sql feature.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

Makes a `use` declaration conditional based on the sql feature.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

I ran the same command and had no issues.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Nope!
comphead pushed a commit that referenced this pull request Nov 7, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- part of #apache#18142.

## Rationale for this change
As discussed in apache#18289 this PR
is for consolidating all the `flight` examples into a single example
binary. Then we can make sure we are agreed on the pattern and then we
can apply it to the remaining examples

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
comphead pushed a commit that referenced this pull request Nov 7, 2025
…st (apache#18485)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- This PR came up as part of apache#17964.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

This PR is intended to fix return type mismatch of spark `array` when
inner data type is `LargeList`, e.g.
```
query error
SELECT array(arrow_cast(array(1), 'LargeList(Int64)'))
----
DataFusion error: Internal error: Function 'array' returned value of type 'LargeList(Field { name: "element", data_type: LargeList(Field { data_type: Int64, nullable: true }), nullable: true })' while the following type was promised at planning time and expected: 'List(Field { name: "element", data_type: LargeList(Field { data_type: Int64, nullable: true }), nullable: true })'.
This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues
```

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

- Return `List` regardless of whether inner data type is `LargeList` or
not. This aligns with the behavior of datafusion `make_array` function.
- Remove `return_field_from_args` as `return_type` is already defined
and is invoked internally.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

No.

---------

Co-authored-by: Michael Kleen <mkleen@gmail.com>
Co-authored-by: Sergey Zhukov <62326549+cj-zhukov@users.noreply.github.com>
Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: bubulalabu <bubulalabububu@gmail.com>
Co-authored-by: Vegard Stikbakke <vegard.stikbakke@gmail.com>
comphead pushed a commit that referenced this pull request Nov 7, 2025
apache#18481)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18407

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
This new metric will give the user better visibility to see what portion
of the possibilities is actually being matched.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
Add selectivity metric to NestedLoopJoinExec for EXPLAIN ANALYZE

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
Added tests

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
Yes, new metric in explain analyze
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Yongting You <2010youy01@gmail.com>
comphead pushed a commit that referenced this pull request Nov 7, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- part of #apache#18142.

## Rationale for this change
This PR is for consolidating all the `udf` examples into a single
example binary. We are agreed on the pattern and we can apply it to the
remaining examples
<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>
comphead pushed a commit that referenced this pull request Nov 10, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

An initial attempt towards
apache#18467

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
### Rationale for the additional lint rule
`clippy::needless_pass_by_value`
There is a clippy lint rule that is not turned on by the current
strictness level in CI:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_value
Note it has the `Clippy` category `pedantic`, and its description is
`lints which are rather strict or have occasional false positives` from
https://doc.rust-lang.org/nightly/clippy

It seems we have been suffering from the excessive copying issue for
quite some time, and @alamb is on the front line now
apache#18413. I think this extra
lint rule is able to help.

### Implementation plan
This PR only enables this rule in `datafusion-common` package, and apply
`#[allow(clippy::needless_pass_by_value)]` for all violations.
If this PR makes sense, we can open a tracking issue and roll out this
check to the remaining workspace packages. At least this can help
prevent new inefficient patterns and identify existing issues that we
can fix gradually.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 10, 2025
)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes apache#18505.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
Enforce clippy `needless_pass_by_value`.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
Yes

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

No
comphead pushed a commit that referenced this pull request Nov 10, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Follow on to apache#18468

## Rationale for this change

We missed the fact that you couldn't yet add new linter rules to
subcrates via Cargo.toml overrides. Thankfully @Jefffrey sorted is out.
Let's try and avoid that again by leaving a comment

## What changes are included in this PR?

Add comments to help our future selves remember to add new lints to
lib.rs rather than Cargo.toml for subcrates

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
comphead pushed a commit that referenced this pull request Nov 10, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
In apache#18468, there is a
inconsistent comment I forget to remove.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants