38.0.0 (2024-05-07)
Breaking changes:
- refactor: make dfschema wrap schemaref #9595 (haohuaijin)
- Make FirstValue an UDAF, Change
AggregateUDFImpl::accumulatorsignature, support ORDER BY for UDAFs #9874 (jayzhan211) - Remove
OwnedTableReferenceandOwnedSchemaReference#9933 (comphead) - Consistent LogicalPlan subquery handling in TreeNode::apply and TreeNode::visit #9913 (peter-toth)
- Refactor
Optimizerto use owned plans andTreeNodeAPI (10% faster planning) #9948 (alamb) - Stop copying plans in
LogicalPlan::with_param_values#10016 (alamb) - Move coalesce to datafusion-functions and remove BuiltInScalarFunction #10098 (Omega359)
- Refactor sessionconfig set fns to avoid an unnecessary enum to string conversion #10141 (psvri)
- ScalarUDF: Remove
supports_zero_argumentand avoid creating null array for empty args #10193 (jayzhan211) - Clean-up: Remove AggregateExec::group_by() #10297 (berkaysynnada)
- Remove
ScalarFunctionDefinition::Name#10277 (lewiszlw) - feat: Determine ordering of file groups #9593 (suremarc)
- Split parquet bloom filter config and enable bloom filter on read by default #10306 (lewiszlw)
- Improve coerce API so it does not need DFSchema #10331 (alamb)
- Minor: Do not force analyzer to copy logical plans #10367 (alamb)
- Move
Covariance(Sample)covar/covar_sampto be a User Defined Aggregate Function #10372 (jayzhan211)
Performance related:
- perf: Use
Arc<str>instead ofCow<&'a>in the analyzer #9824 (comphead)
Implemented enhancements:
- feat: Add display_pg_json for LogicalPlan #9789 (liurenjie1024)
- feat: eliminate redundant sorts on monotonic expressions #9813 (suremarc)
- feat: optimize
lowerandupperfunctions #9971 (JasonLi-cn) - feat: support
unnestmultiple arrays #10044 (jonahgao) - feat:
DataFramesupports unnesting multiple columns #10118 (jonahgao) - feat: support input reordering for
NestedLoopJoinExec#9676 (korowa) - feat: add static_name() to ExecutionPlan #10266 (waynexia)
- feat: add optimizer config param to avoid grouping partitions
prefer_existing_union#10259 (NGA-TRAN) - feat: unwrap casts of string and dictionary columns #10323 (erratic-pattern)
- feat: Add CrossJoin match case to unparser #10371 (sardination)
- feat: run expression simplifier in a loop until a fixedpoint or 3 cycles #10358 (erratic-pattern)
Fixed bugs:
- fix: detect non-recursive CTEs in the recursive
WITHclause #9836 (jonahgao) - fix: improve
unnest_generic_listhandling of null list #9975 (jonahgao) - fix: reduce lock contention in
RepartitionExec::execute#10009 (crepererum) - fix:
RepartitionExecmetrics #10025 (crepererum) - fix: Support Dict types in
in_listphysical plans #10031 (advancedxy) - fix: Specify row count in sort_batch for batch with no columns #10094 (viirya)
- fix: another non-deterministic test in
joins.slt#10122 (korowa) - fix: duplicate output for HashJoinExec in CollectLeft mode #9757 (korowa)
- fix: cargo warnings of import item #10196 (waynexia)
- fix: reduce lock contention in distributor channels #10026 (crepererum)
- fix: no longer support the
substringfunction #10242 (jonahgao) - fix: Correct null_count in describe() #10260 (Weijun-H)
- fix: schema error when parsing order-by expressions #10234 (jonahgao)
- fix: LogFunc simplify swaps arguments #10360 (erratic-pattern)
Documentation updates:
- Update
COPYdocumentation to reflect changes #9754 (alamb) - doc: Add
datafusion-federationto Integrations #9853 (phillipleblanc) - Improve
AggregateUDFImpl::state_fieldsdocumentation #9919 (alamb) - Update datafusion-cli docs, split up #10078 (alamb)
- Fix large futures causing stack overflows #10033 (sergiimk)
- Update documentation to replace Apache Arrow DataFusion with Apache DataFusion #10130 (andygrove)
- Update github repo links #10167 (lewiszlw)
- minor: fix installation section link #10179 (comphead)
- Improve documentation on
TreeNode#10035 (alamb) - Update .asf.yaml to publish docs to datafusion.apache.org #10190 (phillipleblanc)
- Update links to point to datafusion.apache.org #10195 (phillipleblanc)
- doc: fix subscribe mail link to datafusion mailing lists #10225 (jackwener)
- Fix docs.rs build for datafusion-proto (hopefully) #10254 (alamb)
- docs: add download page #10271 (tisonkun)
- Clarify docs explaining the relationship between
SessionStateandSessionContext#10350 (alamb) - docs: Add DataFusion subprojects to navigation menu, other minor updates #10362 (andygrove)
Merged pull requests:
- Prepare 37.0.0 Release #9697 (andygrove)
- move Left, Lpad, Reverse, Right, Rpad functions to datafusion_functions #9841 (Omega359)
- Add non-column expression equality tracking to filter exec #9819 (mustafasrepo)
- datafusion-cli support for multiple commands in a single line #9831 (berkaysynnada)
- Add tests for filtering, grouping, aggregation of ARRAYs #9695 (alamb)
- Remove vestigal conbench integration #9855 (alamb)
- feat: Add display_pg_json for LogicalPlan #9789 (liurenjie1024)
- Update
COPYdocumentation to reflect changes #9754 (alamb) - Minor: Remove the bench most likely to cause OOM in CI #9858 (gruuya)
- Minor: make uuid an optional dependency on datafusion-functions #9771 (alamb)
- doc: Add
Spice.aito Known Users #9852 (phillipleblanc) - minor: add a hint how to adjust max rows displayed #9845 (comphead)
- Exclude .github directory from release tarball #9850 (andygrove)
- move strpos, substr functions to datafusion_functions #9849 (Omega359)
- doc: Add
datafusion-federationto Integrations #9853 (phillipleblanc) - chore(deps): update cargo requirement from 0.77.0 to 0.78.1 #9844 (dependabot[bot])
- chore(deps-dev): bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /datafusion/wasmtest/datafusion-wasm-app #9741 (dependabot[bot])
- Implement semi/anti join output statistics estimation #9800 (korowa)
- move Log2, Log10, Ln to datafusion-functions #9869 (tinfoil-knight)
- Add CI compile checks for feature flags in datafusion-functions #9772 (alamb)
- move the Translate, SubstrIndex, FindInSet functions to datafusion-functions #9864 (Omega359)
- Support custom struct field names with new scalar function named_struct #9743 (gstvg)
- Allow declaring partition columns in
PARTITION BYclause, backwards compatible #9599 (MohamedAbdeen21) - Minor: Move depcheck out of datafusion crate (200 less crates to compile) #9865 (alamb)
- Minor: delete duplicate bench test #9866 (Lordworms)
- parquet: Add tests for pruning on Int8/Int16/Int64 columns #9778 (progval)
- move
Atan2,Atan,Acosh,Asinh,Atanhtodatafusion-function#9872 (Weijun-H) - minor(doc): fix dead link for catalogs example #9883 (yjshen)
- parquet: Add tests for page pruning on unsigned integers #9888 (progval)
- fix(9870): common expression elimination optimization, should always re-find the correct expression during re-write. #9871 (wiedld)
- [CI] Use alias for table.struct #9894 (jayzhan211)
- fix: detect non-recursive CTEs in the recursive
WITHclause #9836 (jonahgao) - Minor: Add SIGMOD paper reference to architecture guide #9886 (alamb)
- refactor: add macro for the binary math function in
datafusion-function#9889 (Weijun-H) - Add benchmark for substr_index #9878 (Omega359)
- Add test for reading back file created with
COPY ... OPTIONS (FORMAT..)options #9753 (alamb) - Add Expr->String for SimilarTo, IsNotTrue, IsNotUnknown,Negative #9902 (yyy1000)
- refactor: make dfschema wrap schemaref #9595 (haohuaijin)
- Add
spilled_rowsmetric toExternalSorterbyIPCWriter#9885 (erenavsarogullari) - Minor: Add ParquetExec::table_parquet_options accessor #9909 (alamb)
- Add support for Bloom filters on unsigned integer columns in Parquet tables #9770 (progval)
- Move
radians,signum,sin,sinhandsqrtfunctions todatafusion-functionscrate #9882 (erenavsarogullari) - refactor: make all udf function impls public #9903 (universalmind303)
- Minor: Improve math expr description #9911 (caicancai)
- perf: Use
Arc<str>instead ofCow<&'a>in the analyzer #9824 (comphead) - Use
structinstead ofnamed_structwhen there are no aliases #9897 (alamb) - Improve planning speed using
impl Into<Arc<str>>to create Arc rather than&str#9916 (alamb) - Make FirstValue an UDAF, Change
AggregateUDFImpl::accumulatorsignature, support ORDER BY for UDAFs #9874 (jayzhan211) - Add TPCH-DS planning benchmark #9907 (alamb)
- Simplify Expr::map_children #9876 (peter-toth)
- CrossJoin Refactor #9830 (berkaysynnada)
- Optimization: concat function #9732 (JasonLi-cn)
- Improve
AggregateUDFImpl::state_fieldsdocumentation #9919 (alamb) - chore(deps): update substrait requirement from 0.28.0 to 0.29.0 #9942 (dependabot[bot])
- test: fix intermittent failure in cte.slt #9934 (jonahgao)
- Move
cbrt,cos,cosh,degreestodatafusion-functions#9938 (erenavsarogullari) - Add Expr->String for Exists, Sort #9936 (kevinmingtarja)
- Remove
OwnedTableReferenceandOwnedSchemaReference#9933 (comphead) - Prune out constant expressions from output ordering. #9947 (mustafasrepo)
- Move
AggregateExpr,PhysicalExprandPhysicalSortExprto physical-expr-core #9926 (jayzhan211) - Minor: Update release README #9956 (alamb)
- Optimize
COUNT(1): Change the sentinel value's type for COUNT(*) to Int64 #9944 (gruuya) - Improve docs for
TableProvider::supports_filters_pushdownand remove deprecated function #9923 (alamb) - Minor: Improve documentation for AggregateUDFImpl::accumulator and
AccumulatorArgs#9920 (alamb) - Minor: improve TableReference docs #9952 (alamb)
- Fix datafusion-cli publishing #9955 (alamb)
- Simplify TreeNode recursions #9965 (peter-toth)
- Validate partitions columns in
CREATE EXTERNAL TABLEif table already exists. #9912 (MohamedAbdeen21) - Minor: Add additional documentation to
CommonSubexprEliminate#9959 (alamb) - Fix tpcds planning stack overflows - Join planning refactoring #9962 (Jefffrey)
- coercion vec[Dictionary, Utf8] to Dictionary for coalesce function #9958 (Lordworms)
- Minor: Update library documentation with new crates #9966 (alamb)
- Minor: Return InternalError rather than panic for
NamedStructField should be rewritten in OperatorToFunction#9968 (alamb) - minor: update MSRV 1.73 #9977 (comphead)
- Move First Value UDAF and builtin first / last function to
aggregate-functions#9960 (jayzhan211) - Minor: Avoid copying all expressions in
Analzyer/check_plan#9974 (alamb) - Minor: Improve documentation about optimizer #9967 (alamb)
- Minor: Use
Expr::apply()instead ofinspect_expr_pre()#9984 (peter-toth) - Update documentation for COPY command #9931 (alamb)
- Minor: fix bug in pruning predicate doc #9986 (alamb)
- fix: improve
unnest_generic_listhandling of null list #9975 (jonahgao) - Consistent LogicalPlan subquery handling in TreeNode::apply and TreeNode::visit #9913 (peter-toth)
- Remove unnecessary result in
DFSchema::index_of_column_by_name#9990 (lewiszlw) - Removes Bloom filter for Int8/Int16/Uint8/Uint16 #9969 (edmondop)
- Move LogicalPlan
tree_nodemodule #9995 (alamb) - Optimize performance of substr_index and add tests #9973 (kevinmingtarja)
- move Floor, Gcd, Lcm, Pi to datafusion-functions #9976 (Omega359)
- Minor: Improve documentation on
LogicalPlan::apply*andLogicalPlan::map*#9996 (alamb) - move the Log, Power functions to datafusion-functions #9983 (tinfoil-knight)
- Remove FORMAT <..> backwards compatibility options from COPY #9985 (tinfoil-knight)
- move Trunc, Cot, Round, iszero functions to datafusion-functions #10000 (Omega359)
- Minor: Clarify documentation on
PruningStatistics::row_countsandPruningStatistics::null_countsand make test match #10004 (alamb) - Avoid
LogicalPlan::clone()inLogicalPlan::map_childrenwhen possible #9999 (alamb) - Introduce
TreeNode::exists()API, avoid copying expressions #10008 (peter-toth) - Minor: Make
LogicalPlan::apply_subqueriesandLogicalPlan::map_subqueriespub #9998 (alamb) - Move Nanvl and random functions to datafusion-functions #10017 (Omega359)
- fix: reduce lock contention in
RepartitionExec::execute#10009 (crepererum) - chore(deps): update rstest requirement from 0.18.0 to 0.19.0 #10021 (dependabot[bot])
- Minor: Document LogicalPlan tree node transformations #10010 (alamb)
- Refactor
Optimizerto use owned plans andTreeNodeAPI (10% faster planning) #9948 (alamb) - Further clarification of the supports_filters_pushdown documentation #9988 (cisaacson)
- Prune columns are all null in ParquetExec by row_counts , handle IS NOT NULL #9989 (Ted-Jiang)
- Improve the performance of ltrim/rtrim/btrim #10006 (JasonLi-cn)
- fix:
RepartitionExecmetrics #10025 (crepererum) - modify emit() of TopK to emit on
batch_sizerather thanbatch_size-1#10030 (JasonLi-cn) - Consolidate LogicalPlan tree node walking/rewriting code into one module #10034 (alamb)
- Introduce
OptimizerRule::rewriteto rewrite in place, rewriteExprSimplifier(20% faster planning) #9954 (alamb) - Fix DistinctCount for timestamps with time zone #10043 (joroKr21)
- Improve documentation on
LogicalPlanTreeNode methods #10037 (alamb) - chore(deps): update prost-build requirement from =0.12.3 to =0.12.4 #10045 (crepererum)
- Fix datafusion-cli cursor isn't on the right position in windows 7 cmd #10028 (colommar)
- Always pass DataType to PrimitiveDistinctCountAccumulator #10047 (joroKr21)
- Stop copying plans in
LogicalPlan::with_param_values#10016 (alamb) - fix
NamedStructField should be rewritten in OperatorToFunctionin subquery regression (changeApplyFunctionRewritesto use TreeNode API #10032 (alamb) - Avoid copies in
InlineTableScanvia TreeNode API #10038 (alamb) - Bump sccache-action to v0.0.4 #10060 (phillipleblanc)
- chore: add GitHub workflow to close stale PRs #10046 (andygrove)
- feat: eliminate redundant sorts on monotonic expressions #9813 (suremarc)
- Disable
crypto_expressionsfeature properly for --no-default-features #10059 (phillipleblanc) - Return self in EmptyExec and PlaceholderRowExec with_new_children #10052 (joroKr21)
- chore(deps): update sqllogictest requirement from 0.19.0 to 0.20.0 #10057 (dependabot[bot])
- Rename
FileSinkExectoDataSinkExec#10065 (phillipleblanc) - fix: Support Dict types in
in_listphysical plans #10031 (advancedxy) - Prune pages are all null in ParquetExec by row_counts and fix NOT NULL prune #10051 (Ted-Jiang)
- Refactor
EliminateOuterJointo implementOptimizerRule::rewrite()#10081 (peter-toth) - chore(deps): update substrait requirement from 0.29.0 to 0.30.0 #10084 (dependabot[bot])
- feat: optimize
lowerandupperfunctions #9971 (JasonLi-cn) - Prepend sqllogictest explain result with line number #10019 (duongcongtoai)
- Use PhysicalExtensionCodec consistently #10075 (joroKr21)
- Minor: Do not truncate
SHOW ALLin datafusion-cli #10079 (alamb) - Minor: get mutable ref to
SessionConfiginSessionState#10050 (MichaelScofield) - Move
ceil,exp,factorialtodatafusion-functionscrate #10083 (erenavsarogullari) - feat: support
unnestmultiple arrays #10044 (jonahgao) - cleanup(tests): Move tests from
push_down_projections.rstooptimize_projections.rs#10071 (kavirajk) - Move conversion of FIRST/LAST Aggregate function to independent physical optimizer rule #10061 (jayzhan211)
- Avoid copies in
CountWildcardRulevia TreeNode API #10066 (alamb) - Coerce Dictionary types for scalar functions #10077 (viirya)
- Refactor
UnwrapCastInComparisonto implementOptimizerRule::rewrite()#10087 (peter-toth) - Improve ApproxPercentileAccumulator merge api and fix bug #10056 (Ted-Jiang)
- Support http s3 endpoints in datafusion-cli via
CREATE EXTERNAL TABLE#10080 (alamb) - [Bug Fix]: Deem hash repartition unnecessary when input and output has 1 partition #10095 (mustafasrepo)
- fix: Specify row count in sort_batch for batch with no columns #10094 (viirya)
- Move concat, concat_ws, ends_with, initcap to datafusion-functions #10089 (Omega359)
- Update datafusion-cli docs, split up #10078 (alamb)
- Refactor physical create_initial_plan to iteratively & concurrently construct plan from the bottom up #10023 (Jefffrey)
- Adding TPCH benchmarks for Sort Merge Join #10092 (comphead)
- [minor] make parquet prune tests more readable #10112 (Ted-Jiang)
- Fix intermittent CI test failure in
joins.slt#10120 (alamb) - Update dependabot to consider datafusion-cli #10108 (Jefffrey)
- fix: another non-deterministic test in
joins.slt#10122 (korowa) - Minor: only trigger dependency check on changes to Cargo.toml #10099 (alamb)
- Refactor
UnwrapCastInComparisonto removeExprclones #10115 (peter-toth) - Fix large futures causing stack overflows #10033 (sergiimk)
- Avoid cloning in
log::simplifyandpower::simplify#10086 (alamb) - feat:
DataFramesupports unnesting multiple columns #10118 (jonahgao) - Minor: Refine dev/release/README.md #10129 (alamb)
- Minor: Add default for
Expr#10127 (peter-toth) - Update documentation to replace Apache Arrow DataFusion with Apache DataFusion #10130 (andygrove)
- Fix AVG groups accummulator ignoring return type #10114 (gruuya)
- Port
37.1.0changes to main #10136 (alamb) - chore(deps): update substrait requirement from 0.30.0 to 0.31.0 #10140 (dependabot[bot])
- Minor: Support more args for udaf #10146 (jayzhan211)
- Minor: Signature check for UDAF #10147 (jayzhan211)
- minor: avoid cloning the
SetExprduring planning ofSelectInto#10152 (jonahgao) - Add distinct aggregate tests to sqllogictest #10158 (Jefffrey)
- Add test for LIKE newline handling #10160 (Jefffrey)
- minor: unparser cleanup and new roundtrip test #10150 (devinjdangelo)
- Support Duration and Union types in ScalarValue::iter_to_array #10139 (joroKr21)
- chore(deps): update sqlparser requirement from 0.44.0 to 0.45.0 #10137 (Jefffrey)
- fix: duplicate output for HashJoinExec in CollectLeft mode #9757 (korowa)
- Move coalesce to datafusion-functions and remove BuiltInScalarFunction #10098 (Omega359)
- [DOC] Add test example for backtraces #10143 (comphead)
- Update github repo links #10167 (lewiszlw)
- feat: support input reordering for
NestedLoopJoinExec#9676 (korowa) - minor: fix installation section link #10179 (comphead)
- Improve
TreeNodeandLogicalPlanAPIs to accept owned closures, deprecatetransform_down_mut()andtransform_up_mut()#10126 (peter-toth) - Projection Expression - Input Field Inconsistencies during Projection #10088 (berkaysynnada)
- implement short_circuits function for ScalarUDFImpl trait #10168 (Lordworms)
- Improve documentation on
TreeNode#10035 (alamb) - implement rewrite for ExtractEquijoinPredicate and avoid clone in filter #10165 (Lordworms)
- Update .asf.yaml to point to new mailing list #10189 (phillipleblanc)
- Update NOTICE.txt to be relevant to DataFusion #10185 (alamb)
- Update .asf.yaml to publish docs to datafusion.apache.org #10190 (phillipleblanc)
- Minor: Add
Column::from(Tableref, &FieldRef),Expr::from(Column)andExpr::from(Tableref, &FieldRef)#10178 (alamb) - implement rewrite for FilterNullJoinKeys #10166 (Lordworms)
- Implement rewrite for EliminateOneUnion and EliminateJoin #10184 (Lordworms)
- Update links to point to datafusion.apache.org #10195 (phillipleblanc)
- Minor: Introduce
Expr::is_volatile(), adjustTreeNode::exists()#10191 (peter-toth) - Doc: Modify docs to fix old naming #10199 (comphead)
- [MINOR] Remove ScalarFunction from datafusion.proto #10173 #10202 (dmitrybugakov)
- Allow expr_to_sql unparsing with no quotes #10198 (phillipleblanc)
- Minor: Avoid a clone in ArrayFunctionRewriter #10204 (alamb)
- Move coalesce function from math to core #10201 (xxxuuu)
- fix: cargo warnings of import item #10196 (waynexia)
- Minor: Remove some clone in
TypeCoercion#10203 (alamb) - doc: fix subscribe mail link to datafusion mailing lists #10225 (jackwener)
- Minor: Prevent empty datafusion-cli commands #10219 (comphead)
- Optimize date_bin (2x faster) #10215 (simonvandel)
- Refactor sessionconfig set fns to avoid an unnecessary enum to string conversion #10141 (psvri)
- fix: reduce lock contention in distributor channels #10026 (crepererum)
- Avoid
ExprcopiesOptimizeProjection, 12% faster planning, encapsulate indicies #10216 (alamb) - chore: Create a doap file #10233 (tisonkun)
- Allow adding user defined metadata to
ParquetSink#10224 (wiedld) - refactor
EliminateDuplicatedExproptimizer pass to avoid clone #10218 (Lordworms) - Support for median(distinct) aggregation function #10226 (Jefffrey)
- Add tests that
random()anduuid()produce unique values for each row #10248 (alamb) - ScalarUDF: Remove
supports_zero_argumentand avoid creating null array for empty args #10193 (jayzhan211) - Add Expr->String for WindowFunction #10243 (yyy1000)
- Make function modules public, add Default impl's. #10239 (Omega359)
- chore: Update release scripts to reflect move to TLP #10235 (andygrove)
- Stop copying plans in
EliminateLimit#10253 (kevinmingtarja) - Minor Clean-up in JoinSelection Tests #10249 (berkaysynnada)
- fix: no longer support the
substringfunction #10242 (jonahgao) - Fix docs.rs build for datafusion-proto (hopefully) #10254 (alamb)
- Minor: Possibility to strip datafusion error name #10186 (comphead)
- Docs: Add governance page to contributor guide #10238 (alamb)
- Improve documentation on
ColumnarValue#10265 (alamb) - Minor: Add comments for removed protobuf nodes #10252 (alamb)
- feat: add static_name() to ExecutionPlan #10266 (waynexia)
- Zero-copy conversion from SchemaRef to DfSchema #10298 (tustvold)
- chore: Update Error for Unnest Rewritter #10263 (Weijun-H)
- feat(CLI): print column headers for empty query results #10300 (jonahgao)
- Clean-up: Remove AggregateExec::group_by() #10297 (berkaysynnada)
- Add mailing list descriptions to documentation #10284 (alamb)
- chore(deps): update substrait requirement from 0.31.0 to 0.32.0 #10279 (dependabot[bot])
- refactor: Convert
IPCWritermetrics fromu64tousize#10278 (erenavsarogullari) - Validate ScalarUDF output rows and fix nulls for
array_hasandget_fieldforMap#10148 (duongcongtoai) - Minor: return NULL for range and generate_series #10275 (Lordworms)
- docs: add download page #10271 (tisonkun)
- Minor: Add some more tests to map.slt #10301 (alamb)
- fix: Correct null_count in describe() #10260 (Weijun-H)
- chore: Add datatype info to error message #10307 (viirya)
- feat: add optimizer config param to avoid grouping partitions
prefer_existing_union#10259 (NGA-TRAN) - Remove
ScalarFunctionDefinition::Name#10277 (lewiszlw) - Display: Support
preserve_partitioningon SortExec physical plan. #10153 (kavirajk) - Fix build with missing
use(" return internal_err!("UDF returned a different ...") #10317 (alamb) - [Minor] Update link to list of committers in contributor guide #10312 (alamb)
- Optimize EliminateFilter to avoid unnecessary copies #10288 #10302 (dmitrybugakov)
- chore: add function to set prefer_existing_union #10322 (NGA-TRAN)
ExecutionPlanvisitor example documentation #10286 (matthewmturner)- fix: schema error when parsing order-by expressions #10234 (jonahgao)
- Stop copying LogicalPlan and Exprs in
RewriteDisjunctivePredicate#10305 (rohitrastogi) - feat: unwrap casts of string and dictionary columns #10323 (erratic-pattern)
- feat: Determine ordering of file groups #9593 (suremarc)
- Stop copying LogicalPlan and Exprs in
DecorrelatePredicateSubquery#10318 (alamb) - Minor: Add additional coalesce tests #10334 (alamb)
- Minor: add a few more dictionary unwrap tests #10335 (alamb)
- Check list size before concat in ScalarValue #10329 (timsaucer)
- Split parquet bloom filter config and enable bloom filter on read by default #10306 (lewiszlw)
- Improve coerce API so it does not need DFSchema #10331 (alamb)
- Stop copying LogicalPlan and Exprs in
PropagateEmptyRelation#10332 (dmitrybugakov) - Stop copying LogicalPlan and Exprs in EliminateNestedUnion #10319 (emgeee)
- Fix clippy lints found by Clippy in Rust
1.78#10353 (alamb) - Minor: Add sql level test for lead/lag on arrays #10345 (alamb)
- fix: LogFunc simplify swaps arguments #10360 (erratic-pattern)
- Refine documentation for
Transformed::{update,map,transform})_data#10355 (alamb) - Clarify docs explaining the relationship between
SessionStateandSessionContext#10350 (alamb) - Optimized push down filter #10291 #10366 (dmitrybugakov)
- Unparser: Support
ORDER BYin window function definition #10370 (yyy1000) - docs: Add DataFusion subprojects to navigation menu, other minor updates #10362 (andygrove)
- feat: Add CrossJoin match case to unparser #10371 (sardination)
- Minor: Do not force analyzer to copy logical plans #10367 (alamb)
- Minor: Move Sum aggregate function test to slt #10382 (jayzhan211)
- chore: remove DataPtr trait since Arc::ptr_eq ignores pointer metadata #10378 (intoraw)
- Move
Covariance(Sample)covar/covar_sampto be a User Defined Aggregate Function #10372 (jayzhan211) - Support limit in StreamingTableExec #10309 (lewiszlw)
- Minor: Move count test to slt #10383 (jayzhan211)
- [MINOR]: Reduce test run time #10390 (mustafasrepo)
- Fix
coalesce,structandnamed_strctexpr_fn function to take multiple arguments #10321 (alamb) - Minor: remove old
create_physical_exprtoscalar_function#10387 (jayzhan211) - Move average unit tests to slt #10401 (lewiszlw)
- Move array_agg unit tests to slt #10402 (lewiszlw)
- feat: run expression simplifier in a loop until a fixedpoint or 3 cycles #10358 (erratic-pattern)
- Add
SessionContext/SessionState::create_physical_expr()to createPhysicalExpressionsfromExprs #10330 (alamb)