12.0.0 (2022-09-12)
Breaking changes:
- Pass
return_typetoAccumulatorFunctionImplementationfor user defined aggregates #3428 (alamb) - Use
usizerather thanOption<usize>to representLimit::skipandLimit::offset#3374 [sql] (HaoYang670) - Deprecate legacy datafusion::logical_plan module #3338 (andygrove)
- Update signature for Expr.name so that schema is no longer required #3336 (andygrove)
- MINOR: rename optimizer rule to ScalarSubqueryToJoin #3306 (kmitchener)
- Add top-level
Like,ILike,SimilarToexpressions in logical plan #3298 [sql] (andygrove) - Upgrade to sqlparser 0.22 #3278 [sql] (andygrove)
Exprvariants for boolean operations #3275 [sql] (sarahyurick)- Upgrade to sqlparser 0.21 #3200 [sql] (andygrove)
- Add SQL planner support for
Like,ILikeandSimilarTo, with optional escape character #3101 [sql] (andygrove)
Implemented enhancements:
- support
castinsidevalues#3446 - update TPCH test schemas to use Decimal128 from Float #3435
- Include Bitwise operators in the documentation #3434
- How to read excel file with datafusion? #3433
- Pass return type to the accumulator state factory in aggregates #3427
- Support bitwise XOR operator (
#) #3420 - support InList with datatype Date32 #3412
- add simplification for
betweenexpression during logical plan optimization #3402 - Replace From trait with TryFrom trait for datafusion-proto crate #3401
- update TPC-H benchmark to Decimal types from Float #3392
- Use
usizeto representLimit::skip#3369 - Avoid coping in
LogicalPlan::expressions#3368 - Upgrade to Arrow 22 #3362
- Eliminate
OFFSET 0in the logical plan optimization #3355 - Add ability to get unoptimized logical plan from DataFrame #3340
- Allow IDEs to recognize generated code #3332
CASTshould not change the name of an expression #3326- add SQL support for unsigned integers #3325
- Review use of panic in
datafusion-protocrate #3318 - Review use of panic in
datafusion-sqlcrate #3315 - Review use of panic in
datafusion-optimizercrate #3314 - Review use of panic in
datafusion-exprcrate #3312 - Support registration of custom TableProviders through SQL #3310
- Support binary data in sha hash functions #3308
- add SQL support for tinyint and unsigned versions of all INTs #3307
- Support binary types in InList expression #3300
- Physical planner should map
IsTrueand similar expressions toIsDistinctFrom#3288 - Introduce physical plan version of
Operatorenum #3269 - Introduce
Exprvariants forIS [NOT] TRUE / FALSE / UNKNOWN#3268 - Add support for non-correlated subqueries #3266 [sql]
- (Re-)add support for glob patterns in ListingTableUrl #3261
PreCastLitInComparisonExpressionsshould use ExprRewriter and supported nested expressions #3259- implement
DROP VIEW#3251 - Upgrade to Arrow 21 #3224
- Add TypeCoercion optimizer rule #3221
- Create bench for approx_percentile_cont aggregate #3217
- Add SQL query planner support for
DISTRIBUTED BY#3207 - Support "IS [NOT] UNKNOWN" syntax #3195
- sqlparser 0.21 upgrade #3192
- Re-implement parsing/planning for SHOW TABLES due to sqlparser changes #3188
- Support
SUMAVG,MIN,MAXonTimecolumns. #3166 - Support "IS TRUE/FALSE" syntax #3159
- Support number of histogram bins in approx_percentile_cont #3145
- Support create ApproxPercentileAccumulator with TDigest max_size #3142
- Remove support for
arrayfunction and only supportarray[]style postgres syntax #3115 - Allow inline column aliases for create view #3108 [sql]
- Add support for Postgres
SIMILAR TOandILIKEsyntax #3099 [sql] - Update SQL reference in user guide to cover all supported syntax #3091
- DataFusion prelude should import all logical expression functions #3068
- Proposal: Add similar to operator #3016 [sql]
- Release DataFusion 11.0.0 #3012
- Implement "SHOW CREATE TABLE" for external tables #2848
- Change java package names in protobuf files #2513
- When creating
DFFieldfromExprwe should provide input plan not input schema #2456 - Support "IS NOT TRUE/FALSE" syntax #2265
- RFC: Spill-To-Disk Object Storage Download #2205
- Support for BitwiseAnd
&, BitOr|binary operators #1619 - [Question] Usage of async object store APIs in consuming code #1313
- Allow User Defined Aggregates to return multiple values / structs #600
- Implement vectorized hashing for dictionary types #331
Fixed bugs:
- Intermittent build error when changing selected features #3366
sql::timestamp::timestamp_add_interval_monthsfailing since September 1st #3327sql::timestamp::timestamp_add_interval_monthstest fails #3322- test case
timestamp_add_interval_monthsfailed on master branch #3321 - datafusion-proto does not support untyped null scalar values #3302
ConfigOptionscreation is slow #3295- FilterPushDown optimization through UNION ALL results in SchemaError #3281
- Execute LogicalPlans after building for TPCH Benchmarks #3273
CREATE TABLEshould return empty DataFrame #3265 [sql]CREATE EXTERNAL TABLEfrom CSV creates a table with no columns if there is just a header row #3263- View TableProvider ignores projections, resulting in invalid plans #3240
- CREATE VIEW should return an empty dataframe on success #3236
DISTRIBUTE BYexpressions get removed during optimization #3234- datafusion cannot recognize chinese charactors. #3203
- Panicked at 'byte index 1 is out of bounds on invalid query #3190
like_nlike_with_null_ltfails with latest sqlparser code #3187- Interval Literal output inconsistent date_type #3180
arrayfunction allows different data types #3123- eq operator doesn't work on binary data #3117
- incorrect
whereclause comparison while using table alias #3073 - Some functions are incorrectly declared as unary #3069
- once now() is called in a statement, it forever returns the same value #3057
- single_distinct_to_groupby panic when group by expr is a binaryExpr #2994
- Cannot have
order byexpression that references complexgroup byexpression #2360 - Fix some bugs in TypeCoercion rule #3407 (andygrove)
- MINOR: Stop ignoring
AggregateFunction::distinctin protobuf serde code #3250 (andygrove) - Add assertion for invariant in
create_physical_expressionand fix ViewTable projection #3242 (andygrove) - Fix bug where optimizer was removing
Partitioning::DistributeByexpressions #3229 (andygrove)
Documentation updates:
Closed issues:
Merged pull requests:
- minor: fix some typo. #3453 (jackwener)
- Update criterion requirement from 0.3 to 0.4 #3452 (dependabot[bot])
- Update object_store requirement from 0.4.0 to 0.5.0 #3451 (dependabot[bot])
- add
castsupport insidevalues#3447 [sql] (kmitchener) - Use hash repartitioning for aggregates on dictionaries #3445 (isidentical)
- Review
unwrapandpanicfrom theaggregatedirectory ofdatafusion-physical-expr#3443 (iajoiner) - MINOR: Implement protobuf serde for all binary operators #3441 (andygrove)
- MINOR: Add accessor methods to DateTimeIntervalExpr #3440 (andygrove)
- update TPCH-mimicking tests to Decimal data type from Float, matching the benchmark #3438 (kmitchener)
- Include Bitwise operators in the documentation #3436 (askoa)
- minor: make sql number parsing slightly more efficient + functional #3432 [sql] (alamb)
- Implement bitwise XOR operator (
#) #3430 [sql] (askoa) - Replace From trait with TryFrom trait for datafusion-proto crate #3401 #3429 (comphead)
- Tests showing user defined aggregate returning a struct #3425 (alamb)
- MINOR: update optimizer rule names to be consistent style as the rest #3415 (kmitchener)
- Support date32 and date 64 in inlist node #3413 (Ted-Jiang)
- Update sqlparser requirement from 0.22 to 0.23 #3411 [sql] (dependabot[bot])
- simplify the
betweenexpr during logical plan optimization #3404 (kmitchener) - MINOR: Improve optimizer error #3403 (andygrove)
- Review panics in the sql crate #3397 [sql] (HaoYang670)
- changed TPC-H benchmark to use Decimal types #3393 (kmitchener)
- minor: remove redundant code. #3389 (jackwener)
- Add dictionary cases to merge bench #3384 (tustvold)
- Implement Eq trait for Expr and nested types #3381 (jdye64)
- Minor: Improvements to type coercion rule #3379 (alamb)
- MINOR: Note that most communication happens on github #3375 (alamb)
- minor fix: clean data type for negative operation #3370 (liukun4515)
- Fix code generation for json feature #3367 (avantgardnerio)
- Review use of panic in datafusion-proto crate #3365 (comphead)
- Upgrade to arrow 22 #3363 [sql] (avantgardnerio)
- return empty dataframe on create table, remove a duplicate optimize call #3361 (kmitchener)
- Add SQL support for
tinyint,smallint, andunsigned int variants#3359 [sql] (kmitchener) - Minor: add hint in README of example #3358 (jackwener)
- Collect to
HashSetdirectly inin_list#3356 (HaoYang670) - MINOR: Add comments about rewrite_disjunctive_predicate #3351 (alamb)
- [MINOR] Add debug logging to plan teardown #3350 (alamb)
- MINOR: add df.to_unoptimized_plan() to docs, remove erroneous comment #3348 (kmitchener)
- Replace
unwrapinconvert_to_ordered_floatand adddowncast_value#3347 (iajoiner) - Remove panics from
common_subexpr_eliminate#3346 (andygrove) - Remove Result.unwrap from single_distinct_to_groupby #3345 (andygrove)
- Add to_unoptimized_plan #3344 (iajoiner)
- Remove panics from simplify_expressions optimizer rule #3343 (andygrove)
- Remove
unreachable!from filter push down rule #3342 (andygrove) - Replace panic in
datafusion-exprcrate #3341 (iajoiner) - Re-implement ExprIdentifierVisitor::desc_expr to use Expr::Display #3339 (andygrove)
- Fix the test
timestamp_add_interval_months#3337 (HaoYang670) - Bump lz4-sys from 1.9.3 to 1.9.4 in /datafusion-cli #3335 (dependabot[bot])
- Make binary operator formatting consistent between logical and physical plans #3331 (andygrove)
- Fix build: Ignore failing test #3329 (andygrove)
- Add
InListsupport for binary type. #3324 (HaoYang670) - MINOR: add github action trigger #3323 (waynexia)
- add explain sql test for optimizer rule PreCastLitInComparisonExpressions #3320 (liukun4515)
- Custom / Dynamic table provider factories #3311 [sql] (avantgardnerio)
- fix: alias group_by exprs in single_distinct_to_groupby optimizer #3305 (waynexia)
- Add support for serializing null scalar values #3303 (andygrove)
- Finish integrating
Expr::Is[Not]Trueand similar expressions #3301 [sql] (andygrove) - MINOR: Remove
unwrapcalls fromsingle_distinct_to_groupby optimizerrule #3299 (andygrove) - docs: update the Python library repository #3297 (haoxins)
- fix: speed up
ConfigOptionscreation #3296 (crepererum) - Execute LogicalPlans after building for TPCH Benchmarks #3290 (DaltonModlin)
- support for non-correlated subqueries #3287 (kmitchener)
- Add
Aggregate::try newwith validation checks #3286 (andygrove) - Fix SchemaError in FilterPushDown optimization with UNION ALL #3282 (jonmmease)
- Allow sorting by aggregated groups #3280 (isidentical)
- Add show external tables #3279 [sql] (psvri)
- Return from task execution if send fails as there is nothing more to do (faster cancel / limit) #3276 (nvartolomei)
- Let prelude import all expression functions #3274 (sadilet)
- Fix no schema when CSV is only header #3272 (comphead)
- support inlist for pre cast literal expression #3270 (liukun4515)
- implement
drop view#3267 [sql] (kmitchener) - Use
ExprRewriterinpre_cast_lit_in_comparison#3260 (andygrove) - Add type coercion for UDFs in logical plan #3254 (andygrove)
- Support "IS NOT TRUE/FALSE" syntax #3252 [sql] (sarahyurick)
- Implement
IS UNKNOWN/IS NOT UNKNOWNoperators #3246 [sql] (isidentical) - support decimal data type for the optimizer rule of PreCastLitInComparisonExpressions #3245 (liukun4515)
- chore: update cranelifts to 0.87.0 #3243 (yjshen)
- Moved nullif out of unary functions #3241 (comphead)
- MINOR: documentation updates #3239 (kmitchener)
- MINOR: Add bounds check to Column physical expression #3238 (andygrove)
- CREATE VIEW should return empty dataframe #3237 (kmitchener)
- Support "IS TRUE/FALSE" syntax (redo) #3235 [sql] (sarahyurick)
- Fix propagation of optimized predicates on nested projections #3228 (isidentical)
- Add more trim test cases #3226 (ayushdg)
- Upgrade to arrow 21 #3225 [sql] (avantgardnerio)
- Add optimizer rule for type coercion (binary operations only) #3222 (andygrove)
- [Improve] Use arrow::compute::sort in approx_percentile_cont #3219 (Ted-Jiang)
- [minor] fix bench aggregate_query_sql meta #3218 (Ted-Jiang)
- minor: refactor simplify negate #3213 (jackwener)
- MINOR: update cargo.lock and rust-version for datafusion-cli #3212 (kmitchener)
- fix issue with now() returning same value across statements #3210 (kmitchener)
- Add support for inline column alias in CREATE VIEW #3209 [sql] (DaltonModlin)
- Add SQL query planner support for
DISTRIBUTE BY#3208 [sql] (andygrove) - minor: remove test code that's in the arrow library now #3206 (kmitchener)
- Use .get() to avoid panic #3201 [sql] (jklamer)
- [Minor] Reduce code duplication creating ScalarValue::List #3197 [sql] (alamb)
- Clean up CI workflows by removing "matrix" strategy, simplifying names #3196 (alamb)
- optimizer: add framework for the rule of pre-add cast to the literal in comparison binary #3185 (liukun4515)
- Fix clippy #3182 (alamb)
- MINOR: Add notes on writing release blog posts #3179 (andygrove)
- add min/max for time #3178 (waitingkuo)
- Recursively apply remove filter rule if filter is a true scalar value #3175 (byteink)
- Update
ahashrequirement from 0.7 to 0.8 #3161 [sql] (alamb) - Support number of centroids in approx_percentile_cont #3146 (Ted-Jiang)
- Introduce
\icommand to execute from a file #3136 (turbo1912) - impl binary ops between binary arrays and scalars #3124 (ozgrakkurt)