Scalar arithmetic should return error when overflows.#5811
Scalar arithmetic should return error when overflows.#5811alamb merged 4 commits intoapache:mainfrom
Conversation
alamb
left a comment
There was a problem hiding this comment.
Thank you @zhzy0077 -- this looks like a great improvement to me
Can you also please add a test that ensures the overflow behavior is consistent with the arrow compute kernels (I can't remember if they error on overflow). Perhaps following the model of:
| } | ||
|
|
||
| macro_rules! primitive_checked_op { | ||
| ($LEFT:expr, $RIGHT:expr, $OPERATION:tt, Float64) => { |
Thank you. Added a few tests in |
|
@tustvold can you give this one a review (mostly for consistency with the semantics of arrow-rs)? |
So this is not consistent with the arrow kernels that DataFusion makes use of, in particular, DataFusion is currently using the unchecked arithmetic kernels, which do not return an error on overflow. This PR will therefore introduce inconsistency between ScalarValue arithmetic, and any arithmetic involving arrays. The major reason I'm a little apprehensive about changing this is that the checked kernels are at least an order of magnitude slower in the presence of nulls, as LLVM can't vectorise them correctly. This will likely remain the case until SIMD intrinsics are finally stabilised, which will hopefully be sometime this decade... I definitely think whatever semantics we settle on should be consistent and well documented, but I don't have a strong opinion what it should be. However, I do feel that changing the current semantics warrants some communication due to the major performance regression it would entail. |
|
Thank you for the detailed explanation @tustvold. What do you think if we introduce the other set of operands, say "add_checked", "mul_checked", etc.? |
|
Test error:
It looks like the methods called in unit tests should also be |
|
UT failure fixed. |
| } | ||
|
|
||
| // Verifies that ScalarValue has the same behavior with compute kernal when it overflows. | ||
| fn check_scalar_add_overflow<T>(left: ScalarValue, right: ScalarValue) |
* Scalar arithmetic should return error when overflows. * Add a few tests. * add new checked_* ops. * fix test failure.
Which issue does this PR close?
Closes #5810.
Rationale for this change
Repro in the bug itself. ScalarValue returns an error for many arithmetic errors, but not overflows.
What changes are included in this PR?
Use
checked_*and check results.Are these changes tested?
Tests added.
Are there any user-facing changes?
No.