Support arithmetic scalar operation with DictionaryArray#5151
Support arithmetic scalar operation with DictionaryArray#5151liukun4515 merged 3 commits intoapache:masterfrom
Conversation
2a33923 to
8b7fcc7
Compare
|
Thanks @viirya -- it is great to see someone else using Dictionary arrays ❤️ Looks like there are some compilation errors on this PR. Marking as draft until those are resolved. |
|
Thanks @alamb . Yea, I'm trying to fix them locally. |
e8669d4 to
699d7c2
Compare
3a0d313 to
67cd4a7
Compare
There was a problem hiding this comment.
Looks good to me -- it would be great to eventually be able to write sql level tests for this feature, but there is no good way to make dictionary arrays yet from SQL (but I am hoping to get #5016 which would allow it).
Thank you for the contribution @viirya
cc @liukun4515 as this also impacts Decimals (though I don't expect any behavior change)
| (DataType::Dictionary(_, value_type), _) => { | ||
| is_numeric(value_type) && is_numeric(rhs_type) | ||
| } | ||
| (_, DataType::Dictionary(_, value_type)) => { |
There was a problem hiding this comment.
I wonder if this need to handle the case when both lhs and rhs are Dictionaries 🤔
There was a problem hiding this comment.
Actually I'm wondering what Dictionary for ScalaValue(rhs here) means although I have seen DataFusion has it.
It sounds reasonable to add the following:
(DataType::Dictionary(_, lhs_value_type), DataType::Dictionary(_, rhs_value_type2)) => {
is_numeric(lhs_value_type) && is_numeric(rhs_value_type2)
}
|
|
||
| let mut dict_builder = PrimitiveDictionaryBuilder::<Int8Type, Int32Type>::new(); | ||
|
|
||
| dict_builder.append(2)?; |
| ) | ||
| } | ||
| Operator::Modulo => { | ||
| // todo: change to binary_primitive_array_op_dyn_scalar! once modulo is implemented |
There was a problem hiding this comment.
we can remove binary_primitive_array_op_scalar once this is done?
There was a problem hiding this comment.
yea, once modulo is ready to do.
Thank you @alamb. Yea, we have internal e2e tests that use dictionary arrays but seems I don't find similar stuff on DataFusion so far. We can definitely expand test coverage for this in the future. |
…e coercion pattern
|
BTW, I think array op array is also not supported yet on dictionary arrays. I will work on it after this. |
| macro_rules! binary_primitive_array_op_dyn_scalar { | ||
| ($LEFT:expr, $RIGHT:expr, $OP:ident) => {{ | ||
| // unwrap underlying (non dictionary) value | ||
| let right = unwrap_dict_value($RIGHT); |
There was a problem hiding this comment.
It's interesting that we have "scalar dictionary":
/// Dictionary type: index type and value
Dictionary(Box<DataType>, Box<ScalarValue>),wonder how useful this is ...
| ) | ||
| } | ||
| Operator::Modulo => { | ||
| // todo: change to binary_primitive_array_op_dyn_scalar! once modulo is implemented |
There was a problem hiding this comment.
we can remove binary_primitive_array_op_scalar once this is done?
|
Benchmark runs are scheduled for baseline = 224c682 and contender = c759865. c759865 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #5150.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?