Skip to content

[Variant]: Implement DataType::Dictionary support for cast_to_variant kernel#8173

Merged
alamb merged 2 commits intoapache:mainfrom
liamzwbao:issue-8062-variant-dict
Aug 20, 2025
Merged

[Variant]: Implement DataType::Dictionary support for cast_to_variant kernel#8173
alamb merged 2 commits intoapache:mainfrom
liamzwbao:issue-8062-variant-dict

Conversation

@liamzwbao
Copy link
Copy Markdown
Contributor

@liamzwbao liamzwbao commented Aug 20, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Implement DataType::Dictionary in cast_to_variant

Are these changes tested?

Yes

Are there any user-facing changes?

New cast type supported

@liamzwbao liamzwbao marked this pull request as ready for review August 20, 2025 00:44
Comment on lines +1864 to +1879
fn test_cast_to_variant_dictionary_with_nulls() {
// Test dictionary with null values in the values array
let values = StringArray::from(vec![Some("a"), None, Some("c")]);
let keys = Int8Array::from(vec![Some(0), Some(1), Some(2), Some(0)]);
let dict_array = DictionaryArray::<Int8Type>::try_new(keys, Arc::new(values)).unwrap();

run_test(
Arc::new(dict_array),
vec![
Some(Variant::from("a")),
None, // key 1 points to null value
Some(Variant::from("c")),
Some(Variant::from("a")),
],
);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only contains the value. What about the key in the Dictionary?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may misunderstand the behavior. There's no corresponding datatype in Variant so what I did is to unfold the dictionary encoding and return the logic array it represent.

Do you mean we need to represent the same internal structure of the dictionary encoding in variant?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification — that makes sense to me @liamzwbao 👍

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Aug 20, 2025
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @liamzwbao and @Weijun-H -- this makes sense to me

@Weijun-H let me know if the answer to your question makes sense

@alamb alamb merged commit 62770b6 into apache:main Aug 20, 2025
12 checks passed
@liamzwbao liamzwbao deleted the issue-8062-variant-dict branch August 20, 2025 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Variant]: Implement DataType::Dictionary support for cast_to_variant kernel

3 participants