Commit f5709e7
authored
## Which issue does this PR close?
<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->
- This PR is part of the [Utf8View
support](apache#10918) epic. It
adds `Utf8View` support in the Spark-compat layer.
## Rationale for this change
In our internal project we're only suppporting `Utf8View` _(because of
design constraints)_ and the current implementation of `SparkConcat`
only supports `Utf8`. The `SparkConcat` function should accept
`Utf8View` and mixed string types in line with the main DataFusion
concat. This PR adds that support and follows the same patterns as
[DataFusion’s
concat](https://github.com/apache/datafusion/blob/main/datafusion/functions/src/string/concat.rs).
Prevents errors like :
> The type of Utf8 AND Utf8View of like physical should be same.
> This issue was likely caused by a bug in DataFusion's code. Please
help us to resolve this by filing a bug report in our issue tracker:
https://github.com/apache/datafusion/issues
from a query like:-
```sql
select i_item_sk,
item_info
from
(select i_item_sk,
CONCAT('Item: ', i_item_desc) as item_info
from item) sub
where item_info LIKE 'Item: Electronic%'
order by 1;
```
## What changes are included in this PR?
- Extend the type signature to accept `Utf8View` in addition to `Utf8`
and `LargeUtf8` via `TypeSignature::Variadic(vec![Utf8View, Utf8,
LargeUtf8])` matching DataFusion’s concat.
- In `return_field_from_args`, compute the result type with precedence
Utf8View > LargeUtf8 > Utf8.
In spark_concat, handle Utf8View and LargeUtf8 in scalar paths
(zero-argument and all-NULL).
## Are these changes tested?
Yes.
- Unit tests: `cargo test --package datafusion-spark
function::string::concat::tests`, including `test_concat_utf8view`.
- Sqllogictest: `spark/string/concat.slt` includes a “**Utf8View: no
extra CAST in plan**” case that uses EXPLAIN and a temporary table to
ensure no extra CASTs when using arrow_cast(..., 'Utf8View') with table
columns.
## Are there any user-facing changes?
- **API:** SparkConcat’s signature is extended to include Utf8View in
the variadic list. No breaking changes.
_used gpt to rephrase some of these points_
1 parent c560bee commit f5709e7
2 files changed
Lines changed: 57 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
| 93 | + | |
92 | 94 | | |
93 | 95 | | |
94 | 96 | | |
95 | | - | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
96 | 107 | | |
97 | 108 | | |
98 | 109 | | |
| |||
110 | 121 | | |
111 | 122 | | |
112 | 123 | | |
113 | | - | |
114 | | - | |
115 | | - | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
116 | 136 | | |
117 | 137 | | |
118 | 138 | | |
119 | 139 | | |
120 | 140 | | |
121 | 141 | | |
122 | 142 | | |
123 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
124 | 151 | | |
125 | 152 | | |
126 | 153 | | |
| |||
181 | 208 | | |
182 | 209 | | |
183 | 210 | | |
| 211 | + | |
184 | 212 | | |
185 | 213 | | |
186 | 214 | | |
| |||
Lines changed: 24 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
23 | 29 | | |
24 | 30 | | |
25 | 31 | | |
| |||
46 | 52 | | |
47 | 53 | | |
48 | 54 | | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
0 commit comments