Commit 2c0fc70
committed
[SPARK-51891][SS] Squeeze the protocol of ListState GET / PUT / APPENDLIST for transformWithState in PySpark
### What changes were proposed in this pull request?
This PR proposes to squeeze the protocol of ListState GET / PUT / APPENDLIST for transformWithState in PySpark, which will help a lot on dealing with small list on ListState.
Here are the changes:
* ListState.get() no longer requires additional request to notice there is no further data to read.
* We inline the data into proto message, to ease of determine whether the iterator has fully consumed or not.
* ListState.put() / ListState.appendList() do not require additional request to send the data separately.
* We inline the data into propo message if the length of list we pass is small enough (now it's "magically" set to 100 elements - need to look further)
* If the length of list is over 100, we fall back to "old" Arrow send (rather than custom protocol). This is because of the fact pickled Python Row contains the schema information as string, which is larger than we anticipated. So in some point, Arrow would be more efficient.
NOTE: 100 is a sort of "magic number", and we will need to improve this with more benchmarking.
### Why are the changes needed?
To optimize further on ListState operations.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
New UT.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #50689 from HeartSaVioR/SPARK-51891.
Authored-by: Jungtaek Lim <[email protected]>
Signed-off-by: Jungtaek Lim <[email protected]>1 parent 83db398 commit 2c0fc70
File tree
9 files changed
+553
-115
lines changed- python/pyspark/sql
- streaming
- proto
- tests/pandas
- helper
- sql/core/src
- main
- protobuf/org/apache/spark/sql/execution/streaming
- scala/org/apache/spark/sql/execution/python/streaming
- test/scala/org/apache/spark/sql/execution/python/streaming
9 files changed
+553
-115
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
| 40 | + | |
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | | - | |
| 64 | + | |
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
| 69 | + | |
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
82 | | - | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
83 | 85 | | |
84 | 86 | | |
85 | | - | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
86 | 94 | | |
87 | 95 | | |
88 | 96 | | |
89 | 97 | | |
| 98 | + | |
90 | 99 | | |
91 | 100 | | |
92 | 101 | | |
93 | | - | |
| 102 | + | |
94 | 103 | | |
95 | 104 | | |
96 | 105 | | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
97 | 109 | | |
98 | | - | |
| 110 | + | |
99 | 111 | | |
100 | 112 | | |
101 | 113 | | |
| |||
118 | 130 | | |
119 | 131 | | |
120 | 132 | | |
121 | | - | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
122 | 151 | | |
123 | 152 | | |
124 | 153 | | |
| |||
127 | 156 | | |
128 | 157 | | |
129 | 158 | | |
130 | | - | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
131 | 162 | | |
132 | 163 | | |
133 | 164 | | |
| |||
137 | 168 | | |
138 | 169 | | |
139 | 170 | | |
140 | | - | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
141 | 188 | | |
142 | 189 | | |
143 | 190 | | |
144 | 191 | | |
145 | 192 | | |
146 | 193 | | |
147 | | - | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
148 | 197 | | |
149 | 198 | | |
150 | 199 | | |
| |||
174 | 223 | | |
175 | 224 | | |
176 | 225 | | |
| 226 | + | |
177 | 227 | | |
178 | 228 | | |
179 | 229 | | |
180 | 230 | | |
181 | 231 | | |
182 | | - | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
| 39 | + | |
38 | 40 | | |
39 | 41 | | |
40 | 42 | | |
| |||
229 | 231 | | |
230 | 232 | | |
231 | 233 | | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
232 | 272 | | |
233 | 273 | | |
234 | 274 | | |
| |||
1042 | 1082 | | |
1043 | 1083 | | |
1044 | 1084 | | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
1045 | 1092 | | |
1046 | 1093 | | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
1047 | 1103 | | |
1048 | 1104 | | |
1049 | 1105 | | |
| |||
1065 | 1121 | | |
1066 | 1122 | | |
1067 | 1123 | | |
| 1124 | + | |
| 1125 | + | |
| 1126 | + | |
| 1127 | + | |
| 1128 | + | |
| 1129 | + | |
| 1130 | + | |
1068 | 1131 | | |
1069 | 1132 | | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
1070 | 1142 | | |
1071 | 1143 | | |
1072 | 1144 | | |
| |||
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
425 | 425 | | |
426 | 426 | | |
427 | 427 | | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
428 | 440 | | |
429 | 441 | | |
430 | 442 | | |
| |||
0 commit comments