Commit 3af6b39
authored
fix: eliminate per-element heap allocs in is_in kernel for binary types (#745)
## Summary
- Add `BinaryMemoTable.ExistsDirect` that inlines the hash table probe
loop, avoiding the closure in `HashTable.Lookup` that causes `val
[]byte` to escape to the heap
- Add `isInBinaryDirect` specialized kernel path that bypasses the
`visitBinary` → `VisitBitBlocksShort` closure chain by directly
iterating with `OptionalBitBlockCounter`
- Route `BinaryDataType` dispatch in `DispatchIsIn` to the new direct
path (handles both int32 and int64 offsets)
## Motivation
The `is_in` kernel for binary types allocated once per input element
because the `[]byte` value escaped to the heap through a closure chain:
1. `visitBinary` slices `rawBytes[offsets[pos]:offsets[pos+1]]` and
passes to a callback
2. The callback calls `BinaryMemoTable.Exists(v)`
3. `Exists` calls `lookup` which creates a closure capturing `val`
4. The closure is passed to `HashTable.Lookup`, causing escape analysis
to move `val` to the heap
Closes #736
## Benchmark (100k rows, 10-element value set)
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| ns/op | 4,133,679 | 923,565 | **4.5x faster** |
| B/op | 2,435,327 | 33,092 | **73x less memory** |
| allocs/op | 100,075 | 70 | **1,430x fewer allocs** |
All existing `TestIsInBinary` subtests pass (binary, large\_binary,
utf8, large\_utf8 × all null matching behaviors).1 parent 5383f8f commit 3af6b39
3 files changed
Lines changed: 165 additions & 1 deletion
File tree
- arrow/compute
- internal/kernels
- internal/hashing
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
235 | 235 | | |
236 | 236 | | |
237 | 237 | | |
238 | | - | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
239 | 244 | | |
240 | 245 | | |
241 | 246 | | |
| |||
254 | 259 | | |
255 | 260 | | |
256 | 261 | | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
257 | 354 | | |
258 | 355 | | |
259 | 356 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
639 | 640 | | |
640 | 641 | | |
641 | 642 | | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
228 | 254 | | |
229 | 255 | | |
230 | 256 | | |
| |||
0 commit comments