Commit 2c14a75
authored
perf(overflow): Speed up overflow checks for small integers (#303)
### Rationale for this change
We can check the bounds of the arguments to avoid an int64 division. The
`Mul` and `Mul64` functions perform better on the systems I tested.
This is an older machine:
```
goos: linux
goarch: amd64
pkg: github.com/apache/arrow-go/v18/internal/utils
cpu: Intel(R) Core(TM) i5-7Y57 CPU @ 1.20GHz
│ bench_amd64_old.txt │ bench_amd64_new.txt │
│ sec/op │ sec/op vs base │
Add/int/8192_+_8192-4 1.990n ± 14% 2.001n ± 18% ~ (p=1.000 n=6)
Add/int/MaxInt16_+_1-4 2.248n ± 1% 2.271n ± 4% +1.07% (p=0.035 n=6)
Add/int/MaxInt16_+_5-4 2.215n ± 2% 2.270n ± 3% +2.48% (p=0.013 n=6)
Add/int/MaxInt16_+_MaxInt16-4 2.293n ± 4% 2.331n ± 9% ~ (p=0.195 n=6)
Add/int/MaxInt32_+_1-4 2.339n ± 16% 2.345n ± 67% ~ (p=0.937 n=6)
Add/int/MaxInt32_+_5-4 2.363n ± 4% 2.330n ± 2% ~ (p=0.331 n=6)
Add/int/MaxInt32_+_MaxInt32-4 2.402n ± 23% 2.343n ± 3% ~ (p=0.310 n=6)
Add/int/MaxInt_+_MaxInt-4 2.334n ± 3% 2.364n ± 2% ~ (p=0.180 n=6)
Mul/int/8192_×_8192-4 14.335n ± 3% 3.578n ± 5% -75.04% (p=0.002 n=6)
Mul/int/MaxInt16_×_1-4 14.015n ± 90% 3.718n ± 20% -73.47% (p=0.002 n=6)
Mul/int/MaxInt16_×_5-4 14.480n ± 25% 3.647n ± 3% -74.82% (p=0.002 n=6)
Mul/int/MaxInt16_×_MaxInt16-4 14.450n ± 30% 3.627n ± 35% -74.90% (p=0.002 n=6)
Mul/int/MaxInt32_×_1-4 14.725n ± 2% 3.597n ± 12% -75.57% (p=0.002 n=6)
Mul/int/MaxInt32_×_5-4 17.485n ± 31% 3.667n ± 5% -79.03% (p=0.002 n=6)
Mul/int/MaxInt32_×_MaxInt32-4 14.385n ± 31% 5.704n ± 48% -60.35% (p=0.002 n=6)
Mul/int/MaxInt_×_MaxInt-4 14.34n ± 3% 15.68n ± 4% +9.34% (p=0.002 n=6)
Mul64/int64/8192_×_8192-4 14.495n ± 6% 3.730n ± 4% -74.27% (p=0.002 n=6)
Mul64/int64/MaxInt16_×_1-4 14.300n ± 3% 3.612n ± 1% -74.74% (p=0.002 n=6)
Mul64/int64/MaxInt16_×_5-4 14.020n ± 20% 3.657n ± 3% -73.91% (p=0.002 n=6)
Mul64/int64/MaxInt16_×_MaxInt16-4 14.110n ± 2% 3.611n ± 12% -74.41% (p=0.002 n=6)
Mul64/int64/MaxInt32_×_1-4 14.330n ± 2% 3.609n ± 4% -74.82% (p=0.002 n=6)
Mul64/int64/MaxInt32_×_5-4 14.250n ± 14% 3.670n ± 2% -74.25% (p=0.002 n=6)
Mul64/int64/MaxInt32_×_MaxInt32-4 14.070n ± 2% 3.571n ± 3% -74.62% (p=0.002 n=6)
Mul64/int64/MaxInt_×_MaxInt-4 14.17n ± 2% 15.77n ± 3% +11.25% (p=0.002 n=6)
geomean 7.807n 3.583n -54.10%
```
The following is inside a `docker.io/i386/ubuntu` container on that same
machine. I don't have any 32-bit hardware.
```
goos: linux
goarch: 386
pkg: github.com/apache/arrow-go/v18/internal/utils
cpu: Intel(R) Core(TM) i5-7Y57 CPU @ 1.20GHz
│ bench_386_old.txt │ bench_386_new.txt │
│ sec/op │ sec/op vs base │
Add/int/8192_+_8192-4 3.113n ± 22% 2.848n ± 17% ~ (p=0.180 n=6)
Add/int/MaxInt16_+_1-4 3.344n ± 4% 3.294n ± 18% ~ (p=0.818 n=6)
Add/int/MaxInt16_+_5-4 3.420n ± 7% 3.411n ± 5% ~ (p=0.937 n=6)
Add/int/MaxInt16_+_MaxInt16-4 3.360n ± 3% 3.402n ± 30% ~ (p=0.240 n=6)
Add/int/MaxInt_+_1-4 3.329n ± 2% 3.433n ± 3% +3.09% (p=0.039 n=6)
Add/int/MaxInt_+_5-4 3.452n ± 5% 3.436n ± 2% ~ (p=0.937 n=6)
Add/int/MaxInt_+_MaxInt-4 3.353n ± 6% 3.454n ± 3% ~ (p=0.132 n=6)
Add/int/MaxInt_+_MaxInt#01-4 3.346n ± 13% 3.488n ± 3% ~ (p=0.132 n=6)
Mul/int/8192_×_8192-4 11.250n ± 2% 6.238n ± 6% -44.55% (p=0.002 n=6)
Mul/int/MaxInt16_×_1-4 11.565n ± 4% 6.261n ± 1% -45.86% (p=0.002 n=6)
Mul/int/MaxInt16_×_5-4 11.300n ± 3% 6.505n ± 3% -42.44% (p=0.002 n=6)
Mul/int/MaxInt16_×_MaxInt16-4 11.370n ± 4% 6.394n ± 40% -43.76% (p=0.002 n=6)
Mul/int/MaxInt_×_1-4 11.245n ± 3% 5.798n ± 2% -48.44% (p=0.002 n=6)
Mul/int/MaxInt_×_5-4 11.210n ± 18% 9.941n ± 2% -11.32% (p=0.002 n=6)
Mul/int/MaxInt_×_MaxInt-4 10.975n ± 4% 9.895n ± 2% -9.84% (p=0.002 n=6)
Mul/int/MaxInt_×_MaxInt#01-4 11.40n ± 18% 10.17n ± 2% -10.83% (p=0.002 n=6)
Mul64/int64/8192_×_8192-4 21.74n ± 70% 23.75n ± 2% ~ (p=0.065 n=6)
Mul64/int64/MaxInt16_×_1-4 22.37n ± 28% 23.87n ± 12% ~ (p=0.065 n=6)
Mul64/int64/MaxInt16_×_5-4 22.09n ± 2% 23.96n ± 23% +8.44% (p=0.002 n=6)
Mul64/int64/MaxInt16_×_MaxInt16-4 21.06n ± 3% 24.27n ± 3% +15.24% (p=0.002 n=6)
Mul64/int64/MaxInt_×_1-4 21.42n ± 8% 23.91n ± 64% +11.62% (p=0.002 n=6)
Mul64/int64/MaxInt_×_5-4 29.87n ± 4% 24.16n ± 22% -19.09% (p=0.009 n=6)
Mul64/int64/MaxInt_×_MaxInt-4 28.10n ± 2% 24.40n ± 2% -13.17% (p=0.002 n=6)
Mul64/int64/9223372036854775807_×_9223372036854775807-4 23.22n ± 6% 31.90n ± 35% +37.38% (p=0.002 n=6)
geomean 9.609n 8.523n -11.30%
```
### What changes are included in this PR?
A new generic `Add` function in `internal/utils` returns the sum of any
two integers and whether or not it overflowed.
New `Mul` and `Mul64` functions in `internal/utils` return the product
of two `int` or `int64`, respectively, and whether or not it overflowed.
These functions perform better on the systems I tested.
### Are these changes tested?
Yes.
### Are there any user-facing changes?
No effect on the API. This drops one dependency.
---------
Signed-off-by: Chris Bandy <bandy.chris@gmail.com>1 parent 0f0d667 commit 2c14a75
11 files changed
Lines changed: 441 additions & 16 deletions
File tree
- arrow/compute/internal/kernels
- internal/utils
- parquet
- file
- internal/encoding
- metadata
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | 26 | | |
28 | 27 | | |
29 | 28 | | |
30 | 29 | | |
31 | 30 | | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
709 | 709 | | |
710 | 710 | | |
711 | 711 | | |
712 | | - | |
| 712 | + | |
713 | 713 | | |
714 | 714 | | |
715 | 715 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
25 | 24 | | |
26 | 25 | | |
27 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
12 | | - | |
13 | 11 | | |
14 | 12 | | |
15 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
20 | 25 | | |
21 | 26 | | |
22 | 27 | | |
| |||
31 | 36 | | |
32 | 37 | | |
33 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
0 commit comments