Commit 5a94422
authored
perf(parquet/compress): set zstd pool encoder concurrency to 1 (#717)
The zstdEncoderPool is used exclusively by EncodeAll(), which is a
single-shot synchronous call that uses exactly one inner block encoder.
However, zstd.NewWriter defaults concurrent to runtime.GOMAXPROCS,
pre-allocating that many inner block encoders — each with its own ~1 MiB
history buffer (ensureHist). On a 10-core machine, each pooled Encoder
allocates 10 inner encoders when only 1 is ever used by EncodeAll.
With WithEncoderConcurrency(1), each pooled encoder creates a single
inner encoder, matching actual usage. The streaming Write/Close path is
unaffected — it does not use the pool.
Benchmark results (Apple M4 Pro, arm64, 256 KiB semi-random data):
BenchmarkZstdPooledEncodeAll/Default-14 11000 B/op 5250 MB/s
BenchmarkZstdPooledEncodeAll/Concurrency1-14 810 B/op 5500 MB/s
14x less memory per operation, ~5% higher throughput from reduced GC
pressure.
In a parquet write workload (1 GiB Arrow data, ZSTD level 3), this
reduced ensureHist allocations from 22 GiB to 7 GiB and madvise kernel
CPU from 4.6s to 2.3s (10% wall-time improvement).
### Rationale for this change
High memory churn during parquet encoding
### What changes are included in this PR?
Change to zstd encoder concurrency, a benchmark to reproduce results.
### Are these changes tested?
Yes
### Are there any user-facing changes?
No1 parent 7ae2e33 commit 5a94422
2 files changed
Lines changed: 69 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| 27 | + | |
26 | 28 | | |
27 | 29 | | |
28 | 30 | | |
| |||
179 | 181 | | |
180 | 182 | | |
181 | 183 | | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
| 67 | + | |
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
| |||
0 commit comments