You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Only GC when needed. Reduce allocs from mechanics. Add PrecompileTools workload. (#419)
* Reduce overhead in hot benchmark loop
- Make `Benchmark` parametric (`Benchmark{F,Q}`) so `samplefunc` and
`quote_vals` have concrete types, eliminating dynamic dispatch and
boxing on every sample call
- Skip `gcscrub()` before `gctrial`/`gcsample` when the previous
sample (or warmup) reported zero allocations — nothing to collect
- Pre-allocate `Trial` vectors with `sizehint!` based on the first
real sample time, avoiding repeated heap growth and GC churn from
the harness itself during the run
- Add a test asserting the harness itself reports zero allocations for
a zero-allocation benchmark
* add a PrecompileTools workload
* Use function barriers and reduce allocations in hot loops
Revert Benchmark to non-parametric (easier to pass around) and use
function barriers (_run_inner, _lineartrial_inner) so Julia specializes
the sampling loops on concrete samplefunc/quote_vals types without
parameterizing the struct.
- Skip GC scrub when warmup/sample reported zero allocations
- Temporarily set evals=1 for warmup instead of allocating new Parameters
- Use explicit push!(trial, s[1], s[2], s[3], s[4]) instead of
s[1:(end-1)]... to avoid intermediate tuple allocation
- resize! instead of slice-copy in _lineartrial_inner
Reduces steady-state allocations from ~102K to ~96 per benchmark run.
* add concurrency cancel in progress
* don't capture the returned result if not exposed
* Use Ref{SampleResult} to avoid heap-allocating return tuples
The samplefunc is stored as `Function` (abstract type), so every
call returned a heap-allocated tuple through dynamic dispatch.
With ~10K samples per benchmark, this was ~10K allocs per run.
Now the caller passes a pre-allocated Ref{SampleResult} that the
samplefunc writes into, reducing per-benchmark allocs from ~10K
to ~12 (all structural: Parameters copy, Trial, vectors, etc).
* Update .gitignore
* format
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* don't specialize show methods on IO type
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
0 commit comments