Problem or motivation
Clifft has C++ Catch2 benchmarks in tests/test_benchmarks.cc and Python pytest-benchmark cases in tools/bench/, but we do not currently keep a historical record of benchmark results. This makes performance regressions harder to spot during normal maintenance and release prep.
We should add a lightweight scheduled benchmark workflow that records benchmark results over time.
Proposed solution
Add a GitHub Actions workflow that can be run manually and on a nightly schedule. The workflow should run the existing benchmark suites and store their results in a way maintainers can inspect over time.
Suggested direction:
- Add a new workflow under
.github/workflows/.
- Trigger it with
workflow_dispatch and a nightly schedule.
- Use
ubuntu-24.04 for the first version.
- Run existing C++ and/or Python benchmarks without adding new benchmark fixtures.
- Store benchmark history outside the docs deployment branch.
- Add a short docs/development page describing where maintainers can find the benchmark history.
One possible implementation is to use benchmark-action/github-action-benchmark, which supports Catch2 and pytest-benchmark output formats, but contributors may propose another simple approach if it satisfies the requirements.
Acceptance criteria
- A scheduled/manual GitHub Actions workflow runs existing Clifft benchmarks.
- Benchmark history is persisted somewhere maintainers can inspect across runs.
- The implementation does not write to or depend on the existing
gh-pages docs deployment branch.
- The workflow does not run on every pull request.
- A short docs page explains how to find and interpret the benchmark history.
- The PR explains any benchmark output format/tooling choices.
Out of scope
- PR-time benchmark comparison comments.
- Failing CI on benchmark regressions.
- Alerting integrations.
- Self-hosted runners.
- Adding new benchmark workloads.
- Reworking the existing benchmark suite.
Open questions
- Should the first version track both Catch2 and pytest-benchmark results, or start with one suite?
- What storage/viewing mechanism should we use for the history?
- Should the history live on a dedicated branch such as
bench-data?
Problem or motivation
Clifft has C++ Catch2 benchmarks in
tests/test_benchmarks.ccand Python pytest-benchmark cases intools/bench/, but we do not currently keep a historical record of benchmark results. This makes performance regressions harder to spot during normal maintenance and release prep.We should add a lightweight scheduled benchmark workflow that records benchmark results over time.
Proposed solution
Add a GitHub Actions workflow that can be run manually and on a nightly schedule. The workflow should run the existing benchmark suites and store their results in a way maintainers can inspect over time.
Suggested direction:
.github/workflows/.workflow_dispatchand a nightlyschedule.ubuntu-24.04for the first version.One possible implementation is to use
benchmark-action/github-action-benchmark, which supports Catch2 and pytest-benchmark output formats, but contributors may propose another simple approach if it satisfies the requirements.Acceptance criteria
gh-pagesdocs deployment branch.Out of scope
Open questions
bench-data?