Add scheduled benchmark history workflow

## Problem or motivation

Clifft has C++ Catch2 benchmarks in `tests/test_benchmarks.cc` and Python pytest-benchmark cases in `tools/bench/`, but we do not currently keep a historical record of benchmark results. This makes performance regressions harder to spot during normal maintenance and release prep.

We should add a lightweight scheduled benchmark workflow that records benchmark results over time.

## Proposed solution

Add a GitHub Actions workflow that can be run manually and on a nightly schedule. The workflow should run the existing benchmark suites and store their results in a way maintainers can inspect over time.

Suggested direction:

- Add a new workflow under `.github/workflows/`.
- Trigger it with `workflow_dispatch` and a nightly `schedule`.
- Use `ubuntu-24.04` for the first version.
- Run existing C++ and/or Python benchmarks without adding new benchmark fixtures.
- Store benchmark history outside the docs deployment branch.
- Add a short docs/development page describing where maintainers can find the benchmark history.

One possible implementation is to use `benchmark-action/github-action-benchmark`, which supports Catch2 and pytest-benchmark output formats, but contributors may propose another simple approach if it satisfies the requirements.

## Acceptance criteria

- A scheduled/manual GitHub Actions workflow runs existing Clifft benchmarks.
- Benchmark history is persisted somewhere maintainers can inspect across runs.
- The implementation does not write to or depend on the existing `gh-pages` docs deployment branch.
- The workflow does not run on every pull request.
- A short docs page explains how to find and interpret the benchmark history.
- The PR explains any benchmark output format/tooling choices.

## Out of scope

- PR-time benchmark comparison comments.
- Failing CI on benchmark regressions.
- Alerting integrations.
- Self-hosted runners.
- Adding new benchmark workloads.
- Reworking the existing benchmark suite.

## Open questions

- Should the first version track both Catch2 and pytest-benchmark results, or start with one suite?
- What storage/viewing mechanism should we use for the history?
- Should the history live on a dedicated branch such as `bench-data`?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scheduled benchmark history workflow #38

Problem or motivation

Proposed solution

Acceptance criteria

Out of scope

Open questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add scheduled benchmark history workflow #38

Description

Problem or motivation

Proposed solution

Acceptance criteria

Out of scope

Open questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions