Support for running criterion benches as tests?

Hi all!

I'm a big fan of what this project is doing.

I noticed when trying to integrate this into https://github.com/vectordotdev/vector that it fails to run test binaries built from criterion benchmarks which don't support the same `--format` flag that normal test binaries support:

```
cargo nextest run --workspace --no-fail-fast --no-default-features --features "default metrics-benches codecs-benches language-benches remap-benches statistic-benches dnstap-benches benches"
    Finished test [unoptimized + debuginfo] target(s) in 1.31s
error: Found argument '--format' which wasn't expected, or isn't valid in this context

USAGE:
    limit-2c27c6bee8522ca1 --list

For more information try --help
Error:
   0: error building test list
   1: running ''/Users/jesse.szwedko/workspace/vector/target/debug/deps/limit-2c27c6bee8522ca1 --list --format terse'' failed
   2: command ["/Users/jesse.szwedko/workspace/vector/target/debug/deps/limit-2c27c6bee8522ca1", "--list", "--format", "terse"] exited with code 1

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
make: *** [test] Error 1
```

Running `--help` on the test binary:

```
Criterion Benchmark 

USAGE:
    limit-2c27c6bee8522ca1 [FLAGS] [OPTIONS] [FILTER]

FLAGS:
    -h, --help       Prints help information
        --list       List all benchmarks
    -n, --noplot     Disable plot and HTML generation.
    -v, --verbose    Print additional statistical information.

OPTIONS:
    -b, --baseline <baseline>                        Compare to a named baseline.
    -c, --color <color>
            Configure coloring of output. always = always colorize output, never = never colorize output, auto =
            colorize output if output is a tty and compiled for unix. [default: auto]  [possible values: auto, always,
            never]
        --confidence-level <confidence-level>        Changes the default confidence level for this run. [default: 0.95]
        --load-baseline <load-baseline>              Load a previous baseline instead of sampling new data.
        --measurement-time <measurement-time>        Changes the default measurement time for this run. [default: 5]
        --noise-threshold <noise-threshold>          Changes the default noise threshold for this run. [default: 0.01]
        --nresamples <nresamples>
            Changes the default number of resamples for this run. [default: 100000]

        --output-format <output-format>
            Change the CLI output format. By default, Criterion.rs will use its own format. If output format is set to
            'bencher', Criterion.rs will print output in a format that resembles the 'bencher' crate. [default:
            criterion]  [possible values: criterion, bencher]
        --plotting-backend <plotting-backend>
            Set the plotting backend. By default, Criterion.rs will use the gnuplot backend if gnuplot is available, or
            the plotters backend if it isn't. [possible values: gnuplot, plotters]
        --profile-time <profile-time>
            Iterate each benchmark for approximately the given number of seconds, doing no analysis and without storing
            the results. Useful for running the benchmarks in a profiler.
        --sample-size <sample-size>                  Changes the default size of the sample for this run. [default: 100]
    -s, --save-baseline <save-baseline>              Save results under a named baseline. [default: base]
        --significance-level <significance-level>
            Changes the default significance level for this run. [default: 0.05]

        --warm-up-time <warm-up-time>                Changes the default warm up time for this run. [default: 3]

ARGS:
    <FILTER>    Skip benchmarks whose names do not contain FILTER.


This executable is a Criterion.rs benchmark.
See https://github.com/bheisler/criterion.rs for more details.

To enable debug output, define the environment variable CRITERION_DEBUG.
Criterion.rs will output more debug information and will save the gnuplot
scripts alongside the generated plots.

To test that the benchmarks work, run `cargo test --benches`

NOTE: If you see an 'unrecognized option' error using any of the options above, see:
https://bheisler.github.io/criterion.rs/book/faq.html
```

I was just curious to get thoughts on handling this. Should I stick with normal `cargo test --benches` for that target for now and use `nextest` for the other targets?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for running criterion benches as tests? #96

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Support for running criterion benches as tests? #96

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions