Skip to content

Support for running criterion benches as tests? #96

@jszwedko

Description

@jszwedko

Hi all!

I'm a big fan of what this project is doing.

I noticed when trying to integrate this into https://github.com/vectordotdev/vector that it fails to run test binaries built from criterion benchmarks which don't support the same --format flag that normal test binaries support:

cargo nextest run --workspace --no-fail-fast --no-default-features --features "default metrics-benches codecs-benches language-benches remap-benches statistic-benches dnstap-benches benches"
    Finished test [unoptimized + debuginfo] target(s) in 1.31s
error: Found argument '--format' which wasn't expected, or isn't valid in this context

USAGE:
    limit-2c27c6bee8522ca1 --list

For more information try --help
Error:
   0: error building test list
   1: running ''/Users/jesse.szwedko/workspace/vector/target/debug/deps/limit-2c27c6bee8522ca1 --list --format terse'' failed
   2: command ["/Users/jesse.szwedko/workspace/vector/target/debug/deps/limit-2c27c6bee8522ca1", "--list", "--format", "terse"] exited with code 1

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
make: *** [test] Error 1

Running --help on the test binary:

Criterion Benchmark 

USAGE:
    limit-2c27c6bee8522ca1 [FLAGS] [OPTIONS] [FILTER]

FLAGS:
    -h, --help       Prints help information
        --list       List all benchmarks
    -n, --noplot     Disable plot and HTML generation.
    -v, --verbose    Print additional statistical information.

OPTIONS:
    -b, --baseline <baseline>                        Compare to a named baseline.
    -c, --color <color>
            Configure coloring of output. always = always colorize output, never = never colorize output, auto =
            colorize output if output is a tty and compiled for unix. [default: auto]  [possible values: auto, always,
            never]
        --confidence-level <confidence-level>        Changes the default confidence level for this run. [default: 0.95]
        --load-baseline <load-baseline>              Load a previous baseline instead of sampling new data.
        --measurement-time <measurement-time>        Changes the default measurement time for this run. [default: 5]
        --noise-threshold <noise-threshold>          Changes the default noise threshold for this run. [default: 0.01]
        --nresamples <nresamples>
            Changes the default number of resamples for this run. [default: 100000]

        --output-format <output-format>
            Change the CLI output format. By default, Criterion.rs will use its own format. If output format is set to
            'bencher', Criterion.rs will print output in a format that resembles the 'bencher' crate. [default:
            criterion]  [possible values: criterion, bencher]
        --plotting-backend <plotting-backend>
            Set the plotting backend. By default, Criterion.rs will use the gnuplot backend if gnuplot is available, or
            the plotters backend if it isn't. [possible values: gnuplot, plotters]
        --profile-time <profile-time>
            Iterate each benchmark for approximately the given number of seconds, doing no analysis and without storing
            the results. Useful for running the benchmarks in a profiler.
        --sample-size <sample-size>                  Changes the default size of the sample for this run. [default: 100]
    -s, --save-baseline <save-baseline>              Save results under a named baseline. [default: base]
        --significance-level <significance-level>
            Changes the default significance level for this run. [default: 0.05]

        --warm-up-time <warm-up-time>                Changes the default warm up time for this run. [default: 3]

ARGS:
    <FILTER>    Skip benchmarks whose names do not contain FILTER.


This executable is a Criterion.rs benchmark.
See https://github.com/bheisler/criterion.rs for more details.

To enable debug output, define the environment variable CRITERION_DEBUG.
Criterion.rs will output more debug information and will save the gnuplot
scripts alongside the generated plots.

To test that the benchmarks work, run `cargo test --benches`

NOTE: If you see an 'unrecognized option' error using any of the options above, see:
https://bheisler.github.io/criterion.rs/book/faq.html

I was just curious to get thoughts on handling this. Should I stick with normal cargo test --benches for that target for now and use nextest for the other targets?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions