Add shrinking a la delta debugging by stevana · Pull Request #345 · pragma-org/amaru

stevana · 2025-07-22T10:42:27Z

Add shrinking to the simulator, essentially bisecting the input for as long as the same error is returned by the test.

Summary by CodeRabbit

New Features
- Added automatic shrinking of failing simulation inputs, minimizing them to the smallest case that still triggers errors.
- Introduced delta debugging to help identify minimal failing scenarios in simulations.
- Added a new function to combine generated vectors element-wise for enhanced test input creation.
Refactor
- Improved simulation test loop structure for better maintainability and error handling.
- Updated error reporting to provide concise failure messages and display minimized failing inputs.
Chores
- Enhanced message generation in tests to include randomized arrival times for more robust simulation scenarios.

Signed-off-by: Stevan A <stevana@users.noreply.github.com>

coderabbitai · 2025-07-22T10:42:35Z

Walkthrough

A new delta debugging "shrink" module was introduced to minimize failing simulation inputs. The simulation logic was refactored to use this shrinker for property-based tests, with improved error reporting and test input generation via a new generate_zip_with utility. Module declarations and imports were updated accordingly.

Changes

File(s)	Change Summary
simulation/amaru-sim/src/simulator/generate.rs	Added `generate_zip_with`, a generic function to combine two generated vectors element-wise.
simulation/amaru-sim/src/simulator/mod.rs	Declared new public module `shrink`.
simulation/amaru-sim/src/simulator/shrink.rs	Introduced delta debugging shrinker module with `shrink` function and unit tests.
simulation/amaru-sim/src/simulator/simulate.rs	Refactored simulation loop to use shrinker, reworked test input generation, and improved error handling.

Sequence Diagram(s)

sequenceDiagram
    participant TestRunner
    participant InputGenerator
    participant Simulator
    participant Shrinker

    TestRunner->>InputGenerator: Generate test inputs (using generate_zip_with)
    TestRunner->>Simulator: Run simulation on inputs
    Simulator-->>TestRunner: Return result (history, property check)
    alt Test fails
        TestRunner->>Shrinker: Minimize failing input (shrink)
        Shrinker->>Simulator: Re-run simulation on reduced inputs
        Shrinker-->>TestRunner: Return minimized input, error, shrink count
        TestRunner->>TestRunner: Display failure with minimized input
    end

Estimated code review effort

4 (~90 minutes)

Possibly related PRs

Initial version of Rust simulator #210: Builds upon the initial Rust simulator framework and echo test setup, extending simulation and testing infrastructure.
feat: generate more realistic arrival times #294: Both PRs modify the simulation's message generation logic, with this PR introducing a generic zip generator and the related PR adding a specific arrival time generator.

Suggested reviewers

abailly

Poem

🍀
In the code down under, where the bugs may roam,
A shrinker now helps bring failing tests home.
Inputs get zipped, simulations run tight,
Errors get smaller—like a Goomba in fright!
Debugging’s a breeze, with a cheeky new twist,
Now let’s raise a pint—no more bugs on the list!
🍻

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fcceb41 and bd51bfe.

📒 Files selected for processing (3)

simulation/amaru-sim/src/simulator/generate.rs (1 hunks)
simulation/amaru-sim/src/simulator/shrink.rs (1 hunks)
simulation/amaru-sim/src/simulator/simulate.rs (10 hunks)

🧠 Learnings (2)

📓 Common learnings

Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

simulation/amaru-sim/src/simulator/simulate.rs (9)

Learnt from: stevana
PR: #210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in simulation/amaru-sim/src/simulator/simulate.rs with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (../../target/debug/echo) and making tests more reliable.

Learnt from: rkuhn
PR: #206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.

Learnt from: jeluard
PR: #69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The Fold::Registered variant in diff_epoch_reg.rs expects a reference &'a V, so unwrapping an Option<&V> requires only a single .expect().

Learnt from: rkuhn
PR: #263
File: crates/amaru-consensus/src/consensus/store.rs:220-223
Timestamp: 2025-06-14T16:38:35.449Z
Learning: In NetworkName::Preprod.into() when converting to &EraHistory, the From implementation returns a static reference to a constant value, not a temporary. This makes it safe to return directly from functions expecting &EraHistory without storing it in a struct field.

Learnt from: abailly
PR: #195
File: crates/amaru/src/stages/consensus/fetch_block.rs:0-0
Timestamp: 2025-04-23T09:12:58.872Z
Learning: In the amaru codebase, when constructing new events from existing events, it's preferred to take ownership of the original event (with a clone at the call site if needed) rather than taking a reference and explicitly cloning individual fields. This approach makes the code cleaner and more straightforward.

Learnt from: rkuhn
PR: #263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.

Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:0-0
Timestamp: 2025-04-20T17:56:39.223Z
Learning: For mpsc::channel in Tokio-based test code, use buffer sizes larger than 1 (e.g., 8) to avoid potential deadlocks when producers send multiple messages before consumers can process them.

Learnt from: rkuhn
PR: #263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

🚧 Files skipped from review as they are similar to previous changes (2)

simulation/amaru-sim/src/simulator/generate.rs
simulation/amaru-sim/src/simulator/shrink.rs

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.