Add shrinking a la delta debugging#345
Conversation
Signed-off-by: Stevan A <stevana@users.noreply.github.com>
Signed-off-by: Stevan A <stevana@users.noreply.github.com>
Signed-off-by: Stevan A <stevana@users.noreply.github.com>
Signed-off-by: Stevan A <stevana@users.noreply.github.com>
WalkthroughA new delta debugging "shrink" module was introduced to minimize failing simulation inputs. The simulation logic was refactored to use this shrinker for property-based tests, with improved error reporting and test input generation via a new Changes
Sequence Diagram(s)sequenceDiagram
participant TestRunner
participant InputGenerator
participant Simulator
participant Shrinker
TestRunner->>InputGenerator: Generate test inputs (using generate_zip_with)
TestRunner->>Simulator: Run simulation on inputs
Simulator-->>TestRunner: Return result (history, property check)
alt Test fails
TestRunner->>Shrinker: Minimize failing input (shrink)
Shrinker->>Simulator: Re-run simulation on reduced inputs
Shrinker-->>TestRunner: Return minimized input, error, shrink count
TestRunner->>TestRunner: Display failure with minimized input
end
Estimated code review effort4 (~90 minutes) Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
🧠 Learnings (2)📓 Common learningssimulation/amaru-sim/src/simulator/simulate.rs (9)Learnt from: stevana Learnt from: rkuhn Learnt from: jeluard Learnt from: rkuhn Learnt from: abailly Learnt from: rkuhn Learnt from: rkuhn Learnt from: rkuhn Learnt from: rkuhn 🚧 Files skipped from review as they are similar to previous changes (2)
🧰 Additional context used🧠 Learnings (2)📓 Common learningssimulation/amaru-sim/src/simulator/simulate.rs (9)Learnt from: stevana Learnt from: rkuhn Learnt from: jeluard Learnt from: rkuhn Learnt from: abailly Learnt from: rkuhn Learnt from: rkuhn Learnt from: rkuhn Learnt from: rkuhn ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (9)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
simulation/amaru-sim/src/simulator/generate.rs (1)
380-396: Consider using iterators for a more rustic approach, mate!While your implementation works like a charm, you could make it more idiomatic by using
zipandmap. It's like choosing between manual transmission and automatic - both get you there, but one's more elegant!pub fn generate_zip_with<A: Copy, B: Copy, C>( generator1: impl Fn(&mut StdRng) -> Vec<A>, generator2: impl Fn(&mut StdRng) -> Vec<B>, f: impl Fn(A, B) -> C, ) -> impl Fn(&mut StdRng) -> Vec<C> { move |rng| { let xs = generator1(rng); let ys = generator2(rng); assert_eq!(xs.len(), ys.len()); - let mut zs = Vec::with_capacity(xs.len()); - - for i in 0..xs.len() { - zs.push(f(xs[i], ys[i])); - } - zs + xs.into_iter() + .zip(ys) + .map(|(x, y)| f(x, y)) + .collect() } }simulation/amaru-sim/src/simulator/shrink.rs (1)
57-57: Minor nitpick on the granularity adjustment- n = n.saturating_sub(1).max(2); + n = n.max(2) - 1;Since we know
n >= 2from the algorithm logic, this simplification works just as well and is a tad cleaner!simulation/amaru-sim/src/simulator/simulate.rs (1)
266-286: Clever extraction of the test logic, but could be cleaner!The nested function approach works, but it's like having a game within a game - sometimes it's better to make it a proper standalone level!
Consider extracting this to a separate function at module level for better readability and potential reuse. The closure captures make it a bit dense to follow.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
simulation/amaru-sim/src/simulator/generate.rs(1 hunks)simulation/amaru-sim/src/simulator/mod.rs(1 hunks)simulation/amaru-sim/src/simulator/shrink.rs(1 hunks)simulation/amaru-sim/src/simulator/simulate.rs(9 hunks)
🧠 Learnings (5)
📓 Common learnings
Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.
simulation/amaru-sim/src/simulator/mod.rs (1)
Learnt from: stevana
PR: #210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in simulation/amaru-sim/src/simulator/simulate.rs with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (../../target/debug/echo) and making tests more reliable.
simulation/amaru-sim/src/simulator/shrink.rs (2)
Learnt from: stevana
PR: #210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in simulation/amaru-sim/src/simulator/simulate.rs with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (../../target/debug/echo) and making tests more reliable.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
simulation/amaru-sim/src/simulator/generate.rs (1)
Learnt from: stevana
PR: #236
File: simulation/amaru-sim/src/simulator/generate.rs:181-181
Timestamp: 2025-06-02T12:55:45.985Z
Learning: In recent versions of the rand crate, the gen_range method has been deprecated and renamed to random_range. The random_range method is the current, correct method to use for generating random numbers within a range.
simulation/amaru-sim/src/simulator/simulate.rs (6)
Learnt from: stevana
PR: #210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in simulation/amaru-sim/src/simulator/simulate.rs with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (../../target/debug/echo) and making tests more reliable.
Learnt from: rkuhn
PR: #206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.
Learnt from: rkuhn
PR: #263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:0-0
Timestamp: 2025-04-20T17:56:39.223Z
Learning: For mpsc::channel in Tokio-based test code, use buffer sizes larger than 1 (e.g., 8) to avoid potential deadlocks when producers send multiple messages before consumers can process them.
Learnt from: rkuhn
PR: #263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.
simulation/amaru-sim/src/simulator/mod.rs (1)
Learnt from: stevana
PR: #210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in simulation/amaru-sim/src/simulator/simulate.rs with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (../../target/debug/echo) and making tests more reliable.
simulation/amaru-sim/src/simulator/shrink.rs (2)
Learnt from: stevana
PR: #210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in simulation/amaru-sim/src/simulator/simulate.rs with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (../../target/debug/echo) and making tests more reliable.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
simulation/amaru-sim/src/simulator/generate.rs (1)
Learnt from: stevana
PR: #236
File: simulation/amaru-sim/src/simulator/generate.rs:181-181
Timestamp: 2025-06-02T12:55:45.985Z
Learning: In recent versions of the rand crate, the gen_range method has been deprecated and renamed to random_range. The random_range method is the current, correct method to use for generating random numbers within a range.
simulation/amaru-sim/src/simulator/simulate.rs (6)
Learnt from: stevana
PR: #210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in simulation/amaru-sim/src/simulator/simulate.rs with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (../../target/debug/echo) and making tests more reliable.
Learnt from: rkuhn
PR: #206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.
Learnt from: rkuhn
PR: #263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:0-0
Timestamp: 2025-04-20T17:56:39.223Z
Learning: For mpsc::channel in Tokio-based test code, use buffer sizes larger than 1 (e.g., 8) to avoid potential deadlocks when producers send multiple messages before consumers can process them.
Learnt from: rkuhn
PR: #263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.
🔇 Additional comments (7)
simulation/amaru-sim/src/simulator/mod.rs (1)
56-56: G'day mate, module declaration looks spot on!Clean addition of the shrink module to the simulator's public API. Like adding a new character to your party in Final Fantasy - simple but essential for the quest ahead!
simulation/amaru-sim/src/simulator/shrink.rs (3)
15-23: Ripper documentation, cobber!Love the clear explanation of Zeller's delta debugging algorithm. It's like the debugging equivalent of a binary search boss fight - keep halving the problem until you find the weak spot!
50-52: Smart move treating different errors as passing tests!This is a clever design choice - like how in Dark Souls, different death animations mean different things. It ensures the shrinker focuses on the specific error you're hunting.
75-161: Top-notch test coverage, legend!The test suite is comprehensive - covering successful shrinking, unresolved cases, and the panic scenario. Like a well-designed game tutorial that covers all the mechanics!
simulation/amaru-sim/src/simulator/simulate.rs (3)
293-298: Solid integration of the shrinking functionality!The way you've wired up the shrinker is ace - passing the error predicate to ensure we're shrinking the right failure. It's like having the perfect combo move in Street Fighter!
346-347: Nice touch showing the shrink count in the failure message!The minimized input display with shrink count gives great debugging context. Like showing the speedrun timer - you know exactly how much work went into finding that minimal repro!
475-491: Smooth use of the new generate_zip_with function!The random arrival times make the tests more realistic - like adding RNG to your game instead of fixed spawn patterns. Much better coverage of timing-related bugs!
Codecov Report❌ Patch coverage is
... and 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Signed-off-by: Stevan A <stevana@users.noreply.github.com>
abailly
left a comment
There was a problem hiding this comment.
nice job! Just minor suggestion which might or might not work
|
|
||
| match world.run_world() { | ||
| Err((reason, history)) => { | ||
| match run_test(config.number_of_nodes, &spawn, &property)(&entries) { |
There was a problem hiding this comment.
can you reuse run_test(..) here, eg. something like
let test = run_test(config.number_of_nodes, &spawn, &property)
match test(&entries) {
...
}
Did not try it so take with a grain of salt
There was a problem hiding this comment.
Seems to work, will make it part of next PR, thanks!
rkuhn
left a comment
There was a problem hiding this comment.
I know I’m late, but there are some questions I have on this one.
| let mut start = 0; | ||
| let subset_length = input.len() / n; | ||
| let mut some_complement_is_failing = false; | ||
| while start < input.len() { |
There was a problem hiding this comment.
when input.len() is odd, this will run a third time with a single input element removed at the end — is this required by the algorithm, or shouldn’t we rather round up and change the second extend_from_slice to just cap the copy?
There was a problem hiding this comment.
I think removing a single input element from the end is correct...
Add shrinking to the simulator, essentially bisecting the input for as long as the same error is returned by the test.
Summary by CodeRabbit
New Features
Refactor
Chores