Skip to content

Commit ee00cbe

Browse files
authored
Merge pull request #352 from pragma-org/stevan/sim-readme
feat/simulation readme
2 parents a16abfa + 0bf67a6 commit ee00cbe

4 files changed

Lines changed: 147 additions & 32 deletions

File tree

simulation/amaru-sim/README.md

Lines changed: 95 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,109 @@
11
# Amaru Simulator
22

3-
This component aims at implementing a _Simulator_ for the Ouroboros Consensus, in Rust, using Amaru components. The main goal of this work is to be able to test the consensus part as deeply as possible, using different strategies, in increasing order of fidelity:
3+
This component aims at implementing a _Simulator_ for the Ouroboros Consensus,
4+
in Rust, using Amaru components. The main goal of this work is to be able to
5+
test the consensus part as deeply as possible, using different strategies, in
6+
increasing order of fidelity:
47

5-
1. ✅ In-process deterministic testing, completely simulating the environment, allowing arbitrary fault injections and full control over concurrency and other side-effects
6-
2.[Maelstrom](https://github.com/jepsen-io/maelstrom/)-like testing through stdin/stdout interface ignoring network interactions
7-
3. 🔴 [Jepsen](https://github.com/jepsen-io/jepsen)-like testing through full-blown deployment of a cluster and actual networking stack
8+
1. ✅ In-process deterministic testing, completely simulating the environment,
9+
allowing arbitrary fault injections and full control over concurrency and
10+
other side-effects
11+
2.[Maelstrom](https://github.com/jepsen-io/maelstrom/)-like testing through
12+
stdin/stdout interface ignoring network interactions
13+
3. 🔴 [Jepsen](https://github.com/jepsen-io/jepsen)-like testing through
14+
full-blown deployment of a cluster and actual networking stack
815
4. 🔴 [Antithesis](https://antithesis.com) support
916

17+
## Overview
18+
19+
The main components of the simulator are:
20+
21+
* Test case generation, found in
22+
[`src/simulator/generate.rs`](src/simulator/generate.rs), which uses the
23+
pre-generated block tree which is saved in
24+
[`tests/data/chain.json`](tests/data/chain.json);
25+
* The (discrete-event) simulator itself, lives in
26+
[`src/simulator/simulate.rs`](src/simulator/simulate.rs);
27+
* The property-based test and property that uses the simulator, defined in
28+
[`src/simulator/mod.rs`](src/simulator/mod.rs);
29+
and property that uses the simulator is defined;
30+
* The actual Rust `#test` which gets picked up by `cargo test`, found in
31+
[`tests/simulation.rs`](tests/simulation.rs).
32+
1033
## Usage
1134

12-
The `simulator` executable is a pared-down version of Amaru where network communications are abstracted away. It's packaged as a test so running it amounts to:
35+
The `simulator` test is a pared-down version of Amaru where network
36+
communications are abstracted away.
37+
38+
The test can be run as follows (environment variables can be used to override
39+
options, we show the default values here):
1340

1441
```
15-
cargo test -p amaru-sim
42+
AMARU_NUMBER_OF_TESTS=50 # Set the number of test cases to generate. \
43+
AMARU_NUMBER_OF_NODES=1 # Set the number of nodes in a simulation. \
44+
AMARU_NUMBER_OF_UPSTREAM_PEERS=2 # Set the number of upstream peers.
45+
AMARU_DISABLE_SHRINKING=0 # Set to 1 to disable shrinking. \
46+
AMARU_TEST_SEED= # Seed to use to reproduce a test case. \
47+
AMARU_PERSIST_ON_SUCCESS=0 # Set to 1 to persist pure-stage schedule on success. \
48+
AMARU_SIMULATION_LOG="error" # Only show error-level logging. \
49+
\
50+
cargo test run_simulator
1651
```
1752

18-
By default, it only logs `error` level and above log entries, but one filter logs by setting the `AMARU_SIMULATION_LOG` environment variable to an appropriate value.
53+
## Debugging failures
54+
55+
When the test fails, the output looks something like this:
56+
57+
```
58+
Minimised input (0 shrinks):
59+
60+
Envelope { src: "c1", dest: "n1", body: Fwd { msg_id: 0, slot: Slot(31), hash: "2487bd", header: "828a0118" } }
61+
Envelope { src: "c1", dest: "n1", body: Fwd { msg_id: 1, slot: Slot(38), hash: "4fcd1d", header: "828a0218" } }
62+
Envelope { src: "c1", dest: "n1", body: Fwd { msg_id: 2, slot: Slot(41), hash: "739307", header: "828a0318" } }
63+
Envelope { src: "c1", dest: "n1", body: Fwd { msg_id: 3, slot: Slot(55), hash: "726ef3", header: "828a0418" } }
64+
Envelope { src: "c1", dest: "n1", body: Fwd { msg_id: 4, slot: Slot(93), hash: "597ea6", header: "828a0518" } }
65+
[...]
66+
67+
History:
68+
69+
0. "c1" ==> "n1" Fwd { msg_id: 0, slot: Slot(31), hash: "2487bd", header: "828a0118" }
70+
1. "n1" ==> "c1" Fwd { msg_id: 0, slot: Slot(31), hash: "2487bd", header: "828a0118" }
71+
2. "c2" ==> "n1" Fwd { msg_id: 0, slot: Slot(31), hash: "2487bd", header: "828a0118" }
72+
3. "c2" ==> "n1" Fwd { msg_id: 1, slot: Slot(38), hash: "4fcd1d", header: "828a0218" }
73+
4. "n1" ==> "c2" Fwd { msg_id: 1, slot: Slot(38), hash: "4fcd1d", header: "828a0218" }
74+
[...]
75+
76+
Error message:
77+
78+
tip of chains don't match, expected:
79+
(Bytes { bytes: "fcb4a51..." }, Slot(990))
80+
got:
81+
(Bytes { bytes: "gcb4a51..." }, Slot(990))
82+
83+
Saved schedule: "./failure-1752489042.schedule"
84+
85+
Seed: 42
86+
```
87+
88+
Let's break the components down:
89+
90+
* The minimised failing test case is printed so that it can be copy-pasted into a
91+
`#test` to create a regression test;
92+
* The history is the same as test case, but the `src` and `dest` of each
93+
message has been pretty printed to be easier to read and it also contains the
94+
responses that we got back from the system under test. The history is what the
95+
property is checked against;
96+
* The error message is how the property failed;
97+
* Saved schedule is the execution schedule from `pure-stage`, which can provide
98+
low-level details about how the stage processing happened. See the following
99+
[note](https://github.com/pragma-org/amaru/wiki/log-::-2025%E2%80%9006#debugging-simulation-tests-failure)
100+
for more details of how to use this information;
101+
* The seed is what was used to produce the test case, it can be used to replay
102+
the test (see `AMARU_TEST_SEED` above).
19103

20104
## References
21105

22-
* [Cardano Consensus and Storage Layer](https://ouroboros-consensus.cardano.intersectmbo.org/assets/files/report-b72e7d765cfee85b26dc035c52c6de84.pdf)
23-
* [Ouroboros Network Specification](https://ouroboros-network.cardano.intersectmbo.org/pdfs/network-spec/network-spec.pdf)
106+
* [Cardano Consensus and Storage
107+
Layer](https://ouroboros-consensus.cardano.intersectmbo.org/assets/files/report-b72e7d765cfee85b26dc035c52c6de84.pdf)
108+
* [Ouroboros Network
109+
Specification](https://ouroboros-network.cardano.intersectmbo.org/pdfs/network-spec/network-spec.pdf)

simulation/amaru-sim/src/simulator/mod.rs

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,9 @@ pub struct Args {
9595
#[arg(long, default_value = "2")]
9696
pub number_of_upstream_peers: Option<u8>,
9797

98+
#[arg(long)]
99+
pub disable_shrinking: bool,
100+
98101
/// Seed for simulation testing.
99102
#[arg(long)]
100103
pub seed: Option<u64>,
@@ -352,6 +355,7 @@ pub fn run(rt: tokio::runtime::Runtime, args: Args) {
352355
let number_of_tests = args.number_of_tests.unwrap_or(50);
353356
let number_of_nodes = args.number_of_nodes.unwrap_or(1);
354357
let number_of_upstream_peers = args.number_of_upstream_peers.unwrap_or(2);
358+
let disable_shrinking = args.disable_shrinking;
355359
let trace_buffer = Arc::new(parking_lot::Mutex::new(TraceBuffer::new(42, 1_000_000_000)));
356360

357361
let spawn = || {
@@ -371,6 +375,7 @@ pub fn run(rt: tokio::runtime::Runtime, args: Args) {
371375
number_of_tests,
372376
seed,
373377
number_of_nodes,
378+
disable_shrinking,
374379
},
375380
spawn,
376381
generate_entries(
@@ -405,7 +410,7 @@ fn chain_property(
405410
.expect("empty chain data");
406411
if actual != expected {
407412
return Err(format!(
408-
"tip of chains don't match, expected {:?}, got {:?}",
413+
"tip of chains don't match, expected:\n {:?}\n got:\n {:?}",
409414
expected, actual
410415
));
411416
}

simulation/amaru-sim/src/simulator/simulate.rs

Lines changed: 32 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ pub struct SimulateConfig {
5454
pub number_of_tests: u32,
5555
pub seed: u64,
5656
pub number_of_nodes: u8,
57+
pub disable_shrinking: bool,
5758
}
5859

5960
#[derive(Debug, Clone, PartialEq, Serialize)]
@@ -289,23 +290,34 @@ pub fn simulate<Msg, F>(
289290
for test_number in 1..=config.number_of_tests {
290291
let entries: Vec<Reverse<Entry<Msg>>> = generator(&mut rng);
291292

292-
match run_test(config.number_of_nodes, &spawn, &property)(&entries) {
293-
(_history, Err(reason)) => {
294-
let (shrunk_entries, (shrunk_history, result), number_of_shrinks) = shrink(
295-
run_test(config.number_of_nodes, &spawn, &property),
296-
entries,
297-
|result| result.1 == Err(reason.clone()),
298-
);
299-
assert_eq!(Err(reason.clone()), result);
300-
display_failure(
301-
test_number,
302-
config.seed,
303-
shrunk_entries,
304-
number_of_shrinks,
305-
shrunk_history,
306-
trace_buffer.clone(),
307-
reason,
308-
);
293+
let test = run_test(config.number_of_nodes, &spawn, &property);
294+
match test(&entries) {
295+
(history, Err(reason)) => {
296+
if config.disable_shrinking {
297+
let number_of_shrinks = 0;
298+
display_failure(
299+
test_number,
300+
config.seed,
301+
entries,
302+
number_of_shrinks,
303+
history,
304+
trace_buffer.clone(),
305+
reason,
306+
);
307+
} else {
308+
let (shrunk_entries, (shrunk_history, result), number_of_shrinks) =
309+
shrink(test, entries, |result| result.1 == Err(reason.clone()));
310+
assert_eq!(Err(reason.clone()), result);
311+
display_failure(
312+
test_number,
313+
config.seed,
314+
shrunk_entries,
315+
number_of_shrinks,
316+
shrunk_history,
317+
trace_buffer.clone(),
318+
reason,
319+
);
320+
}
309321
break;
310322
}
311323
(_history, Ok(())) => continue,
@@ -348,7 +360,7 @@ fn display_failure<Msg: Debug>(
348360
Minimised input ({number_of_shrinks} shrinks):\n\n{}\n \
349361
History:\n\n{}\n \
350362
Error message:\n\n {}\n\n \
351-
{} \
363+
{}\n \
352364
Seed: {}\n",
353365
test_case,
354366
history_string,
@@ -496,6 +508,7 @@ mod tests {
496508
number_of_tests,
497509
seed,
498510
number_of_nodes,
511+
disable_shrinking: false,
499512
},
500513
spawn,
501514
generator,
@@ -566,6 +579,7 @@ mod tests {
566579
number_of_tests,
567580
seed,
568581
number_of_nodes,
582+
disable_shrinking: false,
569583
},
570584
spawn,
571585
generate_messages,

simulation/amaru-sim/tests/simulation.rs

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,23 @@ fn run_simulator() {
1212
chain_dir: "./chain.db".into(),
1313
block_tree_file: "tests/data/chain.json".into(),
1414
start_header: Hash::from([0; 32]),
15-
number_of_tests: Some(50),
16-
number_of_nodes: Some(1),
17-
number_of_upstream_peers: Some(2),
15+
number_of_tests: env::var("AMARU_NUMBER_OF_TESTS")
16+
.ok()
17+
.and_then(|v| v.parse::<u32>().ok())
18+
.or(Some(50)),
19+
number_of_nodes: env::var("AMARU_NUMBER_OF_NODES")
20+
.ok()
21+
.and_then(|v| v.parse::<u8>().ok())
22+
.or(Some(1)),
23+
number_of_upstream_peers: env::var("AMARU_NUMBER_OF_UPSTREAM_PEERS")
24+
.ok()
25+
.and_then(|v| v.parse::<u8>().ok())
26+
.or(Some(2)),
27+
disable_shrinking: std::env::var("AMARU_DISABLE_SHRINKING").is_ok_and(|v| v == "1"),
1828
seed: std::env::var("AMARU_TEST_SEED")
1929
.ok()
2030
.and_then(|s| s.parse().ok()),
21-
persist_on_success: std::env::var("AMARU_PERSIST_ON_SUCCESS").is_ok(),
31+
persist_on_success: std::env::var("AMARU_PERSIST_ON_SUCCESS").is_ok_and(|v| v == "1"),
2232
};
2333

2434
tracing_subscriber::fmt()

0 commit comments

Comments
 (0)