feat: added EDR for errors#346
Conversation
Signed-off-by: jeluard <jeluard@users.noreply.github.com>
WalkthroughA new architecture decision record was added, laying out a standardised error handling approach for the Amaru node project. It prescribes using structured error enums with the Changes
Estimated code review effort1 (~2 minutes) Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
engineering-decision-records/011-error-handling.md (5)
19-23: Mind the clash with our current “panic-first” boot-strapping phaseThe ADR outlaws
panic!except in “unreachable” paths, yet we still have an explicit expect/unwrap-for-fatal policy in several subsystems (see rkuhn’s notes). Without a transition plan we’ll end up with a mixed bag of philosophies – a bit like shipping both VHS and Betamax in the same box.Consider adding one short paragraph that:
- Acknowledges the existing panic hotspots.
- Defines a milestone (e.g. “before the 0.3 release”) for migrating them to
Result.- Explains the temporary coexistence rule of thumb.
This keeps the record truthful and gives the team a north-star without blocking today’s builds.
57-60: Nit: grammar tweak & smoother wordingA couple of tiny hiccups break the reading flow:
-// If extra contexts is required, use anyhow +// If extra context is required, use `anyhow`While you’re there, maybe slip in back-ticks around
context()for consistency with the rest of the snippet.
77-81: Typo – “makese” looks like a glitch in the Matrix-// Sometimes it makese sense to have Error close to some module sub-system +// Sometimes it makes sense to keep Error close to a specific sub-system
93-107: Example can be leaner – duplicate call & verbose matchCalling
fn1_bis()twice and wrapping the result in an extramatchwith a trailing?is a tad over-caffeinated. A slimmer version reads easier and still shows down-casting:-fn main() -> anyhow::Result<()> { - fn1()?; - fn1_bis()?; - match fn1_bis() { - Ok(_) => Ok::<(), anyhow::Error>(()), - Err(e) => { - match e.downcast_ref() { - Some(Error::IoError(_)) => Ok(()), - Some(_) => Ok(()), - None => Ok(()), - } - } - }?; - Ok(()) -} +fn main() -> anyhow::Result<()> { + fn1()?; // propagates as anyhow::Error + + if let Err(err) = fn1_bis() { + if let Some(Error::IoError(_)) = err.downcast_ref::<Error>() { + // handle the specific case + } + } + Ok(()) +}Same teaching value, half the carbs.
41-49: Top-level#[error("Error")]dilutes the enum’s flavourThe derive macro already formats variants nicely; the generic “Error” string at enum level will prepend every message, turning “IO Error: …” into “Error: IO Error: …”. That double prefix feels like watching the opening credits twice.
Unless you really want that duplication, drop the outer attribute:
-#[derive(thiserror::Error, Debug)] -#[error("Error")] +#[derive(thiserror::Error, Debug)] pub enum Error {Keeps the final messages crisp.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
engineering-decision-records/011-error-handling.md(1 hunks)
🧠 Learnings (2)
📓 Common learnings
Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.
Learnt from: abailly
PR: pragma-org/amaru#75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: KtorZ
PR: pragma-org/amaru#0
File: :0-0
Timestamp: 2025-04-04T16:49:53.462Z
Learning: The Amaru project follows a design decision to use traits for state management interfaces that integrate validation and state updates in a single pass, while maintaining flexibility and pluggability.
Learnt from: rkuhn
PR: pragma-org/amaru#263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box<dyn Error> or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.
Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
Learnt from: abailly
PR: pragma-org/amaru#195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.
Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.
engineering-decision-records/011-error-handling.md (5)
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via .expect() or .unwrap()) for fatal errors in the application code that would tear down the node, rather than propagating errors with Result. The intention is to eventually transition to proper error handling with Result as the codebase matures.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
Learnt from: rkuhn
PR: #263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.
Learnt from: abailly
PR: #75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
PR: #195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.
Learnt from: abailly
PR: pragma-org/amaru#75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: KtorZ
PR: pragma-org/amaru#0
File: :0-0
Timestamp: 2025-04-04T16:49:53.462Z
Learning: The Amaru project follows a design decision to use traits for state management interfaces that integrate validation and state updates in a single pass, while maintaining flexibility and pluggability.
Learnt from: rkuhn
PR: pragma-org/amaru#263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box<dyn Error> or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.
Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
Learnt from: abailly
PR: pragma-org/amaru#195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.
Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.
engineering-decision-records/011-error-handling.md (5)
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via .expect() or .unwrap()) for fatal errors in the application code that would tear down the node, rather than propagating errors with Result. The intention is to eventually transition to proper error handling with Result as the codebase matures.
Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.
Learnt from: rkuhn
PR: #263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.
Learnt from: abailly
PR: #75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
PR: #195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Coverage
- GitHub Check: Build on windows-latest with target x86_64-pc-windows-msvc
- GitHub Check: Snapshots (preprod, 1, 10.1.4)
- GitHub Check: Build on ubuntu-latest with target riscv32im-risc0-zkvm-elf
Codecov Report✅ All modified and coverable lines are covered by tests. 🚀 New features to boost your workflow:
|
abailly
left a comment
There was a problem hiding this comment.
I would add a couple of "counterexamples" to show what not to do
|
Any reason to close this @KtorZ ? |
|
Yes, I merged it; but manually after a rebase and a rename (other EDRs got accepted and merged in between). So Github shows it as closed 🤷. I should've commented here, my bad. I also took care of adding some points about "what not to do" and ported some of @rkuhn's comments from that gist you shared some time ago. |
|
Ahh oups 🤦 I was about to push some pitfalls, great to see you did it already :) |
|
Feel free to make a follow-up PR 🫡; I just got tired seeing it hanging there. |
Summary by CodeRabbit