feat: added EDR for errors by jeluard · Pull Request #346 · pragma-org/amaru

jeluard · 2025-07-22T13:58:32Z

Summary by CodeRabbit

Documentation
- Added a new architecture decision record outlining a standardized error handling strategy, including usage guidelines and examples for consistent error management across the project.

Signed-off-by: jeluard <jeluard@users.noreply.github.com>

coderabbitai · 2025-07-22T13:58:40Z

Walkthrough

A new architecture decision record was added, laying out a standardised error handling approach for the Amaru node project. It prescribes using structured error enums with the thiserror crate, Result<T, E> types for fallible functions, and the anyhow crate for application-level error management, complete with context-rich propagation and examples.

Changes

File(s)	Change Summary
engineering-decision-records/011-error-handling.md	Added architecture decision record detailing consistent error handling strategy, error enums, usage of `thiserror` and `anyhow`, and example implementations.

Estimated code review effort

1 (~2 minutes)

Poem

G'day, mate, a record's here,
To keep our errors crystal clear.
With enums, context, never fear—
No panics lurking in the rear!
Like Mario dodging shells anew,
Our code's robust, and stylish too.
Cheers to errors handled right,
Now bugs won't keep us up at night!

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

engineering-decision-records/011-error-handling.md (5)
19-23: Mind the clash with our current “panic-first” boot-strapping phase

The ADR outlaws panic! except in “unreachable” paths, yet we still have an explicit expect/unwrap-for-fatal policy in several subsystems (see rkuhn’s notes). Without a transition plan we’ll end up with a mixed bag of philosophies – a bit like shipping both VHS and Betamax in the same box. 

Consider adding one short paragraph that:

Acknowledges the existing panic hotspots.

Defines a milestone (e.g. “before the 0.3 release”) for migrating them to Result.

Explains the temporary coexistence rule of thumb.

This keeps the record truthful and gives the team a north-star without blocking today’s builds.

57-60: Nit: grammar tweak & smoother wording

A couple of tiny hiccups break the reading flow:
-// If extra contexts is required, use anyhow
+// If extra context is required, use `anyhow`
While you’re there, maybe slip in back-ticks around context() for consistency with the rest of the snippet.

77-81: Typo – “makese” looks like a glitch in the Matrix
-// Sometimes it makese sense to have Error close to some module sub-system
+// Sometimes it makes sense to keep Error close to a specific sub-system
93-107: Example can be leaner – duplicate call & verbose match

Calling fn1_bis() twice and wrapping the result in an extra match with a trailing ? is a tad over-caffeinated. A slimmer version reads easier and still shows down-casting:
-fn main() -> anyhow::Result<()> {
-    fn1()?;
-    fn1_bis()?;
-    match fn1_bis() {
-        Ok(_) => Ok::<(), anyhow::Error>(()),
-        Err(e) => {
-            match e.downcast_ref() {
-                Some(Error::IoError(_)) => Ok(()),
-                Some(_) => Ok(()),
-                None => Ok(()),
-            }
-        }
-    }?;
-    Ok(())
-}
+fn main() -> anyhow::Result<()> {
+    fn1()?;          // propagates as anyhow::Error
+
+    if let Err(err) = fn1_bis() {
+        if let Some(Error::IoError(_)) = err.downcast_ref::<Error>() {
+            // handle the specific case
+        }
+    }
+    Ok(())
+}
Same teaching value, half the carbs.

41-49: Top-level #[error("Error")] dilutes the enum’s flavour

The derive macro already formats variants nicely; the generic “Error” string at enum level will prepend every message, turning “IO Error: …” into “Error: IO Error: …”. That double prefix feels like watching the opening credits twice.

Unless you really want that duplication, drop the outer attribute:
-#[derive(thiserror::Error, Debug)]
-#[error("Error")]
+#[derive(thiserror::Error, Debug)]
 pub enum Error {
Keeps the final messages crisp.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7757809 and 94ad105.

📒 Files selected for processing (1)

engineering-decision-records/011-error-handling.md (1 hunks)

🧠 Learnings (2)

📓 Common learnings

Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Learnt from: abailly
PR: pragma-org/amaru#75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Learnt from: KtorZ
PR: pragma-org/amaru#0
File: :0-0
Timestamp: 2025-04-04T16:49:53.462Z
Learning: The Amaru project follows a design decision to use traits for state management interfaces that integrate validation and state updates in a single pass, while maintaining flexibility and pluggability.

Learnt from: rkuhn
PR: pragma-org/amaru#263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box<dyn Error> or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.

Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Learnt from: abailly
PR: pragma-org/amaru#195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

engineering-decision-records/011-error-handling.md (5)

Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via .expect() or .unwrap()) for fatal errors in the application code that would tear down the node, rather than propagating errors with Result. The intention is to eventually transition to proper error handling with Result as the codebase matures.

Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Learnt from: rkuhn
PR: #263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.

Learnt from: abailly
PR: #75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Learnt from: abailly
PR: #195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Learnt from: abailly
PR: pragma-org/amaru#75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Learnt from: KtorZ
PR: pragma-org/amaru#0
File: :0-0
Timestamp: 2025-04-04T16:49:53.462Z
Learning: The Amaru project follows a design decision to use traits for state management interfaces that integrate validation and state updates in a single pass, while maintaining flexibility and pluggability.

Learnt from: rkuhn
PR: pragma-org/amaru#263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box<dyn Error> or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.

Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Learnt from: abailly
PR: pragma-org/amaru#195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Learnt from: stevana
PR: pragma-org/amaru#210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

engineering-decision-records/011-error-handling.md (5)

Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via .expect() or .unwrap()) for fatal errors in the application code that would tear down the node, rather than propagating errors with Result. The intention is to eventually transition to proper error handling with Result as the codebase matures.

Learnt from: rkuhn
PR: #149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Learnt from: rkuhn
PR: #263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.

Learnt from: abailly
PR: #75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Learnt from: abailly
PR: #195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Coverage
GitHub Check: Build on windows-latest with target x86_64-pc-windows-msvc
GitHub Check: Snapshots (preprod, 1, 10.1.4)
GitHub Check: Build on ubuntu-latest with target riscv32im-risc0-zkvm-elf

codecov · 2025-07-22T14:09:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
see 7 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

abailly

I would add a couple of "counterexamples" to show what not to do

KtorZ

Agreed with @abailly

jeluard · 2025-08-09T18:47:00Z

Any reason to close this @KtorZ ?

KtorZ · 2025-08-09T21:36:03Z

Yes, I merged it; but manually after a rebase and a rename (other EDRs got accepted and merged in between). So Github shows it as closed 🤷. I should've commented here, my bad.

I also took care of adding some points about "what not to do" and ported some of @rkuhn's comments from that gist you shared some time ago.

-> https://github.com/pragma-org/amaru/blob/main/engineering-decision-records/013-error-handling-strategies.md

jeluard · 2025-08-10T05:53:09Z

Ahh oups 🤦 I was about to push some pitfalls, great to see you did it already :)

KtorZ · 2025-08-10T07:13:56Z

Feel free to make a follow-up PR 🫡; I just got tired seeing it hanging there.

feat: added EDR for errors

94ad105

Signed-off-by: jeluard <jeluard@users.noreply.github.com>

coderabbitai Bot reviewed Jul 22, 2025

View reviewed changes

abailly approved these changes Jul 30, 2025

View reviewed changes

KtorZ approved these changes Aug 2, 2025

View reviewed changes

KtorZ closed this Aug 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: added EDR for errors#346

feat: added EDR for errors#346
jeluard wants to merge 1 commit into
mainfrom
jeluard/edr-errors

jeluard commented Jul 22, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 22, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Uh oh!

codecov Bot commented Jul 22, 2025 •

edited

Loading

Uh oh!

abailly left a comment

Uh oh!

KtorZ left a comment

Uh oh!

jeluard commented Aug 9, 2025

Uh oh!

KtorZ commented Aug 9, 2025

Uh oh!

jeluard commented Aug 10, 2025

Uh oh!

KtorZ commented Aug 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jeluard commented Jul 22, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

abailly left a comment

Choose a reason for hiding this comment

Uh oh!

KtorZ left a comment

Choose a reason for hiding this comment

Uh oh!

jeluard commented Aug 9, 2025

Uh oh!

KtorZ commented Aug 9, 2025

Uh oh!

jeluard commented Aug 10, 2025

Uh oh!

KtorZ commented Aug 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jeluard commented Jul 22, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 22, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov Bot commented Jul 22, 2025 •

edited

Loading