Skip to content

feat: serve chain from store to followers#537

Merged
abailly merged 10 commits into
mainfrom
abailly/follow-chain-from-store
Nov 10, 2025
Merged

feat: serve chain from store to followers#537
abailly merged 10 commits into
mainfrom
abailly/follow-chain-from-store

Conversation

@abailly

@abailly abailly commented Nov 3, 2025

Copy link
Copy Markdown
Contributor

This PR introduces a ChainFollower structure that keeps track of a single downstream peer's synchronization state with two modalities:

  • while the synchronisation point (initially from intersection and then evolving) is behind the anchor of the chain, the forwarded headers are loaded from the store,
  • when the synchronisation reaches the anchor, the peer is sent the ClientOp messages accumulated in a queue while it was synchronising. This queue is initially filled with the "volatile" part of the chain, e.g the part between the anchor and the tip.

The code is surprisingly convoluted and there are a few edge cases one needs to deal with that should be covered by the store's invariant, e.g the store should guarantee that if there's a child relationship between 2 headers, then both of them are in the store. However it's really not obvious to me how to enforce that and expose it in the API.

Note that ideally I would have each ChainFollower keep an Iterator over the underlying ChainStore but as Arc<T> is not Send it's seemingly not possible. I would be interested in knowing from Rust experts whether or not this is feasible as I think it would somewhat simplify the design and possibly improve the performances.

Fix #426

Summary by CodeRabbit

  • New Features

    • Enhanced chain store with improved next-chain lookup capability for efficient chain traversal.
  • Improvements

    • Added contextual location-aware logging to consensus validation for better error diagnostics and debugging.
  • Infrastructure

    • Added multi-client launch script to support testing multiple node deployments simultaneously.

@coderabbitai

coderabbitai Bot commented Nov 3, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

This pull request introduces a new next_best_chain method across the ReadOnlyChainStore trait and all its implementations (in-memory, RocksDB, Box), refactors the ChainFollower state machine from client_state.rs into a new dedicated module, updates the chain store bootstrap initialization logic to return initialized stores directly, and improves structured logging with location context in validation stages.

Changes

Cohort / File(s) Summary
ReadOnlyChainStore API Extension
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
Added next_best_chain(&self, point: &Point) -> Option<Point> to trait definition and forwarding implementation for Box<dyn ChainStore<H>>.
In-Memory Store Implementation
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
Implemented next_best_chain by filtering in-memory chain for the first point with slot greater than the given point's slot.
RocksDB Store Implementation
crates/amaru-stores/src/rocksdb/consensus/mod.rs
Implemented next_best_chain via iterator seeking to slot after given point, with four new tests validating successor resolution, genesis handling, and edge cases.
Store Trait Implementations
crates/amaru-consensus/src/consensus/effects/store_effects.rs, crates/amaru-consensus/src/consensus/stages/validate_header.rs
Added next_best_chain stub returning None for Store and delegating implementation for FailingStore.
ChainFollower Module Reorganization
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (new), crates/amaru/src/stages/consensus/forward_chain/client_state.rs (removed), crates/amaru/src/stages/consensus/forward_chain/mod.rs
Moved ChainFollower<H> state machine to new dedicated module with extensive unit tests, test utilities (ChainStoreExt), and Debug impl; removed from client_state.rs.
Client Protocol Updates
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
Updated imports to use ChainFollower from new module, changed store clone patterns, added explicit intersection variable, improved tip logging context.
Bootstrap Chain Store Initialization
crates/amaru/src/stages/mod.rs
Refactored make_chain_store to return initialized Arc<dyn ChainStore<BlockHeader>> directly; removed open_chain_store; moved tip validation and anchor/best-hash setup into make_chain_store.
Validation & Logging
crates/amaru-consensus/src/consensus/stages/validate_block.rs
Enhanced error logs with location context (%point, %rollback_point) for failed validation, rollback, and forward-roll operations.
Test Infrastructure
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs, crates/amaru/src/stages/consensus/forward_chain/tests.rs
Updated imports to reference test utilities from new chain_follower::tests; added chain initialization and FIRST_HEADER constant.
Development Dependencies
crates/amaru/Cargo.toml
Added dev-dependency on amaru-kernel with test-utils feature.
Multi-Client Test Script
scripts/multi-client.sh (new)
New Bash script to orchestrate launching multiple Amaru nodes with unique ports and peer discovery.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Downstream Client
    participant Proto as client_protocol
    participant Follower as ChainFollower
    participant Store as ReadOnlyChainStore
    
    Client->>Proto: Send intersection points
    activate Proto
    
    rect rgb(220, 250, 220)
        Note over Follower,Store: Initialization Phase
        Proto->>Follower: new(store, current_tip, points)
        activate Follower
        Follower->>Store: Load headers from current_tip backward
        Follower-->>Proto: ChainFollower ready
        deactivate Follower
    end
    
    rect rgb(220, 240, 255)
        Note over Follower,Store: Forward Chain Traversal
        loop Until intersection found or chain exhausted
            Proto->>Follower: next_op(store)
            activate Follower
            
            alt Behind anchor
                Follower->>Store: next_best_chain(current_point)
                Store-->>Follower: Option<Point>
                Follower->>Store: Load child header at Point
                Follower-->>Proto: ClientOp::Forward(header)
            else Ahead or at anchor
                Follower-->>Proto: ClientOp from ops queue
            end
            deactivate Follower
            
            Proto->>Client: Send header or intersection
        end
    end
    
    Client->>Proto: Acknowledge / Request next
    Proto-->>Client: Continue or sync complete
    deactivate Proto
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Areas requiring extra attention:

  • chain_follower.rs: Dense logic for state management, initial traversal, and forward-chain sequencing; extensive unit tests must be verified for coverage and correctness
  • stages/mod.rs: Significant bootstrap refactoring; understand migration from open_chain_store separation to unified make_chain_store initialization—verify all error paths and anchor/best-hash setup
  • next_best_chain implementations: Logic divergence between in-memory (slot filtering) and RocksDB (iterator-based) approaches; validate test coverage in RocksDB module
  • Integration between client_protocol.rs and chain_follower.rs: Store cloning patterns and intersection handling must align with the new module boundaries

Possibly related PRs

  • PR #518: Directly extends ChainFollower-driven forward-chain logic and introduces test infrastructure used by the main PR's reorganization
  • PR #464: Overlapping code-level changes to forward_chain client modules and chain-store APIs that support forward-chain logic
  • PR #228: Establishes foundational ChainStore/InMemConsensusStore functionality extended by this PR's next_best_chain additions

Suggested reviewers

  • rkuhn
  • KtorZ

Poem

🔗 The chain follower charts a steadier path,
No more the memory spike's chaotic wrath,
Like finding the next checkpoint mid-level load,
Traversing best chains down a well-lit road,
Where clients connect and don't cause a stall,
Clean refactors make the consensus enthrall! 🎮

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 61.54% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: serve chain from store to followers' accurately captures the main change: implementing ChainFollower to retrieve chain headers from the store for downstream peers.
Linked Issues check ✅ Passed The PR successfully addresses issue #426 by implementing ChainFollower that prevents memory spikes, reduces latency when serving downstream clients, and avoids stalling upstream synchronization through store-based header retrieval.
Out of Scope Changes check ✅ Passed All changes are directly scoped to serving chain headers from store to followers: trait additions, ChainFollower implementation, store integration, and supporting infrastructure are all aligned with issue #426 objectives.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch abailly/follow-chain-from-store

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa909f4 and 9a770b6.

📒 Files selected for processing (1)
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (15)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-08T14:39:50.527Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/borrowed_datum.rs:32-39
Timestamp: 2025-08-08T14:39:50.527Z
Learning: In the amaru project, when converting BorrowedDatumOption::Data to an owned DatumOption in crates/amaru-kernel/src/borrowed_datum.rs, the call `.unwrap()` refers to pallas’s KeepRaw::unwrap, which is infallible (always returns the inner value) and is not a panic risk. Future reviews should not flag this unwrap as dangerous.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-20T18:02:25.073Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
🧬 Code graph analysis (1)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (5)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (4)
  • hash_point (124-129)
  • fmt (71-95)
  • fmt (114-121)
  • tip (99-104)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (6)
  • point (87-89)
  • fmt (226-245)
  • new (102-127)
  • hash (79-81)
  • mk_in_memory_store (43-77)
  • hex (83-85)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (3)
  • fmt (181-193)
  • load_from_best_chain (36-36)
  • load_from_best_chain (141-143)
crates/amaru/src/stages/mod.rs (5)
  • fmt (80-85)
  • fmt (129-134)
  • from (154-159)
  • as_tip (382-382)
  • as_tip (386-388)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (3)
  • new (35-39)
  • new (59-69)
  • load_from_best_chain (124-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Test the CLI
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build riscv32
  • GitHub Check: Build wasm32
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test coverage
  • GitHub Check: Benches
  • GitHub Check: clippy
  • GitHub Check: Analyze (rust)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/amaru/Cargo.toml (1)

75-79: Remove the unnecessary build-dependency on amaru-kernel.

Good news and bad news, mate. The dev-dependency with test-utils? That's totally legit – keep it.

But here's the thing: I've had a proper squiz at your build.rs, and it only uses the built crate to write build-time info and the standard lib to set up the AMARU_NETWORK env var. There's zero usage of amaru-kernel anywhere in that build script – not even a sneaky import hiding in the background like a bonus scene in an Avengers flick.

That build-dependency entry on line 79 is pure cruft and should be binned. It's like carrying a toolbox when you only needed a hammer.

amaru-kernel = { workspace = true, features = ["test-utils"] }

[build-dependencies]
built = { workspace = true, features = ["git2"] }

Just drop the amaru-kernel.workspace = true line from build-dependencies and you're golden.

🧹 Nitpick comments (1)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)

129-146: Logic's spot on, but there's a cheeky optimization opportunity here, mate!

The core logic is absolutely sound – you're finding the next point on the best chain with a slot greater than the input point. That'll work a treat for the downstream sync use case.

That said, the implementation's doing a bit of unnecessary gymnastics with that intermediate Vec collection. You're collecting into a Vec just to check if it's empty, when the iterator could give you the answer directly. It's like making a sandwich, putting it in a box, then immediately taking it out to eat it – the box wasn't really needed, yeah?

Here's a cleaner approach that skips the intermediate collection:

 fn next_best_chain(&self, point: &Point) -> Option<Point> {
     let inner = self.inner.lock().unwrap();
     let min_slot = point.slot_or_default();
-
-    let next: Vec<&Point> = inner
-        .chain
-        .iter()
-        .filter(move |p| p.slot_or_default() > min_slot)
-        .take(1)
-        .collect();
-
-    if next.is_empty() {
-        None
-    } else {
-        Some(next[0].clone())
-    }
+    inner
+        .chain
+        .iter()
+        .find(|p| p.slot_or_default() > min_slot)
+        .cloned()
 }

Using .find() is more idiomatic and avoids the allocation – it returns Option<&Point> directly, then .cloned() gives you the owned Point. Reads cleaner too!

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9704206 and 92f92f4.

📒 Files selected for processing (15)
  • crates/amaru-consensus/src/consensus/effects/store_effects.rs (1 hunks)
  • crates/amaru-consensus/src/consensus/stages/validate_block.rs (3 hunks)
  • crates/amaru-consensus/src/consensus/stages/validate_header.rs (1 hunks)
  • crates/amaru-kernel/src/is_header/tests.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2 hunks)
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs (2 hunks)
  • crates/amaru/Cargo.toml (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (3 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/client_state.rs (0 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/mod.rs (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (4 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs (1 hunks)
  • crates/amaru/src/stages/mod.rs (3 hunks)
💤 Files with no reviewable changes (1)
  • crates/amaru/src/stages/consensus/forward_chain/client_state.rs
🧰 Additional context used
🧠 Learnings (16)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/mod.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru-consensus/src/consensus/stages/validate_block.rs
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs
  • crates/amaru-consensus/src/consensus/effects/store_effects.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In the amaru consensus pipeline, headers are now stored before validation (store_header stage runs before validate_header stage). The validity state of headers should be tracked in the database to avoid revalidating already-valid headers and to prevent flood attacks with duplicate invalid headers.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru-kernel/src/is_header/tests.rs
  • crates/amaru/Cargo.toml
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru-kernel/src/is_header/tests.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/tests.rs
📚 Learning: 2025-08-18T08:11:20.028Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:109-118
Timestamp: 2025-08-18T08:11:20.028Z
Learning: The proposals_roots() method in the MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is intentionally left returning all None values rather than reading from stored state, as it's planned for the upcoming major MemoryStore revamp.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/store_effects.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
📚 Learning: 2025-08-18T08:10:32.640Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/store_effects.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
  • crates/amaru/src/stages/mod.rs
📚 Learning: 2025-08-18T08:10:35.849Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:431-433
Timestamp: 2025-08-18T08:10:35.849Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp soon, so unimplemented methods like set_constitution, set_proposals_roots are intentionally left aside until the revamp is complete.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/store_effects.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-kernel/src/is_header/tests.rs
🧬 Code graph analysis (10)
crates/amaru-consensus/src/consensus/stages/validate_header.rs (3)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)
  • next_best_chain (75-77)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • next_best_chain (130-146)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • next_best_chain (40-40)
  • next_best_chain (145-147)
crates/amaru-stores/src/rocksdb/consensus/mod.rs (6)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)
  • next_best_chain (75-77)
crates/amaru-consensus/src/consensus/stages/validate_header.rs (1)
  • next_best_chain (287-289)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • next_best_chain (130-146)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • next_best_chain (40-40)
  • next_best_chain (145-147)
crates/amaru-kernel/src/point.rs (3)
  • from (65-67)
  • slot_or_default (33-38)
  • hash (40-46)
crates/amaru-stores/src/rocksdb/mod.rs (3)
  • from (157-159)
  • from (163-169)
  • iter (986-1023)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (4)
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (4)
  • store (27-27)
  • store (50-52)
  • store (72-74)
  • store (124-126)
crates/amaru-kernel/src/point.rs (1)
  • hash (40-46)
crates/amaru-consensus/src/consensus/tip.rs (2)
  • hash (94-99)
  • hash (165-167)
crates/amaru-kernel/src/is_header/mod.rs (5)
  • hash (35-35)
  • hash (123-125)
  • hash (185-187)
  • hash (207-209)
  • header (136-138)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (3)
crates/amaru-consensus/src/consensus/stages/validate_header.rs (1)
  • next_best_chain (287-289)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • next_best_chain (130-146)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • next_best_chain (40-40)
  • next_best_chain (145-147)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (3)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)
  • next_best_chain (75-77)
crates/amaru-consensus/src/consensus/stages/validate_header.rs (1)
  • next_best_chain (287-289)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • next_best_chain (40-40)
  • next_best_chain (145-147)
crates/amaru-consensus/src/consensus/stages/validate_block.rs (5)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (1)
  • point (87-89)
crates/amaru-consensus/src/consensus/effects/network_effects.rs (1)
  • point (118-123)
crates/amaru-kernel/src/consensus_events.rs (1)
  • point (115-120)
crates/amaru-consensus/src/consensus/tip.rs (2)
  • point (101-106)
  • point (169-171)
crates/amaru-kernel/src/is_header/mod.rs (1)
  • point (38-40)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (1)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)
  • new (66-146)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (5)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (2)
  • hash_point (124-129)
  • tip (99-104)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (5)
  • point (87-89)
  • new (102-127)
  • hash (79-81)
  • mk_in_memory_store (43-77)
  • hex (83-85)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • load_from_best_chain (36-36)
  • load_from_best_chain (141-143)
crates/amaru/src/stages/mod.rs (3)
  • from (154-159)
  • as_tip (382-382)
  • as_tip (386-388)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • load_from_best_chain (124-127)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (4)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)
  • next_best_chain (75-77)
crates/amaru-consensus/src/consensus/stages/validate_header.rs (1)
  • next_best_chain (287-289)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • next_best_chain (130-146)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (1)
  • point (87-89)
crates/amaru-kernel/src/is_header/tests.rs (1)
crates/amaru-kernel/src/is_header/mod.rs (1)
  • header (136-138)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Test the CLI
  • GitHub Check: Benches
  • GitHub Check: Build aarch64/macos
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Build wasm32
  • GitHub Check: Build riscv32
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Test coverage
  • GitHub Check: clippy
  • GitHub Check: Analyze (rust)
🔇 Additional comments (17)
crates/amaru-kernel/src/is_header/tests.rs (1)

64-67: Fair dinkum, this change makes total sense!

Starting the chain numbering at 1 rather than 0 aligns with blockchain conventions where there's no block 0. The explanatory comments are ripper – they make the intent crystal clear for anyone reading this test utility code later. It's like making sure everyone knows the rules before the game starts, ya know?

crates/amaru/src/stages/consensus/forward_chain/tests.rs (1)

15-15: Clean as a whistle, mate!

The import path update reflects the architectural refactor from client_state to chain_follower. No worries here – just keeping the imports aligned with the new module structure. Easy as!

crates/amaru-consensus/src/consensus/stages/validate_block.rs (1)

63-95: Bloody brilliant observability improvement!

Adding the point context to these error logs is like giving future you (or whoever's on-call) a proper map instead of just saying "something went wrong somewhere." When you're troubleshooting memory spikes or sync stalls with downstream clients (which is what this PR tackles), having that point information in the logs is gonna be absolutely clutch. It's the difference between "we failed somewhere" and "we failed at this exact point" – proper detective work, that.

crates/amaru/src/stages/consensus/forward_chain/mod.rs (1)

15-15: Module refactor looking proper!

Swapping out client_state for chain_follower makes sense given the PR's introducing a dedicated ChainFollower structure to manage downstream peer sync state. It's like reorganizing your garage so the tools are actually where you'd expect them, yeah? Keeps the architecture clean and intentional.

crates/amaru-consensus/src/consensus/stages/validate_header.rs (1)

287-289: Delegation pattern executed perfectly, champion!

The next_best_chain implementation for FailingStore follows the same delegation pattern as the other methods in this test helper – just forwards the call to the inner store. Clean as, no worries at all. It's like the FailingStore is saying "I'm not here to muck about with next_best_chain logic, I'll let the real store handle that."

crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (2)

41-41: FIRST_HEADER constant looking lonely here, mate!

You've added this FIRST_HEADER constant but it's not actually used anywhere in this file. That's totally fine though – I'm guessing it's being used in other test files that import from this test_infra module. Just wanted to give it a shout-out since constants that aren't used locally sometimes make people go "oi, what's this doing here?"


67-75: The review comment is based on a misunderstanding of how the code actually works—it's not incorrect, mate.

Here's the rub: retrieve_best_chain() doesn't depend on roll_forward_chain being called first. They're operating on two completely different levels:

  • retrieve_best_chain() walks the stored persistent header data using the anchor and best_chain_hash pointers to navigate parent-child relationships
  • roll_forward_chain() rebuilds the in-memory consensus state from those stored headers

It's like the difference between reading from a database and populating your application cache—no chicken-and-egg situation, just a sensible two-step bootstrap:

  1. Store headers + set anchors/pointers in persistent storage (lines 54-59)
  2. Walk the stored headers and replay them to rebuild in-memory state (lines 62-66)

The anchor is already set to the first header at line 54-55 before retrieve_best_chain() is called, so the function has everything it needs from day one.

Likely an incorrect or invalid review comment.

crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)

38-40: Clean trait extension, mate!

The new next_best_chain method is well-documented and the signature makes sense. Returns the successor point on the best chain, or None if we've hit the tip or the point doesn't exist. Simple as.


145-147: Standard forwarding pattern, no dramas.

The Box<dyn ChainStore> delegation is exactly what you'd expect - forwards straight through to the underlying implementation. All good here.

crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (4)

17-17: Module import updated to the new chain_follower structure.

Straightforward refactor moving ChainFollower to its own dedicated module. Clean separation of concerns - like moving from a sharehouser situation to your own flat, yeah?


190-190: Debug logging adjustment looks spot on.

Changed from our_tip.0 to just our_tip which suggests the structure changed. Debug formatting will now show the whole Tip rather than just the Point. More context in the logs is always a win!


194-207: ChainFollower initialization with store - design tradeoff acknowledged.

The ChainFollower::new now takes store.clone() and later calls to next_op also pass the store. I see from your PR description you're aware of the tradeoff here - ideally you'd keep an iterator in ChainFollower, but Arc<T> not being Send is blocking that approach.

Just for clarity: Arc::clone is cheap (just bumps a reference count), so the performance hit should be minimal. The real question is whether passing the store on every next_op call (lines 223, 231) accumulates enough overhead when serving many downstream peers.

Worth measuring if you can - spin up a downstream client that's way behind and watch the memory/CPU patterns when it catches up. If you're seeing issues with the Arc cloning frequency, might be worth exploring workarounds like:

  • Store an opaque iterator handle (if you can work around Send)
  • Cache recent chain points in ChainFollower to reduce store hits

But honestly, this looks like a solid pragmatic solution for now!


201-206: Explicit intersection handling - nice and clear!

Rather than inlining the intersection logic, you've extracted it into a variable and logged it separately. Makes the flow much easier to follow when debugging chain sync issues. Good stuff!

crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)

75-77: Placeholder implementation consistent with the pattern here.

Returns None, same as load_from_best_chain above it (line 71-73). Since Store wraps the effects system and there's no ExternalEffect defined for next_best_chain, this is just a no-op stub. The real implementations live in RocksDBStore and InMemConsensusStore. All good, mate!

crates/amaru-stores/src/rocksdb/consensus/mod.rs (1)

751-796: Test coverage is absolutely mint!

You've covered all the key scenarios:

  • Finding a valid successor (752-762)
  • Handling Origin point (764-775)
  • Gracefully returning None for invalid points (777-785)
  • Returning None when already at tip (787-796)

This gives me confidence the implementation handles edge cases properly. Nice one!

crates/amaru/src/stages/mod.rs (2)

188-192: Bootstrap flow refactor - clean separation of concerns!

The new flow is much clearer: make_chain_store returns the store directly instead of mutating a passed-in parameter. Then you load the header and build the tip. The fallback to HeaderTip::new(Point::Origin, 0) handles fresh nodes nicely. Good stuff!


272-296: Store creation and validation - thorough and clear.

The refactored make_chain_store is solid:

  1. Creates the appropriate store (in-memory or RocksDB) based on config
  2. Validates the tip exists in the store (unless it's ORIGIN - nice handling of fresh nodes there, matches the learning about Origin handling during bootstrap)
  3. Sets anchor and best chain hash to the tip
  4. Returns the store

The panic message at lines 287-290 is actually helpful - tells the user exactly what went wrong and which database path is involved. Much better than a cryptic "unwrap on None" panic.

One tiny thing to consider: the panic could theoretically be avoided by returning a Result and letting the caller decide how to handle the missing tip. But given this is bootstrap code and a missing tip in an existing database is a critical misconfiguration that should stop the node, the panic is reasonable here. Your call whether you want to make it a proper error type instead.

Comment on lines +253 to +270
fn next_best_chain(&self, point: &Point) -> Option<Point> {
let readopts = ReadOptions::default();
let prefix = [&CHAIN_PREFIX[..], &(u64::from(point.slot_or_default()) + 1).to_be_bytes()].concat();
let mut iter = self.db.iterator_opt(IteratorMode::From(&prefix, rocksdb::Direction::Forward), readopts);

if let Some(Ok((k, v))) = iter.next() {
let slot_bytes = &k[CHAIN_PREFIX.len()..CHAIN_PREFIX.len() + 8];
let slot = u64::from_be_bytes(slot_bytes.try_into().unwrap());
if v.len() == HEADER_HASH_SIZE {
let hash = HeaderHash::from(v.as_ref());
Some(Point::Specific(slot, hash.to_vec()))
} else {
None
}
} else {
None
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Implementation looks solid, but there's a potential panic lurking.

The logic is straightforward - start iterating from slot+1, grab the first result, parse it, done. However, line 260 has an .unwrap() that could panic if the key slice isn't exactly 8 bytes:

let slot = u64::from_be_bytes(slot_bytes.try_into().unwrap());

If the database is corrupted or the key format somehow changes, this'll bring the whole node down faster than you can say "crikey".

Consider replacing with:

-                    let slot = u64::from_be_bytes(slot_bytes.try_into().unwrap());
+                    let slot_array: [u8; 8] = match slot_bytes.try_into() {
+                        Ok(arr) => arr,
+                        Err(_) => {
+                            tracing::error!("Invalid slot bytes length in chain store");
+                            return None;
+                        }
+                    };
+                    let slot = u64::from_be_bytes(slot_array);

Or at minimum use .expect() with a descriptive message so you know what went wrong when investigating the crash.

Also, minor observation: the implementation relies on the iterator's prefix filtering to ensure keys start with CHAIN_PREFIX. This is fine, but if you're being defensive you could add an explicit check - though the iterator prefix mode should handle this already.

Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs Outdated
@codecov

codecov Bot commented Nov 3, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.06024% with 33 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...c/stages/consensus/forward_chain/chain_follower.rs 95.83% 10 Missing ⚠️
crates/amaru/src/stages/mod.rs 0.00% 9 Missing ⚠️
...u-consensus/src/consensus/effects/store_effects.rs 0.00% 3 Missing ⚠️
...u-consensus/src/consensus/stages/validate_block.rs 0.00% 3 Missing ⚠️
...-consensus/src/consensus/stages/validate_header.rs 0.00% 3 Missing ⚠️
...amaru-ouroboros-traits/src/stores/consensus/mod.rs 0.00% 3 Missing ⚠️
crates/amaru-stores/src/rocksdb/consensus/mod.rs 97.82% 1 Missing ⚠️
...u/src/stages/consensus/forward_chain/test_infra.rs 83.33% 1 Missing ⚠️
Files with missing lines Coverage Δ
.../src/stores/consensus/in_memory_consensus_store.rs 97.60% <100.00%> (+0.27%) ⬆️
.../stages/consensus/forward_chain/client_protocol.rs 62.88% <100.00%> (+0.71%) ⬆️
.../amaru/src/stages/consensus/forward_chain/tests.rs 100.00% <ø> (ø)
crates/amaru-stores/src/rocksdb/consensus/mod.rs 88.54% <97.82%> (+0.59%) ⬆️
...u/src/stages/consensus/forward_chain/test_infra.rs 74.50% <83.33%> (+0.36%) ⬆️
...u-consensus/src/consensus/effects/store_effects.rs 63.83% <0.00%> (-0.87%) ⬇️
...u-consensus/src/consensus/stages/validate_block.rs 63.63% <0.00%> (ø)
...-consensus/src/consensus/stages/validate_header.rs 52.82% <0.00%> (-0.83%) ⬇️
...amaru-ouroboros-traits/src/stores/consensus/mod.rs 48.61% <0.00%> (-2.12%) ⬇️
crates/amaru/src/stages/mod.rs 10.50% <0.00%> (+0.23%) ⬆️
... and 1 more

... and 11 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@etorreborre etorreborre left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛓️ I would be good to be able to test that logic in the simulation and see, for example, that the nodes are converging on the same chain, even they start with different states.

.take(1)
.collect();

if next.is_empty() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next.first().cloned()?

})
}

pub fn next_op(&mut self, store: Arc<dyn ReadOnlyChainStore<H>>) -> Option<ClientOp<H>> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pass a store reference again here since we already have self.store?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have it in self, or at least I removed it! I need to pass it because Arc<dyn ChainStore> is not Send and therefore I cannot store it in self, that's the solution I found: pass it to next_op

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it and that works fine. See my commit in abailly/follow-chain-from-store-eric.

Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs Outdated
Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs Outdated
Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
@abailly abailly force-pushed the abailly/follow-chain-from-store branch from 92f92f4 to ea28195 Compare November 4, 2025 08:12

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
crates/amaru-stores/src/rocksdb/consensus/mod.rs (1)

253-270: Handle slot overflow and dodgy key bytes safely

Line 255 is doing a raw + 1 on the slot; hit u64::MAX and we wrap (release) or panic (debug), which is the opposite of “resilient followers”. Then Line 260 slices and unwrap()s without checking the key prefix/length, so a single bit-flip in RocksDB can drop the whole node faster than Ganondorf grabbing the Master Sword. Let’s fail soft instead: guard the increment with checked_add, make sure the key really starts with CHAIN_PREFIX, and only turn the bytes into a slot when the slice converts cleanly (ditch the unwrap).

-                let prefix = [&CHAIN_PREFIX[..], &(u64::from(point.slot_or_default()) + 1).to_be_bytes()].concat();
+                let Some(start_slot) = u64::from(point.slot_or_default()).checked_add(1) else {
+                    return None;
+                };
+                let prefix = [&CHAIN_PREFIX[..], &start_slot.to_be_bytes()].concat();
                 let mut iter = self.db.iterator_opt(IteratorMode::From(&prefix, rocksdb::Direction::Forward), readopts);

                 if let Some(Ok((k, v))) = iter.next() {
-                    let slot_bytes = &k[CHAIN_PREFIX.len()..CHAIN_PREFIX.len() + 8];
-                    let slot = u64::from_be_bytes(slot_bytes.try_into().unwrap());
+                    if !k.starts_with(&CHAIN_PREFIX[..]) {
+                        return None;
+                    }
+                    let slot_bytes: [u8; 8] = match k[CHAIN_PREFIX.len()..].get(..8).and_then(|slice| slice.try_into().ok()) {
+                        Some(bytes) => bytes,
+                        None => {
+                            tracing::warn!("invalid chain slot key length in RocksDB");
+                            return None;
+                        }
+                    };
+                    let slot = u64::from_be_bytes(slot_bytes);
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (2)

185-186: Mate, you're ghosting the intersection update when draining the ops queue!

So here's the deal: when you pop_front() an op at line 185, you're sending it downstream but leaving self.intersection frozen in time—it's still pointing at the old value (usually the anchor or initial intersection). Next time next_op gets called, it's making decisions with yesterday's news, and anyone calling intersection_found() is gonna see a cursor that's stuck like a scratched DVD.

Think of it like this: you're sending headers downstream but not updating your own map. It's like playing Civilization and forgetting to update the fog of war—total chaos!

Before you return the op, you need to update self.intersection to match what you're sending. Grab the tip from the op and advance the cursor:

-        self.ops.pop_front()
+        if let Some(op) = self.ops.pop_front() {
+            self.intersection = op.tip();
+            Some(op)
+        } else {
+            None
+        }

The ClientOp already has a .tip() method (see client_protocol.rs lines 98-103 in the snippets), so just use that to keep your state consistent with what you've sent, yeah?


108-134: Crikey! The tip header's gone walkabout – it never makes it into the ops queue.

Right, so here's the craic: your backward walk only collects parent headers (line 126), but the start_header itself—the actual tip—never gets queued. When the intersection sits at or near the anchor, you'll break early with an empty or partial headers vec, and the client ends up like Marty McFly stuck in 1955—never quite catching up to the present.

Picture this: intersection == anchor. The while loop bails immediately, headers is empty, client gets a rollback to anchor, maybe one forward for anchor+1, but the current tip? Ghost. Vanished. Gone like Keyser Söze.

To fix it, you need to enqueue the start_header before you start the walk, or append it after reversing. Something like:

 let mut current_header = start_header;
 let mut headers = vec![];
+
+// Include the starting header (the tip) in the ops
+headers.push(ClientOp::Forward(current_header.clone()));

 // walk backwards until either:
 // 1. we find our intersection
 // 2. or we find the anchor
 while let Some(parent_hash) = current_header.parent() {

or after the reverse:

 // headers contains a list of Fwd operations in reverse order
 headers.reverse();
+// Ensure the tip (start_header) is included
+headers.push(ClientOp::Forward(start_header));

Either way, make sure every block above the intersection gets sent downstream, mate. No one likes an incomplete sync!

🧹 Nitpick comments (1)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)

253-253: Fair dinkum, you could tidy up these test assertions with .as_tip() for consistency.

In your test assertions at lines 253, 297, and 315, you're manually constructing Tip values with hardcoded heights or points. While this works fine, it's a bit like hand-rolling your own crypto—you can do it, but why not use the library?

Consider loading the header and calling .as_tip() instead. It's more consistent with the codebase (see relevant snippets from client_protocol.rs), and you won't risk typos in the height values. Something like:

// Line 253
let start_header = store.load_header(&hash_point(&tip)).unwrap();
assert_eq!(
    chain_follower.next_op(store.clone()),
    Some(ClientOp::Backward(start_header.as_tip()))
);
// Line 297
let expected_header = store.load_header(&Hash::from(hex::decode(WINNER_47).unwrap().as_slice())).unwrap();
assert_eq!(
    chain_follower.next_op(store.clone()),
    Some(ClientOp::Backward(expected_header.as_tip()))
);

For line 315, Tip(Point::Origin, 0) is already pretty clear, so maybe leave that one as-is unless you've got a canonical Origin tip somewhere.

Not a blocker at all, just a bit of polish to keep things idiomatic. Based on past review comments suggesting .as_tip() in similar spots.

Also applies to: 297-297, 315-315

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92f92f4 and ea28195.

📒 Files selected for processing (15)
  • crates/amaru-consensus/src/consensus/effects/store_effects.rs (1 hunks)
  • crates/amaru-consensus/src/consensus/stages/validate_block.rs (3 hunks)
  • crates/amaru-consensus/src/consensus/stages/validate_header.rs (1 hunks)
  • crates/amaru-kernel/src/is_header/tests.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2 hunks)
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs (2 hunks)
  • crates/amaru/Cargo.toml (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (3 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/client_state.rs (0 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/mod.rs (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (4 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs (1 hunks)
  • crates/amaru/src/stages/mod.rs (3 hunks)
💤 Files with no reviewable changes (1)
  • crates/amaru/src/stages/consensus/forward_chain/client_state.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • crates/amaru-consensus/src/consensus/stages/validate_block.rs
  • crates/amaru-consensus/src/consensus/effects/store_effects.rs
  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs
🧰 Additional context used
🧠 Learnings (19)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru-kernel/src/is_header/tests.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In the amaru consensus pipeline, headers are now stored before validation (store_header stage runs before validate_header stage). The validity state of headers should be tracked in the database to avoid revalidating already-valid headers and to prevent flood attacks with duplicate invalid headers.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
📚 Learning: 2025-08-18T08:11:20.028Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:109-118
Timestamp: 2025-08-18T08:11:20.028Z
Learning: The proposals_roots() method in the MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is intentionally left returning all None values rather than reading from stored state, as it's planned for the upcoming major MemoryStore revamp.

Applied to files:

  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-18T08:10:35.849Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:431-433
Timestamp: 2025-08-18T08:10:35.849Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp soon, so unimplemented methods like set_constitution, set_proposals_roots are intentionally left aside until the revamp is complete.

Applied to files:

  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-18T08:10:32.640Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.

Applied to files:

  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-kernel/src/is_header/tests.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru-kernel/src/is_header/tests.rs
  • crates/amaru/Cargo.toml
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-01-21T15:32:17.911Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The `Fold::Registered` variant in `diff_epoch_reg.rs` expects a reference `&'a V`, so unwrapping an `Option<&V>` requires only a single `.expect()`.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-08T14:39:50.527Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/borrowed_datum.rs:32-39
Timestamp: 2025-08-08T14:39:50.527Z
Learning: In the amaru project, when converting BorrowedDatumOption::Data to an owned DatumOption in crates/amaru-kernel/src/borrowed_datum.rs, the call `.unwrap()` refers to pallas’s KeepRaw::unwrap, which is infallible (always returns the inner value) and is not a panic risk. Future reviews should not flag this unwrap as dangerous.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
🧬 Code graph analysis (6)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (5)
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (4)
  • store (27-27)
  • store (50-52)
  • store (72-74)
  • store (124-126)
crates/amaru-consensus/src/consensus/headers_tree/data_generation/actions.rs (1)
  • hash (70-75)
crates/amaru-kernel/src/point.rs (1)
  • hash (40-46)
crates/amaru-consensus/src/consensus/tip.rs (2)
  • hash (94-99)
  • hash (165-167)
crates/amaru-kernel/src/is_header/mod.rs (5)
  • hash (35-35)
  • hash (123-125)
  • hash (185-187)
  • hash (207-209)
  • header (136-138)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (3)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)
  • next_best_chain (75-77)
crates/amaru-consensus/src/consensus/stages/validate_header.rs (1)
  • next_best_chain (287-289)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • next_best_chain (40-40)
  • next_best_chain (145-147)
crates/amaru-kernel/src/is_header/tests.rs (1)
crates/amaru-kernel/src/is_header/mod.rs (1)
  • header (136-138)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (1)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)
  • new (66-146)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (5)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (4)
  • hash_point (124-129)
  • fmt (71-95)
  • fmt (114-121)
  • tip (99-104)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (6)
  • point (87-89)
  • fmt (226-245)
  • new (102-127)
  • hash (79-81)
  • mk_in_memory_store (43-77)
  • hex (83-85)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (3)
  • fmt (181-193)
  • load_from_best_chain (36-36)
  • load_from_best_chain (141-143)
crates/amaru/src/stages/mod.rs (5)
  • fmt (80-85)
  • fmt (129-134)
  • from (154-159)
  • as_tip (382-382)
  • as_tip (386-388)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • load_from_best_chain (124-127)
crates/amaru-stores/src/rocksdb/consensus/mod.rs (5)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)
  • next_best_chain (75-77)
crates/amaru-consensus/src/consensus/stages/validate_header.rs (1)
  • next_best_chain (287-289)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • next_best_chain (130-146)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • next_best_chain (40-40)
  • next_best_chain (145-147)
crates/amaru-kernel/src/point.rs (3)
  • from (65-67)
  • slot_or_default (33-38)
  • hash (40-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Test the CLI
  • GitHub Check: Benches
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build aarch64/linux
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (rust)
🔇 Additional comments (4)
crates/amaru/Cargo.toml (2)

74-74: Confirm the purpose of adding test-utils feature for dev-dependency.

The addition makes sense for the test scaffolding mentioned in the PR context—your new ChainFollower and next_best_chain functionality likely need test utilities. However, check that the test-utils feature is indeed what you need (rather than something else from amaru-kernel).


78-78: Clarify why the build script needs amaru-kernel.

You're adding amaru-kernel to [build-dependencies] with no feature specification. This is a bit like bringing a full toolkit to the construction site when you've already got your hammer—what's the build.rs actually using from amaru-kernel? Is this for generating code, or is there something else going on?

crates/amaru-kernel/src/is_header/tests.rs (1)

64-67: Solid as a rock, mate – this one's good to go.

Dead right on the numbering convention. I've dug through the actual test code where any_headers_chain gets used – across the rocksdb consensus tests and the select_chain stages – and there's bugger all hardcoded assertions checking for block 0 or slot 0. The tests are all structural checks: they're validating parent-child hash linkages, storing/retrieving headers, and checking the chain relationships. They're not relying on specific numeric values.

The beauty here is you've kept the parent hash logic ('prev_hash') untouched, so all that chain-linking mojo still works exactly the same. It's like swapping the house numbers on a street – the roads and connections stay put, just the addresses change. And since those addresses now match blockchain semantics (no genesis block 0 weirdness), everyone's happy.

The +1 offset is consistent across both block_number and slot, which is clean. Comments are crystal clear too – future devs won't scratch their heads wondering why you did the +1 dance.

crates/amaru/src/stages/mod.rs (1)

286-295: Love the early sanity check

Digging this guard—bailing out if the ledger tip isn’t actually in the store is like checking the Mako’s fuel before cruising across Illium. Saves us from some gnarly half-synced states later on.

Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea28195 and 79fd911.

📒 Files selected for processing (1)
  • scripts/multi-client.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
🪛 Shellcheck (0.11.0)
scripts/multi-client.sh

[warning] 14-14: Quote this to prevent word splitting.

(SC2046)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Build riscv32
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Benches
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test the CLI
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (rust)
🔇 Additional comments (1)
scripts/multi-client.sh (1)

15-15: Clarify the intent of the 60-second sleep—it's a long pause between launches.

Is this to give each node time to bootstrap and stabilize before the next one starts? If so, that'll add significant latency for testing (10 minutes for 10 nodes, innit?). If it's intentional, a comment explaining why would help future maintainers. If not, consider making it configurable or reducing it based on empirical testing.

Comment thread scripts/multi-client.sh
Comment on lines +7 to +8
NUM_NODES=$1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add argument validation—the script will fail cryptically if NUM_NODES is missing or bogus.

Right now, if someone forgets to pass NUM_NODES or passes something that's not a number, the script either runs forever or exits with an unhelpful error from the seq command. Like trying to find an NPC in an open-world game without a quest marker, y'know? Add a quick validation check at the top.

 NUM_NODES=$1
+
+if [[ -z "$NUM_NODES" ]] || ! [[ "$NUM_NODES" =~ ^[0-9]+$ ]]; then
+    echo "Error: NUM_NODES must be a positive integer" >&2
+    exit 1
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
NUM_NODES=$1
NUM_NODES=$1
if [[ -z "$NUM_NODES" ]] || ! [[ "$NUM_NODES" =~ ^[0-9]+$ ]]; then
echo "Error: NUM_NODES must be a positive integer" >&2
exit 1
fi
🤖 Prompt for AI Agents
In scripts/multi-client.sh around lines 7 to 8, NUM_NODES is taken directly from
$1 with no validation causing cryptic failures; add a short validation block
that (1) checks that $1 is present, (2) ensures it is a positive integer (e.g.
regex or POSIX test), and (3) prints a clear usage message like "Usage:
multi-client.sh NUM_NODES (positive integer)" and exits with a non-zero status
if the check fails; keep the check near the top before any use of NUM_NODES so
subsequent seq/loops only run with a valid value.

Comment thread scripts/multi-client.sh Outdated
Comment on lines +11 to +12
amaru bootstrap --ledger-dir "${NODE_ID}/ledger.db" --chain-dir "${NODE_ID}/chain.db" --network preview
amaru --service-name $NODE_ID --with-open-telemetry daemon --peer-address localhost:3000 --listen-address localhost:$(( 4000 + $i)) --ledger-dir "${NODE_ID}/ledger.db" --chain-dir "${NODE_ID}/chain.db" --network preview > $NODE_ID.log 2>&1 &

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No error handling if bootstrap fails—the entire script exits with set -e.

If bootstrap fails on node 2 of 5, the whole orchestration stops, leaving nodes 1 running orphaned. That's a rough state to debug. Consider wrapping the bootstrap in a conditional or at minimum logging the failure and continuing (depending on your UX goals).

Also, the daemon launch command on line 12 is a bit of a wall-of-text. Consider breaking it into variables for readability:

     NODE_ID=amaru-$i
+    NODE_LOG=$NODE_ID.log
+    NODE_PORT=$((4000 + i))
+
     amaru bootstrap --ledger-dir "${NODE_ID}/ledger.db" --chain-dir "${NODE_ID}/chain.db" --network preview
-    amaru --service-name $NODE_ID --with-open-telemetry daemon --peer-address localhost:3000 --listen-address localhost:$(( 4000 + $i))  --ledger-dir "${NODE_ID}/ledger.db" --chain-dir "${NODE_ID}/chain.db" --network preview > $NODE_ID.log 2>&1 &
+    amaru --service-name "$NODE_ID" \
+        --with-open-telemetry daemon \
+        --peer-address localhost:3000 \
+        --listen-address "localhost:${NODE_PORT}" \
+        --ledger-dir "${NODE_ID}/ledger.db" \
+        --chain-dir "${NODE_ID}/chain.db" \
+        --network preview \
+        > "$NODE_LOG" 2>&1 &
🤖 Prompt for AI Agents
In scripts/multi-client.sh around lines 11 to 12, the bootstrap step is
currently run under set -e so a single bootstrap failure will terminate the
entire script and leave already-started nodes orphaned; modify the logic to run
the bootstrap command in a conditional (capture its exit code), log a clear
error message (with node id and exit code) if it fails, and decide whether to
continue or retry based on that code (e.g., continue to next node after
logging). Also simplify the long daemon invocation by extracting its options
into descriptive variables (e.g., PEER_ADDR, LISTEN_ADDR, LEDGER_DIR, CHAIN_DIR,
LOG_FILE) and then using those variables in the amaru daemon command and
redirect, improving readability and making future error handling or retries
easier.

Comment thread scripts/multi-client.sh Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
scripts/multi-client.sh (2)

16-17: Missing argument validation—script will crater if NUM_NODES or PEER_ADDRESS are missing or dodgy.

Right now this reads like jumping into a boss fight without checking your weapon—you'll get rekt by cryptic errors from seq or the daemon if someone forgets an argument or passes garbage. NUM_NODES should be validated as a positive integer, and PEER_ADDRESS should at least be checked for non-empty.

Apply this validation block right after line 17:

 NUM_NODES=$1
 PEER_ADDRESS=$2
+
+if [[ -z "$NUM_NODES" ]] || ! [[ "$NUM_NODES" =~ ^[0-9]+$ ]]; then
+    echo "Error: NUM_NODES must be a positive integer" >&2
+    exit 1
+fi
+
+if [[ -z "$PEER_ADDRESS" ]]; then
+    echo "Error: PEER_ADDRESS cannot be empty" >&2
+    exit 1
+fi

22-22: No error handling if bootstrap fails—entire script exits leaving nodes orphaned. Plus, daemon command is a wall-of-text.

If bootstrap fails on node 2 of 5, the whole thing bails thanks to set -e, leaving node 1 just... sitting there. That's a debugging nightmare. Also, the daemon invocation across lines 23–27 is hard to parse.

Refactor with error handling and clearer variable breakdown:

     NODE_ID=amaru-$i
+    NODE_LOG="${NODE_ID}.log"
+    NODE_PORT=$((4000 + i))
+    
     echo "$(date -Iseconds) : bootstrapping $NODE_ID"
-    amaru bootstrap --ledger-dir "${NODE_ID}/ledger.db" --chain-dir "${NODE_ID}/chain.db" --network preview > "$NODE_ID.log" 2>&1
+    if ! amaru bootstrap --ledger-dir "${NODE_ID}/ledger.db" --chain-dir "${NODE_ID}/chain.db" --network preview >> "$NODE_LOG" 2>&1; then
+        echo "$(date -Iseconds) : bootstrap failed for $NODE_ID, skipping" >&2
+        continue
+    fi
+    
-    amaru daemon --peer-address "${PEER_ADDRESS}" \
-          --listen-address localhost:$(( 4000 + i)) \
-          --ledger-dir "${NODE_ID}/ledger.db" \
-          --chain-dir "${NODE_ID}/chain.db" \
-          --network preview >> "$NODE_ID.log" 2>&1 &
+    amaru daemon \
+        --peer-address "${PEER_ADDRESS}" \
+        --listen-address "localhost:${NODE_PORT}" \
+        --ledger-dir "${NODE_ID}/ledger.db" \
+        --chain-dir "${NODE_ID}/chain.db" \
+        --network preview \
+        >> "$NODE_LOG" 2>&1 &

Also applies to: 23-27

🧹 Nitpick comments (2)
scripts/multi-client.sh (2)

20-20: Unquoted variable expansion for consistency.

While it works here, NODE_ID=amaru-$i is a bit inconsistent with the quoted variable usage elsewhere. Minor nit, but best practice is amaru-${i} to make it clear where the variable boundary is.

-    NODE_ID=amaru-$i
+    NODE_ID="amaru-${i}"

30-30: 60-second delay per node might be overkill—consider making it configurable.

Starting 10 nodes would take 10 minutes just for staggered bootup. If that's intentional for load dampening, fair dinkum, but it might be worth either making it a parameter or reducing it. Quick question: is this delay meant to avoid thundering-herd bootstrap issues, or just testing throughput?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79fd911 and b3648e9.

📒 Files selected for processing (1)
  • scripts/multi-client.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Test the CLI
  • GitHub Check: Benches
  • GitHub Check: Build x86_64/windows
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build riscv32
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (rust)
🔇 Additional comments (1)
scripts/multi-client.sh (1)

21-21: Good on ya—date variables are properly quoted now!

The $(date -Iseconds) calls on lines 21 and 29 are correctly quoted, which prevents word splitting. Past review flagged this; looks like it's sorted. 👌

Also applies to: 29-29

@abailly

abailly commented Nov 4, 2025

Copy link
Copy Markdown
Contributor Author

I ran a small experiment with a remote node running preview and starting up 5 clients at a 1 minute interval on a local machine and monitoring RAM consumption of remote process. It seems that this PR mostly solves the memory consumption problem.

Screenshot 2025-11-04 at 11 02 02

Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
@abailly abailly force-pushed the abailly/follow-chain-from-store branch from b3648e9 to b102f4c Compare November 10, 2025 08:18

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (3)

114-131: Crikey, the tip's gone walkabout!

Right, so this backward jaunt collects all the ancestors but never actually adds the start_header (your tip block at the beginning of the walk) into the headers vector. When a punter connects with an intersection equal to or close to the anchor, they'll get everything except the current tip—bit like ordering a burger and getting everything but the patty, yeah?

The past review nailed it: if intersection == anchor, the client gets a rollback to anchor and maybe one forward for anchor+1, but never the actual tip at current_tip.

Toss the start_header into the queue before or after the reverse—something like this should sort it:

         headers.push(ClientOp::Forward(header.clone()));
         current_header = header;
     }
     None => return None,
   }
 }

+// Include the tip itself
+headers.push(ClientOp::Forward(start_header.clone()));
+
 // headers contains a list of Fwd operations in reverse order
 headers.reverse();

Or if you reverse first, prepend it after:

 // headers contains a list of Fwd operations in reverse order
 headers.reverse();
+headers.push(ClientOp::Forward(start_header));

129-129: This broken chain deserves a proper rage-quit, not a quiet exit.

Returning None here masks a serious store invariant violation—if a header exists but its parent doesn't, that's like finding a sequel with no original film. The codebase already panics for similar "this should never happen" scenarios (see HeadersTree's unsafe_get_header), and the learnings confirm this pattern.

Based on learnings.

Apply this diff to fail fast with context:

-                None => return None, // FIXME: Broken chain, shouldn't we panic?
+                None => panic!(
+                    "Store invariant violated: header {} exists but parent {} not found",
+                    current_header.hash(), parent_hash
+                ),

185-186: Mate, you're ghosting the intersection update!

When you pop from self.ops here, you're handing back a Forward or Backward op but leaving self.intersection stuck in the past—typically frozen at the anchor value. After the first queued op goes out, any subsequent next_op calls are making decisions with yesterday's news, and anyone checking intersection_found() never sees the headers you just served.

The past review spotted this straight away: you need to update self.intersection to match the op you're returning, whether it's Forward or Backward.

Fix it by syncing the cursor before you return:

-    self.ops.pop_front()
+    if let Some(op) = self.ops.pop_front() {
+        self.intersection = op.tip();
+        Some(op)
+    } else {
+        None
+    }

That op.tip() method exists in client_protocol.rs (lines 98-103 in the relevant snippets) and handles both Forward and Backward variants, so you're sorted.

crates/amaru-stores/src/rocksdb/consensus/mod.rs (1)

253-270: Guard against malformed chain keys before decoding the slot

We’re slicing the RocksDB key and calling try_into().unwrap() under the assumption that the key is perfect. If the store ever contains a truncated or unexpected key (bit-rot, manual fiddling, whatever), we’ll panic here and take the node down—been there, feels like watching Blade Runner’s spinner drop out of the sky mid-chase. Let’s fail gracefully by checking the slice length and handling the conversion error explicitly.

-                if let Some(Ok((k, v))) = iter.next() {
-                    let slot_bytes = &k[CHAIN_PREFIX.len()..CHAIN_PREFIX.len() + 8];
-                    let slot = u64::from_be_bytes(slot_bytes.try_into().unwrap());
+                if let Some(Ok((k, v))) = iter.next() {
+                    let slot_start = CHAIN_PREFIX.len();
+                    let slot_end = slot_start + std::mem::size_of::<u64>();
+                    if k.len() < slot_end {
+                        tracing::error!(
+                            "invalid chain key length ({} bytes), expected at least {}",
+                            k.len(),
+                            slot_end
+                        );
+                        return None;
+                    }
+                    let slot_bytes: [u8; std::mem::size_of::<u64>()] =
+                        match k[slot_start..slot_end].try_into() {
+                            Ok(bytes) => bytes,
+                            Err(_) => {
+                                tracing::error!("failed to decode slot bytes for chain key");
+                                return None;
+                            }
+                        };
+                    let slot = u64::from_be_bytes(slot_bytes);
🧹 Nitpick comments (3)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (3)

247-261: Sweet test coverage, though a tiny style nit if you're keen.

This test correctly verifies the behavior when the client's already at the tip—backwards rollback to the start position, all good. The logic is sound.

etorreborre noted in a past comment that you could use .as_tip() on line 253 instead of manually constructing the Tip. The current code works fine, but the helper method would be a touch cleaner:

-    let start = Tip(tip.clone(), store.get_height(TIP_47));
+    let start = store.load_header(&hash(TIP_47)).unwrap().as_tip();

But honestly, this is just tidying up—your current approach is perfectly clear.


281-299: Another spot where .as_tip() could save you a line.

Similar vibe to the previous test—everything works, but you're manually building a Tip when the helper exists. etorreborre flagged this one too in past reviews.

     let expected = store.get_point(WINNER_47);
+    let expected_header = store.load_header(&hash(WINNER_47)).unwrap();

     let mut chain_follower = ChainFollower::new(store.clone(), &tip, &points).unwrap();
     assert_eq!(
         chain_follower.next_op(store.clone()),
-        Some(ClientOp::Backward(Tip(expected, 8)))
+        Some(ClientOp::Backward(expected_header.as_tip()))
     );

Again, totally optional—just makes the intent a bit more obvious.


301-321: One more .as_tip() opportunity in this test.

This test verifies the "start from Origin" scenario when the requested intersection isn't on the best chain—solid coverage. The manual Tip(Point::Origin, 0) construction works, but for consistency you might consider using the pattern from other tests.

That said, for Point::Origin specifically, manually constructing Tip(Point::Origin, 0) is actually pretty clear and explicit, so this is really just for consistency with the other suggestions. Up to you whether it's worth the change.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3648e9 and b102f4c.

📒 Files selected for processing (15)
  • crates/amaru-consensus/src/consensus/effects/store_effects.rs (1 hunks)
  • crates/amaru-consensus/src/consensus/stages/validate_block.rs (3 hunks)
  • crates/amaru-consensus/src/consensus/stages/validate_header.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2 hunks)
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs (2 hunks)
  • crates/amaru/Cargo.toml (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (3 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/client_state.rs (0 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/mod.rs (1 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (4 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs (1 hunks)
  • crates/amaru/src/stages/mod.rs (3 hunks)
  • scripts/multi-client.sh (1 hunks)
💤 Files with no reviewable changes (1)
  • crates/amaru/src/stages/consensus/forward_chain/client_state.rs
🚧 Files skipped from review as they are similar to previous changes (6)
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
  • crates/amaru-consensus/src/consensus/stages/validate_header.rs
  • crates/amaru-consensus/src/consensus/effects/store_effects.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs
  • scripts/multi-client.sh
🧰 Additional context used
🧠 Learnings (20)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs
  • crates/amaru/Cargo.toml
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru/src/stages/consensus/forward_chain/tests.rs
📚 Learning: 2025-08-18T08:10:32.640Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/mod.rs
📚 Learning: 2025-08-18T08:10:35.849Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:431-433
Timestamp: 2025-08-18T08:10:35.849Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp soon, so unimplemented methods like set_constitution, set_proposals_roots are intentionally left aside until the revamp is complete.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
📚 Learning: 2025-01-21T15:32:17.911Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The `Fold::Registered` variant in `diff_epoch_reg.rs` expects a reference `&'a V`, so unwrapping an `Option<&V>` requires only a single `.expect()`.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
📚 Learning: 2025-08-08T14:39:50.527Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/borrowed_datum.rs:32-39
Timestamp: 2025-08-08T14:39:50.527Z
Learning: In the amaru project, when converting BorrowedDatumOption::Data to an owned DatumOption in crates/amaru-kernel/src/borrowed_datum.rs, the call `.unwrap()` refers to pallas’s KeepRaw::unwrap, which is infallible (always returns the inner value) and is not a panic risk. Future reviews should not flag this unwrap as dangerous.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru-consensus/src/consensus/stages/validate_block.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In the amaru consensus pipeline, headers are now stored before validation (store_header stage runs before validate_header stage). The validity state of headers should be tracked in the database to avoid revalidating already-valid headers and to prevent flood attacks with duplicate invalid headers.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/test_infra.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru/src/stages/mod.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-20T18:02:25.073Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
🧬 Code graph analysis (4)
crates/amaru-stores/src/rocksdb/consensus/mod.rs (4)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)
  • next_best_chain (75-77)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • next_best_chain (130-146)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (2)
  • next_best_chain (40-40)
  • next_best_chain (145-147)
crates/amaru-kernel/src/point.rs (4)
  • from (65-67)
  • slot_or_default (33-38)
  • hash (40-46)
  • random_hash (146-148)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (1)
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (4)
  • store (27-27)
  • store (50-52)
  • store (72-74)
  • store (124-126)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (5)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (4)
  • hash_point (124-129)
  • fmt (71-95)
  • fmt (114-121)
  • tip (99-104)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (6)
  • point (87-89)
  • fmt (226-245)
  • new (102-127)
  • hash (79-81)
  • mk_in_memory_store (43-77)
  • hex (83-85)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (3)
  • fmt (181-193)
  • load_from_best_chain (36-36)
  • load_from_best_chain (141-143)
crates/amaru/src/stages/mod.rs (5)
  • fmt (80-85)
  • fmt (129-134)
  • from (154-159)
  • as_tip (382-382)
  • as_tip (386-388)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (1)
  • load_from_best_chain (124-127)
crates/amaru-consensus/src/consensus/stages/validate_block.rs (5)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (1)
  • point (87-89)
crates/amaru-kernel/src/consensus_events.rs (1)
  • point (128-133)
crates/amaru-kernel/src/is_header/mod.rs (1)
  • point (38-40)
crates/amaru-consensus/src/consensus/effects/network_effects.rs (1)
  • point (118-123)
crates/amaru-consensus/src/consensus/tip.rs (2)
  • point (101-106)
  • point (169-171)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: clippy
  • GitHub Check: Build wasm32
  • GitHub Check: Build riscv32
  • GitHub Check: Test the CLI
  • GitHub Check: Benches
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (rust)
🔇 Additional comments (1)
crates/amaru/Cargo.toml (1)

50-50: This review comment is based on a misunderstanding of Cargo's design.

Here's the thing, mate — having amaru-kernel in both [dependencies] and [dev-dependencies] with different features is actually spot-on practice, not a duplication issue. It's like how you'd pack different gear for a road trip versus a weekend camping trip with your mates.

The evidence shows amaru-kernel is extensively used throughout the main crate — stages, CLI commands, network logic, the lot. The "tracing" feature in [dependencies] is essential for production logging. The "test-utils" feature in [dev-dependencies] is supplemental testing infrastructure that doesn't need bloating the shipped binary.

This is precisely what Cargo was designed for: specifying different feature sets per dependency scope. Combining them into one would be the actual antipattern here — you'd be shipping unnecessary test utilities into production, which is a no-go for optimization.

The three entries (main deps, dev-deps, build-deps) are all legit and serve their purpose. No changes needed.

Likely an incorrect or invalid review comment.

Comment thread crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs Outdated
Comment on lines +293 to 295
chain_store.set_anchor_hash(tip)?;
chain_store.set_best_chain_hash(tip)?;
Ok(chain_store)

@coderabbitai coderabbitai Bot Nov 10, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Handle Origin tip before setting anchor

If the ledger tip is still Origin when we boot, this code calls set_anchor_hash(ORIGIN_HASH) and set_best_chain_hash(ORIGIN_HASH). The store doesn’t have a header for that hash, so it errors out and the node bails during bootstrap—kind of like getting wiped in the tutorial mission before the real game even starts. Guard the Origin case before touching the store so fresh nodes can actually come online.

Apply this diff to prevent the blow-up:

-    chain_store.set_anchor_hash(tip)?;
-    chain_store.set_best_chain_hash(tip)?;
+    if *tip != ORIGIN_HASH {
+        chain_store.set_anchor_hash(tip)?;
+        chain_store.set_best_chain_hash(tip)?;
+    }

Based on learnings

🤖 Prompt for AI Agents
In crates/amaru/src/stages/mod.rs around lines 293–295, the code unconditionally
calls chain_store.set_anchor_hash(tip) and set_best_chain_hash(tip) which fails
when the ledger tip is still Origin; check the tip first and skip setting
anchor/best when tip is Origin (e.g., if tip.is_origin() or matches the Origin
variant, return Ok(chain_store) or proceed without calling the setters),
otherwise call the two set_* methods as before and propagate errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not possible as Amaru explicit does not support starting from ORIGIN

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abailly, fair play! If Amaru requires folks to bootstrap their ledger first—kind of like needing to complete the tutorial before you can actually play the game—then yeah, the ORIGIN case is off the table by design. The panic message "Have you bootstrapped your node?" already tells the story: no sneaking past the bouncer without proper ID.

Cheers for clarifying that constraint, mate. I'll leave the code as-is since the architectural decision handles it upstream. 🍻


🧠 Learnings used
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:109-118
Timestamp: 2025-08-18T08:11:20.028Z
Learning: The proposals_roots() method in the MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is intentionally left returning all None values rather than reading from stored state, as it's planned for the upcoming major MemoryStore revamp.

Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)

129-129: Mate, this broken chain situation needs a proper freakout, not a quiet exit.

Right now if a parent header's gone walkabout from the store, we just return None and pretend everything's grand—but that's a full-blown store invariant violation (parent exists in theory, vanishes in practice). The project's vibe (see HeadersTree's unsafe_get_header pattern) is to fail fast with a panic when the database gets dodgy, not sweep it under the rug.

Apply this diff to make it panic with diagnostic context:

-                None => return None, // FIXME: Broken chain, shouldn't we panic?
+                None => panic!(
+                    "Store invariant violated: header {} exists but parent {} not found",
+                    current_header.hash(), parent_hash
+                ),

Based on learnings.

🧹 Nitpick comments (1)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)

192-210: This rollback-handling logic's pretty clever, but it's doing some serious acrobatics.

The add_op method's got two modes when a Backward rolls in:

  • If the rollback target's already queued (line 197: rfind a matching Forward), truncate ops to that point—basically "don't send anything past here."
  • Otherwise, nuke the queue and push an explicit Backward op.

The logic looks sound (truncate to index + 1 keeps everything up to and including the target), but this kind of coalescing can be gnarly to debug when things go sideways. Since you're already knee-deep in edge cases (per your PR notes), might be worth a quick verification that the truncation index and queue state stay consistent across various rollback scenarios (e.g., rollback to queued point, rollback past queue, rollback to already-sent point).

Consider adding a unit test that exercises:

  1. add_op(Backward(tip)) when ops contains Forward(tip) (should truncate)
  2. add_op(Backward(old_point)) when ops is non-empty but doesn't contain old_point (should clear and push Backward)
  3. Sequence: add several Forwards, then Backward, then more Forwards—verify queue state at each step
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b102f4c and aa909f4.

📒 Files selected for processing (1)
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (14)
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-08T14:39:50.527Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/borrowed_datum.rs:32-39
Timestamp: 2025-08-08T14:39:50.527Z
Learning: In the amaru project, when converting BorrowedDatumOption::Data to an owned DatumOption in crates/amaru-kernel/src/borrowed_datum.rs, the call `.unwrap()` refers to pallas’s KeepRaw::unwrap, which is infallible (always returns the inner value) and is not a panic risk. Future reviews should not flag this unwrap as dangerous.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-04-20T18:02:25.073Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
🧬 Code graph analysis (1)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (6)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (4)
  • hash_point (124-129)
  • fmt (71-95)
  • fmt (114-121)
  • tip (99-104)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (6)
  • point (87-89)
  • fmt (226-245)
  • new (102-127)
  • hash (79-81)
  • mk_in_memory_store (43-77)
  • hex (83-85)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (3)
  • fmt (181-193)
  • load_from_best_chain (36-36)
  • load_from_best_chain (141-143)
crates/amaru/src/stages/mod.rs (5)
  • fmt (80-85)
  • fmt (129-134)
  • from (154-159)
  • as_tip (382-382)
  • as_tip (386-388)
crates/amaru-ouroboros-traits/src/stores/consensus/in_memory_consensus_store.rs (3)
  • new (35-39)
  • new (59-69)
  • load_from_best_chain (124-127)
crates/amaru-consensus/src/consensus/effects/store_effects.rs (26)
  • new (29-31)
  • new (121-123)
  • new (150-152)
  • new (178-180)
  • new (206-208)
  • new (235-237)
  • new (263-265)
  • new (291-293)
  • new (319-321)
  • new (345-347)
  • new (371-373)
  • run (128-136)
  • run (157-165)
  • run (185-193)
  • run (213-221)
  • run (242-250)
  • run (270-278)
  • run (298-306)
  • run (326-334)
  • run (352-360)
  • run (378-386)
  • run (406-414)
  • run (434-442)
  • run (462-470)
  • run (490-498)
  • load_from_best_chain (71-73)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Test the CLI
  • GitHub Check: Benches
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build wasm32
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Test coverage
  • GitHub Check: clippy
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Analyze (rust)
🔇 Additional comments (4)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (4)

158-182: Solid handling of the behind-anchor case, panic and all!

The logic here's bang-on: when the follower's trailing the anchor, you query the store for the next best chain point, load the header, update the intersection, and forward it. The panic at lines 171-174 (when next_best_chain returns a point but the header's missing) is exactly the right move for catching invariant violations. Well played!


184-189: Nice one—intersection cursor stays in sync when draining ops!

This bit (line 185: self.intersection = op.tip()) ensures the intersection advances as we emit queued operations, keeping the follower's state consistent with what's been sent downstream. Past review flagged this as a concern, and you've nailed it.


286-303: Heads up: test comment reckons the intersection logic doesn't match the documented spec.

Line 290-291 says "Note that the below scheme does not match the documented behaviour, which shall pick the first from the list that is on the same chain. But that doesn't make sense to me at all."

The implementation (line 99 in new()) uses max_by_key on slot, meaning it picks the highest slot from the requested points. If the spec actually says "pick the first point from the list that's on the best chain," that's a different algorithm (order-dependent vs. slot-based). Might want to double-check which behavior is intended, or update the comment if the max-slot approach is now canonical.

Could you clarify whether:

  1. The max-slot approach is the correct implementation (and the comment's just outdated)
  2. Or there's a genuine spec deviation here that needs addressing?

134-145: The concern is valid—start_header never makes it into ops.

The evidence is pretty clear: start_header = store.load_header(&hash_point(current_tip)) at line 72 loads the actual current tip. Then the while loop (lines 115–132) walks backwards from it, but the loop breaks without ever pushing start_header or even the intersection point itself. After reversing, ops contains only the headers between the intersection and the tip—exclusive on both ends.

Here's the flow:

  1. Early exit (lines 82–88): If client already has current_tip in their points, it sets initial = start_header and ops = []. This case works fine.
  2. Normal path (lines 137–146): Sets initial = best_intersection (NOT start_header) and ops = walked ancestors (without start_header).

When next_op() runs (lines 151–169), the client gets a Backward to best_intersection, then consumes the ops, but never receives start_header as a Forward. The comment at lines 47–50 even hints at the gap: "intersection represents the parent of the next header to forward to client," suggesting something still needs forwarding.

Whether this is intentional design (with a follow-up sync handling the gap) or an actual bug that needs fixing would require checking how this integrates with the client sync protocol and whether there's a continuation mechanism downstream.

Signed-off-by: Arnaud Bailly <arnaud@pankzsoft.com>
@abailly abailly force-pushed the abailly/follow-chain-from-store branch from aa909f4 to 9a770b6 Compare November 10, 2025 10:05
@abailly abailly merged commit df64ec5 into main Nov 10, 2025
20 checks passed
@abailly abailly deleted the abailly/follow-chain-from-store branch November 10, 2025 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Downstream clients connecting entail huge memory spike and stalls process

2 participants