Skip to content

run e2e sync test until epoch 185 (was reduced from 182 to 176 due to regression)#749

Merged
rkuhn merged 8 commits into
mainfrom
rk/preprod-sync
Apr 25, 2026
Merged

run e2e sync test until epoch 185 (was reduced from 182 to 176 due to regression)#749
rkuhn merged 8 commits into
mainfrom
rk/preprod-sync

Conversation

@rkuhn

@rkuhn rkuhn commented Apr 10, 2026

Copy link
Copy Markdown
Contributor

closes #658

Summary by CodeRabbit

  • Chores
    • Updated snapshot test workflow defaults and bumped preprod fixed-epoch
    • Added .opencode/ to .gitignore
  • Bug Fixes
    • Rate-limited frequent “adopted tip” logging, lowered noisy logs to DEBUG and added suppressed-count tracking
  • Performance
    • Reduced default deferred-request polling interval for faster scheduling
  • Tests
    • Updated trace helpers and expected traces to reflect new clock/trace ordering
  • Refactor
    • Simplified deferred-request tracking and ledger-height retrieval logic

@coderabbitai

coderabbitai Bot commented Apr 10, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Standardises CI workflow string defaults and bumps preprod epoch; ignores /.opencode/; reduces a poll default; adds time-based rate-limited logging and a clock trace helper in adopt_chain; refactors track_peers defer/ledger lookup and updates related tests.

Changes

Cohort / File(s) Summary
Workflow Configuration
/.github/workflows/snapshot-tests.yml
Changed workflow_call string input defaults from single quotes to double quotes; bumped preprod matrix target_epoch from 176185.
Repository Ignore
/.gitignore
Added /.opencode/ to ignore rules.
Adopt Chain Stage
crates/amaru-consensus/src/stages/adopt_chain/mod.rs
Added last_printed: Instant and suppressed: u32; switched to 1s time-based rate-limited logging; initialized fields in AdoptChain::new.
Adopt Chain Tests / Helpers
crates/amaru-consensus/src/stages/adopt_chain/test_setup.rs, crates/amaru-consensus/src/stages/adopt_chain/tests.rs
Added pub fn te_clock(at_stage: &str) helper; tests now include te_clock("ac-1") and expect adopted-tip at DEBUG with suppressed = 1.
Track Peers — Defer Request Next
crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs
Removed per-peer peer from DeferReqNextMsg::Register and dropped Cancel { peer }; pending stores (handler, min_ledger_height); scheduling uses eff.schedule_after(Poll, ...).
Track Peers — Core & Ledger Lookup
crates/amaru-consensus/src/stages/track_peers/mod.rs
Replaced public ledger_applied_block_height with private fn ledger_applied_block_height<T: SendData>(eff: &Effects<T>) -> BlockHeight that clones effects and reads best-chain header; removed peer-specific Cancel on rollback.
Track Peers Tests / Helpers
crates/amaru-consensus/src/stages/track_peers/test_setup.rs, crates/amaru-consensus/src/stages/track_peers/tests.rs
Replaced te_volatile_tip/te_tip with te_get_best_chain_hash and te_load_header(...); registered GetBestChainHashEffect in test guards; updated roll-forward test traces.
Config Default
crates/amaru/src/stages/config.rs
Changed Config::default defer_req_next_poll_ms from 2000200 ms.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant Upstream as Upstream
    participant TrackPeers as TrackPeers
    participant Effects as Effects/Store
    participant Defer as DeferReqNext
    participant Initiator as InitiatorHandler
    end

    Upstream->>TrackPeers: roll_forward(header)
    TrackPeers->>Effects: ledger_applied_block_height(eff)
    Effects->>Effects: read best_chain_hash → load header
    TrackPeers->>Defer: Register { handler, min_ledger_height }
    Note right of Defer: schedule_after(Poll, poll_ms)
    Defer->>Initiator: RequestNext when ledger height >= min_h
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • abailly
  • jeluard

Poem

A second's hush in logs, the pipeline finds its beat,
Epochs nudged and quotes cleaned up — tidy as a street.
Peers trimmed to essentials, the ledger asks the hash,
Tests tap the clock, deferred tasks re-run in a flash.
Shipshape and jaunty — light the build, let's feast.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: updating the preprod snapshot-test target epoch from 176 to 185, which is the primary objective of this PR per issue #658.
Linked Issues check ✅ Passed The PR partially addresses issue #658 by improving CI e2e sync testing (increasing target epoch to 185), which aligns with the acceptance criteria. However, the core objectives of moving upstream peer tracking out of select_chain and refactoring block validation are implemented through the code changes to DeferReqNext, track_peers, and AdoptChain.
Out of Scope Changes check ✅ Passed All changes directly support the PR objectives: logging rate-limiting in AdoptChain improves sync stability, DeferReqNext refactoring removes peer scoping per the issue requirements, and track_peers changes implement upstream peer tracking factoring. The .gitignore and quote-style changes are minor housekeeping supporting the refactoring.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch rk/preprod-sync

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Apr 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 67.81609% with 84 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/amaru/src/bin/amaru/cmd/dump_chain_db.rs 0.00% 38 Missing ⚠️
...ates/amaru-consensus/src/stages/track_peers/mod.rs 33.33% 12 Missing ⚠️
...maru/src/bin/amaru/cmd/remove_validation_status.rs 0.00% 10 Missing ⚠️
...consensus/src/stages/track_peers/defer_req_next.rs 0.00% 6 Missing ⚠️
crates/amaru-stores/src/rocksdb/consensus/mod.rs 0.00% 5 Missing ⚠️
...ates/amaru-consensus/src/stages/adopt_chain/mod.rs 80.00% 3 Missing ⚠️
crates/amaru-kernel/src/cardano/point.rs 0.00% 3 Missing ⚠️
crates/amaru-kernel/src/cardano/network_name.rs 50.00% 2 Missing ⚠️
...maru-ledger/src/rules/transaction/phase_two/mod.rs 96.72% 2 Missing ⚠️
...-kernel/src/cardano/protocol_parameters/default.rs 98.70% 1 Missing ⚠️
... and 2 more
Files with missing lines Coverage Δ
...aru-consensus/src/stages/adopt_chain/test_setup.rs 97.87% <100.00%> (+0.04%) ⬆️
...tes/amaru-consensus/src/stages/fetch_blocks/mod.rs 80.51% <100.00%> (ø)
...aru-consensus/src/stages/track_peers/test_setup.rs 100.00% <100.00%> (ø)
...es/amaru-kernel/src/cardano/protocol_parameters.rs 77.57% <ø> (ø)
...er/src/governance/ratification/proposals_forest.rs 94.61% <100.00%> (ø)
crates/amaru-ledger/src/rules.rs 95.31% <100.00%> (ø)
crates/amaru-ledger/src/rules/block/ex_units.rs 100.00% <ø> (ø)
crates/amaru-ledger/src/rules/block/header_size.rs 100.00% <100.00%> (ø)
...dger/src/rules/transaction/phase_one/collateral.rs 82.35% <ø> (ø)
...-ledger/src/rules/transaction/phase_one/outputs.rs 100.00% <ø> (ø)
... and 20 more

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/amaru-consensus/src/stages/track_peers/mod.rs (1)

38-44: Heads up on a sneaky little clippy attribute, legend!

The #[expect(clippy::expect_used)] attribute on line 39 seems like a leftover from the old implementation – the current function uses unwrap_or which doesn't trigger that lint. It's like leaving the cheat codes enabled after you've already beaten the boss fair and square!

Consider removing it to keep the code tidy:

🧹 Remove unnecessary attribute
 /// Block height of the furthest ledger-applied state: volatile tip if present, otherwise stable tip.
-#[expect(clippy::expect_used)]
 fn ledger_applied_block_height<T: SendData>(eff: &Effects<T>) -> BlockHeight {
     let store = Store::new(eff.clone());
     let best = store.get_best_chain_hash();
     store.load_header(&best).map(|h| h.block_height()).unwrap_or(BlockHeight::from(0))
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/track_peers/mod.rs` around lines 38 - 44,
The attribute #[expect(clippy::expect_used)] above the function
ledger_applied_block_height is stale (the function uses unwrap_or, not expect);
remove that attribute annotation so the codebase no longer contains an
unnecessary clippy expectation. Locate the ledger_applied_block_height<T:
SendData>(eff: &Effects<T>) -> BlockHeight function and delete the
#[expect(clippy::expect_used)] line immediately preceding it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/amaru-consensus/src/stages/track_peers/mod.rs`:
- Around line 38-44: The attribute #[expect(clippy::expect_used)] above the
function ledger_applied_block_height is stale (the function uses unwrap_or, not
expect); remove that attribute annotation so the codebase no longer contains an
unnecessary clippy expectation. Locate the ledger_applied_block_height<T:
SendData>(eff: &Effects<T>) -> BlockHeight function and delete the
#[expect(clippy::expect_used)] line immediately preceding it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4f3eb964-8d4b-4f1c-bb9a-4ba74daaf33f

📥 Commits

Reviewing files that changed from the base of the PR and between ab78797 and bb22fd1.

📒 Files selected for processing (8)
  • crates/amaru-consensus/src/stages/adopt_chain/mod.rs
  • crates/amaru-consensus/src/stages/adopt_chain/test_setup.rs
  • crates/amaru-consensus/src/stages/adopt_chain/tests.rs
  • crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs
  • crates/amaru-consensus/src/stages/track_peers/mod.rs
  • crates/amaru-consensus/src/stages/track_peers/test_setup.rs
  • crates/amaru-consensus/src/stages/track_peers/tests.rs
  • crates/amaru/src/stages/config.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/amaru/src/stages/config.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/amaru-consensus/src/stages/adopt_chain/tests.rs (1)

115-133: ⚠️ Potential issue | 🟠 Major

Update the expected state for the suppressed counter.

These tests now expect the DEBUG branch for "adopted tip"; that branch increments state.suppressed before returning, so the final te_state should expect suppressed = 1. Otherwise the trace assertion will be off by one — a sneaky little Elden Ring trap.

🛠️ Proposed test fix
     let mut expected = prep.state.clone();
     expected.current_best_tip = msg;
+    expected.suppressed = 1;
     assert_trace(
         &running,
         &[

Apply the same expected.suppressed = 1; update in all three adopting tests.

Also applies to: 159-187, 212-234

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/adopt_chain/tests.rs` around lines 115 -
133, The final expected state in the adopting-tip tests must account for the
DEBUG branch incrementing state.suppressed; update the test setup to set
expected.suppressed = 1 before asserting te_state so the trace matches the
branch that increments suppressed, and apply the same change in all three
adopting tests that build expected and assert te_state (refer to the local
variable expected, the state.suppressed field, and the te_state assertions
paired with the AdoptChainMsg::new inputs).
crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs (1)

27-48: ⚠️ Potential issue | 🟡 Minor

Nice cleanup, but the Cancel safety-net's gone walkabout — pending entries may orphan

Dropping peer identity from Register (and binning Cancel) makes the message shape much tidier, Marie Kondo would approve. The thing tickling my brain though: in track_peers/mod.rs a peer gets removed on validate_header or store_header errors (lines 242-243 and 254-255) and on roll-back failures (lines 338-339). With Cancel gone, any still-pending (handler, min_h) tuple that was registered for that peer just sits in state.pending forever, or at least until ledger_applied_block_height drifts past min_h — which, for a misbehaving peer we've already kicked, might be "quite a while" or, in the worst case, never (e.g. we're stuck on a dead branch with no further progress).

The eventual send(&handler, RequestNext) to a dead initiator StageRef is almost certainly a no-op on pure-stage, so it's not a correctness foot-bullet — more of a "pending vec grows monotonically with peer churn" smell. It's a trap! only if you care about long-running nodes with lots of flaky peers.

Options if you don't want to bring Cancel back:

  • Have dispatch_ready also drop entries whose handler is known-dead (if pure-stage exposes liveness).
  • Or document that the pending vec is allowed to accumulate tombstones and is pruned lazily when the ledger catches up.

Happy to leave as-is if the expected operational profile is "peer removals are rare and the ledger always advances eventually" — just flagging so it's a conscious call rather than a gotcha hiding in the long grass.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs` around lines
27 - 48, The pending vector can leak entries for peers that have been removed
because Register no longer carries peer identity and there is no Cancel; to fix,
either restore a Cancel variant on DeferReqNextMsg (e.g., Cancel { handler:
StageRef<InitiatorMessage> } and remove matching tuples from
DeferReqNext::pending in stage on Cancel) or add pruning logic where entries are
dispatched (e.g., in dispatch_ready or the Poll handling in stage) to detect
dead StageRef handlers and drop their (handler, min_ledger_height) tuples; refer
to DeferReqNextMsg::{Register, Poll}, the DeferReqNext.pending field, and the
stage(...) handler to implement the chosen approach.
🧹 Nitpick comments (1)
crates/amaru-consensus/src/stages/track_peers/test_setup.rs (1)

97-99: Tiny nit — stay consistent with the sibling stage, mate.

G'day! Functionally Box::new(GetBestChainHashEffect) is grand — it's a unit struct, so this and ::new() compile down to the same thing. But over in crates/amaru-consensus/src/stages/select_chain/test_setup.rs the same helper calls GetBestChainHashEffect::new() (see context snippet 2), and new() exists specifically for this purpose (snippet 1). Like Neo picking the same pill twice, let's keep the twin helpers looking identical so future grep-archaeologists don't do a double-take.

♻️ Proposed tweak
 pub fn te_get_best_chain_hash(at_stage: &str) -> TraceEntry {
-    TraceEntry::suspend(Effect::external(at_stage, Box::new(GetBestChainHashEffect)))
+    TraceEntry::suspend(Effect::external(at_stage, Box::new(GetBestChainHashEffect::new())))
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/track_peers/test_setup.rs` around lines 97
- 99, The helper te_get_best_chain_hash is inconsistent with its sibling:
instead of constructing the unit effect with Box::new(GetBestChainHashEffect)
use the existing constructor to match style; replace
Box::new(GetBestChainHashEffect) with Box::new(GetBestChainHashEffect::new()) so
TraceEntry::suspend(Effect::external(at_stage,
Box::new(GetBestChainHashEffect::new()))) mirrors the select_chain test setup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/amaru-consensus/src/stages/track_peers/mod.rs`:
- Around line 39-44: The function ledger_applied_block_height currently reads
from the consensus chain store (Store::get_best_chain_hash → load_header) but is
misnamed and documented as returning the ledger-applied tip; rename the function
to best_chain_stored_block_height and update its docstring to state it returns
the best-chain tip stored in the consensus Store, and then update any callers
(e.g., the defer_req_next::dispatch_ready gate that compares ledger_height >=
min_h) to either use this renamed best_chain_stored_block_height when they
really mean the consensus store tip or, if they require the actual
ledger-applied tip, replace the Store lookup with the proper ledger-applied-tip
retrieval via Effects<T> (adapt the call to the ledger-applied API available on
Effects<T>), ensuring the semantics match the gate's backpressure intent.

---

Outside diff comments:
In `@crates/amaru-consensus/src/stages/adopt_chain/tests.rs`:
- Around line 115-133: The final expected state in the adopting-tip tests must
account for the DEBUG branch incrementing state.suppressed; update the test
setup to set expected.suppressed = 1 before asserting te_state so the trace
matches the branch that increments suppressed, and apply the same change in all
three adopting tests that build expected and assert te_state (refer to the local
variable expected, the state.suppressed field, and the te_state assertions
paired with the AdoptChainMsg::new inputs).

In `@crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs`:
- Around line 27-48: The pending vector can leak entries for peers that have
been removed because Register no longer carries peer identity and there is no
Cancel; to fix, either restore a Cancel variant on DeferReqNextMsg (e.g., Cancel
{ handler: StageRef<InitiatorMessage> } and remove matching tuples from
DeferReqNext::pending in stage on Cancel) or add pruning logic where entries are
dispatched (e.g., in dispatch_ready or the Poll handling in stage) to detect
dead StageRef handlers and drop their (handler, min_ledger_height) tuples; refer
to DeferReqNextMsg::{Register, Poll}, the DeferReqNext.pending field, and the
stage(...) handler to implement the chosen approach.

---

Nitpick comments:
In `@crates/amaru-consensus/src/stages/track_peers/test_setup.rs`:
- Around line 97-99: The helper te_get_best_chain_hash is inconsistent with its
sibling: instead of constructing the unit effect with
Box::new(GetBestChainHashEffect) use the existing constructor to match style;
replace Box::new(GetBestChainHashEffect) with
Box::new(GetBestChainHashEffect::new()) so
TraceEntry::suspend(Effect::external(at_stage,
Box::new(GetBestChainHashEffect::new()))) mirrors the select_chain test setup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5817d650-248f-408a-af6c-b086b3334ef6

📥 Commits

Reviewing files that changed from the base of the PR and between a4b58c7 and d07475a.

📒 Files selected for processing (10)
  • .github/workflows/snapshot-tests.yml
  • .gitignore
  • crates/amaru-consensus/src/stages/adopt_chain/mod.rs
  • crates/amaru-consensus/src/stages/adopt_chain/test_setup.rs
  • crates/amaru-consensus/src/stages/adopt_chain/tests.rs
  • crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs
  • crates/amaru-consensus/src/stages/track_peers/mod.rs
  • crates/amaru-consensus/src/stages/track_peers/test_setup.rs
  • crates/amaru-consensus/src/stages/track_peers/tests.rs
  • crates/amaru/src/stages/config.rs
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/amaru/src/stages/config.rs
  • crates/amaru-consensus/src/stages/track_peers/tests.rs

Comment thread crates/amaru-consensus/src/stages/track_peers/mod.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/amaru-consensus/src/stages/track_peers/mod.rs (1)

39-43: ⚠️ Potential issue | 🟠 Major

Still reading the best-chain store tip, not the ledger-applied height.

This helper now calls Store::get_best_chain_hash()load_header(), so Line 315 gates RequestNext using the consensus store’s best-chain tip rather than the actual ledger-applied tip. That can make the backpressure decision fire on the wrong clock, like trusting the minimap instead of the boss-room door.

Either rename this to best-chain-store semantics, or fetch the real ledger-applied height for this gate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/track_peers/mod.rs` around lines 39 - 43,
The helper ledger_applied_block_height currently reads the consensus store's
best-chain tip via Store::get_best_chain_hash() then load_header(), which gates
RequestNext on the wrong clock; change it to read the actual ledger-applied tip
instead (e.g. call Store::get_ledger_applied_hash() or
Store::ledger_applied_header() and then load_header()/block_height() from that
hash) so the returned height reflects ledger-applied state; if the store API
lacks a ledger_applied accessor, rename the function to best_chain_block_height
and update call sites accordingly.
🧹 Nitpick comments (2)
crates/amaru-consensus/src/stages/adopt_chain/mod.rs (2)

114-123: Rate-limited logging reads well — one tiny nitpick on the magic second.

The branch logic is spot on: log at INFO once the 1s window elapses, reset suppressed; otherwise DEBUG + bump the counter. Very "Groundhog Day but with fewer Bill Murrays".

Only nit: the Duration::from_secs(1) threshold is a policy knob hiding in the middle of stage. Pulling it into a const LOG_THROTTLE: Duration = Duration::from_secs(1); at module scope makes intent obvious and easy to tweak later. Take it or leave it.

♻️ Optional tidy-up
+const LOG_THROTTLE: Duration = Duration::from_secs(1);
+
 // ...
-    let now = store.eff().clock().await;
-    if now.saturating_since(state.last_printed) >= Duration::from_secs(1) {
+    let now = store.eff().clock().await;
+    if now.saturating_since(state.last_printed) >= LOG_THROTTLE {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/adopt_chain/mod.rs` around lines 114 - 123,
Introduce a module-level constant for the log throttle duration (e.g. const
LOG_THROTTLE: Duration = Duration::from_secs(1);) and replace the inline
Duration::from_secs(1) in the adopt tip logging branch with LOG_THROTTLE; keep
the existing logic around state.last_printed and state.suppressed unchanged so
the INFO/DEBUG decision still uses now.saturating_since(state.last_printed) >=
LOG_THROTTLE and the counter reset/increment behavior remains intact.

43-56: Clock origin assumption worth noting for future Clock tweaks

The last_printed: Instant::at_offset(Duration::from_secs(0)) init relies on the assumption that the clock origin stays stable—so the very first adoption (which happens immediately in the simulation) lands at now - EPOCH < 1s and triggers the DEBUG branch. The tests confirm this is exactly what happens. Solid stuff.

That said, if someone ever tinkers with how pure_stage::Clock initialises its origin, this implicit dependency could be a sneaky gotcha. Not a blocker at all, but if you fancy future-proofing it, an Option<Instant> — where None forces the first adoption straight to INFO regardless of clock semantics — would be a belt-and-braces move. Totally up to you though; no drama keeping it as is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/adopt_chain/mod.rs` around lines 43 - 56,
The init currently sets last_printed using Instant::at_offset(...) which depends
on Clock origin; change AdoptChain.last_printed from Instant to Option<Instant>
and update AdoptChain::new (the constructor) to set last_printed = None so the
first adoption is treated as a fresh INFO regardless of clock semantics; update
any code paths that read last_printed (e.g., where the DEBUG/INFO decision is
made) to treat None as forcing INFO and to store Some(now) after the first
print.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs`:
- Around line 27-35: The pending deferred requests are stored as plain
(StageRef<InitiatorMessage>, BlockHeight) so dropped peer/token cannot cancel
them; change the shape used by DeferReqNext/DeferReqNextMsg::Register to include
a cancellation handle (e.g., a CancellationToken, OwnerId, or CancelToken
struct) instead of just StageRef and store Vec<(StageRef<InitiatorMessage>,
BlockHeight, CancelHandle)> in DeferReqNext; update the Register variant to
accept/produce that CancelHandle and update all rollback/removal sites to call
the cancel/remove path using that handle to remove or mark pending entries
cancelled (i.e., add removal logic where peers are rolled back/removed to invoke
the CancelHandle for matching pending entries).

---

Duplicate comments:
In `@crates/amaru-consensus/src/stages/track_peers/mod.rs`:
- Around line 39-43: The helper ledger_applied_block_height currently reads the
consensus store's best-chain tip via Store::get_best_chain_hash() then
load_header(), which gates RequestNext on the wrong clock; change it to read the
actual ledger-applied tip instead (e.g. call Store::get_ledger_applied_hash() or
Store::ledger_applied_header() and then load_header()/block_height() from that
hash) so the returned height reflects ledger-applied state; if the store API
lacks a ledger_applied accessor, rename the function to best_chain_block_height
and update call sites accordingly.

---

Nitpick comments:
In `@crates/amaru-consensus/src/stages/adopt_chain/mod.rs`:
- Around line 114-123: Introduce a module-level constant for the log throttle
duration (e.g. const LOG_THROTTLE: Duration = Duration::from_secs(1);) and
replace the inline Duration::from_secs(1) in the adopt tip logging branch with
LOG_THROTTLE; keep the existing logic around state.last_printed and
state.suppressed unchanged so the INFO/DEBUG decision still uses
now.saturating_since(state.last_printed) >= LOG_THROTTLE and the counter
reset/increment behavior remains intact.
- Around line 43-56: The init currently sets last_printed using
Instant::at_offset(...) which depends on Clock origin; change
AdoptChain.last_printed from Instant to Option<Instant> and update
AdoptChain::new (the constructor) to set last_printed = None so the first
adoption is treated as a fresh INFO regardless of clock semantics; update any
code paths that read last_printed (e.g., where the DEBUG/INFO decision is made)
to treat None as forcing INFO and to store Some(now) after the first print.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 520b847d-736c-450a-837f-650ca01ca5da

📥 Commits

Reviewing files that changed from the base of the PR and between d07475a and e76e9d0.

📒 Files selected for processing (8)
  • crates/amaru-consensus/src/stages/adopt_chain/mod.rs
  • crates/amaru-consensus/src/stages/adopt_chain/test_setup.rs
  • crates/amaru-consensus/src/stages/adopt_chain/tests.rs
  • crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs
  • crates/amaru-consensus/src/stages/track_peers/mod.rs
  • crates/amaru-consensus/src/stages/track_peers/test_setup.rs
  • crates/amaru-consensus/src/stages/track_peers/tests.rs
  • crates/amaru/src/stages/config.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/amaru/src/stages/config.rs
  • crates/amaru-consensus/src/stages/track_peers/tests.rs

Comment on lines 27 to +35
pub enum DeferReqNextMsg {
Register {
peer: Peer,
handler: StageRef<InitiatorMessage>,
min_ledger_height: BlockHeight,
},
/// Drop any deferred `RequestNext` for this peer (e.g. after rollback).
Cancel {
peer: Peer,
},
Register { handler: StageRef<InitiatorMessage>, min_ledger_height: BlockHeight },
Poll,
}

#[derive(Debug, Clone, PartialEq, serde::Serialize, serde::Deserialize)]
pub struct DeferReqNext {
pub poll_interval_ms: u64,
pub pending: Vec<(Peer, StageRef<InitiatorMessage>, BlockHeight)>,
pub pending: Vec<(StageRef<InitiatorMessage>, BlockHeight)>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep deferred requests cancellable by owner.

Dropping the peer/cancel token means rollback or peer removal can’t clear a pending deferred RequestNext. If that chain switches away or the handler goes stale before min_ledger_height, this stage can later poke the wrong handler — bit of a Matrix “wrong phone line” moment.

Consider keeping a peer/token in pending and restoring a Cancel path from rollback/removal sites.

🛠️ Sketch of the safer shape
-use amaru_kernel::BlockHeight;
+use amaru_kernel::{BlockHeight, Peer};

 pub enum DeferReqNextMsg {
-    Register { handler: StageRef<InitiatorMessage>, min_ledger_height: BlockHeight },
+    Register { peer: Peer, handler: StageRef<InitiatorMessage>, min_ledger_height: BlockHeight },
+    Cancel { peer: Peer },
     Poll,
 }

 pub struct DeferReqNext {
     pub poll_interval_ms: u64,
-    pub pending: Vec<(StageRef<InitiatorMessage>, BlockHeight)>,
+    pub pending: Vec<(Peer, StageRef<InitiatorMessage>, BlockHeight)>,
 }

 match msg {
-    Register { handler, min_ledger_height } => {
-        state.pending.push((handler, min_ledger_height));
+    Register { peer, handler, min_ledger_height } => {
+        state.pending.push((peer, handler, min_ledger_height));
+    }
+    Cancel { peer } => {
+        state.pending.retain(|(pending_peer, _, _)| pending_peer != &peer);
     }
     Poll => {
         dispatch_ready(&mut state, &eff).await;
         let poll = Duration::from_millis(state.poll_interval_ms.max(1));
         eff.schedule_after(Poll, poll).await;
@@
-for (handler, min_h) in std::mem::take(&mut state.pending) {
+for (peer, handler, min_h) in std::mem::take(&mut state.pending) {
     if ledger_height >= min_h {
         eff.send(&handler, InitiatorMessage::RequestNext).await;
     } else {
-        remaining.push((handler, min_h));
+        remaining.push((peer, handler, min_h));
     }
 }

Also applies to: 47-49, 59-69

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-consensus/src/stages/track_peers/defer_req_next.rs` around lines
27 - 35, The pending deferred requests are stored as plain
(StageRef<InitiatorMessage>, BlockHeight) so dropped peer/token cannot cancel
them; change the shape used by DeferReqNext/DeferReqNextMsg::Register to include
a cancellation handle (e.g., a CancellationToken, OwnerId, or CancelToken
struct) instead of just StageRef and store Vec<(StageRef<InitiatorMessage>,
BlockHeight, CancelHandle)> in DeferReqNext; update the Register variant to
accept/produce that CancelHandle and update all rollback/removal sites to call
the cancel/remove path using that handle to remove or mark pending entries
cancelled (i.e., add removal logic where peers are rolled back/removed to invoke
the CancelHandle for matching pending entries).

@rkuhn rkuhn force-pushed the rk/preprod-sync branch from e76e9d0 to 023f69e Compare April 25, 2026 11:59
KtorZ and others added 7 commits April 25, 2026 17:21
…idations.

Signed-off-by: KtorZ <matthias.benkort@gmail.com>
Signed-off-by: KtorZ <matthias.benkort@gmail.com>
Signed-off-by: KtorZ <matthias.benkort@gmail.com>
… regression)

Signed-off-by: Roland Kuhn <rk@rkuhn.info>
Signed-off-by: Roland Kuhn <rk@rkuhn.info>
Signed-off-by: Roland Kuhn <rk@rkuhn.info>
also teach dump-chain-db a new trick: show ancestors of given point with validity

Signed-off-by: Roland Kuhn <rk@rkuhn.info>
@rkuhn rkuhn force-pushed the rk/preprod-sync branch from 023f69e to ce4bd77 Compare April 25, 2026 17:01
For preview there is a rather weird effect slowing it down now, looks like the Cardano node throttles chainsync to 4/sec.

Signed-off-by: Roland Kuhn <rk@rkuhn.info>
@rkuhn rkuhn merged commit d44d84c into main Apr 25, 2026
23 of 24 checks passed
@rkuhn rkuhn deleted the rk/preprod-sync branch April 25, 2026 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

move upstream peer tracking and block validation out of select_chain stage

2 participants