Skip to content

feat: TraceBuffer and simulation Replay#263

Merged
rkuhn merged 3 commits into
mainfrom
rk/effect-logging
Jun 18, 2025
Merged

feat: TraceBuffer and simulation Replay#263
rkuhn merged 3 commits into
mainfrom
rk/effect-logging

Conversation

@rkuhn

@rkuhn rkuhn commented Jun 14, 2025

Copy link
Copy Markdown
Contributor

Worked on serialization and deserialization of dyn SendData and
friends until rustc capitulated. It is now possible to run a simulation
and capture the inputs, states, effects, and their responses in a binary
CBOR log that can later be deserialized again and used to replay a
StageGraph to the same state as the traced simulation run.

Note that currently the log must be complete, otherwise initial states
will be missing and the replay will fail because stage behaviour will be
different.

Note also that while serialization can turn any dyn SendData into
byes, it is not possible to get back the same dyn SendData by
deserializing a TraceEntry. This is impossible because we'd need to
know the actual deserialization code for the concrete underlying type
(which may have type parameters etc.). So the deserialization result is
a generic CBOR Value that is then redeserialized in the context where
the concrete value is needed, which is also where the concrete type is
known and the Deserialize instance can be captured.

Signed-off-by: Roland Kuhn rk@rkuhn.info

Summary by CodeRabbit

  • New Features

    • Added comprehensive serialization and deserialization support across many core types, enabling encoding and replay of simulation traces and states.
    • Introduced a trace buffer to record simulation events with utilities for trace hydration, replay, and detailed trace assertions.
    • Added a CBOR-based tracing layer for serializing tracing spans and events.
    • Introduced a new simulation clock abstraction with monotonic time and epoch-based serialization.
    • Added a new simulation replay module to run simulations from trace data.
    • Added detailed resume functions for various simulation effects to improve modularity.
  • Refactor

    • Replaced previous Message and State traits with a unified SendData trait supporting serialization and dynamic casting.
    • Updated simulation, stage graph, effect, and messaging APIs to use SendData and require serialization and deserialization traits.
    • Centralized simulation resume logic into dedicated modules, improving clarity and maintainability.
    • Replaced closure-based time handling with a Clock trait object throughout the simulation stack.
    • Enhanced error reporting and equality comparisons using serialization-aware trait objects.
    • Improved type safety and serialization support in output effects, external effects, and call references.
    • Refactored simulation running logic to use new resume helpers and trace buffering.
    • Updated stage graph and Tokio implementations to use SendData and the new clock abstraction.
    • Removed deprecated utility types and reorganized public exports.
  • Documentation

    • Added detailed notes on serialization challenges and design decisions.
    • Updated doc comments to reflect renamed methods and new abstractions.
  • Style

    • Added new modules for serialization utilities, logging layers, trace buffering, and simulation replay.
    • Cleaned up imports, derives, and removed deprecated utility types.
    • Reorganized public exports and re-exports for clarity.
  • Tests

    • Refactored simulation tests to use strongly typed states and messages with serialization.
    • Added tests verifying serialization round-trips for dynamic trait objects.
    • Added replay tests validating trace fidelity and stage states.
    • Updated Tokio integration tests to use renamed methods and new abstractions.

Worked on serialization and deserialization of `dyn SendData` and
friends until rustc capitulated. It is now possible to run a simulation
and capture the inputs, states, effects, and their responses in a binary
CBOR log that can later be deserialized again and used to replay a
StageGraph to the same state as the traced simulation run.

Note that currently the log must be complete, otherwise initial states
will be missing and the replay will fail because stage behaviour will be
different.

Note also that while serialization can turn any `dyn SendData` into
byes, it is not possible to get back the same `dyn SendData` by
deserializing a `TraceEntry`. This is impossible because we'd need to
know the actual deserialization code for the concrete underlying type
(which may have type parameters etc.). So the deserialization result is
a generic CBOR `Value` that is then redeserialized in the context where
the concrete value is needed, which is also where the concrete type is
known and the `Deserialize` instance can be captured.

Signed-off-by: Roland Kuhn <rk@rkuhn.info>
@coderabbitai

coderabbitai Bot commented Jun 14, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Alright mate, here’s the lowdown: This batch of changes is a full-on overhaul of the simulation and messaging framework, adding serde serialization/deserialization everywhere, introducing a new SendData trait to replace Message, and building out a detailed trace buffer and replay system. There’s also a new clock abstraction, heaps of refactoring, and a swathe of tests and docs updated to match.

Changes

File(s) / Group Change Summary
Cargo.toml, crates/amaru-consensus/Cargo.toml, crates/ouroboros-traits/Cargo.toml, crates/pure-stage/Cargo.toml Added/updated dependencies (serde, typetag, cbor4ii, pretty_assertions), reordered dev-deps
crates/amaru-consensus/src/consensus/mod.rs, store.rs, store_header.rs, validate_header.rs, peer.rs Added serde derives, serialization attributes, default helpers, and fake store/ledger implementations
crates/amaru-kernel/src/lib.rs, protocol_parameters.rs Added serde derives to Point enum and GlobalParameters struct
crates/amaru/src/stages/ledger.rs Minor refactor: changed error return style in roll_forward method
crates/ouroboros-traits/src/praos/mod.rs Added serde derives to Nonces struct
crates/pure-stage/README.md Added section explaining serialization design and typetag limitations
crates/pure-stage/src/effect.rs, output.rs, types.rs Replaced Message with SendData, added serde/typetag support, unified effect/message handling
crates/pure-stage/src/lib.rs Re-exported new modules/types, removed util, added serde, trace_buffer, and public exports
crates/pure-stage/src/logging.rs, logging/layer.rs Added CBOR-serializing tracing layer and visitor, with tests
crates/pure-stage/src/serde.rs New module: serde helpers, NoDebug, Void, trait object serialization, CBOR utilities
crates/pure-stage/src/simulation.rs, simulation/blocked.rs, simulation/inputs.rs, simulation/replay.rs, simulation/resume.rs, simulation/running.rs, simulation/state.rs Major refactor: new trace buffer, replay, clock abstraction, effect handling, state/message serialization, error handling improvements
crates/pure-stage/src/stage_ref.rs, stagegraph.rs, receiver.rs, sender.rs, time.rs, tokio.rs Added serde derives, clock trait, refactored sender/receiver, updated trait bounds and docs
crates/pure-stage/src/trace_buffer.rs New module: bounded trace buffer, CBOR-encoded trace entries, push/iterate/hydrate support
crates/pure-stage/src/util.rs Removed: moved NoDebug and Void to serde.rs
crates/pure-stage/tests/simulation.rs, tests/tokio.rs Tests refactored: use new state/message types, trace/replay assertions, updated to SendData
crates/slot-arithmetic/src/lib.rs Added serde derives to error enums
simulation/amaru-sim/src/echo/message.rs, simulator/simulate.rs Added typetag name test, updated test state/message types, added serde bounds

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SimulationBuilder
    participant StageGraph
    participant TraceBuffer
    participant Stage
    participant Clock

    User->>SimulationBuilder: Build simulation (with trace buffer, clock)
    SimulationBuilder->>StageGraph: Wire up stages (with SendData, serde)
    User->>SimulationBuilder: Run simulation
    SimulationBuilder->>Stage: Send input message (serialized)
    Stage->>TraceBuffer: Record Input event (CBOR)
    Stage->>Clock: Query current time
    Stage->>Stage: Process message, produce effect (serialized)
    Stage->>TraceBuffer: Record Suspend/Resume/State events (CBOR)
    User->>SimulationBuilder: Replay trace
    SimulationBuilder->>TraceBuffer: Hydrate entries, replay events
    TraceBuffer->>Stage: Restore state, effects, clock
Loading

Possibly related PRs

Suggested reviewers

  • stevana

Poem

G’day, code wranglers, gather ‘round,
We’ve wrangled traits and serde bound!
Messages now can serialize,
Simulations trace and analyze.
With clocks that tick and traces neat,
Our replay’s now a retro treat.
Cheers to code that’s bonza, mate—
Let’s simulate, then celebrate! 🍻🦘


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98898a8 and 132a08a.

📒 Files selected for processing (4)
  • crates/amaru-consensus/src/consensus/validate_header.rs (3 hunks)
  • crates/amaru-consensus/src/peer.rs (1 hunks)
  • crates/amaru-kernel/src/lib.rs (1 hunks)
  • crates/amaru/src/stages/ledger.rs (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • crates/amaru-kernel/src/lib.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/amaru-consensus/src/peer.rs
  • crates/amaru/src/stages/ledger.rs
  • crates/amaru-consensus/src/consensus/validate_header.rs
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Build on ubuntu-latest with target x86_64-unknown-linux-gnu
  • GitHub Check: Snapshots (preprod, 1, 10.1.4)
  • GitHub Check: Build on ubuntu-24.04 with target aarch64-unknown-linux-musl
  • GitHub Check: Build on ubuntu-latest with target riscv32im-risc0-zkvm-elf
  • GitHub Check: Build on windows-latest with target x86_64-pc-windows-msvc
  • GitHub Check: Coverage
  • GitHub Check: Build on macos-latest with target aarch64-apple-darwin
  • GitHub Check: Sanity
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🔭 Outside diff range comments (3)
crates/pure-stage/src/simulation/inputs.rs (2)

34-48: ⚠️ Potential issue

Possible compile error: cast isn’t on Box<dyn SendData>

Inside the map_err, you’re doing e.0.msg.cast::<Msg>().
Unless SendData : Any and you’ve provided an extension method named cast, this won’t compile (stable Box<dyn Any> uses downcast).

-                        *e.0.msg.cast::<Msg>().expect("message was just boxed")
+                        *e.0
+                             .msg
+                             .downcast::<Msg>()
+                             .expect("message was just boxed")

If SendData doesn’t inherit Any, add it:

pub trait SendData: erased_serde::Serialize + Send + Sync + 'static + std::any::Any {}

41-50: 🛠️ Refactor suggestion

The oneshot rx.await.ok(); swallows send-side errors

If the callee drops the ACK channel without sending, you’ll silently accept it and report Ok(()).
Bubble the error so the caller knows the message never landed.

-                rx.await.ok();
-                Ok(())
+                rx.await.map_err(|_| SenderError::AckDropped)

(Replace SenderError with whatever suits.)
Keeps the contract watertight like Master Chief’s MJOLNIR, yeah?

crates/pure-stage/src/simulation/running.rs (1)

615-637: 🛠️ Refactor suggestion

Duplicate clock-resume logic – DRY it up, mate

resume_clock_internal here is a copy-paste of the one in simulation::resume. Keeping two versions is asking for drift and late-night debugging sessions. Re-export and call the shared helper instead of maintaining this local clone.

🧹 Nitpick comments (28)
crates/pure-stage/src/sender.rs (1)

4-6: Tidy up the doc blurb for extra clarity

The new comment is grand, but a tiny tweak would make it crystal-clear to newcomers: explicitly mention that the generic Msg must implement SendData. Keeps the docs in sync with the rest of the crate’s messaging story.

simulation/amaru-sim/src/echo/message.rs (2)

48-50: Ye’ve got an unused import chillin’ here

SendData never gets a cameo in this test module, so the compiler will nag (or even fail if unused_imports is denied). Either drop it or slap an #[allow(unused_imports)].

-    use pure_stage::SendData;
+    // use pure_stage::SendData; // uncomment when a test needs it

64-67: Hard-coded type_name string is brittle

std::any::type_name::<T>() bakes in the crate path; one refactor and this assertion goes boom like a bad barrel in Borderlands. Consider a looser check (e.g. ends_with) or snapshot testing instead.

-        assert_eq!(
-            env.typetag_name(),
-            "amaru_sim::echo::message::Envelope<amaru_sim::echo::message::EchoMessage>"
-        );
+        assert!(env
+            .typetag_name()
+            .ends_with("Envelope<amaru_sim::echo::message::EchoMessage>"));
Cargo.toml (1)

23-33: Mind the CBOR double-up

We now depend on both minicbor (legacy) and cbor4ii (new hotness). Having two CBOR libs in the workspace can bloat code size and confuse callers (e.g. which encoder do they use?).

Consider consolidating on one crate (preferably cbor4ii since it brings Serde). If a full switch isn’t feasible yet, drop a TODO so folks know the dual stack is intentional.

crates/slot-arithmetic/src/lib.rs (2)

41-45: Serialising error enums: check privacy & bloat

Deriving Serde on SlotArithmeticError leaks the full enum variants over the wire. That’s usually fine for tests/replay, but if these errors bubble out to remote callers we might expose internal slot numbers we’d rather keep private.

If exposure is a concern, wrap the error in an opaque string or add #[serde(skip)] on verbose fields.


345-353: Same note for TimeHorizonError

Ditto. Make sure we’re comfy shipping these variants outside the process boundary.

crates/pure-stage/src/stage_ref.rs (1)

21-27: Consider adding Eq and Hash to round-out the handle’s ergonomics

Deriving/implementing Eq and Hash (mirroring the PartialEq you just added) lets a StageRef be used as a key in hash-maps or sets—handy for routing tables and the like.
The extra derive costs nothing and won’t change the serialised form because the trait is purely semantic.

-#[derive(serde::Serialize, serde::Deserialize)]
+#[derive(serde::Serialize, serde::Deserialize, Eq, Hash)]
crates/pure-stage/README.md (1)

52-60: Tiny copy-editing nits for smoother reading

  1. Line 57: “weighs too heavily; therefore a different solution is required.”
  2. Line 59: “where the compiler can therefore instantiate …”

Just a dab of polish so future readers don’t trip over the prose, mate.

🧰 Tools
🪛 LanguageTool

[style] ~57-~57: Consider an alternative to strengthen your wording.
Context: ...striction that weighs too heavily, thus a different solution is required. The current design shifts...

(DIFFERENT_ALTERNATIVE)


[typographical] ~59-~59: The word “thus” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...e concrete target type can be named and where thus the compiler can instantiate a `Deseria...

(THUS_SENTENCE)

crates/pure-stage/src/lib.rs (1)

1-13: Public serde module name clashes with upstream crate

Calling the local helper-module serde means downstream users now have both pure_stage::serde and the canonical serde crate in scope. In code it’s easy to fat-finger serde:: and grab the wrong one. A tiny rename (e.g. serde_support) or at least a short doc comment warning consumers would save a future Mick-Dundee sized “that’s not the serde you’re looking for” moment.

simulation/amaru-sim/src/simulator/simulate.rs (1)

79-81: Redundant Serialize / Deserialize bounds?

pure_stage::SendData already promises serialisability, so slapping Serialize + DeserializeOwned directly onto Msg may be belt-and-suspenders. If callers always pass Msg: SendData, consider trimming the extra bounds to keep the generic list from looking like the end-credits of a LOTR movie.

crates/pure-stage/src/simulation/blocked.rs (1)

61-66: Consumes self in assert_breakpoint unlike siblings

assert_breakpoint takes ownership whereas the other assert_* methods borrow. That asymmetry trips muscle memory and prevents chained assertions. Consider taking &self and returning &Effect (or cloning) to keep the API jazz in the same key.

crates/pure-stage/src/simulation/state.rs (1)

19-24: Transition now boxes two SendData objects – watch the heap churn

Swapping &dyn for owned Box<dyn SendData> in the transition signature means every message/state hop allocates. If perf becomes a bottleneck during long replays, maybe switch to &dyn or smart-pointer tricks (à la Arc<dyn SendData>) to keep the garbage compactor quiet.

crates/amaru-consensus/src/consensus/store.rs (1)

206-209: No error propagation on insert failures

BTreeMap::insert can’t fail, but if we ever migrate to a fallible store (sled, rocksdb, etc.) this silent Ok(()) blanket will swallow IO errors. Tiny note to future devs: bubble up the underlying error once we switch to real storage.

crates/pure-stage/src/logging/layer.rs (1)

23-27: G'day mate, looks like we've got some unfinished business here!

The WIP note combined with #![allow(unused)] is a bit like leaving the debug mode on in your favorite RPG - fine for development, but not great for production. Consider removing these once the implementation is complete, yeah?

-// ********************************************************
-// NOTE: This is a work in progress.
-// ********************************************************
-
-#![allow(unused)]
crates/pure-stage/src/tokio.rs (1)

49-55: Hey mate, that empty advance_to could use some context!

Like an NPC with no dialogue, an empty method implementation leaves players wondering what's going on. A comment explaining why Tokio's clock can't be advanced would be ace!

 impl Clock for TokioClock {
     fn now(&self) -> Instant {
         Instant::now()
     }
-    fn advance_to(&self, _instant: Instant) {}
+    fn advance_to(&self, _instant: Instant) {
+        // No-op: Tokio uses real time which cannot be manipulated
+    }
 }
crates/pure-stage/src/time.rs (1)

103-127: Love the humor in those error messages, but they're still panic bombs!

"290 billion years" - that's longer than the entire Mass Effect timeline! But jokes aside, consider using saturating arithmetic or returning Results instead of panicking.

 impl std::ops::Add<Duration> for Instant {
     type Output = Instant;

-    #[allow(clippy::expect_used)]
-    fn add(self, duration: Duration) -> Self {
-        Instant(
-            self.0
-                .checked_add(duration)
-                .expect("simulation is not supposed to run for more than 290 billion years"),
-        )
+    fn add(self, duration: Duration) -> Self {
+        Instant(self.0.saturating_add(duration))
     }
 }
crates/pure-stage/src/trace_buffer.rs (1)

119-151: That's some intricate buffer juggling, mate!

This push logic is more complex than the inventory management in Resident Evil! While it looks correct, the multiple conditions and edge cases make it a bit hard to follow. Consider adding more inline comments or breaking it into smaller helper methods?

 fn push(&mut self, msg: Vec<u8>) {
     if self.max_size == 0 {
         return;
     }

     self.used_size += msg.len();

+    // First pass: remove old messages until we're under max_size
+    // but keep at least min_entries
     #[allow(clippy::expect_used)]
     while self.used_size > self.max_size && self.messages.len() > self.min_entries {
         // ... existing code ...
     }

+    // Second pass: if we can fit by removing just one more entry, do it
     let one_more = self.messages.front().map(|m| m.len()).unwrap_or(0);
     // ... rest of the logic
crates/pure-stage/src/serde.rs (1)

195-208: Sweet optimization with that thread-local buffer!

Like keeping a stash of health potions, reusing buffers saves allocations. But that expect() could bite you if someone passes a value that can't be serialized (like a type with a broken Serialize impl).

Consider returning a Result:

-pub fn to_cbor<T: serde::Serialize>(value: &T) -> Vec<u8> {
+pub fn to_cbor<T: serde::Serialize>(value: &T) -> Result<Vec<u8>, cbor4ii::serde::Error> {
     thread_local! {
         static BUFFER: RefCell<Vec<u8>> = const { RefCell::new(Vec::new()) };
     }
     BUFFER.with_borrow_mut(|buffer| {
-        #[allow(clippy::expect_used)]
-        to_writer(&mut *buffer, value).expect("serialization should not fail");
+        to_writer(&mut *buffer, value)?;
         let ret = Vec::from(buffer.as_slice());
         buffer.clear();
-        ret
+        Ok(ret)
     })
 }
crates/pure-stage/src/output.rs (2)

40-46: Debug output leaks concrete type name – deliberate?

typetag_name() lands the full path (incl. module) in logs. Grand for debugging, but it’ll spam traces like Diablo’s loot drops.
If that’s too noisy in production, consider stripping the path or gating behind debug.


56-66: Silent send failures – decide if that’s kosher

run logs at debug and returns () when the channel is closed. If the output stage disappearing is exceptional in your sim, surfacing an error (or at least warn!) might save some late-night head-scratching.

crates/pure-stage/tests/simulation.rs (1)

84-139: Golden-string snapshot is brittle

Hard-coding the entire pretty-printed trace means one comma change & your test keels over like a beached koala.
Snapshot-testing crates (insta, expect-test) or selective matching (e.g. assert on variants/order) keep intent while dodging churn.

crates/pure-stage/src/simulation/replay.rs (1)

124-146: Status helpers unwrap() on missing stages – convert to polite look-ups

is_running/is_idle/is_failed panic when the caller passes an unknown stage name. A graceful false (or Option<bool>) would make the API as friendly as a pub-quiz mate.

crates/pure-stage/src/effect.rs (1)

125-131: Clock skew risk – using wall-clock for call timeout

now = self.clock.now() bypasses the effect system’s Clock effect, meaning replays driven purely by trace won’t re-emit this instant.
If determinism is king, consider funnelling time via the existing clock() effect so traces capture the read.

crates/pure-stage/src/simulation/running.rs (2)

411-414: Missing trace entry for Clock resume

We resume the stage but don’t push_resume/push_suspend for the clock response. Replay might become inconsistent. Consider logging it the same way as other resumes.


398-406: Back-pressure not observable in trace

When the mailbox is full we queue the (from,msg) in senders, but no trace entry records this drop/park. If replay fidelity matters, emit a trace event here.

crates/pure-stage/src/simulation/resume.rs (1)

93-116: Trace only records Receives

resume_clock_internal (and siblings) don’t touch TraceBuffer, unlike resume_receive_internal. If replay depends on matching resumes, consider adding trace hooks for the other resume paths as well.

crates/pure-stage/src/stagegraph.rs (1)

46-50: dummy_response leaks if used at scale

oneshot::channel().0 allocates a real channel that’s never received. For mass-deserialisation this leaks. Prefer Sender::new() via tokio::sync::oneshot::Sender::default() or a custom NoopSender that does nothing.

crates/pure-stage/src/simulation.rs (1)

144-148: Expose TraceBuffer sizing knobs

with_trace_buffer just swaps in a ready buffer. Offering a builder-style with_trace_capacity(messages, bytes) would save callers the extra dance and encourage consistent sizing.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b67cec and 6991a84.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (41)
  • Cargo.toml (3 hunks)
  • crates/amaru-consensus/Cargo.toml (1 hunks)
  • crates/amaru-consensus/src/consensus/mod.rs (1 hunks)
  • crates/amaru-consensus/src/consensus/store.rs (4 hunks)
  • crates/amaru-consensus/src/consensus/store_header.rs (1 hunks)
  • crates/amaru-consensus/src/consensus/validate_header.rs (3 hunks)
  • crates/amaru-consensus/src/peer.rs (1 hunks)
  • crates/amaru-kernel/src/lib.rs (1 hunks)
  • crates/amaru-kernel/src/protocol_parameters.rs (1 hunks)
  • crates/amaru/src/stages/ledger.rs (1 hunks)
  • crates/ouroboros-traits/Cargo.toml (1 hunks)
  • crates/ouroboros-traits/src/praos/mod.rs (1 hunks)
  • crates/pure-stage/Cargo.toml (1 hunks)
  • crates/pure-stage/README.md (1 hunks)
  • crates/pure-stage/src/effect.rs (19 hunks)
  • crates/pure-stage/src/lib.rs (1 hunks)
  • crates/pure-stage/src/logging.rs (1 hunks)
  • crates/pure-stage/src/logging/layer.rs (1 hunks)
  • crates/pure-stage/src/output.rs (2 hunks)
  • crates/pure-stage/src/receiver.rs (1 hunks)
  • crates/pure-stage/src/sender.rs (1 hunks)
  • crates/pure-stage/src/serde.rs (1 hunks)
  • crates/pure-stage/src/simulation.rs (10 hunks)
  • crates/pure-stage/src/simulation/blocked.rs (1 hunks)
  • crates/pure-stage/src/simulation/inputs.rs (2 hunks)
  • crates/pure-stage/src/simulation/replay.rs (1 hunks)
  • crates/pure-stage/src/simulation/resume.rs (1 hunks)
  • crates/pure-stage/src/simulation/running.rs (35 hunks)
  • crates/pure-stage/src/simulation/state.rs (2 hunks)
  • crates/pure-stage/src/stage_ref.rs (1 hunks)
  • crates/pure-stage/src/stagegraph.rs (7 hunks)
  • crates/pure-stage/src/time.rs (2 hunks)
  • crates/pure-stage/src/tokio.rs (7 hunks)
  • crates/pure-stage/src/trace_buffer.rs (1 hunks)
  • crates/pure-stage/src/types.rs (4 hunks)
  • crates/pure-stage/src/util.rs (0 hunks)
  • crates/pure-stage/tests/simulation.rs (12 hunks)
  • crates/pure-stage/tests/tokio.rs (1 hunks)
  • crates/slot-arithmetic/src/lib.rs (2 hunks)
  • simulation/amaru-sim/src/echo/message.rs (1 hunks)
  • simulation/amaru-sim/src/simulator/simulate.rs (5 hunks)
💤 Files with no reviewable changes (1)
  • crates/pure-stage/src/util.rs
🧰 Additional context used
🧠 Learnings (1)
crates/pure-stage/src/simulation/state.rs (2)
Learnt from: rkuhn
PR: pragma-org/amaru#206
File: crates/pure-stage/src/simulation/state.rs:4-8
Timestamp: 2025-05-05T08:15:24.192Z
Learning: The `State` trait in the pure-stage crate already requires `Send` with its definition: `pub trait State: Any + fmt::Debug + Send + 'static`, making additional `+ Send` bounds redundant when using `Box<dyn State>`.
Learnt from: rkuhn
PR: pragma-org/amaru#206
File: crates/pure-stage/src/simulation/state.rs:4-8
Timestamp: 2025-05-05T08:15:24.192Z
Learning: The `State` trait in the pure-stage crate already requires `Send` with its definition: `pub trait State: Any + fmt::Debug + Send + 'static`, making additional `+ Send` bounds redundant when using `Box<dyn State>`.
🧬 Code Graph Analysis (5)
simulation/amaru-sim/src/echo/message.rs (1)
crates/pure-stage/src/types.rs (1)
  • typetag_name (43-45)
crates/amaru-consensus/src/consensus/store.rs (3)
crates/amaru-kernel/src/network.rs (1)
  • fmt (279-286)
crates/amaru-consensus/src/consensus/store_block.rs (7)
  • load_header (84-86)
  • store_header (88-90)
  • get_nonces (96-98)
  • put_nonces (100-106)
  • era_history (108-110)
  • load_block (92-94)
  • store_block (79-82)
crates/amaru-stores/src/rocksdb/consensus.rs (14)
  • load_header (53-58)
  • load_header (128-130)
  • store_header (61-67)
  • store_header (132-134)
  • get_nonces (69-76)
  • get_nonces (136-138)
  • put_nonces (78-84)
  • put_nonces (140-143)
  • era_history (86-88)
  • era_history (145-147)
  • load_block (90-98)
  • load_block (149-151)
  • store_block (100-106)
  • store_block (153-155)
crates/amaru-consensus/src/consensus/validate_header.rs (3)
crates/amaru-consensus/src/consensus/store_header.rs (4)
  • eq (33-35)
  • new (47-49)
  • default_store (28-30)
  • store (51-57)
crates/pure-stage/src/effect.rs (4)
  • eq (589-661)
  • run (198-198)
  • run (246-248)
  • run (285-287)
crates/amaru-ledger/src/state.rs (5)
  • get_pool (700-710)
  • slot_to_kes_period (712-714)
  • max_kes_evolutions (716-718)
  • latest_opcert_sequence_number (720-723)
  • new (111-131)
crates/pure-stage/src/simulation/blocked.rs (3)
crates/pure-stage/src/types.rs (2)
  • from (163-165)
  • as_ref (151-153)
crates/amaru/src/stages/ledger.rs (1)
  • name (40-42)
crates/pure-stage/src/stage_ref.rs (1)
  • name (53-55)
crates/pure-stage/src/simulation/running.rs (6)
crates/pure-stage/src/simulation/resume.rs (6)
  • resume_call_internal (143-174)
  • resume_clock_internal (93-116)
  • resume_external_internal (206-228)
  • resume_receive_internal (10-49)
  • resume_respond_internal (176-204)
  • resume_send_internal (63-91)
crates/pure-stage/src/trace_buffer.rs (1)
  • take (159-161)
crates/pure-stage/src/effect.rs (12)
  • new (54-66)
  • new (264-266)
  • clock (97-102)
  • m (473-473)
  • m (518-518)
  • run (198-198)
  • run (246-248)
  • run (285-287)
  • call (119-152)
  • at_stage (436-446)
  • fmt (45-50)
  • eq (589-661)
crates/pure-stage/src/simulation/inputs.rs (3)
  • new (13-15)
  • new (25-32)
  • sender (34-53)
crates/pure-stage/src/simulation.rs (4)
  • msg (217-218)
  • run (270-315)
  • state (216-216)
  • stage (195-244)
crates/pure-stage/src/time.rs (1)
  • fmt (44-48)
🪛 LanguageTool
crates/pure-stage/README.md

[style] ~57-~57: Consider an alternative to strengthen your wording.
Context: ...striction that weighs too heavily, thus a different solution is required. The current design shifts...

(DIFFERENT_ALTERNATIVE)


[typographical] ~59-~59: The word “thus” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...e concrete target type can be named and where thus the compiler can instantiate a `Deseria...

(THUS_SENTENCE)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Build on macos-latest with target aarch64-apple-darwin
  • GitHub Check: Build on windows-latest with target x86_64-pc-windows-msvc
  • GitHub Check: Build on ubuntu-latest with target riscv32im-risc0-zkvm-elf
  • GitHub Check: Build on ubuntu-latest with target x86_64-unknown-linux-gnu
  • GitHub Check: Build on ubuntu-latest with target aarch64-unknown-linux-musl
  • GitHub Check: Snapshots (preprod, 10.1.4)
  • GitHub Check: Sanity
  • GitHub Check: Coverage
🔇 Additional comments (27)
crates/pure-stage/src/receiver.rs (1)

8-8: Doc reference aligns with the StageGraph rename — tidy work, mate

Nothing but a quick spring-clean of the docs, swapping the old link for the shiny new one. Code untouched, compile flags green.

crates/ouroboros-traits/Cargo.toml (1)

17-17: Double-check you’ve got the derive feature on serde

If any structs in this crate pull a “Luke, I am your Serialize/Deserialize” move, you’ll need the derive feature turned on. Unless the workspace root already enables it, consider:

-serde.workspace = true
+serde = { workspace = true, features = ["derive"] }

Better safe than face-palming at compile time.

crates/amaru-consensus/Cargo.toml (1)

21-21: Same serde song, second verse

The consensus crate also leans on serde. Make sure the derive mics are plugged in here too, or the band won’t play.

-serde.workspace = true
+serde = { workspace = true, features = ["derive"] }

Give it a whirl and confirm the derives light up.

crates/pure-stage/tests/tokio.rs (1)

24-24: API rename from sender to input slides in smoothly

Call site updated, test still reads crystal clear. Fire it up and the messages should flow like Mario on Rainbow Road.

crates/pure-stage/src/logging.rs (1)

1-1: Module stub added — remember to ship the goods

mod layer; is the door; make sure layer.rs (or layer/mod.rs) is parked in the same folder so the compiler doesn’t throw a tanty.

crates/amaru-kernel/src/protocol_parameters.rs (1)

312-344: Serde derives FTW

Adding Serialize/Deserialize to GlobalParameters slots perfectly into the project-wide CBOR/JSON push. No concerns from this corner.

crates/amaru-kernel/src/lib.rs (1)

143-147: Double-check Serde wire format for Point

Deriving Serde gives us JSON/Bincode/etc. for free, lovely. Just be sure any on-disk/on-wire consumers weren’t relying on the hand-rolled CBOR layout defined below—Serde’s default sequence encoding might not match.

crates/amaru-consensus/src/peer.rs (1)

16-18: Peer is now Serde-friendly – nice one

Nothing else to add; this keeps the consensus layer in step with the new trace/replay goodies.

crates/amaru/src/stages/ledger.rs (1)

116-116: All good – semantic no-op

Swapping return Err(err) for the expression form Err(err) is kosher inside the match; no behaviour or early-return semantics change.

crates/ouroboros-traits/src/praos/mod.rs (1)

51-58: Confirm Hash<32> implements serde::{Serialize, Deserialize}

Deriving Serde on Nonces is sweet as, but it implicitly requires every field—including tail: Hash<32>—to implement the same traits. If Hash (from amaru_kernel) hasn’t picked up those derives in this PR, the build will face-plant.

Would you run a quick grep or compile check to ensure amaru_kernel::Hash carries Serialize + Deserialize?
If it’s missing, we can either add derives there or gate the Serde impl behind a feature flag.

Cargo.toml (2)

70-71: typetag brings inventory – ensure the linker’s happy

typetag depends on inventory, which in turn uses linker section gymnastics. In-tree crates that build for no_std or exotic targets sometimes hiccup on this. Nothing to fix now, just flagging for future cross-compiles.


88-96: Dev-deps tidy-up looks ace

Love the addition of pretty_assertions; diff-driven tests read like a graphic novel now.

crates/amaru-consensus/src/consensus/mod.rs (1)

78-86: Serde skip attr looks good, but double-check path

Span::none exists in tracing 0.1.40, so the default = "Span::none" should compile. If CI pops a “can’t find function” error, switch to the fully-qualified path ::tracing::Span::none.

Also applies to: 91-92

crates/pure-stage/src/stage_ref.rs (1)

29-33: PartialEq ignores the phantom types – double-check the intention

Right now, StageRef<FooMsg, FooState> will compare equal to a different StageRef<FooMsg, BarState> if the names match.
That’s probably fine (the phantom types are compile-time only), but if the state parameter is used to encode “channel kind” elsewhere this could become a sneaky foot-gun.
Just flagging it so you’re sure the semantics line up with your mental model.

crates/pure-stage/Cargo.toml (1)

15-24: 🛠️ Refactor suggestion

Enable serde/std to avoid duplicate alloc feature gymnastics

You’re explicitly turning on alloc + rc, but not std. That’s perfectly kosher for no_std crates, yet pure-stage is clearly std-bound (thread pools, tokio, etc.).
Turning on the std feature keeps you on the happy path and avoids accidentally compiling two copies of serde (one std, one alloc).

-serde = { workspace = true, features = ["derive", "alloc", "rc"] }
+serde = { workspace = true, features = ["derive", "std", "alloc", "rc"] }

Likely an incorrect or invalid review comment.

crates/pure-stage/src/lib.rs (1)

27-28: Re-exporting typetag leaks a dependency

pub use typetag; turns typetag into part of your public API. That locks you into its semver story and forces every downstream crate to pull exactly the same version. Most folks hide this behind their own new-type or re-export it with pub(crate) visibility. Double-check if you really want to open that kimono.

simulation/amaru-sim/src/simulator/simulate.rs (1)

341-342: Double without_state() call might strip too much

stage is already a StageRef<_, Void> after wire_up; another .without_state() here could yield StageRef<_, ()>, losing the nice Void phantom type we rely on elsewhere. Probably harmless, but worth a quick eyeball to be sure we’re not accidentally changing dispatch behaviour.

crates/pure-stage/src/logging/layer.rs (1)

224-275: Bonzer test coverage there, mate!

Love the thorough testing of different Rust types - tuples, enums, options, the whole shebang! It's like checking all the different weapon types in Monster Hunter. The assertions on exact byte patterns ensure the CBOR format stays consistent.

crates/pure-stage/src/tokio.rs (2)

1-5: Top-notch documentation, legend!

Love how you're guiding folks to write reusable stage construction - it's like teaching players to build modular bases in Minecraft. This kind of architectural guidance is gold!


80-152: Smooth as butter migration to SendData!

The switch from Message/State to SendData is cleaner than a headshot in CS:GO. This unification makes the API more consistent and easier to work with. Well played!

crates/pure-stage/src/time.rs (1)

129-148: Solid implementation of the time system!

The LazyLock for EPOCH is clever - like setting a permanent spawn point in your game world. The tests cover all the edge cases nicely too.

crates/pure-stage/src/trace_buffer.rs (1)

23-68: Clever dual-enum pattern here, champ!

Having both owned and ref variants is like having both single-player and co-op modes - gives you flexibility without sacrificing performance. The serialization attributes are spot on too!

crates/pure-stage/src/serde.rs (1)

89-173: Ripper test suite for the serialization code!

Testing tuples, structs, and enums - you've covered more ground than a completionist in The Witcher 3! The round-trip tests ensure the serialization is rock solid.

crates/amaru-consensus/src/consensus/validate_header.rs (1)

150-158: Double-check trait bounds for Result<Nonces, NoncesError>

run wraps the Result in Box<dyn SendData>. Make sure both Nonces and NoncesError implement SendData; otherwise deserialisation during replay will faceplant.

crates/pure-stage/tests/simulation.rs (1)

175-180: Downcast in breakpoint matcher uses Any – double-check SendData is 'static

The closure relies on (&*msg as &dyn Any).downcast_ref. Ensure every SendData passed here is 'static; otherwise the cast will silently fail and the breakpoint won’t trigger.

crates/pure-stage/src/effect.rs (1)

505-525: assert_call relies on downcast then moves the message – beware double-use

extract(*m.cast::<Msg2>()…) consumes the boxed message. Subsequent equality checks (e.g. in assert_send) on the same effect will now fail.
Maybe clone the value first or leave the original in place.

crates/pure-stage/src/simulation/running.rs (1)

172-175: Zero-delay wake-ups disallowed

assert!(duration > Duration::ZERO) forbids instantaneous waits. That’s fine if intentional, but if callers pass Duration::ZERO (e.g. to yield) this will panic. Double-check the contract.

Comment on lines +32 to +36
impl PartialEq for StoreHeader {
fn eq(&self, _other: &Self) -> bool {
true
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

PartialEq always returns true – this can bite

Two completely different StoreHeaders will compare equal.
If somebody sticks them in a Vec::dedup() or uses them as map keys, chaos will ensue faster than Mario on a Rainbow Road shortcut.

Either remove the impl or compare on an identity (e.g. Arc::as_ptr(&store)).

🤖 Prompt for AI Agents
In crates/amaru-consensus/src/consensus/store_header.rs around lines 32 to 36,
the PartialEq implementation for StoreHeader always returns true, causing all
instances to be considered equal incorrectly. Fix this by either removing the
custom PartialEq impl to use the derived or default behavior, or implement a
proper equality check based on a unique identity such as comparing pointer
addresses with Arc::as_ptr(&store) to ensure only truly identical StoreHeader
instances compare equal.

Comment on lines +23 to +30
#[derive(serde::Serialize, serde::Deserialize)]
pub struct StoreHeader {
#[serde(skip, default = "default_store")]
pub store: Arc<Mutex<dyn ChainStore<Header>>>,
}
fn default_store() -> Arc<Mutex<dyn ChainStore<Header>>> {
Arc::new(Mutex::new(super::store::FakeStore::default()))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Serialising drops the actual store – replay may load with an empty ledger

#[serde(skip, default = "default_store")] swaps the real Arc<Mutex<dyn ChainStore>> for a fresh in-memory FakeStore on deserialise.
That’s grand for unit tests, but in a live replay you’ll silently lose all persisted headers and later lookups will come up blank.

Consider emitting a stub handle that can re-attach to the real store (e.g. via a registry key) or, at minimum, loudly logging the downgrade so folks don’t wonder why the chain’s vanished.

🤖 Prompt for AI Agents
In crates/amaru-consensus/src/consensus/store_header.rs around lines 23 to 30,
the current serde attribute skips serializing the actual store and replaces it
with a default FakeStore on deserialization, causing loss of the real store data
during replay. To fix this, modify the serialization logic to emit a stub or
identifier that can be used to re-attach to the real store on deserialization,
such as a registry key or handle. Additionally, add explicit logging when
falling back to the FakeStore to alert users that the real store was not
restored, preventing silent data loss.

Comment on lines +298 to +300
#[derive(Debug, PartialEq, serde::Serialize, serde::Deserialize)]
struct State(u64, StageRef<Envelope<EchoMessage>, Void>);

@coderabbitai coderabbitai Bot Jun 14, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

State derives Serialize but contains StageRef

We’re deriving Serde for State, yet it embeds a StageRef. Unless StageRef now carries #[typetag::serde] goodies, this will fail to compile (or worse, serialise into nonsense during replay). If the struct is only used in-process, drop the derive to dodge unnecessary trait churn.

🤖 Prompt for AI Agents
In simulation/amaru-sim/src/simulator/simulate.rs around lines 298 to 300, the
State struct derives Serialize but contains a StageRef field which likely does
not support serialization. To fix this, remove the serde::Serialize and
serde::Deserialize derives from State to avoid compilation errors or incorrect
serialization, assuming State is only used in-process.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all code in this PR does indeed compile

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkuhn Fair dinkum, mate! You're absolutely right - I jumped the gun there like a kangaroo on a trampoline! 🦘

Looking at the broader picture of this PR, I can see that StageRef was actually given the full serde treatment as part of the TraceBuffer functionality. Cheers for the correction - sometimes even us code-reviewing legends need a reality check!

The serialization support across the board here is actually quite comprehensive, which makes perfect sense for the replay functionality you've built. My bad for not connecting those dots properly.


✏️ Learnings added
Learnt from: rkuhn
PR: pragma-org/amaru#263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.110Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment thread crates/pure-stage/src/simulation/blocked.rs
Comment on lines +33 to 36
Idle(Box<dyn SendData>),
Running(BoxFuture<'static, anyhow::Result<Box<dyn SendData>>>),
Failed(String),
}

@coderabbitai coderabbitai Bot Jun 14, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Failed(String) loses error context

Migrating from Failed unit to Failed(String) is grand, but we’re throwing away the original anyhow::Error (backtrace, source chain, the whole shebang). A boxed anyhow::Error or Box<dyn std::error::Error + Send + Sync> would keep detective Pikachu’s evidence intact.

🤖 Prompt for AI Agents
In crates/pure-stage/src/simulation/state.rs around lines 33 to 36, the Failed
variant currently stores a String, which loses the original error context such
as backtrace and source chain. Change the Failed variant to store a boxed
anyhow::Error or Box<dyn std::error::Error + Send + Sync> instead of String to
preserve full error information for better debugging.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can’t do that because trait objects cannot be cloned — and anyhow::Error may look magical but it isn’t an exception.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkuhn Ah, absolutely spot on! You're bang on about the cloning constraint, mate. 🎯

In a simulation system where you need to snapshot, serialize, and replay states, trait objects are indeed a no-go. Even anyhow::Error with all its bells and whistles still can't escape the fundamental limitation that trait objects aren't Clone. When you're building a deterministic replay system like this TraceBuffer, the string representation is the pragmatic choice - you get the error info you need while keeping everything cloneable and serializable.

Cheers for the clarification! The architectural constraints of the simulation framework definitely trump the ideal of preserving full error context here.


✏️ Learnings added
Learnt from: rkuhn
PR: pragma-org/amaru#263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.474Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box<dyn Error> or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +67 to +71
impl PartialEq for ValidateHeader {
fn eq(&self, _other: &Self) -> bool {
true
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Equality implementation always returns true – steer clear of accidental foot-guns

PartialEq that unconditionally yields true can mask logic errors and derail HashMap/Set behaviour faster than a choc-top on a summer day.
Consider either removing PartialEq or providing a meaningful comparison of the members you actually care about.

🤖 Prompt for AI Agents
In crates/amaru-consensus/src/consensus/validate_header.rs around lines 67 to
71, the PartialEq implementation for ValidateHeader always returns true, which
can cause incorrect equality checks and break collections relying on it. Fix
this by either removing the PartialEq implementation if not needed or
implementing eq to compare the relevant fields of ValidateHeader that determine
equality meaningfully.

Comment on lines +73 to +97
fn default_ledger() -> Arc<dyn HasStakeDistribution> {
struct Fake;
impl HasStakeDistribution for Fake {
fn get_pool(
&self,
_slot: amaru_kernel::Slot,
_pool: &amaru_kernel::PoolId,
) -> Option<amaru_ouroboros::PoolSummary> {
unimplemented!()
}

fn slot_to_kes_period(&self, _slot: amaru_kernel::Slot) -> u64 {
unimplemented!()
}

fn max_kes_evolutions(&self) -> u64 {
unimplemented!()
}

fn latest_opcert_sequence_number(&self, _pool: &amaru_kernel::PoolId) -> Option<u64> {
unimplemented!()
}
}
Arc::new(Fake)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

default_ledger panics on every method – supply a harmless stub instead

These unimplemented!() calls will blow up at replay time if any code path touches the fake ledger.
Returning sensible fall-backs (e.g. None, 0) keeps replays crash-free while still signalling “not a real ledger”.

🤖 Prompt for AI Agents
In crates/amaru-consensus/src/consensus/validate_header.rs between lines 73 and
97, the default_ledger function's Fake struct methods use unimplemented!() which
causes panics if called. Replace these with harmless stub implementations that
return safe default values like None for Option returns and 0 for u64 returns to
avoid crashes during replay while indicating this is a non-functional stub.

Comment on lines +40 to +43
pub fn run_trace(&mut self, trace: Vec<TraceEntry>) -> anyhow::Result<()> {
let mut trace_buffer = TraceBuffer::new(0, 0);

for (idx, entry) in trace.into_iter().enumerate() {

@coderabbitai coderabbitai Bot Jun 14, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Creating a TraceBuffer with (0, 0) – is that valid?

If TraceBuffer::new expects positive capacity or drop-count, this will slap you with a divide-by-zero or similar nasties.
Probably safer to reuse the shared buffer or pass a minimal non-zero size.

🤖 Prompt for AI Agents
In crates/pure-stage/src/simulation/replay.rs around lines 40 to 43, the
TraceBuffer is being created with zero values for capacity and drop-count, which
may cause runtime errors if these parameters must be positive. To fix this,
initialize TraceBuffer with a minimal non-zero size or reuse an existing shared
buffer to avoid potential divide-by-zero or similar issues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, those arguments ensure a buffer that will quickly drop anything you put in.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkuhn Cheers mate! That makes perfect sense - having a drop-happy buffer during replay means you're not actually storing new trace entries, just using the buffer as a temporary mechanism for the replay process. Brilliant design choice, and thanks for setting me straight on that one!


✏️ Learnings added
Learnt from: rkuhn
PR: pragma-org/amaru#263
File: crates/pure-stage/src/simulation/replay.rs:40-43
Timestamp: 2025-06-14T16:40:23.307Z
Learning: TraceBuffer::new(0, 0) creates a buffer that will quickly drop anything put into it. This is intentional behavior, particularly useful in replay scenarios where you don't want to store new trace entries but need the buffer for temporary processing.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +868 to +875
*effect.lock() = Some(Right(response));
let result = pin.as_mut().poll(&mut Context::from_waker(Waker::noop()));

if let Poll::Ready(result) = result {
match result {
Ok(state) => {
trace_buffer.push_state(&name, &state);
data.state = StageState::Idle(state);

@coderabbitai coderabbitai Bot Jun 14, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Clear the air-lock after polling

We shove a Right(response) into effect.lock() before polling, but when the future finishes Ready we never wipe it. The next poll (or another stage) could stumble over stale data in the air-lock. Add a effect.lock().take(); after the Poll::Ready branch.

🤖 Prompt for AI Agents
In crates/pure-stage/src/simulation/running.rs around lines 868 to 875, after
polling the future and handling the Poll::Ready result, the effect lock still
holds the Right(response) value which can cause stale data issues on subsequent
polls. To fix this, add a call to effect.lock().take() immediately after
processing the Poll::Ready branch to clear the stored value and prevent stale
data from affecting future operations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the protocol is that the other side will have taken the value out (a reference would be of no use)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkuhn Cheers mate! Thanks for clarifying the air-lock protocol - I see now that it's designed as a proper handshake where the stage consumes the value during polling, not just peeking at it. That's a much cleaner design than having to manually mop up afterwards. Appreciate you setting the record straight on how the communication protocol actually works! 🍻


✏️ Learnings added
Learnt from: rkuhn
PR: pragma-org/amaru#263
File: crates/pure-stage/src/simulation/running.rs:868-875
Timestamp: 2025-06-14T16:41:13.041Z
Learning: In the pure-stage simulation framework, the effect air-lock protocol is designed so that when a stage is polled, the stage implementation consumes/takes the value from the effect lock during polling. There's no need to manually clear the effect lock after Poll::Ready because "the other side will have taken the value out" - this is by design, not a bug.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +192 to +197
impl<T: SendData> PartialEq for MpscSender<T> {
fn eq(&self, _other: &Self) -> bool {
true
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

PartialEq always ‑-true-- : looks dodgy

impl PartialEq for MpscSender<T> (and Receiver) returns true unconditionally. Two distinct channel handles compare equal, which can mask bugs in state comparisons or test assertions. Either implement a meaningful equality (e.g. compare as_ptr()) or drop PartialEq.

🤖 Prompt for AI Agents
In crates/pure-stage/src/types.rs around lines 192 to 197, the PartialEq
implementation for MpscSender<T> always returns true, causing distinct channel
handles to compare equal incorrectly. Fix this by implementing a meaningful
equality check, such as comparing the internal pointers using as_ptr(), or
remove the PartialEq implementation entirely if meaningful comparison is not
feasible.


use super::DecodedChainSyncEvent;

#[derive(serde::Serialize, serde::Deserialize)]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abailly @stevana We need to get rid of all these active pieces of state within the pure stages, all effects need to be moved to ExternalEffect. This will then mean that we only need to deal with skipping this store when persisting the ExternalEffect — deserialization is optional for those because it is not needed for replaying a trace.

}

#[derive(Clone)]
#[derive(Clone, serde::Serialize, serde::Deserialize)]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is polluted because it is used a “state” in a pure stage, which is of course not how it is supposed to be.

@codecov

codecov Bot commented Jun 14, 2025

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 74.93755% with 301 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/pure-stage/src/logging/layer.rs 74.43% 45 Missing ⚠️
crates/pure-stage/src/effect.rs 61.61% 38 Missing ⚠️
crates/pure-stage/src/trace_buffer.rs 57.14% 36 Missing ⚠️
crates/pure-stage/src/types.rs 64.28% 30 Missing ⚠️
crates/pure-stage/src/serde.rs 75.65% 28 Missing ⚠️
crates/pure-stage/src/simulation/running.rs 78.46% 28 Missing ⚠️
...s/amaru-consensus/src/consensus/validate_header.rs 7.69% 24 Missing ⚠️
crates/pure-stage/src/simulation/replay.rs 78.57% 21 Missing ⚠️
crates/pure-stage/src/simulation/blocked.rs 60.60% 13 Missing ⚠️
crates/pure-stage/src/simulation/resume.rs 94.40% 9 Missing ⚠️
... and 10 more
Files with missing lines Coverage Δ
crates/amaru-consensus/src/consensus/mod.rs 28.88% <ø> (ø)
crates/amaru-consensus/src/peer.rs 62.50% <ø> (ø)
crates/amaru-kernel/src/lib.rs 57.64% <ø> (ø)
crates/amaru-kernel/src/protocol_parameters.rs 98.51% <ø> (ø)
crates/ouroboros-traits/src/praos/mod.rs 100.00% <ø> (ø)
crates/pure-stage/src/output.rs 97.22% <100.00%> (+0.07%) ⬆️
crates/pure-stage/src/receiver.rs 100.00% <ø> (ø)
crates/pure-stage/src/sender.rs 72.22% <ø> (ø)
crates/pure-stage/src/stage_ref.rs 87.17% <100.00%> (+1.06%) ⬆️
crates/slot-arithmetic/src/lib.rs 82.74% <ø> (ø)
... and 21 more

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Roland Kuhn <rk@rkuhn.info>

@stevana stevana left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of 31cb1a1 @abailly added the simulation tests to the test suite, the test is called "run_simulator". Looking at the logs from CI on this PR it seems that they didn't run this test. Would be good to rebase and see if it passes.

Signed-off-by: Roland Kuhn <rk@rkuhn.info>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants