Skip to content

Peerless bootstrap#410

Merged
jeluard merged 3 commits into
pragma-org:mainfrom
pgrange:peerless_bootstrap
Sep 4, 2025
Merged

Peerless bootstrap#410
jeluard merged 3 commits into
pragma-org:mainfrom
pgrange:peerless_bootstrap

Conversation

@pgrange

@pgrange pgrange commented Aug 27, 2025

Copy link
Copy Markdown
Contributor

This PR allows one to bootstrap an Amaru node without needing any cardano peer.

To do so, we store the needed chain headers in the data directory of this repository, as they are really small files. That way, they're next to all the other metadata that we currently store. They need to be kept in sync with the rest of the data directory.

data/README.md explains how to generate these files to keep them in sync (make fetch-chain-headers).

The bootstrap process itself has been changed to pick chain headers information from this data directory instead of trying to connect to a peer.

Now with this new fetch-chain-header command we have some duplication with the existing import-headers command. So the import-headers command is deleted as it does not seem to be of use anymore.

Summary by CodeRabbit

  • New Features

    • Added a CLI subcommand to fetch chain headers and save them as CBOR files.
    • Introduced a Makefile target (fetch-chain-headers) to generate headers into data//headers.
  • Improvements

    • Simplified bootstrap and import workflows to use a config directory instead of a peer address.
    • Import headers now reads locally stored headers, improving setup consistency.
  • Documentation

    • Updated data/README with a new step to dump chain headers and where they are stored.

@coderabbitai

coderabbitai Bot commented Aug 27, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Adds a new FetchChainHeaders CLI and Makefile target to fetch headers from a peer into per-network data dirs, refactors import-headers to read local CBOR files, updates bootstrap to use network/config dirs without peer address, wires new subcommand in main/mod, and updates docs.

Changes

Cohort / File(s) Summary
Build targets
Makefile
Adds fetch-chain-headers PHONY target invoking cargo run -- fetch-chain-headers with --peer-address, --config-dir, --network. Simplifies import-headers to only --config-dir. Updates bootstrap to drop --peer-address and pass --config-dir, --ledger-dir, --chain-dir, --network.
CLI wiring
crates/amaru/src/bin/amaru/main.rs, crates/amaru/src/bin/amaru/cmd/mod.rs
Registers new subcommand FetchChainHeaders and routes to cmd::fetch_chain_headers::run. Replaces mod import_headers with mod fetch_chain_headers import. Updates doc comment for ImportHeaders.
New command: fetch headers from peer
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
Introduces CLI args (network, config_dir, peer_address). Implements run, fetch_headers_for_network, and async fetch_headers using ChainSync to pull up to N headers per point; writes header.<slot>.<hash>.cbor under <config_dir>/<network>/headers.
Import headers from local files
crates/amaru/src/bin/amaru/cmd/import_headers.rs
Replaces network-based import with file-based import from <config_dir>/<network>/headers. Changes Args to { network, chain_dir, config_dir }. Adds import_headers_for_network(network, config_dir, chain_dir). Removes prior peer/session logic.
Bootstrap flow update
crates/amaru/src/bin/amaru/cmd/bootstrap.rs
Removes peer_address from Args. Switches to import_headers_for_network(network, network_dir, chain_dir). Drops Point/peer-based header loop and config-dir header parsing in favor of network-dir driven import.
Docs
data/README.md
Adds step to run make fetch-chain-headers to dump headers as CBOR to data/<network>/headers. Minor formatting tweak.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant MK as Makefile
  participant CLI as Amaru CLI
  participant FCH as fetch_chain_headers
  participant Peer as Upstream Node
  participant FS as Filesystem

  U->>MK: make fetch-chain-headers
  MK->>CLI: cargo run -- fetch-chain-headers --network --config-dir --peer-address
  CLI->>FCH: run(args)
  FCH->>FS: Read <config_dir>/<network>/headers.json (points)
  loop per point (max N headers)
    FCH->>Peer: ChainSync: request next header
    alt RollForward
      Peer-->>FCH: Header (CBOR)
      FCH->>FS: Write header.<slot>.<hash>.cbor
    else RollBackward/Await
      FCH-->>FCH: continue
    end
  end
  FCH-->>CLI: ok
  CLI-->>U: done
Loading
sequenceDiagram
  autonumber
  actor U as User
  participant MK as Makefile
  participant CLI as Amaru CLI
  participant IH as import_headers
  participant FS as Filesystem
  participant DB as RocksDBStore

  U->>MK: make import-headers
  MK->>CLI: cargo run -- import-headers --config-dir
  CLI->>IH: run(args)
  IH->>FS: List <config_dir>/<network>/headers/*.cbor
  loop each file
    IH->>FS: Read CBOR
    IH->>DB: Decode and store header by hash
  end
  IH-->>CLI: ok
  CLI-->>U: done
Loading
sequenceDiagram
  autonumber
  actor U as User
  participant MK as Makefile
  participant CLI as Amaru CLI
  participant BS as bootstrap
  participant IH as import_headers_for_network

  U->>MK: make bootstrap
  MK->>CLI: cargo run -- bootstrap --config-dir --ledger-dir --chain-dir --network
  CLI->>BS: run(args)
  BS->>IH: import_headers_for_network(network, network_dir, chain_dir)
  IH-->>BS: ok
  BS-->>CLI: continue bootstrap
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • KtorZ
  • jeluard
  • abailly

Poem

New headers in the stash, tidy as a barista’s pour,
We fetch, we file, then import—no peer chat anymore.
Like swapping boss fights for loot runs on shore,
CBOR gems sparkle in data/…/headers galore.
Ship it, mate—another quest off the lore. 🪄⛓️

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@KtorZ

KtorZ commented Aug 27, 2025

Copy link
Copy Markdown
Contributor

Somewhat related to #406

@pgrange pgrange force-pushed the peerless_bootstrap branch from 11c0da2 to 5a89fe7 Compare August 27, 2025 13:21

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
crates/amaru/src/bin/amaru/cmd/mod.rs (1)

18-24: Duplication risk with import_headers module.

Now that fetch_chain_headers exists and bootstrap reads on-disk CBOR, do we still need import_headers? If both stay, consider deprecating one, or making one delegate to the other to avoid drift.

Happy to draft a small deprecation shim or merge the code paths if you confirm the desired UX.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 11c0da2 and 5a89fe7.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
  • crates/amaru/src/bin/amaru/main.rs
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
  • Makefile
🧰 Additional context used
🪛 LanguageTool
data/README.md

[grammar] ~90-~90: There might be a mistake here.
Context: ...sed. 4. Dump chain headers as CBOR Go to home directory cd .. and make fetch-c...

(QB_NEW_EN)


[grammar] ~97-~97: There might be a mistake here.
Context: ...n be used to bootstraped the amaru node.

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
data/README.md

92-92: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build riscv32
  • GitHub Check: Test coverage
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)

Comment thread crates/amaru/src/bin/amaru/cmd/mod.rs
Comment thread data/README.md Outdated
@pgrange pgrange force-pushed the peerless_bootstrap branch from 5a89fe7 to 24cd68b Compare August 27, 2025 13:48

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
data/README.md (1)

91-95: Add code-fence language and tighten wording/path.

Small doc polish so linters stop nagging and readers don’t guess the shell.

-```
-make fetch-chain-headers
-```
+```bash
+make fetch-chain-headers
+```

-This will add `chain.<slot>.<hash>.cbor` files to data/<network>/headers, which the bootstrap command consumes to initialize the Amaru node.
+This writes `chain.<slot>.<hash>.cbor` files to `data/<NETWORK>/headers/`, which the bootstrap command consumes to initialize the Amaru node.
🧹 Nitpick comments (4)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

72-75: Update clap docs: no more headers.json; fix typo.

Docs still mention headers.json, but the code now scans a headers/ directory of CBOR files. Let’s keep the CLI help honest.

-/// * `data/preview/snapshots.json`: a list of `Snapshot` vaalues,
+/// * `data/preview/snapshots.json`: a list of `Snapshot` values,
 /// * `data/preview/nonces.json`: a list of `InitialNonces` values,
-/// * `data/preview/headers.json`: a list of `Point`s.
+/// * `data/preview/headers/`: CBOR-encoded headers (`chain.<slot>.<hash>.cbor`) consumed by bootstrap.

124-131: Optional: deterministic import order.

Reading dir order is OS-dependent. Sorting by filename (slot prefix) makes runs reproducible.

-    for entry in std::fs::read_dir(config_dir.join("headers"))? {
-        let entry = entry?;
-        let path = entry.path();
+    let mut entries: Vec<_> = std::fs::read_dir(config_dir.join("headers"))?
+        .filter_map(|e| e.ok().map(|e| e.path()))
+        .collect();
+    entries.sort();
+    for path in entries {
         if path.is_file()
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (2)

225-235: Tiny I/O tweak: create the headers dir once, not per header.

We call create_dir_all on every RollForward. Pre-create in run() to save syscalls; keeps the hot path lean without touching the async/blocking trade-off.

 pub async fn run(args: Args) -> Result<(), Box<dyn Error>> {
     info!(config=?args.config_dir, peer=%args.peer_address, network=%args.network,
           "fetching chain headers",
     );
     let network = args.network;
     let network_dir = args.config_dir.join(&*network.to_string());
 
+    // Ensure destination exists up-front.
+    std::fs::create_dir_all(network_dir.join("headers"))?;
+
     fetch_headers_for_network(network, &args.peer_address, &network_dir).await?;
 
     Ok(())
 }

221-236: Optional: avoid unwrap on header decode (mirror bootstrap).

Chainsync should hand us valid CBOR, but defensive coding beats save-scumming later. If decoding fails, log and continue.

-            let header: Header = from_cbor(&content.cbor).unwrap();
+            let Some(header) = from_cbor::<Header>(&content.cbor) else {
+                tracing::warn!("Skipping malformed header from peer response");
+                return Ok(Continue);
+            };
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5a89fe7 and 24cd68b.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
  • Makefile
  • crates/amaru/src/bin/amaru/main.rs
🧰 Additional context used
🧬 Code graph analysis (2)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (3)
crates/amaru/src/bin/amaru/cmd/mod.rs (1)
  • connect_to_peer (31-37)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/progress-bar/src/terminal_progress_bar.rs (1)
  • new_terminal_progress_bar (24-29)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3)
crates/amaru-kernel/src/lib.rs (2)
  • default_chain_dir (816-818)
  • default_ledger_dir (812-814)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/amaru-stores/src/rocksdb/mod.rs (3)
  • new (161-173)
  • new (204-210)
  • new (283-287)
🪛 markdownlint-cli2 (0.17.2)
data/README.md

91-91: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/linux
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test coverage

Comment thread crates/amaru/src/bin/amaru/cmd/bootstrap.rs Outdated
Comment thread crates/amaru/src/bin/amaru/cmd/bootstrap.rs Outdated
Comment thread crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs Outdated
Comment thread crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs Outdated
Comment thread crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
Comment thread data/README.md Outdated
@abailly

abailly commented Sep 2, 2025

Copy link
Copy Markdown
Contributor

One thing I noticed while trying to automate deployment of amaru on one of my machines is the fact that bootstrapping currently depends on the source code, eg. having the data/ directory around. Perhaps we should start thinking of producing signed snapshots that would be available for download from some stable location? Mithril integration does not seem to be "just around the corner" but this would be a first step in that direction.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
data/README.md (4)

91-93: Add language to code fence for linting and readability

Mark the block as bash so markdownlint chills and readers know it’s a shell snippet.

-```
+```bash
 make fetch-chain-headers

---

`87-87`: **Promote to a subsection instead of a numbered step**

This reads like a new flow, not Step 4 of the previous sequence. A small header makes it clearer, mate.


```diff
-4. Dump chain headers as CBOR
+### Dump chain headers as CBOR

95-95: Clarify path, escape placeholders, and note CONFIG override

Angle brackets can vanish in Markdown and the CONFIG_FOLDER override is handy to call out.

-This will add `chain.<slot>.<hash>.cbor` files to data/<network>/headers, which the bootstrap command consumes to initialize the Amaru node.
+This writes `chain.<slot>.<hash>.cbor` files to `data/${NETWORK}/headers/` (or `${CONFIG_FOLDER}/${NETWORK}/headers/` if you override `CONFIG_FOLDER`), which the bootstrap command consumes to initialize the Amaru node.

Quick check: does the Make target always resolve to data/${NETWORK}/headers/ by default?


95-95: Explain the chain prefix (follow-up to jeluard’s question)

A one-liner helps future you: why chain.*.cbor vs header.*.cbor?

Proposed add-after:

  • “Files are prefixed with chain. to distinguish header dumps from ledger snapshots and other artifacts.”
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 24cd68b and c2da95e.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
  • crates/amaru/src/bin/amaru/main.rs
  • Makefile
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
data/README.md

91-91: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build riscv32
  • GitHub Check: Benches
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test coverage

@jeluard

jeluard commented Sep 2, 2025

Copy link
Copy Markdown
Contributor

@abailly That is something we discussed with @pgrange and will soon start looking into. Also see #327

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (1)

72-75: Fix outdated help text and typo (“vaalues”) to match the new headers directory.

This clap doc still mentions headers.json. The code reads CBOR files from .../headers/. Let’s keep the CLI help crisp.

Apply this diff:

-    /// * `data/preview/snapshots.json`: a list of `Snapshot` vaalues,
+    /// * `data/preview/snapshots.json`: a list of `Snapshot` values,
     /// * `data/preview/nonces.json`: a list of `InitialNonces` values,
-    /// * `data/preview/headers.json`: a list of `Point`s.
+    /// * `data/preview/headers/`: CBOR header files named `header.<slot>.<hash>.cbor`.
♻️ Duplicate comments (2)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

30-30: Avoid leaking WorkerError into this CLI path.

Using gasket::framework::WorkerError here is a bit odd; we’re not in a worker. Either propagate the concrete error or wrap with io::Error::other/anyhow. Prior question “Why WorkerError?” still stands.

Minimal change (if types permit):

-    .map_err(|_| WorkerError::Panic)?;
+    ?;

If needed for type conversion:

-    .map_err(|_| WorkerError::Panic)?;
+    .map_err(std::io::Error::other)?;

115-115: Don’t unwrap CBOR; skip malformed files instead of panicking.

Unwrapping will bring the whole bootstrap down if one file is borked. Parse defensively and carry on. This echoes prior feedback.

Apply this diff for the decode/store block:

-            let header_from_file: Header = from_cbor(&cbor_data).unwrap();
-            let hash = header_from_file.hash();
-            db.store_header(&hash, &header_from_file)
-                .map_err(|_| WorkerError::Panic)?;
+            match from_cbor::<Header>(&cbor_data) {
+                Some(header) => {
+                    let hash = header.hash();
+                    db.store_header(&hash, &header)
+                        .inspect_err(|reason| tracing::error!(hash=%hex::encode(hash), file=%path.display(), reason=%reason, "Failed to store header"))
+                        .map_err(|_| WorkerError::Panic)?;
+                }
+                None => {
+                    tracing::warn!(file=%path.display(), "Skipping malformed header CBOR");
+                    continue;
+                }
+            }

And you can drop the allowance:

-#[allow(clippy::unwrap_used)]

Also applies to: 132-143

🧹 Nitpick comments (5)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3)

85-87: Log resolved paths, not the raw args.

Right now we log None when defaults are used. Shift the log after defaults are computed so operators see the actual dirs. No dramas.

Apply this diff:

-    info!(config=?args.config_dir, ledger_dir=?args.ledger_dir, chain_dir=?args.chain_dir, network=%args.network,
-          "bootstrapping",
-    );
+    // Log after resolving defaults
+    info!(config=?args.config_dir, ledger_dir=?ledger_dir, chain_dir=?chain_dir, network=%args.network, "bootstrapping");

121-127: Gracefully handle missing headers directory.

If data/<network>/headers/ isn’t present, a friendly skip beats a hard error. Handy for folks who only want ledger snapshots on first run.

Apply this diff:

-    for entry in std::fs::read_dir(config_dir.join("headers"))? {
+    let headers_dir = config_dir.join("headers");
+    if !headers_dir.exists() {
+        tracing::info!(dir=%headers_dir.display(), "No headers directory; skipping header import");
+        return Ok(());
+    }
+    for entry in std::fs::read_dir(&headers_dir)? {

197-215: Add HTTP client timeouts (and UA) for snapshot downloads.

Default reqwest has no per-request timeout; slow mirrors could hang the whole show like a boss fight with infinite health.

Example:

use std::time::Duration;

let client = reqwest::Client::builder()
    .timeout(Duration::from_secs(60))
    .user_agent(concat!("amaru/", env!("CARGO_PKG_VERSION")))
    .build()
    .map_err(|e| BootstrapError::Io(std::io::Error::other(e)))?;
data/README.md (2)

91-93: Specify the code fence language.

Silences markdownlint (MD040) and improves highlighting.

Apply this diff:

-```
+```bash
 make fetch-chain-headers

---

`95-95`: **Style nit: format the path as code.**

Tiny readability boost.


Apply this diff:

```diff
-This will add `header.<slot>.<hash>.cbor` files to data/<network>/headers, which the bootstrap command consumes to initialize the Amaru node.
+This will add `header.<slot>.<hash>.cbor` files to `data/<network>/headers/`, which the bootstrap command consumes to initialize the Amaru node.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c2da95e and db86688.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
  • crates/amaru/src/bin/amaru/main.rs
  • Makefile
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
PR: pragma-org/amaru#392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
🧬 Code graph analysis (1)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3)
crates/amaru-kernel/src/lib.rs (2)
  • default_chain_dir (816-818)
  • default_ledger_dir (812-814)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/amaru/src/bin/amaru/cmd/import_ledger_state.rs (1)
  • fs (90-93)
🪛 markdownlint-cli2 (0.17.2)
data/README.md

91-91: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build riscv32
  • GitHub Check: Benches
  • GitHub Check: Test coverage
🔇 Additional comments (2)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

21-27: Imports look right for the new flow.

Bringing IsHeader into scope for hash() and wiring RocksDBStore is spot on. Good on ya.


110-112: Nice pivot to on-disk header CBORs.

Dropping network dependency during bootstrap will make life easier in CI and air-gapped setups.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1)

22-23: Drop the gasket dependency from this CLI.

We already discussed avoiding gasket here; using WorkerError/AsWorkError leaks worker semantics into a one-off command and complicates control flow (as seen with the timeout). Consider swapping to anyhow/thiserror and returning Result<What, anyhow::Error> from helpers.

🧹 Nitpick comments (5)
data/README.md (2)

91-93: Add a language to the fenced code block (markdownlint MD040).

Use a bash fence so editors render it nicely.

-```
+```bash
 make fetch-chain-headers

---

`95-95`: **Clarify the output path placeholder.**

Use a consistent placeholder with the rest of the doc (NETWORK is used above).


```diff
-This will add `header.<slot>.<hash>.cbor` files to data/<network>/headers, which the bootstrap command consumes to initialize the Amaru node.
+This will add `header.<slot>.<hash>.cbor` files to `data/<NETWORK>/headers`, which the bootstrap command consumes to initialize the Amaru node.
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (3)

175-176: Minor log tweak.

Plural reads nicer in logs.

-    info!(total = count, "header_fetched");
+    info!(total = count, "headers_fetched");

68-81: Default peer address may surprise; consider making it required.

Hardcoding 127.0.0.1:3001 is convenient locally but easy to forget in scripts. Making it required (or at least surfacing the network default in docs) reduces “why am I not getting any headers?” moments.


95-116: Question the “2 headers per point” heuristic.

FIXME is still open. If bootstrap only needs the immediate successor to validate a boundary, fine; otherwise consider making this configurable (flag) and defaulting to a safer value.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between db86688 and ac63e4f.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
  • crates/amaru/src/bin/amaru/main.rs
  • Makefile
🧰 Additional context used
🧬 Code graph analysis (1)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (3)
crates/amaru/src/bin/amaru/cmd/mod.rs (1)
  • connect_to_peer (31-37)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/progress-bar/src/terminal_progress_bar.rs (1)
  • new_terminal_progress_bar (24-29)
🪛 GitHub Actions: Coding Practices
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs

[error] 137-141: cargo clippy-amaru failed with E0308: incorrect arguments to Stage::new in crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs:137. Stage::new expects (peer: Peer, chain_sync: Client, intersection: Vec); provided (peer_session.clone(), vec![point.clone()], Arc::new(RwLock::new(true))).

🪛 markdownlint-cli2 (0.17.2)
data/README.md

91-91: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build riscv32
  • GitHub Check: Benches
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test coverage

Comment thread crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs Outdated
Comment thread crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs Outdated
Comment thread crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (3)

135-146: Fix Stage::new args (compile error E0308) and drop the stray RwLock.

You’re passing a PeerSession and an RwLock; Stage::new wants (Peer, ChainSync client, Vec). This is the blocker in CI. The snippet also lets you delete the RwLock import. Like switching to the right dialogue option to progress the quest.

@@
-    let pull = pull::Stage::new(
-        peer_session.clone(),
-        vec![point.clone()],
-        Arc::new(RwLock::new(true)),
-    );
-
-    pull.find_intersection().await?;
-
-    let mut peer_client = peer_session.lock().await;
+    // Find intersection first with a dedicated ChainSync client
+    {
+        let mut guard = peer_session.lock().await;
+        let chain_sync_for_intersection = (*guard).chainsync();
+        let pull = pull::Stage::new(
+            Peer::new(peer_address),
+            chain_sync_for_intersection,
+            vec![point.clone()],
+        );
+        pull.find_intersection().await?;
+    }
+
+    let mut peer_client = peer_session.lock().await;
@@
-    let client = (*peer_client).chainsync();
+    let client = (*peer_client).chainsync();

And in the imports:

@@
-    sync::{Arc, RwLock},
+    sync::Arc,

201-206: Don’t abort on a 1s timeout; keep listening.

Transient quiet periods are normal on ChainSync. Treat timeout as “no message yet”, not a failure.

-    match timeout(Duration::from_secs(1), client.recv_while_must_reply()).await {
+    match timeout(Duration::from_secs(1), client.recv_while_must_reply()).await {
         Ok(result) => result
             .map_err(|_| WorkerError::Recv)
             .and_then(|next| handle_response(next, config_dir, count, progress, max)),
-        Err(_) => Err(WorkerError::Retry)?,
+        Err(_) => Ok(Continue),
     }

219-223: Avoid unwrap on CBOR decode; surface context instead.

Bad bytes shouldn’t insta-crash without breadcrumbs.

-            let header: Header = from_cbor(&content.cbor).unwrap();
+            let header: Header = from_cbor(&content.cbor).ok_or_else(|| {
+                tracing::error!("failed to decode header CBOR (bytes: {})", content.cbor.len());
+                WorkerError::Panic
+            })?;
🧹 Nitpick comments (4)
data/README.md (2)

91-93: Add language to fenced block (markdownlint MD040).

Use bash for clarity.

-```
+```bash
 make fetch-chain-headers

---

`95-95`: **Tighten wording and format the path.**

Small doc polish; keeps it crisp and consistent with earlier sections.


```diff
-This will add `header.<slot>.<hash>.cbor` files to data/<network>/headers, which the bootstrap command consumes to initialize the Amaru node.
+This writes `header.<slot>.<hash>.cbor` files to `data/<network>/headers/`, which the bootstrap command consumes to initialize the Amaru node.
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (2)

223-234: Optional: use async FS to avoid blocking the runtime.

Swap std::fs and File for tokio::fs equivalents in this async context. Not a blocker, just a smooth frame rate tweak.


173-174: Pluralise the log message.

Tiny, but reads nicer when count > 1.

-    info!(total = count, "header_fetched");
+    info!(total = count, "headers_fetched");
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ac63e4f and 32de5f0.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
  • crates/amaru/src/bin/amaru/main.rs
  • Makefile
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
PR: pragma-org/amaru#392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
📚 Learning: 2025-08-08T14:35:35.562Z
Learnt from: KtorZ
PR: pragma-org/amaru#370
File: crates/amaru-kernel/src/transaction_pointer.rs:36-44
Timestamp: 2025-08-08T14:35:35.562Z
Learning: In the amaru project, when decoding CBOR arrays, prefer using minicbor_extra::heterogenous_array with the expected length to validate definite-length arrays and correctly handle indefinite-length arrays. Example: crates/amaru-kernel/src/transaction_pointer.rs Decode should use heterogenous_array(d, 2, …) instead of ignoring the length from d.array().

Applied to files:

  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
🧬 Code graph analysis (1)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (3)
crates/amaru/src/bin/amaru/cmd/mod.rs (1)
  • connect_to_peer (31-37)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/progress-bar/src/terminal_progress_bar.rs (1)
  • new_terminal_progress_bar (24-29)
🪛 GitHub Actions: Coding Practices
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs

[error] 135-139: Rust compile error E0308 during cargo clippy-amaru: incorrect arguments to Stage::new in fetch_chain_headers.rs. The call passes (peer_session.clone(), vec![point.clone()], Arc::new(RwLock::new(true))) but Stage::new has signature new(peer: Peer, chain_sync: Client, intersection: Vec).

🪛 markdownlint-cli2 (0.17.2)
data/README.md

91-91: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build riscv32
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Benches
  • GitHub Check: Build aarch64/linux
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test coverage

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
Makefile (1)

18-18: Add the new target to .PHONY so make doesn’t get clever about timestamps.

Keeps rebuilds predictable, ya beauty.

-.PHONY: help bootstrap start import-ledger-state import-headers import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov
+.PHONY: help bootstrap start import-ledger-state import-headers import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov fetch-chain-headers
data/README.md (2)

91-95: Add a language to the code fence and format the path.

Silences MD040 and makes it clearer where files land.

-```
-make fetch-chain-headers
-```
+```bash
+make fetch-chain-headers
+```
-
-This will add `header.<slot>.<hash>.cbor` files to data/<network>/headers, which the bootstrap command consumes to initialize the Amaru node.
+This writes `header.<slot>.<hash>.cbor` files to `data/${NETWORK}/headers/`, which the bootstrap command consumes to initialize the Amaru node.

87-95: Clarify this step hits a peer (PEER_ADDRESS) so folks don’t expect offline magic here.

The bootstrap is peerless, but fetching headers isn’t. A one-liner will save a few head-scratches.

Proposed addition after the code block:

Note: `make fetch-chain-headers` connects to a Cardano node at `$PEER_ADDRESS` (defaults to `127.0.0.1:3001`). Set `PEER_ADDRESS` if your node runs elsewhere.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 32de5f0 and c7c1507.

📒 Files selected for processing (6)
  • Makefile (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
  • crates/amaru/src/bin/amaru/main.rs
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
data/README.md

91-91: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build riscv32
  • GitHub Check: Test coverage
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/linux
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)

Comment thread Makefile
@pgrange

pgrange commented Sep 2, 2025

Copy link
Copy Markdown
Contributor Author

I rebased on main and fixed the corresponding compilation issues related to API changes in client management.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
Makefile (1)

85-90: Make target may fail if data dir doesn’t exist; create dirs in-recipe (dup of prior feedback).

As-is, make can error on the prerequisite path. Safer to mkdir -p (including headers) before running cargo.

Apply:

-fetch-chain-headers: $(CONFIG_FOLDER)/$(NETWORK)/ ## Fetch chain headers from the network
+fetch-chain-headers: ## Fetch chain headers from the network
+	@mkdir -p $(CONFIG_FOLDER)/$(NETWORK)/headers
 	cargo run --profile $(BUILD_PROFILE) -- fetch-chain-headers \
 		--peer-address $(PEER_ADDRESS) \
 		--config-dir $(CONFIG_FOLDER) \
 		--network $(NETWORK)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (1)

115-116: Drop unwrap allowance; handle decode failures.

Letting one bad CBOR nuke the run is a bit “git gud” Dark Souls. Parse defensively so we can skip corrupted files.

Apply:

-#[allow(clippy::unwrap_used)]
 async fn import_headers_for_network(
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (2)

15-22: Trim the gasket trait use here (step towards removing the dep in this CLI).

You don’t need AsWorkError; map directly to WorkerError. Baby step toward the earlier “no gasket here” goal.

@@
-use gasket::framework::{AsWorkError, WorkerError};
+use gasket::framework::WorkerError;
@@
-    let next = client.request_next().await.or_restart()?;
+    let next = client.request_next().await.map_err(|_| WorkerError::Recv)?;

Also applies to: 175-176


194-205: Kill the unwrap on CBOR decode; fail gracefully with context.

An invalid header from a peer shouldn’t panic the process.

-#[allow(clippy::unwrap_used)]
 fn handle_response(
@@
-        NextResponse::RollForward(content, tip) => {
-            let header: Header = from_cbor(&content.cbor).unwrap();
+        NextResponse::RollForward(content, tip) => {
+            let header: Header = from_cbor(&content.cbor).ok_or_else(|| {
+                tracing::error!(
+                    len = content.cbor.len(),
+                    tip_slot = tip.0.slot_or_default(),
+                    "failed to decode Header from CBOR"
+                );
+                WorkerError::Panic
+            })?;
🧹 Nitpick comments (8)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

85-87: Helpful structured log — keep it.

This context is gold when something goes sideways.

As a follow-up, considering the repo-shipped headers/snapshots: plan for signature verification (signed manifest + checksums) to avoid sneaky tampering. Think “Mithril-lite” until full Mithril lands.


110-111: Docs/config mismatch: headers now come from files, not headers.json.

Update the bootstrap help/docs to mention data//headers/header.*.cbor as the source of truth.

crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (6)

235-235: Drop redundant clippy allowance.

No unwraps in this arm; the attribute’s dead weight.

-        #[allow(clippy::unwrap_used)]
         NextResponse::RollBackward(point, tip) => {

86-86: Sanitise network dir names (Windows-safe).

NetworkName::to_string() can contain “:”, which makes paths invalid on Windows. Swap “:” for “-” (or similar) before join.

-    let network_dir = args.config_dir.join(&*network.to_string());
+    let network_dir = args.config_dir.join(network_to_dir(&network));

Add this helper somewhere near the bottom/top:

fn network_to_dir(n: &NetworkName) -> String {
    n.to_string().replace(':', "-")
}

208-219: Avoid blocking I/O on the async path.

std::fs::create_dir_all + File::create/write_all blocks the Tokio runtime. Use spawn_blocking or make handle_response async and use tokio::fs + AsyncWriteExt.


206-207: Don’t recompute slot; reuse it.

Tiny tidy-up, but why roll the dice twice?

@@
-            let slot = header.slot();
+            let slot = header.slot();
@@
-            let slot = header.slot();

Also applies to: 222-222


158-159: Pluralise the log message.

More than one header? Make the log read right, mate.

-    info!(total = count, "header_fetched");
+    info!(total = count, "headers_fetched");

110-112: Make “how many headers” a CLI knob.

Hardcoding 2 is a bit like playing Elden Ring with a broken sword. Consider a --max-headers flag with a sensible default.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c7c1507 and 834e61e.

📒 Files selected for processing (6)
  • Makefile (2 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
  • data/README.md
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
PR: pragma-org/amaru#392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
📚 Learning: 2025-08-08T14:35:35.562Z
Learnt from: KtorZ
PR: pragma-org/amaru#370
File: crates/amaru-kernel/src/transaction_pointer.rs:36-44
Timestamp: 2025-08-08T14:35:35.562Z
Learning: In the amaru project, when decoding CBOR arrays, prefer using minicbor_extra::heterogenous_array with the expected length to validate definite-length arrays and correctly handle indefinite-length arrays. Example: crates/amaru-kernel/src/transaction_pointer.rs Decode should use heterogenous_array(d, 2, …) instead of ignoring the length from d.array().

Applied to files:

  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
🧬 Code graph analysis (3)
crates/amaru/src/bin/amaru/main.rs (1)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1)
  • run (81-91)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (3)
crates/amaru/src/bin/amaru/cmd/mod.rs (1)
  • connect_to_peer (31-37)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/progress-bar/src/terminal_progress_bar.rs (1)
  • new_terminal_progress_bar (24-29)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3)
crates/amaru-kernel/src/lib.rs (2)
  • default_chain_dir (818-820)
  • default_ledger_dir (814-816)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/amaru-stores/src/rocksdb/mod.rs (3)
  • new (161-173)
  • new (204-210)
  • new (284-289)
🪛 checkmake (0.2.2)
Makefile

[warning] 18-18: Missing required phony target "all"

(minphony)


[warning] 18-18: Missing required phony target "clean"

(minphony)


[warning] 18-18: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Benches
  • GitHub Check: Build riscv32
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Test coverage
🔇 Additional comments (5)
Makefile (1)

18-18: LGTM on PHONY wiring.

Nice one, mate — adding fetch-chain-headers to PHONY keeps make’s mood sunny.

crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

21-27: Imports look right for the new flow.

Types line up with RocksDB-backed ChainStore and Header decoding.


121-123: Confirm RocksDBStore::new preconditions.

It calls assert_sufficient_snapshots(dir)? Ensure chain_dir satisfies that on a fresh node, else bootstrap will faceplant.

Run a quick repo scan to confirm expectations around chain_dir initialization in amaru-stores.

crates/amaru/src/bin/amaru/main.rs (2)

40-41: New subcommand wired cleanly.

fetch-chain-headers slots into the CLI lineup nicely — chef’s kiss.


119-120: Match arm looks good.

Consistent with existing command dispatch.

Comment thread crates/amaru/src/bin/amaru/cmd/bootstrap.rs Outdated
Comment thread crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
@pgrange

pgrange commented Sep 2, 2025

Copy link
Copy Markdown
Contributor Author

One thing I noticed while trying to automate deployment of amaru on one of my machines is the fact that bootstrapping currently depends on the source code, eg. having the data/ directory around. Perhaps we should start thinking of producing signed snapshots that would be available for download from some stable location? Mithril integration does not seem to be "just around the corner" but this would be a first step in that direction.

@abailly I don't know if that's totally related but @yHSJ pointed me to the CIP Tweag's drafting to propose a canonical ledger state. Just in case we want to look at it and see if that can make sense for these snapshots you're mentioning here.

@pgrange pgrange marked this pull request as draft September 2, 2025 13:15
@abailly

abailly commented Sep 2, 2025

Copy link
Copy Markdown
Contributor

Indeed, I am aware of this initiative which was started in Paris back in April. Yet it will take some time to come to maturity and we'll need to deploy and bootstrap amaru nodes before.

@pgrange pgrange marked this pull request as ready for review September 2, 2025 13:29
@pgrange pgrange requested a review from jeluard September 2, 2025 13:30

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
Makefile (1)

73-78: Fix target prerequisite; create dirs in-recipe so make doesn’t choke.

Having $(CONFIG_FOLDER)/$(NETWORK)/ as a prerequisite without a rule causes make to bail before the recipe runs. Create the dirs in the recipe and ensure the headers subdir exists. Also check headers.json up-front for a nicer error.

-fetch-chain-headers: $(CONFIG_FOLDER)/$(NETWORK)/ ## Fetch chain headers from the network
-	cargo run --profile $(BUILD_PROFILE) -- fetch-chain-headers \
+fetch-chain-headers: ## Fetch chain headers from the network
+	@mkdir -p "$(CONFIG_FOLDER)/$(NETWORK)/headers"
+	@if [ ! -f "$(DATA_FOLDER)/headers.json" ]; then echo "HEADERS_FILE not found: $(DATA_FOLDER)/headers.json"; exit 1; fi
+	cargo run --profile $(BUILD_PROFILE) -- fetch-chain-headers \
 		--peer-address $(PEER_ADDRESS) \
-		--config-dir $(CONFIG_FOLDER) \
+		--config-dir "$(CONFIG_FOLDER)" \
 		--network $(NETWORK)
🧹 Nitpick comments (2)
crates/amaru/src/bin/amaru/main.rs (1)

40-41: Nice addition: subcommand wiring looks clean; consider alias + help text.

Add a visible alias and a short docstring so amaru --help reads better and matches the style of ImportLedgerState.

-    FetchChainHeaders(cmd::fetch_chain_headers::Args),
+    /// Fetch chain headers and write CBOR files under `data/<network>/headers/`.
+    #[clap(visible_alias = "fetch-headers")]
+    FetchChainHeaders(cmd::fetch_chain_headers::Args),

Also, the Bootstrap docs above still mention “possibly a peer to connect to”; that’s outdated now that bootstrap reads on-disk headers. Want me to patch those docs too?

Makefile (1)

18-18: Soothe checkmake: add the usual suspects (all/clean/test).

Not mandatory, but it quiets minphony and makes local workflows predictable.

-.PHONY: help bootstrap start import-ledger-state import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov fetch-chain-headers
+.PHONY: help bootstrap start import-ledger-state import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov fetch-chain-headers
+.PHONY: all clean test
+
+all: all-ci-checks
+clean: clear-dbs
+test: test-e2e
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 834e61e and 0b3a92a.

📒 Files selected for processing (4)
  • Makefile (2 hunks)
  • crates/amaru/src/bin/amaru/cmd/import_headers.rs (0 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • crates/amaru/src/bin/amaru/cmd/import_headers.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.
🧬 Code graph analysis (1)
crates/amaru/src/bin/amaru/main.rs (2)
crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1)
  • run (81-91)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (1)
  • run (84-113)
🪛 checkmake (0.2.2)
Makefile

[warning] 18-18: Missing required phony target "all"

(minphony)


[warning] 18-18: Missing required phony target "clean"

(minphony)


[warning] 18-18: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Benches
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Test coverage
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Build riscv32
  • GitHub Check: Build x86_64/windows

Command::ImportHeaders(args) => cmd::import_headers::run(args).await,
Command::ImportNonces(args) => cmd::import_nonces::run(args).await,
Command::Bootstrap(args) => cmd::bootstrap::run(args).await,
Command::FetchChainHeaders(args) => cmd::fetch_chain_headers::run(args).await,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Match arm LGTM.

Execution path is correct and consistent with the new module layout.

Quick sweep to ensure nothing still references the removed ImportHeaders:


🏁 Script executed:

#!/bin/bash
rg -nP 'ImportHeaders|import_headers' -g '!**/target/**' -C2

Length of output: 822


Banish leftover import_headers_for_network references
bootstrap.rs (crates/amaru/src/bin/amaru/cmd/bootstrap.rs:110, 116) still invokes and defines import_headers_for_network – replace these with the new fetch_chain_headers flow or remove them entirely.

🤖 Prompt for AI Agents
In crates/amaru/src/bin/amaru/main.rs around line 114 and in
crates/amaru/src/bin/amaru/cmd/bootstrap.rs around lines 110–116, leftover
references to import_headers_for_network must be removed or replaced with the
new fetch_chain_headers flow: update any bootstrap.rs call sites that invoke
import_headers_for_network to call the new cmd::fetch_chain_headers::run
(passing the same args) or delete the call if the functionality is no longer
needed; remove the import_headers_for_network function definition and any
related imports/uses, clean up resulting unused imports/variables, and run a
build to fix any remaining matches/enum variants or tests that expect the old
function.

@yHSJ

yHSJ commented Sep 2, 2025

Copy link
Copy Markdown
Contributor

Indeed, I am aware of this initiative which was started in Paris back in April. Yet it will take some time to come to maturity and we'll need to deploy and bootstrap amaru nodes before.

The main difference is that Tweag now has funding for this work and is actively pursuing it. I agree, it will absolutely take some time to come to maturity, but maybe work that you're doing can be helpful in their work, and vice-versa. It's worth connecting with them for a conversation at least.

We fetch headers from peer to store them next to the snapshots.
The bootstrap process will start from them instead of fetching
them on demand. That way, we can move towards a peer-less
bootstrap.

Signed-off-by: Pascal Grange <pascal@grange.nom.fr>
Signed-off-by: Pascal Grange <pascal@grange.nom.fr>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
Makefile (1)

73-78: Fix build: drop nonexistent prerequisite and create dirs in-recipe (ensure headers/).

As-is, make will error with “No rule to make target ‘$(CONFIG_FOLDER)/$(NETWORK)/’” before the recipe runs. Create the needed dirs inside the recipe and add headers/ while you’re there. Same tune as earlier feedback.

Apply:

-fetch-chain-headers: $(CONFIG_FOLDER)/$(NETWORK)/ ## Fetch chain headers from the network
-	cargo run --profile $(BUILD_PROFILE) -- fetch-chain-headers \
+fetch-chain-headers: ## Fetch chain headers from the network
+	@mkdir -p $(CONFIG_FOLDER)/$(NETWORK)/headers
+	cargo run --profile $(BUILD_PROFILE) -- fetch-chain-headers \
 		--peer-address $(PEER_ADDRESS) \
 		--config-dir $(CONFIG_FOLDER) \
 		--network $(NETWORK)

Run a dry-run to confirm make no longer trips over the missing rule:

#!/bin/bash
set -euo pipefail
make -n fetch-chain-headers NETWORK=preprod CONFIG_FOLDER=data
🧹 Nitpick comments (1)
Makefile (1)

18-18: Appease checkmake: add standard phony targets (all/clean/test).

Your PHONY list looks good, but the linter nags about missing “all”, “clean”, and “test”. Easy win to wire them to existing targets.

Apply:

-.PHONY: help bootstrap start import-ledger-state import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov fetch-chain-headers
+.PHONY: help bootstrap start import-ledger-state import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov fetch-chain-headers all clean test

And add these aliases (anywhere reasonable below):

all: all-ci-checks
clean: clear-dbs
test: test-e2e
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6e2490a and dd1c5e5.

📒 Files selected for processing (8)
  • .github/workflows/continuous-integration.yml (0 hunks)
  • Makefile (2 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs (1 hunks)
  • crates/amaru/src/bin/amaru/cmd/import_headers.rs (0 hunks)
  • crates/amaru/src/bin/amaru/cmd/mod.rs (1 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
  • data/README.md (1 hunks)
💤 Files with no reviewable changes (2)
  • .github/workflows/continuous-integration.yml
  • crates/amaru/src/bin/amaru/cmd/import_headers.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • crates/amaru/src/bin/amaru/cmd/mod.rs
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
  • crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs
  • crates/amaru/src/bin/amaru/main.rs
  • data/README.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: rkuhn
PR: pragma-org/amaru#149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.
🪛 checkmake (0.2.2)
Makefile

[warning] 18-18: Missing required phony target "all"

(minphony)


[warning] 18-18: Missing required phony target "clean"

(minphony)


[warning] 18-18: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Benches
  • GitHub Check: Build riscv32
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Test coverage

@jeluard

jeluard commented Sep 3, 2025

Copy link
Copy Markdown
Contributor

@pgrange actually import-headers is needed due to our CI setup (see https://github.com/pragma-org/amaru/blob/main/.github/workflows/continuous-integration.yml#L228).
It will need to be restored but in a way that it relies on your local files probably.

The alternative is to also cache chain.db. Maybe it's not cached due to size constraints cc @KtorZ

import-headers and bootstrap both use the same peerless tactic
to import headers from source `data` directory.

Signed-off-by: Pascal Grange <pascal@grange.nom.fr>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
Makefile (1)

50-58: Stop per-header loop; import once
The current loop invokes cargo run import-headers for each entry in HEADERS_FILE, but the CLI no longer consumes that file or the $HEADER variable—so you end up re-importing the full set N times. Replace it with a single invocation and update the help text.

-import-headers: ## Import headers from $PEER_ADDRESS for demo
-	@if [ ! -f "$(HEADERS_FILE)" ]; then echo "HEADERS_FILE not found: $(HEADERS_FILE)"; exit 1; fi; \
-	HEADERS=$$(jq -r '.[]' $(HEADERS_FILE)); \
-	for HEADER in $$HEADERS; do \
-		cargo run --profile $(BUILD_PROFILE) -- import-headers \
-			--network $(NETWORK) \
-			--chain-dir $(CHAIN_DIR) \
-			--config-dir $(CONFIG_FOLDER); \
-	done
+import-headers: ## Import headers from local CBOR files under $(CONFIG_FOLDER)/$(NETWORK)/headers
+	cargo run --profile $(BUILD_PROFILE) -- import-headers \
+		--network $(NETWORK) \
+		--chain-dir $(CHAIN_DIR) \
+		--config-dir $(CONFIG_FOLDER)

Keeps the Makefile lean—no more redundant reruns.

crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

61-76: CLI help is stale (mentions headers.json) + typo.

Docs still reference headers.json and “vaalues”. Update to headers/ directory with CBOR files.

-    /// * `data/preview/snapshots.json`: a list of `Snapshot` vaalues,
-    /// * `data/preview/nonces.json`: a list of `InitialNonces` values,
-    /// * `data/preview/headers.json`: a list of `Point`s.
+    /// * `data/preview/snapshots.json`: a list of `Snapshot` values,
+    /// * `data/preview/nonces.json`: a list of `InitialNonces` values,
+    /// * `data/preview/headers/`: CBOR header files named `header.<slot>.<hash>.cbor`.

174-206: Add HTTP timeouts + simple retry/backoff for snapshot downloads.

Right now a slow endpoint can hang your bootstrap like a boss fight with no pause button. Add client timeouts and a couple of retries.

-    // Create a reqwest client
-    let client = reqwest::Client::new();
+    // Create a reqwest client with sensible timeouts
+    let client = reqwest::Client::builder()
+        .connect_timeout(std::time::Duration::from_secs(10))
+        .timeout(std::time::Duration::from_secs(120))
+        .build()
+        .map_err(|e| BootstrapError::DownloadError("<client build>".into(), e))?;
@@
-        // Download the file
-        let response = client
-            .get(&snapshot.url)
-            .send()
-            .await
-            .map_err(|e| BootstrapError::DownloadError(snapshot.url.clone(), e))?;
+        // Download the file with basic retry/backoff
+        let mut response = None;
+        let mut delay = 1;
+        for attempt in 1..=3 {
+            match client.get(&snapshot.url).send().await {
+                Ok(rsp) => { response = Some(rsp); break; }
+                Err(e) if attempt < 3 => {
+                    tracing::warn!(url=%snapshot.url, attempt, error=%e, "Download failed, retrying...");
+                    tokio::time::sleep(std::time::Duration::from_secs(delay)).await;
+                    delay *= 2;
+                }
+                Err(e) => return Err(BootstrapError::DownloadError(snapshot.url.clone(), e)),
+            }
+        }
-        if !response.status().is_success() {
+        let response = response.unwrap();
+        if !response.status().is_success() {
♻️ Duplicate comments (1)
Makefile (1)

83-88: Don’t use a directory prerequisite; create dirs in-recipe.

As written, make can error if $(CONFIG_FOLDER)/$(NETWORK)/ has no rule. Also ensure headers/ exists before writing. Same note as earlier review.

-fetch-chain-headers: $(CONFIG_FOLDER)/$(NETWORK)/ ## Fetch chain headers from the network
-	cargo run --profile $(BUILD_PROFILE) -- fetch-chain-headers \
+fetch-chain-headers: ## Fetch chain headers from the network
+	@mkdir -p $(CONFIG_FOLDER)/$(NETWORK)/headers
+	cargo run --profile $(BUILD_PROFILE) -- fetch-chain-headers \
 		--peer-address $(PEER_ADDRESS) \
 		--config-dir $(CONFIG_FOLDER) \
 		--network $(NETWORK)
🧹 Nitpick comments (4)
Makefile (1)

18-18: Appease checkmake: add common PHONYs (optional).

If you want the minphony warnings to chill out, add these. Up to you, mate.

-.PHONY: help bootstrap start import-ledger-state import-headers import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov fetch-chain-headers
+.PHONY: all clean test help bootstrap start import-ledger-state import-headers import-nonces download-haskell-config coverage-html coverage-lconv check-llvm-cov fetch-chain-headers

Optional stubs (outside this hunk):

all: all-ci-checks
clean: clear-dbs
test:
	@cargo test-amaru
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

158-166: Minor: friendlier errors for missing snapshots file/dir (optional).

Consider hinting the expected path/network when these IO errors fire.


214-232: Consider checksum/signature verification for snapshots.

As a next step toward verifiable bootstrap, carry a SHA-256 or signature per entry in snapshots.json and verify before rename.

crates/amaru/src/bin/amaru/cmd/import_headers.rs (1)

33-37: Align default network with other commands (optional).

Bootstrap uses DEFAULT_NETWORK; this uses Preprod. Consider importing DEFAULT_NETWORK for consistency.

-use clap::Parser;
+use clap::Parser;
+use crate::cmd::DEFAULT_NETWORK;
@@
-        default_value_t = NetworkName::Preprod,
+        default_value_t = DEFAULT_NETWORK,
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between dd1c5e5 and 946e0c9.

📒 Files selected for processing (4)
  • Makefile (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs (3 hunks)
  • crates/amaru/src/bin/amaru/cmd/import_headers.rs (2 hunks)
  • crates/amaru/src/bin/amaru/main.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/amaru/src/bin/amaru/main.rs
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
PR: pragma-org/amaru#392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
PR: pragma-org/amaru#295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru/src/bin/amaru/cmd/bootstrap.rs
🧬 Code graph analysis (2)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (1)
crates/amaru/src/bin/amaru/cmd/import_headers.rs (1)
  • import_headers_for_network (72-103)
crates/amaru/src/bin/amaru/cmd/import_headers.rs (5)
crates/amaru-kernel/src/lib.rs (1)
  • default_chain_dir (826-828)
crates/minicbor-extra/src/lib.rs (1)
  • from_cbor (31-33)
crates/amaru/src/bin/amaru/cmd/import_ledger_state.rs (2)
  • fs (90-93)
  • run (72-83)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (1)
  • run (79-108)
crates/amaru-stores/src/rocksdb/consensus.rs (4)
  • era_history (119-121)
  • era_history (183-185)
  • new (37-49)
  • new (144-149)
🪛 checkmake (0.2.2)
Makefile

[warning] 18-18: Missing required phony target "all"

(minphony)


[warning] 18-18: Missing required phony target "clean"

(minphony)


[warning] 18-18: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build riscv32
  • GitHub Check: Build x86_64/windows
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Test coverage
🔇 Additional comments (2)
crates/amaru/src/bin/amaru/cmd/bootstrap.rs (2)

80-83: LGTM on the logging.

Clear, structured boot log fields. Helps heaps when tailing.


99-106: Nice handoff to file-based headers import.

This aligns with the peerless bootstrap story.

Comment on lines +71 to 103
#[allow(clippy::unwrap_used)]
pub(crate) async fn import_headers_for_network(
network: NetworkName,
config_dir: &Path,
chain_dir: &PathBuf,
) -> Result<(), Box<dyn Error>> {
let era_history = network_name.into();
let mut db = RocksDBStore::new(chain_db_dir, era_history)?;

let peer_client = connect_to_peer(peer_address, &network_name).await?;
let mut client = ChainSyncClient::new(
Peer::new(peer_address),
peer_client.chainsync,
vec![point.clone()],
);

client.find_intersection().await?;

let mut count = 0;

let mut progress: Option<Box<dyn ProgressBar>> = None;

loop {
let what = if client.has_agency() {
request_next_block(&mut client, &mut db, &mut count, &mut progress, max).await?
} else {
await_for_next_block(&mut client, &mut db, &mut count, &mut progress, max).await?
};
match what {
Continue => continue,
Stop => {
if let Some(progress) = progress {
progress.clear()
}
break;
}
}
}
info!(total = count, "header_imported");
Ok(())
}

async fn request_next_block(
client: &mut ChainSyncClient,
db: &mut RocksDBStore,
count: &mut usize,
progress: &mut Option<Box<dyn ProgressBar>>,
max: usize,
) -> Result<What, WorkerError> {
let next = client.request_next().await.or_restart()?;
handle_response(next, db, count, progress, max)
}

async fn await_for_next_block(
client: &mut ChainSyncClient,
db: &mut RocksDBStore,
count: &mut usize,
progress: &mut Option<Box<dyn ProgressBar>>,
max: usize,
) -> Result<What, WorkerError> {
match timeout(Duration::from_secs(1), client.await_next()).await {
Ok(result) => result
.map_err(|_| WorkerError::Recv)
.and_then(|next| handle_response(next, db, count, progress, max)),
Err(_) => Err(WorkerError::Retry)?,
}
}

#[expect(clippy::unwrap_used)]
fn handle_response(
next: NextResponse<HeaderContent>,
db: &mut RocksDBStore,
count: &mut usize,
progress: &mut Option<Box<dyn ProgressBar>>,
max: usize,
) -> Result<What, WorkerError> {
match next {
NextResponse::RollForward(content, tip) => {
let header: Header = from_cbor(&content.cbor).unwrap();
let hash = header.hash();

db.store_header(&hash, &header)
let era_history = network.into();
let mut db = RocksDBStore::new(chain_dir, era_history)?;

for entry in std::fs::read_dir(config_dir.join("headers"))? {
let entry = entry?;
let path = entry.path();
if path.is_file()
&& let Some(filename) = path.file_name().and_then(|f| f.to_str())
&& filename.starts_with("header.")
&& filename.ends_with(".cbor")
{
let mut file = File::open(&path).await
.inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to open header file"))
.map_err(|_| WorkerError::Panic)?;
let mut cbor_data = Vec::new();
file.read_to_end(&mut cbor_data).await
.inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to read header file"))
.map_err(|_| WorkerError::Panic)?;
let header_from_file: Header = from_cbor(&cbor_data).unwrap();
let hash = header_from_file.hash();
db.store_header(&hash, &header_from_file)
.map_err(|_| WorkerError::Panic)?;

*count += 1;

let slot = header.slot();
let tip_slot = tip.0.slot_or_default();

if let Some(progress) = progress {
progress.tick(1)
}

if *count >= max || slot == tip_slot {
Ok(Stop)
} else {
Ok(Continue)
}
}
NextResponse::RollBackward(point, tip) => {
info!(?point, ?tip, "roll_backward");
if progress.is_none() {
*progress = Some(new_terminal_progress_bar(
max,
" importing headers (~{eta} left) {bar:70} {pos:>7}/{len:7} ({percent_precise}%)",
));
}
Ok(Continue)
}
NextResponse::Await => Ok(Continue),
}

Ok(())
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make header ingest resilient and deterministic (no unwrap; sorted; continue on errors).

One dodgy file shouldn’t nuke the run. Also, process files in a stable order.

 #[allow(clippy::unwrap_used)]
 pub(crate) async fn import_headers_for_network(
@@
-    for entry in std::fs::read_dir(config_dir.join("headers"))? {
-        let entry = entry?;
-        let path = entry.path();
-        if path.is_file()
-            && let Some(filename) = path.file_name().and_then(|f| f.to_str())
-            && filename.starts_with("header.")
-            && filename.ends_with(".cbor")
-        {
-            let mut file = File::open(&path).await
-                    .inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to open header file"))
-                    .map_err(|_| WorkerError::Panic)?;
-            let mut cbor_data = Vec::new();
-            file.read_to_end(&mut cbor_data).await
-                    .inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to read header file"))
-                    .map_err(|_| WorkerError::Panic)?;
-            let header_from_file: Header = from_cbor(&cbor_data).unwrap();
-            let hash = header_from_file.hash();
-            db.store_header(&hash, &header_from_file)
-                .map_err(|_| WorkerError::Panic)?;
-        }
-    }
+    let headers_dir = config_dir.join("headers");
+    let mut entries: Vec<std::path::PathBuf> = std::fs::read_dir(&headers_dir)?
+        .filter_map(|e| e.ok().map(|e| e.path()))
+        .filter(|p| p.is_file())
+        .collect();
+    entries.sort();
+    for path in entries {
+        let Some(filename) = path.file_name().and_then(|f| f.to_str()) else { continue };
+        if !(filename.starts_with("header.") && filename.ends_with(".cbor")) { continue; }
+
+        let mut file = match File::open(&path).await {
+            Ok(f) => f,
+            Err(reason) => { tracing::error!(file=%path.display(), %reason, "Failed to open header file"); continue; }
+        };
+        let mut cbor_data = Vec::new();
+        if let Err(reason) = file.read_to_end(&mut cbor_data).await {
+            tracing::error!(file=%path.display(), %reason, "Failed to read header file");
+            continue;
+        }
+        match from_cbor::<Header>(&cbor_data) {
+            Some(header) => {
+                let hash = header.hash();
+                if let Err(reason) = db.store_header(&hash, &header) {
+                    tracing::error!(file=%path.display(), %reason, "Failed to store header");
+                }
+            }
+            None => {
+                tracing::warn!(file=%path.display(), "Skipping malformed header CBOR");
+            }
+        }
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[allow(clippy::unwrap_used)]
pub(crate) async fn import_headers_for_network(
network: NetworkName,
config_dir: &Path,
chain_dir: &PathBuf,
) -> Result<(), Box<dyn Error>> {
let era_history = network_name.into();
let mut db = RocksDBStore::new(chain_db_dir, era_history)?;
let peer_client = connect_to_peer(peer_address, &network_name).await?;
let mut client = ChainSyncClient::new(
Peer::new(peer_address),
peer_client.chainsync,
vec![point.clone()],
);
client.find_intersection().await?;
let mut count = 0;
let mut progress: Option<Box<dyn ProgressBar>> = None;
loop {
let what = if client.has_agency() {
request_next_block(&mut client, &mut db, &mut count, &mut progress, max).await?
} else {
await_for_next_block(&mut client, &mut db, &mut count, &mut progress, max).await?
};
match what {
Continue => continue,
Stop => {
if let Some(progress) = progress {
progress.clear()
}
break;
}
}
}
info!(total = count, "header_imported");
Ok(())
}
async fn request_next_block(
client: &mut ChainSyncClient,
db: &mut RocksDBStore,
count: &mut usize,
progress: &mut Option<Box<dyn ProgressBar>>,
max: usize,
) -> Result<What, WorkerError> {
let next = client.request_next().await.or_restart()?;
handle_response(next, db, count, progress, max)
}
async fn await_for_next_block(
client: &mut ChainSyncClient,
db: &mut RocksDBStore,
count: &mut usize,
progress: &mut Option<Box<dyn ProgressBar>>,
max: usize,
) -> Result<What, WorkerError> {
match timeout(Duration::from_secs(1), client.await_next()).await {
Ok(result) => result
.map_err(|_| WorkerError::Recv)
.and_then(|next| handle_response(next, db, count, progress, max)),
Err(_) => Err(WorkerError::Retry)?,
}
}
#[expect(clippy::unwrap_used)]
fn handle_response(
next: NextResponse<HeaderContent>,
db: &mut RocksDBStore,
count: &mut usize,
progress: &mut Option<Box<dyn ProgressBar>>,
max: usize,
) -> Result<What, WorkerError> {
match next {
NextResponse::RollForward(content, tip) => {
let header: Header = from_cbor(&content.cbor).unwrap();
let hash = header.hash();
db.store_header(&hash, &header)
let era_history = network.into();
let mut db = RocksDBStore::new(chain_dir, era_history)?;
for entry in std::fs::read_dir(config_dir.join("headers"))? {
let entry = entry?;
let path = entry.path();
if path.is_file()
&& let Some(filename) = path.file_name().and_then(|f| f.to_str())
&& filename.starts_with("header.")
&& filename.ends_with(".cbor")
{
let mut file = File::open(&path).await
.inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to open header file"))
.map_err(|_| WorkerError::Panic)?;
let mut cbor_data = Vec::new();
file.read_to_end(&mut cbor_data).await
.inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to read header file"))
.map_err(|_| WorkerError::Panic)?;
let header_from_file: Header = from_cbor(&cbor_data).unwrap();
let hash = header_from_file.hash();
db.store_header(&hash, &header_from_file)
.map_err(|_| WorkerError::Panic)?;
*count += 1;
let slot = header.slot();
let tip_slot = tip.0.slot_or_default();
if let Some(progress) = progress {
progress.tick(1)
}
if *count >= max || slot == tip_slot {
Ok(Stop)
} else {
Ok(Continue)
}
}
NextResponse::RollBackward(point, tip) => {
info!(?point, ?tip, "roll_backward");
if progress.is_none() {
*progress = Some(new_terminal_progress_bar(
max,
" importing headers (~{eta} left) {bar:70} {pos:>7}/{len:7} ({percent_precise}%)",
));
}
Ok(Continue)
}
NextResponse::Await => Ok(Continue),
}
Ok(())
}
#[allow(clippy::unwrap_used)]
pub(crate) async fn import_headers_for_network(
network: NetworkName,
config_dir: &Path,
chain_dir: &PathBuf,
) -> Result<(), Box<dyn Error>> {
let era_history = network.into();
let mut db = RocksDBStore::new(chain_dir, era_history)?;
- for entry in std::fs::read_dir(config_dir.join("headers"))? {
- let entry = entry?;
- let path = entry.path();
- if path.is_file()
- && let Some(filename) = path.file_name().and_then(|f| f.to_str())
- && filename.starts_with("header.")
- && filename.ends_with(".cbor")
- {
- let mut file = File::open(&path).await
- .inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to open header file"))
- .map_err(|_| WorkerError::Panic)?;
- let mut cbor_data = Vec::new();
- file.read_to_end(&mut cbor_data).await
- .inspect_err(|reason| tracing::error!(file = %path.display(), reason = %reason, "Failed to read header file"))
- .map_err(|_| WorkerError::Panic)?;
- let header_from_file: Header = from_cbor(&cbor_data).unwrap();
- let hash = header_from_file.hash();
- db.store_header(&hash, &header_from_file)
- .map_err(|_| WorkerError::Panic)?;
- }
let headers_dir = config_dir.join("headers");
let mut entries: Vec<std::path::PathBuf> = std::fs::read_dir(&headers_dir)?
.filter_map(|r| r.ok().map(|e| e.path()))
.filter(|p| p.is_file())
.collect();
entries.sort();
for path in entries {
let Some(filename) = path.file_name().and_then(|f| f.to_str()) else {
continue;
};
if !(filename.starts_with("header.") && filename.ends_with(".cbor")) {
continue;
}
// Open the file, logging and skipping on error
let mut file = match File::open(&path).await {
Ok(f) => f,
Err(reason) => {
tracing::error!(file = %path.display(), %reason, "Failed to open header file");
continue;
}
};
// Read the entire CBOR blob
let mut cbor_data = Vec::new();
if let Err(reason) = file.read_to_end(&mut cbor_data).await {
tracing::error!(file = %path.display(), %reason, "Failed to read header file");
continue;
}
// Decode and store, logging but never panicking on malformed or DB errors
match from_cbor::<Header>(&cbor_data) {
Some(header) => {
let hash = header.hash();
if let Err(reason) = db.store_header(&hash, &header) {
tracing::error!(file = %path.display(), %reason, "Failed to store header");
}
}
None => {
tracing::warn!(file = %path.display(), "Skipping malformed header CBOR");
}
}
}
Ok(())
}

@pgrange

pgrange commented Sep 4, 2025

Copy link
Copy Markdown
Contributor Author

@pgrange actually import-headers is needed due to our CI setup (see https://github.com/pragma-org/amaru/blob/main/.github/workflows/continuous-integration.yml#L228). It will need to be restored but in a way that it relies on your local files probably.

The alternative is to also cache chain.db. Maybe it's not cached due to size constraints cc @KtorZ

@jeluard I fixed the issue by factorizing the way the headers are imported in bootstrap and in import-headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants