feat: add seed peer exclusion for header- and block sync by hansieodendaal · Pull Request #7394 · tari-project/tari

hansieodendaal · 2025-08-05T13:40:25Z

Description

Force header and block sync to use non-seed peers. The rationale here is that the seed peers' function is primarily to provide other nodes with network peer information when they have no peer database, typically when starting a new node or when the peer database is manually deleted. Also, no miners are/should be connected to seed nodes. As such, seed nodes will always be followers in the blockchain network and not leaders.

Fixes #7328.

Motivation and Context

See above.

How Has This Been Tested?

System-level testing.
Added new unit tests.

What process can a PR reviewer use to test or verify this change?

Code review.
System-level testing.

Breaking Changes

None
Requires data directory on base node to be deleted
Requires hard fork
Other - Please specify

Summary by CodeRabbit

New Features
- Optional config to include/exclude seed peers from synchronization (default: false) and an accessor for seed-bootstrap rounds.
Bug Fixes
- Sync now filters seed peers from candidate sync sources when configured, avoiding seeds as sync providers.
- Comms initialization accepts an optional seed peer for test setups.
Documentation
- Added doc comment for retrieving seed peers and commented preset options.
Tests
- Updated tests and helpers to support explicit seed-peer setups and added unit tests for seed-peer filtering.

Force header and block sync to use non-seed peers. The rationale here is that the seed peers' function is primarily to provide other nodes with network peer information when they have no peer database, typically when starting a new node or when the peer database is manually deleted. Also, no miners are/should be connected to seed nodes. As such, seed nodes will always be followers in the blockchain network and not leaders.

coderabbitai · 2025-08-05T13:42:16Z

Walkthrough

Adds seed-peer-aware filtering into the Listening state sync decision (optional via config), introduces a helper to remove seed peers from SyncStatus, adds unit tests, exposes a config flag to control behavior, and updates test and comms initialization helpers to support an optional seed peer.

Changes

Cohort / File(s)	Change Summary
Seed-peer filtering & tests `base_layer/core/src/base_node/state_machine_service/states/listening.rs`	Add private helper to filter seed peers from `SyncStatus`; apply filtering in `Listening::next_event` when `include_seed_peers_in_sync` is false; add unit tests covering mixed/all-seed and UpToDate cases.
Sync config & preset `base_layer/core/src/base_node/sync/config.rs`, `common/config/presets/c_base_node_c.toml`	Add `include_seed_peers_in_sync: bool` (public, default false) and accessor `num_initial_sync_rounds_seed_bootstrap`; update preset comments to expose options.
Test helpers: seed-peer support `base_layer/core/tests/helpers/nodes.rs`, `base_layer/core/tests/helpers/sync.rs`	Extend `BaseNodeBuilder` and network creation helpers to accept/propagate an optional seed peer; update signatures and returns to include an optional seed node.
Test callsite updates `base_layer/core/tests/tests/block_sync.rs`, `base_layer/core/tests/tests/header_sync.rs`, `base_layer/core/tests/tests/horizon_sync.rs`, `base_layer/core/tests/tests/mempool.rs`, `base_layer/core/tests/tests/node_state_machine.rs`	Update calls to new helper signatures (pass explicit bool flag) and unpack the additional optional seed-node return value; no test logic changes.
Comms init: optional seed peer `base_layer/p2p/src/initialization.rs`, `base_layer/p2p/tests/support/comms_and_services.rs`, `base_layer/wallet/tests/support/comms_and_services.rs`	Change `initialize_local_test_comms` signature to accept `peers: Vec<Peer>, seed_peer: Option<Peer>`; add seed_peer handling when initializing peer-manager; update callsites to pass `None` where applicable.
Minor doc comment `comms/core/src/connectivity/requester.rs`	Add documentation comment for `get_seeds` method.

Sequence Diagram(s)

sequenceDiagram
    participant Listening as Listening State
    participant PeerStore as ConnectivityRequester / PeerStore
    participant SyncLogic as Sync Decision Logic

    Listening->>PeerStore: get_seed_peers()
    PeerStore-->>Listening: [seed NodeIds]
    Listening->>SyncLogic: get current SyncStatus (with sync_peers)
    SyncLogic-->>Listening: SyncStatus
    Listening->>Listening: filter_out_seed_peers_from_sync_status(sync_status, seed_peers)
    alt Some(filtered)
        Listening->>SyncLogic: proceed with filtered sync_peers
    else None (all peers were seeds)
        Listening->>Listening: log warning and skip progressing this cycle
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Assessment against linked issues

Objective (issue)	Addressed	Explanation
Fix initial sync only syncing from seed peers (#7328)	✅
Exclude seed peers from sync peer selection when configured (#7328)	✅
Propagate seed peer info into comms/test initialization (#7328)	✅

Assessment against linked issues: Out-of-scope changes

Code Change	Explanation
Added doc comment to `get_seeds` (comms/core/src/connectivity/requester.rs: +1 doc line)	Documentation-only change; not required by the issue objectives and does not affect functionality.

Possibly related PRs

feat: new bootstrap process #7121 — Overlaps with base_node state machine bootstrap/sync changes and seed-peer handling; likely similar areas edited.
feat: add sqlite peer_db #6963 — Related changes to initialize_local_test_comms and test comms setup (seed-peer plumbing).
fix: sync peer ordering #6802 — Related to claimed_difficulty-based peer selection logic that intersects SyncStatus network selection.

Suggested reviewers

SWvheerden
MCozhusheck

Poem

🐇
I hopped through peers both spry and meek,
I sniffed the seeds, then chose the sleek.
If all are seeds I pause and wait,
Else pick the best and sync the state.
Tail twitch, I guard the network gate 🌱✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03cf184 and b3078db.

📒 Files selected for processing (1)

base_layer/core/src/base_node/state_machine_service/states/listening.rs (6 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

base_layer/core/src/base_node/state_machine_service/states/listening.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Cucumber tests / Base Layer

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

base_layer/core/src/base_node/state_machine_service/states/listening.rs (1)

292-299: Avoid variable shadowing for clarity.

The variable sync_mode is shadowed here, which could lead to confusion. Consider using a different name for the filtered result.

-                    let sync_mode = match filter_out_seed_peers_from_sync_status(&sync_mode, &seed_peers) {
+                    let filtered_sync_mode = match filter_out_seed_peers_from_sync_status(&sync_mode, &seed_peers) {
                         None => {
                             warn!(target: LOG_TARGET, "All potential sync peers were seed peers. Waiting for better options.");
                             continue;
                         },
                         Some(val) => val,
                     };
+                    let sync_mode = filtered_sync_mode;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f595d6d and 0007437.

📒 Files selected for processing (12)

base_layer/core/src/base_node/state_machine_service/states/listening.rs (5 hunks)
base_layer/core/tests/helpers/nodes.rs (9 hunks)
base_layer/core/tests/helpers/sync.rs (4 hunks)
base_layer/core/tests/tests/block_sync.rs (5 hunks)
base_layer/core/tests/tests/header_sync.rs (6 hunks)
base_layer/core/tests/tests/horizon_sync.rs (3 hunks)
base_layer/core/tests/tests/mempool.rs (2 hunks)
base_layer/core/tests/tests/node_state_machine.rs (4 hunks)
base_layer/p2p/src/initialization.rs (3 hunks)
base_layer/p2p/tests/support/comms_and_services.rs (1 hunks)
base_layer/wallet/tests/support/comms_and_services.rs (1 hunks)
comms/core/src/connectivity/requester.rs (1 hunks)

🧰 Additional context used

🧠 Learnings (10)

📚 Learning: in the `hard_delete_all_stale_peers` method in `comms/core/src/peer_manager/storage/database.rs`, th...

Learnt from: hansieodendaal
PR: tari-project/tari#7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the `hard_delete_all_stale_peers` method in `comms/core/src/peer_manager/storage/database.rs`, the SQL query intentionally uses exact equality (`peers.features = ?`) rather than bitwise operations (`peers.features & ? != 0`) when matching `COMMUNICATION_NODE` features. This is the intended behavior to match only peers with exactly the `COMMUNICATION_NODE` feature, excluding those with additional feature flags.

Applied to files:

base_layer/wallet/tests/support/comms_and_services.rs
base_layer/core/tests/tests/header_sync.rs
base_layer/core/tests/tests/node_state_machine.rs
base_layer/core/tests/tests/horizon_sync.rs
base_layer/core/tests/tests/mempool.rs
base_layer/core/tests/tests/block_sync.rs
base_layer/core/src/base_node/state_machine_service/states/listening.rs
base_layer/p2p/src/initialization.rs
base_layer/core/tests/helpers/nodes.rs
base_layer/core/tests/helpers/sync.rs

📚 Learning: in comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should f...

Learnt from: hansieodendaal
PR: tari-project/tari#7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.

Applied to files:

comms/core/src/connectivity/requester.rs
base_layer/core/src/base_node/state_machine_service/states/listening.rs
base_layer/p2p/src/initialization.rs
base_layer/core/tests/helpers/nodes.rs
base_layer/core/tests/helpers/sync.rs

📚 Learning: in comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_s...

Learnt from: hansieodendaal
PR: tari-project/tari#7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.

Applied to files:

comms/core/src/connectivity/requester.rs
base_layer/core/tests/tests/node_state_machine.rs
base_layer/p2p/tests/support/comms_and_services.rs
base_layer/core/tests/tests/block_sync.rs
base_layer/core/src/base_node/state_machine_service/states/listening.rs
base_layer/p2p/src/initialization.rs
base_layer/core/tests/helpers/nodes.rs
base_layer/core/tests/helpers/sync.rs

📚 Learning: in comms/dht/src/network_discovery/seed_strap.rs, the num_retries logic in get_peers is specifically...

Learnt from: hansieodendaal
PR: tari-project/tari#7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.

Applied to files:

comms/core/src/connectivity/requester.rs
base_layer/p2p/tests/support/comms_and_services.rs
base_layer/core/src/base_node/state_machine_service/states/listening.rs
base_layer/p2p/src/initialization.rs
base_layer/core/tests/helpers/nodes.rs
base_layer/core/tests/helpers/sync.rs

📚 Learning: in comms/dht/src/network_discovery/seed_strap.rs, the 10-second stream_item_timeout and retry logic ...

Learnt from: hansieodendaal
PR: tari-project/tari#7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.

Applied to files:

comms/core/src/connectivity/requester.rs
base_layer/p2p/tests/support/comms_and_services.rs
base_layer/core/src/base_node/state_machine_service/states/listening.rs
base_layer/p2p/src/initialization.rs
base_layer/core/tests/helpers/nodes.rs
base_layer/core/tests/helpers/sync.rs

📚 Learning: in base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses param...

Learnt from: hansieodendaal
PR: tari-project/tari#7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.

Applied to files:

base_layer/core/tests/tests/header_sync.rs
base_layer/core/tests/tests/node_state_machine.rs
base_layer/core/tests/tests/horizon_sync.rs
base_layer/p2p/tests/support/comms_and_services.rs
base_layer/core/tests/tests/mempool.rs
base_layer/core/tests/tests/block_sync.rs
base_layer/core/src/base_node/state_machine_service/states/listening.rs
base_layer/p2p/src/initialization.rs
base_layer/core/tests/helpers/nodes.rs

📚 Learning: in comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) inte...

Learnt from: hansieodendaal
PR: tari-project/tari#7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.

Applied to files:

base_layer/p2p/tests/support/comms_and_services.rs

📚 Learning: the discovery_peer and dial_peer methods in the tari codebase have built-in timeout mechanisms, so a...

Learnt from: SWvheerden
PR: tari-project/tari#6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.

Applied to files:

base_layer/p2p/tests/support/comms_and_services.rs
base_layer/core/src/base_node/state_machine_service/states/listening.rs
base_layer/p2p/src/initialization.rs

📚 Learning: in applications/minotari_console_wallet/src/automation/commands.rs, the consistent error handling pa...

Learnt from: hansieodendaal
PR: tari-project/tari#7284
File: applications/minotari_console_wallet/src/automation/commands.rs:0-0
Timestamp: 2025-07-15T12:23:14.650Z
Learning: In applications/minotari_console_wallet/src/automation/commands.rs, the consistent error handling pattern for command execution is to use match statements that: 1) On success: log with debug!, print user feedback, and push tx_id to tx_ids vector for monitoring, 2) On error: print error message with eprintln! using the format "{CommandName} error! {}", rather than using .unwrap() which would panic.

Applied to files:

base_layer/core/tests/tests/block_sync.rs

📚 Learning: in clients/rust/base_node_wallet_client/src/client/http.rs, the http client server selection logic i...

Learnt from: SWvheerden
PR: tari-project/tari#7301
File: clients/rust/base_node_wallet_client/src/client/http.rs:70-106
Timestamp: 2025-07-11T06:07:36.367Z
Learning: In clients/rust/base_node_wallet_client/src/client/http.rs, the HTTP client server selection logic is intentionally designed to try the local API once and then fallback to the seed server permanently without retry mechanisms. This is the preferred behavior - they want to use local API if available, otherwise use seed server, without continuously retrying the local API.

Applied to files:

base_layer/core/tests/helpers/nodes.rs
base_layer/core/tests/helpers/sync.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: test (testnet, esmeralda)
GitHub Check: test (mainnet, stagenet)
GitHub Check: test (nextnet, nextnet)
GitHub Check: cargo check with stable
GitHub Check: ci
GitHub Check: Cucumber tests / Base Layer
GitHub Check: Cucumber tests / FFI

🔇 Additional comments (15)

base_layer/wallet/tests/support/comms_and_services.rs (1)

56-56: LGTM: Correct adaptation to new function signature.

The addition of None for the optional seed peer parameter correctly aligns with the updated initialize_local_test_comms signature while maintaining the existing behavior.

comms/core/src/connectivity/requester.rs (1)

301-301: LGTM: Good documentation addition.

Adding documentation for the get_seeds method improves code clarity, especially since this method is used in the seed peer filtering logic for synchronization.

base_layer/p2p/tests/support/comms_and_services.rs (1)

49-49: LGTM: Consistent with updated function signature.

The addition of None for the optional seed peer parameter correctly maintains compatibility with the updated initialize_local_test_comms signature.

base_layer/core/tests/tests/mempool.rs (2)

1059-1059: LGTM: Correct adaptation to updated helper function signature.

The tuple unpacking is properly updated to accommodate the additional seed node return value, and the false parameter correctly indicates no seed node is needed for this test.

Also applies to: 1067-1067

1741-1741: LGTM: Consistent with the pattern established above.

The changes follow the same correct pattern as the previous test function - adding the boolean parameter and updating tuple unpacking for the modified helper function signature.

Also applies to: 1749-1749

base_layer/core/tests/tests/node_state_machine.rs (2)

82-96: LGTM: Test correctly updated for new helper function signature

The changes properly adapt to the new create_network_with_multiple_base_nodes_with_config signature by adding the with_seed_node: false parameter and ignoring the optional seed node interface in the return tuple. This maintains the test's original behavior while supporting the new seed peer infrastructure.

169-183: LGTM: Consistent test update for new function signature

The changes mirror those in the first test, correctly adding the with_seed_node: false parameter and handling the expanded return tuple. The test maintains its original functionality while being compatible with the new seed peer support.

base_layer/core/tests/tests/header_sync.rs (1)

36-40: LGTM: Consistent test updates across all header sync tests

All test functions have been consistently updated to use the new sync::create_network_with_multiple_nodes signature by:

Adding false as the second parameter to indicate no seed node is needed

Adding an ignored placeholder _ in the tuple unpacking to account for the optional seed node interface

These mechanical changes maintain the original test functionality while supporting the new seed peer infrastructure.

Also applies to: 134-138, 248-252, 305-309, 377-381, 439-443

base_layer/core/tests/tests/horizon_sync.rs (1)

45-58: LGTM: Horizon sync tests updated for new helper function

All horizon sync test functions have been consistently updated to match the new sync::create_network_with_multiple_nodes signature. The changes add the with_seed_node: false parameter and handle the expanded return tuple with an ignored placeholder for the seed node interface. These mechanical updates preserve the original test behavior while supporting the seed peer infrastructure.

Also applies to: 301-324, 683-706

base_layer/core/tests/tests/block_sync.rs (1)

35-39: LGTM: Block sync tests consistently updated for new infrastructure

All block sync test functions have been uniformly updated to use the new sync::create_network_with_multiple_nodes signature by adding the with_seed_node: false parameter and handling the expanded return tuple with an ignored seed node placeholder. These mechanical changes maintain the original test functionality while supporting the new seed peer infrastructure.

Also applies to: 100-104, 160-164, 223-232, 335-344
base_layer/core/src/base_node/state_machine_service/states/listening.rs (1)

167-174: Move seed peer retrieval outside the event loop to avoid repeated calls.

The seed peers are being retrieved inside the loop at line 167, which means get_seed_peers() is called on every iteration. This is inefficient and contradicts the intention suggested by the variable positioning. Move this logic before the loop starts (before line 175).
     let mut ahead_of_peers_counter = 0;
     let mut initial_sync_peer_list = Vec::new();
-    let seed_peers = shared
-        .peer_manager
-        .get_seed_peers()
-        .await
-        .unwrap_or_else(|_| vec![])
-        .iter()
-        .map(|n| n.node_id.clone())
-        .collect::<Vec<_>>();
+    let seed_peers = shared
+        .peer_manager
+        .get_seed_peers()
+        .await
+        .unwrap_or_else(|_| vec![])
+        .iter()
+        .map(|n| n.node_id.clone())
+        .collect::<Vec<_>>();
     loop {
Likely an incorrect or invalid review comment.
base_layer/p2p/src/initialization.rs (1)

149-196: LGTM!

The refactoring to separate seed peer handling from regular peers is clean and improves code clarity. The implementation correctly adds the seed peer with appropriate flags and processes regular peers individually.

base_layer/core/tests/helpers/sync.rs (1)

137-182: LGTM!

The implementation correctly extends the test network creation to support an optional seed node. The logic properly handles the additional configuration and return value.

base_layer/core/tests/helpers/nodes.rs (2)

110-155: LGTM!

The BaseNodeBuilder extension follows the established builder pattern and cleanly integrates seed peer support.

265-314: LGTM!

The implementation correctly designates the last node as the seed node when requested and ensures all other nodes are aware of it. The separation of seed node from regular nodes in the return value is appropriate.

base_layer/core/src/base_node/state_machine_service/states/listening.rs

SWvheerden

Very much against this, sync should not do any filterting of peers

…eeds

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

base_layer/core/src/base_node/state_machine_service/states/listening.rs (1)

534-598: Do not recalculate network metadata after filtering; use a HashSet for seed lookups

Recomputing the “network” metadata from filtered peers (Lines 569–581) can diverge from the metadata selected by determine_sync_mode, introducing inconsistencies in sync decisions. Preserve the original network metadata and only filter the peer list. Also, convert seed_peers to a HashSet to speed up membership checks and reduce per-event CPU.

Apply this diff to the function body:

 fn filter_out_seed_peers_from_sync_status(sync_mode: &SyncStatus, seed_peers: &[NodeId]) -> Option<SyncStatus> {
     match sync_mode {
         SyncStatus::Lagging {
             local,
             sync_peers,
             network,
         } |
         SyncStatus::BehindButNotYetLagging {
             local,
             sync_peers,
             network,
         } => {
-            let filtered_sync_peers: Vec<_> = sync_peers
-                .iter()
-                .filter(|p| {
-                    if seed_peers.contains(p.node_id()) {
-                        debug!(target: LOG_TARGET, "Peer {} is a seed peer, ignoring it", p.node_id());
-                        false
-                    } else {
-                        true
-                    }
-                })
-                .cloned()
-                .collect();
+            // Optimize membership checks
+            let seed_set: std::collections::HashSet<&NodeId> = seed_peers.iter().collect();
+            let filtered_sync_peers: Vec<_> = sync_peers
+                .iter()
+                .filter(|p| {
+                    let is_seed = seed_set.contains(p.node_id());
+                    if is_seed {
+                        debug!(target: LOG_TARGET, "Peer {} is a seed peer, ignoring it", p.node_id());
+                    }
+                    !is_seed
+                })
+                .cloned()
+                .collect();
 
             if filtered_sync_peers.is_empty() {
                 debug!(target: LOG_TARGET, "No sync peers available after filtering seed peers");
                 return None;
             }
 
             if sync_peers.len() == filtered_sync_peers.len() {
                 // No filtering was done, so we can return the original sync mode
                 Some(sync_mode.clone())
             } else {
-                // Filtering was done, so we need to create a new sync mode and select the best metadata if necessary
-                let network_metadata_unchanged = filtered_sync_peers
-                    .iter()
-                    .any(|s| s.claimed_chain_metadata() == network);
-                let network_filtered = if network_metadata_unchanged {
-                    network.clone()
-                } else {
-                    filtered_sync_peers
-                        .iter()
-                        .max_by_key(|p| p.claimed_difficulty())
-                        .map(|p| p.claimed_chain_metadata().clone())
-                        .unwrap_or(filtered_sync_peers[0].claimed_chain_metadata().clone())
-                };
                 match sync_mode {
-                    SyncStatus::Lagging { .. } => Some(SyncStatus::Lagging {
+                    SyncStatus::Lagging { .. } => Some(SyncStatus::Lagging {
                         local: (*local).clone(),
-                        network: network_filtered,
+                        // Preserve the original network metadata selected by determine_sync_mode
+                        network: network.clone(),
                         sync_peers: filtered_sync_peers,
                     }),
                     SyncStatus::BehindButNotYetLagging { .. } => Some(SyncStatus::BehindButNotYetLagging {
                         local: (*local).clone(),
-                        network: network_filtered,
+                        // Preserve the original network metadata selected by determine_sync_mode
+                        network: network.clone(),
                         sync_peers: filtered_sync_peers,
                     }),
                     _ => unreachable!(),
                 }
             }
         },
         _ => Some(sync_mode.clone()),
     }
 }

Outside this range, add the import once at the top-level to support HashSet:

use std::collections::HashSet;

🧹 Nitpick comments (3)

base_layer/core/src/base_node/sync/config.rs (1)
85-89: Fix misleading comment on accessor

The comment says “for the new field,” but the accessor is for num_initial_sync_rounds_seed_bootstrap (an existing field). Clarify to avoid confusion.

Apply this diff:
-    // Add an accessor if one doesn't exist for the new field
+    // Accessor for num_initial_sync_rounds_seed_bootstrap
     pub fn num_initial_sync_rounds_seed_bootstrap(&self) -> usize {
         self.num_initial_sync_rounds_seed_bootstrap
     }
base_layer/core/src/base_node/state_machine_service/states/listening.rs (2)
167-175: Avoid stale seed list and optimize lookups

Seed peers are fetched once before the event loop and never refreshed; if the seed set changes, filtering remains stale. Also, Vec::contains is O(n) per check. Fetch seeds on demand and use a HashSet for O(1) average lookup.

Apply this diff to remove the upfront, static fetch:
-        let seed_peers = shared
-            .peer_manager
-            .get_seed_peers()
-            .await
-            .unwrap_or_else(|_| vec![])
-            .iter()
-            .map(|n| n.node_id.clone())
-            .collect::<Vec<_>>();
Then, in the filtering block (see Lines 292–300), refresh seed peers per event and use a HashSet for lookup (diff provided in that comment). Also consider logging the error when get_seed_peers() fails instead of silently swallowing it.

664-739: Good coverage; add a test to assert network metadata is preserved post-filter

The cases cover both Lagging and BehindButNotYetLagging paths and empty results. Add a test to ensure the network metadata remains unchanged when filtered peers don’t claim the same metadata, preventing regressions related to Line 569–Line 581 logic.

Here’s a test you can add:
#[test]
fn test_filter_preserves_network_metadata_when_filtered_peers_disagree() {
    let seed_peer = random_node_id();
    let non_seed_peer = random_node_id();

    let local_metadata =
        ChainMetadata::new(999, FixedHash::from([0; 32]), 0, 0, U512::from(300), 1672531000).unwrap();

    // Original network metadata determined by determine_sync_mode
    let network_metadata =
        ChainMetadata::new(1001, FixedHash::from([1; 32]), 0, 0, U512::from(350), 1672532000).unwrap();

    // Non-seed peer claims a different (higher) network
    let different_network_metadata =
        ChainMetadata::new(1002, FixedHash::from([2; 32]), 0, 0, U512::from(360), 1672533000).unwrap();

    let sync_status = SyncStatus::Lagging {
        local: local_metadata.clone(),
        network: network_metadata.clone(),
        sync_peers: vec![
            create_sync_peer(seed_peer.clone(), network_metadata.clone()),
            create_sync_peer(non_seed_peer.clone(), different_network_metadata.clone()),
        ],
    };

    let result = filter_out_seed_peers_from_sync_status(&sync_status, &[seed_peer]);
    assert!(result.is_some());
    if let SyncStatus::Lagging { network, sync_peers, .. } = result.unwrap() {
        // Ensure the chosen 'network' remains the original one
        assert_eq!(network, network_metadata);
        // And only the non-seed peer remains
        assert_eq!(sync_peers.len(), 1);
        assert_eq!(sync_peers[0].node_id(), &non_seed_peer);
    } else {
        panic!("Expected Lagging variant");
    }
}
If you prefer, I can push a commit with this test and the associated filter refactor.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0007437 and 03cf184.

📒 Files selected for processing (3)

base_layer/core/src/base_node/state_machine_service/states/listening.rs (5 hunks)
base_layer/core/src/base_node/sync/config.rs (2 hunks)
common/config/presets/c_base_node_c.toml (1 hunks)

✅ Files skipped from review due to trivial changes (1)

common/config/presets/c_base_node_c.toml

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-08-11T10:49:01.453Z

Learnt from: hansieodendaal
PR: tari-project/tari#7401
File: base_layer/core/src/base_node/sync/config.rs:57-60
Timestamp: 2025-08-11T10:49:01.453Z
Learning: In the Tari codebase, the BlockchainSyncConfig struct works correctly without #[serde(default)] attributes on new fields when used within BaseNodeStateMachineConfig. The configuration loading mechanism or usage patterns ensure that missing fields don't cause deserialization failures in practice.

Applied to files:

base_layer/core/src/base_node/sync/config.rs

📚 Learning: 2025-07-09T08:33:29.320Z

Learnt from: hansieodendaal
PR: tari-project/tari#7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.

Applied to files:

base_layer/core/src/base_node/state_machine_service/states/listening.rs

🧬 Code Graph Analysis (1)

base_layer/core/src/base_node/state_machine_service/states/listening.rs (4)

base_layer/core/src/base_node/sync/header_sync/synchronizer.rs (1)

sync_peers (148-148)

base_layer/core/src/base_node/sync/horizon_state_sync/synchronizer.rs (2)

sync_peers (175-175)

sync (174-215)

base_layer/core/src/base_node/sync/sync_peer.rs (2)

node_id (42-44)

from (78-83)

base_layer/core/src/base_node/chain_metadata_service/handle.rs (1)

node_id (49-51)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: test (mainnet, stagenet)
GitHub Check: test (testnet, esmeralda)
GitHub Check: test (nextnet, nextnet)
GitHub Check: Cucumber tests / Base Layer
GitHub Check: ledger build tests
GitHub Check: ci
GitHub Check: cargo check with stable
GitHub Check: Cucumber tests / FFI

🔇 Additional comments (4)

base_layer/core/src/base_node/sync/config.rs (2)

79-79: LGTM: default excludes seed peers by default

Defaulting include_seed_peers_in_sync to false aligns with the PR objective to avoid syncing from seed peers unless explicitly opted-in.

57-58: Skip adding #[serde(default)] for include_seed_peers_in_sync

The BlockchainSyncConfig is only ever deserialized through the higher-level config crate (via BaseNodeStateMachineConfig), which merges in the Default impl and never uses a raw toml::from_str or serde_json::from_str on BlockchainSyncConfig itself. As confirmed by searches:

There are no direct deserializations of BlockchainSyncConfig from TOML/JSON.

All loads go through config::Config::builder().deserialize::<…>(), which applies defaults from the Default impl.

Long-term memory shows that missing fields haven’t caused errors in practice.

You can safely leave the field without #[serde(default)].

Likely an incorrect or invalid review comment.

base_layer/core/src/base_node/state_machine_service/states/listening.rs (2)

33-33: LGTM: brings NodeId into scope

Needed for seed peer filtering and unit tests.

656-662: LGTM: concise helper to build SyncPeer test fixtures

This keeps tests readable and avoids duplication.

base_layer/core/src/base_node/state_machine_service/states/listening.rs

hansieodendaal · 2025-08-15T07:04:10Z

Closed in favour of #7418.

hansieodendaal requested a review from a team as a code owner August 5, 2025 13:40

hansieodendaal changed the title ~~feat: add seed peer exclusion when selecting header sync and block sync peers~~ feat: add seed peer exclusion for header- and block sync Aug 5, 2025

coderabbitai bot reviewed Aug 5, 2025

View reviewed changes

base_layer/core/src/base_node/state_machine_service/states/listening.rs Outdated Show resolved Hide resolved

MCozhusheck reviewed Aug 5, 2025

View reviewed changes

base_layer/core/src/base_node/state_machine_service/states/listening.rs Outdated Show resolved Hide resolved

This was referenced Aug 6, 2025

feat: exclude seed targets in network discovery, dht pools, proactive dialing, header and block sync #7375

Closed

Fix initial sync only syncing from seed peers #7328

Closed

SWvheerden reviewed Aug 6, 2025

View reviewed changes

Hansie Odendaal added 2 commits August 13, 2025 08:31

review comments

03cf184

Merge remote-tracking branch 'upstream/development' into ho_sync_no_s…

7122112

…eeds

coderabbitai bot reviewed Aug 13, 2025

View reviewed changes

base_layer/core/src/base_node/state_machine_service/states/listening.rs Show resolved Hide resolved

coderabbitai

b3078db

hansieodendaal closed this Aug 15, 2025

hansieodendaal deleted the ho_sync_no_seeds branch September 17, 2025 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add seed peer exclusion for header- and block sync#7394

feat: add seed peer exclusion for header- and block sync#7394
hansieodendaal wants to merge 4 commits intotari-project:developmentfrom
hansieodendaal:ho_sync_no_seeds

hansieodendaal commented Aug 5, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 5, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

SWvheerden left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

hansieodendaal commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hansieodendaal commented Aug 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How Has This Been Tested?

What process can a PR reviewer use to test or verify this change?

Breaking Changes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

SWvheerden left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hansieodendaal commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hansieodendaal commented Aug 5, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 5, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)