Skip to content

feat: refresh the peer_db#7382

Merged
SWvheerden merged 3 commits intotari-project:developmentfrom
hansieodendaal:ho_refresh_peer_db
Jul 31, 2025
Merged

feat: refresh the peer_db#7382
SWvheerden merged 3 commits intotari-project:developmentfrom
hansieodendaal:ho_refresh_peer_db

Conversation

@hansieodendaal
Copy link
Copy Markdown
Contributor

@hansieodendaal hansieodendaal commented Jul 30, 2025

Description

The database needs to be refreshed to get rid of corrupted data, which was fixed by PR#7374.

Motivation and Context

See above

How Has This Been Tested?

System-level testing running migrations

What process can a PR reviewer use to test or verify this change?

Code review
System-level testing

Breaking Changes

  • None
  • Requires data directory on base node to be deleted
  • Requires hard fork
  • Other - Please specify

Summary by CodeRabbit

  • Bug Fixes
    • Cleared corrupt data from peer and multiaddress records to improve data integrity.
    • Reset database counters to ensure consistent data indexing.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jul 30, 2025

Walkthrough

This change adds a new database migration for the peer manager module that clears all data from the peers and multi_addresses tables to remove corrupt data. The down migration is a no-op and does not revert any changes or affect the schema.

Changes

Cohort / File(s) Change Summary
Migration: Clear peer manager tables data
comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/up.sql, comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/down.sql
Adds a migration that deletes all rows from the peers and multi_addresses tables to remove corrupt data, and resets SQLite autoincrement sequences. The down migration is a no-op and performs no operations.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested reviewers

  • SWvheerden

Poem

A hop, a skip, the data clears,
Peers and addresses shed their years.
No schema changed, just fresh and clean,
A tidy start, a bunnie's dream. 🐰✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d75ab6 and 57129c7.

📒 Files selected for processing (1)
  • comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/up.sql (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/up.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: cargo check with stable
  • GitHub Check: test (nextnet, nextnet)
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: ledger build tests
  • GitHub Check: test (mainnet, stagenet)
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/up.sql (2)

6-23: Consider making peer_id the rowid again unless there’s a strong reason not to

Previously the schema used INTEGER PRIMARY KEY (alias for the rowid, giving free indexing and small storage).
Changing to BIGINT PRIMARY KEY removes those SQLite optimisations and disables the implicit autoincrement behaviour. If peer_id is still generated by the application, that’s fine – but if the intention was to rely on SQLite autoincrement we’ll need INTEGER PRIMARY KEY AUTOINCREMENT.

-    peer_id BIGINT PRIMARY KEY NOT NULL,
+    peer_id INTEGER PRIMARY KEY,

Leaving it as-is is perfectly valid; just make sure all insert code paths supply the value explicitly.


29-51: Add a uniqueness guard to avoid duplicate (peer_id, address) rows

The address table can accumulate duplicates over time if the same address is discovered repeatedly. A composite unique constraint (or at least an index) on (peer_id, address) prevents bloat and keeps queries deterministic.

 CREATE TABLE multi_addresses (
@@
     is_external BOOLEAN NOT NULL DEFAULT TRUE,
 
     FOREIGN KEY (peer_id) REFERENCES peers (peer_id) ON DELETE CASCADE
 );
 
+-- Prevent duplicate addresses per peer
+CREATE UNIQUE INDEX uq_multi_addr_peer_address
+    ON multi_addresses (peer_id, address);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6d9287 and ad15845.

📒 Files selected for processing (2)
  • comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/down.sql (1 hunks)
  • comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/up.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: hansieodendaal
PR: tari-project/tari#7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.269Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the `source` field of the `multi_addresses` table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: tari-project/tari#6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
Learnt from: hansieodendaal
PR: tari-project/tari#7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the `hard_delete_all_stale_peers` method in `comms/core/src/peer_manager/storage/database.rs`, the SQL query intentionally uses exact equality (`peers.features = ?`) rather than bitwise operations (`peers.features & ? != 0`) when matching `COMMUNICATION_NODE` features. This is the intended behavior to match only peers with exactly the `COMMUNICATION_NODE` feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: tari-project/tari#7123
File: comms/core/src/peer_manager/storage/database.rs:1655-1658
Timestamp: 2025-05-29T09:40:09.356Z
Learning: In the Tari codebase, node_id hex strings in the database are guaranteed to be valid because they can only be added via `update_peer_sql(peer: Peer)` which converts from valid NodeId objects, ensuring data integrity at the insertion layer.
comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/down.sql (1)

Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.269Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source field of the multi_addresses table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.

comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/up.sql (4)

Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.269Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source field of the multi_addresses table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.

Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers method in comms/core/src/peer_manager/storage/database.rs, the SQL query intentionally uses exact equality (peers.features = ?) rather than bitwise operations (peers.features & ? != 0) when matching COMMUNICATION_NODE features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE feature, excluding those with additional feature flags.

Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.

Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1655-1658
Timestamp: 2025-05-29T09:40:09.356Z
Learning: In the Tari codebase, node_id hex strings in the database are guaranteed to be valid because they can only be added via update_peer_sql(peer: Peer) which converts from valid NodeId objects, ensuring data integrity at the insertion layer.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Cucumber tests / Base Layer
  • GitHub Check: Cucumber tests / FFI
  • GitHub Check: test (mainnet, stagenet)
  • GitHub Check: test (nextnet, nextnet)
  • GitHub Check: cargo check with stable
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: ledger build tests
🔇 Additional comments (2)
comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/up.sql (1)

1-4: Destructive drop is fine, but ensure node bootstrap is still possible

The migration wipes all peer and address data. While this solves the corruption problem, it also leaves the node with an empty peer set after the migration. Please double-check that the bootstrap/seeding logic is triggered correctly on first start-up after the migration so the node can reconnect to the network without manual intervention.

comms/core/src/peer_manager/storage/migrations/2025-07-30-093600_refresh_db/down.sql (1)

1-1: Down migration acknowledged as intentional no-op

No concerns – the irreversible approach matches previous destructive migrations used to purge corrupted data.

Copy link
Copy Markdown
Collaborator

@SWvheerden SWvheerden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wont this cause a problem where ther are not address in the peer db?

we should dump the entire db, not just one the one table, do it from rust

@hansieodendaal
Copy link
Copy Markdown
Contributor Author

hansieodendaal commented Jul 30, 2025

Wont this cause a problem where ther are not address in the peer db?

we should dump the entire db, not just one the one table, do it from rust

The migration dumps both the peers and multi_addresses tables.

I will do a test with an empty database to confirm if the migration works, - I do not see any reason why it will not work.

I have previously tested:

  • New peer_db running all migrations
  • Migrate from previous version where the peer_db has existing data in both the peers and multi_addresses tables

I have now confirmed - this works:

  • Migrate from previous version where both the peers and multi_addresses tables are empty
  • Migrate from previous version where only the peers table is empty
  • Migrate from previous version where only the multi_addresses table is empty

@hansieodendaal hansieodendaal marked this pull request as draft July 30, 2025 15:58
@hansieodendaal hansieodendaal marked this pull request as ready for review July 30, 2025 16:03
@SWvheerden SWvheerden merged commit fb29b4d into tari-project:development Jul 31, 2025
11 of 15 checks passed
@hansieodendaal hansieodendaal deleted the ho_refresh_peer_db branch August 5, 2025 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants