feat: add migration to verify and fix embedded accumulated difficulty error#7387
Conversation
WalkthroughMigration version incremented and metadata added to track accumulated-data rebuild status. LMDB migration defers accumulated-difficulty repairs to a background rebuild task. New backend APIs expose fetch/update of rebuild status and per-height accumulated-difficulty updates; BlockchainDatabase starts an async loop to iteratively repair accumulated data. Changes
Sequence Diagram(s)sequenceDiagram
participant DB as BlockchainDatabase
participant Backend as BlockchainBackend (LMDB)
participant BG as BackgroundTask
participant DC as DifficultyCalculator
DB->>Backend: fetch_accumulated_data_rebuild_status()
alt status.is_rebuilt == true
DB-->>DB: no background task
else
DB->>BG: spawn async rebuild task
BG->>Backend: lock & fetch next height to process
BG->>Backend: fetch chain header(s) for height
BG->>DC: calculate achieved difficulty (RandomX)
BG->>Backend: update_accumulated_difficulty(height, accumulated_data, last_header)
alt update success
BG->>BG: increment height and repeat
else failure
BG->>BG: log error and stop
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~35 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
As per technical discussion, this PR will be changed to not roll back the db but only recalculate the accumulated difficulty table as a background task. |
Added migration for accumulated difficulty, as we have an embedded accumulated difficulty error in some instances of the blockchain. It runs as a background task, and will stop when the last stored chain header has been processed.
9ce4170 to
75bcc62
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
base_layer/core/src/chain_storage/blockchain_backend.rs (1)
162-168: Fix incorrect documentation comment.The documentation comment mentions "Builds the payref indexes" which is incorrect for this method. This appears to be copied from the method above.
Apply this diff to fix the documentation:
- /// Builds the payref indexes for a given block height, with stats. + /// Updates accumulated difficulty for a given block height and returns rebuild status.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
base_layer/core/src/base_node/sync/header_sync/synchronizer.rs(1 hunks)base_layer/core/src/chain_storage/blockchain_backend.rs(3 hunks)base_layer/core/src/chain_storage/blockchain_database.rs(7 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs(11 hunks)base_layer/core/src/chain_storage/lmdb_db/mod.rs(1 hunks)base_layer/core/src/chain_storage/mod.rs(1 hunks)base_layer/core/src/test_helpers/blockchain.rs(3 hunks)
✅ Files skipped from review due to trivial changes (3)
- base_layer/core/src/chain_storage/mod.rs
- base_layer/core/src/base_node/sync/header_sync/synchronizer.rs
- base_layer/core/src/chain_storage/lmdb_db/mod.rs
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: the minedinfo struct in the tari blockchain database contains two optional fields: `input: option
Learnt from: hansieodendaal
PR: tari-project/tari#7266
File: base_layer/core/src/chain_storage/blockchain_database.rs:457-461
Timestamp: 2025-06-26T13:18:55.898Z
Learning: The MinedInfo struct in the Tari blockchain database contains two optional fields: `input: Option<InputMinedInfo>` and `output: Option<OutputMinedInfo>`. This design allows methods returning MinedInfo to always return a valid struct rather than wrapping it in Option, with "not found" scenarios represented through the internal optional fields.
Applied to files:
base_layer/core/src/chain_storage/blockchain_backend.rs
base_layer/core/src/chain_storage/blockchain_database.rs
Learnt from: hansieodendaal
PR: tari-project/tari#7266
File: base_layer/core/src/chain_storage/blockchain_database.rs:457-461
Timestamp: 2025-06-26T13:18:55.898Z
Learning: The MinedInfo struct in the Tari blockchain database contains two optional fields: `input: Option<InputMinedInfo>` and `output: Option<OutputMinedInfo>`. This design allows methods returning MinedInfo to always return a valid struct rather than wrapping it in Option, with "not found" scenarios represented through the internal optional fields.
base_layer/core/src/chain_storage/blockchain_backend.rsbase_layer/core/src/chain_storage/blockchain_database.rs📚 Learning: the combination of `tokio::task::spawn(async move {` with `tokio::task::spawn_blocking().await` in t...
Learnt from: hansieodendaal
PR: tari-project/tari#7280
File: base_layer/core/src/chain_storage/blockchain_database.rs:361-438
Timestamp: 2025-07-04T10:56:46.079Z
Learning: The combination of `tokio::task::spawn(async move {` with `tokio::task::spawn_blocking().await` in the payref rebuild background task works well and shuts down properly with the tokio environment, as confirmed by testing in the Tari codebase.
Applied to files:
base_layer/core/src/chain_storage/blockchain_database.rs
📚 Learning: in applications/minotari_console_wallet/src/automation/commands.rs, the consistent error handling pa...
Learnt from: hansieodendaal
PR: tari-project/tari#7284
File: applications/minotari_console_wallet/src/automation/commands.rs:0-0
Timestamp: 2025-07-15T12:23:14.650Z
Learning: In applications/minotari_console_wallet/src/automation/commands.rs, the consistent error handling pattern for command execution is to use match statements that: 1) On success: log with debug!, print user feedback, and push tx_id to tx_ids vector for monitoring, 2) On error: print error message with eprintln! using the format "{CommandName} error! {}", rather than using .unwrap() which would panic.
Applied to files:
base_layer/core/src/chain_storage/blockchain_database.rs
📚 Learning: in the tari peer database, there was a known issue with json serialization corruption in the `source...
Learnt from: hansieodendaal
PR: tari-project/tari#7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.269Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the `source` field of the `multi_addresses` table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Applied to files:
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧬 Code Graph Analysis (1)
base_layer/core/src/chain_storage/blockchain_backend.rs (2)
base_layer/core/src/test_helpers/blockchain.rs (2)
fetch_accumulated_data_rebuild_status(371-373)update_accumulated_difficulty(388-398)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (2)
fetch_accumulated_data_rebuild_status(2823-2835)update_accumulated_difficulty(2896-2927)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: ci
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: ledger build tests
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: cargo check with stable
- GitHub Check: test (mainnet, stagenet)
🔇 Additional comments (15)
base_layer/core/src/test_helpers/blockchain.rs (2)
371-373: LGTM! Clean delegation implementation.The method correctly delegates to the underlying LMDB database, maintaining consistency with the trait definition and following the established pattern used throughout this implementation.
388-398: LGTM! Proper delegation with correct parameter handling.The method correctly delegates the accumulated difficulty update to the underlying LMDB database with proper parameter passing, ensuring test database behavior matches the production LMDB backend.
base_layer/core/src/chain_storage/blockchain_backend.rs (1)
152-153: LGTM! Well-designed trait method addition.The method signature is consistent with other fetch methods in the trait, has clear documentation, and follows the established error handling pattern.
base_layer/core/src/chain_storage/blockchain_database.rs (5)
25-25: LGTM! Import additions support the new accumulated data rebuild functionality.The added imports for
min,AccumulatedDataRebuildStatus, andTargetDifficultyWindoware properly used in the new accumulated data rebuild logic.Also applies to: 72-72, 124-124
372-372: LGTM! Proper integration of accumulated data rebuild task.The accumulated data rebuild task is correctly integrated into the start sequence following the same pattern as the existing payref rebuild task.
377-451: LGTM! Well-structured accumulated data rebuild implementation.The method follows the established pattern from the payref rebuild task and includes proper error handling, logging, and concurrency management. The early return optimization and sequential processing approach are appropriate for this rebuilding operation.
466-487: LGTM! Good refactoring to extract reusable helper.The extraction of
get_vm_key_for_candidate_headerreduces code duplication while maintaining the same logic. The delegation pattern preserves backward compatibility for the existingget_vm_key_for_candidate_blockfunction.
949-980: LGTM! Well-implemented accumulated data processing function.The function properly handles write locking, safety checks for reorged blocks, difficulty calculation, and accumulated data rebuilding. The use of
min(height, last_chain_header.height())is a good safety measure to prevent processing blocks that may have been reorged out.base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (7)
3601-3601: LGTM! Well-structured metadata additions.The new
AccumulatedDataRebuildStatusmetadata key, struct, and value variant follow the established patterns in the codebase. The structure appropriately tracks rebuild state withis_rebuiltflag andlast_rebuild_heightfor resuming background tasks.Also applies to: 3640-3648, 3662-3662, 3679-3681
2822-2835: LGTM! Consistent implementation pattern.The
fetch_accumulated_data_rebuild_statusmethod follows the same pattern as the existingfetch_payref_rebuild_statusmethod, properly handling missing values with sensible defaults.
2896-2927: LGTM! Atomic update with proper completion detection.The
update_accumulated_difficultymethod correctly performs atomic updates of both the accumulated difficulty data and rebuild status. The completion detection logic (height == last_chain_header.height()) appropriately determines when the rebuild is finished.
3688-3688: LGTM! Appropriate version increment.The migration version increment to 6 is correct for introducing the new accumulated difficulty migration logic.
3837-3845: LGTM! Proper migration deferral strategy.Deferring the accumulated difficulty migration from version 1 to version 5 is the correct approach, as mentioned in the PR objectives. This allows the migration to handle both version 1 and version 5 known good values comprehensively in a single migration step.
3912-4005: LGTM! Well-designed migration strategy with proper edge case handling.The migration version 5 logic effectively addresses the accumulated difficulty corruption issue by:
- Comprehensive data merging: Combines known good difficulties from both v1 and v5 migrations
- Efficient detection: Finds the last height with correct accumulated difficulty without blocking the migration
- Background processing: Defers the actual correction to a background task to avoid long migration times
- Edge case handling: Properly handles new blockchains and databases below fork height
The approach of setting
AccumulatedDataRebuildStatusto enable background correction is much better than doing synchronous correction during migration, which could cause extended downtime.
4068-4190: Manual verification required for hardcoded accumulated difficulty valuesI ran a repository-wide search for these exact reference values in any
.mdor.txtdocumentation and found no matches. Since these checkpoints are critical for corruption detection, please:
- Confirm with the blockchain ops team or trusted node logs that each
(height, difficulty)tuple is correct for both mainnet and nextnet.- Add or link to authoritative documentation (e.g., export logs, block explorer snapshots, or in-repo docs) for future maintainers.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (1)
3914-4004: Verify the migration logic handles edge cases correctly.The migration logic looks sound overall, but I want to verify a few potential edge cases:
- Empty known difficulties list: The code correctly handles this case by setting
is_rebuilt: true- Database below fork height: Also handled correctly with early return
- Sequential difficulty checking: The logic properly finds the last correct height by iterating through known good difficulties
However, there's a potential issue with the loop logic - if a block has incorrect accumulated difficulty, the code breaks immediately. This means if there's a single incorrect block followed by correct ones, the rebuild might start from too early a point.
Consider this alternative approach to find the actual last correct height:
- let mut last_correct_height = known_good_difficulties[0].0.saturating_sub(1); - for (height, correct_difficulty) in known_good_difficulties { + let mut last_correct_height = known_good_difficulties[0].0.saturating_sub(1); + for (height, correct_difficulty) in known_good_difficulties.iter().rev() { let txn = db.read_transaction()?; let accum_data: Option<BlockHeaderAccumulatedData> = lmdb_get(&txn, &db.header_accumulated_data_db, &height)?; match accum_data { - Some(accum_data) if accum_data.total_accumulated_difficulty == correct_difficulty => { + Some(accum_data) if accum_data.total_accumulated_difficulty == *correct_difficulty => { info!( target: LOG_TARGET, "[MIGRATIONS] v{}: Block height {} has correct accumulated difficulty", migrate_from_version, height ); - last_correct_height = height; + last_correct_height = *height; + break; }, Some(_) => { info!( target: LOG_TARGET, "[MIGRATIONS] v{}: Block height {} has incorrect accumulated difficulty", migrate_from_version, height ); - break; + continue; }, None => { info!( target: LOG_TARGET, "[MIGRATIONS] v{}: No accumulated difficulty found for block height {}", migrate_from_version, height ); - break; + continue; }, } }This would find the highest height with correct accumulated difficulty rather than stopping at the first incorrect one.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
base_layer/core/src/chain_storage/blockchain_database.rs(7 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs(12 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: hansieodendaal
PR: tari-project/tari#7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.269Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the `source` field of the `multi_addresses` table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
📚 Learning: the combination of `tokio::task::spawn(async move {` with `tokio::task::spawn_blocking().await` in t...
Learnt from: hansieodendaal
PR: tari-project/tari#7280
File: base_layer/core/src/chain_storage/blockchain_database.rs:361-438
Timestamp: 2025-07-04T10:56:46.079Z
Learning: The combination of `tokio::task::spawn(async move {` with `tokio::task::spawn_blocking().await` in the payref rebuild background task works well and shuts down properly with the tokio environment, as confirmed by testing in the Tari codebase.
Applied to files:
base_layer/core/src/chain_storage/blockchain_database.rs
📚 Learning: in applications/minotari_console_wallet/src/automation/commands.rs, the consistent error handling pa...
Learnt from: hansieodendaal
PR: tari-project/tari#7284
File: applications/minotari_console_wallet/src/automation/commands.rs:0-0
Timestamp: 2025-07-15T12:23:14.650Z
Learning: In applications/minotari_console_wallet/src/automation/commands.rs, the consistent error handling pattern for command execution is to use match statements that: 1) On success: log with debug!, print user feedback, and push tx_id to tx_ids vector for monitoring, 2) On error: print error message with eprintln! using the format "{CommandName} error! {}", rather than using .unwrap() which would panic.
Applied to files:
base_layer/core/src/chain_storage/blockchain_database.rs
📚 Learning: the minedinfo struct in the tari blockchain database contains two optional fields: `input: option
Learnt from: hansieodendaal
PR: tari-project/tari#7266
File: base_layer/core/src/chain_storage/blockchain_database.rs:457-461
Timestamp: 2025-06-26T13:18:55.898Z
Learning: The MinedInfo struct in the Tari blockchain database contains two optional fields: `input: Option<InputMinedInfo>` and `output: Option<OutputMinedInfo>`. This design allows methods returning MinedInfo to always return a valid struct rather than wrapping it in Option, with "not found" scenarios represented through the internal optional fields.
Applied to files:
base_layer/core/src/chain_storage/blockchain_database.rs
Learnt from: hansieodendaal
PR: tari-project/tari#7266
File: base_layer/core/src/chain_storage/blockchain_database.rs:457-461
Timestamp: 2025-06-26T13:18:55.898Z
Learning: The MinedInfo struct in the Tari blockchain database contains two optional fields: `input: Option<InputMinedInfo>` and `output: Option<OutputMinedInfo>`. This design allows methods returning MinedInfo to always return a valid struct rather than wrapping it in Option, with "not found" scenarios represented through the internal optional fields.
base_layer/core/src/chain_storage/blockchain_database.rs📚 Learning: in the tari peer database, there was a known issue with json serialization corruption in the `source...
Learnt from: hansieodendaal
PR: tari-project/tari#7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.269Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the `source` field of the `multi_addresses` table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Applied to files:
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧬 Code Graph Analysis (1)
base_layer/core/src/chain_storage/blockchain_database.rs (2)
base_layer/core/src/blocks/accumulated_data.rs (7)
height(267-269)height(331-333)header(279-281)header(345-347)accumulated_data(283-285)accumulated_data(355-357)builder(211-213)base_layer/core/src/validation/difficulty_calculator.rs (1)
tari_rx_vm_key_height(74-80)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Cucumber tests / FFI
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: Cucumber tests / Base Layer
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: cargo check with stable
- GitHub Check: ledger build tests
- GitHub Check: ci
🔇 Additional comments (13)
base_layer/core/src/chain_storage/blockchain_database.rs (7)
25-25: Import addition looks good.The addition of
minfromstd::cmpis used appropriately in the new helper function.
72-72: New type import is appropriate.The
AccumulatedDataRebuildStatustype is correctly imported and used throughout the new functionality.
124-124: RandomXFactory import is necessary.The
RandomXFactoryimport is required for the difficulty calculation in the new background task.
372-375: Background task initialization is well-placed.The accumulated data rebuild background task is appropriately called after the payref rebuild task in the startup sequence.
377-451: Robust background task implementation with good error handling.The implementation follows the established pattern from the payref rebuild task and includes proper error handling. However, there are a few considerations:
- The error handling logic logs errors but doesn't attempt retry, which is consistent with the existing pattern and the past review comments indicating this is acceptable.
- The use of
tokio::task::spawnwithspawn_blockingfollows the established pattern proven to work well in this codebase.- The safety mechanism using
min(height, last_chain_header.height())in the helper function prevents processing beyond the current chain tip.
460-487: Helper function refactoring improves code organization.The extraction of
get_vm_key_for_candidate_headerfromget_vm_key_for_candidate_blockis a good refactoring that promotes reusability. The logic remains the same but is now more modular.
949-981: Well-implemented accumulated data rebuilding logic.The
process_accumulated_data_for_heightfunction properly:
- Validates the height: Uses
min(height, last_chain_header.height())to prevent processing beyond the current chain tip- Calculates difficulty correctly: Uses the
DifficultyCalculatorwith proper error handling- Builds accumulated data: Follows the established pattern using the builder pattern
- Updates the database: Calls the backend's
update_accumulated_difficultymethodThe implementation is consistent with the existing codebase patterns and includes appropriate error handling.
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (6)
3688-3688: LGTM! Migration version increment is appropriate.The increment from version 5 to 6 correctly reflects the addition of the new accumulated difficulty migration logic.
2822-2835: LGTM! New backend method follows established patterns.The
fetch_accumulated_data_rebuild_statusmethod correctly follows the same pattern asfetch_payref_rebuild_status, with proper error handling and default value provision.
2896-2927: LGTM! Atomic update method is well-implemented.The
update_accumulated_difficultymethod properly handles both the header accumulated data update and status metadata update within a single transaction, ensuring atomicity.
3601-3601: LGTM! New metadata structures are consistent.The new
AccumulatedDataRebuildStatusstruct and associated metadata key/value variants follow the existing patterns in the codebase and provide appropriate fields for tracking rebuild progress.Also applies to: 3623-3623, 3640-3648, 3662-3662, 3679-3681
3837-3845: LGTM! Migration deferral is appropriate.Deferring the accumulated difficulty migration from v1 to v5 aligns with the new architecture where the actual fix is handled by a background task rather than during migration.
4067-4164: LGTM! Extended known good difficulties provide better coverage.The function renaming is appropriate, and the additional known good accumulated difficulty entries for both mainnet and nextnet provide much better coverage for detecting and correcting accumulated difficulty corruption. The values follow the expected exponential growth pattern.
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (1)
3591-3602: Stabilize metadata-key encoding over time.as_u32 relies on enum discriminant layout. Adding variants is fine if always appended, but accidental reordering can corrupt persisted keys. Consider pinning representation.
-#[derive(Debug, Clone, PartialEq, Eq, Copy, Hash)] +#[repr(u32)] +#[derive(Debug, Clone, PartialEq, Eq, Copy, Hash)] pub enum MetadataKey {Alternatively, explicitly document and enforce “append-only” ordering for this enum to prevent future footguns.
🧹 Nitpick comments (4)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (4)
2822-2835: Fix docstring: this returns accumulated-data rebuild status (not payref).Minor correctness/clarity tweak. The comment mismatches the function.
- // Returns the payref rebuild status. + // Returns the accumulated data rebuild status.
3640-3649: Fix copy/paste in docstring (“Payref” -> “Accumulated data”).Minor doc correctness.
-/// Payref rebuild status - for new base nodes or once rebuilt, this will be set to true +/// Accumulated-data rebuild status - for new base nodes or once rebuilt, this will be set to true
3912-4003: v6 migration: set correct starting point for background repair; minor edge checks.The logic determines the last known-correct height from a known-good list and records it in AccumulatedDataRebuildStatus to let the background job continue. This is sound.
Two small considerations:
- If the first mismatched height is the very first known-good height, last_rebuild_height becomes first-1. Please confirm the background task starts from last_rebuild_height + 1 to avoid reprocessing below the baseline.
- You may also want to verify header existence (headers_db) alongside accumulated data presence during the loop for extra safety, although a mismatch will cause a break anyway.
If helpful, I can draft a quick test that seeds a tiny DB with an intentional mismatch at the first known-good height and asserts that the recorded last_rebuild_height is first_known_good - 1 and that the background loop starts at first_known_good.
I can also add a metrics hook so operators can track rebuild progress (e.g., current height / last stored header).
4066-4176: Known-good acc-diff table: add cheap guards to prevent regressions.The list looks sorted and monotonic. To avoid subtle ordering mistakes in future edits, consider adding a lightweight guard.
- Add a debug-only assertion after constructing the Vec to ensure strict height ordering (and optionally strictly increasing difficulties).
For example (illustrative):
let v = vec![ /* entries */ ]; debug_assert!(v.windows(2).all(|w| w[0].0 < w[1].0), "known-good heights must be strictly increasing"); vOptionally, place these constants behind tests that validate ordering and uniqueness.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
base_layer/core/src/chain_storage/blockchain_database.rs(8 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs(13 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- base_layer/core/src/chain_storage/blockchain_database.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (2)
base_layer/core/src/chain_storage/blockchain_backend.rs (3)
fetch_accumulated_data_rebuild_status(153-153)update_accumulated_difficulty(163-168)write(50-50)base_layer/core/src/chain_storage/lmdb_db/lmdb.rs (2)
lmdb_get(228-251)lmdb_replace(126-161)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: cargo check with stable
- GitHub Check: ci
- GitHub Check: ledger build tests
- GitHub Check: Cucumber tests / Base Layer
- GitHub Check: Cucumber tests / FFI
🔇 Additional comments (4)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (4)
3611-3624: Display mapping for new metadata key looks good.String label is clear and consistent with other variants.
3662-3682: New metadata value variant and Display are correct.Serialization-friendly and matches the key addition.
3837-3845: Intentional deferral acknowledged.Skipping the old accumulated-difficulty migration at v1 in favor of the v5/v6 path aligns with the background-repair approach described in the PR.
3686-3689: Migration version bump to 6 verified
I ran a global search for MIRGRATION_VERSION/MigrationVersion references and only found usages in the migration runner itself and the metadata‐key/value enums—no CLI commands, telemetry exporters, or tests explicitly reference the old version constant. No further updates are required.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
base_layer/core/src/chain_storage/blockchain_database.rs (1)
404-447: Error handling strategy is acceptable; no retries neededOn failure, the task logs and stops. This aligns with the prior discussion that the node can retry on next startup and avoids masking underlying issues with blind retries.
🧹 Nitpick comments (2)
base_layer/core/src/chain_storage/blockchain_database.rs (2)
377-451: Background accumulated-data rebuild: consider reusing the existing DifficultyCalculatorThe approach to run the acc-data fix as a background job is solid. One improvement: avoid constructing a new RandomXFactory by reusing the database’s DifficultyCalculator (already held as Arc in the struct). This reduces memory footprint and ensures consistent PoW config.
Proposed change (minimal surface):
- Capture and pass self.difficulty_calculator.clone() into the task rather than creating a new instance.
- Update process_accumulated_data_for_height to accept Arc to avoid cloning large internals per iteration.
Example diff for this file (limited to changed code; adjust signatures accordingly where used):
- tokio::task::spawn(async move { - let difficulty_calculator = DifficultyCalculator::new(rules.clone(), RandomXFactory::new(1)); + let dc = self.difficulty_calculator.clone(); + tokio::task::spawn(async move { + let difficulty_calculator = (*dc).clone();Note: If DifficultyCalculator is not Clone, switch process_accumulated_data_for_height to take Arc instead.
Additionally, to make termination on edge cases explicit, consider handling the “genesis-only DB” case in the loop by short-circuiting when the last stored height is 0 (this avoids relying on backend status semantics to break the loop).
2469-2487: VM key resolution across orphan/main chain boundary: add brief doc, and guard equality checkThe logic to walk back via fetch_chain_header_in_all_chains until the main chain is reached makes sense and will select the correct VM key for RandomX. Two minor tweaks:
- Document the intent to use the main chain VM key once the walk reaches a height that is on the main chain.
- Consider a small defensive optimization to avoid a second DB fetch when already on main chain (rarely hot, but clarifies intent).
Suggested doc and small clarity tweak:
-// this is tricky as we need to find the vm_key hash for the candidate block, but it might not be in the current chain -// so we need to search for it. +/// Resolve the RandomX VM key for a candidate header that may live on an orphan branch. +/// We walk back via orphan/main headers until we reach a height that is on the main chain; from that point down, +/// the VM key anchor (vm_height) must be on the main chain, so we can use the main chain hash at vm_height. fn get_vm_key_for_candidate_header<T: BlockchainBackend>(- let mut current_header = header.clone(); + let mut current_header = header.clone(); while current_header.height != vm_height { let h = db.fetch_chain_header_in_all_chains(¤t_header.prev_hash)?; - let chain_header = db.fetch_chain_header_by_height(h.height())?; - if *h.header() == *chain_header.header() { + // If this height is already on the main chain, we can use the main chain VM key at vm_height. + if db.fetch_chain_header_by_height(h.height())?.header() == h.header() { // Now we now the orphan links back to the main chain here return Ok(*db.fetch_chain_header_by_height(vm_height)?.hash()); } current_header = h.header().clone(); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
base_layer/core/src/chain_storage/blockchain_database.rs(8 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hansieodendaal
PR: tari-project/tari#7387
File: base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs:2896-2927
Timestamp: 2025-08-11T15:37:17.344Z
Learning: The `update_accumulated_difficulty` method in `base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs` is migration-specific code that only runs during a once-off migration when existing ChainHeader data is present in LMDB. The consistency guarantees are upheld by the calling function `process_accumulated_data_for_height`, so additional validation within this method is redundant.
📚 Learning: 2025-08-11T15:37:17.344Z
Learnt from: hansieodendaal
PR: tari-project/tari#7387
File: base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs:2896-2927
Timestamp: 2025-08-11T15:37:17.344Z
Learning: The `update_accumulated_difficulty` method in `base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs` is migration-specific code that only runs during a once-off migration when existing ChainHeader data is present in LMDB. The consistency guarantees are upheld by the calling function `process_accumulated_data_for_height`, so additional validation within this method is redundant.
Applied to files:
base_layer/core/src/chain_storage/blockchain_database.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: cargo check with stable
- GitHub Check: ledger build tests
- GitHub Check: ci
- GitHub Check: Cucumber tests / Base Layer
- GitHub Check: Cucumber tests / FFI
🔇 Additional comments (7)
base_layer/core/src/chain_storage/blockchain_database.rs (7)
25-26: LGTM: import Ordering and minThe import refactor is correct and prepares for later usage.
72-77: LGTM: AccumulatedDataRebuildStatus re-exportImporting AccumulatedDataRebuildStatus here keeps the API cohesive with the backend changes in this PR.
124-125: LGTM: RandomXFactory importImport is justified for PoW difficulty recalculation within the migration path.
371-375: Start now kicks off both background rebuild tasksStarting the accumulated data rebuild alongside the existing payref task is correct and matches the PR objective of a non-blocking migration.
2460-2467: Refactor reduces duplication by delegating to header-based helperThe redirection to get_vm_key_for_candidate_header is a nice cleanup.
2931-2937: LGTM: Correct use of write_lock namingClearer than calling it a txn; aligns with feedback in prior review.
2950-2980: Off-by-one/edge-case guard for height 0 and reorged tipsYou clamp the requested height to the last stored chain header with min, which is good. However, when the DB only contains genesis (last height = 0) and start_height = 1, min will drop to 0 and attempt to rebuild height 0. While this likely works, it relies on update_accumulated_difficulty to mark completion to exit the outer loop. To make behavior explicit and avoid a potential stuck loop if backend status is misreported, consider short-circuiting when last_chain_header.height() == 0.
Please validate this case with a DB containing only genesis and confirm the task exits quickly. If it doesn’t, add an explicit early return:
- Fast path: if last_chain_header.height() == 0, mark rebuild complete in status and return Ok(status).
If you'd like, I can propose a concrete code change once I see AccumulatedDataRebuildStatus fields and semantics (e.g., a constructor or update method to set is_rebuilt and last_rebuild_height).
…issue # Conflicts: # base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs # base_layer/core/src/chain_storage/lmdb_db/mod.rs
- `cargo clippy --workspace --exclude tari_libtor --all-targets --all-features -- -D warnings` - `cargo lints clippy --workspace --exclude tari_libtor --all-targets --all-features`
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
base_layer/core/src/transactions/transaction_components/test.rs (1)
438-438: Preserve error context in assertion for easier debuggingassert!(...is_ok()) hides the underlying ValidationError on failure. Capture the result first and include it in the assertion message to aid diagnosis.
- assert!(validator.validate(&tx, None, None, u64::MAX).is_ok()); + let res = validator.validate(&tx, None, None, u64::MAX); + assert!(res.is_ok(), "Internal consistency validation failed: {res:?}");base_layer/core/src/proof_of_work/siphash.rs (1)
106-121: Replace rotl macro calls with rotate_left; consider inlining round for perfrotate_left is idiomatic and constant-time; behavior unchanged. For this hot loop, consider adding #[inline] to round() to help the optimizer without forcing code bloat.
- fn round(&mut self, rot_e: u8) { + #[inline] + fn round(&mut self, rot_e: u8) {base_layer/core/src/proof_of_work/cuckaroo_pow.rs (2)
204-209: Redundant padding check after truncationunpack_nonces already truncates the returned Vec to expected_length and validates that any extras (if parsed) are zero. This post-check operates on the truncated Vec and will never trigger. Consider removing it to reduce noise.
- for n in &nonces[expected_length..] { - if *n != 0 { - return Err(CuckarooVerificationError::PowDataContainsNonZeroPadding); - } - }
263-267: Degree check error variant may be misleadingneighbors.len() != 2 returns NodeHasMoreThanTwoEdges even when degree < 2. If you care about precise errors, split into two variants (or adjust wording). Otherwise, logic is correct.
- for neighbors in graph.values() { - if neighbors.len() != 2 { - return Err(CuckarooVerificationError::NodeHasMoreThanTwoEdges); - } - } + for neighbors in graph.values() { + match neighbors.len() { + 2 => {}, + _ => return Err(CuckarooVerificationError::NodeHasMoreThanTwoEdges), + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
Cargo.toml(0 hunks)applications/minotari_node/src/commands/command/header_stats.rs(1 hunks)base_layer/core/src/chain_storage/blockchain_database.rs(8 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb.rs(1 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs(13 hunks)base_layer/core/src/chain_storage/lmdb_db/mod.rs(1 hunks)base_layer/core/src/consensus/consensus_constants.rs(2 hunks)base_layer/core/src/proof_of_work/cuckaroo_pow.rs(6 hunks)base_layer/core/src/proof_of_work/siphash.rs(3 hunks)base_layer/core/src/transactions/transaction_components/test.rs(1 hunks)base_layer/core/src/validation/header/header_full_validator.rs(1 hunks)scripts/source_coverage.sh(0 hunks)
💤 Files with no reviewable changes (2)
- Cargo.toml
- scripts/source_coverage.sh
✅ Files skipped from review due to trivial changes (3)
- applications/minotari_node/src/commands/command/header_stats.rs
- base_layer/core/src/validation/header/header_full_validator.rs
- base_layer/core/src/consensus/consensus_constants.rs
👮 Files not reviewed due to content moderation or server errors (2)
- base_layer/core/src/chain_storage/blockchain_database.rs
- base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hansieodendaal
PR: tari-project/tari#7387
File: base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs:2896-2927
Timestamp: 2025-08-11T15:37:17.354Z
Learning: The `update_accumulated_difficulty` method in `base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs` is migration-specific code that only runs during a once-off migration when existing ChainHeader data is present in LMDB. The consistency guarantees are upheld by the calling function `process_accumulated_data_for_height`, so additional validation within this method is redundant.
📚 Learning: 2025-08-11T15:37:17.354Z
Learnt from: hansieodendaal
PR: tari-project/tari#7387
File: base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs:2896-2927
Timestamp: 2025-08-11T15:37:17.354Z
Learning: The `update_accumulated_difficulty` method in `base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs` is migration-specific code that only runs during a once-off migration when existing ChainHeader data is present in LMDB. The consistency guarantees are upheld by the calling function `process_accumulated_data_for_height`, so additional validation within this method is redundant.
Applied to files:
base_layer/core/src/chain_storage/lmdb_db/mod.rsbase_layer/core/src/chain_storage/blockchain_database.rsbase_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧬 Code Graph Analysis (3)
base_layer/core/src/transactions/transaction_components/test.rs (2)
base_layer/core/src/validation/aggregate_body/aggregate_body_internal_validator.rs (1)
validate(102-146)base_layer/core/src/validation/transaction/transaction_internal_validator.rs (1)
validate(60-69)
base_layer/core/src/proof_of_work/cuckaroo_pow.rs (1)
base_layer/core/src/proof_of_work/siphash.rs (1)
siphash_block(39-67)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (3)
base_layer/core/src/chain_storage/blockchain_backend.rs (3)
fetch_accumulated_data_rebuild_status(153-153)update_accumulated_difficulty(163-168)write(50-50)base_layer/core/src/chain_storage/lmdb_db/lmdb.rs (2)
lmdb_get(283-306)lmdb_replace(140-175)base_layer/core/src/consensus/consensus_constants.rs (2)
from_u16(61-68)from(85-91)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Cucumber tests / Base Layer
- GitHub Check: Cucumber tests / FFI
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: ci
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: cargo check with stable
- GitHub Check: ledger build tests
🔇 Additional comments (6)
base_layer/core/src/chain_storage/lmdb_db/lmdb.rs (1)
63-64: Ignore unsized V concernThe generic
TValueTypeparameter onTypedDatabaseRefis implicitlySized(there’s no?Sizedbound), soVcould never have been unsized in any call tolmdb_insert_typed. All existing call sites already pass concrete, sized value types. Removing?SizedfromVinlmdb_insert_typedtherefore introduces no new restriction and won’t break existing usage.Likely an incorrect or invalid review comment.
base_layer/core/src/proof_of_work/siphash.rs (1)
43-53: Minor: clippy allow and indexing tweaks are fineSwitch to a local i_usize and clippy allowances are benign and keep the code readable. No functional change.
base_layer/core/src/proof_of_work/cuckaroo_pow.rs (3)
190-201: Idiomatic byte iteration LGTMSwitching to for byte in pow.iter().copied() improves readability with no behavioral change. The extraction loop logic remains correct.
228-233: siphash_block call matches signature; ensure consistency across modulesPassing siphash_keys as a &[u64; 4] aligns with siphash_block’s signature in proof_of_work::siphash. Good.
284-288: Neighbor selection refactor is equivalentSelecting the neighbor not equal to previous using the inverted condition is correct and clearer.
base_layer/core/src/chain_storage/lmdb_db/mod.rs (1)
45-46: Publicly re-exporting AccumulatedDataRebuildStatus is appropriateExport aligns with the new background rebuild workflow and keeps the type available to external consumers. No issues.
Description
Added migration for accumulated difficulty, as we have an embedded accumulated difficulty error in some instances of the blockchain. It runs as a background task, and will stop when the last stored chain header has been processed.
Closes #7386.
Motivation and Context
See #7386.
How Has This Been Tested?
What process can a PR reviewer use to test or verify this change?
Breaking Changes
Summary by CodeRabbit
New Features
Migrations
Bug Fixes