feat: add minotari_utils#7157
Conversation
WalkthroughA new command-line utility, Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant Commands
participant DbStats
participant LMDB/SQLite
User->>CLI: Run minotari_utils with arguments
CLI->>Commands: Parse and dispatch subcommand
Commands->>DbStats: Execute dbstats
DbStats->>LMDB/SQLite: Scan and analyze databases
LMDB/SQLite-->>DbStats: Return stats
DbStats-->>User: Output results (table/JSON/CSV)
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Test Results (CI) 3 files 135 suites 35m 19s ⏱️ Results for commit 51ba274. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
applications/minotari_util/src/main.rs (1)
29-34: Clean main function structure with room for enhancement.The main function follows good practices with proper logging initialization and clear execution flow. Consider adding explicit error handling for better user experience:
fn main() -> anyhow::Result<()> { env_logger::init(); - let cli = Cli::parse(); - cli.execute() + let cli = Cli::parse(); + if let Err(e) = cli.execute() { + eprintln!("Error: {}", e); + std::process::exit(1); + } + Ok(()) }This would provide cleaner error messages to users without stack traces.
applications/minotari_util/src/config.rs (1)
37-49: Configuration initialization with sensible defaults.The
from_climethod provides good default values and follows expected Tari directory conventions. Consider making the database path segments configurable for future flexibility:// Future enhancement: make path configurable pub const DEFAULT_DB_SUBPATH: &str = "data/base_node/db"; // Usage: let db_path = base_path.join(DEFAULT_DB_SUBPATH);This would make it easier to support different database locations in the future.
applications/minotari_util/README.md (1)
80-80: Minor hyphenation correction needed.When used as a modifier, "Export specific" should be hyphenated as "Export-specific":
-- `export` - Export specific data sets +- `export` - Export-specific data sets🧰 Tools
🪛 LanguageTool
[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities -export- Export specific data sets -compact- Database compac...(SPECIFIC_HYPHEN)
applications/minotari_util/src/commands/dbstats.rs (3)
38-60: Remove unnecessary empty line and fix formatting.There's an unnecessary empty line after the opening brace that should be removed for consistency.
fn create_readonly_lmdb_store<P: AsRef<Path>>( path: P, config: LMDBConfig, ) -> Result<LMDBStore, anyhow::Error> { - debug!("Opening LMDB store in read-only mode at {:?}", path.as_ref());
258-259: Remove unnecessary empty line and fix formatting.There's an unnecessary empty line after the opening brace that should be removed for consistency.
fn collect_database_stats(db_path: &Path) -> Result<DbStatsOutput> { - // Open LMDB store in read-only mode without exclusive lock
328-330: Remove trailing empty lines.Remove the multiple empty lines at the end of the file for cleaner formatting.
}) } - -
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
Cargo.toml(1 hunks)applications/minotari_util/Cargo.toml(1 hunks)applications/minotari_util/README.md(1 hunks)applications/minotari_util/src/cli.rs(1 hunks)applications/minotari_util/src/commands/dbstats.rs(1 hunks)applications/minotari_util/src/commands/mod.rs(1 hunks)applications/minotari_util/src/config.rs(1 hunks)applications/minotari_util/src/main.rs(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
applications/minotari_util/src/main.rs (1)
applications/minotari_util/src/cli.rs (1)
parse(50-52)
applications/minotari_util/src/cli.rs (2)
applications/minotari_util/src/commands/mod.rs (1)
execute(43-47)applications/minotari_util/src/commands/dbstats.rs (1)
execute(158-179)
applications/minotari_util/src/commands/mod.rs (2)
applications/minotari_util/src/cli.rs (1)
execute(54-57)applications/minotari_util/src/commands/dbstats.rs (1)
execute(158-179)
🪛 LanguageTool
applications/minotari_util/README.md
[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...
(SPECIFIC_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: cargo check with stable
🔇 Additional comments (9)
Cargo.toml (1)
41-41: LGTM! Workspace integration is correct.The new
minotari_utilapplication is properly integrated into the workspace in the correct alphabetical position among other applications.applications/minotari_util/src/config.rs (1)
27-34: Well-structured configuration with appropriate dead code handling.The
AppConfigstruct is well-designed for the initial implementation. The#[allow(dead_code)]attributes are appropriate since this is a new tool wherebase_pathandnetworkwill likely be used by future commands.applications/minotari_util/README.md (1)
1-103: Excellent comprehensive documentation.The README provides thorough documentation for the new CLI tool with clear examples, usage patterns, and a helpful roadmap for future development. The structure and content effectively communicate the tool's purpose and capabilities.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities -export- Export specific data sets -compact- Database compac...(SPECIFIC_HYPHEN)
applications/minotari_util/src/commands/mod.rs (1)
29-48: LGTM! Clean command dispatcher implementation.The enum structure and execution pattern follow standard CLI best practices. The implementation is well-structured for extensibility as new commands are added in the future.
applications/minotari_util/src/cli.rs (1)
28-58: LGTM! Well-designed CLI structure with proper ownership handling.The CLI implementation follows clap best practices. The use of
std::mem::takein theexecutemethod is a clean pattern for transferring ownership of the command while maintaining a reference to the CLI context for configuration access.applications/minotari_util/Cargo.toml (1)
1-35: LGTM! Well-structured package manifest with appropriate dependencies.The Cargo.toml properly inherits workspace settings and includes all necessary dependencies for CLI parsing, LMDB database access, serialization, and output formatting. The dependency versions and features are appropriate for the intended functionality.
applications/minotari_util/src/commands/dbstats.rs (3)
37-60: LGTM! Solid read-only LMDB connection with proper safety measures.The function correctly opens LMDB in read-only mode with
NOLOCKandRDONLYflags, which is perfect for statistics collection without interfering with running base nodes. The error handling and path validation are also well implemented.
62-87: LGTM! Comprehensive CLI argument structure.The
DbStatsArgsstruct provides a good set of options for database analysis including path override, output formats, sorting, filtering, and export capabilities. The argument annotations and defaults are well-chosen.
181-246: LGTM! Well-implemented output formatting with good user experience.The output methods provide clear formatting for different use cases:
- Table output includes environment info, sorted databases, and summary
- JSON output provides structured data for programmatic use
- CSV output enables spreadsheet analysis
- Export functionality supports both JSON and CSV formats
The sorting and truncation logic is also well implemented.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
applications/minotari_util/src/commands/dbstats.rs(1 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs(2 hunks)base_layer/core/src/chain_storage/lmdb_db/mod.rs(1 hunks)base_layer/core/src/chain_storage/mod.rs(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- base_layer/core/src/chain_storage/lmdb_db/mod.rs
- base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
base_layer/core/src/chain_storage/mod.rs (1)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (1)
create_readonly_lmdb_environment(235-269)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: cargo check with stable
- GitHub Check: file licenses
🔇 Additional comments (5)
base_layer/core/src/chain_storage/mod.rs (1)
68-68: LGTM! Clean export addition for the new utility.The addition of
create_readonly_lmdb_environmentto the public exports is well-justified and enables the newminotari_utilapplication to safely access LMDB databases in read-only mode. The function implementation (from the relevant snippets) shows proper safety measures with read-only flags and comprehensive error handling.applications/minotari_util/src/commands/dbstats.rs (4)
38-63: Well-designed CLI interface with comprehensive options.The
DbStatsArgsstructure provides a thoughtful set of command-line options that cover the key use cases mentioned in the PR objectives: path customization, multiple output formats, sorting, filtering, and export capabilities. The argument definitions are clear and well-documented.
82-127: Comprehensive data structures for database statistics.The data structures effectively capture all relevant LMDB statistics including per-database metrics (entries, sizes, page counts, depth) and environment-level information. The use of
Tabledderive for formatting andSerialize/Deserializefor export functionality is appropriate. Theformat_sizehelper usingByteSizeprovides human-readable output.
270-304: Robust database statistics collection with appropriate error handling.The implementation correctly:
- Uses read-only transactions for safe access
- Calculates comprehensive statistics including page counts and sizes
- Handles missing or inaccessible databases gracefully with silent skipping
- Computes derived metrics like average size per entry
The silent error handling (lines 295-297, 300-302) is appropriate for a read-only inspection tool, as some databases may legitimately not exist in all Tari installations.
134-155: Clean execution flow with proper validation and error handling.The execution method provides:
- Path validation before attempting database access
- Clean delegation to format-specific output methods
- Optional export functionality
- Comprehensive error propagation using
anyhow::ResultThis addresses the past review feedback by successfully implementing per-database statistics collection rather than just environment summary.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
applications/minotari_util/src/commands/dbstats.rs (2)
157-187: Consider improving table output formatting consistency.The table output mixes formatted and unformatted numbers, which could confuse users. For example, environment info shows raw numbers while the table uses formatted sizes.
Consider applying consistent formatting:
- println!(" Last Page: {}", stats.environment.last_pgno); - println!(" Last Transaction ID: {}", stats.environment.last_txnid); + println!(" Last Page: {:,}", stats.environment.last_pgno); + println!(" Last Transaction ID: {:,}", stats.environment.last_txnid);
267-271: Potential division by zero protection could be more explicit.While the code correctly handles the case where
db_stat.entriesis zero, the logic could be more explicit about this edge case for better code readability.Consider making the zero-division protection more explicit:
- let avg_size = if db_stat.entries > 0 { - total_size / db_stat.entries - } else { - 0 - }; + let avg_size = if db_stat.entries == 0 { + 0 // Avoid division by zero for empty databases + } else { + total_size / db_stat.entries + };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
applications/minotari_util/src/commands/dbstats.rs(1 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs(3 hunks)base_layer/core/src/chain_storage/lmdb_db/mod.rs(1 hunks)base_layer/core/src/chain_storage/mod.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- base_layer/core/src/chain_storage/lmdb_db/mod.rs
- base_layer/core/src/chain_storage/mod.rs
- base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: cargo check with stable
🔇 Additional comments (8)
applications/minotari_util/src/commands/dbstats.rs (8)
32-33: Well-designed integration with Tari core libraries.The imports correctly use the newly added functions
create_readonly_lmdb_environmentandget_all_database_namesfrom the Tari core library, which addresses the previous concerns about hardcoded database names and provides a clean abstraction for read-only LMDB access.
38-63: Comprehensive CLI argument structure.The
DbStatsArgsstruct provides excellent coverage of expected functionality including path overrides, multiple output formats, sorting options, result limiting, and export capabilities. The argument definitions are clear and well-documented.
82-102: Well-structured database statistics representation.The
DatabaseStatsstruct effectively captures all relevant LMDB database metrics including B-tree structure details (leaf, branch, overflow pages). Thetabledannotations with custom size formatting provide user-friendly output.
134-155: Robust main execution flow with proper error handling.The execute function follows a clean pattern: configuration resolution, path validation, data collection, output formatting, and optional export. Error handling is comprehensive and user-friendly.
206-222: Robust export functionality with proper format detection.The export function correctly determines output format by file extension and provides clear error messages for unsupported formats. Error handling covers serialization, file writing, and encoding issues appropriately.
255-256: Excellent resolution of previous hardcoded database names issue.Using
get_all_database_names()provides a centralized, authoritative source for database names, eliminating the maintenance burden and potential errors from hardcoded lists mentioned in previous reviews.
260-294: Comprehensive per-database statistics collection addresses previous concerns.The implementation now properly iterates through individual databases and collects detailed statistics for each, resolving the previous issue where only environment summary was collected. The silent error handling for missing or inaccessible databases is appropriate for a statistics tool.
296-317: Well-implemented summary calculations with edge case handling.The summary statistics properly aggregate data across all databases and handle edge cases like empty database lists. The calculation of largest database and averages is correct and robust.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
applications/minotari_util/src/commands/dbstats.rs (1)
56-58: Unused CLI argument:include_detailedflag is defined but never used.The
include_detailedflag is defined in the CLI arguments but is not referenced anywhere in the implementation. This could confuse users who might expect additional detailed output when using this flag.Either remove the unused flag or implement the detailed functionality:
- /// Include detailed per-database stats - #[arg(long)] - pub include_detailed: bool,Or implement the functionality to show additional details when this flag is set.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
applications/minotari_util/src/commands/dbstats.rs(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: cargo check with stable
- GitHub Check: test (nextnet, nextnet)
🔇 Additional comments (6)
applications/minotari_util/src/commands/dbstats.rs (6)
23-36: Import organization looks clean and appropriate.The imports are well-organized and all dependencies appear to be used appropriately. The use of
tari_core::chain_storage::{create_readonly_lmdb_environment, get_all_database_names}addresses the previous concerns about using authoritative database names.
234-257: Excellent use of new read-only LMDB functions.The implementation correctly uses the new
create_readonly_lmdb_environmentfunction andget_all_database_namesto ensure authoritative database access and enumeration. This addresses the previous review concerns about hardcoded database names and proper LMDB access.
258-294: Robust database statistics collection with proper error handling.The implementation correctly iterates through all database names, handles cases where databases might not exist or be accessible, and collects comprehensive statistics including page counts, sizes, and depth information. The error handling gracefully continues processing other databases when individual ones fail.
296-323: Well-implemented summary statistics calculation.The summary calculation correctly aggregates data across all databases, handles edge cases (like empty database lists), and provides useful metrics like total size, largest database, and average entries per database.
206-222: Good export functionality with proper format detection.The export functionality properly detects file extensions and supports both JSON and CSV formats with appropriate error handling for unsupported formats.
224-231: Complete sorting implementation covering all defined fields.The sorting implementation correctly handles all four defined sort fields (name, size, entries, pages) with appropriate ascending/descending order for each field type.
| Err(e) => { | ||
| println!("Failed to get stats for database '{}': {}", db_name, e); | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Error messages should be written to stderr instead of stdout.
Error messages for database access failures are currently printed to stdout, which can interfere with structured output formats like JSON or CSV.
- println!("Failed to get stats for database '{}': {}", db_name, e);
+ eprintln!("Failed to get stats for database '{}': {}", db_name, e);
- println!("Failed to open database '{}': {}", db_name, e);
+ eprintln!("Failed to open database '{}': {}", db_name, e);Also applies to: 290-292
🤖 Prompt for AI Agents
In applications/minotari_util/src/commands/dbstats.rs at lines 285-287 and
290-292, error messages are printed using println!, which writes to stdout.
Change these to use eprintln! instead to write error messages to stderr,
ensuring that error output does not interfere with structured stdout output like
JSON or CSV.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
applications/minotari_util/src/commands/dbstats.rs (2)
197-367: Well-implemented helper functions with purpose clarification needed.The output formatting methods and directory scanning functions are well-implemented with proper error handling. However, these functions serve a network-wide scanning purpose rather than the detailed LMDB database analysis described in the README.
Consider separating the network scanning functionality into a different command (e.g.,
scan-network) to clarify the distinction between network-wide database discovery and detailed database statistics analysis.
369-448: Excellent implementation of detailed database statistics collection.This function properly implements the core functionality described in the README, collecting comprehensive statistics for each database using the authoritative database names from
get_all_database_names(). The implementation addresses previous review comments about collecting per-database statistics.Consider adding more detailed error logging for individual database failures:
for db_name in db_names { if let Ok(database) = Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) { if let Ok(db_stat) = ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) { // ... existing stats collection code ... + } else { + eprintln!("Warning: Failed to get stats for database '{}'", db_name); } + } else { + eprintln!("Warning: Failed to open database '{}'", db_name); } }applications/minotari_util/README.md (1)
80-80: Minor grammatical improvement needed.The phrase should be hyphenated when used as a modifier.
-- `export` - Export specific data sets +- `export` - Export-specific data sets🧰 Tools
🪛 LanguageTool
[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities -export- Export specific data sets -compact- Database compac...(SPECIFIC_HYPHEN)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
applications/minotari_util/README.md(1 hunks)applications/minotari_util/src/commands/dbstats.rs(1 hunks)applications/minotari_util/src/commands/mod.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- applications/minotari_util/src/commands/mod.rs
🧰 Additional context used
🪛 LanguageTool
applications/minotari_util/README.md
[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...
(SPECIFIC_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: test (testnet, esmeralda)
🔇 Additional comments (2)
applications/minotari_util/src/commands/dbstats.rs (1)
1-154: Well-structured data models and CLI interface.The imports, enums, and data structures are well-designed for the database statistics functionality. The use of appropriate crates (serde, tabled, clap) and the comprehensive data models for database statistics provide a solid foundation for the tool.
applications/minotari_util/README.md (1)
1-103:⚠️ Potential issueDocumentation doesn't match actual implementation.
The README describes detailed database statistics functionality with options like
--db-path,--network, and--verbose, but the actual implementation indbstats.rsonly performs network-wide database scanning with different options (--network-dir,--format,--sort-by, etc.).Update the documentation to match the actual implementation, or implement the functionality as documented. Key discrepancies:
- The actual CLI uses
--network-dirinstead of--db-path- The actual implementation doesn't have
--networkor--verboseoptions- The current implementation does network scanning, not detailed LMDB analysis
Consider updating the README to reflect the current implementation or modify the implementation to match the documentation.
Likely an incorrect or invalid review comment.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities -export- Export specific data sets -compact- Database compac...(SPECIFIC_HYPHEN)
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
applications/minotari_util/src/commands/dbstats.rs (2)
359-374: Consider making component determination more robust.The current component determination relies on string matching in file paths, which could be fragile if directory structures change or if users have non-standard setups.
Consider making this more configurable or robust:
fn determine_component(path: &Path, base_dir: &Path) -> String { let relative_path = path.strip_prefix(base_dir).unwrap_or(path); let path_str = relative_path.to_string_lossy(); + // Check components in order of specificity if path_str.contains("base_node") { "Base Node".to_string() } else if path_str.contains("wallet") { "Wallet".to_string() - } else if path_str.contains("peer_db") { + } else if path_str.contains("peer_db") || path_str.contains("peer") { "Peer Database".to_string() } else if path_str.contains("dht") { "DHT".to_string() + } else if path_str.contains("console_wallet") { + "Console Wallet".to_string() } else { - "Other".to_string() + // Fallback to directory name or "Unknown" + path.file_name() + .map(|name| name.to_string_lossy().to_string()) + .unwrap_or_else(|| "Unknown".to_string()) } }This provides better fallback behavior and handles additional component types.
426-450: Consider providing feedback for skipped databases.Currently, databases that fail to open are silently skipped, which might leave users wondering why certain expected databases don't appear in the output. Consider adding optional logging or a summary of skipped databases.
// Get statistics for each database +let mut skipped_databases = Vec::new(); for db_name in db_names { - if let Ok(database) = Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) { - if let Ok(db_stat) = ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) { + match Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) { + Ok(database) => { + match ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) { + Ok(db_stat) => { let total_pages = db_stat.leaf_pages + db_stat.branch_pages + db_stat.overflow_pages; let total_size = total_pages * page_size; let avg_size = if db_stat.entries > 0 { total_size / db_stat.entries } else { 0 }; databases.push(DatabaseStats { name: db_name.to_string(), entries: db_stat.entries, total_size, avg_size, depth: db_stat.depth, total_pages, leaf_pages: db_stat.leaf_pages, branch_pages: db_stat.branch_pages, overflow_pages: db_stat.overflow_pages, }); + } + Err(_e) => { + // Database exists but couldn't get stats - might be corrupted or locked + skipped_databases.push((db_name, "Failed to read statistics")); + } + } + } + Err(_e) => { + // Database doesn't exist - this is expected for some databases + skipped_databases.push((db_name, "Database not found")); } } } +// Optionally log skipped databases for debugging +if !skipped_databases.is_empty() && std::env::var("TARI_UTIL_VERBOSE").is_ok() { + eprintln!("Skipped {} databases:", skipped_databases.len()); + for (name, reason) in &skipped_databases { + eprintln!(" {}: {}", name, reason); + } +}This provides better observability while maintaining the current behavior by default.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
applications/minotari_util/src/commands/dbstats.rs(1 hunks)applications/minotari_util/src/config.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- applications/minotari_util/src/config.rs
🔇 Additional comments (3)
applications/minotari_util/src/commands/dbstats.rs (3)
23-162: Well-structured foundation with appropriate dependencies.The imports, CLI argument structure, and data models are well-designed with proper serialization support and clear field naming conventions. The byte size formatting functions provide good user experience.
235-273: Past review concerns have been properly addressed.The implementation now correctly provides detailed database statistics when requested, addressing the previous concern about the execution only performing network scans. The integration with
collect_database_statsand the use ofget_all_database_names()from Tari core resolves the earlier issues about hardcoded database names and missing detailed analysis.
1-484: Well-implemented database statistics utility with comprehensive functionality.This implementation successfully provides the database analysis functionality described in the PR objectives. Key strengths include:
- Clean separation between network-wide scanning and detailed LMDB analysis
- Support for multiple output formats (table, JSON, CSV)
- Proper integration with Tari core for authoritative database names
- Comprehensive error handling with Result types
- Flexible CLI interface with sorting, filtering, and export options
The code effectively addresses the goal of providing base node operators with tools to inspect LMDB database usage and identify performance bottlenecks.
| let network_dir = self.network_dir.clone().unwrap_or_else(|| { | ||
| let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string()); | ||
| PathBuf::from(home).join(".tari").join("mainnet") | ||
| }); |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Improve cross-platform compatibility for default path handling.
The current default path logic uses the HOME environment variable, which is Unix-specific and will not work correctly on Windows systems where the equivalent would be USERPROFILE or HOMEDRIVE/HOMEPATH.
Consider using a cross-platform approach:
- let network_dir = self.network_dir.clone().unwrap_or_else(|| {
- let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
- PathBuf::from(home).join(".tari").join("mainnet")
- });
+ let network_dir = self.network_dir.clone().unwrap_or_else(|| {
+ let home = dirs::home_dir().unwrap_or_else(|| PathBuf::from("."));
+ home.join(".tari").join("mainnet")
+ });This requires adding the dirs crate as a dependency, which provides cross-platform home directory detection.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let network_dir = self.network_dir.clone().unwrap_or_else(|| { | |
| let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string()); | |
| PathBuf::from(home).join(".tari").join("mainnet") | |
| }); | |
| let network_dir = self.network_dir.clone().unwrap_or_else(|| { | |
| let home = dirs::home_dir().unwrap_or_else(|| PathBuf::from(".")); | |
| home.join(".tari").join("mainnet") | |
| }); |
🤖 Prompt for AI Agents
In applications/minotari_util/src/commands/dbstats.rs around lines 169 to 172,
the code uses the Unix-specific HOME environment variable to determine the
default path, which breaks on Windows. Replace this logic by using the dirs
crate's home_dir() function to get the user's home directory in a cross-platform
way. Add the dirs crate as a dependency if not already present, then update the
code to call dirs::home_dir() and build the path from there, falling back to "."
if home_dir() returns None.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
applications/minotari_util/src/commands/dbstats.rs (1)
206-209: Cross-platform compatibility issue remains unaddressed.The code still uses the Unix-specific
HOMEenvironment variable, which will not work on Windows systems. This was flagged in previous reviews but hasn't been fixed.- let network_dir = self.network_dir.clone().unwrap_or_else(|| { - let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string()); - PathBuf::from(home).join(".tari").join("mainnet") - }); + let network_dir = self.network_dir.clone().unwrap_or_else(|| { + let home = dirs::home_dir().unwrap_or_else(|| PathBuf::from(".")); + home.join(".tari").join("mainnet") + });This requires adding the
dirscrate as a dependency for cross-platform home directory detection.
🧹 Nitpick comments (3)
applications/minotari_util/src/commands/dbstats.rs (3)
369-378: Remove unnecessary empty lines.There are many consecutive empty lines that reduce code readability and don't follow Rust formatting conventions.
- } - - - - - - - - - -} + } +}
508-532: Consider error handling consistency.The LMDB database iteration silently skips databases that fail to open or have statistics errors. While this may be intentional for robustness, it could hide important issues from users.
Consider adding an optional verbose mode or logging failed database access:
for db_name in db_names { if let Ok(database) = Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) { if let Ok(db_stat) = ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) { // ... existing stats collection ... + } else { + eprintln!("Warning: Failed to get statistics for database '{}'", db_name); } + } else { + eprintln!("Warning: Failed to open database '{}'", db_name); } }
651-653: Remove trailing empty lines.The file ends with unnecessary empty lines that should be removed for cleaner formatting.
-} - - - +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
applications/minotari_util/Cargo.toml(1 hunks)applications/minotari_util/src/commands/dbstats.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- applications/minotari_util/Cargo.toml
🔇 Additional comments (1)
applications/minotari_util/src/commands/dbstats.rs (1)
599-603: 🛠️ Refactor suggestionImprove SQLite table size estimation accuracy.
The current table size estimation logic is flawed. It calculates
size_per_rowasfile_size / page_count, but then multiplies byrow_count, which doesn't make logical sense sincepage_countis the total pages in the database, not total rows.Consider using SQLite's built-in
dbstatvirtual table for more accurate size estimation:- // Estimate table size using database_list and page info - // This is an approximation since SQLite doesn't provide direct table size info - let table_size_estimate = if row_count > 0 { - // Use rough estimation: file_size * (table_rows / total_db_rows_estimate) - // This is not perfect but gives a reasonable approximation - let size_per_row = if row_count > 0 { file_size / page_count.max(1) } else { 0 }; - row_count * size_per_row - } else { - 0 - }; + // Use SQLite's dbstat virtual table for more accurate size estimation + let table_size_estimate = conn.query_row( + "SELECT COALESCE(SUM(pgsize), 0) FROM dbstat WHERE name = ?", + [&table_name], + |row| row.get::<_, u64>(0) + ).unwrap_or_else(|_| { + // Fallback to simple estimation if dbstat is not available + if row_count > 0 { file_size / 100 * row_count / page_count.max(1) } else { 0 } + });Likely an incorrect or invalid review comment.
| let row_count: u64 = conn.query_row( | ||
| &format!("SELECT COUNT(*) FROM {}", table_name), | ||
| [], | ||
| |row| row.get(0) | ||
| ).unwrap_or(0); |
There was a problem hiding this comment.
Potential SQL injection vulnerability.
Using string formatting to construct SQL queries can lead to SQL injection vulnerabilities, even though table_name comes from SQLite's internal metadata.
Use parameterized queries or properly escape the table name:
- let row_count: u64 = conn.query_row(
- &format!("SELECT COUNT(*) FROM {}", table_name),
- [],
- |row| row.get(0)
- ).unwrap_or(0);
+ let row_count: u64 = conn.query_row(
+ "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name=?
+ AND (SELECT COUNT(*) FROM pragma_table_info(?)) > 0",
+ [&table_name, &table_name],
+ |_| Ok(0) // First check if table exists and is accessible
+ ).and_then(|_| {
+ // Use identifier quoting for safety
+ conn.query_row(
+ &format!("SELECT COUNT(*) FROM \"{}\"", table_name.replace("\"", "\"\"")),
+ [],
+ |row| row.get(0)
+ )
+ }).unwrap_or(0);Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In applications/minotari_util/src/commands/dbstats.rs around lines 591 to 595,
the current code constructs an SQL query by directly formatting the table name
into the query string, which risks SQL injection. To fix this, avoid using
string formatting for the table name in the query. Instead, validate or sanitize
the table_name to ensure it is safe, or use SQLite's built-in mechanisms to
safely quote identifiers before including them in the query. This prevents
injection by ensuring the table name is treated as an identifier, not executable
code.
|
Title says minotari_utils, but the tool is |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
applications/minotari_utils/src/cli.rs (1)
55-58: Simplify by consuming self directly.The use of
std::mem::takeis unnecessary here since you're already taking ownership ofself.-pub fn execute(mut self) -> anyhow::Result<()> { - let command = std::mem::take(&mut self.command); - command.execute(&self) +pub fn execute(self) -> anyhow::Result<()> { + self.command.execute(&self) }applications/minotari_utils/src/commands/dbstats.rs (2)
31-31: Remove unnecessary semicolon from import.-use csv; +use csv;
438-439: Remove unnecessary empty else blocks.These empty else blocks with clippy comments serve no purpose.
- } else { - // clippy }Also applies to: 473-474
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (11)
Cargo.toml(1 hunks)applications/minotari_utils/Cargo.toml(1 hunks)applications/minotari_utils/README.md(1 hunks)applications/minotari_utils/src/cli.rs(1 hunks)applications/minotari_utils/src/commands/dbstats.rs(1 hunks)applications/minotari_utils/src/commands/mod.rs(1 hunks)applications/minotari_utils/src/config.rs(1 hunks)applications/minotari_utils/src/main.rs(1 hunks)base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs(3 hunks)base_layer/core/src/chain_storage/lmdb_db/mod.rs(1 hunks)base_layer/core/src/chain_storage/mod.rs(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- applications/minotari_utils/README.md
- applications/minotari_utils/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (4)
- base_layer/core/src/chain_storage/mod.rs
- Cargo.toml
- base_layer/core/src/chain_storage/lmdb_db/mod.rs
- base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
applications/minotari_utils/src/main.rs (1)
applications/minotari_utils/src/cli.rs (1)
parse(51-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: cargo check with stable
- GitHub Check: ci
- GitHub Check: Cucumber tests / FFI
- GitHub Check: Cucumber tests / Base Layer
🔇 Additional comments (2)
applications/minotari_utils/src/main.rs (1)
29-34: LGTM!The main entry point is well-structured and follows standard Rust patterns for CLI applications.
applications/minotari_utils/src/commands/mod.rs (1)
29-48: LGTM!Clean command dispatch implementation with proper subcommand handling.
| #[derive(Debug, Clone)] | ||
| pub struct AppConfig { | ||
| #[allow(dead_code)] | ||
| pub base_path: PathBuf, | ||
| #[allow(dead_code)] | ||
| pub network: String, | ||
| #[allow(dead_code)] | ||
| pub db_path: PathBuf, | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Remove unused configuration struct.
The AppConfig struct appears to be unused - all fields are marked with #[allow(dead_code)] and in dbstats.rs line 208, the config is created but immediately discarded with an underscore prefix. Additionally, the hardcoded db_path to "data/base_node/db" doesn't align with the tool's actual behavior of scanning entire network directories.
Consider removing this entire configuration module if it's not being used, or update it to properly support the network-wide scanning functionality described in the PR objectives.
Also applies to: 37-51
🤖 Prompt for AI Agents
In applications/minotari_utils/src/config.rs around lines 27 to 35 and 37 to 51,
the AppConfig struct and related configuration code are unused and contain
misleading hardcoded paths that do not reflect the tool's actual scanning
behavior. Remove the entire AppConfig struct and any associated configuration
code in this module to eliminate dead code and avoid confusion, unless you plan
to update it to support network-wide scanning properly.
| .query_row(&format!("SELECT COUNT(*) FROM {}", table_name), [], |row| row.get(0)) | ||
| .unwrap_or(0); |
There was a problem hiding this comment.
Fix SQL injection vulnerability.
The table name is directly interpolated into the SQL query without validation, creating a potential SQL injection vulnerability. While table names come from sqlite_master, it's still a security best practice to properly escape identifiers.
Use proper identifier quoting:
- .query_row(&format!("SELECT COUNT(*) FROM {}", table_name), [], |row| row.get(0))
+ .query_row(&format!("SELECT COUNT(*) FROM \"{}\"", table_name.replace("\"", "\"\"")), [], |row| row.get(0))Or better yet, use a prepared statement with a parameterized query if the SQLite driver supports it for identifiers.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| .query_row(&format!("SELECT COUNT(*) FROM {}", table_name), [], |row| row.get(0)) | |
| .unwrap_or(0); | |
| .query_row(&format!("SELECT COUNT(*) FROM \"{}\"", table_name.replace("\"", "\"\"")), [], |row| row.get(0)) | |
| .unwrap_or(0); |
🤖 Prompt for AI Agents
In applications/minotari_utils/src/commands/dbstats.rs around lines 599 to 600,
the SQL query directly interpolates the table name, causing a potential SQL
injection risk. To fix this, properly escape the table name using SQLite
identifier quoting (e.g., wrapping the table name in double quotes and escaping
any internal quotes) before including it in the query string. Since
parameterized queries do not support identifiers, ensure the table name is
sanitized or quoted correctly to prevent injection.
| let table_size_estimate = if row_count > 0 { | ||
| // Use rough estimation: file_size * (table_rows / total_db_rows_estimate) | ||
| // This is not perfect but gives a reasonable approximation | ||
| let size_per_row = if row_count > 0 { | ||
| file_size / page_count.max(1) | ||
| } else { | ||
| 0 | ||
| }; | ||
| row_count * size_per_row | ||
| } else { | ||
| 0 | ||
| }; |
There was a problem hiding this comment.
Fix table size estimation logic.
The current logic for estimating table size is incorrect. It uses file_size / page_count as size_per_row, which doesn't make logical sense.
The estimation should be based on the proportion of rows in the table compared to total rows in the database:
-let size_per_row = if row_count > 0 {
- file_size / page_count.max(1)
-} else {
- 0
-};
-row_count * size_per_row
+// Estimate based on proportion of database
+// This is still an approximation but more reasonable
+(file_size as f64 * (row_count as f64 / total_rows.max(1) as f64)) as u64Note: You'll need to calculate total_rows across all tables first for this approach to work.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In applications/minotari_utils/src/commands/dbstats.rs around lines 604 to 615,
the table size estimation logic is incorrect because it calculates size_per_row
as file_size divided by page_count, which is not meaningful. To fix this, first
compute the total number of rows across all tables, then estimate the table size
by multiplying the file_size by the ratio of the current table's row_count to
the total_rows. Replace the existing size_per_row calculation with this
proportional approach to get a more accurate estimate.
| #[allow(clippy::too_many_lines)] | ||
| pub fn execute(self, cli: &Cli) -> Result<()> { | ||
| let _config = AppConfig::from_cli(cli)?; | ||
|
|
||
| // Default to ~/.tari/mainnet if no network dir specified | ||
| let network_dir = self.network_dir.clone().unwrap_or_else(|| { | ||
| let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string()); | ||
| PathBuf::from(home).join(".tari").join("mainnet") | ||
| }); |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Address cross-platform compatibility and unused code.
- The function is marked with
#[allow(clippy::too_many_lines)]indicating it should be refactored into smaller functions. - The
_configis created but never used, confirming the AppConfig is unnecessary. - The
HOMEenvironment variable is Unix-specific and won't work on Windows.
For cross-platform home directory detection, use the dirs crate:
-let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
+let home = dirs::home_dir()
+ .map(|p| p.to_string_lossy().to_string())
+ .unwrap_or_else(|| ".".to_string());Also, remove the unused _config variable:
-let _config = AppConfig::from_cli(cli)?;🤖 Prompt for AI Agents
In applications/minotari_utils/src/commands/dbstats.rs around lines 206 to 214,
remove the unused variable _config since AppConfig is not used in this function.
Replace the Unix-specific HOME environment variable usage with a cross-platform
home directory retrieval using the dirs crate's home_dir function. Additionally,
consider refactoring the execute function into smaller functions to address the
clippy warning about too many lines.
Description
Adds a comprehensive
minotari_utilCLI tool for analyzing Tari database statistics across all network components. The tool scans entire Tari network directories (like~/.tari/mainnet) and provides detailed analysis of both LMDB and SQLite databases.Key features:
Motivation and Context
Previously, database analysis was limited to single LMDB files and required nodes to be stopped due to file locking. This tool addresses the need for:
How Has This Been Tested?
What process can a PR reviewer use to test or verify this change?
cd applications/minotari_util && cargo buildcargo run -- dbstats --network-dir ~/.tari/mainnetcargo run -- dbstats --network-dir ~/.tari/mainnet --include-detailedcargo run -- dbstats --format jsoncargo run -- dbstats --format csvcargo run -- dbstats --export /tmp/stats.jsoncargo run -- dbstats --helpExpected output should show:
Breaking Changes
Summary by CodeRabbit
New Features
minotari_utilscommand-line tool for analyzing Tari base node database usage.nodedbstatscommand to scan, analyze, and report statistics for LMDB and SQLite databases, with support for various output formats (table, JSON, CSV), sorting, and exporting results.minotari_utils.Documentation
Chores
minotari_utilsinto the workspace for unified build and management.