Skip to content

feat: add minotari_utils#7157

Merged
SWvheerden merged 22 commits intotari-project:developmentfrom
fluffypony:utils
Jul 14, 2025
Merged

feat: add minotari_utils#7157
SWvheerden merged 22 commits intotari-project:developmentfrom
fluffypony:utils

Conversation

@fluffypony
Copy link
Copy Markdown
Contributor

@fluffypony fluffypony commented Jun 3, 2025

Description


Adds a comprehensive minotari_util CLI tool for analyzing Tari database statistics across all network components. The tool scans entire Tari network directories (like ~/.tari/mainnet) and provides detailed analysis of both LMDB and SQLite databases.

Key features:

  • Network directory scanning: Recursively scans Tari network directories to find all databases
  • Multi-database support: Detects LMDB (data.mdb) and SQLite (.db) databases
  • Component categorization: Organizes databases by Tari component (Base Node, Wallet, DHT, Peer Database)
  • Multiple output formats: Table (default), JSON, and CSV formats
  • Export functionality: Can export results to files with automatic format detection
  • Detailed analysis: Comprehensive statistics for both LMDB and SQLite databases
  • Read-only access: Uses non-locking LMDB access so it works while Tari nodes are running

Motivation and Context


Previously, database analysis was limited to single LMDB files and required nodes to be stopped due to file locking. This tool addresses the need for:

  1. Network-wide database visibility across all Tari components
  2. Non-disruptive analysis that works while nodes are running
  3. Comprehensive SQLite analysis alongside existing LMDB capabilities
  4. Flexible output formats for integration with monitoring and analysis workflows

How Has This Been Tested?


  • Basic functionality: Tested network directory scanning with real Tari mainnet data
  • LMDB analysis: Verified detailed base node LMDB analysis with 31 databases and 20M+ entries
  • SQLite analysis: Tested detailed SQLite analysis showing table breakdowns and statistics
  • Output formats: Verified table, JSON, and CSV output formats
  • Export functionality: Tested file export with automatic format detection
  • Read-only access: Confirmed tool works while Tari base node is running
  • Error handling: Tested with various network directory configurations

What process can a PR reviewer use to test or verify this change?


  1. Build the tool: cd applications/minotari_util && cargo build
  2. Basic network scan: cargo run -- dbstats --network-dir ~/.tari/mainnet
  3. Detailed analysis: cargo run -- dbstats --network-dir ~/.tari/mainnet --include-detailed
  4. Test output formats:
    • JSON: cargo run -- dbstats --format json
    • CSV: cargo run -- dbstats --format csv
  5. Test export: cargo run -- dbstats --export /tmp/stats.json
  6. Verify while node running: Run tool while base node is active to confirm read-only access
  7. Check help: cargo run -- dbstats --help

Expected output should show:

  • Network-wide database discovery (LMDB and SQLite)
  • Detailed LMDB analysis with environment info and per-database stats
  • Detailed SQLite analysis with database info and per-table stats
  • Clean output in all formats without compilation warnings

Breaking Changes


  • None
  • Requires data directory on base node to be deleted
  • Requires hard fork
  • Other - Please specify

Summary by CodeRabbit

  • New Features

    • Introduced the minotari_utils command-line tool for analyzing Tari base node database usage.
    • Added a nodedbstats command to scan, analyze, and report statistics for LMDB and SQLite databases, with support for various output formats (table, JSON, CSV), sorting, and exporting results.
    • Provided detailed database and environment statistics, including summary and per-database insights.
    • Added commands and CLI structure to support extensible subcommands within minotari_utils.
    • Enabled creation of read-only LMDB environments for safe database inspection without modification.
  • Documentation

    • Added a comprehensive README with installation instructions, usage examples, command options, and planned features.
  • Chores

    • Integrated minotari_utils into the workspace for unified build and management.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jun 3, 2025

Walkthrough

A new command-line utility, minotari_utils, is introduced for analyzing and reporting statistics on Tari base node LMDB and SQLite databases. Supporting modules for CLI parsing, configuration, and database statistics are implemented. The workspace and exports are updated to include this tool, and new utility functions for LMDB environments are added.

Changes

File(s) Change Summary
Cargo.toml Added applications/minotari_utils to workspace members.
applications/minotari_utils/Cargo.toml, README.md, src/main.rs Introduced new minotari_utils application with manifest, documentation, and main entry point.
applications/minotari_utils/src/cli.rs, src/commands/mod.rs, src/commands/dbstats.rs Implemented CLI structure, subcommands, and a comprehensive dbstats command for database analysis/reporting.
applications/minotari_utils/src/config.rs Added configuration module for managing base path, network, and database path.
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs Added public functions: get_all_database_names and create_readonly_lmdb_environment.
base_layer/core/src/chain_storage/lmdb_db/mod.rs, base_layer/core/src/chain_storage/mod.rs Re-exported new LMDB utility functions for broader accessibility.
supply-chain/config.toml Added multiple crate exemptions under "safe-to-deploy" criteria and adjusted one exemption's criteria.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Commands
    participant DbStats
    participant LMDB/SQLite

    User->>CLI: Run minotari_utils with arguments
    CLI->>Commands: Parse and dispatch subcommand
    Commands->>DbStats: Execute dbstats
    DbStats->>LMDB/SQLite: Scan and analyze databases
    LMDB/SQLite-->>DbStats: Return stats
    DbStats-->>User: Output results (table/JSON/CSV)
Loading

Poem

🐇
In the warren, bytes abound,
Now a tool to count is found!
LMDB and SQLite,
Measured morning, noon, and night.
With stats and tables, clear and bright—
Hooray for data, neat and right!
🗃️✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51ba274 and 579bb2b.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • supply-chain/imports.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • applications/minotari_utils/Cargo.toml (1 hunks)
  • supply-chain/config.toml (8 hunks)
✅ Files skipped from review due to trivial changes (1)
  • supply-chain/config.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • applications/minotari_utils/Cargo.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: test (mainnet, stagenet)
  • GitHub Check: test (nextnet, nextnet)
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: cargo check with stable
  • GitHub Check: ci

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jun 3, 2025

Test Results (CI)

    3 files    135 suites   35m 19s ⏱️
1 358 tests 1 358 ✅ 0 💤 0 ❌
4 072 runs  4 072 ✅ 0 💤 0 ❌

Results for commit 51ba274.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
applications/minotari_util/src/main.rs (1)

29-34: Clean main function structure with room for enhancement.

The main function follows good practices with proper logging initialization and clear execution flow. Consider adding explicit error handling for better user experience:

 fn main() -> anyhow::Result<()> {
     env_logger::init();
 
-    let cli = Cli::parse();
-    cli.execute()
+    let cli = Cli::parse();
+    if let Err(e) = cli.execute() {
+        eprintln!("Error: {}", e);
+        std::process::exit(1);
+    }
+    Ok(())
 }

This would provide cleaner error messages to users without stack traces.

applications/minotari_util/src/config.rs (1)

37-49: Configuration initialization with sensible defaults.

The from_cli method provides good default values and follows expected Tari directory conventions. Consider making the database path segments configurable for future flexibility:

// Future enhancement: make path configurable
pub const DEFAULT_DB_SUBPATH: &str = "data/base_node/db";

// Usage:
let db_path = base_path.join(DEFAULT_DB_SUBPATH);

This would make it easier to support different database locations in the future.

applications/minotari_util/README.md (1)

80-80: Minor hyphenation correction needed.

When used as a modifier, "Export specific" should be hyphenated as "Export-specific":

-- `export` - Export specific data sets
+- `export` - Export-specific data sets
🧰 Tools
🪛 LanguageTool

[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...

(SPECIFIC_HYPHEN)

applications/minotari_util/src/commands/dbstats.rs (3)

38-60: Remove unnecessary empty line and fix formatting.

There's an unnecessary empty line after the opening brace that should be removed for consistency.

 fn create_readonly_lmdb_store<P: AsRef<Path>>(
     path: P,
     config: LMDBConfig,
 ) -> Result<LMDBStore, anyhow::Error> {
-
     debug!("Opening LMDB store in read-only mode at {:?}", path.as_ref());

258-259: Remove unnecessary empty line and fix formatting.

There's an unnecessary empty line after the opening brace that should be removed for consistency.

 fn collect_database_stats(db_path: &Path) -> Result<DbStatsOutput> {
-    
     // Open LMDB store in read-only mode without exclusive lock

328-330: Remove trailing empty lines.

Remove the multiple empty lines at the end of the file for cleaner formatting.

     })
 }
-

-
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41add9f and aa25b56.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml (1 hunks)
  • applications/minotari_util/Cargo.toml (1 hunks)
  • applications/minotari_util/README.md (1 hunks)
  • applications/minotari_util/src/cli.rs (1 hunks)
  • applications/minotari_util/src/commands/dbstats.rs (1 hunks)
  • applications/minotari_util/src/commands/mod.rs (1 hunks)
  • applications/minotari_util/src/config.rs (1 hunks)
  • applications/minotari_util/src/main.rs (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
applications/minotari_util/src/main.rs (1)
applications/minotari_util/src/cli.rs (1)
  • parse (50-52)
applications/minotari_util/src/cli.rs (2)
applications/minotari_util/src/commands/mod.rs (1)
  • execute (43-47)
applications/minotari_util/src/commands/dbstats.rs (1)
  • execute (158-179)
applications/minotari_util/src/commands/mod.rs (2)
applications/minotari_util/src/cli.rs (1)
  • execute (54-57)
applications/minotari_util/src/commands/dbstats.rs (1)
  • execute (158-179)
🪛 LanguageTool
applications/minotari_util/README.md

[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...

(SPECIFIC_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: test (mainnet, stagenet)
  • GitHub Check: test (nextnet, nextnet)
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: cargo check with stable
🔇 Additional comments (9)
Cargo.toml (1)

41-41: LGTM! Workspace integration is correct.

The new minotari_util application is properly integrated into the workspace in the correct alphabetical position among other applications.

applications/minotari_util/src/config.rs (1)

27-34: Well-structured configuration with appropriate dead code handling.

The AppConfig struct is well-designed for the initial implementation. The #[allow(dead_code)] attributes are appropriate since this is a new tool where base_path and network will likely be used by future commands.

applications/minotari_util/README.md (1)

1-103: Excellent comprehensive documentation.

The README provides thorough documentation for the new CLI tool with clear examples, usage patterns, and a helpful roadmap for future development. The structure and content effectively communicate the tool's purpose and capabilities.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...

(SPECIFIC_HYPHEN)

applications/minotari_util/src/commands/mod.rs (1)

29-48: LGTM! Clean command dispatcher implementation.

The enum structure and execution pattern follow standard CLI best practices. The implementation is well-structured for extensibility as new commands are added in the future.

applications/minotari_util/src/cli.rs (1)

28-58: LGTM! Well-designed CLI structure with proper ownership handling.

The CLI implementation follows clap best practices. The use of std::mem::take in the execute method is a clean pattern for transferring ownership of the command while maintaining a reference to the CLI context for configuration access.

applications/minotari_util/Cargo.toml (1)

1-35: LGTM! Well-structured package manifest with appropriate dependencies.

The Cargo.toml properly inherits workspace settings and includes all necessary dependencies for CLI parsing, LMDB database access, serialization, and output formatting. The dependency versions and features are appropriate for the intended functionality.

applications/minotari_util/src/commands/dbstats.rs (3)

37-60: LGTM! Solid read-only LMDB connection with proper safety measures.

The function correctly opens LMDB in read-only mode with NOLOCK and RDONLY flags, which is perfect for statistics collection without interfering with running base nodes. The error handling and path validation are also well implemented.


62-87: LGTM! Comprehensive CLI argument structure.

The DbStatsArgs struct provides a good set of options for database analysis including path override, output formats, sorting, filtering, and export capabilities. The argument annotations and defaults are well-chosen.


181-246: LGTM! Well-implemented output formatting with good user experience.

The output methods provide clear formatting for different use cases:

  • Table output includes environment info, sorted databases, and summary
  • JSON output provides structured data for programmatic use
  • CSV output enables spreadsheet analysis
  • Export functionality supports both JSON and CSV formats

The sorting and truncation logic is also well implemented.

@fluffypony fluffypony requested a review from a team as a code owner June 3, 2025 04:24
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa25b56 and 359b046.

📒 Files selected for processing (4)
  • applications/minotari_util/src/commands/dbstats.rs (1 hunks)
  • base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (2 hunks)
  • base_layer/core/src/chain_storage/lmdb_db/mod.rs (1 hunks)
  • base_layer/core/src/chain_storage/mod.rs (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • base_layer/core/src/chain_storage/lmdb_db/mod.rs
  • base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
base_layer/core/src/chain_storage/mod.rs (1)
base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (1)
  • create_readonly_lmdb_environment (235-269)
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: test (mainnet, stagenet)
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: test (nextnet, nextnet)
  • GitHub Check: cargo check with stable
  • GitHub Check: file licenses
🔇 Additional comments (5)
base_layer/core/src/chain_storage/mod.rs (1)

68-68: LGTM! Clean export addition for the new utility.

The addition of create_readonly_lmdb_environment to the public exports is well-justified and enables the new minotari_util application to safely access LMDB databases in read-only mode. The function implementation (from the relevant snippets) shows proper safety measures with read-only flags and comprehensive error handling.

applications/minotari_util/src/commands/dbstats.rs (4)

38-63: Well-designed CLI interface with comprehensive options.

The DbStatsArgs structure provides a thoughtful set of command-line options that cover the key use cases mentioned in the PR objectives: path customization, multiple output formats, sorting, filtering, and export capabilities. The argument definitions are clear and well-documented.


82-127: Comprehensive data structures for database statistics.

The data structures effectively capture all relevant LMDB statistics including per-database metrics (entries, sizes, page counts, depth) and environment-level information. The use of Tabled derive for formatting and Serialize/Deserialize for export functionality is appropriate. The format_size helper using ByteSize provides human-readable output.


270-304: Robust database statistics collection with appropriate error handling.

The implementation correctly:

  • Uses read-only transactions for safe access
  • Calculates comprehensive statistics including page counts and sizes
  • Handles missing or inaccessible databases gracefully with silent skipping
  • Computes derived metrics like average size per entry

The silent error handling (lines 295-297, 300-302) is appropriate for a read-only inspection tool, as some databases may legitimately not exist in all Tari installations.


134-155: Clean execution flow with proper validation and error handling.

The execution method provides:

  • Path validation before attempting database access
  • Clean delegation to format-specific output methods
  • Optional export functionality
  • Comprehensive error propagation using anyhow::Result

This addresses the past review feedback by successfully implementing per-database statistics collection rather than just environment summary.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
applications/minotari_util/src/commands/dbstats.rs (2)

157-187: Consider improving table output formatting consistency.

The table output mixes formatted and unformatted numbers, which could confuse users. For example, environment info shows raw numbers while the table uses formatted sizes.

Consider applying consistent formatting:

-        println!("  Last Page: {}", stats.environment.last_pgno);
-        println!("  Last Transaction ID: {}", stats.environment.last_txnid);
+        println!("  Last Page: {:,}", stats.environment.last_pgno);
+        println!("  Last Transaction ID: {:,}", stats.environment.last_txnid);

267-271: Potential division by zero protection could be more explicit.

While the code correctly handles the case where db_stat.entries is zero, the logic could be more explicit about this edge case for better code readability.

Consider making the zero-division protection more explicit:

-                        let avg_size = if db_stat.entries > 0 {
-                            total_size / db_stat.entries
-                        } else {
-                            0
-                        };
+                        let avg_size = if db_stat.entries == 0 {
+                            0  // Avoid division by zero for empty databases
+                        } else {
+                            total_size / db_stat.entries
+                        };
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 359b046 and d5fff36.

📒 Files selected for processing (4)
  • applications/minotari_util/src/commands/dbstats.rs (1 hunks)
  • base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (3 hunks)
  • base_layer/core/src/chain_storage/lmdb_db/mod.rs (1 hunks)
  • base_layer/core/src/chain_storage/mod.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • base_layer/core/src/chain_storage/lmdb_db/mod.rs
  • base_layer/core/src/chain_storage/mod.rs
  • base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: test (mainnet, stagenet)
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: test (nextnet, nextnet)
  • GitHub Check: cargo check with stable
🔇 Additional comments (8)
applications/minotari_util/src/commands/dbstats.rs (8)

32-33: Well-designed integration with Tari core libraries.

The imports correctly use the newly added functions create_readonly_lmdb_environment and get_all_database_names from the Tari core library, which addresses the previous concerns about hardcoded database names and provides a clean abstraction for read-only LMDB access.


38-63: Comprehensive CLI argument structure.

The DbStatsArgs struct provides excellent coverage of expected functionality including path overrides, multiple output formats, sorting options, result limiting, and export capabilities. The argument definitions are clear and well-documented.


82-102: Well-structured database statistics representation.

The DatabaseStats struct effectively captures all relevant LMDB database metrics including B-tree structure details (leaf, branch, overflow pages). The tabled annotations with custom size formatting provide user-friendly output.


134-155: Robust main execution flow with proper error handling.

The execute function follows a clean pattern: configuration resolution, path validation, data collection, output formatting, and optional export. Error handling is comprehensive and user-friendly.


206-222: Robust export functionality with proper format detection.

The export function correctly determines output format by file extension and provides clear error messages for unsupported formats. Error handling covers serialization, file writing, and encoding issues appropriately.


255-256: Excellent resolution of previous hardcoded database names issue.

Using get_all_database_names() provides a centralized, authoritative source for database names, eliminating the maintenance burden and potential errors from hardcoded lists mentioned in previous reviews.


260-294: Comprehensive per-database statistics collection addresses previous concerns.

The implementation now properly iterates through individual databases and collects detailed statistics for each, resolving the previous issue where only environment summary was collected. The silent error handling for missing or inaccessible databases is appropriate for a statistics tool.


296-317: Well-implemented summary calculations with edge case handling.

The summary statistics properly aggregate data across all databases and handle edge cases like empty database lists. The calculation of largest database and averages is correct and robust.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
applications/minotari_util/src/commands/dbstats.rs (1)

56-58: Unused CLI argument: include_detailed flag is defined but never used.

The include_detailed flag is defined in the CLI arguments but is not referenced anywhere in the implementation. This could confuse users who might expect additional detailed output when using this flag.

Either remove the unused flag or implement the detailed functionality:

-    /// Include detailed per-database stats
-    #[arg(long)]
-    pub include_detailed: bool,

Or implement the functionality to show additional details when this flag is set.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5fff36 and 8d451fe.

📒 Files selected for processing (1)
  • applications/minotari_util/src/commands/dbstats.rs (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: cargo check with stable
  • GitHub Check: test (nextnet, nextnet)
🔇 Additional comments (6)
applications/minotari_util/src/commands/dbstats.rs (6)

23-36: Import organization looks clean and appropriate.

The imports are well-organized and all dependencies appear to be used appropriately. The use of tari_core::chain_storage::{create_readonly_lmdb_environment, get_all_database_names} addresses the previous concerns about using authoritative database names.


234-257: Excellent use of new read-only LMDB functions.

The implementation correctly uses the new create_readonly_lmdb_environment function and get_all_database_names to ensure authoritative database access and enumeration. This addresses the previous review concerns about hardcoded database names and proper LMDB access.


258-294: Robust database statistics collection with proper error handling.

The implementation correctly iterates through all database names, handles cases where databases might not exist or be accessible, and collects comprehensive statistics including page counts, sizes, and depth information. The error handling gracefully continues processing other databases when individual ones fail.


296-323: Well-implemented summary statistics calculation.

The summary calculation correctly aggregates data across all databases, handles edge cases (like empty database lists), and provides useful metrics like total size, largest database, and average entries per database.


206-222: Good export functionality with proper format detection.

The export functionality properly detects file extensions and supports both JSON and CSV formats with appropriate error handling for unsupported formats.


224-231: Complete sorting implementation covering all defined fields.

The sorting implementation correctly handles all four defined sort fields (name, size, entries, pages) with appropriate ascending/descending order for each field type.

Comment on lines +285 to +287
Err(e) => {
println!("Failed to get stats for database '{}': {}", db_name, e);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Error messages should be written to stderr instead of stdout.

Error messages for database access failures are currently printed to stdout, which can interfere with structured output formats like JSON or CSV.

-                        println!("Failed to get stats for database '{}': {}", db_name, e);
+                        eprintln!("Failed to get stats for database '{}': {}", db_name, e);

-                println!("Failed to open database '{}': {}", db_name, e);
+                eprintln!("Failed to open database '{}': {}", db_name, e);

Also applies to: 290-292

🤖 Prompt for AI Agents
In applications/minotari_util/src/commands/dbstats.rs at lines 285-287 and
290-292, error messages are printed using println!, which writes to stdout.
Change these to use eprintln! instead to write error messages to stderr,
ensuring that error output does not interfere with structured stdout output like
JSON or CSV.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
applications/minotari_util/src/commands/dbstats.rs (2)

197-367: Well-implemented helper functions with purpose clarification needed.

The output formatting methods and directory scanning functions are well-implemented with proper error handling. However, these functions serve a network-wide scanning purpose rather than the detailed LMDB database analysis described in the README.

Consider separating the network scanning functionality into a different command (e.g., scan-network) to clarify the distinction between network-wide database discovery and detailed database statistics analysis.


369-448: Excellent implementation of detailed database statistics collection.

This function properly implements the core functionality described in the README, collecting comprehensive statistics for each database using the authoritative database names from get_all_database_names(). The implementation addresses previous review comments about collecting per-database statistics.

Consider adding more detailed error logging for individual database failures:

     for db_name in db_names {
         if let Ok(database) = Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) {
             if let Ok(db_stat) = ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) {
                 // ... existing stats collection code ...
+            } else {
+                eprintln!("Warning: Failed to get stats for database '{}'", db_name);
             }
+        } else {
+            eprintln!("Warning: Failed to open database '{}'", db_name);
         }
     }
applications/minotari_util/README.md (1)

80-80: Minor grammatical improvement needed.

The phrase should be hyphenated when used as a modifier.

-- `export` - Export specific data sets
+- `export` - Export-specific data sets
🧰 Tools
🪛 LanguageTool

[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...

(SPECIFIC_HYPHEN)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d451fe and 27797c1.

📒 Files selected for processing (3)
  • applications/minotari_util/README.md (1 hunks)
  • applications/minotari_util/src/commands/dbstats.rs (1 hunks)
  • applications/minotari_util/src/commands/mod.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • applications/minotari_util/src/commands/mod.rs
🧰 Additional context used
🪛 LanguageTool
applications/minotari_util/README.md

[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...

(SPECIFIC_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: test (testnet, esmeralda)
🔇 Additional comments (2)
applications/minotari_util/src/commands/dbstats.rs (1)

1-154: Well-structured data models and CLI interface.

The imports, enums, and data structures are well-designed for the database statistics functionality. The use of appropriate crates (serde, tabled, clap) and the comprehensive data models for database statistics provide a solid foundation for the tool.

applications/minotari_util/README.md (1)

1-103: ⚠️ Potential issue

Documentation doesn't match actual implementation.

The README describes detailed database statistics functionality with options like --db-path, --network, and --verbose, but the actual implementation in dbstats.rs only performs network-wide database scanning with different options (--network-dir, --format, --sort-by, etc.).

Update the documentation to match the actual implementation, or implement the functionality as documented. Key discrepancies:

  1. The actual CLI uses --network-dir instead of --db-path
  2. The actual implementation doesn't have --network or --verbose options
  3. The current implementation does network scanning, not detailed LMDB analysis

Consider updating the README to reflect the current implementation or modify the implementation to match the documentation.

Likely an incorrect or invalid review comment.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~80-~80: When ‘Export-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... Database backup utilities - export - Export specific data sets - compact - Database compac...

(SPECIFIC_HYPHEN)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
applications/minotari_util/src/commands/dbstats.rs (2)

359-374: Consider making component determination more robust.

The current component determination relies on string matching in file paths, which could be fragile if directory structures change or if users have non-standard setups.

Consider making this more configurable or robust:

 fn determine_component(path: &Path, base_dir: &Path) -> String {
     let relative_path = path.strip_prefix(base_dir).unwrap_or(path);
     let path_str = relative_path.to_string_lossy();
     
+    // Check components in order of specificity
     if path_str.contains("base_node") {
         "Base Node".to_string()
     } else if path_str.contains("wallet") {
         "Wallet".to_string()
-    } else if path_str.contains("peer_db") {
+    } else if path_str.contains("peer_db") || path_str.contains("peer") {
         "Peer Database".to_string()
     } else if path_str.contains("dht") {
         "DHT".to_string()
+    } else if path_str.contains("console_wallet") {
+        "Console Wallet".to_string()
     } else {
-        "Other".to_string()
+        // Fallback to directory name or "Unknown"
+        path.file_name()
+            .map(|name| name.to_string_lossy().to_string())
+            .unwrap_or_else(|| "Unknown".to_string())
     }
 }

This provides better fallback behavior and handles additional component types.


426-450: Consider providing feedback for skipped databases.

Currently, databases that fail to open are silently skipped, which might leave users wondering why certain expected databases don't appear in the output. Consider adding optional logging or a summary of skipped databases.

 // Get statistics for each database
+let mut skipped_databases = Vec::new();
 for db_name in db_names {
-    if let Ok(database) = Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) {
-        if let Ok(db_stat) = ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) {
+    match Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) {
+        Ok(database) => {
+            match ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) {
+                Ok(db_stat) => {
                     let total_pages = db_stat.leaf_pages + db_stat.branch_pages + db_stat.overflow_pages;
                     let total_size = total_pages * page_size;
                     let avg_size = if db_stat.entries > 0 {
                         total_size / db_stat.entries
                     } else {
                         0
                     };

                     databases.push(DatabaseStats {
                         name: db_name.to_string(),
                         entries: db_stat.entries,
                         total_size,
                         avg_size,
                         depth: db_stat.depth,
                         total_pages,
                         leaf_pages: db_stat.leaf_pages,
                         branch_pages: db_stat.branch_pages,
                         overflow_pages: db_stat.overflow_pages,
                     });
+                }
+                Err(_e) => {
+                    // Database exists but couldn't get stats - might be corrupted or locked
+                    skipped_databases.push((db_name, "Failed to read statistics"));
+                }
+            }
+        }
+        Err(_e) => {
+            // Database doesn't exist - this is expected for some databases
+            skipped_databases.push((db_name, "Database not found"));
         }
     }
 }

+// Optionally log skipped databases for debugging
+if !skipped_databases.is_empty() && std::env::var("TARI_UTIL_VERBOSE").is_ok() {
+    eprintln!("Skipped {} databases:", skipped_databases.len());
+    for (name, reason) in &skipped_databases {
+        eprintln!("  {}: {}", name, reason);
+    }
+}

This provides better observability while maintaining the current behavior by default.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 27797c1 and a7ece75.

📒 Files selected for processing (2)
  • applications/minotari_util/src/commands/dbstats.rs (1 hunks)
  • applications/minotari_util/src/config.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • applications/minotari_util/src/config.rs
🔇 Additional comments (3)
applications/minotari_util/src/commands/dbstats.rs (3)

23-162: Well-structured foundation with appropriate dependencies.

The imports, CLI argument structure, and data models are well-designed with proper serialization support and clear field naming conventions. The byte size formatting functions provide good user experience.


235-273: Past review concerns have been properly addressed.

The implementation now correctly provides detailed database statistics when requested, addressing the previous concern about the execution only performing network scans. The integration with collect_database_stats and the use of get_all_database_names() from Tari core resolves the earlier issues about hardcoded database names and missing detailed analysis.


1-484: Well-implemented database statistics utility with comprehensive functionality.

This implementation successfully provides the database analysis functionality described in the PR objectives. Key strengths include:

  • Clean separation between network-wide scanning and detailed LMDB analysis
  • Support for multiple output formats (table, JSON, CSV)
  • Proper integration with Tari core for authoritative database names
  • Comprehensive error handling with Result types
  • Flexible CLI interface with sorting, filtering, and export options

The code effectively addresses the goal of providing base node operators with tools to inspect LMDB database usage and identify performance bottlenecks.

Comment on lines +169 to +172
let network_dir = self.network_dir.clone().unwrap_or_else(|| {
let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
PathBuf::from(home).join(".tari").join("mainnet")
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve cross-platform compatibility for default path handling.

The current default path logic uses the HOME environment variable, which is Unix-specific and will not work correctly on Windows systems where the equivalent would be USERPROFILE or HOMEDRIVE/HOMEPATH.

Consider using a cross-platform approach:

-        let network_dir = self.network_dir.clone().unwrap_or_else(|| {
-            let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
-            PathBuf::from(home).join(".tari").join("mainnet")
-        });
+        let network_dir = self.network_dir.clone().unwrap_or_else(|| {
+            let home = dirs::home_dir().unwrap_or_else(|| PathBuf::from("."));
+            home.join(".tari").join("mainnet")
+        });

This requires adding the dirs crate as a dependency, which provides cross-platform home directory detection.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let network_dir = self.network_dir.clone().unwrap_or_else(|| {
let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
PathBuf::from(home).join(".tari").join("mainnet")
});
let network_dir = self.network_dir.clone().unwrap_or_else(|| {
let home = dirs::home_dir().unwrap_or_else(|| PathBuf::from("."));
home.join(".tari").join("mainnet")
});
🤖 Prompt for AI Agents
In applications/minotari_util/src/commands/dbstats.rs around lines 169 to 172,
the code uses the Unix-specific HOME environment variable to determine the
default path, which breaks on Windows. Replace this logic by using the dirs
crate's home_dir() function to get the user's home directory in a cross-platform
way. Add the dirs crate as a dependency if not already present, then update the
code to call dirs::home_dir() and build the path from there, falling back to "."
if home_dir() returns None.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
applications/minotari_util/src/commands/dbstats.rs (1)

206-209: Cross-platform compatibility issue remains unaddressed.

The code still uses the Unix-specific HOME environment variable, which will not work on Windows systems. This was flagged in previous reviews but hasn't been fixed.

-        let network_dir = self.network_dir.clone().unwrap_or_else(|| {
-            let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
-            PathBuf::from(home).join(".tari").join("mainnet")
-        });
+        let network_dir = self.network_dir.clone().unwrap_or_else(|| {
+            let home = dirs::home_dir().unwrap_or_else(|| PathBuf::from("."));
+            home.join(".tari").join("mainnet")
+        });

This requires adding the dirs crate as a dependency for cross-platform home directory detection.

🧹 Nitpick comments (3)
applications/minotari_util/src/commands/dbstats.rs (3)

369-378: Remove unnecessary empty lines.

There are many consecutive empty lines that reduce code readability and don't follow Rust formatting conventions.

-    }
-
-
-
-
-
-
-
-
-
-}
+    }
+}

508-532: Consider error handling consistency.

The LMDB database iteration silently skips databases that fail to open or have statistics errors. While this may be intentional for robustness, it could hide important issues from users.

Consider adding an optional verbose mode or logging failed database access:

     for db_name in db_names {
         if let Ok(database) = Database::open(&*env, Some(db_name), &DatabaseOptions::defaults()) {
             if let Ok(db_stat) = ReadTransaction::new(env.clone()).and_then(|txn| txn.db_stat(&database)) {
                 // ... existing stats collection ...
+            } else {
+                eprintln!("Warning: Failed to get statistics for database '{}'", db_name);
             }
+        } else {
+            eprintln!("Warning: Failed to open database '{}'", db_name);
         }
     }

651-653: Remove trailing empty lines.

The file ends with unnecessary empty lines that should be removed for cleaner formatting.

-}
-
-
-
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a7ece75 and 1eacaff.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • applications/minotari_util/Cargo.toml (1 hunks)
  • applications/minotari_util/src/commands/dbstats.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • applications/minotari_util/Cargo.toml
🔇 Additional comments (1)
applications/minotari_util/src/commands/dbstats.rs (1)

599-603: 🛠️ Refactor suggestion

Improve SQLite table size estimation accuracy.

The current table size estimation logic is flawed. It calculates size_per_row as file_size / page_count, but then multiplies by row_count, which doesn't make logical sense since page_count is the total pages in the database, not total rows.

Consider using SQLite's built-in dbstat virtual table for more accurate size estimation:

-        // Estimate table size using database_list and page info
-        // This is an approximation since SQLite doesn't provide direct table size info
-        let table_size_estimate = if row_count > 0 {
-            // Use rough estimation: file_size * (table_rows / total_db_rows_estimate)
-            // This is not perfect but gives a reasonable approximation
-            let size_per_row = if row_count > 0 { file_size / page_count.max(1) } else { 0 };
-            row_count * size_per_row
-        } else {
-            0
-        };
+        // Use SQLite's dbstat virtual table for more accurate size estimation
+        let table_size_estimate = conn.query_row(
+            "SELECT COALESCE(SUM(pgsize), 0) FROM dbstat WHERE name = ?",
+            [&table_name],
+            |row| row.get::<_, u64>(0)
+        ).unwrap_or_else(|_| {
+            // Fallback to simple estimation if dbstat is not available
+            if row_count > 0 { file_size / 100 * row_count / page_count.max(1) } else { 0 }
+        });

Likely an incorrect or invalid review comment.

Comment on lines +591 to +595
let row_count: u64 = conn.query_row(
&format!("SELECT COUNT(*) FROM {}", table_name),
[],
|row| row.get(0)
).unwrap_or(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential SQL injection vulnerability.

Using string formatting to construct SQL queries can lead to SQL injection vulnerabilities, even though table_name comes from SQLite's internal metadata.

Use parameterized queries or properly escape the table name:

-        let row_count: u64 = conn.query_row(
-            &format!("SELECT COUNT(*) FROM {}", table_name),
-            [],
-            |row| row.get(0)
-        ).unwrap_or(0);
+        let row_count: u64 = conn.query_row(
+            "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name=? 
+             AND (SELECT COUNT(*) FROM pragma_table_info(?)) > 0",
+            [&table_name, &table_name],
+            |_| Ok(0) // First check if table exists and is accessible
+        ).and_then(|_| {
+            // Use identifier quoting for safety
+            conn.query_row(
+                &format!("SELECT COUNT(*) FROM \"{}\"", table_name.replace("\"", "\"\"")),
+                [],
+                |row| row.get(0)
+            )
+        }).unwrap_or(0);

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In applications/minotari_util/src/commands/dbstats.rs around lines 591 to 595,
the current code constructs an SQL query by directly formatting the table name
into the query string, which risks SQL injection. To fix this, avoid using
string formatting for the table name in the query. Instead, validate or sanitize
the table_name to ensure it is safe, or use SQLite's built-in mechanisms to
safely quote identifiers before including them in the query. This prevents
injection by ensuring the table name is treated as an identifier, not executable
code.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jun 3, 2025

Test Results (Integration tests)

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit cc11bea. ± Comparison against base commit 41add9f.

@leet4tari
Copy link
Copy Markdown
Contributor

Title says minotari_utils, but the tool is minotari_util? Note the missing s. I think minotari_utils is better, as this might just be the start of many things that could be put into one minotari_utils.

SWvheerden
SWvheerden previously approved these changes Jul 14, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
applications/minotari_utils/src/cli.rs (1)

55-58: Simplify by consuming self directly.

The use of std::mem::take is unnecessary here since you're already taking ownership of self.

-pub fn execute(mut self) -> anyhow::Result<()> {
-    let command = std::mem::take(&mut self.command);
-    command.execute(&self)
+pub fn execute(self) -> anyhow::Result<()> {
+    self.command.execute(&self)
 }
applications/minotari_utils/src/commands/dbstats.rs (2)

31-31: Remove unnecessary semicolon from import.

-use csv;
+use csv;

438-439: Remove unnecessary empty else blocks.

These empty else blocks with clippy comments serve no purpose.

-        } else {
-            // clippy
         }

Also applies to: 473-474

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1eacaff and a313397.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • Cargo.toml (1 hunks)
  • applications/minotari_utils/Cargo.toml (1 hunks)
  • applications/minotari_utils/README.md (1 hunks)
  • applications/minotari_utils/src/cli.rs (1 hunks)
  • applications/minotari_utils/src/commands/dbstats.rs (1 hunks)
  • applications/minotari_utils/src/commands/mod.rs (1 hunks)
  • applications/minotari_utils/src/config.rs (1 hunks)
  • applications/minotari_utils/src/main.rs (1 hunks)
  • base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs (3 hunks)
  • base_layer/core/src/chain_storage/lmdb_db/mod.rs (1 hunks)
  • base_layer/core/src/chain_storage/mod.rs (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • applications/minotari_utils/README.md
  • applications/minotari_utils/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (4)
  • base_layer/core/src/chain_storage/mod.rs
  • Cargo.toml
  • base_layer/core/src/chain_storage/lmdb_db/mod.rs
  • base_layer/core/src/chain_storage/lmdb_db/lmdb_db.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
applications/minotari_utils/src/main.rs (1)
applications/minotari_utils/src/cli.rs (1)
  • parse (51-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: test (nextnet, nextnet)
  • GitHub Check: test (mainnet, stagenet)
  • GitHub Check: test (testnet, esmeralda)
  • GitHub Check: cargo check with stable
  • GitHub Check: ci
  • GitHub Check: Cucumber tests / FFI
  • GitHub Check: Cucumber tests / Base Layer
🔇 Additional comments (2)
applications/minotari_utils/src/main.rs (1)

29-34: LGTM!

The main entry point is well-structured and follows standard Rust patterns for CLI applications.

applications/minotari_utils/src/commands/mod.rs (1)

29-48: LGTM!

Clean command dispatch implementation with proper subcommand handling.

Comment on lines +27 to +35
#[derive(Debug, Clone)]
pub struct AppConfig {
#[allow(dead_code)]
pub base_path: PathBuf,
#[allow(dead_code)]
pub network: String,
#[allow(dead_code)]
pub db_path: PathBuf,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused configuration struct.

The AppConfig struct appears to be unused - all fields are marked with #[allow(dead_code)] and in dbstats.rs line 208, the config is created but immediately discarded with an underscore prefix. Additionally, the hardcoded db_path to "data/base_node/db" doesn't align with the tool's actual behavior of scanning entire network directories.

Consider removing this entire configuration module if it's not being used, or update it to properly support the network-wide scanning functionality described in the PR objectives.

Also applies to: 37-51

🤖 Prompt for AI Agents
In applications/minotari_utils/src/config.rs around lines 27 to 35 and 37 to 51,
the AppConfig struct and related configuration code are unused and contain
misleading hardcoded paths that do not reflect the tool's actual scanning
behavior. Remove the entire AppConfig struct and any associated configuration
code in this module to eliminate dead code and avoid confusion, unless you plan
to update it to support network-wide scanning properly.

Comment on lines +599 to +600
.query_row(&format!("SELECT COUNT(*) FROM {}", table_name), [], |row| row.get(0))
.unwrap_or(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix SQL injection vulnerability.

The table name is directly interpolated into the SQL query without validation, creating a potential SQL injection vulnerability. While table names come from sqlite_master, it's still a security best practice to properly escape identifiers.

Use proper identifier quoting:

-    .query_row(&format!("SELECT COUNT(*) FROM {}", table_name), [], |row| row.get(0))
+    .query_row(&format!("SELECT COUNT(*) FROM \"{}\"", table_name.replace("\"", "\"\"")), [], |row| row.get(0))

Or better yet, use a prepared statement with a parameterized query if the SQLite driver supports it for identifiers.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.query_row(&format!("SELECT COUNT(*) FROM {}", table_name), [], |row| row.get(0))
.unwrap_or(0);
.query_row(&format!("SELECT COUNT(*) FROM \"{}\"", table_name.replace("\"", "\"\"")), [], |row| row.get(0))
.unwrap_or(0);
🤖 Prompt for AI Agents
In applications/minotari_utils/src/commands/dbstats.rs around lines 599 to 600,
the SQL query directly interpolates the table name, causing a potential SQL
injection risk. To fix this, properly escape the table name using SQLite
identifier quoting (e.g., wrapping the table name in double quotes and escaping
any internal quotes) before including it in the query string. Since
parameterized queries do not support identifiers, ensure the table name is
sanitized or quoted correctly to prevent injection.

Comment on lines +604 to +615
let table_size_estimate = if row_count > 0 {
// Use rough estimation: file_size * (table_rows / total_db_rows_estimate)
// This is not perfect but gives a reasonable approximation
let size_per_row = if row_count > 0 {
file_size / page_count.max(1)
} else {
0
};
row_count * size_per_row
} else {
0
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix table size estimation logic.

The current logic for estimating table size is incorrect. It uses file_size / page_count as size_per_row, which doesn't make logical sense.

The estimation should be based on the proportion of rows in the table compared to total rows in the database:

-let size_per_row = if row_count > 0 {
-    file_size / page_count.max(1)
-} else {
-    0
-};
-row_count * size_per_row
+// Estimate based on proportion of database
+// This is still an approximation but more reasonable
+(file_size as f64 * (row_count as f64 / total_rows.max(1) as f64)) as u64

Note: You'll need to calculate total_rows across all tables first for this approach to work.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In applications/minotari_utils/src/commands/dbstats.rs around lines 604 to 615,
the table size estimation logic is incorrect because it calculates size_per_row
as file_size divided by page_count, which is not meaningful. To fix this, first
compute the total number of rows across all tables, then estimate the table size
by multiplying the file_size by the ratio of the current table's row_count to
the total_rows. Replace the existing size_per_row calculation with this
proportional approach to get a more accurate estimate.

Comment on lines +206 to +214
#[allow(clippy::too_many_lines)]
pub fn execute(self, cli: &Cli) -> Result<()> {
let _config = AppConfig::from_cli(cli)?;

// Default to ~/.tari/mainnet if no network dir specified
let network_dir = self.network_dir.clone().unwrap_or_else(|| {
let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
PathBuf::from(home).join(".tari").join("mainnet")
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Address cross-platform compatibility and unused code.

  1. The function is marked with #[allow(clippy::too_many_lines)] indicating it should be refactored into smaller functions.
  2. The _config is created but never used, confirming the AppConfig is unnecessary.
  3. The HOME environment variable is Unix-specific and won't work on Windows.

For cross-platform home directory detection, use the dirs crate:

-let home = std::env::var("HOME").unwrap_or_else(|_| ".".to_string());
+let home = dirs::home_dir()
+    .map(|p| p.to_string_lossy().to_string())
+    .unwrap_or_else(|| ".".to_string());

Also, remove the unused _config variable:

-let _config = AppConfig::from_cli(cli)?;
🤖 Prompt for AI Agents
In applications/minotari_utils/src/commands/dbstats.rs around lines 206 to 214,
remove the unused variable _config since AppConfig is not used in this function.
Replace the Unix-specific HOME environment variable usage with a cross-platform
home directory retrieval using the dirs crate's home_dir function. Additionally,
consider refactoring the execute function into smaller functions to address the
clippy warning about too many lines.

@SWvheerden SWvheerden merged commit 1ffeef7 into tari-project:development Jul 14, 2025
11 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Sep 27, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants