Skip to content

Commit cad1af5

Browse files
committed
Drive indexing: Add DB auto vacuum
1 parent 2c588bf commit cad1af5

4 files changed

Lines changed: 21 additions & 6 deletions

File tree

AGENTS.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,12 @@ Run the smallest set of checks possible for efficiency while maintaining confide
6363

6464
## Debugging
6565

66+
- **Data directories (IMPORTANT — dev and prod are separate!)**:
67+
- **Prod**: `~/Library/Application Support/com.veszelovszki.cmdr/`
68+
- **Dev**: `~/Library/Application Support/com.veszelovszki.cmdr-dev/`
69+
- This is set by `resolved_app_data_dir()` in `src-tauri/src/config.rs` (appends `-dev` in debug builds).
70+
Settings, index DBs, font metrics, AI models, and license data all live here. When debugging data issues,
71+
always check the right directory for the mode you're running. Deleting the wrong one is a real risk.
6672
- **Unified logging**: Frontend and backend logs appear together in the terminal and in a shared log file at
6773
`~/Library/Logs/com.veszelovszki.cmdr/`. The log file is also accessible from Settings > Logging > "Open log file".
6874
- **Svelte/TypeScript**: Use LogTape via `getAppLogger('feature')` from `$lib/logging/logger`. Levels: debug, info, warn, error.

apps/desktop/src-tauri/src/indexing/CLAUDE.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -66,18 +66,22 @@ All writes go through a dedicated `std::thread` via a bounded `sync_channel` (20
6666

6767
Reads happen on separate WAL connections (any thread). The global read-only store (`GLOBAL_INDEX_STORE`) provides enrichment without passing `AppHandle` through the listing pipeline.
6868

69-
### SQLite schema (v3: integer-keyed)
69+
### SQLite schema (v4: integer-keyed, incremental vacuum)
7070

71-
One DB per volume: `~/Library/Application Support/com.veszelovszki.cmdr/index-{volume_id}.db`
71+
One DB per volume. **Dev and prod use separate directories** (see AGENTS.md § Debugging):
72+
- **Prod**: `~/Library/Application Support/com.veszelovszki.cmdr/index-{volume_id}.db`
73+
- **Dev**: `~/Library/Application Support/com.veszelovszki.cmdr-dev/index-{volume_id}.db`
7274

7375
Three tables:
7476
- `entries` (id INTEGER PK, parent_id, name COLLATE platform_case, is_directory, is_symlink, size, modified_at) with unique index `idx_parent_name(parent_id, name)`. Root sentinel: id=1, parent_id=0, name="".
7577
- `dir_stats` (entry_id INTEGER PK, recursive_size, recursive_file_count, recursive_dir_count)
7678
- `meta` (key TEXT PK, value TEXT) WITHOUT ROWID
7779

78-
WAL mode, 16 MB page cache. Custom `platform_case` collation registered on every connection: case-insensitive + NFD normalization on macOS, binary on Linux. **Opening the DB with the sqlite3 CLI will fail** on queries touching the name column (the collation isn't registered).
80+
WAL mode, 16 MB page cache, `auto_vacuum = INCREMENTAL` (free pages reclaimed via `PRAGMA incremental_vacuum` after truncation). Custom `platform_case` collation registered on every connection: case-insensitive + NFD normalization on macOS, binary on Linux. **Opening the DB with the sqlite3 CLI will fail** on queries touching the name column (the collation isn't registered).
7981

80-
**Schema v3**: Bumped from v2 to force DB rebuild after fixing orphan entry bug. Scanner, writer, aggregator, reconciler, enrichment, and IPC commands all fully migrated to integer keys. `IndexManager` owns a `PathResolver` for LRU-cached path→ID resolution in IPC commands (`get_dir_stats`, `get_dir_stats_batch`). Enrichment uses integer-keyed fast path: resolve parent once → batch child dir stats by ID. Reconciler sends integer-keyed messages exclusively. Old path-keyed `WriteMessage` variants and backward-compat shims (`ScannedEntry`, `DirStats`) still exist for post-replay verification — cleanup in milestone 6.
82+
History of changes:
83+
- **Schema v3**: Bumped from v2 to force DB rebuild after fixing orphan entry bug. Scanner, writer, aggregator, reconciler, enrichment, and IPC commands all fully migrated to integer keys. `IndexManager` owns a `PathResolver` for LRU-cached path→ID resolution in IPC commands (`get_dir_stats`, `get_dir_stats_batch`). Enrichment uses integer-keyed fast path: resolve parent once → batch child dir stats by ID. Reconciler sends integer-keyed messages exclusively. Old path-keyed `WriteMessage` variants and backward-compat shims (`ScannedEntry`, `DirStats`) still exist for post-replay verification — cleanup in milestone 6.
84+
- **Schema v4**: Bumped from v3 to enable `auto_vacuum = INCREMENTAL` (requires DB rebuild since the pragma must be set before table creation).
8185

8286
## How to test
8387

apps/desktop/src-tauri/src/indexing/store.rs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
use rusqlite::{Connection, OptionalExtension, params};
2020
use std::path::{Path, PathBuf};
2121

22-
const SCHEMA_VERSION: &str = "3";
22+
const SCHEMA_VERSION: &str = "4";
2323

2424
/// Root entry sentinel ID. All top-level entries have `parent_id = ROOT_ID`.
2525
pub const ROOT_ID: i64 = 1;
@@ -280,7 +280,8 @@ fn ensure_root_sentinel(conn: &Connection) -> Result<(), IndexStoreError> {
280280
/// Apply WAL-mode pragmas for performance.
281281
fn apply_pragmas(conn: &Connection) -> Result<(), IndexStoreError> {
282282
conn.execute_batch(
283-
"PRAGMA journal_mode = WAL;
283+
"PRAGMA auto_vacuum = INCREMENTAL;
284+
PRAGMA journal_mode = WAL;
284285
PRAGMA synchronous = NORMAL;
285286
PRAGMA cache_size = -16384;",
286287
)?;

apps/desktop/src-tauri/src/indexing/writer.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -415,6 +415,10 @@ fn process_message(conn: &rusqlite::Connection, msg: WriteMessage, stats: &Write
415415
"Writer: truncated entries + dir_stats ({}ms)",
416416
t.elapsed().as_millis(),
417417
);
418+
// Reclaim free pages from the truncation
419+
if let Err(e) = conn.execute_batch("PRAGMA incremental_vacuum;") {
420+
log::warn!("Writer: incremental_vacuum after truncate failed: {e}");
421+
}
418422
}
419423
Err(e) => log::warn!("Writer: truncate failed: {e}"),
420424
}

0 commit comments

Comments
 (0)