Indexing: reserve the UNIQUE-conflict WARN for the case that's actually actionable

vdavid · vdavid · commit ba5a538ca4a8 · 2026-06-12T19:50:33.000+02:00
The index scan logged a WARN for every batch that skipped a row on the `(parent_id, name_folded)` UNIQUE constraint, even a single skip. But a few skips per scan is expected dedup (one dir reachable by two walk paths via a firmlink/symlink, or case/NFD sibling pairs on case-sensitive or cross-OS-synced trees), not anything to act on. That made the WARN noise and trained the eye to ignore it, defeating the point of WARN.

Recalibrate so WARN means "do something":

- Per-batch skips drop to DEBUG (keeping the 3-row sample for diagnosis under `RUST_LOG=cmdr_lib::indexing::writer=debug`).
- `AccumulatorMaps` gains an `entries_skipped` tally; `handle_compute_all_aggregates` summarizes it once per scan via the new pure `classify_skip_severity`: silent when nothing skipped, DEBUG for sparse dedup, and WARN only when the skip ratio looks like two writers racing on one DB (≥50 skips AND &gt;1% of the scan's rows). That racing case is the constraint's whole reason for being (a 1.83 TB ghost size was traced to it), and it's the one a reader should investigate.

Ratio over per-batch absolute count because the racing signature is a large *fraction* of rows skipped sustained across the scan, while a giant directory of genuine collisions could trip an absolute per-batch threshold falsely. The absolute floor keeps a tiny tree with a couple sibling collisions from warning.

Normal scans now log nothing here.
diff --git a/apps/desktop/src-tauri/src/indexing/DETAILS.md b/apps/desktop/src-tauri/src/indexing/DETAILS.md
@@ -22,7 +22,7 @@ The key UX win: showing directory sizes in file listings. Design history is in g
 - **store.rs** -- SQLite schema (integer-keyed entries with `name_folded` column on all platforms, `inode` column for hardlink dedup, `dir_stats` by entry_id, `meta`), `platform_case` collation, read queries, DB open/migrate. `resolve_component` always queries by `(parent_id, name_folded)` using the `idx_parent_name_folded` composite **UNIQUE** index. On Linux/Windows, `normalize_for_comparison()` is the identity function, so `name_folded = name` and the index behaves identically to a `(parent_id, name)` index. Schema version check: mismatch triggers drop+rebuild. `has_sized_entry_for_inode()` checks if another entry with the same inode already has non-NULL sizes; `find_entry_by_inode()` returns the first row with a given inode (used by the live event loop's rename pre-pass). Both path-keyed (backward compat) and integer-keyed APIs.
 - **metadata.rs** -- `MetadataSnapshot` struct and `extract_metadata()` function. Single location for all platform-specific metadata extraction (logical/physical size, mtime, inode, nlink). Used by scanner, reconciler, verifier, and event_loop. Symlinks get `None` everywhere. Files get sizes + inode + nlink. Directories get inode but no sizes/nlink. The inode is what the live event loop's rename pre-pass matches against to detect dir renames in place.
 - **memory_watchdog.rs** -- Background task monitoring resident memory via `mach_task_info` (macOS). Warns at 8 GB, stops indexing at 16 GB, emits `index-memory-warning` event to frontend. No-op stub on non-macOS. Started from `start_indexing()`.
-- **writer.rs** -- Single writer thread, owns the write connection, processes `WriteMessage` channel (bounded `sync_channel`, 20K capacity, backpressure via blocking). `WRITER_GENERATION: AtomicU64` (initialized to 1) bumped on every mutation (`InsertEntriesV2`, `UpsertEntryV2`, `MoveEntryV2`, `DeleteEntryById`, `DeleteSubtreeById`, `TruncateData`) for search index staleness detection. Priority: `UpdateDirStats` before `InsertEntries`. `Flush` variant + async `flush()` method let callers wait for all prior writes to commit. Has both integer-keyed variants (`InsertEntriesV2`, `UpsertEntryV2`, `MoveEntryV2`, `DeleteEntryById`, `DeleteSubtreeById`, `PropagateDeltaById`) and path-keyed backward-compat variants. `MoveEntryV2 { entry_id, new_parent_id, new_name }` updates an entry's `(parent_id, name, name_folded)` in place, preserving its `id` and (for directories) `dir_stats`. If a different entry already occupies the destination `(parent_id, name_folded)` (rename-with-overwrite, or a concurrent upsert racing ahead of the move message), the handler deletes the conflicting row first (subtree-aware, with delta propagation) so the move never fails the UNIQUE constraint; the on-disk truth after a rename is that the moved entry owns the destination name. Same-parent renames don't change ancestor totals; cross-parent moves subtract the entry's contribution from the old ancestor chain and add it to the new one (and recompute the OR-aggregated `recursive_has_symlinks` flag on both chains). The integer-keyed delete/subtree-delete handlers auto-propagate negative deltas via the `parent_id` chain (same pattern as the path-keyed variants). `propagate_delta_by_id` walks the parent chain using `get_parent_id` lookups. `UpsertEntryV2` auto-propagates deltas on both insert and update: on insert, propagates the full size (+file_count or +dir_count); on update, reads the old entry first and propagates only the size difference. This means callers never need a separate `PropagateDeltaById` for upserted entries. For new directories, also initializes a zero-valued `dir_stats` row so enrichment always has a row. Maintains `AccumulatorMaps` during `InsertEntriesV2` processing (two HashMaps: direct children stats and child dir relationships + an `entries_inserted` counter), cleared on `TruncateData`. On `ComputeAllAggregates`, passes accumulated maps to `aggregator::compute_all_aggregates_with_maps()` to skip expensive full-table-scan SQL queries. On `ComputePartialAggregates { hot_paths }` (mid-scan), `handle_compute_partial_aggregates` borrows the same maps **read-only** (no clear, no mutation, no generation bump), no-ops on empty maps with no SQL fallback, delegates the math to `aggregator::compute_partial_aggregates`, writes a depth-capped (`PARTIAL_AGG_MAX_DEPTH = 3`) subset of `dir_stats` rows, and emits `index-dir-updated { paths: ["/"] }` when an `AppHandle` is present. Accepts an optional `AppHandle` at spawn time to emit `index-aggregation-progress` events during aggregation (phase, current, total). `IndexWriter::try_send` is a non-blocking send (`Ok(true)` enqueued / `Ok(false)` channel full, dropped / `Err` writer gone) with `queue_depth()` accessor over the channel-depth atomic; the bump/undo accounting lives in the extracted `try_send_with_depth` free function (undoes the bump on both `Full` and `Disconnected` so the depth never drifts). Also emits `saving_entries` phase progress during `InsertEntriesV2` processing when the expected total is set via `set_expected_total_entries()` (an `Arc<AtomicU64>` shared between the writer thread and the `IndexWriter` handle). No index drop/recreate dance. The `idx_parent_name_folded` composite index uses binary collation and stays present during scans.
+- **writer.rs** -- Single writer thread, owns the write connection, processes `WriteMessage` channel (bounded `sync_channel`, 20K capacity, backpressure via blocking). `WRITER_GENERATION: AtomicU64` (initialized to 1) bumped on every mutation (`InsertEntriesV2`, `UpsertEntryV2`, `MoveEntryV2`, `DeleteEntryById`, `DeleteSubtreeById`, `TruncateData`) for search index staleness detection. Priority: `UpdateDirStats` before `InsertEntries`. `Flush` variant + async `flush()` method let callers wait for all prior writes to commit. Has both integer-keyed variants (`InsertEntriesV2`, `UpsertEntryV2`, `MoveEntryV2`, `DeleteEntryById`, `DeleteSubtreeById`, `PropagateDeltaById`) and path-keyed backward-compat variants. `MoveEntryV2 { entry_id, new_parent_id, new_name }` updates an entry's `(parent_id, name, name_folded)` in place, preserving its `id` and (for directories) `dir_stats`. If a different entry already occupies the destination `(parent_id, name_folded)` (rename-with-overwrite, or a concurrent upsert racing ahead of the move message), the handler deletes the conflicting row first (subtree-aware, with delta propagation) so the move never fails the UNIQUE constraint; the on-disk truth after a rename is that the moved entry owns the destination name. Same-parent renames don't change ancestor totals; cross-parent moves subtract the entry's contribution from the old ancestor chain and add it to the new one (and recompute the OR-aggregated `recursive_has_symlinks` flag on both chains). The integer-keyed delete/subtree-delete handlers auto-propagate negative deltas via the `parent_id` chain (same pattern as the path-keyed variants). `propagate_delta_by_id` walks the parent chain using `get_parent_id` lookups. `UpsertEntryV2` auto-propagates deltas on both insert and update: on insert, propagates the full size (+file_count or +dir_count); on update, reads the old entry first and propagates only the size difference. This means callers never need a separate `PropagateDeltaById` for upserted entries. For new directories, also initializes a zero-valued `dir_stats` row so enrichment always has a row. Maintains `AccumulatorMaps` during `InsertEntriesV2` processing (two HashMaps: direct children stats and child dir relationships + `entries_inserted` and `entries_skipped` counters), cleared on `TruncateData`. A per-batch `INSERT OR IGNORE` UNIQUE-conflict skip is logged at DEBUG only (with a 3-row sample) and tallied into `entries_skipped`; `handle_compute_all_aggregates` summarizes the scan-wide tally once via `classify_skip_severity` (none → silent, sparse dedup → DEBUG, racing-writer ratio (≥50 skips and >1% of rows) → WARN), so normal scans log nothing and only the actionable double-write case warns. On `ComputeAllAggregates`, passes accumulated maps to `aggregator::compute_all_aggregates_with_maps()` to skip expensive full-table-scan SQL queries. On `ComputePartialAggregates { hot_paths }` (mid-scan), `handle_compute_partial_aggregates` borrows the same maps **read-only** (no clear, no mutation, no generation bump), no-ops on empty maps with no SQL fallback, delegates the math to `aggregator::compute_partial_aggregates`, writes a depth-capped (`PARTIAL_AGG_MAX_DEPTH = 3`) subset of `dir_stats` rows, and emits `index-dir-updated { paths: ["/"] }` when an `AppHandle` is present. Accepts an optional `AppHandle` at spawn time to emit `index-aggregation-progress` events during aggregation (phase, current, total). `IndexWriter::try_send` is a non-blocking send (`Ok(true)` enqueued / `Ok(false)` channel full, dropped / `Err` writer gone) with `queue_depth()` accessor over the channel-depth atomic; the bump/undo accounting lives in the extracted `try_send_with_depth` free function (undoes the bump on both `Full` and `Disconnected` so the depth never drifts). Also emits `saving_entries` phase progress during `InsertEntriesV2` processing when the expected total is set via `set_expected_total_entries()` (an `Arc<AtomicU64>` shared between the writer thread and the `IndexWriter` handle). No index drop/recreate dance. The `idx_parent_name_folded` composite index uses binary collation and stays present during scans.
 - **scanner.rs** -- jwalk-based parallel directory walker. `scan_volume()` for full scan, `scan_subtree()` for targeted subtree rescans (used by post-replay background verification). Uses `ScanContext` (from store.rs) to assign integer IDs and parent IDs during the walk: maintains a `HashMap<PathBuf, i64>` mapping directory paths to assigned IDs, with IDs allocated from the shared `Arc<AtomicI64>` counter owned by `IndexWriter`. The scan root is mapped to `ROOT_ID` (1). Sends `InsertEntriesV2(Vec<EntryRow>)` batches to the writer. Platform-specific exclusion filters via `should_exclude` (`pub(super)`), the single exclusion gate for all code paths (scanner, reconciler, event_loop verification, per-navigation verifier). E2E scan restriction: when `CMDR_E2E_START_PATH` is set, `should_exclude` restricts scanning to only the fixture path, its children, and ancestors. Everything else is excluded (critical for Docker E2E performance). `default_exclusions()` is `#[cfg(test)]` only. Physical sizes (`st_blocks * 512`). Hardlink inode dedup: files with `nlink > 1` are tracked in a `HashSet<u64>` by inode; only the first link's size is counted, subsequent links get `size = None`. Files with `nlink == 1` (vast majority) skip the set entirely. All files store `inode` in `EntryRow.inode` (from `MetadataExt::ino()` on Unix, `None` on non-Unix). Directories and symlinks get `inode: None`.
 - **aggregator.rs** -- Dir stats computation. Bottom-up after full scan (O(N) single pass), per-subtree after subtree rescans, incremental delta propagation up ancestor chain for watcher events. Two entry points for full aggregation: `compute_all_aggregates_reported` (loads maps from SQL) and `compute_all_aggregates_with_maps` (accepts pre-built maps from the writer). Both accept an `on_progress: &mut dyn FnMut(AggregationProgress)` callback and delegate to `compute_and_write()` for the shared topological sort + bottom-up computation + batch write. Progress is reported at phase transitions and every ~1% during compute/write loops. `AggregationPhase` enum: `SavingEntries` (flushing writer channel), `LoadingDirectories`, `Sorting`, `Computing`, `Writing`. The composite indexes use binary collation so there's no per-scan index rebuild phase. `compute_partial_aggregates` is the mid-scan variant: it derives the dir list and parent relations from the borrowed accumulator maps (no SQL `load_all_directory_ids` scan), computes each dir's depth from the scan root via a memoized walk (`depth(ROOT_ID) = 0` is the explicit base case; unreachable dirs get `usize::MAX` so the depth cap never writes them), reuses the same `topological_sort_bottom_up` + `compute_bottom_up` over **all** dirs, and writes only dirs at `depth ≤ max_depth` plus each resolvable hot-path dir and its direct children. `backfill_missing_dir_stats` is a catch-up pass that finds directories without `dir_stats` rows and computes their stats bottom-up; triggered after reconciler replay and cold-start replay via `BackfillMissingDirStats` writer message.
 - **watcher.rs** -- Drive-level filesystem watcher. macOS: FSEvents via `cmdr-fsevent-stream` with event IDs and `sinceWhen` replay. Linux: `notify` crate (inotify backend) with recursive watching and synthetic event counter. Other platforms: stub. `supports_event_replay()` lets callers branch on whether journal replay is available.
diff --git a/apps/desktop/src-tauri/src/indexing/writer.rs b/apps/desktop/src-tauri/src/indexing/writer.rs
@@ -550,6 +550,10 @@ struct AccumulatorMaps {
     child_dirs: HashMap<i64, Vec<i64>>,
     /// Running count of entries inserted so far (for flushing progress).
     entries_inserted: u64,
+    /// Running count of rows the scan skipped on a UNIQUE `(parent_id,
+    /// name_folded)` conflict (`INSERT OR IGNORE`). Summarized once per scan at
+    /// `ComputeAllAggregates`; see `classify_skip_severity`.
+    entries_skipped: u64,
 }
 
 impl AccumulatorMaps {
@@ -558,6 +562,7 @@ impl AccumulatorMaps {
             direct_stats: HashMap::new(),
             child_dirs: HashMap::new(),
             entries_inserted: 0,
+            entries_skipped: 0,
         }
     }
 
@@ -586,6 +591,42 @@ impl AccumulatorMaps {
         self.direct_stats.clear();
         self.child_dirs.clear();
         self.entries_inserted = 0;
+        self.entries_skipped = 0;
+    }
+}
+
+/// Log severity for the count of rows a full scan skipped on a UNIQUE
+/// `(parent_id, name_folded)` conflict (the `INSERT OR IGNORE` path).
+#[derive(Debug, PartialEq, Eq)]
+enum SkipSeverity {
+    /// Nothing skipped: log nothing.
+    None,
+    /// Sparse skips, expected dedup (one dir reachable by two walk paths via a
+    /// firmlink/symlink, or case/NFD sibling pairs on case-sensitive or
+    /// cross-OS-synced trees). Not actionable: log at DEBUG.
+    Benign,
+    /// A large fraction of the scan skipped: the signature of two writer threads
+    /// racing on one DB (the constraint's reason for being, a 1.83 TB ghost size
+    /// was traced to exactly that). Actionable: log at WARN.
+    Suspicious,
+}
+
+/// Classify a full scan's accumulated UNIQUE-conflict skips. The absolute floor
+/// keeps a tiny tree with a couple genuine sibling collisions from tripping the
+/// warning; the ratio separates a handful of dedup hits in a multi-million-row
+/// scan from a racing writer (whose loser duplicates a large fraction of rows).
+fn classify_skip_severity(inserted: u64, skipped: u64) -> SkipSeverity {
+    const MIN_SUSPICIOUS_SKIPS: u64 = 50;
+    const SUSPICIOUS_SKIP_RATIO: f64 = 0.01;
+    if skipped == 0 {
+        return SkipSeverity::None;
+    }
+    let total = inserted + skipped;
+    let ratio = skipped as f64 / total as f64;
+    if skipped >= MIN_SUSPICIOUS_SKIPS && ratio > SUSPICIOUS_SKIP_RATIO {
+        SkipSeverity::Suspicious
+    } else {
+        SkipSeverity::Benign
     }
 }
 
@@ -945,12 +986,19 @@ fn handle_insert_entries_v2(
     // inflates `dir_stats` with phantom bytes (the constraint comment that
     // called out "1.83 TB ghost size on a 994 GB volume" is exactly this
     // failure mode).
+    //
+    // A per-batch skip is logged at DEBUG only (with a sample for diagnosis): a
+    // few skips per scan is expected dedup and not actionable. The accumulated
+    // count is summarized once per scan at `ComputeAllAggregates`, which escalates
+    // to WARN only when the skip ratio looks like a racing writer. See
+    // `classify_skip_severity`.
     match IndexStore::insert_entries_v2_batch(conn, &entries) {
         Ok(inserted) => {
             let skipped_count = inserted.iter().filter(|landed| !**landed).count();
             if skipped_count == 0 {
                 accumulator.accumulate(&entries);
             } else {
+                accumulator.entries_skipped += skipped_count as u64;
                 accumulator.accumulate(
                     entries
                         .iter()
@@ -969,7 +1017,7 @@ fn handle_insert_entries_v2(
                     })
                     .take(3)
                     .collect();
-                log::warn!(
+                log::debug!(
                     "Index writer: {skipped_count} of {batch_size} skipped due to UNIQUE conflict on (parent_id, name_folded); sample: {samples:?}",
                     batch_size = pluralize_with(count as u64, "entry", "entries")
                 );
@@ -1602,6 +1650,23 @@ fn handle_compute_all_aggregates(
             &mut on_progress,
         )
     };
+    // Summarize the scan's UNIQUE-conflict skips once, here, instead of WARNing
+    // per offending batch. Sparse skips are expected dedup; only a racing-writer
+    // ratio is worth a WARN. Read before `clear()`.
+    let inserted = accumulator.entries_inserted;
+    let skipped = accumulator.entries_skipped;
+    match classify_skip_severity(inserted, skipped) {
+        SkipSeverity::None => {}
+        SkipSeverity::Benign => log::debug!(
+            "Index scan: {skipped} of {total} entries skipped on UNIQUE conflict (expected dedup: firmlinks, case/NFD siblings)",
+            total = inserted + skipped,
+        ),
+        SkipSeverity::Suspicious => log::warn!(
+            "Index scan: {skipped} of {total} entries skipped on UNIQUE conflict ({pct:.1}%); a high ratio can mean two writers raced on one DB",
+            total = inserted + skipped,
+            pct = skipped as f64 / (inserted + skipped) as f64 * 100.0,
+        ),
+    }
     // Maps are consumed; clear to free memory.
     // Reset expected_total so subtree-scan inserts don't emit
     // spurious saving_entries progress events after the full scan.
@@ -1927,6 +1992,42 @@ mod tests {
         IndexStore::open(db_path).expect("failed to open read store")
     }
 
+    // ── Skip-severity classification ─────────────────────────────────
+
+    #[test]
+    fn skip_severity_none_when_nothing_skipped() {
+        assert_eq!(classify_skip_severity(5_000_000, 0), SkipSeverity::None);
+    }
+
+    #[test]
+    fn skip_severity_benign_for_sparse_dedup() {
+        // A handful of firmlink double-visits / case-NFD siblings in a big scan: expected, not actionable.
+        assert_eq!(classify_skip_severity(5_000_000, 3), SkipSeverity::Benign);
+        assert_eq!(classify_skip_severity(5_000_000, 49), SkipSeverity::Benign);
+    }
+
+    #[test]
+    fn skip_severity_benign_when_below_absolute_floor_even_at_high_ratio() {
+        // Tiny tree with a couple genuine sibling collisions: high ratio but few skips, stay quiet.
+        assert_eq!(classify_skip_severity(20, 10), SkipSeverity::Benign);
+    }
+
+    #[test]
+    fn skip_severity_suspicious_for_racing_writer_signature() {
+        // Two writers racing on one DB: the loser's inserts all conflict, so a large fraction skips.
+        assert_eq!(classify_skip_severity(5_000_000, 5_000_000), SkipSeverity::Suspicious);
+        // Just over both gates: 100 skips and >1% of the scan (100 / 9100 ≈ 1.1%).
+        assert_eq!(classify_skip_severity(9_000, 100), SkipSeverity::Suspicious);
+        // Exactly 1% does not trip it (the ratio gate is strict `>`): 100 / 10000.
+        assert_eq!(classify_skip_severity(9_900, 100), SkipSeverity::Benign);
+    }
+
+    #[test]
+    fn skip_severity_benign_when_over_floor_but_under_ratio() {
+        // 50 skips clears the floor but is a vanishing fraction of a 5M scan: still benign.
+        assert_eq!(classify_skip_severity(5_000_000, 50), SkipSeverity::Benign);
+    }
+
     // ── Basic lifecycle tests ────────────────────────────────────────
 
     #[test]