Skip to content

Commit eb69228

Browse files
committed
Bugfix: don't roll back the whole scan batch on one duplicate name
- `insert_entries_v2_batch` used plain `INSERT` inside a savepoint, so a single `(parent_id, name_folded)` collision (case-sensitive volumes with `Foo.txt`/`foo.txt` siblings, NFC/NFD twins from cross-OS sync, etc.) rolled back the entire 2000-entry batch — silently dropping 1999 unrelated rows plus orphaning every descendant whose `parent_id` was allocated from that batch. - Switch to `INSERT OR IGNORE` and return a `Vec<bool>` parallel to the input so the caller can tell which rows actually landed. - `handle_insert_entries_v2` filters the accumulator input by that vec, so the in-memory aggregation state still satisfies "never claim more than the DB has" (the constraint that protects against the historical 1.83 TB ghost size on `..`). - Log conflicts at WARN with a sample of `(parent_id, name)` pairs so the offending filesystem is diagnosable. - `accumulate` now takes `impl IntoIterator<Item = &EntryRow>` so the filtered-iteration path doesn't need an extra clone. - Update the existing accumulator-consistency test to assert the new per-row behavior; rename it to `handle_insert_entries_v2_only_accumulates_rows_that_landed`. Replace the old "duplicate rejected" store test with one that verifies graceful skip + survivors land. - Update `indexing/CLAUDE.md` gotchas to reflect the new contract.
1 parent 9e80891 commit eb69228

3 files changed

Lines changed: 122 additions & 55 deletions

File tree

apps/desktop/src-tauri/src/indexing/CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ function is shared so the two gate sites can't drift apart.
172172

173173
**Lock-first `start_indexing`, atomic phase reservation**: `start_indexing` opens a temporary `IndexStore` and atomically claims the `Disabled -> Initializing(store)` transition via `try_reserve_initializing_phase()` BEFORE constructing the heavy `IndexManager` (which spawns the writer thread). If the phase is already `Initializing`, `Running`, or `ShuttingDown`, the call is a no-op. Without the lock-first claim, two near-simultaneous calls can both spawn writer threads — each with its own `Arc<AtomicI64>` ID counter and `AccumulatorMaps` — racing on the same DB. The lock-first guard makes second-call no-op behaviour the contract; the `UNIQUE (parent_id, name_folded)` index is the safety net if the contract is ever bypassed.
174174

175-
**`accumulator.accumulate` runs only after the DB commit succeeds**: `handle_insert_entries_v2` inserts via `IndexStore::insert_entries_v2_batch` first, and only updates `AccumulatorMaps` on `Ok`. The batch is wrapped in a SQLite savepoint that rolls back on any conflict (including UNIQUE on `(parent_id, name_folded)`); if the accumulator runs before the commit, a failed batch leaves it claiming bytes that never reached the DB, and `compute_all_aggregates_with_maps` then writes inflated `dir_stats`. The contract — "in-memory state never claims more than the DB has" — must hold under any future failure mode, even though the lock-first guard makes the failure path rare in practice.
175+
**`accumulator.accumulate` runs only for rows that actually landed**: `handle_insert_entries_v2` inserts via `IndexStore::insert_entries_v2_batch`, which uses `INSERT OR IGNORE` and returns a `Vec<bool>` parallel to the input. A duplicate `(parent_id, name_folded)` (case-sensitive volumes with `Foo.txt` + `foo.txt`, NFC/NFD duplicates from cross-OS sync, etc.) skips just that row and the rest of the batch survives; without this, a single collision used to roll back the whole ~2000-entry batch via the savepoint, silently dropping huge swaths of the index. `handle_insert_entries_v2` filters `entries` by the returned flags before calling `accumulator.accumulate`, so the in-memory aggregation state never claims bytes that lost the OR-IGNORE. The contract — "in-memory state never claims more than the DB has" — must hold under any future failure mode (this was one of the mechanisms behind the historical 1.83 TB ghost size on `..` of a 994 GB volume; the other was two writers racing, closed by the lock-first guard above). Conflicts are logged at WARN with sample names so the offending filesystem is diagnosable.
176176

177177
**Progressive `index-dir-updated` emit during background verification**: `run_background_verification` emits one `index-dir-updated` event per successfully-scanned new subtree, immediately after the post-scan writer flush. Don't buffer new-dir paths and fire a single end-of-verification emit instead: that window runs up to 5 minutes for a typical home folder, and any listing opened in it stays on `<dir>` placeholders the whole time (listing-time enrichment runs before the relevant `dir_stats` rows exist; the single emit often misses the right paths because it carries `affected_paths` from the replay rather than the verification-discovered paths; and the `/` full-refresh sentinel case is filtered out by the FE's strict-descendant check). The per-subtree emit gives a progressive reveal: each folder's recursive size pops in as soon as its subtree aggregation commits. The FE handler in `index-events.ts` is throttled at 2 s per pane, so a burst of subtree completions naturally coalesces into one refresh per pane per 2 s window. The end-of-verification emit carries just `affected_paths`; `new_dir_paths` are emitted progressively.
178178

@@ -271,7 +271,7 @@ per-event detail back: `RUST_LOG=cmdr_lib::indexing::reconciler=trace,cmdr_lib::
271271

272272
**Reconciler must delete old subtree on dir-to-file type changes**: When `reconcile_subtree` matches a filesystem entry to a DB entry by name, it must check if `is_directory` changed. If a directory became a file, `DeleteSubtreeById` must be sent before `UpsertEntryV2`. Without this, `INSERT OR REPLACE` keeps the same row ID (same `parent_id + name`), and the old directory's children become logical orphans: entries parented by a file.
273273

274-
**Scanner's `insert_entries_v2_batch` uses plain `INSERT`**: `INSERT OR REPLACE` would silently delete the old row and insert a new one with a new ID, orphaning all children. The table has two unique constraints: PK on `id`, and (since v12) the composite UNIQUE `(parent_id, name_folded)`. IDs are allocated from a shared `Arc<AtomicI64>` counter owned by `IndexWriter`, and the table is truncated before full scans (or descendants deleted before subtree scans), so under a single live writer neither conflict can occur. The plain `INSERT` reflects the invariant. If a conflict ever does fire (e.g. someone defeats the lock-first guard), the batch savepoint rolls back and `handle_insert_entries_v2` skips the accumulator update, so the in-memory aggregation state stays consistent with the DB.
274+
**Scanner's `insert_entries_v2_batch` uses `INSERT OR IGNORE`**: `INSERT OR REPLACE` would silently delete the old row and insert a new one with a new ID, orphaning all children — never the right move. Plain `INSERT` would roll back the entire 2000-entry batch on any conflict via the wrapping savepoint, which is catastrophic if a single filesystem oddity (two siblings folding to the same `name_folded` on a case-sensitive volume, NFC/NFD twins from cross-OS sync) takes out 1999 unrelated rows. `INSERT OR IGNORE` skips just the conflicting row; the batch returns `Vec<bool>` so `handle_insert_entries_v2` can filter the accumulator input. The table has two unique constraints: PK on `id`, and (since v12) the composite UNIQUE `(parent_id, name_folded)`. IDs are allocated from a shared `Arc<AtomicI64>` counter owned by `IndexWriter`, and the table is truncated before full scans (or descendants deleted before subtree scans), so under a single live writer PK conflicts shouldn't occur in practice. The savepoint still wraps the batch so that non-constraint errors (disk full, etc.) roll back cleanly.
275275

276276
**IndexWriter owns the shared `next_id` counter**: All ID allocation goes through an `Arc<AtomicI64>` owned by `IndexWriter`. The scanner's `ScanContext` atomically increments it via `alloc_id()`, and the writer thread bumps it via `fetch_max` after `UpsertEntryV2` inserts (which let SQLite auto-assign). `TruncateData` resets it to 2. Don't fall back to reading `MAX(id)` from a read connection for allocation: the writer can have uncommitted inserts in its channel, so the read sees a stale value and the scanner double-assigns IDs. `IndexWriter` exposes `db_path()`. The scanner opens a temporary connection for `resolve_path` (subtree scans) and `ensure_root_sentinel` (volume scans), but not for ID allocation.
277277

apps/desktop/src-tauri/src/indexing/store.rs

Lines changed: 48 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -868,21 +868,33 @@ impl IndexStore {
868868
///
869869
/// Uses a savepoint instead of `unchecked_transaction()` so it nests correctly
870870
/// inside explicit transactions (replay's `BEGIN IMMEDIATE`).
871-
pub fn insert_entries_v2_batch(conn: &Connection, entries: &[EntryRow]) -> Result<(), IndexStoreError> {
871+
///
872+
/// Uses `INSERT OR IGNORE` so a single `(parent_id, name_folded)` collision
873+
/// (case-sensitive filesystems with `Foo.txt`/`foo.txt` siblings, NFC/NFD
874+
/// duplicates from cross-OS sync, etc.) skips just that row rather than
875+
/// rolling back the whole batch of ~2000 entries. Returns a `Vec<bool>`
876+
/// parallel to `entries` where each element indicates whether the
877+
/// corresponding row actually landed in the DB. Callers (the writer
878+
/// thread's accumulator) must consult this so the in-memory aggregation
879+
/// state never claims more than the DB actually has.
880+
pub fn insert_entries_v2_batch(conn: &Connection, entries: &[EntryRow]) -> Result<Vec<bool>, IndexStoreError> {
872881
if entries.is_empty() {
873-
return Ok(());
882+
return Ok(Vec::new());
874883
}
875884
with_savepoint(conn, "insert_entries", |conn| {
876-
// Plain INSERT: IDs are allocated from a shared AtomicI64 counter owned by
877-
// IndexWriter, so conflicts shouldn't occur. The table is truncated before
878-
// full scans and descendants are deleted before subtree scans.
885+
// INSERT OR IGNORE: the table is truncated before full scans and
886+
// descendants are deleted before subtree scans, so collisions
887+
// against existing rows are rare, but two siblings with the same
888+
// `name_folded` can show up on case-sensitive volumes / sync
889+
// sources. Skip the duplicate, keep the rest.
879890
let mut stmt = conn.prepare_cached(
880-
"INSERT INTO entries (id, parent_id, name, name_folded, is_directory, is_symlink, logical_size, physical_size, modified_at, inode)
891+
"INSERT OR IGNORE INTO entries (id, parent_id, name, name_folded, is_directory, is_symlink, logical_size, physical_size, modified_at, inode)
881892
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10)",
882893
)?;
894+
let mut inserted = Vec::with_capacity(entries.len());
883895
for e in entries {
884896
let name_folded = normalize_for_comparison(&e.name);
885-
stmt.execute(params![
897+
let rows = stmt.execute(params![
886898
e.id,
887899
e.parent_id,
888900
e.name,
@@ -894,8 +906,9 @@ impl IndexStore {
894906
e.modified_at,
895907
e.inode,
896908
])?;
909+
inserted.push(rows == 1);
897910
}
898-
Ok(())
911+
Ok(inserted)
899912
})
900913
}
901914

@@ -2049,8 +2062,15 @@ mod tests {
20492062
);
20502063
}
20512064

2065+
/// Batch insert uses `INSERT OR IGNORE`: a duplicate `(parent_id, name_folded)`
2066+
/// in the batch (or against an existing row) skips just that row, keeping
2067+
/// every other entry in the batch. The returned `Vec<bool>` flags which
2068+
/// rows actually landed. This replaces the previous roll-back-the-whole-batch
2069+
/// behavior, which silently dropped ~2000 unrelated entries every time a
2070+
/// scan encountered two siblings with colliding `name_folded` (case-sensitive
2071+
/// volumes, NFC/NFD duplicates, etc.).
20522072
#[test]
2053-
fn duplicate_parent_name_folded_rejected_batch_insert() {
2073+
fn duplicate_parent_name_folded_skipped_in_batch_insert() {
20542074
let (_store, dir) = open_temp_store();
20552075
let db_path = dir.path().join("test-index.db");
20562076
let conn = IndexStore::open_write_connection(&db_path).unwrap();
@@ -2073,21 +2093,31 @@ mod tests {
20732093
name: "dup.txt".into(),
20742094
is_directory: false,
20752095
is_symlink: false,
2076-
logical_size: Some(10),
2077-
physical_size: Some(10),
2096+
logical_size: Some(20),
2097+
physical_size: Some(20),
2098+
modified_at: None,
2099+
inode: None,
2100+
},
2101+
EntryRow {
2102+
id: 102,
2103+
parent_id: ROOT_ID,
2104+
name: "unrelated.txt".into(),
2105+
is_directory: false,
2106+
is_symlink: false,
2107+
logical_size: Some(30),
2108+
physical_size: Some(30),
20782109
modified_at: None,
20792110
inode: None,
20802111
},
20812112
];
2082-
let result = IndexStore::insert_entries_v2_batch(&conn, &entries);
2083-
assert!(
2084-
result.is_err(),
2085-
"batch with duplicate (parent_id, name_folded) must fail; got {result:?}"
2086-
);
2113+
let inserted = IndexStore::insert_entries_v2_batch(&conn, &entries).unwrap();
2114+
assert_eq!(inserted, vec![true, false, true]);
20872115

2088-
// Savepoint rolls back the whole batch, so neither row should be present.
2089-
assert!(IndexStore::get_entry_by_id(&conn, 100).unwrap().is_none());
2116+
// First duplicate wins; the second is dropped; the unrelated entry survives.
2117+
// Without the per-row skip, the savepoint used to roll back ALL THREE.
2118+
assert!(IndexStore::get_entry_by_id(&conn, 100).unwrap().is_some());
20902119
assert!(IndexStore::get_entry_by_id(&conn, 101).unwrap().is_none());
2120+
assert!(IndexStore::get_entry_by_id(&conn, 102).unwrap().is_some());
20912121
}
20922122

20932123
#[cfg(target_os = "macos")]

apps/desktop/src-tauri/src/indexing/writer.rs

Lines changed: 72 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -468,10 +468,12 @@ impl AccumulatorMaps {
468468
}
469469
}
470470

471-
/// Accumulate stats from a batch of inserted entries.
472-
fn accumulate(&mut self, entries: &[EntryRow]) {
473-
self.entries_inserted += entries.len() as u64;
471+
/// Accumulate stats from a set of inserted entries. Accepts any iterator
472+
/// of `&EntryRow` so callers can pre-filter (skipping rows that lost a
473+
/// UNIQUE conflict during `INSERT OR IGNORE`) without an extra clone.
474+
fn accumulate<'a>(&mut self, entries: impl IntoIterator<Item = &'a EntryRow>) {
474475
for entry in entries {
476+
self.entries_inserted += 1;
475477
let stats = self.direct_stats.entry(entry.parent_id).or_insert((0, 0, 0, 0, false));
476478
if entry.is_symlink {
477479
stats.4 = true;
@@ -817,14 +819,45 @@ fn handle_insert_entries_v2(
817819
) {
818820
let count = entries.len();
819821
let t = Instant::now();
820-
// Accumulate AFTER the DB commit succeeds. The savepoint inside
821-
// `insert_entries_v2_batch` rolls the whole batch back on any conflict
822-
// (PK or UNIQUE on `(parent_id, name_folded)` — schema v12), so the
823-
// accumulator must not claim rows that never landed; otherwise
824-
// `compute_all_aggregates_with_maps` inflates `dir_stats` with phantom
825-
// bytes.
822+
// Accumulate AFTER the DB commit succeeds. `insert_entries_v2_batch`
823+
// uses `INSERT OR IGNORE`, so a UNIQUE conflict on
824+
// `(parent_id, name_folded)` (case-sensitive volumes with `Foo.txt` and
825+
// `foo.txt` siblings, NFC/NFD duplicates from cross-OS sync, etc.) skips
826+
// just that row instead of rolling back the entire 2000-entry batch. The
827+
// accumulator must skip those rows too, or `compute_all_aggregates_with_maps`
828+
// inflates `dir_stats` with phantom bytes (the constraint comment that
829+
// called out "1.83 TB ghost size on a 994 GB volume" is exactly this
830+
// failure mode).
826831
match IndexStore::insert_entries_v2_batch(conn, &entries) {
827-
Ok(()) => accumulator.accumulate(&entries),
832+
Ok(inserted) => {
833+
let skipped_count = inserted.iter().filter(|landed| !**landed).count();
834+
if skipped_count == 0 {
835+
accumulator.accumulate(&entries);
836+
} else {
837+
accumulator.accumulate(
838+
entries
839+
.iter()
840+
.zip(inserted.iter())
841+
.filter_map(|(e, landed)| if *landed { Some(e) } else { None }),
842+
);
843+
let samples: Vec<(i64, &str)> = entries
844+
.iter()
845+
.zip(inserted.iter())
846+
.filter_map(|(e, landed)| {
847+
if !*landed {
848+
Some((e.parent_id, e.name.as_str()))
849+
} else {
850+
None
851+
}
852+
})
853+
.take(3)
854+
.collect();
855+
log::warn!(
856+
"Index writer: {skipped_count} of {batch_size} skipped due to UNIQUE conflict on (parent_id, name_folded); sample: {samples:?}",
857+
batch_size = pluralize_with(count as u64, "entry", "entries")
858+
);
859+
}
860+
}
828861
Err(e) => crate::log_error!("Index writer: insert_entries_v2_batch failed: {e}"),
829862
}
830863
let elapsed = t.elapsed().as_millis();
@@ -1642,22 +1675,20 @@ mod tests {
16421675
writer.shutdown();
16431676
}
16441677

1645-
// The accumulator must NOT advance when the batch INSERT fails: the
1646-
// accumulator maps drive `compute_all_aggregates_with_maps`, and counting
1647-
// bytes for rows that never landed in the DB produces inflated dir_stats.
1648-
// This was the second mechanism behind the 1.83 TB ghost size on `..` —
1649-
// the first being two writers racing (Fix #1), the schema admitting
1650-
// duplicates (Fix #3) finalises it.
1678+
// The accumulator must only count rows that actually landed in the DB.
1679+
// `insert_entries_v2_batch` uses `INSERT OR IGNORE`, so one duplicate in
1680+
// a batch skips just that row and the rest insert. The accumulator maps
1681+
// drive `compute_all_aggregates_with_maps`; counting bytes for a row that
1682+
// lost an OR-IGNORE produces inflated dir_stats (this was one of the
1683+
// mechanisms behind the 1.83 TB ghost size on `..` of a 994 GB volume).
16511684
#[test]
1652-
fn handle_insert_entries_v2_does_not_accumulate_when_db_insert_fails() {
1685+
fn handle_insert_entries_v2_only_accumulates_rows_that_landed() {
16531686
use std::sync::atomic::AtomicU64;
16541687

16551688
let (db_path, _dir) = setup_db();
16561689
let conn = IndexStore::open_write_connection(&db_path).unwrap();
16571690

1658-
// Pre-seed a row with id=100 so a second insert with the same id
1659-
// collides on the integer PK (UNIQUE on entries.id is enforced
1660-
// independently of the (parent_id, name_folded) index).
1691+
// Pre-seed: id=100, name="first.txt".
16611692
let entries_first = vec![EntryRow {
16621693
id: 100,
16631694
parent_id: ROOT_ID,
@@ -1671,17 +1702,17 @@ mod tests {
16711702
}];
16721703
IndexStore::insert_entries_v2_batch(&conn, &entries_first).unwrap();
16731704

1674-
// Second batch: re-uses id=100. The savepoint rolls back the whole batch
1675-
// on the PK collision, so neither entry lands in the DB.
1705+
// Second batch: row 0 collides on the (parent_id, name_folded) UNIQUE
1706+
// index (same `first.txt` under ROOT_ID). Row 1 is fresh and must land.
16761707
let entries_dup = vec![
16771708
EntryRow {
1678-
id: 100,
1709+
id: 200,
16791710
parent_id: ROOT_ID,
16801711
name: "first.txt".into(),
16811712
is_directory: false,
16821713
is_symlink: false,
1683-
logical_size: Some(10),
1684-
physical_size: Some(10),
1714+
logical_size: Some(999_999),
1715+
physical_size: Some(999_999),
16851716
modified_at: None,
16861717
inode: None,
16871718
},
@@ -1711,23 +1742,29 @@ mod tests {
17111742
&mutation_counter,
17121743
);
17131744

1714-
// DB unchanged: only the first.txt row from the seed.
1745+
// DB has the original first.txt (id=100) and the new second.txt (id=101).
1746+
// id=200 was the OR-IGNORE'd duplicate and must not be in the DB.
17151747
assert_eq!(
17161748
IndexStore::get_entry_by_id(&conn, 100).unwrap().unwrap().name,
17171749
"first.txt"
17181750
);
1719-
assert!(IndexStore::get_entry_by_id(&conn, 101).unwrap().is_none());
1720-
1721-
// Accumulator must reflect that 0 entries were inserted from this batch.
17221751
assert_eq!(
1723-
accumulator.entries_inserted, 0,
1724-
"accumulator advanced despite failed INSERT (would feed inflated dir_stats into aggregation)"
1752+
IndexStore::get_entry_by_id(&conn, 101).unwrap().unwrap().name,
1753+
"second.txt"
17251754
);
1726-
assert!(
1727-
accumulator.direct_stats.is_empty(),
1728-
"accumulator.direct_stats populated despite failed INSERT: {:?}",
1729-
accumulator.direct_stats
1755+
assert!(IndexStore::get_entry_by_id(&conn, 200).unwrap().is_none());
1756+
1757+
// Accumulator must reflect exactly one new entry (the row that landed),
1758+
// never the 999_999-byte phantom. If a regression makes the accumulator
1759+
// count the OR-IGNORE'd row, this assert catches it.
1760+
assert_eq!(
1761+
accumulator.entries_inserted, 1,
1762+
"accumulator must count only rows that landed in the DB"
17301763
);
1764+
let stats = accumulator.direct_stats.get(&ROOT_ID).expect("ROOT_ID stats present");
1765+
assert_eq!(stats.0, 20, "logical bytes must only count the landed row");
1766+
assert_eq!(stats.1, 20, "physical bytes must only count the landed row");
1767+
assert_eq!(stats.2, 1, "file count must only include the landed row");
17311768
}
17321769

17331770
#[test]

0 commit comments

Comments
 (0)