Skip to content

Commit 7ce90bb

Browse files
committed
Selection: Rust backend (M5)
- New `selection/` module mirroring `search/` but narrower (no scope, no system-dir exclusion). - `selection/history.rs`: persistent recent-selections store at `{app_data_dir}/selection-history.json`. Atomic write, schema-versioned (v1), canonical-key dedupe over `mode|normalized_query|filters|case_sensitive`, cap eviction (default 1000, 0 disables). Re-exports `HistoryMode` and `HistoryFilters` from `search::history` so the wire shape stays in sync. - `selection/ai/{prompt,parser,query_builder}.rs`: cloud-only LLM pipeline. Key-value response (`pattern`, `kind`, `size_min`, `size_max`, `modified_after`, `modified_before`, `note`, `label`). The prompt grounds the model in a sample of the focused folder's filenames so "every rymd file" → `*rymd*`. - `commands/selection.rs`: 6 IPC commands wired through specta (`translate_selection_query`, `get_recent_selections`, `add_recent_selection`, `remove_recent_selection`, `clear_recent_selections`, `apply_recent_selections_max_count`). `translate_selection_query` hard-errors when the AI provider isn't `cloud`. - Settings registry + applier wire `selection.recentSelections.maxCount` (default 1000, range 0-10000) with live-apply. - 62 Rust unit tests for history + AI pipeline (canonical key, dedupe, cap, schema-version quarantine, prompt assembly, parser edge cases, builder filter combinations). - 11 IPC contract tests under `lib/ipc/selection.test.ts` pin the wire shape for destructive `clear_recent_selections` and cross-window `apply_recent_selections_max_count`. - 6 `#[ignore]`-gated real-OpenAI eval tests (`selection::ai::real_llm_eval_test`) confirm the prompt + parser produce parseable results across 6 representative intents. All 6 pass against `gpt-4o-mini`. - Bindings regenerated; `bindings-fresh` green. Full default check suite green for Rust + new TS tests; pre-existing svelte-check/eslint failures in `query-ui/QueryDialog.*` and `search-state.svelte.ts` belong to M4 frontend work, not this commit. - Docs: new `selection/CLAUDE.md`, `search/CLAUDE.md` updated to call out the shared history types, `docs/architecture.md` adds a `selection/` row.
1 parent 8d41729 commit 7ce90bb

21 files changed

Lines changed: 2406 additions & 0 deletions

File tree

apps/desktop/src-tauri/src/commands/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ pub mod network;
2222
pub mod rename;
2323
pub mod restricted_paths;
2424
pub mod search;
25+
pub mod selection;
2526
pub mod settings;
2627
pub mod smb_diagnostics;
2728
pub mod sync_status; // Has both macOS and non-macOS implementations
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
//! IPC commands for the Selection dialog (M5).
2+
//!
3+
//! Thin wrappers around `crate::selection`. The selection AI translation is
4+
//! cloud-only: small local models can't reliably handle a 200+-name folder sample
5+
//! plus the structured prompt and response. `translate_selection_query` returns a
6+
//! hard error when the configured provider isn't `cloud` so the UI can toast the
7+
//! reason; the frontend hides the AI chip when the provider isn't cloud so we
8+
//! never reach this path in normal use.
9+
10+
use genai::chat::ChatOptions;
11+
12+
use crate::ai::client::AiBackend;
13+
use crate::ai::manager::BackendResolution;
14+
15+
use crate::selection::ai::{self, SelectionTranslateResult, query_builder};
16+
use crate::selection::history::{self, SelectionHistoryEntry};
17+
18+
/// Resolves the AI backend, requiring a cloud provider.
19+
///
20+
/// Mirrors `commands::search::resolve_ai_backend` but adds the cloud-only gate. The
21+
/// frontend hides the AI chip when `ai.provider !== 'cloud'`; this gate is the
22+
/// belt-and-braces check so a misconfigured frontend (or an MCP caller in the
23+
/// future) can't drive the local model with a prompt it can't handle.
24+
fn resolve_cloud_ai_backend() -> Result<AiBackend, String> {
25+
let provider = crate::ai::manager::get_provider();
26+
if provider != "cloud" {
27+
return Err("AI selection needs a cloud provider. Set one in Settings > AI.".to_string());
28+
}
29+
match crate::ai::manager::resolve_backend() {
30+
BackendResolution::Ready(b) => Ok(b),
31+
BackendResolution::Off => Err("AI is not configured. Enable a cloud provider in settings.".to_string()),
32+
BackendResolution::NotConfigured(reason) => Err(reason.to_string()),
33+
BackendResolution::UnknownProvider(p) => Err(format!("Unknown AI provider: {p}")),
34+
}
35+
}
36+
37+
/// Translates a natural-language selection request into a glob/regex plus optional
38+
/// size and date filters.
39+
///
40+
/// The `sample_names` argument is the focused folder's filename listing (already
41+
/// sampled on the frontend; see `lib/selection-dialog/folder-sampler.ts` for the
42+
/// sampling strategy). It grounds the prompt in what's actually in the folder.
43+
#[tauri::command]
44+
#[specta::specta]
45+
pub async fn translate_selection_query(
46+
prompt: String,
47+
sample_names: Vec<String>,
48+
) -> Result<SelectionTranslateResult, String> {
49+
let backend = resolve_cloud_ai_backend()?;
50+
let system_prompt = ai::build_classification_prompt(&sample_names);
51+
52+
log::debug!(
53+
target: "selection::ai",
54+
"translate_selection_query: prompt={prompt:?}, sample_count={}, system_prompt_chars={}",
55+
sample_names.len(),
56+
system_prompt.len()
57+
);
58+
59+
let options = ChatOptions::default()
60+
.with_temperature(0.2)
61+
.with_max_tokens(300)
62+
.with_top_p(0.9);
63+
64+
let t0 = std::time::Instant::now();
65+
let response = crate::ai::client::chat_completion(&backend, &system_prompt, &prompt, &options)
66+
.await
67+
.map_err(|e| {
68+
log::warn!(
69+
target: "selection::ai",
70+
"chat_completion failed after {:.1}s for prompt={prompt:?}: {e}",
71+
t0.elapsed().as_secs_f64()
72+
);
73+
format!("{e}")
74+
})?;
75+
76+
log::info!(
77+
target: "selection::ai",
78+
"translate_selection_query: response {} chars in {:.1}s",
79+
response.len(),
80+
t0.elapsed().as_secs_f64()
81+
);
82+
log::debug!(target: "selection::ai", "translate_selection_query raw response: {response:?}");
83+
84+
let parsed = ai::parse_selection_response(&response);
85+
Ok(query_builder::build_selection_translate_result(&parsed))
86+
}
87+
88+
// ============================================================================
89+
// Recent selections (history) IPC
90+
// ============================================================================
91+
92+
/// Returns the persisted recent-selections entries (newest first). `limit = None`
93+
/// returns all.
94+
#[tauri::command]
95+
#[specta::specta]
96+
pub fn get_recent_selections(limit: Option<u32>) -> Vec<SelectionHistoryEntry> {
97+
history::list_entries(limit.map(|n| n as usize))
98+
}
99+
100+
/// Adds a recent-selection entry. Dedupes against existing entries by canonical
101+
/// key, moves the matching one to the top, and trims to `max_count`.
102+
#[tauri::command]
103+
#[specta::specta]
104+
pub fn add_recent_selection(
105+
app: tauri::AppHandle,
106+
entry: SelectionHistoryEntry,
107+
max_count: Option<u32>,
108+
) -> Result<(), String> {
109+
let cap = max_count.map(|n| n as usize).unwrap_or_else(history::default_max_count);
110+
history::add_entry(&app, entry, cap);
111+
Ok(())
112+
}
113+
114+
/// Removes a recent-selection entry by id. No-op when the id isn't present.
115+
#[tauri::command]
116+
#[specta::specta]
117+
pub fn remove_recent_selection(app: tauri::AppHandle, id: String) -> Result<(), String> {
118+
history::remove_entry(&app, &id);
119+
Ok(())
120+
}
121+
122+
/// Clears every recent-selection entry.
123+
#[tauri::command]
124+
#[specta::specta]
125+
pub fn clear_recent_selections(app: tauri::AppHandle) -> Result<(), String> {
126+
history::clear_entries(&app);
127+
Ok(())
128+
}
129+
130+
/// Live-applies a new `selection.recentSelections.maxCount` value. Trims the
131+
/// in-memory store and rewrites disk only when entries actually drop.
132+
#[tauri::command]
133+
#[specta::specta]
134+
pub fn apply_recent_selections_max_count(app: tauri::AppHandle, max_count: u32) -> Result<(), String> {
135+
history::apply_max_count(&app, max_count as usize);
136+
Ok(())
137+
}
138+
139+
#[cfg(test)]
140+
mod tests {
141+
use super::*;
142+
143+
#[test]
144+
fn translate_result_serialization_round_trips() {
145+
let r = SelectionTranslateResult {
146+
pattern: Some("*.log".to_string()),
147+
kind: Some("glob".to_string()),
148+
size_min: Some(1024),
149+
size_max: None,
150+
modified_after: Some("2026-01-01".to_string()),
151+
modified_before: None,
152+
caveat: None,
153+
label: Some("Log files".to_string()),
154+
};
155+
let json = serde_json::to_string(&r).unwrap();
156+
assert!(json.contains("\"pattern\":\"*.log\""));
157+
assert!(json.contains("\"sizeMin\":1024"));
158+
assert!(json.contains("\"modifiedAfter\":\"2026-01-01\""));
159+
assert!(json.contains("\"label\":\"Log files\""));
160+
}
161+
}

apps/desktop/src-tauri/src/ipc.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -495,6 +495,12 @@ pub fn builder() -> Builder<tauri::Wry> {
495495
crate::commands::search::remove_recent_search,
496496
crate::commands::search::clear_recent_searches,
497497
crate::commands::search::apply_recent_searches_max_count,
498+
crate::commands::selection::translate_selection_query,
499+
crate::commands::selection::get_recent_selections,
500+
crate::commands::selection::add_recent_selection,
501+
crate::commands::selection::remove_recent_selection,
502+
crate::commands::selection::clear_recent_selections,
503+
crate::commands::selection::apply_recent_selections_max_count,
498504
crate::commands::e2e::get_e2e_start_path,
499505
crate::commands::e2e::is_e2e_mode,
500506
#[cfg(feature = "playwright-e2e")]

apps/desktop/src-tauri/src/ipc_collectors.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,12 @@ pub(crate) fn collect_cross_platform_types(types: &mut Types) -> Vec<Function> {
190190
crate::commands::search::remove_recent_search,
191191
crate::commands::search::clear_recent_searches,
192192
crate::commands::search::apply_recent_searches_max_count,
193+
crate::commands::selection::translate_selection_query,
194+
crate::commands::selection::get_recent_selections,
195+
crate::commands::selection::add_recent_selection,
196+
crate::commands::selection::remove_recent_selection,
197+
crate::commands::selection::clear_recent_selections,
198+
crate::commands::selection::apply_recent_selections_max_count,
193199
crate::commands::e2e::get_e2e_start_path,
194200
crate::commands::e2e::is_e2e_mode,
195201
crate::commands::clipboard::copy_files_to_clipboard,

apps/desktop/src-tauri/src/lib.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,7 @@ mod redact;
121121
mod restricted_paths;
122122
pub mod search;
123123
mod secrets;
124+
pub mod selection;
124125
mod settings;
125126
mod short_id;
126127
mod space_poller;
@@ -455,6 +456,9 @@ pub fn run() {
455456
// Load persisted recent search history into the in-memory cache.
456457
search::history::load_history(app.handle());
457458

459+
// Same for recent selections (Selection dialog history).
460+
selection::history::load_history(app.handle());
461+
458462
// Load manually-added servers and inject into discovery state
459463
#[cfg(any(target_os = "macos", target_os = "linux"))]
460464
network::manual_servers::load_manual_servers(app.handle());

apps/desktop/src-tauri/src/search/CLAUDE.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,21 @@ field is missing (`build_label` returns `None`).
6767
**Decision**: Single LLM pass, no refinement.
6868
**Why**: The previous two-pass system (translate + refine) caused regressions ~15% of the time (over-narrowing, flag dropping). With deterministic structure, there's nothing to refine. Also halves LLM latency.
6969

70+
## Sharing with `selection/`
71+
72+
`crate::selection::history` re-exports `HistoryMode` and `HistoryFilters` from this
73+
module's `history.rs`. The two pure data types are identical in intent across Search
74+
and Selection, so the wire shape stays in sync. The entry struct itself
75+
(`HistoryEntry` here vs `SelectionHistoryEntry`) stays separate because the canonical
76+
dedupe keys differ (Selection has no scope or exclude-system-dirs). If the mode set
77+
ever forks between consumers, drop the re-export and copy the types.
78+
79+
The AI parser helpers in `search/ai/parser.rs` are NOT shared with Selection: the
80+
fields are different (Selection has no `keywords` / `type` / `scope` / `folders`), so
81+
sharing would have meant exporting a few low-level helpers (`is_year`, `is_range`)
82+
that aren't worth the coupling. Selection's parser lives independently in
83+
`crate::selection::ai::parser`.
84+
7085
## Coupling to `indexing/`
7186

7287
`search/` is a read-only consumer of the indexing DB via `ReadPool`, `WRITER_GENERATION`, and `store::resolve_path`. This is intentional -- search reads from the index but doesn't participate in indexing. The dependency is one-way (`search` -> `indexing`, never reverse) and narrow:
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
# Selection module
2+
3+
Backend for the Selection dialog (Select files / Deselect files). Mirrors `crate::search`
4+
but narrower: there is no scope, no system-dir exclusion, no in-memory index, and the
5+
matcher itself runs in JS against the focused folder's entries. This module owns just
6+
the persistent history store and the AI translation pipeline.
7+
8+
## Module structure
9+
10+
| File | Purpose |
11+
|------|---------|
12+
| `mod.rs` | Re-exports the public surface. |
13+
| `history.rs` | `SelectionHistoryEntry`, atomic JSON read/write, canonical-key dedupe, cap eviction, schema-version quarantine. Re-exports `HistoryMode` and `HistoryFilters` from `crate::search::history` so the frontend sees the same mode/filter shape for both consumers. |
14+
| `ai/mod.rs` | Re-exports the AI submodules. |
15+
| `ai/prompt.rs` | `build_classification_prompt(sample_names)` and `format_sample_block`. Pure functions; no IPC. Returns the system-prompt string the LLM receives. |
16+
| `ai/parser.rs` | `parse_selection_response(text)``ParsedSelectionLlmResponse`. Key-value line parser; mirrors `search::ai::parser` in style but with the narrower field set. |
17+
| `ai/query_builder.rs` | `build_selection_translate_result(parsed)``SelectionTranslateResult`. Assembles the result type that crosses IPC; `generate_caveat` and `build_label` are the supporting helpers. |
18+
19+
The IPC layer is in `crate::commands::selection`.
20+
21+
## History store
22+
23+
Persistent recent-selections store for the dialog's footer + popover. Same atomic-write
24+
story as `crate::search::history`; the key tradeoffs are:
25+
26+
- **Persistence path**: `{app_data_dir}/selection-history.json`. Schema-versioned via
27+
`_schemaVersion` (currently `1`).
28+
- **In-memory cache + disk lock**: in-memory `Mutex<HistoryStore>` plus a separate
29+
`OnceLock<Mutex<()>>` (`DISK_LOCK`) that serializes the read-modify-write cycle so
30+
concurrent IPC commands can't lose writes. Cache guards drop before any `fs` call.
31+
- **Canonical dedupe key**: `mode | normalized_query | filters | case_sensitive`. Four
32+
segments; Search's key has six (it adds `scope` and `exclude_system_dirs`). Filters
33+
serialize as alphabetically-keyed `k=v,k=v` pairs with undefined fields omitted. The
34+
key is never persisted; it only exists at compare time.
35+
- **Recovery**: parse failure or schema-version mismatch → rename file to `.broken`,
36+
start fresh. The user keeps using the dialog; the corrupted file is preserved for
37+
one rotation in case debugging is needed.
38+
- **Cap**: configurable via `selection.recentSelections.maxCount` (default 1000).
39+
`apply_max_count` trims the in-memory store on live-apply; `0` clears everything and
40+
short-circuits future adds.
41+
42+
### Decision: separate `selection-history.json` from `search-history.json`
43+
44+
Storing both consumers' history in one file with a `kind` discriminator was rejected.
45+
Their schemas already diverge (`scope` and `exclude_system_dirs` are irrelevant for
46+
Selection), and coupling two unrelated migrations forever didn't earn its keep. The
47+
small cost of two files is invisible at runtime.
48+
49+
### Decision: re-export `HistoryMode` and `HistoryFilters` from `search::history`
50+
51+
The two pure data types are identical in intent across the two consumers. The
52+
`SelectionHistoryEntry` struct itself stays separate so the on-disk schema doesn't bind
53+
Selection to Search's canonical-key shape. If Search's mode set or filter shape ever
54+
diverges from Selection's, the re-export drops out and the types fork; the wiring is
55+
already isolated enough that the change is mechanical.
56+
57+
## AI translation
58+
59+
The `translate_selection_query(prompt, sample_names)` IPC orchestrates:
60+
61+
1. Verifies the AI provider is `cloud`. Small local models (4-8K context) can't
62+
reliably fit a 200+-name folder sample plus the structured prompt and response, so
63+
the backend hard-errors when provider isn't cloud. The frontend hides the AI chip
64+
in that case, but this gate is the belt-and-braces check for an MCP caller or a
65+
misconfigured frontend.
66+
2. Calls `ai::build_classification_prompt(&sample_names)` to assemble the system
67+
prompt with today's date and the folder sample.
68+
3. Runs `chat_completion` via `crate::ai::client` against the configured cloud backend
69+
with `temperature: 0.2`, `max_tokens: 300`, `top_p: 0.9`.
70+
4. Parses the response via `ai::parse_selection_response` into a
71+
`ParsedSelectionLlmResponse`.
72+
5. Builds the wire-result via `ai::build_selection_translate_result`.
73+
74+
### Decision: cloud-only AI for Selection
75+
76+
Folder samples weigh 1-3k tokens; the prompt plus completion lives ~4-5k tokens. Local
77+
4-8K context models often can't fit the full payload, and quality on small models is
78+
unreliable for pattern inference. We surface a tooltip on the gated UI in the frontend
79+
("AI selection needs a cloud provider. Set one in Settings > AI."); the backend
80+
returns the same message as a hard error for any non-cloud caller.
81+
82+
### Decision: key-value response format, not JSON
83+
84+
Same rationale as `crate::search::ai`. JSON generation is the #1 failure mode for
85+
small LLMs. Key-value lines are trivial to produce and parse, missing lines are
86+
individually skippable, and malformed lines never void the whole response.
87+
88+
### Decision: `pattern` + `kind` instead of structured filter types
89+
90+
The matcher runs on the frontend in JS. There's no benefit to round-tripping a typed
91+
glob through Rust; the parsed string IS the contract. The kind is `glob` (full-name
92+
match, `*` and `?` only) or `regex` (JS RegExp). When `pattern` is missing or blank,
93+
`kind` is forced to `None` so the frontend doesn't compile a half-built query.
94+
95+
### Decision: default `kind` to `glob` when the model omits it
96+
97+
The model occasionally forgets to emit `kind:` for obvious globs (`*.png`, `*.log`).
98+
Defaulting saves a re-prompt. The parser still drops `kind` to `None` when the value
99+
isn't one of `glob`/`regex`; the builder catches the missing-kind-with-pattern case
100+
and substitutes `glob`.
101+
102+
## Real-LLM eval results
103+
104+
The prompt + parser are pinned by `selection/ai/real_llm_eval_test.rs`, six
105+
`#[ignore]`-gated integration tests against the live OpenAI API. Run them with:
106+
107+
```sh
108+
OPENAI_API_KEY=$(security find-generic-password -a "$USER" -s "OPENAI_API_KEY" -w) \
109+
cargo nextest run --lib --run-ignored only selection::ai::real_llm_eval_test
110+
```
111+
112+
The default model is `gpt-4o-mini` (cheap, fast, comparable to the model David has
113+
configured in his Settings UI for everyday use). When David's cloud-provider model
114+
changes, edit `MODEL` in the eval file and rerun.
115+
116+
| Intent | Sample shape | Assertions | Status |
117+
|---|---|---|---|
118+
| "all log files" | mixed `.log` / `.txt` / `.md` / `.png` | pattern contains `log`, `kind` set | passing |
119+
| "png and jpg images" | mixed image + text extensions | pattern mentions both png and jpg/jpeg | passing |
120+
| "files bigger than 5 MB" | mixed sizes | `size_min`[4 MB, 10 MB], pattern present | passing |
121+
| "backups from last week" | `*-backup-*` files plus noise | `modified_after` set | passing |
122+
| "every rymd file" | `rymd-*.pdf` plus noise | pattern matches the keyword | passing |
123+
| "final drafts I haven't shared" | `Final-*` files | pattern OR caveat present (no half-built query) | passing |
124+
125+
The eval also surfaces drift: a prompt change that breaks one of these assertions
126+
shows up before the dialog wraps around it. Iterate the prompt, rerun the eval, ship
127+
the prompt change with green tests.
128+
129+
For ad-hoc debugging (peek at the raw model response), add an `eprintln!` to the
130+
`translate` helper temporarily (allowed in `#[cfg(test)]` blocks for `--no-capture`
131+
runs); revert before commit so the crate-level deny on `print_stderr` stays clean.
132+
Alternatively, run the dialog through the live app and tail
133+
`RUST_LOG=cmdr_lib::selection::ai=debug pnpm dev`.
134+
135+
## IPC surface
136+
137+
All commands live in `crate::commands::selection`:
138+
139+
| Command | Purpose |
140+
|---|---|
141+
| `translate_selection_query(prompt, sample_names)` | AI translation; cloud-only. Returns `SelectionTranslateResult` or an error string. |
142+
| `get_recent_selections(limit)` | Returns persisted entries (newest first). |
143+
| `add_recent_selection(entry, max_count)` | Adds + dedupes + caps. |
144+
| `remove_recent_selection(id)` | Removes by id; no-op when missing. |
145+
| `clear_recent_selections()` | Drops every entry. |
146+
| `apply_recent_selections_max_count(max_count)` | Live-applies a freshly-tuned cap. |
147+
148+
All six are registered in `crate::ipc::builder` (runtime dispatch) and
149+
`crate::ipc_collectors::collect_cross_platform_types` (specta). The bindings appear
150+
in `apps/desktop/src/lib/ipc/bindings.ts`; the typed wrappers live in
151+
`apps/desktop/src/lib/tauri-commands/selection.ts`.
152+
153+
## Coupling to other modules
154+
155+
- `crate::search::history`: re-exports `HistoryMode` and `HistoryFilters`. One-way.
156+
- `crate::ai::manager` + `crate::ai::client`: backend resolution and chat completion.
157+
Mirrors `crate::commands::search`'s usage exactly.
158+
- `crate::config::resolved_app_data_dir`: shared persistence-path resolver.
159+
160+
No other modules depend on `selection`; the dialog frontend and command-dispatch wiring
161+
land in M7.

0 commit comments

Comments
 (0)