Skip to content

Commit 11f59ea

Browse files
committed
AI search works again: finish the Search translateAi stub and surface translation failures
AI search in both Search and Select was broken in distinct ways, both invisible to the user: - **Search AI mode was an unfinished stub.** `SearchDialog.translateAi` fired the IPC and then threw the result away, so a natural-language query translated but applied nothing. It now applies the AI's writes (Pattern chip + label, size, date, scope, `caseSensitive`, hide-boring-folders), ported from the pre-`QueryDialog`-refactor `applyAiFilters`. Re-enabled the test that was `skip`ped because of the stub and added a regression that asserts the result is APPLIED, not just that the IPC was called. - **Both dialogs swallowed translation errors.** `translateAi` wrapped the call in `catch { return null }`, so a failed AI call (out of quota, key rejected, timeout, empty answer) was a silent no-op. `translateAi` now lets the error throw; `QueryDialog.runAiSearch` catches it once for both consumers and shows a specific, friendly toast. To toast a specific reason without string-matching the message (the `no-string-matching` rule): - New typed `AiTranslateError { kind, message }` crosses IPC (`ai/translate_error.rs`); the frontend branches on `kind`. - `AiError` gains `AuthFailed` (401/403), `RateLimited` (429, also OpenAI `insufficient_quota`), and `EmptyResponse`, classified by the pure `ai_error_for_status`. - Frontend `lib/ai/translate-error-toast.ts`: pure `kind -> copy` map + guard + the one `addToast` wrapper, unit-tested. Also bumped `translate_search_query` from `max_tokens=200` to `300` (matching selection) so a reasoning model doesn't spend the whole budget thinking and emit no visible answer; `EmptyResponse` now tells the user to pick a faster model when it still happens. Docs updated across `ai/`, `selection/`, `lib/ai/`, and `query-ui` CLAUDE.md.
1 parent 188ba72 commit 11f59ea

16 files changed

Lines changed: 660 additions & 65 deletions

File tree

apps/desktop/src-tauri/src/ai/CLAUDE.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@ Three provider modes:
1919
| `download.rs` | HTTP streaming download with Range-based resume. Emits `ai-download-progress` events (200ms throttle). Cooperative cancellation via function parameter (`Fn() -> bool`). |
2020
| `extract.rs` | Copies bundled `llama-server` binary + dylibs from `resources/ai/` to the AI data dir. Sets Unix permissions, handles symlinks. |
2121
| `process.rs` | Spawns child process with `DYLD_LIBRARY_PATH` set. Instant SIGKILL to stop (llama-server is stateless; macOS reclaims all GPU/mmap resources). `kill_process` for fire-and-forget (quit, orphans), `kill_and_reap_in_background` for normal operation (reaps zombie in bg thread). `kill_stale_llama_servers` for belt-and-suspenders orphan cleanup by process name. Port discovery via `bind(:0)`. |
22-
| `client.rs` | `genai`-backed chat client. `AiBackend` is a struct bundling a long-lived `genai::Client` with a model name; built via `AiBackend::local(port)` or `AiBackend::remote(api_key, base_url, model)`. The model name picks the adapter (`claude-*` → Anthropic native, `gemini-*` → Gemini native, `gpt-5*`/`*-pro`/`*-codex` → OpenAI Responses API, etc.). Auto-omits `temperature`/`top_p` for OpenAI Responses adapter and for chat-completions reasoning models (`o1*`, `o3*`, `o4*`, `chatgpt-*`, `gpt-5*` defense-in-depth) and substitutes `ReasoningEffort::Low`. Local backend forces the OpenAI adapter via a `ServiceTargetResolver` pinning endpoint to `http://127.0.0.1:<port>/v1/`. Exposes both `chat_completion` (full response) and `chat_completion_stream` (returns a `BoxStream<Result<String, AiError>>` of content chunks; reasoning/thought-signature/tool-call chunks filtered out). |
22+
| `client.rs` | `genai`-backed chat client. `AiBackend` is a struct bundling a long-lived `genai::Client` with a model name; built via `AiBackend::local(port)` or `AiBackend::remote(api_key, base_url, model)`. The model name picks the adapter (`claude-*` → Anthropic native, `gemini-*` → Gemini native, `gpt-5*`/`*-pro`/`*-codex` → OpenAI Responses API, etc.). Auto-omits `temperature`/`top_p` for OpenAI Responses adapter and for chat-completions reasoning models (`o1*`, `o3*`, `o4*`, `chatgpt-*`, `gpt-5*` defense-in-depth) and substitutes `ReasoningEffort::Low`. Local backend forces the OpenAI adapter via a `ServiceTargetResolver` pinning endpoint to `http://127.0.0.1:<port>/v1/`. Exposes both `chat_completion` (full response) and `chat_completion_stream` (returns a `BoxStream<Result<String, AiError>>` of content chunks; reasoning/thought-signature/tool-call chunks filtered out). `AiError` is typed by HTTP status via the pure `ai_error_for_status` (401/403 → `AuthFailed`, 429 → `RateLimited`, else `ServerError`); a `None` `first_text()` → `EmptyResponse`. |
23+
| `translate_error.rs` | `AiTranslateError { kind, message }` + `AiTranslateErrorKind` enum, the typed error the two translate IPC commands return so the frontend branches on `kind` (not the message string). `From<AiError>` maps transport variants; the commands map `BackendResolution` non-ready cases. Mirror enum: `lib/ai/translate-error-toast.ts`. |
2324
| `client_integration_test.rs` | `wiremock`-based tests covering request shape per adapter (chat completions vs Responses API), parsing, error mapping. Always run in CI. |
2425
| `client_streaming_test.rs` | `axum`-based SSE mock server tests for `chat_completion_stream`: chunks arrive in order, empty streams end cleanly, drop-mid-stream closes the connection, HTTP 5xx maps to `ServerError`. Always run in CI. (Wiremock can't chunk-deliver SSE bodies. See Gotchas.) |
2526
| `client_real_openai_test.rs` | `#[ignore]`-gated smoke tests against `api.openai.com`, including streaming variants for `gpt-4o-mini`, `gpt-5-mini`, `o3-mini`. Run with `OPENAI_API_KEY=$(security find-generic-password -a "$USER" -s "OPENAI_API_KEY" -w) cargo nextest run --lib --run-ignored only ai::client_real_openai_test`. Costs ~$0.001 per full run. |
@@ -60,7 +61,7 @@ Centralized in `manager::resolve_backend() -> BackendResolution`:
6061
- `Ready(AiBackend)`: backend ready to call `chat_completion` on.
6162
- `UnknownProvider(name)`: provider value isn't recognized.
6263

63-
Callers decide what to do per case. `suggestions.rs` returns empty on any non-Ready (folder suggestions are nice-to-have). `commands/search.rs::translate_search_query` returns the human-readable reason as an error so the UI can toast it.
64+
Callers decide what to do per case. `suggestions.rs` returns empty on any non-Ready (folder suggestions are nice-to-have). The two translate commands (`commands/search.rs::translate_search_query`, `commands/selection.rs::translate_selection_query`) return a typed `AiTranslateError { kind, message }` (in `translate_error.rs`) so the frontend can branch on `kind` and show a SPECIFIC toast (key rejected vs. out of quota vs. timed out vs. empty answer) without string-matching the message. The `kind` set maps both the `BackendResolution` non-ready cases (`off` / `notConfigured` / `unknownProvider`) and the `AiError` transport variants (`authFailed` / `rateLimited` / `timeout` / `unavailable` / `emptyResponse` / `serverError` / `parseError`). Frontend counterpart: `lib/ai/translate-error-toast.ts`; keep the two enums in lockstep.
6465

6566
## Download/install event sequence
6667

@@ -149,7 +150,7 @@ privacy-focused users. The architecture doesn't fight this switch: it's just a d
149150

150151
**Gotcha**: `genai 0.6` auto-routes `gpt-5*`, `*-codex`, `*-pro` to the Responses API, but `o1*`/`o3*`/`o4*`/`chatgpt-*` stay on Chat Completions even though they also reject custom `temperature`. We layer `is_openai_chat_reasoning_model()` on top to strip `temperature`/`top_p` and substitute `ReasoningEffort::Low` for those. The heuristic also matches `gpt-5*` as defense-in-depth in case `genai`'s routing rule changes.
151152

152-
**Gotcha**: For reasoning models, `max_tokens` (`max_output_tokens` on Responses API) covers reasoning + visible answer combined. Real-world finding: at `ReasoningEffort::Low`, `gpt-5-mini` consumed all 40 tokens thinking and emitted no `output_text`, so `first_text()` returned `None`. `suggestions.rs` (`max_tokens=150`) and `commands/search.rs` (`max_tokens=200`) may occasionally produce empty results when the user picks a reasoning model. Bump to `max_tokens >= 300` if empty-result rate becomes a problem; the empty-result graceful degradation already covers it functionally.
153+
**Gotcha**: For reasoning models, `max_tokens` (`max_output_tokens` on Responses API) covers reasoning + visible answer combined. Real-world finding: at `ReasoningEffort::Low`, `gpt-5-mini` consumed all 40 tokens thinking and emitted no `output_text`, so `first_text()` returned `None`. Both translate commands now request `max_tokens=300` (search bumped from 200; selection already 300) to give reasoning room before the visible answer. When `first_text()` is still `None`, `chat_completion` returns the typed `AiError::EmptyResponse` (not a generic parse error), which surfaces as a specific "the AI came back empty, try a faster model" toast. `suggestions.rs` (`max_tokens=150`) stays graceful-empty since folder suggestions are nice-to-have. Picking a non-reasoning model (the default `gpt-4.1-mini`) sidesteps this entirely.
153154

154155
**Gotcha**: `tauri::async_runtime::spawn` is used in `configure_ai` and `start_ai_server` instead of `tokio::spawn`.
155156
**Why**: These may run during Tauri setup before the tokio runtime is fully available. `tauri::async_runtime::spawn` uses Tauri's own runtime which is always ready at that point.

apps/desktop/src-tauri/src/ai/client.rs

Lines changed: 40 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,14 @@ pub enum AiError {
6767
Unavailable,
6868
/// Request timed out (server too slow, or local server unhealthy).
6969
Timeout,
70-
/// Server returned an HTTP error or otherwise misbehaved.
70+
/// The provider rejected the API key (HTTP 401 / 403).
71+
AuthFailed(String),
72+
/// The provider is rate-limiting requests or the account is out of quota (HTTP 429).
73+
RateLimited(String),
74+
/// The call succeeded but the model produced no visible text. Common on reasoning models
75+
/// when `max_tokens` is fully consumed by reasoning before any answer is emitted.
76+
EmptyResponse,
77+
/// Server returned some other HTTP error or otherwise misbehaved.
7178
ServerError(String),
7279
/// Couldn't parse the response body.
7380
ParseError(String),
@@ -78,6 +85,9 @@ impl std::fmt::Display for AiError {
7885
match self {
7986
Self::Unavailable => write!(f, "AI server unavailable"),
8087
Self::Timeout => write!(f, "AI request timed out"),
88+
Self::AuthFailed(msg) => write!(f, "AI provider rejected the API key: {msg}"),
89+
Self::RateLimited(msg) => write!(f, "AI provider is rate-limiting or out of quota: {msg}"),
90+
Self::EmptyResponse => write!(f, "AI returned no text"),
8191
Self::ServerError(msg) => write!(f, "AI server error: {msg}"),
8292
Self::ParseError(msg) => write!(f, "AI response parse error: {msg}"),
8393
}
@@ -122,14 +132,11 @@ pub async fn chat_completion(
122132

123133
let text = res
124134
.first_text()
125-
.ok_or_else(|| {
126-
// Common on reasoning models (`gpt-5*`, `o3*`, `*-pro`) when `max_tokens`
127-
// gets fully consumed by reasoning before any `output_text` is emitted.
128-
// The HTTP call succeeded; there's just no visible answer to return.
129-
AiError::ParseError(String::from(
130-
"AI returned no text. Likely max_tokens fully consumed by reasoning. Increase max_tokens.",
131-
))
132-
})?
135+
// Common on reasoning models (`gpt-5*`, `o3*`, `*-pro`) when `max_tokens` gets
136+
// fully consumed by reasoning before any `output_text` is emitted. The HTTP call
137+
// succeeded; there's just no visible answer to return. Typed so callers can tell
138+
// the user to pick a simpler model or raise the token budget.
139+
.ok_or(AiError::EmptyResponse)?
133140
.to_owned();
134141

135142
log::trace!("AI chat_completion: extracted content: {text}");
@@ -254,6 +261,18 @@ fn make_resolver(endpoint: String, auth: AuthData, force_adapter: ForceAdapter)
254261
})
255262
}
256263

264+
/// Classifies a provider HTTP error status into the right [`AiError`] so the frontend can
265+
/// show a specific toast (key rejected vs. out of quota vs. generic server error). Branches
266+
/// on the numeric status, never the message body. 429 covers both rate-limiting and
267+
/// OpenAI's `insufficient_quota`; 401/403 is a rejected key.
268+
fn ai_error_for_status(status: u16, detail: String) -> AiError {
269+
match status {
270+
401 | 403 => AiError::AuthFailed(detail),
271+
429 => AiError::RateLimited(detail),
272+
_ => AiError::ServerError(detail),
273+
}
274+
}
275+
257276
/// Maps `genai`'s rich error tree to our flat [`AiError`]. Pattern-matches on the
258277
/// known transport variants instead of grepping the `Display` output.
259278
fn map_genai_error(e: genai::Error) -> AiError {
@@ -271,7 +290,7 @@ fn map_genai_error(e: genai::Error) -> AiError {
271290
W::Reqwest(req) if req.is_connect() => return AiError::Unavailable,
272291
W::Reqwest(req) => return AiError::ServerError(format!("network error: {req}")),
273292
W::ResponseFailedStatus { status, body, .. } => {
274-
return AiError::ServerError(format!("HTTP {status}: {body}"));
293+
return ai_error_for_status(status.as_u16(), format!("HTTP {status}: {body}"));
275294
}
276295
W::ResponseFailedNotJson { content_type, body } => {
277296
return AiError::ParseError(format!(
@@ -329,6 +348,7 @@ mod tests {
329348
fn test_ai_error_display() {
330349
assert_eq!(AiError::Unavailable.to_string(), "AI server unavailable");
331350
assert_eq!(AiError::Timeout.to_string(), "AI request timed out");
351+
assert_eq!(AiError::EmptyResponse.to_string(), "AI returned no text");
332352
assert_eq!(
333353
AiError::ServerError(String::from("bad")).to_string(),
334354
"AI server error: bad"
@@ -339,6 +359,16 @@ mod tests {
339359
);
340360
}
341361

362+
#[test]
363+
fn ai_error_for_status_classifies_by_code() {
364+
assert!(matches!(ai_error_for_status(401, "x".into()), AiError::AuthFailed(_)));
365+
assert!(matches!(ai_error_for_status(403, "x".into()), AiError::AuthFailed(_)));
366+
// 429 is both rate-limiting and OpenAI's `insufficient_quota`.
367+
assert!(matches!(ai_error_for_status(429, "x".into()), AiError::RateLimited(_)));
368+
assert!(matches!(ai_error_for_status(500, "x".into()), AiError::ServerError(_)));
369+
assert!(matches!(ai_error_for_status(404, "x".into()), AiError::ServerError(_)));
370+
}
371+
342372
#[test]
343373
fn test_is_openai_chat_reasoning_model() {
344374
assert!(is_openai_chat_reasoning_model("o1"));

apps/desktop/src-tauri/src/ai/mod.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,9 @@ mod process;
3535
pub mod suggestions;
3636
#[cfg(test)]
3737
mod suggestions_streaming_test;
38+
pub mod translate_error;
39+
40+
pub use translate_error::{AiTranslateError, AiTranslateErrorKind};
3841

3942
use serde::{Deserialize, Serialize};
4043

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
//! Typed error for the AI natural-language translation commands.
2+
//!
3+
//! Search and Selection both translate a prompt into a structured query via
4+
//! [`crate::ai::client::chat_completion`]. When that fails (provider off, key rejected,
5+
//! quota / rate limit, timeout, empty answer), the dialogs need to show a SPECIFIC toast,
6+
//! not a generic "something went wrong". A bare `String` error would force the frontend to
7+
//! string-match the message (banned by the `no-string-matching` rule), so we cross the IPC
8+
//! boundary with a typed `kind` plus a human-readable `message`. The frontend branches on
9+
//! `kind`; `message` is detail for logs, never for control flow.
10+
11+
use serde::Serialize;
12+
13+
use super::client::AiError;
14+
15+
/// Coarse, frontend-branchable classification of an AI translation failure.
16+
///
17+
/// Keep in lockstep with the `AiErrorKind` switch in
18+
/// `apps/desktop/src/lib/ai/translate-error-toast.ts`.
19+
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, specta::Type)]
20+
#[serde(rename_all = "camelCase")]
21+
pub enum AiTranslateErrorKind {
22+
/// AI is turned off (`provider = "off"`).
23+
Off,
24+
/// Provider is selected but not usable yet (no key, local server down, wrong provider).
25+
NotConfigured,
26+
/// The provider rejected the API key (HTTP 401 / 403).
27+
AuthFailed,
28+
/// The provider is rate-limiting requests or the account is out of quota (HTTP 429).
29+
RateLimited,
30+
/// The request timed out.
31+
Timeout,
32+
/// Couldn't reach the provider (DNS / connection refused).
33+
Unavailable,
34+
/// The model returned no usable text (often a reasoning model burning the token budget).
35+
EmptyResponse,
36+
/// The provider returned some other HTTP error or otherwise misbehaved.
37+
ServerError,
38+
/// Couldn't parse the provider's response.
39+
ParseError,
40+
/// The configured provider value isn't recognized.
41+
UnknownProvider,
42+
}
43+
44+
/// Typed error returned by `translate_search_query` / `translate_selection_query`.
45+
///
46+
/// `message` is a human-readable detail string for logs and the toast's secondary line; the
47+
/// frontend chooses the headline + tone from `kind`, never by parsing `message`.
48+
#[derive(Debug, Clone, Serialize, specta::Type)]
49+
#[serde(rename_all = "camelCase")]
50+
pub struct AiTranslateError {
51+
pub kind: AiTranslateErrorKind,
52+
pub message: String,
53+
}
54+
55+
impl AiTranslateError {
56+
pub fn new(kind: AiTranslateErrorKind, message: impl Into<String>) -> Self {
57+
Self {
58+
kind,
59+
message: message.into(),
60+
}
61+
}
62+
}
63+
64+
impl std::fmt::Display for AiTranslateError {
65+
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
66+
write!(f, "{:?}: {}", self.kind, self.message)
67+
}
68+
}
69+
70+
impl std::error::Error for AiTranslateError {}
71+
72+
impl From<AiError> for AiTranslateError {
73+
fn from(e: AiError) -> Self {
74+
use AiTranslateErrorKind as K;
75+
let kind = match e {
76+
AiError::Unavailable => K::Unavailable,
77+
AiError::Timeout => K::Timeout,
78+
AiError::AuthFailed(_) => K::AuthFailed,
79+
AiError::RateLimited(_) => K::RateLimited,
80+
AiError::EmptyResponse => K::EmptyResponse,
81+
AiError::ServerError(_) => K::ServerError,
82+
AiError::ParseError(_) => K::ParseError,
83+
};
84+
Self::new(kind, e.to_string())
85+
}
86+
}
87+
88+
#[cfg(test)]
89+
mod tests {
90+
use super::*;
91+
92+
#[test]
93+
fn maps_each_ai_error_to_its_kind() {
94+
use AiTranslateErrorKind as K;
95+
let cases = [
96+
(AiError::Unavailable, K::Unavailable),
97+
(AiError::Timeout, K::Timeout),
98+
(AiError::AuthFailed("x".into()), K::AuthFailed),
99+
(AiError::RateLimited("x".into()), K::RateLimited),
100+
(AiError::EmptyResponse, K::EmptyResponse),
101+
(AiError::ServerError("x".into()), K::ServerError),
102+
(AiError::ParseError("x".into()), K::ParseError),
103+
];
104+
for (err, expected) in cases {
105+
assert_eq!(AiTranslateError::from(err).kind, expected);
106+
}
107+
}
108+
109+
#[test]
110+
fn carries_the_detail_message() {
111+
// The detail flows through verbatim (the source error's Display), so logs / the
112+
// toast's secondary line keep the provider's wording. We compare against Display
113+
// rather than substring-matching the message (the no-string-matching rule).
114+
let src = AiError::RateLimited("HTTP 429: out of quota".into());
115+
let expected = src.to_string();
116+
let err = AiTranslateError::from(src);
117+
assert_eq!(err.kind, AiTranslateErrorKind::RateLimited);
118+
assert_eq!(err.message, expected);
119+
}
120+
}

0 commit comments

Comments
 (0)