Skip to content

Commit e12285d

Browse files
committed
MCP: harden ack contract, fix viewer-close + close-about + open-cursor
Fixes a batch of issues found while reviewing the original ack-contract PR end-to-end via real MCP traffic. - `dialog close about` now waits for `SoftDialogDisappeared("about")` instead of `WindowDisappeared("about")`. The about dialog is a soft overlay, not a Tauri window, so the prior signal was always immediately true and the close was effectively fire-and-forget. Now it ack's on the tracker losing the id; absent → instant OK, present → wait for the FE to close it. - `dialog close file-viewer` by path no longer hangs on the ALL-viewers-gone signal. Added `AckSignal::WindowCountBelow { prefix, threshold }` + `snapshot_window_count` helper, so the tool snapshots the viewer count, dispatches, and ack's when the count drops by one. Close-all uses `threshold: 1`. Fast-fail with `invalid_params` when zero viewers are open instead of timing out. - `open_under_cursor` no longer false-times-out on files. The OS-open branch produces neither a state push nor a viewer window, so the previous `Any([GenerationAdvanced, WindowAppeared("viewer")])` signal would always time out at 5 s. Routed through `mcp_round_trip` instead: FE awaits `handleNavigate(entry)` and signals completion explicitly. Removed the now-unused `AckSignal::Any` variant; round-trip is the only honest ack for multi-mode commands. - `nav_to_parent` / `nav_back` / `nav_forward` get a `NAV_ACK_TIMEOUT` of 5 s (up from 1500 ms) because a parent/back jump can land on an SMB or MTP path whose directory listing takes a few seconds even on success. - `update_pane_tabs` now delegates to `PaneStateStore::set_tabs`, a new helper that owns both the tab-list write and the generation bump. Eliminates the inline `pane.write().unwrap().tabs = ...` + `generation.fetch_add(1, Relaxed)` pair that a future contributor would forget to keep in sync. - Documented the `GenerationAdvanced` TOCTOU caveat (an unrelated state push between snapshot and dispatch can satisfy the signal) and the `SoftDialogDisappeared` test-context asymmetry (returns true when the tracker isn't registered, so close paths don't need a fixture). - `mcp/CLAUDE.md`, `docs/tooling/mcp.md`, the ack-signal table, and the round-trip list updated. `bindings.ts` regenerated for the refreshed `update_pane_tabs` doc-comment. End-to-end verified via the real MCP transport: dialog open/close for settings/about/file-viewer (single and multi-viewer); mkdir/mkfile cancel; copy/move/delete autoConfirm and manual; tab new/close; toggle_hidden/set_view_mode/sort; nav_to_parent/back/forward; open_under_cursor on both a .txt file (OS-opens) and a directory (navigates). All Rust checks (1907 unit + 30 integration tests, clippy, rustfmt, cargo-deny, cargo-audit) and Svelte (svelte-check, 1928 vitest tests, oxfmt) green.
1 parent 3c1b0dc commit e12285d

14 files changed

Lines changed: 287 additions & 102 deletions

File tree

apps/desktop/src-tauri/src/mcp/CLAUDE.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -69,20 +69,23 @@ The mechanism lives in `executor/ack.rs`. Each tool:
6969
| `SoftDialogAppeared(id)` | A soft dialog with that ID is in `SoftDialogTracker` | Confirmation dialogs from `copy`/`move`/`delete` (autoConfirm: false), `mkdir`, `mkfile`; `dialog open about` |
7070
| `SoftDialogDisappeared(id)` | A soft dialog with that ID is no longer in `SoftDialogTracker` | `dialog close <confirmation>` — the FE `ModalDialog` fires `notifyDialogClosed` on unmount, so the tracker reflects the close even when cancel doesn't bump pane generation |
7171
| `WindowAppeared(label)` | A `webview_windows()` entry matches the label (exact, or `viewer-*`) | `dialog open settings|file-viewer`, `dialog focus` |
72-
| `WindowDisappeared` | The matching `webview_windows()` entry is gone | `dialog close settings|file-viewer|about` |
73-
| `Any([...])` | Logical OR — any inner signal fires | `open_under_cursor` (directory case bumps generation, file case opens a viewer window) |
72+
| `WindowDisappeared(label)` | The matching `webview_windows()` entry is gone | `dialog close settings` (single window family) |
73+
| `WindowCountBelow {prefix, threshold}` | Count of matching windows is `< threshold` | `dialog close file-viewer` — snapshot the count, ack when one closes (don't wait for *all* viewers to vanish) |
74+
| `Any([...])` | Logical OR — any inner signal fires | Reserved for future multi-mode tools |
7475

7576
The polling cadence is 250 ms for state-driven signals (matching the existing `await` tool) and 100 ms for window/soft-dialog signals (both react faster than a full pane state push).
7677

7778
**Gotcha/Why**: `dialog close <settings>` requires the settings window to listen for the `mcp-settings-close` event and close itself (`apps/desktop/src/routes/settings/+page.svelte`). Without that listener the backend keeps polling for `WindowDisappeared("settings")` and times out at 1500 ms while the window stays put. Same shape applies if you add new window-based dialogs: the FE side has to opt in.
7879

79-
The 1500 ms budget is a backend-side latency budget, not a client-facing knob: MCP clients shouldn't have to tune ack timeouts. Bump it per-call via the `Duration` argument to `wait_for_ack` if a specific operation has a known higher latency floor; don't expose it as a tool parameter.
80+
The 1500 ms budget is a backend-side latency budget, not a client-facing knob: MCP clients shouldn't have to tune ack timeouts. Bump it per-call via the `Duration` argument to `wait_for_ack` if a specific operation has a known higher latency floor; don't expose it as a tool parameter. The navigation family (`nav_to_parent`, `nav_back`, `nav_forward`, `open_under_cursor`) uses `NAV_ACK_TIMEOUT` (5 s) because a parent/back navigation can land on an SMB or MTP path whose directory listing takes a few seconds even on success. `nav_to_path` and `select_volume` use higher round-trip budgets via `mcp_round_trip_with_timeout` and aren't part of the ack-contract family.
8081

81-
**Note on `update_pane_tabs`.** Tab changes flow through this command (not `set_left`/`set_right`), so it also bumps the generation counter. Without that, the `tab` tool's ack would time out on every call.
82+
**Caveat: `GenerationAdvanced` isn't a per-action proof.** The snapshot-dispatch-wait sequence proves the FE pushed pane state *recently after* dispatch — not that the FE specifically handled our event. An unrelated push between snapshot and dispatch (the other pane's watcher, a tab refresh) will satisfy the signal as a false positive. The window is microseconds wide and the FE was clearly running (something pushed), so this is a much weaker false-positive class than the original "always OK" bug. If a real false positive ever surfaces, the fix is to switch the affected tool to `mcp_round_trip` with a request id. Tagged `TODO(mcp-ack):` in `ack.rs`.
83+
84+
**Note on `update_pane_tabs`.** Tab changes flow through this command (not `set_left`/`set_right`). It delegates to `PaneStateStore::set_tabs`, which is the single place tab mutation + generation bump live. Add any new tab-mutating path through that helper, or the `tab` MCP tool's ack will time out.
8285

8386
**Known TODO: `refresh` is still fire-and-forget.** The FE skips the `update_*_pane_state` push when the new listing is byte-identical to the cached state (correct behavior for state sync but no signal for the ack helper). Switching `refresh` to `mcp_round_trip` is the right follow-up, but requires a FE change to emit `mcp-response` after every re-list. The original "FE silently dropping the action" bug is less acute for refresh than for destructive tools, so this stays open. Search the codebase for `TODO(mcp-ack):` to find this and any future similar cases.
8487

85-
Tools where the backend can't fully validate preconditions use `mcp_round_trip`: emit an event with a `requestId`, wait for the frontend to respond via `mcp-response` with `{ requestId, ok, error? }`, and return the actual outcome. Used by `move_cursor` and `set_setting` (5s timeout). `nav_to_path` uses `mcp_round_trip_with_timeout` with a 30s timeout because it waits for the directory listing to complete (the frontend delays the response until `handleListingComplete` fires in FilePane). Resources that need frontend data use `resource_round_trip` (same pattern but returns `data` from the response). Used by `cmdr://settings`. Use these patterns for any new tool/resource where the backend would otherwise need to replicate frontend knowledge.
88+
Tools where the backend can't fully validate preconditions use `mcp_round_trip`: emit an event with a `requestId`, wait for the frontend to respond via `mcp-response` with `{ requestId, ok, error? }`, and return the actual outcome. Used by `move_cursor` and `set_setting` (5 s timeout). `nav_to_path` uses `mcp_round_trip_with_timeout` with a 30 s timeout because it waits for the directory listing to complete (the frontend delays the response until `handleListingComplete` fires in FilePane). `open_under_cursor` also uses `mcp_round_trip_with_timeout` (5 s) because Enter on a non-directory file delegates to the OS default app (no pane state push, no viewer window), so neither `GenerationAdvanced` nor `WindowAppeared` would fire — the FE explicitly awaits `handleNavigate(entry)` and signals completion. Resources that need frontend data use `resource_round_trip` (same pattern but returns `data` from the response). Used by `cmdr://settings`. Use these patterns for any new tool/resource where the backend would otherwise need to replicate frontend knowledge.
8689

8790
### Configuration (`config.rs`)
8891

@@ -114,7 +117,7 @@ Directory module split by test category:
114117

115118
### MCP action tools wait for backend ack before returning success
116119

117-
**Decision (May 2026):** Every fire-and-forget action tool waits for a typed ack signal (`AckSignal::GenerationAdvanced`, `SoftDialogAppeared`, `WindowAppeared`, `WindowDisappeared`, or `Any`) within a 1500 ms budget before returning `OK`. On timeout, the tool returns a `ToolError::internal` whose message names the missing signal and elapsed budget.
120+
**Decision (May 2026):** Every fire-and-forget action tool waits for a typed ack signal (`AckSignal::GenerationAdvanced`, `SoftDialogAppeared`/`Disappeared`, `WindowAppeared`/`Disappeared`, `WindowCountBelow`, or `Any`) within a 1500 ms budget (5 s for the nav family) before returning `OK`. On timeout, the tool returns a `ToolError::internal` whose message names the missing signal and elapsed budget.
118121

119122
**Why.** Real QA hit a paper-cut: MCP tools were returning `OK` while the FE was stalled (modal blocking input, error pane up, race during startup), so the dispatched action was silently dropped. That made MCP unreliable as an automation surface. The ack contract makes `OK` a real promise: the FE actually processed the dispatched action.
120123

@@ -201,7 +204,7 @@ The `cmdr://settings` resource and `set_setting` tool both use round-trips to th
201204

202205
`execute_tool()` is an async function. Action tools follow the ack contract (see "Action-tool ack contract" above): dispatch the event, then `wait_for_ack` for a small backend-side signal before returning. The tool's reported "OK" thus means "the FE accepted the dispatched action," not "the underlying operation completed." For long-running operations (a copy of 10 GB), the agent still has to poll via the `await` tool to observe completion. The ack-contract change made the FE-accepted line meaningful — pre-May 2026, the tool returned `OK` even when the FE wasn't listening.
203206

204-
Three categories of latency-sensitive tools exist beyond the ack contract: (1) `mcp_round_trip` tools (`nav_to_path`, `move_cursor`, `set_setting`) that wait up to 5–30 s for an explicit `mcp-response` event with success/failure, (2) search tools (`search`, `ai_search`) that load the search index via `spawn_blocking` and (for `ai_search`) call the LLM API, (3) `select_volume` which polls until the target pane's `volume_name` matches.
207+
Three categories of latency-sensitive tools exist beyond the ack contract: (1) `mcp_round_trip` tools (`nav_to_path`, `move_cursor`, `set_setting`, `open_under_cursor`) that wait up to 5–30 s for an explicit `mcp-response` event with success/failure, (2) search tools (`search`, `ai_search`) that load the search index via `spawn_blocking` and (for `ai_search`) call the LLM API, (3) `select_volume` which polls until the target pane's `volume_name` matches.
205208

206209
### Error codes are JSON-RPC standard
207210

apps/desktop/src-tauri/src/mcp/executor/ack.rs

Lines changed: 77 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,29 @@
1717
//! - `WindowAppeared` / `WindowDisappeared`: a Tauri webview window with the given label
1818
//! prefix appeared (or vanished). Use this for child windows (settings, file-viewer,
1919
//! about) and for `dialog close` actions.
20-
//! - `Any`: succeeds when any of the inner signals fire (logical OR). Used by
21-
//! `open_under_cursor` where opening a directory bumps the pane generation but opening
22-
//! a file launches a viewer window.
20+
//! - `WindowCountBelow`: the number of windows matching a label prefix is strictly less
21+
//! than a snapshotted count. Use this for close-one-of-many scenarios (closing a
22+
//! specific `viewer-*` window by path) where `WindowDisappeared` would only fire when
23+
//! ALL viewers are gone.
24+
//!
25+
//! Multi-mode tools that can produce different signals depending on what happened (the
26+
//! original `open_under_cursor` was the only such case) live outside the ack contract
27+
//! entirely — they use `mcp_round_trip` and let the FE explicitly signal completion. An
28+
//! earlier `AckSignal::Any` variant tried to OR these together but couldn't cover the
29+
//! OS-open branch (Enter on a non-directory file produces neither a state push nor a
30+
//! viewer window); round-trip is the only honest ack for that shape.
31+
//!
32+
//! ## Caveat: `GenerationAdvanced` is not a per-action ack
33+
//!
34+
//! `snapshot_generation` + dispatch + wait for `GenerationAdvanced` proves the FE pushed
35+
//! pane state recently after dispatch — not that it specifically handled this action. An
36+
//! unrelated state push (other pane's MTP watcher, a tab refresh) between the snapshot
37+
//! and the dispatch can satisfy the signal without the FE having processed our event.
38+
//! In practice this is a much weaker false-positive class than the original "always OK"
39+
//! bug (the FE was almost certainly running, since something pushed state), so the
40+
//! contract is acceptable. Stronger guarantees would require either a request-id-based
41+
//! `mcp-response` round-trip (see `mcp_round_trip`) or per-tool FE acks. TODO(mcp-ack):
42+
//! revisit if real-world false positives surface.
2343
//!
2444
//! ## Timeout
2545
//!
@@ -48,6 +68,13 @@ use crate::mcp::pane_state::PaneStateStore;
4868
/// Default ack budget. Backend-side latency budget; not a client-facing knob.
4969
pub const DEFAULT_ACK_TIMEOUT: Duration = Duration::from_millis(1500);
5070

71+
/// Ack budget for navigation tools whose state push can include a directory listing
72+
/// against a remote backend (SMB/MTP). Local paths still ack in ~50 ms; remote paths
73+
/// can take a few seconds end-to-end. 5 s strikes a balance: still bounded, but no
74+
/// spurious timeouts on a working remote share. `nav_to_path` and `select_volume`
75+
/// use higher round-trip budgets via `mcp_round_trip_with_timeout`.
76+
pub const NAV_ACK_TIMEOUT: Duration = Duration::from_secs(5);
77+
5178
/// Polling cadence for state-driven signals. Matches the existing `await` tool.
5279
const STATE_POLL_INTERVAL: Duration = Duration::from_millis(250);
5380

@@ -70,11 +97,15 @@ pub enum AckSignal {
7097
/// A Tauri webview window whose label equals (or starts with, for viewers)
7198
/// the given pattern appeared.
7299
WindowAppeared(&'static str),
73-
/// A Tauri webview window matching the pattern vanished.
100+
/// A Tauri webview window matching the pattern vanished. For prefix families
101+
/// like `viewer`, this fires only when zero matching windows remain; use
102+
/// `WindowCountBelow` to wait for *one* viewer to close while others stay open.
74103
WindowDisappeared(&'static str),
75-
/// Succeeds when any inner signal fires. Used for tools where the ack
76-
/// kind depends on what got opened (for example `open_under_cursor`).
77-
Any(Vec<AckSignal>),
104+
/// The number of webview windows matching the pattern is strictly less than
105+
/// `threshold`. Used to ack "close one of N viewers" cleanly: snapshot the
106+
/// count, dispatch the close, wait for the count to drop. For close-all,
107+
/// use `threshold: 1` (i.e., wait for count to reach 0).
108+
WindowCountBelow { prefix: &'static str, threshold: usize },
78109
}
79110

80111
impl AckSignal {
@@ -88,9 +119,8 @@ impl AckSignal {
88119
AckSignal::SoftDialogDisappeared(id) => format!("soft dialog '{id}' closed"),
89120
AckSignal::WindowAppeared(label) => format!("window '{label}' opened"),
90121
AckSignal::WindowDisappeared(label) => format!("window '{label}' closed"),
91-
AckSignal::Any(signals) => {
92-
let parts: Vec<String> = signals.iter().map(|s| s.describe()).collect();
93-
format!("any of [{}]", parts.join(", "))
122+
AckSignal::WindowCountBelow { prefix, threshold } => {
123+
format!("window count for '{prefix}' < {threshold}")
94124
}
95125
}
96126
}
@@ -150,39 +180,56 @@ fn check_signal<R: Runtime>(app: &AppHandle<R>, signal: &AckSignal) -> bool {
150180
AckSignal::SoftDialogDisappeared(id) => app
151181
.try_state::<SoftDialogTracker>()
152182
.map(|tracker| !tracker.get_open_types().iter().any(|d| d == id))
153-
// If the tracker isn't registered (test contexts), treat the dialog as
154-
// gone — there's nothing to wait for.
183+
// Asymmetry vs. SoftDialogAppeared (which returns false when the tracker
184+
// isn't registered): if the tracker isn't there, no dialog is open either,
185+
// so "this dialog is gone" is trivially true. The Appeared variant must
186+
// wait — without a tracker it can never see the dialog. Lets unit tests
187+
// drive close paths without spinning up a tracker fixture, while keeping
188+
// the open path strict.
155189
.unwrap_or(true),
156190
AckSignal::WindowAppeared(pattern) => window_matches(app, pattern),
157191
AckSignal::WindowDisappeared(pattern) => !window_matches(app, pattern),
158-
AckSignal::Any(signals) => signals.iter().any(|s| check_signal(app, s)),
192+
AckSignal::WindowCountBelow { prefix, threshold } => count_windows_matching(app, prefix) < *threshold,
159193
}
160194
}
161195

162196
/// True if any Tauri webview window has a label exactly equal to `pattern`,
163197
/// or (for the `viewer` family) starting with `pattern-`.
164198
fn window_matches<R: Runtime>(app: &AppHandle<R>, pattern: &str) -> bool {
199+
count_windows_matching(app, pattern) > 0
200+
}
201+
202+
/// Count of Tauri webview windows matching `pattern`. For prefix families
203+
/// (`viewer`), counts every `viewer-*` window; for exact labels (`settings`,
204+
/// `file-viewer-help`, etc.), returns 0 or 1.
205+
fn count_windows_matching<R: Runtime>(app: &AppHandle<R>, pattern: &str) -> usize {
165206
let windows = app.webview_windows();
166-
// `viewer-` is a label prefix family — each opened file has its own
167-
// `viewer-<id>` window. Other tracked labels (settings, about) are exact.
168207
if pattern == "viewer" {
169-
windows.keys().any(|k| k.starts_with("viewer-"))
208+
windows.keys().filter(|k| k.starts_with("viewer-")).count()
209+
} else if windows.contains_key(pattern) {
210+
1
170211
} else {
171-
windows.contains_key(pattern)
212+
0
172213
}
173214
}
174215

216+
/// Snapshot the current count of Tauri webview windows matching `prefix`.
217+
/// Use with `WindowCountBelow { threshold: snapshot }` to ack on "one closed."
218+
pub fn snapshot_window_count<R: Runtime>(app: &AppHandle<R>, prefix: &'static str) -> usize {
219+
count_windows_matching(app, prefix)
220+
}
221+
175222
/// Whether any leaf in the signal tree references windows or soft dialogs.
176223
/// Both are FE-side mutations that don't require a full pane-state push, so
177224
/// they should react with the tighter cadence.
178225
fn signal_uses_windows(signal: &AckSignal) -> bool {
179226
match signal {
180227
AckSignal::WindowAppeared(_)
181228
| AckSignal::WindowDisappeared(_)
229+
| AckSignal::WindowCountBelow { .. }
182230
| AckSignal::SoftDialogAppeared(_)
183231
| AckSignal::SoftDialogDisappeared(_) => true,
184-
AckSignal::Any(signals) => signals.iter().any(signal_uses_windows),
185-
_ => false,
232+
AckSignal::GenerationAdvanced { .. } => false,
186233
}
187234
}
188235

@@ -223,13 +270,13 @@ mod tests {
223270
assert!(closed.contains("closed"));
224271
assert!(AckSignal::WindowAppeared("settings").describe().contains("settings"));
225272
assert!(AckSignal::WindowDisappeared("settings").describe().contains("settings"));
226-
let any = AckSignal::Any(vec![
227-
AckSignal::GenerationAdvanced { from: 1 },
228-
AckSignal::WindowAppeared("viewer"),
229-
]);
230-
let s = any.describe();
231-
assert!(s.contains("any of"));
232-
assert!(s.contains("viewer"));
273+
let count = AckSignal::WindowCountBelow {
274+
prefix: "viewer",
275+
threshold: 3,
276+
}
277+
.describe();
278+
assert!(count.contains("viewer"));
279+
assert!(count.contains("3"));
233280
}
234281

235282
#[test]
@@ -239,10 +286,10 @@ mod tests {
239286
assert!(signal_uses_windows(&AckSignal::SoftDialogDisappeared(
240287
"mkdir-confirmation"
241288
)));
242-
assert!(signal_uses_windows(&AckSignal::Any(vec![
243-
AckSignal::GenerationAdvanced { from: 0 },
244-
AckSignal::WindowAppeared("viewer"),
245-
])));
289+
assert!(signal_uses_windows(&AckSignal::WindowCountBelow {
290+
prefix: "viewer",
291+
threshold: 1,
292+
}));
246293
}
247294

248295
// Verifies the core promise: once the generation strictly advances past

0 commit comments

Comments
 (0)