Skip to content

Commit 51b6102

Browse files
committed
Error reports: Add Flow B auto-send (opt-in, debounced)
- New `error_reporter::auto_dispatcher` module: 60s ± 10s debounce window per error burst, captures the first error's metadata, increments a counter for the rest, then ships a 1MB-tail bundle via the Phase 4 builder/uploader and emits `error-report-auto-sent` with the server ID. - New `log_error!` macro: drop-in replacement for `log::error!` at user-visible failure sites. Fires the auto-dispatcher only when opted in (single atomic load). - Migrated 7 user-visible error sites: copy/volume-copy failures, listing FE/BE index desync, AI server spawn/start failures, MCP server crash/start failures, and the indexing memory-watchdog forced stop. - New setting `updates.errorReports` (boolean, default false) in the Updates section. Wired through `settings-registry`, `settings-applier`, `loader.rs`, and a new `set_error_reports_enabled` Tauri command. The dispatcher's opt-in flag is set in `lib.rs::setup` *before* any user-visible error path can fire. - New `AutoSendToastContent.svelte` + `auto-send-toast.svelte.ts` listener: shows "Error report sent / Reference ID: ERR-XXXXX" with View and Change settings actions, auto-dismiss after 10s. Mounted from `(main)/+layout.svelte`. - Tests: 6 unit tests for the dispatcher (debounce, opt-in, first-call-wins, jitter band, crash-loop note) + a11y/unit tests for the toast. - Docs: updated `error_reporter/CLAUDE.md` (Flow B mechanics, jitter rationale, crash-loop semantics, `log_error!` convention), `error-reporter/CLAUDE.md` (Flow B toast), and `settings/CLAUDE.md` (new field).
1 parent 6d904aa commit 51b6102

26 files changed

Lines changed: 973 additions & 24 deletions

apps/desktop/src-tauri/src/ai/manager.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -415,7 +415,7 @@ pub fn configure_ai<R: Runtime>(
415415
Some((pid, port))
416416
}
417417
Err(e) => {
418-
log::error!("AI configure: couldn't spawn server: {e}");
418+
crate::log_error!("AI configure: couldn't spawn server: {e}");
419419
None
420420
}
421421
}
@@ -434,7 +434,7 @@ pub fn configure_ai<R: Runtime>(
434434
log::info!("AI: server ready");
435435
let _ = app.emit("ai-server-ready", ());
436436
}
437-
Err(e) => log::error!("AI manager: server didn't start: {e}"),
437+
Err(e) => crate::log_error!("AI manager: server didn't start: {e}"),
438438
}
439439
let mut manager = MANAGER.lock_ignore_poison();
440440
if let Some(ref mut m) = *manager {
@@ -504,7 +504,7 @@ pub fn start_ai_server<R: Runtime>(app: AppHandle<R>, ctx_size: u32) -> Result<(
504504
log::info!("AI: server ready");
505505
let _ = app.emit("ai-server-ready", ());
506506
}
507-
Err(e) => log::error!("AI manager: server didn't start: {e}"),
507+
Err(e) => crate::log_error!("AI manager: server didn't start: {e}"),
508508
}
509509
let mut manager = MANAGER.lock_ignore_poison();
510510
if let Some(ref mut m) = *manager {

apps/desktop/src-tauri/src/commands/settings.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,13 @@ pub fn set_max_log_storage_mb(value: u64) -> Result<(), String> {
101101
Ok(())
102102
}
103103

104+
/// Enable or disable the Flow B error-report auto-dispatcher.
105+
/// Pushed live from the frontend whenever `updates.errorReports` changes.
106+
#[tauri::command]
107+
pub fn set_error_reports_enabled(value: bool) {
108+
crate::error_reporter::auto_dispatcher::set_enabled(value);
109+
}
110+
104111
/// Update menu accelerator for a command.
105112
/// Called from frontend when keyboard shortcuts are changed.
106113
#[tauri::command]

apps/desktop/src-tauri/src/error_reporter/CLAUDE.md

Lines changed: 74 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,10 +39,12 @@ SMB URIs, and UNC paths. See the redact module for the full pattern table.
3939

4040
## Files
4141

42-
| File | Purpose |
43-
| ------------ | ---------------------------------------------------------------- |
44-
| `mod.rs` | `build_bundle`, `build_zip`, `generate_short_id`, `upload`, `cap_bundle_to_mb`, `save_bundle_to_disk` (debug) |
45-
| `tests.rs` | Unit tests: zip structure, redaction, ID format/uniqueness, capping |
42+
| File | Purpose |
43+
| -------------------------- | ------------------------------------------------------------------------------------------------------------- |
44+
| `mod.rs` | `build_bundle`, `build_zip`, `generate_short_id`, `upload`, `cap_bundle_to_mb`, `save_bundle_to_disk` (debug); also exports the `log_error!` macro |
45+
| `tests.rs` | Unit tests: zip structure, redaction, ID format/uniqueness, capping |
46+
| `auto_dispatcher.rs` | Flow B: opt-in auto-send on user-visible errors (60 s ± 10 s debounce, 1 MB tail, no retry on failure) |
47+
| `auto_dispatcher_tests.rs` | Unit tests: debounce, opt-in flag, first-call wins, jitter band, crash-loop interaction |
4648

4749
## Two-command frontend split (rationale)
4850

@@ -75,6 +77,74 @@ preview dialog doesn't apply a cap — log lines are already bounded by the rota
7577
(`advanced.maxLogStorageMb`, default 200 MB → keep up to four 50 MB files). The server
7678
enforces a 10 MB hard cap anyway.
7779

80+
## Flow B (auto-send on error)
81+
82+
Opt-in via the `updates.errorReports` setting (default off — Flow B sends data without
83+
per-event consent, so the consent has to be up front). When enabled:
84+
85+
1. The `log_error!` macro routes select call sites through
86+
`auto_dispatcher::on_error_logged(category, message)` in addition to the normal
87+
`log::error!` emit.
88+
2. The first error in a 60 s window captures `(category, first_message, error_count = 1)`
89+
and schedules a flush at `now + 60 s ± 10 s of jitter`. Subsequent errors in the same
90+
window only bump the counter — the first-call metadata is kept verbatim.
91+
3. When the timer fires: build a bundle (`BundleKind::Auto`, user note carries the count
92+
+ first-error preview), trim to a 1 MB tail via `cap_bundle_to_mb`, upload, emit
93+
`error-report-auto-sent` with the server-issued ID. The frontend listens for that
94+
event and shows a confirmation toast (see `apps/desktop/src/lib/error-reporter/`).
95+
96+
### Why jitter?
97+
98+
Without jitter, a global outage (DNS, an upstream API) triggers thousands of users to
99+
auto-send at the same `now + 60 s` instant. The ±10 s uniform spread costs nothing on
100+
the client and smears the load over a 20 s window server-side.
101+
102+
### Why no retry on upload failure?
103+
104+
We're already debounced at 60 s. If the network's flaky, the user will hit other errors
105+
soon and the next debounce window will fire normally. Retrying inside a single dispatch
106+
risks flooding the server during real outages, and the user still has Flow A as a manual
107+
safety net.
108+
109+
### Crash-loop interaction (read this!)
110+
111+
If the app exits inside the 60 s debounce window — for example, during a panic — the
112+
spawned flush task is dropped before it fires. **The auto-dispatcher does not flush on
113+
shutdown**, by design.
114+
115+
This is fine because:
116+
117+
- **Panics** route through `crash_reporter`, which writes a JSON file synchronously and
118+
uploads it on the next launch. That covers the "app died" case end-to-end.
119+
- **Soft errors** that don't kill the app are exactly what the auto-dispatcher exists
120+
for, and the next `log_error!` call after the next launch will start a fresh window.
121+
122+
If a future scenario shows we're losing important reports here, the simplest fix is to
123+
add a debug-only "flush now" command and let panic hooks call it. Don't add a queue or
124+
on-disk persistence layer — the manual flow is the safety net (matches the FE log
125+
bridge's `beforeunload` semantics: best-effort, no durability guarantees).
126+
127+
### `log_error!` convention
128+
129+
Use `log_error!` instead of `log::error!` at user-visible failure sites — anything that
130+
already produces a user toast or that an end user would describe as "this didn't work."
131+
Skip noisy library-level errors (`smb2`, `nusb`, etc.); the goal is signal, not coverage.
132+
133+
The macro forwards to `log::error!` unconditionally, then calls
134+
`auto_dispatcher::on_error_logged(target, message)` which bails out on a single atomic
135+
load when the opt-in flag is off.
136+
137+
The current set of migrated call sites is small and deliberate; expand it as we discover
138+
new user-visible errors. Do not bulk-migrate.
139+
140+
### AppHandle wiring
141+
142+
The macro can't thread an `AppHandle` through every call site, so
143+
`auto_dispatcher::set_app_handle(handle)` stashes one in a `OnceLock` at app startup
144+
(called from `lib.rs::setup` right after `crash_reporter::init`). If the handle isn't set
145+
yet (init order, unit tests), the dispatcher still updates the debounce counter but
146+
silently skips the spawn — acceptable, and matches the "soft errors only" contract.
147+
78148
## Gotchas
79149

80150
- The cached `ActiveSettings` snapshot is built lazily on the first `build_bundle` call,

0 commit comments

Comments
 (0)