Skip to content

Commit ed4b688

Browse files
committed
SMB: drop dead acquire_client + LoggedClientGuard
`acquire_client` had its sole hot-path caller removed in `efb15479` — the streaming write path now drives on an owned `Connection` clone via smb2 0.9's owned `FileWriter`. Nothing else called it. `LoggedClientGuard` existed only to instrument that one method's release point, so it goes with it. `clone_session` keeps its ticketed `client-mutex:` debug logs: that's the one remaining lock acquire on the SMB session and a useful future diagnostic. The mutex is held for microseconds per call now (just to clone the underlying `Arc<Connection>`), so the logs sit at `debug` where they fire only when explicitly enabled. Also updates the colocated CLAUDE.md and the historical comment in `write_from_stream` to point at the regression test that pins the no-deadlock shape, not the deleted Phase C QNAP test.
1 parent e750920 commit ed4b688

2 files changed

Lines changed: 7 additions & 95 deletions

File tree

apps/desktop/src-tauri/src/file_system/volume/CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -411,7 +411,7 @@ spawned detached task. This is safe because the stream always lives in an async
411411
**Why**: SMB servers return NFC-normalized filenames. macOS filesystem paths use NFD. The watcher NFD-normalizes filenames before constructing display paths used for cache lookups.
412412

413413
**Gotcha (no longer applicable as of smb2 0.9)**: SMB write streaming fallback used to hold the client mutex for the whole upload
414-
**Why**: Historically `FileWriter<'a>` borrowed `&'a mut Connection` from the `SmbClient`, so `write_from_stream` had to hold the client mutex for the duration of the streaming write. Under sustained concurrent pressure on QNAP this two-phase pattern (brief `clone_session` fast-path probe → drop → long `acquire_client` for the streaming fallback) deadlocked: Phase C `smb_integration_phase_c_deadlock_repro_qnap` reproduced it reliably. smb2 0.9 rebuilt `FileWriter` to own its `Connection` and `Arc<Tree>`, removing the borrow. See the new Decision below (`write_from_stream` uses a cloned Connection + Arc<Tree>) for the current design.
414+
**Why**: Historically `FileWriter<'a>` borrowed `&'a mut Connection` from the `SmbClient`, so `write_from_stream` had to hold the client mutex for the duration of the streaming write. Under sustained concurrent pressure this two-phase pattern (brief `clone_session` fast-path probe → drop → long mutex-guarded streaming fallback) deadlocked. smb2 0.9 rebuilt `FileWriter` to own its `Connection` and `Arc<Tree>`, removing the borrow. The regression is pinned by `smb_integration_concurrent_streaming_writes_no_deadlock`. See the new Decision below (`write_from_stream` uses a cloned Connection + Arc<Tree>) for the current design.
415415

416416
**Decision**: `write_from_stream` uses a cloned `Connection` + `Arc<Tree>` via smb2 0.9's owned `FileWriter`
417417
**Why**: smb2 0.9 made `FileWriter` own its `Connection` (cheap `Arc::clone`) and `Arc<Tree>` instead of borrowing `&'a mut Connection`. `write_from_stream` now calls `clone_session` once up front and drives both the compound fast-path AND the streaming fallback on the same owned `Connection` clone. The client mutex is held only for the few microseconds of `clone_session()`, never across I/O. The previous shape — fast-path on a clone, then drop and re-acquire the client for the streaming fallback — was the deadlock reproducer in Phase C against QNAP. The architectural property we get for free: N concurrent streaming writes on one `SmbVolume` pipeline N WRITE chains over a single SMB session, multiplexed by `MessageId` in smb2's receiver task. No external locking, no mutex contention on the hot copy path.

apps/desktop/src-tauri/src/file_system/volume/smb.rs

Lines changed: 6 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -94,51 +94,8 @@ impl ConnectionState {
9494
}
9595
}
9696

97-
// ── Client-mutex instrumentation ────────────────────────────────────
98-
//
99-
// Pure observability around the `client` mutex on `SmbVolume`. Helps
100-
// diagnose deadlocks where the lock appears to be held indefinitely by
101-
// an unknown task. Every acquire allocates a process-global ticket and
102-
// emits three debug logs: `waiting`, `acquired`, and `released`. Grep
103-
// one log file for `client-mutex:` to reconstruct who holds the lock,
104-
// who's queued behind them, and how long each hold lasts.
105-
10697
static CLIENT_LOCK_TICKET: std::sync::atomic::AtomicU64 = std::sync::atomic::AtomicU64::new(0);
10798

108-
/// RAII wrapper around the client `MutexGuard` that logs on Drop so the
109-
/// release event is captured even if the caller returns early. Transparent
110-
/// to callers via `Deref` / `DerefMut` over `Option<SmbClient>`.
111-
struct LoggedClientGuard<'a> {
112-
inner: tokio::sync::MutexGuard<'a, Option<SmbClient>>,
113-
ticket: u64,
114-
caller: &'static str,
115-
acquired_at: std::time::Instant,
116-
}
117-
118-
impl<'a> std::ops::Deref for LoggedClientGuard<'a> {
119-
type Target = Option<SmbClient>;
120-
fn deref(&self) -> &Self::Target {
121-
&self.inner
122-
}
123-
}
124-
125-
impl<'a> std::ops::DerefMut for LoggedClientGuard<'a> {
126-
fn deref_mut(&mut self) -> &mut Self::Target {
127-
&mut self.inner
128-
}
129-
}
130-
131-
impl<'a> Drop for LoggedClientGuard<'a> {
132-
fn drop(&mut self) {
133-
log::debug!(
134-
"client-mutex: released ticket={} caller={} held_for={:?}",
135-
self.ticket,
136-
self.caller,
137-
self.acquired_at.elapsed()
138-
);
139-
}
140-
}
141-
14299
// ── Type mapping helpers ────────────────────────────────────────────
143100

144101
/// Converts an `smb2::FileTime` to seconds since the Unix epoch, matching
@@ -508,53 +465,6 @@ impl SmbVolume {
508465
}
509466
}
510467

511-
/// Acquires the client mutex and returns a guard over the `Option<SmbClient>`.
512-
/// Checks connection state first, then verifies the client is present.
513-
///
514-
/// **No longer called from any hot path.** Every read and write goes
515-
/// through `clone_session` and drives its op on an owned
516-
/// `Connection` clone (cheap `Arc::clone`) — including the streaming
517-
/// write path since smb2 0.9's owned `FileWriter`. This method is
518-
/// kept for now in case future reconnect / diagnostic paths need a
519-
/// guarded `&mut SmbClient` (the `SmbClient` itself is `!Clone`).
520-
/// Candidate for removal once we're confident no such caller is
521-
/// coming back; see also `LoggedClientGuard`.
522-
#[allow(dead_code)]
523-
async fn acquire_client(&self) -> Result<LoggedClientGuard<'_>, VolumeError> {
524-
self.check_connection()?;
525-
let ticket = CLIENT_LOCK_TICKET.fetch_add(1, Ordering::Relaxed);
526-
let start = std::time::Instant::now();
527-
log::debug!(
528-
"client-mutex: waiting ticket={} caller=acquire_client share={}",
529-
ticket,
530-
self.share_name
531-
);
532-
let guard = self.client.lock().await;
533-
log::debug!(
534-
"client-mutex: acquired ticket={} caller=acquire_client share={} waited={:?}",
535-
ticket,
536-
self.share_name,
537-
start.elapsed()
538-
);
539-
if guard.is_none() {
540-
// Drop the unwrapped guard explicitly so the released-log
541-
// doesn't fire for a "no-op" acquire that immediately bails.
542-
drop(guard);
543-
log::debug!(
544-
"client-mutex: released ticket={} caller=acquire_client held_for={:?} (no-session-bail)",
545-
ticket,
546-
start.elapsed()
547-
);
548-
return Err(VolumeError::DeviceDisconnected("SMB session not available".to_string()));
549-
}
550-
Ok(LoggedClientGuard {
551-
inner: guard,
552-
ticket,
553-
caller: "acquire_client",
554-
acquired_at: start,
555-
})
556-
}
557-
558468
/// Reads out a clone of `Arc<Tree>`. Cheap (`Arc::clone`).
559469
async fn tree_arc(&self) -> Result<Arc<Tree>, VolumeError> {
560470
self.check_connection()?;
@@ -1821,10 +1731,12 @@ impl Volume for SmbVolume {
18211731
// receiver task multiplexes responses by `MessageId`.
18221732
//
18231733
// This collapses the historical two-phase pattern (brief
1824-
// `clone_session` for the fast-path → drop → long `acquire_client`
1825-
// for the streaming fallback) into a single-phase clone. The old
1826-
// shape deadlocked under sustained concurrent pressure on QNAP
1827-
// (Phase C `smb_integration_phase_c_deadlock_repro_qnap`).
1734+
// `clone_session` for the fast-path → drop → long
1735+
// session-mutex hold for the streaming fallback) into a single
1736+
// clone. The old shape deadlocked under sustained concurrent
1737+
// pressure; the regression test
1738+
// `smb_integration_concurrent_streaming_writes_no_deadlock`
1739+
// pins this shape.
18281740
Box::pin(async move {
18291741
let smb_path = self.to_smb_path(dest);
18301742

0 commit comments

Comments
 (0)