Skip to content

[FancyZones] Harden three shutdown races in WorkArea / ZonesOverlay / OnThreadExecutor#48473

Open
crutkas wants to merge 3 commits into
microsoft:mainfrom
crutkas:user/crutkas/fix-fancyzones-shutdown-race
Open

[FancyZones] Harden three shutdown races in WorkArea / ZonesOverlay / OnThreadExecutor#48473
crutkas wants to merge 3 commits into
microsoft:mainfrom
crutkas:user/crutkas/fix-fancyzones-shutdown-race

Conversation

@crutkas

@crutkas crutkas commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

Four small shutdown-/teardown-race fixes in FancyZones that I spotted while reading through the work-area and overlay teardown sequence for an unrelated review. Each one is independently safe in the happy path, but in combination they can crash the FancyZones host process during display changes, monitor configuration changes, a settings toggle mid-drag, or normal exit.

Issues fixed

1. ~ZonesOverlay joins a non-joinable thread when the constructor early-returns

ZonesOverlay::ZonesOverlay can return early in two places — if GetClientRect fails or if CreateHwndRenderTarget returns a failure HRESULT (both reachable in the wild during a display-driver TDR or when a monitor is disconnected mid-init). When that happens, m_renderThread is never started and stays default-constructed. The destructor unconditionally calls m_renderThread.join(), which on a non-joinable thread is undefined behavior (MSVC throws std::system_error); thrown from an implicit-noexcept destructor it calls std::terminate().

Fix: guard the wake-up-and-join sequence with if (m_renderThread.joinable()).

2. ~WorkArea returns the HWND to the window pool before the renderer is torn down

WorkArea's explicit destructor body calls windowPool.FreeZonesOverlayWindow(m_window) first, and only afterwards does implicit member destruction run ~ZonesOverlay (which joins the render thread). Between those two steps the HWND is back in the pool and immediately eligible for reuse by the next NewZonesOverlayWindow call, while the still-alive render thread is using m_renderTarget to draw into it. If the pool hands the same HWND to a freshly-built ZonesOverlay, two render targets target the same window concurrently.

Fix: reset m_zonesOverlay (which joins the render thread) before returning the window to the pool.

3. ~OnThreadExecutor writes _shutdown_request outside the mutex

The destructor mutates the shutdown flag without holding _task_mutex, then calls _task_cv.notify_one(). The worker checks the same flag inside _task_cv.wait(lock, predicate). The atomic does make the value visible eventually, but if the notify lands in the narrow window where the worker has just evaluated the predicate as false and is about to atomically release the lock and sleep, the wakeup can be missed and _worker_thread.join() hangs.

Fix: take _task_mutex around the _shutdown_request = true write so it pairs correctly with the cv.wait.

4. WindowMouseSnap keeps a dangling WorkArea* across WorkAreaConfiguration::Clear()

FancyZones::UpdateWorkAreas() rebuilds m_workAreaConfiguration whenever monitor state changes mid-session, and the SpanZonesAcrossMonitors settings toggle hits the same Clear(). If the user is mid-drag at the moment one of these runs, the WindowMouseSnap instance owned by FancyZones is still holding both a const reference to the map being cleared (m_activeWorkAreas) and a raw WorkArea* into one of the entries that's about to be destroyed (m_currentWorkArea). The next WM_MOUSEMOVE -> MoveSizeUpdate() then dereferences a freed pointer. WindowMouseSnap's destructor only resets window transparency, so relying on it doesn't help; the snapper has to be torn down explicitly.

Fix: call FancyZones::MoveSizeEnd() (which already tears down the snapper cleanly and is a no-op when the snapper is null) before each m_workAreaConfiguration.Clear() call on these paths.

Risk

Low. All four changes are localized to teardown / reconfiguration paths and only tighten existing destruction sequences — the steady-state behavior of ZonesOverlay::Render/Show/Hide, the work-area public API, OnThreadExecutor::submit/cancel, and WindowMouseSnap drag handling is unchanged. The WorkArea reordering is the most behavioral change; it now guarantees the render thread has stopped using the HWND before the pool can recycle it, which is what the existing implicit-member-destruction order already implied but couldn't enforce given the explicit destructor body.

Validation

Spot-built locally; this repo's dotnet restore runtime-pack issue (unrelated to this PR — same NU1102 pattern that's affecting other open PRs) prevents a full Build.cmd here, but the C++ FancyZones modules involved are unchanged in their public surface and are exercised by existing unit tests in FancyZonesTests for the WorkArea code paths.


ADO: https://microsoft.visualstudio.com/OS/_workitems/edit/54653316/

… OnThreadExecutor

1. ~ZonesOverlay: only join m_renderThread if it's joinable. The constructor can early-return when GetClientRect or CreateHwndRenderTarget fail (e.g. during a display-driver TDR), leaving the thread default-constructed; an unconditional join() then terminates the process from the destructor.

2. ~WorkArea: reset m_zonesOverlay (joining the render thread) before returning the HWND to windowPool. Otherwise the HWND becomes eligible for reuse by the next NewZonesOverlayWindow call while the still-live render thread continues to draw into it.

3. ~OnThreadExecutor: take _task_mutex around the _shutdown_request write so it pairs with the worker's _task_cv.wait predicate and the notify cannot be lost in the narrow window where the worker is about to sleep.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@crutkas crutkas added Needs-Review This Pull Request awaits the review of a maintainer. bug Something isn't working labels Jun 11, 2026
Comment thread src/modules/fancyzones/FancyZonesLib/WorkArea.cpp Fixed
@github-actions

This comment has been minimized.

crutkas and others added 2 commits June 10, 2026 23:02
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Follow-up read of the work-area teardown after the earlier shutdown-race
fixes turned up one more lifetime hazard, this time on the drag path.

`FancyZones::UpdateWorkAreas()` clears `m_workAreaConfiguration` (and
rebuilds it) whenever monitor state changes mid-session. The same Clear()
also runs from the `SpanZonesAcrossMonitors` settings toggle. In both
cases, if the user is mid-drag, the `WindowMouseSnap` instance owned by
`FancyZones` is still holding:

  - `m_activeWorkAreas`: `const` reference to the map being cleared, and
  - `m_currentWorkArea`: raw `WorkArea*` into one of the entries that's
    about to be destroyed.

The next `WM_MOUSEMOVE` -> `MoveSizeUpdate()` then dereferences a freed
pointer.

`WindowMouseSnap`'s destructor only resets window transparency, so
relying on it doesn't help; the snapper has to be torn down explicitly.
`FancyZones::MoveSizeEnd()` already does exactly that (calls
`snapper->MoveSizeEnd()`, disables dragging state, nulls the snapper)
and is a no-op when the snapper is null, so it's safe to call
unconditionally before each Clear().

Fix: call `MoveSizeEnd()` before each `m_workAreaConfiguration.Clear()`
call in `UpdateWorkAreas()` and in the `SpanZonesAcrossMonitors` branch
of `settingsUpdate()`. The `Destroy()` path is left alone — that one is
process teardown and the snapper is about to go away with the host.

---

ADO: https://microsoft.visualstudio.com/OS/_workitems/edit/54653316/

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Needs-Review This Pull Request awaits the review of a maintainer.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants