Skip to content

Fix runner APPLICATION_HANG_QUIESCE: handle WM_ENDSESSION and skip blocking shutdown cleanup#48363

Open
crutkas wants to merge 3 commits into
microsoft:mainfrom
crutkas:user/crutkas/fix-runner-quiesce-hang
Open

Fix runner APPLICATION_HANG_QUIESCE: handle WM_ENDSESSION and skip blocking shutdown cleanup#48363
crutkas wants to merge 3 commits into
microsoft:mainfrom
crutkas:user/crutkas/fix-runner-quiesce-hang

Conversation

@crutkas

@crutkas crutkas commented Jun 7, 2026

Copy link
Copy Markdown
Member

Summary

The runner WndProc (tray_icon_window_proc) does not handle WM_QUERYENDSESSION / WM_ENDSESSION, and its WM_DESTROY teardown performs blocking cross-process cleanup. Both contribute to the Watson failure APPLICATION_HANG_QUIESCE_cfffffff_PowerToys.exe!run_message_loop on OS shutdown, sign-out, or restart:

  1. Without a WM_ENDSESSION handler, DefWindowProc returns 0 without posting a quit message, so run_message_loop stays parked in GetMessageW until the OS quiesce timeout (~5 s) force-terminates the process.
  2. Even once teardown starts, WM_DESTROY calls close_settings_window(), which blocks up to 1.5 s on WaitForSingleObject against PowerToys.Settings.exe (src/runner/settings_window.cpp:712), plus Shell_NotifyIcon(NIM_DELETE) during explorer teardown. The Windows shutdown guidance is explicit that handlers must not block — that wait eats the limited quiesce budget.

This PR addresses both: a reusable helper makes the runner unwind promptly, and the runner skips the blocking cleanup the OS is already reaping in parallel.

Supersedes #48378 (same Watson bucket) by combining its no-blocking-cleanup fix with a reusable helper + unit tests. The cleanup-skip insight is credited to @yeelam-gordon.

Related (same failure class, different binary): #41260.

Root cause

src/runner/tray_icon.cpptray_icon_window_proc had no case for WM_QUERYENDSESSION / WM_ENDSESSION, and WM_DESTROY unconditionally ran cross-process cleanup. On an OS-initiated shutdown the OS delivers WM_ENDSESSION to PowerToys.Settings.exe directly and reaps it independently, so the runner's 1.5 s wait on it is pure dead time against the quiesce clock.

Fix

1. Reusable helper in src/common/utils/window.h

handle_session_end_message:

  • WM_QUERYENDSESSION → returns TRUE (the runner has no unsaved user state to negotiate over).
  • WM_ENDSESSION with wparam == TRUE → sets the optional out_session_ending flag, then DestroyWindow(window), which drives the existing WM_DESTROY → PostQuitMessage(0) path and lets run_message_loop unwind cleanly.
  • WM_ENDSESSION with wparam == FALSE (another app vetoed shutdown) → leaves the window alone.

The optional bool* out_session_ending out-param lets each caller decide whether to skip its own blocking teardown on shutdown, without baking module-specific cleanup semantics into the shared helper.

tray_icon_window_proc calls the helper at the top of its dispatch and returns early when the message is handled.

2. Skip blocking cleanup on shutdown in src/runner/tray_icon.cpp

WM_DESTROY now branches on g_session_ending:

  • User-initiated close (tray → Exit): unchanged — full graceful cleanup (Shell_NotifyIcon(NIM_DELETE) + close_settings_window()).
  • OS-initiated shutdown: PostQuitMessage(0) only; the OS reaps Settings.exe and the tray icon for us. The loop unwinds in milliseconds instead of waiting on a 1.5 s cross-process handle.

Why a helper instead of inline cases?

There are 9 other places in PowerToys with the same pattern (separate run_message_loop / GetMessageW callers, no WM_ENDSESSION handling): FancyZones, AlwaysOnTop, KeyboardManagerEngine, WorkspacesWindowArranger, MeasureTool overlay, CropAndLock, GrabAndMove, ZoomIt SelectRectangle, and the notifications COM activator. They can be migrated to the same helper in follow-up PRs scoped per-module — each opts into the out_session_ending skip based on its own cleanup semantics. This PR is intentionally limited to the highest-volume contributor (the always-on runner).

Why not centralize handling inside run_message_loop?

WM_QUERYENDSESSION / WM_ENDSESSION are delivered via SendMessage and invoke the WndProc directly during the GetMessage call — they never appear as a MSG returned to the loop, so the loop cannot observe them. The handler must live in (or be reachable from) the WndProc.

Tests

7 tests in src/common/UnitTests-CommonUtils/Window.Tests.cpp (all green — 18/18 in the WindowTests class):

Test Guards
HandleSessionEndMessage_QueryEndSession_AllowsShutdown WM_QUERYENDSESSION returns TRUE.
HandleSessionEndMessage_EndSessionCancelled_DoesNotTearDown WM_ENDSESSION(FALSE) must not destroy the window.
HandleSessionEndMessage_EndSessionConfirmed_TearsDownAndExitsLoop WM_ENDSESSION(TRUE) destroys the window and run_message_loop exits < 500 ms.
HandleSessionEndMessage_UnrelatedMessage_NotHandled Unrelated messages fall through untouched.
HandleSessionEndMessage_EndSessionConfirmed_SignalsSessionEnding WM_ENDSESSION(TRUE) sets out_session_ending so WM_DESTROY can skip blocking cleanup.
HandleSessionEndMessage_EndSessionCancelled_DoesNotSignalSessionEnding A cancelled shutdown does not flag session-ending.
HandleSessionEndMessage_QueryEndSession_DoesNotSignalSessionEnding The query phase does not flag teardown.

Build: runner.vcxproj and UnitTests-CommonUtils both build clean (x64|Debug, 0 warnings / 0 errors).

Manual validation

  1. Build PowerToys Debug|x64 and start the runner.
  2. Initiate a sign-off (logoff) or restart.
  3. Confirm Event Viewer (Windows Logs → Application) shows no Application Hang event for PowerToys.exe.
  4. Right-click tray → Exit: confirm Settings.exe shuts down gracefully and no ghost tray icon remains (the user-close path is unchanged).

(#48378 additionally captured real logoff/restart runs showing WM_ENDSESSION → WM_DESTROY completing in 1–8 ms with 0 hang events — the same code path this PR takes.)

Quality checklist

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

The runner WndProc (tray_icon_window_proc) did not handle WM_QUERYENDSESSION / WM_ENDSESSION. On OS shutdown, sign-out, or restart, the messages fell through to DefWindowProc, which returns 0 for WM_ENDSESSION without posting a quit message. The main thread stayed parked in GetMessageW until the OS quiesce timeout fired and the process was force-terminated, producing the Watson failure APPLICATION_HANG_QUIESCE_cfffffff_PowerToys.exe!run_message_loop.

Adds a reusable helper handle_session_end_message() in common/utils/window.h that the runner WndProc now calls before its switch. WM_QUERYENDSESSION returns TRUE (we have no unsaved state); WM_ENDSESSION(TRUE) calls DestroyWindow, which routes through the existing WM_DESTROY -> PostQuitMessage(0) path and lets run_message_loop unwind cleanly. WM_ENDSESSION(FALSE) is preserved so cancelled shutdowns do not tear the runner down.

Adds four unit tests in UnitTests-CommonUtils/Window.Tests.cpp covering: WM_QUERYENDSESSION returns TRUE, WM_ENDSESSION(FALSE) does not destroy the window, WM_ENDSESSION(TRUE) tears down and the message loop exits within 500ms, and unrelated messages are not handled.

AB#55588441

Related: microsoft#41260

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@crutkas crutkas force-pushed the user/crutkas/fix-runner-quiesce-hang branch from 7b7733c to d8f8564 Compare June 7, 2026 04:12
crutkas added a commit to crutkas/autoUpgradeAttempt that referenced this pull request Jun 7, 2026
Defense-in-depth follow-up to the runner fix (PR microsoft#48363, addresses APPLICATION_HANG_QUIESCE_*_PowerToys.exe!run_message_loop).

The runner fix added handle_session_end_message in common/utils/window.h and wired it into the runner's tray WndProc. This change wires the same helper into every other long-lived PowerToys module that has a registered WndProc and a run_message_loop call, so that none of them can be force-terminated on shutdown / sign-out / restart.

Modules updated:
- FancyZones (FancyZones::s_WndProc)
- AlwaysOnTop (AlwaysOnTop::WndProc_Helper)
- GrabAndMove (main.cpp WndProc)
- ZoomIt SelectRectangle (SelectRectangle::WindowProc)
- MeasureTool overlays (MeasureToolWndProc and BoundsToolWndProc)

Modules deliberately not touched:
- KeyboardManagerEngine: no top-level window; uses thread-only messages and EventWaiter on TERMINATE_KBM_SHARED_EVENT.
- WorkspacesWindowArranger: run_message_loop() in main.cpp is commented out; the process is short-lived and does not park in GetMessage.
- CropAndLock main.cpp: no WndProc lives in main.cpp; top-level windows are managed by the CropAndLockWindow class.
- common/notifications/notifications.cpp: COM activator with no top-level window (CoRegisterClassObject + RPC apartment pump only).

Tests:
- The semantics of handle_session_end_message are exhaustively tested in src/common/UnitTests-CommonUtils/Window.Tests.cpp (four cases added in the runner PR).
- This change adds one more test: HandleSessionEndMessage_WiredIntoWndProc_ShutsDownCleanly. It builds a real window with a WndProc that follows the exact recipe used by the modules above, sends WM_QUERYENDSESSION and WM_ENDSESSION(TRUE) via SendMessage, and asserts the window is destroyed and the message loop exits promptly. This test validates the integration pattern, so per-module duplicate tests would only re-prove the same thing through much heavier scaffolding.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@crutkas crutkas added Needs-Review This Pull Request awaits the review of a maintainer. bug Something isn't working labels Jun 7, 2026
yeelam-gordon pushed a commit to yeelam-gordon/PowerToys that referenced this pull request Jun 8, 2026
…cking on shutdown

Targets the same APPLICATION_HANG_QUIESCE_cfffffff_PowerToys.exe!run_message_loop Watson bucket as microsoft#48363, but applies the fix per Microsoft's documented WM_ENDSESSION best practices:

1. WM_ENDSESSION is handled inline in tray_icon_window_proc. WM_QUERYENDSESSION is intentionally NOT handled because DefWindowProc already returns TRUE for it (handling it explicitly is dead code).

2. On WM_ENDSESSION(TRUE) we route through WM_CLOSE via SendMessageW rather than calling DestroyWindow directly, so WM_CLOSE remains the single chokepoint for window teardown. Any future telemetry/flush/log added to WM_CLOSE automatically applies on the OS-shutdown path.

3. A new g_session_ending flag tells WM_DESTROY to skip cross-process cleanup on OS-initiated shutdown. Per https://learn.microsoft.com/windows/win32/shutdown/wm-endsession shutdown handlers must not block; close_settings_window() does a 1500 ms WaitForSingleObject on PowerToys.Settings.exe which the OS is shutting down in parallel and will reap independently. Keeping that wait on the shutdown path consumed ~30% of the ~5 s quiesce budget for zero benefit and could itself reproduce the hang.

User-initiated close (right-click tray -> Exit) is unchanged: full graceful cleanup runs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Builds on the WM_QUERYENDSESSION/WM_ENDSESSION handling from microsoft#48363 by folding
in the decisive fix from microsoft#48378: on OS-initiated shutdown the runner's
WM_DESTROY must not run cross-process cleanup the OS is already reaping in
parallel.

- window.h: handle_session_end_message gains an optional out_session_ending
  flag, set on WM_ENDSESSION(TRUE) before DestroyWindow, so each module can
  decide whether to skip its own blocking teardown.
- tray_icon.cpp: add g_session_ending; on shutdown skip Shell_NotifyIcon(
  NIM_DELETE) and close_settings_window() (which blocks up to 1.5s on
  PowerToys.Settings.exe). User-initiated Exit keeps full graceful cleanup.
- Window.Tests.cpp: 3 tests for the new out-param (TRUE sets it; cancelled
  FALSE and WM_QUERYENDSESSION do not).

Fixes APPLICATION_HANG_QUIESCE_cfffffff_PowerToys.exe!run_message_loop.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@crutkas crutkas changed the title Fix APPLICATION_HANG_QUIESCE in PowerToys.exe runner Fix runner APPLICATION_HANG_QUIESCE: handle WM_ENDSESSION and skip blocking shutdown cleanup Jun 9, 2026
… WM_ENDSESSION test threshold

Round-2 follow-up to the earlier WM_ENDSESSION + WM_DESTROY work on the
runner. Two small additions spotted on a re-read:

1. `src/runner/main.cpp` calls `QuickAccessHost::stop()` after
   `run_message_loop()` returns. `QuickAccessHost::stop()` does a
   `WaitForSingleObject` on the Quick Access host process up to its wait
   timeout. On OS shutdown / sign-off / restart the OS reaps that
   process in parallel, so the wait is pure dead time against the
   quiesce budget - the same class of issue the WM_DESTROY skip
   already addresses for `close_settings_window()` and the tray icon.

   Fix: expose `is_session_ending()` from `tray_icon.cpp` (the
   `g_session_ending` flag we already set when the runner observes
   `WM_ENDSESSION(TRUE)`), and gate the `QuickAccessHost::stop()` call
   on it. The user-initiated tray -> Exit path still calls `stop()`
   and unwinds cleanly; only the OS-shutdown path skips the blocking
   wait.

2. `src/common/UnitTests-CommonUtils/Window.Tests.cpp` -
   `HandleSessionEndMessage_EndSessionConfirmed_TearsDownAndExitsLoop`
   asserted `elapsed.count() < 500` against a 1000 ms `run_message_loop`
   timeout. That's tight enough to be CI-flaky on a slow VM while still
   not catching a real failure (the failure mode is "loop ran to full
   timeout"). Raise to 2000 ms - well under the 5 s OS quiesce budget,
   but with enough headroom to absorb scheduler hiccups on shared CI.

Build: `runner.vcxproj` + `UnitTests-CommonUtils.vcxproj` build clean
(x64|Release, 0 warnings / 0 errors). 18/18 tests in `WindowTests`
class pass via vstest.console.

---

ADO: https://microsoft.visualstudio.com/DefaultCollection/OS/_workitems/edit/55588441/

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Needs-Review This Pull Request awaits the review of a maintainer.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant