Fix 2-second systemd notify stalls in cloud-hypervisor#493
Fix 2-second systemd notify stalls in cloud-hypervisor#493rzhikharevich wants to merge 3 commits intomicrovm-nix:mainfrom
Conversation
|
I think |
|
@RaitoBezarius Please help review. |
|
Unfortunately, there is no EOF because cloud-hypervisor doesn't seem to propagate the socket half-close to the host socket, so |
|
|
|
By the way, should I use |
| # -T2 is required because cloud-hypervisor does not handle partial | ||
| # shutdown of the stream, like systemd v256+ does. | ||
| ${pkgs.socat}/bin/socat -T2 UNIX-LISTEN:${vsockPath}_8888,fork UNIX-SENDTO:$NOTIFY_SOCKET & | ||
| VSOCK_NOTIFY_PATH=${vsockPath}_8888 ${vsockNotifyProxy}/bin/vsock-notify-proxy & |
There was a problem hiding this comment.
| VSOCK_NOTIFY_PATH=${vsockPath}_8888 ${vsockNotifyProxy}/bin/vsock-notify-proxy & | |
| ${lib.getExe vsockNotifyProxy} ${vsockPath}_8888 & |
| path = os.environ["VSOCK_NOTIFY_PATH"] | ||
| server = await asyncio.start_unix_server(handle, path=path) |
There was a problem hiding this comment.
| path = os.environ["VSOCK_NOTIFY_PATH"] | |
| server = await asyncio.start_unix_server(handle, path=path) | |
| server = await asyncio.start_unix_server(handle, path=sys.argv[0]) |
We can make this kinda dirty here, as we control all invocations and this fails loudly when no arg is there, os.environ does IIRC return empty string.
| # An important caveat is that the message could theoretically be truncated but this should not | ||
| # happen in practice unless the message is large (systemd notifications are small). | ||
| # | ||
| # TODO(rzhikharevich): Ideally, cloud-hypervisor should propagate the half-close, then the |
There was a problem hiding this comment.
| # TODO(rzhikharevich): Ideally, cloud-hypervisor should propagate the half-close, then the | |
| # TODO: Ideally, cloud-hypervisor should propagate the half-close, then the |
There was a problem hiding this comment.
Pull request overview
This PR addresses slow microvm boot times when running cloud-hypervisor under systemd with microvm.vsock.cid enabled, by replacing the previous socat-based notify forwarding (which introduced a 2s stall per notification) with a purpose-built proxy that immediately closes the vsock-stream connection after a single read.
Changes:
- Add a Python-based
vsock-notify-proxythat forwards guest notify messages from the cloud-hypervisor vsock socket to the hostNOTIFY_SOCKET. - Replace the
socat -T2forwarding logic inpreStartwith the new proxy to avoid repeated systemd stalls.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| async def handle(reader, writer): | ||
| data = await reader.read(65536) | ||
| writer.close() |
There was a problem hiding this comment.
After calling writer.close(), it’s generally best to await writer.wait_closed() (or equivalent) to ensure the transport is actually closed and file descriptors are released promptly. This helps avoid accumulating open connections under load and prevents noisy ResourceWarnings in some Python/asyncio versions.
| writer.close() | |
| writer.close() | |
| await writer.wait_closed() |
There was a problem hiding this comment.
The 1st example in the Python asyncio documentation does this, too.
I'm sorry, I don't review anything that involves GenAI in a project on my free time. |
|
@rzhikharevich can you please incorporate the feedback? |
This is a fix for #474. See the code comment for explanation and caveats.
Relevant parts of: