Skip to content

Runner lifecycle: unmanaged runners created by run* launchers #7353

@agoscinski

Description

@agoscinski

I create the issue to keep the particularity in mind of this fix. I don't think we have any resources to solve this problem in any near-term future.

Context

PR #7344 introduced create_runner(communicator=None) in run* launchers and FunctionProcess.run_get_node to avoid eagerly connecting to the broker when running processes locally. These runners are intentionally not stored on the Manager (i.e. they don't become the global runner).

Why not use the global runner?

The global runner (Manager.get_runner()) is cached and shared. If we create it without a communicator for a run* call, a subsequent submit would get the same runner — without a communicator — and fail. We can't set the communicator after construction because the Runner wires it up in __init__. Also resetting the runner in submit can cause issues with the global event loop.

Why not close the local runner?

Runner.close() closes the underlying asyncio event loop. The runner doesn't create its own loop — it calls plumpy.get_or_create_event_loop(), which returns the global event loop if one exists. Closing it would break anything else sharing that loop (the global runner, the daemon, etc.).

We could check whether an event loop existed before creating the runner and only close if we "own" it, but this leaks event loop implementation details into the launcher layer and is fragile.

Why is the leak acceptable?

plumpy.get_or_create_event_loop() maintains a single global event loop. All unmanaged runners reuse it, so repeated run* calls don't accumulate loops. The only leaked object is the Runner itself (transport queue, job manager) which is lightweight.

Proper fix

Decouple the communicator from Runner construction so it can be attached lazily. This would allow the global runner to start without a communicator and connect one on demand when submit needs it. This requires changes in plumpy (Runner/event loop ownership) and should be addressed there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions