🐛 ZMQ broker: send immediate task acknowledgment#7356
Merged
Conversation
When a task is received, the broker now sends an immediate TASK_RESPONSE back to the sender as soon as the task is persisted to the queue on disk. This matches RabbitMQ's publisher-confirm semantics: the caller's Future resolves without waiting for a worker to process the task. Previously, the sender's Future would only resolve when a worker sent its result back, causing timeouts when no workers were connected (e.g. during `verdi process repair` with the daemon stopped). Update the internal architecture docs to reflect that the broker now sends an immediate TASK_RESPONSE when a task is queued (matching RabbitMQ publisher-confirm semantics), rather than forwarding the worker's result back to the sender.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7356 +/- ##
==========================================
- Coverage 80.26% 80.26% -0.00%
==========================================
Files 577 577
Lines 45505 45497 -8
==========================================
- Hits 36519 36512 -7
+ Misses 8986 8985 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
agoscinski
added a commit
to agoscinski/aiida-core
that referenced
this pull request
Apr 30, 2026
When a task is received, the broker now sends an immediate TASK_RESPONSE back to the sender as soon as the task is persisted to the queue on disk. This matches RabbitMQ's publisher-confirm semantics: the caller's Future resolves without waiting for a worker to process the task. Previously, the sender's Future would only resolve when a worker sent its result back, causing timeouts when no workers were connected (e.g. during `verdi process repair` with the daemon stopped). Update the internal architecture docs to reflect that the broker now sends an immediate TASK_RESPONSE when a task is queued (matching RabbitMQ publisher-confirm semantics), rather than forwarding the worker's result back to the sender.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is super hard to understand because it is going through plumpy and kiwipy logic. Basically we immediately send an ACK on submission of a task to not let the client wait for the response (so you can spam responses). I thought I implemented it the way RMQ does it but apparently not. Because I assigned the broker to the daemon lifetime (and thus worker lifetime) this got unnoticed.
When a task is received, the broker now sends an immediate TASK_RESPONSE back to the sender as soon as the task is persisted to the queue on disk. This matches RabbitMQ's publisher-confirm semantics: the caller's Future resolves without waiting for a worker to process the task.
Previously, the sender's Future would only resolve when a worker sent its result back, causing timeouts when no workers were connected (e.g. during
verdi process repairwith the daemon stopped).Update the internal architecture docs to reflect that the broker now sends an immediate TASK_RESPONSE when a task is queued (matching RabbitMQ publisher-confirm semantics), rather than forwarding the worker's result back to the sender.