Add parallel iteration scaffold#702
Merged
Merged
Conversation
67aaac0 to
f893a0e
Compare
MatthewRBruce
approved these changes
Apr 29, 2026
akinshonibare
approved these changes
Apr 29, 2026
58cfd74 to
bda69e3
Compare
adrianna-chang-shopify
approved these changes
Apr 29, 2026
adrianna-chang-shopify
left a comment
Contributor
There was a problem hiding this comment.
Couple small questions but looks good overall!
|
|
||
| unless child_jobs.all?(&:successfully_enqueued?) | ||
| failed_count = @instances - child_jobs.count(&:successfully_enqueued?) | ||
| raise EnqueueError, "Failed to enqueue #{failed_count} out of #{@instances} child jobs" |
Contributor
There was a problem hiding this comment.
Hmm, does this mean that if any of the child jobs fail, we re-enqueue all of the children again on retry? I guess we assume idempotency and it's not really an issue?
Contributor
Author
There was a problem hiding this comment.
Yeah. This is a tricky case where I don't see any perfect options.
I suspect that usually either all the enqueues will fail, or all will succeed. If we do have a partial success, retrying the job will indeed enqueue additional copies.
bda69e3 to
4af4177
Compare
This was referenced May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What and why?
It's sometimes useful to iterate over a collection in parallel to speed up the completion of the work. This PR adds a primitive to enable that:
When this job is enqueued, it first runs with a
nilcursor, causing it to enqueue the 5 child jobs, each with a cursor that looks like{ instance: x, inner_cursor: nil }, wherexranges from 0 to 4. When those jobs start, they build the inner enumerator and run as normal, except that the outer enumerator ensures that the cursor stays wrapped in this hash withinstanceandinner_cursor.sequenceDiagram participant Parent as ParentJob participant Q as ActiveJob queue participant Child as ChildJob (instance i) activate Parent Parent->>Parent: build_enumerator(cursor: nil) Note over Parent: cursor is nil →<br/>user block NOT invoked Parent->>Parent: enqueue_jobs(self.class, arguments) Parent->>Q: perform_all_later([job_0..job_N-1]) Note right of Parent: each child gets<br/>cursor_position =<br/>{instance: i, inner_cursor: nil} deactivate Parent Q->>Child: perform<br/>(cursor: {instance: i, inner_cursor: nil}) activate Child Child->>Child: build_enumerator(cursor: {instance: i, ...}) Note over Child: cursor non-nil →<br/>user block IS invoked,<br/>builds inner enum for instance i loop until done or interrupted Child->>Child: each_iteration(record) Note right of Child: cursor_position =<br/>{instance: i, inner_cursor: X} end alt interrupted Child->>Q: retry_job (carries current cursor) Q->>Child: perform<br/>(cursor: {instance: i, inner_cursor: X}) Note over Child: cursor still non-nil →<br/>no re-fan-out,<br/>resumes from inner_cursor X else completes Note over Child: on_complete callback fires end deactivate ChildGotchas
on_startmeans this particular instance is starting andon_completemeans this particular instance is done iterating. There is no "all child jobs are done iterating" callback, as that would require some form of external synchronization.Follow-up
enumerator_builder.parallel_active_record_on_records(Product.all, instances: 5, cursor: cursor)