Skip to content

Batch enumerator size should return the number of batches, not records#97

Merged
etiennebarrie merged 1 commit into
masterfrom
batch-enumerator-size
May 28, 2021
Merged

Batch enumerator size should return the number of batches, not records#97
etiennebarrie merged 1 commit into
masterfrom
batch-enumerator-size

Conversation

@etiennebarrie

@etiennebarrie etiennebarrie commented May 28, 2021

Copy link
Copy Markdown
Member

Enumerator#size returns the number of yields, not the underlying number of elements, e.g.:

[1,2,3].each_slice(2).size # => 2

Similarly BatchEnumerator#size should return the number of batches, not the number of records.

@adrianna-chang-shopify adrianna-chang-shopify left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update ActiveRecordEnumerator to return an enum with the proper size for #batches as well? I don't think it should have any downstream impact in Core if we're already basing size there on the number of batches (however that's happening 😛 ) That way we maintain consistency between the two?


def size
@base_relation.count
-@base_relation.count.div(-@batch_size)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a perform ceiling division comment here to clarify? At a glance, this is a bit confusing

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a better way to do the ceiling division without using negative numbers and the fact that divmod/div round the quotient to negative infinity. I find it less confusing.

@etiennebarrie etiennebarrie mentioned this pull request May 28, 2021
Enumerator#size returns the number of yields, not the underlying number
of elements, e.g.:

	[1,2,3].each_slice(2).size # => 2
@etiennebarrie etiennebarrie force-pushed the batch-enumerator-size branch from 8985af5 to 2cebca0 Compare May 28, 2021 19:44
@etiennebarrie

Copy link
Copy Markdown
Member Author

Can we update ActiveRecordEnumerator to return an enum with the proper size for #batches as well?

I don't really want to do it since it would change an existing behavior, potentially breaking things.

@adrianna-chang-shopify

Copy link
Copy Markdown
Contributor

I don't really want to do it since it would change an existing behavior, potentially breaking things.

Fair, I'd like to circle back to this and make the fix in Core, since technically it's an incorrect implementation of size based on what the enum is yielding.


def size
@base_relation.count
(@base_relation.count + @batch_size - 1) / @batch_size # ceiling division

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, leaning back towards using (@base_relation.count.to_f / @batch_size).ceil, then we could probably drop the comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer keeping the computation in integers rather than going to floats. 🤷

@etiennebarrie etiennebarrie merged commit dcddbfc into master May 28, 2021
@etiennebarrie etiennebarrie deleted the batch-enumerator-size branch May 28, 2021 20:20
@shopify-shipit shopify-shipit Bot temporarily deployed to rubygems May 28, 2021 20:23 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants