Batch enumerator size should return the number of batches, not records#97
Conversation
adrianna-chang-shopify
left a comment
There was a problem hiding this comment.
Can we update ActiveRecordEnumerator to return an enum with the proper size for #batches as well? I don't think it should have any downstream impact in Core if we're already basing size there on the number of batches (however that's happening 😛 ) That way we maintain consistency between the two?
|
|
||
| def size | ||
| @base_relation.count | ||
| -@base_relation.count.div(-@batch_size) |
There was a problem hiding this comment.
Maybe a perform ceiling division comment here to clarify? At a glance, this is a bit confusing
There was a problem hiding this comment.
I found a better way to do the ceiling division without using negative numbers and the fact that divmod/div round the quotient to negative infinity. I find it less confusing.
Enumerator#size returns the number of yields, not the underlying number of elements, e.g.: [1,2,3].each_slice(2).size # => 2
8985af5 to
2cebca0
Compare
I don't really want to do it since it would change an existing behavior, potentially breaking things. |
Fair, I'd like to circle back to this and make the fix in Core, since technically it's an incorrect implementation of size based on what the enum is yielding. |
|
|
||
| def size | ||
| @base_relation.count | ||
| (@base_relation.count + @batch_size - 1) / @batch_size # ceiling division |
There was a problem hiding this comment.
Actually, leaning back towards using (@base_relation.count.to_f / @batch_size).ceil, then we could probably drop the comment
There was a problem hiding this comment.
I prefer keeping the computation in integers rather than going to floats. 🤷
Enumerator#size returns the number of yields, not the underlying number of elements, e.g.:
Similarly BatchEnumerator#size should return the number of batches, not the number of records.