Enforce cursor be serializable#73
Conversation
|
Turns out both Sidekiq and Resque encode job params as JSON, so technically as long as cursor == JSON.load(JSON.dump(cursor))it should be a (de-)serializable cursor. However, we need to watch out for arbitrary Ruby objects as noted in Sidekiq's docs, that validation isn't quite good enough. |
a3a6db5 to
0b38ffd
Compare
a95c464 to
6da0b85
Compare
kirs
left a comment
There was a problem hiding this comment.
Thanks for putting the efforts to describe the problem and how you've run into it in your app.
Code LGTM!
87285e7 to
95a7f38
Compare
|
Thanks for the feedback @GustavoCaso! I have
I also noticed the integration tests are basically identical for Resque and Sidekiq, so I extracted a common test module in #75, to keep it independent of this change. |
TL;DR: This enforces that cursors can only be composed of Objects of built-in JSON compatible classes: Array, FalseClass, Float, Hash, Integer, NilClass, String, and TrueClass. Both Resque and Sidekiq serialize job arguments as JSON. Because `cursor_position` is not part of the ActiveJob arguments, it is subject to this serialization method. Objects other than the basic types JSON supports are serialized by calling `.to_s`, which means they are subject to one way serialization; once serialized, there is no way (out of the box) for them to be deserialized. One such example is Time objects, which are serialized to a reasonable String representation, but when deserialized simply result in a String containing that textual representation. Sneakily enough, because ActiveSupport "augments" `Time#==` adding automatic coercion, `time == JSON.parse(JSON.dump(time))`, which means it isn't always noticed in tests. Nonetheless, this can result in unexpected errors, especially since it only "matters" if the job is interrupted (as otherwise the cursor is never serialized. Therefore, we should enforce that the (de)serialization round trip results in the same cursor it was given without corruption. The simplest way to do this is to traverse the cursor looking for unsupported classes. Since cursors should typically be very small, this should not add any significant performance penalty.
95a7f38 to
28e74ee
Compare
|
Should not Symbols be allowed? |
|
Symbol serialization is "lossy" as they are turned into string JSON keys, which means they are deserialized into Strings. Therefore they are disallowed. |
|
Ok, that makes sense but this is a breaking change, so it should not released as a patch release. |
Context & Problem
The
cursor_positionis serialized by the job adapter (Resque/Sidekiq) and bypasses Active Job's fancy serialization. Both Resque and Sidekiq effectively useJSON.dumpto serialize it andJSON.parseto deserialize it.In cases where a non-JSON-compatible cursor is used, it is silently turned into a
Stringviato_s, which can lead to weird errors when the job resumes after interruption. These are not always straightforward to debug, and don't reliably show up in tests, as Active Support adds implicit coercions to some object's==method.Real world example
In one of our repos, we use a custom enumerator around a collection of events fetched from an API and ordered by time. We used the time the event occurred at as a cursor, which was "corrupted" by being serialized into a String. The API expected an ISO8601 timestamp (its wrapper gem converts `Time` objects automatically), but our improperly deserialized time strings were in the wrong format, so it blew up.This went unnoticed for some time, due to the job not being interrupted, and was tricky to debug due to the reasons outlined above.
What this PR does about it
This prevents these errors by enforcing that the cursors be composed of basic Ruby objects.
This is enforced by (recursively) analyzing the each cursor yielded by the enumerator before
each_iteration. If it is composed of anything other than the following classes, a descriptiveCursorErroris raised:ArrayHashStringIntegerFloatNilClass(nil)TrueClass(true)FalseClass(false)Rationale
nil, the performance penalty of checking each cursor is negligible.Original Description
The cursor ends up being serialized as a `String`, regardless of if it is a `String` or not (by calling `to_s`).This can lead to subtle bugs often not found during testing.
For instance, if a
Timeis used as a cursor and the job is requeued, it instead receives theTime#to_srepresentation as its new input cursor.Let's be explicit and save developers the headache of debugging that: let's enforce that the cursor be an instance of
StringorNilClass.