Is your feature request related to a problem? Please describe.
I find the ruby reaper runs slowly, and takes up a lot of memory when I'm processing a large volume of jobs (hundreds of thousands in a few hours). I've updated our configuration so that if the reaper fails entirely, the reaper resurrector brings it back, and it mostly doesn't hit the timeout, but ideally it would run a little bit faster with a smaller footprint.
Describe the solution you'd like
I suggest that the reaper should work through the oldest digests first, and that it should avoid loading all digests in ruby at once. Here's the code I'm interested in:
|
conn.zrevrange(digests.key, 0, -1).each_with_object([]) do |digest, memo| |
|
next if belongs_to_job?(digest) |
|
|
|
memo << digest |
|
break memo if memo.size >= reaper_count |
|
end |
Currently, using zrevrange means we go from the highest score to the lowest. As the current timestamp is generally used for a digest, this means going from newest to oldest. It's certainly not perfect, but I suggest a better general guess when seeking stale digests would be to go from oldest to newest - we can do this by using zrange instead of zrevrange.
Second, perhaps more laboriously, I suggest paging through digests rather than loading the whole set. It might look similar to this:
page = 0
per = reaper_count * 2
orphans = []
digests = conn.zrange(digests.key, page * per, (page + 1) * per)
while(digests.size > 0)
digests.each do |digest|
next if belongs_to_job?(digest)
orphans << digest
break if orphans.size >= reaper_count
end
break if orphans.size >= reaper_count
page +=1
digests = conn.zrange(digests.key, page * per, (page + 1) * per)
end
orphans
Describe alternatives you've considered
I've considered switching to the Lua reaper, but I was concerned about blocking redis. I'm also thinking about changing some of our application logic so we don't lean quite so heavily on unique jobs, but that will take a bit longer to develop.
Additional context
I'm happy to provide more detail on how we're using sidekiq-unique-jobs in case that's helpful. We tend to process large volumes of jobs (e.g., 300,000) in a short amount of time (e.g., 2 hours) and then have long periods with much less activity.
Is your feature request related to a problem? Please describe.
I find the ruby reaper runs slowly, and takes up a lot of memory when I'm processing a large volume of jobs (hundreds of thousands in a few hours). I've updated our configuration so that if the reaper fails entirely, the reaper resurrector brings it back, and it mostly doesn't hit the timeout, but ideally it would run a little bit faster with a smaller footprint.
Describe the solution you'd like
I suggest that the reaper should work through the oldest digests first, and that it should avoid loading all digests in ruby at once. Here's the code I'm interested in:
sidekiq-unique-jobs/lib/sidekiq_unique_jobs/orphans/ruby_reaper.rb
Lines 58 to 63 in 323d4ef
Currently, using
zrevrangemeans we go from the highest score to the lowest. As the current timestamp is generally used for a digest, this means going from newest to oldest. It's certainly not perfect, but I suggest a better general guess when seeking stale digests would be to go from oldest to newest - we can do this by usingzrangeinstead ofzrevrange.Second, perhaps more laboriously, I suggest paging through digests rather than loading the whole set. It might look similar to this:
Describe alternatives you've considered
I've considered switching to the Lua reaper, but I was concerned about blocking redis. I'm also thinking about changing some of our application logic so we don't lean quite so heavily on unique jobs, but that will take a bit longer to develop.
Additional context
I'm happy to provide more detail on how we're using sidekiq-unique-jobs in case that's helpful. We tend to process large volumes of jobs (e.g., 300,000) in a short amount of time (e.g., 2 hours) and then have long periods with much less activity.