Skip to content

Problem with releasing uniquejobs locks after timeout expires #169

@davehartnoll

Description

@davehartnoll

The redis/aquire_lock.lua script sets two keys in redis for each unique job; the first has an expiration time:

if redis.pcall('set', unique_key, job_id, 'nx', 'ex', expires) then
    redis.pcall('hsetnx', 'uniquejobs', job_id, unique_key)
    return 1
else
    return 0
end

The redis/release_lock.lua script contains this to delete the same two keys:

if redis.pcall('del', unique_key) then
    redis.pcall('hdel', 'uniquejobs', job_id)
    return 1
 end

However, if the job has already expired by the time this release_lock script is called then the first redis.pcall will return false and the 2nd one never gets executed. This causes the uniquejobs hash to keep some entries forever, and it just gets bigger and bigger, and in my case it was consuming all available memory and causing Sidekiq to reject all additional jobs even though the underlying queues were apparently empty.

One simple fix may be to always remove the key from the uniquejobs hash before testing whether the timed-out key can also be deleted, but I'll leave it to someone who understands the locking mechanism better to decide if it's good:

redis.pcall('hdel', 'uniquejobs', job_id)
if redis.pcall('del', unique_key) then
    return 1
 end

My workaround (to avoid having to change the library) is to add a configuration setting the increase the value of the default timeout:

SidekiqUniqueJobs.configure do |config|
  config.default_queue_lock_expiration = 24 * 60 * 60
end

P.S.: the correct spelling of 'aquire' is 'acquire'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions