2
votes

Well. I'm using Sidekiq to convert video files in background. I use sidekiq-unique-jobs gem to avoid jobs duplicating with the same payload object.

I run my sidekiq process with no options just in default queue with concurrency of 25.

The problem is: every job after being processed for a long time (video files are really big) goes to queue backlog but size of processed jobs is incrementing too.

Job is like neither completed nor unique. I'm stuck. Thanks in advance

UPD:

I'm running Puma as a web server.

1
Are these jobs taking longer than 30 minutes to complete?platforms
Much longer than 30 minutes. 3-4 hours is averageSergey Kishenin

1 Answers

4
votes

Try running it without the sidekiq-unique-jobs gem. It's only been protecting you against dupes for 30 minutes anyway. That gem sets its hashkeys in Redis to auto-expire after 30 minutes (configurable). sidekiq itself sets its jobs to auto-expire in Redis after 24 hours.

I obviously don't see your app, but I'll bet you want to not process the same file very often at all. I would control this at the application layer instead and track my own hashkey doing something similar to what the unique-jobs gem is doing:

hash = Digest::MD5.hexdigest(Sidekiq.dump_json(md5_arguments))

It's also possible that the sidekiq-unique-jobs middleware is also getting in the way of sidekiq knowing if a job properly completed or not. I'll bet that there aren't a lot of folks testing this with long-running jobs in your same configuration.

If you continue to see this behavior without the additional middleware, give resque a try. I've never seen this kind of behavior with that gem, and failed jobs have a helpful retry option in the admin GUI.

The main benefit of sidekiq is that it is multi-threaded. Even so, a concurrency of 25 with large video processes might be pushing it a bit. In my experience, forking is more stable and portable, with less worries about your application's thread-safety (YMMV).

Whatever you do, make sure that you are aware of the auto-expiry TTL settings that these systems put on their data in Redis. The size and nature of your jobs means that jobs could easily back up for 24 hours. These automatic deletions happen at the database layer. There are no callbacks to the application layer to warn if a job has been deleted automatically. In the sidekiq code, for example, they introduced auto-expire behavior to "to avoid any possible leaking." ( reference ) This isn't very encouraging if you really need these jobs to execute.