5
votes

I've just inherited a Rails project and before it was on a typical 'nix server. The decision was made to move it to heroku for the client and it up to me to get the background process working. Currently it uses Whenever to schedule daily events(email etc) and fire up the delayed job queue on boot.

Heroku provides an example of documentation for a custom clock process using clockwork, can I going by this example use it with whenever? Any pitfalls I might come across? Will I need to create a separate worker dyno?

Scheduled Jobs and Custom Clock Processes in Ruby with Clockwork

2

2 Answers

12
votes

Yes -- Heroku's Cedar stack lets you run whatever you want.

The basic building block of the Cedar stack is the dyno. Each dyno gets an ephemeral copy of your application, 512 MB of RAM, and a bunch of shared CPU time. Web dynos are expected to bind an HTTP server to the port specified in the $PORT environment variable, since that's where Heroku will send HTTP requests, but other than that, web dynos are identical to other types of dynos.

Your application tells Heroku how to run its various components by defining them in the Procfile. (See Declaring and Scaling Process Types with Procfile.) The Clock Processes article demonstrates a pattern where you use a worker (i.e. non-web) dyno to enqueue work based on arbitrary criteria. Again, you can do whatever you want here -- just define it in a Procfile and Heroku will happily run it. If you go with a clock process (e.g. a 24x7 whenever), you'll be using a whole dyno ($0.05/hour) to do nothing but schedule work.

In your case, I'd consider switching from Whenever to Heroku Scheduler. Scheduler is basically a Heroku-run cron, where the crontab entries are "spin up a dyno and run this command". You'll still pay $0.05/hour for the extra dynos, but unlike the clock + worker setup, you'll only pay for the time they actually spend running. It cleanly separates periodic tasks from the steady-state web + worker traffic, and it's usually significantly cheaper too.

The only other word of warning is that running periodic tasks in distributed systems is complex and has complex failure modes. Some of the platform incidents (corresponding with the big EC2 outages) have resulted in things like 2 simultaneous clock processes and duplicate scheduler runs. If you're doing something that needs to run serially (like emailing people once a day), consider guarding it with RDBMS locking, and double-checking that it's actually been ~23 hours since your daily job.

2
votes

Heroku Scheduler is often a bad option for production use because it's unreliable and will skip running its tasks sometimes.

The good news is that if you run a jobs queue dyno with Sidekiq there are scheduling plugins for it, e.g. sidekiq-cron. With that you can use the same dyno for scheduling. And if you don't have a jobs worker yet you need to set it up just for scheduling if you need to run it reliably.

P.S. if you happen to run Delayed::Job for jobs queing there are scheduling plugins for it, too, e.g. this one.