I have a periodic task that I am implementing on heroku procfile using worker:
Procile
web: gunicorn voltbe2.wsgi --log-file - --log-level debug
worker: celery -A voltbe2 worker --beat -events -loglevel info
tasks.py
class PullXXXActivityTask(PeriodicTask):
"""
A periodic task that fetch data every 1 mins.
"""
run_every = timedelta(minutes=1)
def run(self, **kwargs):
abc= MyModel.objects.all()
for rk in abc:
rk.pull()
logger = self.get_logger(**kwargs)
logger.info("Running periodic task for XXX.")
return True
For this periodictask, I need the --beat (I checked by turning it off, and it does not repeat the task). So, in some way, the --beat
does the work of a clock (https://devcenter.heroku.com/articles/scheduled-jobs-custom-clock-processes)
My concern is: if I scale the worker heroku ps:scale worker=2
to 2x dynos, I am seeing that there are two beats running on worker.1 and worker.2 from the logs:
Aug 25 09:38:11 emstaging app/worker.2: [2014-08-25 16:38:11,580: INFO/Beat] Scheduler: Sending due task apps.notification.tasks.SendPushNotificationTask (apps.notification.tasks.SendPushNotificationTask)
Aug 25 09:38:20 emstaging app/worker.1: [2014-08-25 16:38:20,239: INFO/Beat] Scheduler: Sending due task apps.notification.tasks.SendPushNotificationTask (apps.notification.tasks.SendPushNotificationTask)
The log displayed is for a different periodic task, but the key point is that both worker dynos are getting signals to do the same task from their respective clocks, while in fact there should be one clock that ticks and after every XX seconds decides what to do, and gives that task to the least loaded worker.n
dyno
More on why a single clock is essential is here : https://devcenter.heroku.com/articles/scheduled-jobs-custom-clock-processes#custom-clock-processes
Is this a problem and how to avoid this, if so?