2
votes

I am building a project using python. I have to schedule some jobs so i am using apscheduler. The scheduler is working fine on Windows Apache. But when i moved the project to Amazon Ubuntu instance i am facing an issue i-e all jobs are running after certain interval of time but instead of once it is running twice.. So i have two instances of every job running at the same time. Everything is working fine on Windows instance...I am using wsgi-python. Below is my wsgi file..

import os
import sys

sys.path.append('C:/Django/sweetspot/src/sweetspot')
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings_server'

from jobs.FeedAndNews import FeedParse, NewsParse
from apscheduler.scheduler import Scheduler

import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

today = datetime.datetime.today()
nex = datetime.timedelta(hours=1)
startsat = today + nex
timestr = startsat.strftime("%Y-%m-%d %H:%M:%S")   

scheduler = Scheduler()
scheduler.start()

scheduler.add_interval_job(FeedParse, hours=1, start_date=timestr)
scheduler.add_interval_job(NewsParse, hours=1, start_date=timestr)

Below are the versions of python and apcheduler: Python 2.7, apscheduler-2.1.1

Can someone please help me out in identifying the issue.. Appreciate your help. Thanks in advance..

1
I added the scheduler code in the urls.py and deleted it from the wsgi.py. Now the jobs are running once but are running again after half the interval i have provided them i-e if i give 1 hour it start the other instance of the same job after half an hour... - planet260
You do realise that on UNIX if using Apache/mod_wsgi that your WSGI application can be running in multiple processes at the same time. If you are dependent on there only being one process, as is the case on Windows, then you need to configure Apache/mod_wsgi appropriately. Most likely to use mod_wsgi daemon mode. See code.google.com/p/modwsgi/wiki/ProcessesAndThreading - Graham Dumpleton

1 Answers

1
votes

There are two different approaches to cater this problem without stopping multi-processing.

--First one is to use locking mechanism. You can create a file which will act as a shared resource and which ever process open it first it can write a lock character inside this way only one of the process will actually run the jobs.

--Second we can separate jobs from all the other code. What we can do is we can create cron jobs this way jobs will have no effect on server restart.