We have an application that needs to
nightly reprocess large amounts of data, and
reprocess large amounts of data on demand.
In both of these cases, around 10,000 quartz jobs get spawned and then run. In the case of nightly, we have one quartz cron
job that spawns the 10,000 jobs which each individually do the work of processing the data.
The issue that we have is that we are running with around 30 threads, so naturally the quartz jobs misfire, and continue to misfire until everything is processed. The processing can take up to 6 hours. Each of these 10,000 jobs pertain to a specific domain object that can processed in parallel and are completely independent. Each of the 10,000 jobs can take a variable amount of time (from half a second to a minute).
My question is:
Is there a better way to do this?
If not, what is the best way for us to schedule/setup our quartz jobs so that a minimal amount of time is spent thrashing and dealing with misfires?
A note about or architecture: We are running two clusters with three nodes apiece. The version of quartz is a bit old (2.0.1), and clustering is enabled in the quartz.properties file.