11
votes

We have an application that needs to

  1. nightly reprocess large amounts of data, and

  2. reprocess large amounts of data on demand.

In both of these cases, around 10,000 quartz jobs get spawned and then run. In the case of nightly, we have one quartz cron job that spawns the 10,000 jobs which each individually do the work of processing the data.

The issue that we have is that we are running with around 30 threads, so naturally the quartz jobs misfire, and continue to misfire until everything is processed. The processing can take up to 6 hours. Each of these 10,000 jobs pertain to a specific domain object that can processed in parallel and are completely independent. Each of the 10,000 jobs can take a variable amount of time (from half a second to a minute).

My question is:

  1. Is there a better way to do this?

  2. If not, what is the best way for us to schedule/setup our quartz jobs so that a minimal amount of time is spent thrashing and dealing with misfires?

A note about or architecture: We are running two clusters with three nodes apiece. The version of quartz is a bit old (2.0.1), and clustering is enabled in the quartz.properties file.

4
There isn't any way you could distribute workload evenly over day? (e.g queues)Sami Korhonen
For the nightly we can do this, but for on demand it needs to execute as fast as possible. One of my thoughts was to sum the number of quartz jobs to be created and take the average execution time of a single thread, and then randomly schedule the quartz jobs accordingly, also accounting for the number of threads. This would help alleviate the misfiring slightly, but the execution time of a single thread is too variable, and always assuming the worst case scenario would take too long in the case of on demand processing.Brett McLain
I think using Quartz for this kind of scenario is wrong, since all jobs you spawn should execute immediately and not at a specific time. Like other answers suggest using queues and an executor service would make the most sense here.Leonard Brünings

4 Answers

8
votes

In both of these cases, around 10,000 quartz jobs get spawned

No need to spawn new quartz jobs. Quartz is a scheduler - not a task manager.

In the nightly reprocess - you need only that one quartz cron job to invoke some service responsible for managing and running the 10,000 tasks. In the "on demand" scenario, quartz shouldn't be involved at all. Just invoke that service directly.

How does the service manage 10,000 tasks?

Typically, when only one JVM is available, you'd just use some ExecutorService. Here, since you have 6 nodes under your fingers, you can easily use Hazelcast. Hazelcast is a java library that enables you to cluster your nodes, sharing resources efficiently with each other. Hazelcast has a straightforward solution distributing your ExecutorService, that's called Distributed Executor Service. It's as easy as creating a Hazelcast ExecutorService and submitting the task on all members. Here's an example from the documentation for invoking on a single member:

Callable<String> task = new Echo(input); // Echo is just some Callable
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IExecutorService executorService = hz.getExecutorService("default");
Future<String> future = executorService.submitToMember(task, member);
String echoResult = future.get();
4
votes

I would do this by making use of a queue (RabbitMQ/ActiveMQ). The cron job (or whatever your on-demand trigger is) populates the queue with messages representing the 10,000 work instructions (i.e. the instruction to reprocess the data for a given domain object).

On each of your nodes you have a pool of executors which pull from the queue and carry out the work instruction. This solution means that each executor is kept as busy as possible while there are still work items on the queue, meaning that the overall processing is accomplished as quickly as possible.

2
votes

The best way is to use a cluster of Quartz Instances. This will share the jobs between many cluster nodes : http://quartz-scheduler.org/documentation/quartz-2.x/configuration/ConfigJDBCJobStoreClustering

0
votes

I would use a scheduled quartz job to initiate the 10k tasks, but it does so by appending task details to a JMS queue (10k messages). That queue is monitored by a message-driven bean (Java-EE EJB MDB). The MDB can run simultaneously on multiple nodes in your cluster, and each node can run multiple instances... don't reinvent the wheel for distributed taskload: let Java-EE do it.