We have a application running with GAE and plan to move to AWS Beanstalk. GAE has Cron feature for Scheduled Tasks which Beanstalk don't.
Quartz is the "Cron" in the java world, and support clustering which is required for us. Looks it is a good replacement, so we decide to have a try.
Quartz Clustering relies on Database:
(Pic from Official Doc)
From the architecture you can see that firstly you should have a high availability DB solution. Since AWS has RDS, this is not an issue for us.
Put a quartz.properties
into your app classpath, make sure to copy it and distribute to other instances.
Note that every node should have the same org.quartz.scheduler.instanceName
. More detail about file content please refer to Official Doc.
Quartz provides script for most kinds of DB, you can find them under downloaded_distribution_file/docs/dbTables
Let's create a Logging Job:
import org.quartz.Job;
import org.quartz.JobExecutionContext;
import org.quartz.JobExecutionException;
import org.quartz.SchedulerException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class LoggingJob implements Job {
private static Logger _log = LoggerFactory.getLogger(LoggingJob.class);
@Override
public void execute(JobExecutionContext paramJobExecutionContext) throws JobExecutionException {
_log.info(paramJobExecutionContext.toString());
try {
_log.info(paramJobExecutionContext.getScheduler().getSchedulerName() + " execute!");
} catch (SchedulerException e) {
e.printStackTrace();
}
}
}
The scheduler runner looks like:
import org.quartz.JobBuilder;
import org.quartz.JobDetail;
import org.quartz.Scheduler;
import org.quartz.SchedulerException;
import org.quartz.SimpleScheduleBuilder;
import org.quartz.Trigger;
import org.quartz.TriggerBuilder;
import org.quartz.impl.StdSchedulerFactory;
public class QuartzTest {
public static void main(String[] args) {
Scheduler scheduler = null;
try {
scheduler = StdSchedulerFactory.getDefaultScheduler();
JobDetail job = JobBuilder.newJob(LoggingJob.class).requestRecovery().withIdentity("job1", "group1").build();
Trigger trigger = TriggerBuilder.newTrigger().withIdentity("trigger1", "group1").startNow()
.withSchedule(SimpleScheduleBuilder.simpleSchedule().withIntervalInSeconds(2).repeatForever()).build();
if (scheduler.getJobDetail(job.getKey()) == null) {
scheduler.scheduleJob(job, trigger);
}
scheduler.start();
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
e.printStackTrace();
}
scheduler.clear();
scheduler.shutdown();
} catch (SchedulerException se) {
se.printStackTrace();
if (scheduler != null) {
try {
scheduler.clear();
scheduler.shutdown();
} catch (SchedulerException e) {
e.printStackTrace();
}
}
}
}
}
Running QuartzTest
to see LoggingJob
print following at console:
[INFO] 20 Jan 11:16:30.926 AM MyClusteredScheduler_Worker-1 [test.LoggingJob]
JobExecutionContext: trigger: 'group1.trigger1 job: group1.job1 fireTime: 'Tue Jan 20 11:16:30 CST 2015 scheduledFireTime: Tue Jan 20 11:16:05 CST 2015 previousFireTime: 'Tue Jan 20 11:16:03 CST 2015 nextFireTime: Tue Jan 20 11:16:07 CST 2015 isRecovering: false refireCount: 0
[INFO] 20 Jan 11:16:30.926 AM MyClusteredScheduler_Worker-1 [test.LoggingJob]
MyClusteredScheduler execute!
Then start more QuartzTest
, you will find only 1 LoggingJob
is scheduled at the same time. If you terminate the first one, LoggingJob
from another node is awoken.
The Pros and Cons of Quartz Clustering:
Pros | Cons |
---|---|
Simply architecture | Relies on database high availability |
Suit for small/mid level clustering | DB could be the bottleneck and latent risk |
Less invasiveness. Can be in/out application. | Clock sync is required between nodes. |
Next I'm going to try ZooKeeper, there now exists a Lock implementation under recipes directory.
20 Jan 2015