I was really surprised that there is no Spring-friendly solution that prevents concurrent execution of scheduled Springtasks when deployed to acluster. Since I was not able to find one, I had to implement one. I call it ShedLock . This article describes how it can be done and what I have learned by doing it.
Scheduled tasks are animportant part of applications. Let's say we want to send anemail to users with expiring subscriptions. With Spring schedulers , it's easy.
@Scheduled(fixedRate = ONE_HOUR) public void sendSubscriptionExpirationWarning() { findUsersWithExipringSubscriptionWhichWereNotNotified().forEach(user -> { sendExpirationWarning(user); markUserAsNotified(user); }); }We just find all users with expiring subscriptions, send them anemail, and mark them so we do not notify them again. If the task is not executed due to some temporary issues, we will execute the same task thenext hour, and the users will eventually get notified.
This works great until you decide to deploy your service on multiple instances. Now your users might get the email multiple times, one for each instance running.
There are several ways how to fix this.
Hope that tasks on different servers will not execute at the same time. I am afraid that hope is not a good strategy. We can process users that need to be notified one by one, atomicallyupdating their status. It's possible to implement using Mongo with findOneAndUpdate . This is afeasible strategy, although hard to read and reason about. Use Quartz . Quartz is the solution for all your scheduling needs. But I have always found Quartz configuration incredibly complex andconfusing. I am still not sure if it is possible to use Spring configured JDBC DataSource together with ConfigJobStore . Write your own scheduler lock, open source it, and write an article about it.Which brings us to ShedLock .
Spring APIs Are GreatThe first question is how to integrate with Spring. Let's say I have a method like this
@Scheduled(fixedRate = ONE_HOUR) @SchedulerLock(name = "sendSubscriptionExirationWarning") public void sendSubscriptionExirationWarning() { ... }When the task is being executed, I need to intercept the call and decide if it should be really executed or skipped.
Luckily for us, it's pretty easy thanks to the Spring scheduler architecture. Spring uses TaskScheduler for scheduling. It's easy to provide an implementation that wraps all tasks that are being scheduled and delegatesthe execution to an existing scheduler.
public class LockableTaskScheduler implements TaskScheduler { private final TaskScheduler taskScheduler; private final LockManager lockManager; public LockableTaskScheduler(TaskScheduler taskScheduler, LockManager lockManager) { this.taskScheduler = requireNonNull(taskScheduler); this.lockManager = requireNonNull(lockManager); } @Override public ScheduledFuture<?> schedule(Runnable task, Trigger trigger) { return taskScheduler.schedule(wrap(task), trigger); } ... private Runnable wrap(Runnable task) { return new LockableRunnable(task, lockManager); } }Now we just need to read the @SchedulerLock annotation to get the lock name. This is, again, quite easy since Spring wraps each scheduled method call in ScheduledMethodRunnable , which provides reference tothe method to be executed, so reading the annotation is piece of cake .
Lock ProviderThe actual distributed locking is delegated to a pluggable LockProvider . Since I am usually using Mongo, I have started with a LockProvider which places locks into a shared Mongo collection. The document looks like this.
{ "_id" : "lock name", "lockUntil" : ISODate("2017-01-07T16:52:04.071Z"), "lockedAt" : ISODate("2017-01-07T16:52:03.932Z"), "lockedBy" : "host name" }In _id, we have the lock name from the @SchedulerLock annotation. _id has to be unique and Mongo makes surewe do not end up with two documents for the same lock. The 'lockUntil' fieldis used for locking ― if it's in the future, the lock is held, if it isin the past, no task is running. And 'lockedAt' and 'lockedBy' are currently just for info and troubleshooting.
There are three situations that might happen when acquiring the lock.
No document with given ID exists: We want to create it to obtain the lock.
The document exists, lockUntil is in the past: We can get thelock by updating it.
The document exists, lockUntil is in the future: Thelock is alreadyheld, skip the task.
Of course, we have to execute the steps above in a thread-safe way. The algorithm is simple
Try to create the document with 'lockUntil' set to the future, if it fails due to duplicate key error, go to step 2. If it succeeds, we have obtained the lock.
Try to update the document with condition 'lockUntil' <= now. If the update succeeds (the document is updated), we have obtained the lock.
To release the lock, set 'lockUntil' = now.
If the process dies while holding the lock, the lock will be eventually released when lockUntil moves into the past. By default, we are setting 'lockUnitl' one hour into the future but it's configurable by ScheduledLock annotation parameter.
Mongo DB guarantees atomicity of the operations above, so the algorithm works even if several processes are trying to acquire the same lock. Exactly the same algorithm works with SQL databases as well, we just have a DB row in