Why and how you should monitor scheduled tasks
Oh Dear is the monitoring SaaS that my buddy Mattias and I are running. As you might suspect, our service can monitor the uptime of sites and SSL certificates' health. What sets Oh Dear apart from the competition is that it can also monitor performance and detect broken links and mixed content on any of the pages of your site.
Today, we added a new type of monitoring: scheduled tasks monitoring. Oh Dear can now notify you whenever one of your scheduled tasks has not run or is running too late.
You can get started monitoring your schedule today. We have a free ten-day trial. And when using this coupon code, you'll get 30% off on the first three months when subscribing: MONITOR-ALL-THE-THINGS
.
In this blog post, I'd like to introduce how you can use scheduled task monitoring in Oh Dear, and how it works under the hood. There were a lot of interesting challenges we had to solve. I hope you're ready to dig it.
Why monitor scheduled tasks in the first place
Because I mostly write Laravel apps, I'm going to use Laravel examples in the remainder of this post, but most things apply to other frameworks and languages as well.
Before heading into how you can monitor scheduled tasks, let's first discuss why you would want to monitor them in the first place.
In Laravel you can schedule tasks in the console kernel.
Here's how a typical schedule could look like. I took this example from the Laravel app that powers the blog you are reading.
namespace App\Console;
use App\Console\Commands\PublishScheduledPostsCommand;
use Illuminate\Console\Scheduling\Schedule;
use Illuminate\Foundation\Console\Kernel as ConsoleKernel;
class Kernel extends ConsoleKernel
{
protected function schedule(Schedule $schedule)
{
$schedule->command(PublishScheduledPostsCommand::class)->everyMinute();
$schedule->command('mailcoach:calculate-statistics')->everyMinute();
$schedule->command('mailcoach:send-scheduled-campaigns')->everyMinute();
$schedule->command('mailcoach:send-campaign-summary-mail')->hourly();
$schedule->command('mailcoach:delete-old-unconfirmed-subscribers')->daily();
$schedule->command('schedule-monitor:clean')->daily();
$schedule->command('responsecache:clear')->daily();
$schedule->command('backup:clean')->daily()->at('01:00');
$schedule->command('backup:run')->dailyAt('3:00');
$schedule->command('mailcoach:send-email-list-summary-mail ')->mondays()->at('9:00');
}
protected function commands()
{
$this->load(__DIR__.'/Commands');
}
}
In order to make the schedule tasks actually run, you need to add this entry to your cron on your server:
* * * * * cd /path-to-your-project && php artisan schedule:run >> /dev/null 2>&1
This cron entry will start Laravel every minute, and Laravel will execute the tasks that need to be run that minute.
You might think that both cron and Laravel are rock solid, and because of that, nothing can go wrong here. All tasks always run on time, right? Actually, there are a lot of things that can go wrong.
When executing the schedule, Laravel will call each of the tasks that need to be run one after the other. If a task takes a long time to run, let's say a couple of seconds, then all other tasks will have to wait. Take a look at the kernel above. If PublishScheduledPostsCommand::class
takes 30 seconds, then the mailcoach:calculate-statistics
tasks will only start 30 seconds after the minute.
But things can go wrong even more. Image PublishScheduledPostsCommand
takes more than a minute to run. When a new minute starts, and while PublishScheduledPostsCommand
is running, cron will call schedule:run
again. This will cause PublishScheduledPostsCommand
to be running multiple times at the same time. Of course, this can have unintended consequences.
This is just a small example. There are quite a few other things that can go wrong. The commands themselves might throw an exception. In previous versions of Laravel, this would cause all subsequent tasks not to be run at all. If you want to know more about the challenges, watch this excellent talk Michael Dyrynda gave at the Laravel Worldwide Meetup.
When a scheduled command throws an exception, a tracker like Flare might notify you. But Flare and similar services won't alert you whenever a scheduled command doesn't run or doesn't run on time, because... no exception or error is thrown.
That's why you need to monitor the execution of your scheduled tasks as well.
Using Oh Dear to monitoring scheduled tasks
Let's take a look at how you can use Oh Dear to monitor your schedule.
The basic principle of monitoring scheduled tasks is quite simple. Whenever a scheduled task completes, it should ping an endpoint at Oh Dear. Whenever Oh Dear doesn't receive a ping in time, Oh Dear will send a notification via Mail, Slack, a webhook, SMS, ...
If you're using Laravel, getting your schedule synced with Oh Dear is a breeze. In addition to monitoring your schedule locally, the spatie/laravel-schedule-monitor package can sync up your schedule with Oh Dear. It will also, after each scheduled task completes, automatically ping the right endpoint at Oh Dear.
Here's a quick video where I demonstrate how the package can be used.
If you want to know more about the local monitoring part of the laravel-schedule-monitoring package, or how it works under the hood, read this blog post or watch this stream.
Users of other languages and frameworks can use the scheduled tasks monitoring too. Task monitors can be manually created at Oh Dear.
In the cron expression field, you should enter... 🥁 a cron expression. When you've entered one, you get a preview of the date and times Oh Dear expects the task to run.
In the grace time field, you can put a number of minutes Oh Dear should wait before concluding that a task is down. It's probably safe to set this to 5 minutes or so.
After submitting, you'll see this screen with setup instructions. It also shows you the URL that you need to ping at the end of your scheduled tasks. For PHP, it could be as simple as adding this to the scheduled task.
@file_get_contents('https://ping.ohdear.app/7b023b17-9325-4a27-a1c8-8af83c520e78');
After you've added scheduled task monitors, either manually or through the Laravel package, you'll see a list of defined monitors like this one:
A cool thing to know is that, through the power of Livewire, this screen auto-updates. So you'll see incoming pings happen in real-time. Pretty cool!
When you click a scheduled task, you'll see this detail page.
You'll see a grid-like month view, in which each square represents a day. If everything went well running the scheduled task a particular day, we'll color that square green; otherwise, we'll color it red. This way, you have a quick birds-eye view of how your scheduled task has performed the past month.
Below that month view, you see a list of all recent events for your scheduled task. If you use the Laravel package to monitor your tasks, we'll even show you the runtime of the task and memory usage.
When something goes wrong running your check (it did not run, ran too late, or an error occurred within your task), Oh Dear will sent a notification. There are several notification channels available: Mail, Slack, Webhooks, SMS, ...
Here's how such a notification looks like in Slack.
Under the hood of the scheduled task monitoring
When starting out creating the scheduled monitoring check on Oh Dear, I thought, "how hard can this be". After spending half a year on this problem, I can tell you: it's hard to get it right. There are lots of caveats. When talking with Michael Drynyda, who's been tackling the same problem, confirmed that there are lot pitfalls. Let's dive into two of those.
Pitfall 1: handling a large number of incoming requests
You can get in the high numbers pretty fast. Let's say the average app has 10 scheduled tasks that run every minute. And let's assume the monitoring app has 500 teams that each have 10 sites that need to be monitored.
500 teams x 10 sites x 10 scheduled tasks x 24 hours x 60 minutes = 72 million incoming requests a day. That's a lot!
What makes this worse is that requests can also happen concurrently: 500 teams x 10 sites x 10 tasks = 50 000 requests coming in at more or less the same time.
We solved this by putting our ping receiving endpoint on Laravel Vapor. With some tuning, AWS Lambda can handle this amount of traffic.
Pitfall 2: Avoiding sending false positives
I consider sending a false positive notification as worse as sending no notification at all. While building scheduled task monitoring, I've noticed that it's very easy to send a false positive notification. Imagine you would log pings that enter the monitoring and send a notification if no ping came in on time. You might think this approach would always work out. But it breaks down very fast. If your monitoring app itself goes down for a while, it will miss all incoming pings. When the monitoring app comes up again and checks the DB, you would send notifications for all missed calls.
To solve this problem, the Oh Dear endpoint that receives pings from scheduled tasks is not part of the Oh Dear main app.
As mentioned above, it is a separate Laravel app that runs on Laravel Vapor. The main Oh Dear app has a scheduled task called verify-scheduled-task-monitoring-works
that runs every 10 seconds. It pings the ping receiving endpoint on Vapor and uses the same logic as all other scheduled takes.
Every time that check completes, we open up a time window in which we can send notifications for the scheduled tasks we monitor for our clients. If our own verify-scheduled-task-monitoring-works
does not complete, we don't send any notifications to our clients because we can't guarantee that our infrastructure is working correctly. That's how we avoid sending false positives. Of course, when verify-scheduled-task-monitoring-works
, we send a notification to ourselves, so we know we need to fix the problem ASAP.
In closing
I hope you've enjoyed this introduction to scheduled task monitoring at Oh Dear. Even though I underestimated it, Mattias and I did have a fun time building it.
Scheduled tasks monitoring has some extra features not mentioned in this blog post. To know more, check out the documentation at Oh Dear or read the launch blog post at Oh Dear.
To try out the scheduled tasks monitoring at Oh Dear, and all the other checks (uptime, broken links, certificate health, mixed content, ...) register for a free ten-day trial. If you decide to subscribe, use this coupon code to 30% off on the first three months: MONITOR-ALL-THE-THINGS
.
What are your thoughts on "Why and how you should monitor scheduled tasks"?