At some point, it’s going to be necessary for you perform some scheduled tasks in your application. Here are three options for how to do this in PHP.
For all of these methods I recommend you use PHP to develop the scheduled task because you’ll be able to reuse existing code from your application and you can develop the tasks using Test Drive Development.
1. A System Level Cron Job For Every Task
Unix and Windows each have a built-in system for creating scheduled tasks. In Windows, it’s call scheduled tasks and in Unix land it’s called Cron (I’m going to refer to both as Cron for clarity in this article). This method has you create a command line script for each task and then add an entry into Cron.
The nice part about using the built-in system is that it’s an easy way to get started and each task gets run in isolation from each other so you don’t need to worry about one script affecting any other.
There are some downsides to this approach. This setup works best when you have lots of little scripts that each perform a little task and then exits. This can cause your list of scheduled tasks to grow very large and be hard to sync across the various levels of development, staging, and production servers. It’s not impossible but even with a tool like Ansible performing the sync it can be unwieldy.
This process also doesn’t work well for long running processes because Cron will continue to create lots of versions of the script if it gets caught in an infinite loop which could crash your server. This is the worst case of course.
2. A Single Cron Page For Everything
The next level I usually suggest is to create a single endpoint on your web server that controls the running of your Cron jobs within PHP.
Then it can automatically discover new jobs (which you place in a specific folder so it can auto discover them). By having a single endpoint that controls when the jobs fire, you can prevent the same stuck task from running over and over again and taking down your server. You’ll do this by checking a flag to see it it’s already running and exit if it is. You’ll also be able to add in better rules like a job must be run once an hour regardless of when during the hour (which helps to balance out server load on smaller servers).
To get this started you’ll want to create a single cron job on your server that calls
wget and retrieves your URL.
The downside to this approach is that you’re going to have to recreate a subset of cron within your own application but the positives usually outweigh the downsides on this one. Because the tasks are all run within the same process, it’s possible for a single task to stop all the other tasks.
3. A Task Runner Resque/Gearman/Etc.
To deal with this problem it’s time to look into a task runner of some sort like Resque or RabbitMQ. Resque and RabbitMQ allow you to tell the systems you want a job run and then when it’s available it will run the task. The nice peace about this process is that each task gets run separately so there’s less worry about one task causing problems with the other tasks.
For applications I’ve developed recently, I usually start with #2 and then work myself over to #3 when load gets high enough. I think most people will find any level to work well but each has it’s pros and cons.