We have a variety of batch scheduled jobs at Cinchcast responsible for a lot of different tasks, including media encoding, CDN storage and several maintenance, cleanup and alerts duties. We’ve been using a commercial solution so far, but the pricing has become more restrictive as the amount of jobs grows larger, and more hardware needs to be allocated for the increasing load as the company grows.
There’s also been an internal push to dockerized applications, and more reliance on open source projects. The time seemed right to revisit and renew our batch job infrastructure.
Open source projects
Looking at the alternatives, the following projects seemed to fit our needs:
- Quartz is a great Java library for job scheduling. It’s got a ton of great features, including built-in clustering support, JTA transactions and DB persistence.
- Quartz is still missing some features. First, it’s simply a library, so it needs to be hosted somewhere. It’s also missing a job run history, and an admin user interface. Piezo is a very small wrapper around Quartz that provides all these. We’ve already started contributing to Piezo, and will continue to do so. These projects together provided the same feature set we had with the current commercial solution.
- Docker is great but you are pretty much limited to a single computer. The Shipyard project, based on Docker Swarm, turns a pool of Docker hosts into a single, virtual Docker host.
Migrating to the new infrastructure was simple:
- The existing Java code was framework agnostic, with only a few wrappers needed for the existing solution. Migrating to Quartz only required a new set of wrappers.
- We got the piezo components dockerized. That is, the Worker component thats hosts the Quartz scheduler and runs the jobs, and the Scala based web admin. This was as simple as creating Dockerfiles following the deployment instructions.
- We deployed Shipyard, and added several docker hosts to the swarm, and deploy new/updated Piezo Workers using a combination of Jenkins and Ansible.
Our jobs infrastucture is now clustered among many servers, using all free and open source technologies: