Dockerization Part 1: Building

I’ve been long overdue for a series of articles explaining how our current build system works. One of the major projects I was involved with before this recent reorg involved overhauling our manual build process into a shiny new CI/CD system that would take the code from commit to production in a regulated, automated fashion. As always, the reward for doing a good job is more work like that; when we decided to move to Docker to better support our new team structure, I ended up doing a lot of the foundational work on our new build-test-deliver pipeline. Part one of that pipeline is, of course, building and storing containers.

Your mission, if you choose to accept it

In the old world, before we dockerized our applications, we were following a fairly typical system (that I designed): our CI server runs tests against the code, then bundles it up as an archive file. After that, one environment at a time and on request, it would SCP the tarball down to the server, stop the running process, remove the old codebase, and unpack the new before starting the process again. There were configuration files that had to be saved off and moved back in afterward in a few cases, but we had all those edge cases ironed out. It was working, and there were almost no changes to it in the year before we launched docker.

As we were preparing to go live, I didn’t want to lose the build pipelines we had worked so hard on. And yet, docker containers are fundamentally different than tarballs of code files. Furthermore, our operators (who are responsible for putting code into production) complained of having too many buttons to click: often, our servers had 3-4 codebases on them, meaning 3-4 buttons to click to update one server. They definitely didn’t want to do one button per container. On the other hand, our developers were clear on what they wanted: more deploys, faster deploys, and breaking out their monoliths into modules and microservices so they could go even faster. How to balance these concerns?

Another wrinkle emerged as well once I got my hands on our environment: we chose Rancher as our docker management tool of choice. Rancher is a great little tool, and I enjoy working with its GUI, but when most companies seem to be standardizing on Kubernetes, it was hard to find good examples and tutorials for how to work with Rancher instead.

With all those pressures bearing down on me, my task was straightforward, but far from simple.

How to build a container in 30 days

The promise of containers seemed like it resolved a lot of our headaches overall: developers control the interior of the container, and Platform Ops controls the outside of it. In this brave new world, I don’t have to care what goes in a container, but it’s my job to ensure they get to where they’re going every time without fail. In practice, however, I found I need to understand quite a bit about containers themselves.

For the purposes of this article, you don’t need to know or care about the virtualization layer; just trust that a container is isolated from everything around it, until and unless you drill holes in it (which we do. A lot. But I understand that’s common). You will need to know a little about how they’re built, however.

Picture a repository of source code. At some point, to dockerize the application contained within, you need  a Dockerfile: a file of instructions on how to build this container. Almost every container begins with an instruction to extend from another image, much like classes extending from a base class. This was really handy for us, since it means we can put anything we need into a custom base image and all the developers will have it pre-installed.

From there, there’s a series of customizations to the container. Generally, one step involves copying the code into the container, and another tells the container what executable to run when it starts. For Node.js, we ask our developers to put their code in a standard location, then execute “npm start” when the container boots up, letting them define what that means for their application.

Once you’re happy with what the container contains, it’s time to seal it up and ship it. In this case, that means two commands: a “tag” command, which gives it a name more interesting than the default (which will be something like 2b9c0185251d), and a “push” command, which uploads the docker container to a remote repository. If the container is intended to live in a central repository, it has to be tagged with that repository as part of the name (including a port number, which usually defaults to 5000 for a Docker registry unless you put an Nginx in front to make it 80): something like “artifactory.internal:5000/dt-node-base”. Appended to that is a version: this can be a sequential number, or a word or anything else. By convention, each container is tagged twice: once with a sequential number, and once with the word “latest”. That makes it so you can always pull down the very latest node base container from our Artifactory repository by asking it for “artifactory.internal:5000/dt-node-base:latest”.

The system

So we have a number of parts to this build system that the CI/CD server has to integrate with. The first piece is to begin with raw source code, including a Dockerfile; we had been using Subversion, but the developers had been asking for Git for so long we finally broke down and bought a Bitbucket server and let them migrate.

The next piece is to build the containers with Docker. Since we were using Bamboo as our CI/CD server, I installed Docker on all the remote agents; this required an OS upgrade for them to Red Hat 7, but I was able to script the install using Ansible to make doing it across our whole system less painful.

The next piece is somewhere to store the containers when we’re done with them. As you can guess by the previous example, we decided to use Artifactory for this; this is mostly because, as the developers moved to Node, they were asking for a private NPM server, and Artifactory is able to do double duty and hold both types of artifacts.

For the communication between them, my coworker put together a script we could put on each build server that the plans could use to ensure they didn’t miss any steps. It’s straightforward, looking something like this:

#!/bin/sh -e
# $1 Project Name (dt-nodejs)

docker build -t artifactory.internal:5000/$1:$bamboo_buildNumber \
 -t artifactory.internal:5000/$1:latest

docker push artifactory.internal:5000/$1:$bamboo_buildNumber
docker push artifactory.internal:5000/$1:latest
echo "$1:$bamboo_buildNumber and $1:latest pushed to Artifactory on artifactory.internal:5000"

This means that every build tags the container with the number of the build, giving us an easy source of sequential numbers for the containers without thinking about it. It does mean, however, that building a new pipeline for an existing container name will start the numbering over from 1 and overwrite old containers, but we encourage developers to edit their build plans instead of starting over where possible. If you have any ideas on how to prevent that, I’d love to hear them.

(I’ve actually enhanced this script since, but I’ll talk about that in a future entry)


How to force Bamboo to build on Linux

So let’s talk about build servers for a minute. I manage the company’s Bamboo server, which we use to do builds and continuous integration. I don’t know if this is an unusual use case or what, but some of my builds require Windows and others perform best on Linux. So we have Windows agents and Linux agents.

Some things you would think are intuitive are not. For example, there’s no way to differentiate in a Script Task between CMD and Bash. How many scripts are actually cross-compatible between the two? Not many, in my experience. Often, I’d write out script tasks for Bash and they’d get farmed out to a Windows server by mistake and fail to, say, create a tar archive or wget a resource. So how can I force those to execute on Windows?

The solution I hit upon is pretty simple: I created a new executable definition called Bash, located at /bin/bash. This will auto-detect on new Linux agents, but not on Windows agents. Then I can put my scripts into the repo (which is probably a best practice anyway) and use the Bash command to run them. I can even run one-liner scripts with this task if I use the “-c” flag before the command, like “-c grep –BROKEN– results.txt | tee broken.txt” (a command I used just yesterday to pull results out of my broken link checker). Plus, you can still use the script task as normal, as long as there’s at least one Bash task in your job to force it to build on Linux.

The inverse is simple as well: I created a Powershell executable and use Powershell scripts for my Windows builds. Problem solved, plus I get the power of Powershell to use in my scripts.

Does anyone out there have any other cool tips? Let me know in the comments!