Dockerization part 2: Deploying

Now that we have containers, we need to push them to our subprod environments so they can be tested. Bear with me, this is where things get a little complicated.

Docker Setup

Most people take the easy way out when they move to docker: they ship their containers to the cloud and let someone else manage the installation, upgrades, and maintenance on the docker hosts. We don’t do things the easy way around these parts, though, so we have our own server farm: a series of VMs in our datacenter. Everything below the VM is maintained by another team; my team is responsible for the software layer of the VM, and the containers that run on top.¬†We have a handful of servers in our sub-prod environments, and then a handful more in our various production DMZs.

For management, most people seem to choose Kubernetes, but again, we don’t do things the easy way around here, so we went with a less popular product called Rancher. Now, Rancher is a management interface that can sit on top of a number of underlying technologies, including Kubernetes, but we chose to use their house-brand management system, called Cattle, instead. They were nice enough to give us a bunch of training in Docker, including the advice that forms the basis for their theme: if servers were pets, carefully maintained and fed over the years, containers should be like cattle, slaughtered and replaced as soon as they seem to be ill so they don’t infect the whole herd.

Rancher is a really great tool if you’re working in the GUI. It has the concept of an Environment (which we use to separate dev from QA from demo), which spans across one or more Hosts (the servers that run Docker and manage the containers). Inside the Environment are Stacks, which are a collection of related containers with a name. It also handles a lot of the networking between containers, as it comes with its own DNS for the internal container network so you can just resolve Stackname/ContainerName to find a given container in your Environment. You can upload a docker-compose.yml file to create a stack if you’re using Compose, and the extra metadata Rancher uses can be stored in a rancher-compose.yml that also can be uploaded when you make a stack.

Rancher running on my local machine, showing a project I have in progress


Manual deployment is super easy in Rancher: create a new stack, add services, paste in the container name from our build step, and let it handle everything else. Moving between environments manually once it works in dev is also easy: download the compose files, then upload them into the next environment. But we’re doing CI/CD, and the developers are constantly asking how they can speed up their release schedule. How do we do this automatically?

There’s two tools that come with Rancher that can help here. One is the extensive API; pretty much everything you can do in the GUI can be done via the JSON-based REST API. The other is the pair of command-line tools they produce: Rancher-compose and Rancher CLI. Since I was also trying to release quickly, I used the API for my initial round of deployment scripts; in a later post, I’ll talk through how I’ve begun to convert to using the CLI commands instead, as I feel they’re faster and cleaner.

For Bamboo, I needed something that could run in a Deploy Project that would update the stack in a given environment. I decided to write a Node.JS script, because when all I have is a node-shaped hammer every build script becomes a nail ūüėČ (Actually, it was so our Node developers could read the script themselves). I didn’t do much special here, just your standard API integration using a promise-based architecture; however, this is a chunk of a bigger library I decided to write around Rancher, so you’ll see a lot of config options:

function findStack(stackName, environment) {
    return request({
        uri: `${opts.url}/v2-beta/projects/${opts.projectIDs[environment]}/stacks?name=${stackName}`,
        auth: {
            username: opts.auth[environment].key,
            password: opts.auth[environment].secret
        json: true// Automatically stringifies the body to JSON

function getContainerInfo(environment, stackName, containerName) {
	log(`Getting container info for ${containerName}`);
	return findStack(stackName, environment)
	.then((body) => request({
		method: 'GET',
		uri: `${opts.url}/v1/services/?environmentId=${[0].id}&name=${containerName}`,
		auth: {
			username: opts.auth[environment].key,
			password: opts.auth[environment].secret
		json: true // Automatically stringifies the body to JSON

function performAction(serviceId, action, environment, launchConfig, stackName) {
    update(`Performing action ${action} on service ${serviceId}`, 'info', stackName);
    return request({
        method: 'POST',
        uri: `${opts.url}/v1/services/${serviceId}/?action=${action}`,
        body: {
            'inServiceStrategy': {
                'batchSize': 1,
                'intervalMillis': 2000,
                'startFirst': true,
                'launchConfig': launchConfig
        auth: {
            username: opts.auth[environment].key,
            password: opts.auth[environment].secret
        json: true

    performServiceUpgrade: function (stackName, containerName, environment, image) {
        update(`Upgrading ${containerName} in stack ${stackName} in ${environment} to image ${image}`, 'info', stackName, environment)
        return getContainerInfo(environment, stackName, containerName).then((body) => {
            if ( <= 0) {
                throw new Error(`Could not find service ${containerName} in stack ${stackName} in ${environment}`);
            let serviceId =[0].id;
            let launchConfig =[0].launchConfig;
            launchConfig.imageUuid = image;

            return performAction(serviceId, 'upgrade', environment, launchConfig, stackName)
            .catch((err) => {
                if ((err.statusCode == 422 || err.status == 422) && opts.retries.on422) {
                    log('Detected invalid state. Rolling back to retry.', 'info', stackName)
                    return performAction(serviceId, 'rollback', environment, launchConfig, stackName)
                        .then(() => this.waitForActionComplete(stackName, containerName, environment, 'active', stackName))
                        .then(() => performAction(serviceId, 'upgrade', environment, launchConfig, stackName));
                } else {
                    log('Detected error condition. Aborting', 'error', stackName)
                    throw err;
            .then(() => this.waitForActionComplete(stackName, containerName, environment, 'upgraded', stackName))
            .then(() => serviceId);

I highlighted lines 54 and 55, however, because they are a little strange. Rancher lets you update anything about a service using the same endpoint, which is kind of nice and kind of rough: I need to specify every single attribute of the service, or it’ll assume I meant to blank out the setting (rather than assuming I meant to leave it unchanged). To make this easier, I captured the existing launch configuration, then changed the container number and sent it back.

Upgrading a service in Rancher is a two-step process: first, you upgrade, which launches a copy of the new container for every copy of the existing container, and then you “finish” the upgrade, which removes the old containers. This is so that if there’s a problem with the new container, you can issue a “rollback” action, which turns the old containers back on and removes the new ones — much faster than trying to pull a fresh copy of the old container back. However, this means sometimes you’ll be trying to upgrade while it’s in an “upgraded” state, waiting for you to finish or roll back. When that happens, Rancher issues a status code 422. My library optionally rolls back and issues the upgrade action again if it encounters this state.

The hardest part was figuring out how to figure out when Rancher was done upgrading. Some of our images are huge, particularly the ones that contain monoliths we’re still in the process of breaking up; it can take several minutes for these containers to download and start up. Eventually, I settled on a polling-based strategy:

waitForActionComplete: function(stackName, containerName, environment, desiredState) {
    update('Waiting for upgrade to complete', 'info', stackName, environment);
    return new Promise((resolve, reject) => {
        //Wait for the service to finish upgrading
        let retries = opts.retries.actionComplete;
        function checkState() {
            getContainerInfo(environment, stackName, containerName).then((body) => {
                let container =[0];
                log('Current state: ' + container.state);

                //Check if upgrade is done
                if (container.state == desiredState) {
                    log('Action complete');
                    return resolve();
                } else {
                    if (retries < 0) {
                        return reject('Timed out waiting for action to complete');
                    log(`${retries} left, running again`);
                    return setTimeout(checkState, 1000);
        setTimeout(checkState, 500);

This will keep running the checkState function until either the container’s state enters the desired state, or it runs out of retries (configured in the config for the library). I’ve had to tune the number of retries several times; right now, for our production deploy, it’s something outrageous like 600.

This library is called from a simple wrapper for Bamboo’s sub-prod deploys; for production, however, I got a lot trickier. Stay tuned for that write-up next week!

Dockerization Part 1: Building

I’ve been long overdue for a series of articles explaining how our current build system works. One of the major projects I was involved with before this recent reorg involved overhauling our manual build process into a shiny new CI/CD system that would take the code from commit to production in a regulated, automated fashion. As always, the reward for doing a good job is more work like that; when we decided to move to Docker to better support our new team structure, I ended up doing a lot of the foundational work on our new build-test-deliver pipeline. Part one of that pipeline is, of course, building and storing containers.

Your mission, if you choose to accept it

In the old world, before we dockerized our applications, we were following a fairly typical system (that I designed): our CI server runs tests against the code, then bundles it up as an archive file. After that, one environment at a time and on request, it would SCP the tarball down to the server, stop the running process, remove the old codebase, and unpack the new before starting the process again. There were configuration files that had to be saved off and moved back in afterward in a few cases, but we had all those edge cases ironed out. It was working, and there were almost no changes to it in the year before we launched docker.

As we were preparing to go live, I didn’t want to lose the build pipelines we had worked so hard on. And yet, docker containers are fundamentally different than tarballs of code files. Furthermore, our operators (who are responsible for putting code into production) complained of having too many buttons to click: often, our servers had 3-4 codebases on them, meaning 3-4 buttons to click to update one server. They definitely didn’t want to do one button per container. On the other hand, our developers were clear on what they wanted: more deploys, faster deploys, and breaking out their monoliths into modules and microservices so they could go even faster. How to balance these concerns?

Another wrinkle emerged as well once I got my hands on our environment: we chose Rancher as our docker management tool of choice. Rancher is a great little tool, and I enjoy working with its GUI, but when most companies seem to be standardizing on Kubernetes, it was hard to find good examples and tutorials for how to work with Rancher instead.

With all those pressures bearing down on me, my task was straightforward, but far from simple.

How to build a container in 30 days

The promise of containers seemed like it resolved a lot of our headaches overall: developers control the interior of the container, and Platform Ops controls the outside of it. In this brave new world, I don’t have to care what goes in a container, but it’s my job to ensure they get to where they’re going every time without fail. In practice, however, I found I need to understand quite a bit about containers themselves.

For the purposes of this article, you don’t need to know or care about the virtualization layer; just trust that a container is isolated from everything around it, until and unless you drill holes in it (which we do. A lot. But I understand that’s common). You will need to know a little about how they’re built, however.

Picture a repository of source code. At some point, to dockerize the application contained within, you need  a Dockerfile: a file of instructions on how to build this container. Almost every container begins with an instruction to extend from another image, much like classes extending from a base class. This was really handy for us, since it means we can put anything we need into a custom base image and all the developers will have it pre-installed.

From there, there’s a series of customizations to the container. Generally, one step involves copying the code into the container, and another tells the container what executable to run when it starts. For Node.js, we ask our developers to put their code in a standard location, then execute “npm start” when the container boots up, letting them define what that means for their application.

Once you’re happy with what the container contains, it’s time to seal it up and ship it. In this case, that means two commands: a “tag” command, which gives it a name more interesting than the default (which will be something like 2b9c0185251d), and a “push” command, which uploads the docker container to a remote repository. If the container is intended to live in a central repository, it has to be tagged with that repository as part of the name (including a port number, which usually defaults to 5000 for a Docker registry unless you put an Nginx in front to make it 80): something like “artifactory.internal:5000/dt-node-base”. Appended to that is a version: this can be a sequential number, or a word or anything else. By convention, each container is tagged twice: once with a sequential number, and once with the word “latest”. That makes it so you can always pull down the very latest node base container from our Artifactory repository by asking it for “artifactory.internal:5000/dt-node-base:latest”.

The system

So we have a number of parts to this build system that the CI/CD server has to integrate with. The first piece is to begin with raw source code, including a Dockerfile; we had been using Subversion, but the developers had been asking for Git for so long we finally broke down and bought a Bitbucket server and let them migrate.

The next piece is to build the containers with Docker. Since we were using Bamboo as our CI/CD server, I installed Docker on all the remote agents; this required an OS upgrade for them to Red Hat 7, but I was able to script the install using Ansible to make doing it across our whole system less painful.

The next piece is somewhere to store the containers when we’re done with them. As you can guess by the previous example, we decided to use Artifactory for this; this is mostly because, as the developers moved to Node, they were asking for a private NPM server, and Artifactory is able to do double duty and hold both types of artifacts.

For the communication between them, my coworker put together a script we could put on each build server that the plans could use to ensure they didn’t miss any steps. It’s straightforward, looking something like this:

#!/bin/sh -e
# $1 Project Name (dt-nodejs)

docker build -t artifactory.internal:5000/$1:$bamboo_buildNumber \
 -t artifactory.internal:5000/$1:latest

docker push artifactory.internal:5000/$1:$bamboo_buildNumber
docker push artifactory.internal:5000/$1:latest
echo "$1:$bamboo_buildNumber and $1:latest pushed to Artifactory on artifactory.internal:5000"

This means that every build tags the container with the number of the build, giving us an easy source of sequential numbers for the containers without thinking about it. It does mean, however, that building a new pipeline for an existing container name will start the numbering over from 1 and overwrite old containers, but we encourage developers to edit their build plans instead of starting over where possible. If you have any ideas on how to prevent that, I’d love to hear them.

(I’ve actually enhanced this script since, but I’ll talk about that in a future entry)


How to force Bamboo to build on Linux

So let’s talk about build servers for a minute. I manage the company’s Bamboo server, which we use to do builds and continuous integration. I don’t know if this is an unusual use case or what, but some of my builds require Windows and others perform best on Linux. So we have Windows agents and Linux agents.

Some things you would think are intuitive are not. For example, there’s no way to differentiate in a Script Task between CMD and Bash. How many scripts are actually cross-compatible between the two? Not many, in my experience. Often, I’d write out script tasks for Bash and they’d get farmed out to a Windows server by mistake and fail to, say, create a tar archive or wget a resource. So how can I force those to execute on Windows?

The solution I hit upon is pretty simple: I created a new executable definition called Bash, located at /bin/bash. This will auto-detect on new Linux agents, but not on Windows agents. Then I can put my scripts into the repo (which is probably a best practice anyway) and use the Bash command to run them. I can even run one-liner scripts with this task if I use the “-c” flag before the command, like “-c grep –BROKEN– results.txt | tee broken.txt” (a command I used just yesterday to pull results out of my broken link checker). Plus, you can still use the script task as normal, as long as there’s at least one Bash task in your job to force it to build on Linux.

The inverse is simple as well: I created a Powershell executable and use Powershell scripts for my Windows builds. Problem solved, plus I get the power of Powershell to use in my scripts.

Does anyone out there have any other cool tips? Let me know in the comments!

Teatime: Continuous Integration

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Oprah Chai. I expected this to be boring and gimmicky, but it was surprisingly bold, and a pleasant drink all-around. I tried it at a Starbucks before I bought some, which is a nice perk.

Today’s Topic: Continuous Integration

Today’s topic¬†is continuous integration; much of it is adapted¬†from a book called Continuous delivery by Jez Humble and David Farley. When I gave this talk, I gave a disclaimer that the book aims to start with the worst possible practices and walk them up to the best possible practices. Since my company is far from the worst possible state, a lot of the items were things we were already doing. I’d be interested to hear in the comments what you already do or don’t do.

The Problem

Here are some of the major problems in the industry that Humble and Farley saw when they sat down to write the book in 2011:

  • Delivering software is hard!¬†Release day arrives; everyone’s tense, nervous. Nobody’s quite sure the product will even work, and nobody wants their part to be the bit that fails, putting them on the chopping block. We’ve got one shot at this launch, and if we botch it, we cost the company millions as customers go offline. Expect 3am phone calls from Operations — did development forget to write down a step in the process? Did someone forget to put a file in the build folder? Is all the SQL going to run first try? What if some of the data we thought was in prod has changed since we started developing? Are we really, really sure?
  • Manual Testing Sucks, Period. It takes¬†forever to manually test a product, and nobody’s ever quite sure we covered everything. How many bugs do you get where you’re asking yourself, “Didn’t they test this? How did this ever work?” It’s so obvious in hindsight when these things come in, but it’s customers finding them. And it takes weeks, months maybe, to run a full test cycle on a brand new app if it does anything more than CRUD. Oh, and performance testing? Manually? Eh, it seems fast enough, release it. Security testing? Who has time for this crap?
  • “It worked in dev”¬†syndrome. What are the differences between dev and prod? Can you name them all off the top of your head? When do you test in a production-like environment, and what are the differences between production-like and production? Who tested in dev? What did they test? Are you sure you understand how users will interact with your system? How many times do you get bugs where you ask yourself “Why did they even click that?!”
  • No way to test deployment. The only truly prod-like servers are prod; the only process is “a person does a thing”. You can’t test people, and there’s always going to be turnover. How do you know they did it right? How can you audit their process, or improve on it? People aren’t exactly reliable, that’s why we invented machines ūüėČ

The Principles

So here’s what they came up with as guidelines to try and correct the system. These are necessary to pull yourself out of process hell and start building toward Continuous Integration:

  • Every Commit is a Release candidate. Every single one could potentially be released. If it adds value, and doesn’t break anything else, it’s ready to release. Whether it’s actually released is going to be up to the BA and/or PM, of course, but you don’t want to commit anything you know is broken, you’ll just waste everyone’s time. If you want the safety blanket of committing early and often, make a feature branch; when you merge that back in, it’s a release candidate.
  • Repeatable, Reliable Release Process. Once you have that commit, you want a standardized process, on paper, that can be repeated with every release, no exceptions. If there ARE exceptions, you document those too, so they’re not exceptions anymore; things like rolling back a failed deployment should be a standard, repeatable process as well. We had one week where we accidentally re-promoted a broken release because I forgot to pull it out of the QA branch after it failed in production the week before. Needless to say, after I made a round of apologies, I documented that step as well!
  • Automate all the things! Automate everything. The authors have never seen a release process that can’t be automated with sufficient work and ingenuity. After I gave this talk the first time, I embarked on a 6-month project to do just that, simplifying our convoluted multiple-branch SVN strategy into a flatter tree, and automating the deployment from Trunk. It took ages and it was painful to implement, but the new system is much more reliable, faster, and generally nicer to use.
  • Keep everything in source control. The goal is to allow a new team member to come in, sit down at a brand new workstation, run a checkout, run a build script, and have a working environment. Yes, that includes the database. Yes, that includes the version of Coldfusion or Node or whatnot. Yes, that includes the Apache or Nginx configuration.¬†It should be possible to see at a glance what version of the application and dependencies are on the servers. Node’s package.json is a great step toward that ideal
  • If it hurts, do it more often. Example: Merging sucks. Merging more often, in smaller chunks, is easier than delaying until the end of the project to merge and resolve conflicts. Another example: releasing sucks. So instead of releasing huge products once a quarter, release them once a month, or once a week, or once a day, or once an hour…
  • Build Quality In This idea comes from Lean: the earlier you find a bug, the cheaper it is to fix. We QA folks tend to know that, and our mantra becomes Test Early, Test Often. If you find a bug in your code before you commit it, that’s maybe ten minutes time to fix it, max. If you find it in QA, now the person who found the bug has to write a ticket, the PM has to triage it, you have to read it and understand it, maybe some more clarification ¬†back and forth, then you have to hunt through the code to find the problem, and then you fix it. So now we’re looking at hours of time rather than minutes. And if the problem is found in production and we have to run through a whole release cycle? Plus the customers’ time lost trying to work around the bug? A disaster. This is where unit testing and integration testing is super important.
  • Done means released In Waterfall, “done” means “built, ready for testing”. In Agile, “done” means “ready to be released”, which means the developers don’t stop caring about something until it passes testing. DevOps goes one step beyond that: “done” means “released to production”. After all, what good is something that is beautifully crafted and passed all tests if the customer can’t use it yet? This ties into the next principle:
  • Everyone is responsible for delivery In the Waterfall way, the developer builds a thing, tosses it over the wall to QA, and walks away, expecting other people to be responsible for getting it into prod. In the DevOps world, we’re all on the same team together: it doesn’t matter whose fault it is or what went wrong, everyone’s responsible for helping get the code safely into production. The developer should be on hand to chime in with his intimate knowledge of the code while the operations folks are trying to get things running.
  • Continuous¬†Improvement This is my favorite principle ūüôā The general flow of work should be: Plan, Do, Study, Act. Routinely, we should get together to ask “how could do this better next time?”. We should take controlled risks in order to improve our craft.

The Practices

In order to support the above principles, the following practices need to be in place:

  • Use CI Software. We use Atlassian’s Bamboo to build and deploy changes. It makes 0 sense to have people do this manually; people are good at creative tasks, while computers are good at repetitive, boring tasks.
  • Don’t break the build Run the tests before you commit; don’t commit if something’s broken. An intern once asked me, “404s aren’t real errors, right?”. He was so used to popping open the console and seeing a dozen 404 errors that he didn’t notice the one that mattered. We can’t just have errors sitting around in production that we ignore, or we train ourselves to ignore real errors too.
  • Don’t move on until commit tests pass The CI server should be fast enough to give you feedback before you move on to another task; you should wait until you’re sure your commit is good before changing gears, so that if there’s something broken, you still have all the information you need loaded into your metaphorical RAM and don’t have to metaphorically swap pages to get to it.
  • You break it, you fix it Take responsibility for your changes! If your commit breaks something else, it’s not the other author’s problem, it’s your problem, because you made the change. Pointing fingers is a bad habit to get into.
  • Fail fast The authors suggest failing the build for slow tests. I agree, sheepishly; my functional tests are slow as heck, but I’m always trying to tighten the feedback loop and get developers information as rapidly as possible. They also suggest failing for linting issues, because they can lead to weird bugs later on. They suggest failing for architectural breaches, things like inline SQL when you have a stored-proc architecture, or other problems like that. The more you fail in dev, the less you fail in Prod.
  • Deploy to prod-like environments You should be deploying to environments that mimic production before you get out of the testing cycle, to make sure it’s going to be a clean deploy. ¬†More importantly,¬†what¬†you deploy to that environment should be exactly, byte for byte, what you deploy to prod. With the new release process I’ve set up, that’s exactly what we do: we physically move the exact files, no building on the server anymore. Staging should be the exact same hardware, the same load balancing, the same OS configuration, the same application stack, with data in a known good state.


I know that was a lot of dense information this week, but hopefully it gave you a nice clear picture of the goal state you can work toward. Was it useful? Let me know in the comments!

Teatime: Deployment Pipelines

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week:¬†An old standby, Twinings Ceylon Orange Pekoe. There’s no orange flavor in it; the Orange refers to the size of the leaves. It’s a good staple tea I can find in my local supermarkets, solid and dependable — just like a deployment pipeline should be.¬†

Deployment Pipelines

Today’s topic is a little more afield from last week’s discussion of testing types, but I feel it firmly falls under the umbrella of quality. A good deployment pipeline, as you will see shortly, improves the maintainability of code and prevents unwanted regressions.

Like last week, much of this talk touches on concepts laid out in¬† Continuous Delivery by Jez Humble and David Farley. If your company isn’t already performing continuous delivery, I highly recommend the book, as it talks through the benefits and how to get there in small increments. In the book, they lay out a simple goal:

¬õ‚ÄúOur goal as software professionals is to deliver useful, working software to users as quickly as possible‚ÄĚ

Note that they said “software professionals”, not developers. After all, isn’t that the ultimate goal of SQA as well? And of the BAs and project managers?

Feedback Loops

In order to achieve the goal — delivering software that is both¬†useful and¬†working — Humble and Farley suggest that there needs to be a tight feedback loop of information about how well the software works and how useful it is to the end user delivered back to the development team so they can adjust their course in response. In order to validate traditional software, one typically has to build it first; they advocate building the software after every change so that the build is always up to date and ready for validation. Automate this process, including the delivery of build results to the development team, and you have created a¬†feedback loop — specifically, the first step in a deployment¬†pipeline.

Automated Deployment

In order to validate software that builds correctly, it must be installed, either on an end-user-like testing machine or to a web server that will then serve up the content (depending on the type of software). This, too, can be automated — and now you’ve gained benefits for the development team (who get feedback right away when they make a change that breaks the installation) as well as the testing team (who always have a fresh build ready to test). Furthermore, your infrastructure and/or operations teams have benefits now as well; when they need to spin up a new instance for testing or for a developer to use, they now can deploy to it using the same automated script.

Automated deployment is a must for delivering working software. The first deploy is always the most painful; when it’s done by hand at 2am in production, you’ve already lost the war for quality. Not only should your deploys be automated, they should be against production-like systems, ideally created automatically as well (humans make mistakes, after all).

Continuous Testing

And now we see how this pipeline connects to QA’s more traditional role: testing.¬†Once we have the basic structure in place, typically using a CI Server to automatically build on every commit, we can start adding automatic quality checks into the process to give development feedback on the quality of the code they’ve committed. This can include static checks like linting (automated maintainability checking) as well as simple dynamic tests like unit tests or performance tests. Ideally, however, you want to keep your feedback loop tight; don’t run an eight-hour automated regression suite on every commit. The key is to get information back to the developer before they get bored and wander off to get coffee ūüôā

Essential Practices

In order to make this really work for your organization, there are a number of practices that must be upheld, according to the authors of Continuous Delivery. These are basic maintenance sort of things, required for code to keep the level of quality it has over time. They are:

  • Commit early, commit often. Uncommitted code can’t be built, and thus, can’t be analysed.
  • Don’t commit broken code. Developers love to “code first, test later”, and, if they’re not used to this principle, tend to commit code with broken unit tests, intending to go back and clean it up “later”. Over time, the broken windows of old failing tests inoculate people against the warning tests can give. They become complacent; “oh, that always fails, pay it no mind”, they say, and then you might as well not have tests at all.
  • Wait for feedback before moving on. If your brain’s on the next task already, you’ll file away a broken unit test under the “I’ll fix it later” category, and then the above will happen. Especially, never go home on a broken build!
  • Never comment out failing tests. Why are they failing? What needs to be fixed? Commenting them out means removing all their value. ‘


Do any of you use continuous testing and/or a deployment pipeline? Maybe with software like Jenkins, Travis CI, or Bamboo? Let’s chat in the comments!

Quick tip: Passing parameters from Jenkins to Maven

I saw bits and pieces of information all over the internet about parameters and properties and command-line arguments, but what I was looking for I didn’t find: a simple, straightforward explanation of how to use a Paramaterized Build in Jenkins to pass arguments through to the jUnit tests that run the functional tests that I’ve built on Webdriver. So: ¬†here it is!

Step 1: Command-line via Maven to jUnit

Use the System.getProperty tor ead system properties in Java:

        remoteHost = System.getProperty("remoteHost");
    	if (remoteHost == null) remoteHost = "http://localhost:4444/wd/hub";
    	browserName = System.getProperty("browserName");
    	if (browserName == null) browserName = "Internet Explorer";

And use -D to pass them with Maven:

mvn test -DbrowserName=Firefox



Step 2: Jenkins to Maven

First make the build a paramaterized build:



Then adjust the maven build step as shown:




(Quotes are important here because of Internet Explorer)




CI with Jenkins for Javascript: Part 3: Scheduling and reporting

In Part One, we set up a Jenkins server and some unit testing. In Part Two, we added some static analysis tools to our build. But we’re still manually running all this, even if it’s all tied together now. Let’s talk about some of the features Jenkins brings to the table.

Building automatically

Our code release pipeline is going through some revisions to make better use of branching, so I have the good fortune of being able to detail for you two different build strategies for two different types of branching strategies. Today I will detail our old style, and in a future post, I will detail the updates we did to make a more branch-heavy system work.

Our original strategy involved a branch for each codebase representing our demo environment; to promote a project to demo, the code would be merged into the demo branch using Subversion. This is the easier strategy to set up,  because you always know where in the repository to point Jenkins.

The first change we made was to symlink the location our repo was checked out by ¬†Jenkins to a network share on a demo server. This allows Jenkins to check out the code directly to the server, where it can then run the unit tests. That was the simplest way for us to get the code onto our servers, but there are many ways you can go about this step, including using FTP or SSH to update the server; if you have many servers you want Jenkins to update, that’s probably the best way to do it. We used a symlink because it plays nicely with Jenkins’ preferred build pipeline: First it checks out the code, then it runs the tests, then it would deploy to other servers. Our code does not need to be compiled before being deployed, and Jenkins was not running on a machine configured to run as a Coldfusion server, so by checking out the code directly onto a server, we had it up and running as fast as possible.

Once you’ve figured out your deployment strategy, you’re ready to trigger Jenkins to automatically build based on code promotion. There are two strategies to accomplish this task: polling and a post-commit hook. Polling is the easiest to set up; there’s literally a checkbox under “Build Triggers” called “Poll SCM”. This allows you to set up a poll strategy usinga similar syntax as the one used to configure¬†cronjobs; for example, to poll every fifteen minutes, you use the string “H/15 * * * *”. This can be configured without ever leaving Jenkins, and it will only build when there’s new changes.

Post-commit hooks require some work in Subversion. With this strategy, you configure Subversion to activate Jenkins whenever a commit is pushed. I didn’t do this myself, but there’s some details in the subversion plugin notes¬†about how you might set this up. Honestly, the more I read about it, the less interesting it looked. Polling every ten minutes or so would achieve the same level of detail for my organization; remember, I’m talking about major code promotions to demo that happen probably no more often than once a day.


Information Radiators

So, you have your Jenkins server pointed to your repo. It’s polling every fifteen minutes, and it reports out on the unit tests, linting results, and code complexity. You’re feeling pretty proud of yourself: this is a nice spiffy setup, capable of giving a good sense of the long-term health of the project.

Too bad nobody looks at it.

Oh sure, you can give them the dashboard link. Maybe one or two of them will poke at it every week or so. For a while. Until they get bored and wander off. IT people are humans too, and humans are notoriously averse to reading anything or seeking out information on their own. How can you make the information more in-their-face?

One answer is to present the information in a pretty easily-understood graph or chart and display that on a monitor in the hallway. As people walk past it, the information is thrust into their face, and they tend to stop and take a look at it. The nicer the visualization, the more likely people will stop to look at it and accidentally ingest the information you’re trying to get across ūüôā

Jenkins has a lovely API for retrieving information about a build: on any page, add “/api” to the end. If you just add /api, it gives you a description of the formats you can retrieve the api information in; to get the JSON data, you add /api/json to any page. For human-readability, add “?pretty=true”. You can also get the data in xml format using the same method.

With that in mind, I wrote a quick javascript app that polls Jenkins for data about unit tests using Backbone to abstract away all the details. The model is something like:

var TestResult = Backbone.Model.extend({
    baseURL: "",
    build: "lastCompletedBuild",
    url: function() {
        return this.baseURL + "/job/" + + "/" + + "/testReport/api/json?jsonp=?";

And the view something like:

var PlatformView = Backbone.View.extend({
    initialize: function(options) {
        this.options = options;
        this.model.on("change", this.render, this);
        this.model.on("error", this.renderErrorState, this)
    render: function() {
        var tpl = Handlebars.compile($("#platformTemplate").html());
        var data = this.model.toJSON();
        var ts = new Date(this.options.timestamp);
        data.timestamp = ts.getMonth() + 1 + "-" + ts.getDate() + "-" + ts.getFullYear() + " " + ts.getHours() + ":" + (ts.getMinutes() < 10 ? "0" : "") + ts.getMinutes();
        var html = tpl(data);

        var model_id = this.model.get("id");
        var chartdata = [
                {label: "Pass", value: this.model.get("passCount")},
                {label: "Fail", value: this.model.get("failCount")},
                {label: "Skip", value: this.model.get("skipCount")}
        testResultsPieChart.drawTestResultsGraph(chartdata,"#" + model_id + "-chart");
        return this;
    renderErrorState : function() {
        var tpl = Handlebars.compile($("#errorTemplate").html());
        var data = this.model.toJSON();
        var html = tpl(data);

        var model_id = this.model.get("id");
        var chartdata = [
                {label: "Pass", value: 0},
                {label: "Fail", value: 1},
                {label: "Skip", value: 0}
        testResultsPieChart.drawTestResultsGraph(chartdata,"#" + model_id + "-chart");

Where testResultsPieChart uses the d3 library to convert the data into a pie chart. I tossed all this into a basic Bootstrap page, because I’m not much of a designer ūüôā The result ends up looking like:

Note that one project had managed to break their test runner while I was taking this screenshot. You’ll see that result if qUnit never finishes executing.


And that’s where I’d gotten when someone told me we were changing branching strategy to remove the idea of a single demo branch ūüėÄ Moving goalposts keeps life interesting.

CI with Jenkins for Javascript: Part 2: Static Analysis

Part one

So. We’re up, we’re unit testing, we’re publishing results. But unit testing is only as good as the tests themselves, and that depends heavily on the programmers’ ability to write good tests. Maybe we want more than that. Maybe we want a metric that isn’t essentially self-reported. Maybe we want static analysis.

What is Static Analysis

Static Analysis¬†is a category of testing techniques that covers any metric of code that can be collected without executing the code. These techniques can be used to measure code against an agreed-upon standard without needing anything more from a developer than the code they’ve already written. This is a great way to touch in on the quality of the code when your developers are already over-worked, stressed, and up against the wall; they can put off writing unit tests until “later”, but they can’t prevent you from looking at the code and evaluating it. Obviously, you don’t want to just spring new metrics on people, but once a standard is in place, holding people to it shouldn’t be unreasonable.


Linting with ESLint

The most common static analysis tool you’ll hear JS developers talking about is¬†linting. If you already know about this, feel free to skip down to the practical section, but if you’re picturing the fuzzy stuff that comes out of your dryer after you wash a load of blankets, allow me to dispel the confusion a little. The basic metaphor comes from using a lint roller to clean little bits of lint (or cat hair) off a sweater so you look neater and more presentable. Linting, therefore, is the act of cleaning up little stylistic issues to make the overall code look neat and tidy.

The most widely known linter is¬†JSLint;¬†I believe that’s where the name came from in the first place.¬†You can test out how JSLint works at their website. Notice the checkboxes below the input box; JSLint is configurable, but not overly so. It was designed to enforce the Crockford Conventions, which some JS developers hold to be the best possible standard for Javascript code style. However, as with all things in JS land, the “standard” is hotly debated, and in many places rejected entirely. Therefore, for linting, I prefer a tool called ESLint. Every single rule in ESLint is configurable; at the minimum, this means it has three levels of enforcement: Do not enforce, Warn, or Error. Many rules also have configurable options, such as whether spaces should be before a comma, after a comma, both, or neither.

So let’s say you’ve got ESLint, talked with your team, and come up with a configuration file that enforces your standards. We can fairly simply add that into our gruntfile for Jenkins to execute, using a package like grunt-eslint. However, we now have a problem. Unlike grunt-qunit-junit, grunt-eslint does NOT allow for writing to a file. We’d have to pipe the output, and that includes any output from grunt itself, which might make our file no longer conform to the desired output format without more massaging. So I prefer to¬†install eslint as a standalone console application, as detailed here.

Now our buildfile has two items:



That command line breaks down as follows:

  • eslint calls the linter
  • -c eslint.conf points it to our custom configuration file
  • -f checkstyle outputs the results in checkstyle format. This can be other formats like jslint, junit, or tap, but I found the checkstyle plugin to be to my liking.
  • file paths indicate what files should be linted. Here I’m only linting the models and views for my project
  • > lintresults.xml is the linux way to pipe the results of the output into a file. This can be any file.
  • || echo¬†is, as with last time, a way to ensure that the build does not fail when linting fails. Again, the reporting plugin will take care of marking the build as unstable when the linting fails. Without this, linting errors will prevent Jenkins from moving on to the unit tests.

We can then use the Checkstyle plugin (or any other plugin that can process Checkstyle reports) to display the results:



And voila!



Complexity with Plato

Another static analysis that can be useful to shed some light on code quality is¬†complexity analysis. Now, if you’re planning to write angry comments, please keep in mind that all of these metrics measure one aspect of quality, and I don’t believe any of them are infallible be-all end-all measures. But complexity can tell you a little about what parts of your application are going to be more troublesome to maintain.

The most common metric for complexity is¬†cyclomatic complexity.¬†This is a rough measure of code complexity created in 1976 by Thomas McCabe, defined as¬†the count of the number of linearly independent paths through the source code. Basically, this tells you how much branching, looping, and nesting is present in a piece of code. Lower is easier to understand and maintain, but obviously, code with a complexity of 1 doesn’t do very much that’s interesting at all; it’s 100% deterministic, and will always do exactly the same thing, with no change in behavior based on inputs. Your basic “Hello World” program has a cyclomatic complexity of 1; FizzBuzz tends to be around 6 or so.

Another metric of complexity is Halstead Complexity. This is a more robust set of measures proposed in 1977 by Maurice Halstead. These are calculated by counting the number of distinct operators, total number of operators, number of operands, and other such analysis to produce a slew of numbers. One such number is the difficulty index, which is half the number of distinct operators times the total number of operands divided by the number of distinct operands. In theory, this measures how difficult code is to maintain over time.

As both of these metrics are strongly correlated with lines of code, the¬†Maintainability¬†Index seeks to relate them to each other and to the LOC to get an overall quick-and-dirty number to represent how difficult code is to maintain. This index was created in 1991 by¬†Paul Oman and Jack Hagemeister, and it ranges from negative infinity to a “perfect” score of¬†171, achieved only by an empty file with 0 ¬†lines of code. They proposed that code scoring above about 65 should be considered easy to maintain.

These metrics are all measured with a tool called JSComplexity, a tool written by Paul Booth to easily measure the complexity of javascript code. The command-line version of this tool is complexity-report, and there’s a¬†nicely formatted HTML reporter using that tool called Plato. From that we have a Grunt wrapper called grunt-plato, which we can use to generate an HTML report that can be included in Jenkins automatically. Still with me? ūüôā

The grunt setup is pretty straightforward, as before. We can add it to our existing file with a few lines:

module.exports = function(grunt) {
  // Project configuration.
   plato: {
    complexity: {
        options: {
        jshint: false
        files: {
        'reports': ['../src/source/model/*/*.js', '../src/source/ui/*/*.js']
  // These plugins provide necessary tasks.
  // Default task.
  grunt.registerTask('default', ['plato','qunit_junit','qunit']);

I’ve turned off JSHint because I’m using ESLint above. If you like JSHint, you can leave it included and skip the whole section above about checkstyle.

We already have the grunt file being run by jenkins, so we just add the report like so:

jenkins_plato_outputAnd voila! It’ll appear on the left as a link:


Which takes you right to the report.


There’s a lot more in the world of static analysis that I’d love to be able to show you. There are tools to generate dependency analyses, tools to find common bugs, tools for finding duplicated or dead code, tools to find potential security holes… but unfortunately, the tooling for javascript is rather limited. Compiled languages are always easier to analyse than interpreted ones, and strongly typed languages are easier to analyse than weakly typed ones. Frankly, though, with all the wonders javascript developers are able to produce, I have to wonder: does the community really care about quality? Maybe the tooling is limited because beyond linting and maybe some complexity, javascript developers aren’t interested in writing these kinds of tools.

Maybe I’ll write some myself.

CI with Jenkins for Javascript: Part 1: Unit Testing

In a lot of ways, the Javascript world feels like it’s trapped in the year 2k: the dot com bubble is swelling huge, and nobody has time for best practices, it’s time to reinvent everything and strike it rich. As an SQA professional, it’s immensely frustrating to outline a technique and be told “Javascript doesn’t do that.” (That’s one of three answers that ought to be banned from a webdev’s vocabulary; the other two are “I think jQuery does that” and “Maybe with Node?” Protip: using the latest shiny library is no substitute for using your brain. But I digress.)

Anyway, so let’s set the scene: A young, frazzled SQA professional, trying to get a sandbox install of Jenkins full of shiny things to prove to the Directors that really, we do need more tools, we’re not just being lazy. Jenkins’ install was a tale for another blog, but it’s up and running. Now what?


Unit testing with QUnit

The first thing, the key component for any sort of continuous-testing exercise (nevermind the integration part for now, this is only a demo) is to automate the unit testing. In my case, we used qUnit for our tests, which is pretty standard. Or was it? Since we don’t do TDD and we’re backporting testing into legacy apps that weren’t built with testability in mind, I ended up putting Coldfusion to work for us. I created a series of drop-down menus, customized for each platform we were testing, that would let you drill down to a specific component to test (model, view, library element, et cetera). It would then read a json file to find any dependencies that were required (yes, the developers were given a lecture about minimizing dependencies. No, that doesn’t mean our legacy apps would detangle themselves magically overnight. Yes, they really insist on testing views with the real Handlebars templates stored in separate files. Sure, why not.) and include them on the page, then the item under test, then the test code.

How do I make jenkins do this? The qUnit “way” seems to be to generate this page on the fly, but there ended up being quite a bit of logic implemented in Coldfusion I didn’t want to remake. And why should I? Add a “all” option to the dropdown and I had the exact result-set I wanted to include in Jenkins. What I really needed was a way to hit that page and retrieve the results.

Enter Grunt.

Grunt is a automation system made to run javascript tasks, particularly in a node environment. I found it works pretty well to treat it like ant or maven in your javascript stack: migrate the nitty-gritty logic to grunt, then execute the grunt script from jenkins. I didn’t see a plugin to do this, so I used the shell plugin to execute grunt. That gets the tests run and result files generated.

To get Grunt installed, you need Node (and the Node Package Manager). Once you have a working install of Node, you can install grunt with npm install -g grunt, which does a global install of grunt using node package manager. Of course, this isn’t enough to get running. You then have to install the command-line interface for grunt, which is packaged separately (because nothing shiny can be simple):npm install -g grunt-cli.

The Tao of Node involves creating projects, much like you do with Java and Eclipse. We’re not actually building anything here, there’s no asset pipeline involved at this stage, but you still have to have a project. So we create one. The simplest way to create a node package is with npm init, which will create a file called “package.json”. You probably never need to edit this file directly, but you can if you want. It’s just a json file.

The Tao of Grunt then involves a file you’ll be editing heavily: “Gruntfile.js”. This is where the instructions for Grunt go. These instructions come in two flavors: a list of configuration options for the specific plugin you’re using, and a list of plugins to activate for a given build target. This is kind of like ant, but with JSON instead of XML. Your basic gruntfile looks like:

module.exports = function(grunt) {

  // Project configuration.
   //JSON for config options here

  // Load the plugins

  // Default task(s).
  grunt.registerTask('default', ['sometask']);


For this use case, we want to use grunt-contrib-qunit to run my tests and snag the results. So we set that up first. Since I want to use an existing runner, I use the urls option to pass in the URL of the runner. My gruntfile then looks like:

module.exports = function(grunt) {

  // Project configuration.
   qunit: {
      all: {
        options: {
          urls: [

  // Load the plugins

  // Default task(s).
  grunt.registerTask('default', ['qunit']);


From the command line, I can then run sudo grunt from the folder with the gruntfile and bam, there’s my results. (Make sure there’s no authentication required to get to your test runner. That’s for the advanced class. Also, if you find someone teaching the advanced class, I’d love to sign up ūüôā ).
Of course, I can’t actually do that until I install grunt-contrib-qunit. Thankfully, these plugins are all available via npm. This is probably a good time to make an aside note about how npm works. See, we installed grunt globally, because we want it to be availible to multiple jenkins projects, but you’re not supposed to do that often. The better way to install dependencies is without the -g flag and with the --save-dev flag. This will do two things:

  • Download the module to the project’s dependencies folder
  • Add the module to your project.json file

So in this case, we want to do a npm install grunt-contrib-qunit --save-deps. Do that for any other plugin I discuss and you’ll be up and running in no time.
Now, that’s all fine and dandy, except that qunit prints the output in a human-readable format to the screen. We want the output in a jenkins-readable file instead. Jenkins can read xUnit, TAP, and HTML reports with the help of some readily-available plugins, so basically anything standard will do; luckily, there’s another grunt plugin that takes the output from grunt-contrib-qunit and massages it into JUnit format before saving to a file. It’s called grunt-qunit-junit, because naming conventions are weird.

module.exports = function(grunt) {

  // Project configuration.
  qunit_junit: {
        options: {
   qunit: {
      all: {
        options: {
          urls: [

  // Load the plugins

  // Default task(s).
  grunt.registerTask('default', ['qunit_junit','qunit']);


Note that qunit-junit wants to be loaded before qunit. This will write the output in junit format to a folder called test-reports. Make sure that’s writeable by jenkins!
Speaking of Jenkins…

Displaying jenkins_grunt.png

The or (||) and output aren’t strictly necessary; what that does is allow jenkins to continue building if the unit tests failed. Later, in the reporting step, a build will be marked “unstable” if the tests failed, but if you don’t have this you won’t be able to execute any later steps.

Speaking of reporting, here’s how I configured the junit reporter for jenkins:

Displaying jenkins_junit.png

And voila! Click “build” and you’ll see your results right away.