Teatime: Continuous Integration

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Oprah Chai. I expected this to be boring and gimmicky, but it was surprisingly bold, and a pleasant drink all-around. I tried it at a Starbucks before I bought some, which is a nice perk.

Today’s Topic: Continuous Integration

Today’s topic is continuous integration; much of it is adapted from a book called Continuous delivery by Jez Humble and David Farley. When I gave this talk, I gave a disclaimer that the book aims to start with the worst possible practices and walk them up to the best possible practices. Since my company is far from the worst possible state, a lot of the items were things we were already doing. I’d be interested to hear in the comments what you already do or don’t do.

The Problem

Here are some of the major problems in the industry that Humble and Farley saw when they sat down to write the book in 2011:

  • Delivering software is hard! Release day arrives; everyone’s tense, nervous. Nobody’s quite sure the product will even work, and nobody wants their part to be the bit that fails, putting them on the chopping block. We’ve got one shot at this launch, and if we botch it, we cost the company millions as customers go offline. Expect 3am phone calls from Operations — did development forget to write down a step in the process? Did someone forget to put a file in the build folder? Is all the SQL going to run first try? What if some of the data we thought was in prod has changed since we started developing? Are we really, really sure?
  • Manual Testing Sucks, Period. It takes forever to manually test a product, and nobody’s ever quite sure we covered everything. How many bugs do you get where you’re asking yourself, “Didn’t they test this? How did this ever work?” It’s so obvious in hindsight when these things come in, but it’s customers finding them. And it takes weeks, months maybe, to run a full test cycle on a brand new app if it does anything more than CRUD. Oh, and performance testing? Manually? Eh, it seems fast enough, release it. Security testing? Who has time for this crap?
  • “It worked in dev” syndrome. What are the differences between dev and prod? Can you name them all off the top of your head? When do you test in a production-like environment, and what are the differences between production-like and production? Who tested in dev? What did they test? Are you sure you understand how users will interact with your system? How many times do you get bugs where you ask yourself “Why did they even click that?!”
  • No way to test deployment. The only truly prod-like servers are prod; the only process is “a person does a thing”. You can’t test people, and there’s always going to be turnover. How do you know they did it right? How can you audit their process, or improve on it? People aren’t exactly reliable, that’s why we invented machines 😉

The Principles

So here’s what they came up with as guidelines to try and correct the system. These are necessary to pull yourself out of process hell and start building toward Continuous Integration:

  • Every Commit is a Release candidate. Every single one could potentially be released. If it adds value, and doesn’t break anything else, it’s ready to release. Whether it’s actually released is going to be up to the BA and/or PM, of course, but you don’t want to commit anything you know is broken, you’ll just waste everyone’s time. If you want the safety blanket of committing early and often, make a feature branch; when you merge that back in, it’s a release candidate.
  • Repeatable, Reliable Release Process. Once you have that commit, you want a standardized process, on paper, that can be repeated with every release, no exceptions. If there ARE exceptions, you document those too, so they’re not exceptions anymore; things like rolling back a failed deployment should be a standard, repeatable process as well. We had one week where we accidentally re-promoted a broken release because I forgot to pull it out of the QA branch after it failed in production the week before. Needless to say, after I made a round of apologies, I documented that step as well!
  • Automate all the things! Automate everything. The authors have never seen a release process that can’t be automated with sufficient work and ingenuity. After I gave this talk the first time, I embarked on a 6-month project to do just that, simplifying our convoluted multiple-branch SVN strategy into a flatter tree, and automating the deployment from Trunk. It took ages and it was painful to implement, but the new system is much more reliable, faster, and generally nicer to use.
  • Keep everything in source control. The goal is to allow a new team member to come in, sit down at a brand new workstation, run a checkout, run a build script, and have a working environment. Yes, that includes the database. Yes, that includes the version of Coldfusion or Node or whatnot. Yes, that includes the Apache or Nginx configuration. It should be possible to see at a glance what version of the application and dependencies are on the servers. Node’s package.json is a great step toward that ideal
  • If it hurts, do it more often. Example: Merging sucks. Merging more often, in smaller chunks, is easier than delaying until the end of the project to merge and resolve conflicts. Another example: releasing sucks. So instead of releasing huge products once a quarter, release them once a month, or once a week, or once a day, or once an hour…
  • Build Quality In This idea comes from Lean: the earlier you find a bug, the cheaper it is to fix. We QA folks tend to know that, and our mantra becomes Test Early, Test Often. If you find a bug in your code before you commit it, that’s maybe ten minutes time to fix it, max. If you find it in QA, now the person who found the bug has to write a ticket, the PM has to triage it, you have to read it and understand it, maybe some more clarification  back and forth, then you have to hunt through the code to find the problem, and then you fix it. So now we’re looking at hours of time rather than minutes. And if the problem is found in production and we have to run through a whole release cycle? Plus the customers’ time lost trying to work around the bug? A disaster. This is where unit testing and integration testing is super important.
  • Done means released In Waterfall, “done” means “built, ready for testing”. In Agile, “done” means “ready to be released”, which means the developers don’t stop caring about something until it passes testing. DevOps goes one step beyond that: “done” means “released to production”. After all, what good is something that is beautifully crafted and passed all tests if the customer can’t use it yet? This ties into the next principle:
  • Everyone is responsible for delivery In the Waterfall way, the developer builds a thing, tosses it over the wall to QA, and walks away, expecting other people to be responsible for getting it into prod. In the DevOps world, we’re all on the same team together: it doesn’t matter whose fault it is or what went wrong, everyone’s responsible for helping get the code safely into production. The developer should be on hand to chime in with his intimate knowledge of the code while the operations folks are trying to get things running.
  • Continuous Improvement This is my favorite principle 🙂 The general flow of work should be: Plan, Do, Study, Act. Routinely, we should get together to ask “how could do this better next time?”. We should take controlled risks in order to improve our craft.

The Practices

In order to support the above principles, the following practices need to be in place:

  • Use CI Software. We use Atlassian’s Bamboo to build and deploy changes. It makes 0 sense to have people do this manually; people are good at creative tasks, while computers are good at repetitive, boring tasks.
  • Don’t break the build Run the tests before you commit; don’t commit if something’s broken. An intern once asked me, “404s aren’t real errors, right?”. He was so used to popping open the console and seeing a dozen 404 errors that he didn’t notice the one that mattered. We can’t just have errors sitting around in production that we ignore, or we train ourselves to ignore real errors too.
  • Don’t move on until commit tests pass The CI server should be fast enough to give you feedback before you move on to another task; you should wait until you’re sure your commit is good before changing gears, so that if there’s something broken, you still have all the information you need loaded into your metaphorical RAM and don’t have to metaphorically swap pages to get to it.
  • You break it, you fix it Take responsibility for your changes! If your commit breaks something else, it’s not the other author’s problem, it’s your problem, because you made the change. Pointing fingers is a bad habit to get into.
  • Fail fast The authors suggest failing the build for slow tests. I agree, sheepishly; my functional tests are slow as heck, but I’m always trying to tighten the feedback loop and get developers information as rapidly as possible. They also suggest failing for linting issues, because they can lead to weird bugs later on. They suggest failing for architectural breaches, things like inline SQL when you have a stored-proc architecture, or other problems like that. The more you fail in dev, the less you fail in Prod.
  • Deploy to prod-like environments You should be deploying to environments that mimic production before you get out of the testing cycle, to make sure it’s going to be a clean deploy.  More importantly, what you deploy to that environment should be exactly, byte for byte, what you deploy to prod. With the new release process I’ve set up, that’s exactly what we do: we physically move the exact files, no building on the server anymore. Staging should be the exact same hardware, the same load balancing, the same OS configuration, the same application stack, with data in a known good state.


I know that was a lot of dense information this week, but hopefully it gave you a nice clear picture of the goal state you can work toward. Was it useful? Let me know in the comments!

Teatime: Containers and VMs

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Ceylon by Sub Rosa Tea. This is a nice, basic, bold tea, very astringent; it’s great for blending so long as you don’t choose delicate flavors to blend with. It really adds a kick!

Today’s Topic: Containers and virtualization

Today, I’m going to give you a brief overview of a technology I think might be helpful when running a test lab. So often as testers we neglect to follow trends in development; we figure, devs love their fancy toys, but the processes for testing software really don’t change, so there’s no need to pay much heed to what they’re doing. Too often we forget that, especially as automation engineers, we are writing software and using software and immersing ourselves in software just like they are. So it’s worth taking the time to attend tooling talks from time to time, see if there’s anything worth picking up.


A tool I’ve picked up and put down a few times over the past year or so is Vagrant. Vagrant makes it very easy to provision VMs; you can store the configuration for the server needed to run software right with the source code or binaries. Adopting a system in which developers keep the vagrantfiles up to date and testers use them to spin up test instances can ensure that every test we run is on a valid system configuration, and both teams know what the supported configurations entail.

At a high level, the workflow is simple:

  1. Create a Vagrantfile
  2. On the command line, type “vagrant up”
  3. Wait for your VM to finish booting

In order for this to work, however, you have to have what’s called a “provider” configured with Vagrant. This is a specific VM technology that you’re using at your workplace; in my experiements, I’ve used Virtualbox, but if you’re already using something like VMWare or a cloud provider like AWS for your test lab, there’s integrations with those systems as well.

When creating the vagrantfile, you first select a base image to use. Typically, this will be a machine with a given version of a given OS and possible some software that’s more complex to install (to save time). HashiCorp, makers of Vagrant, provide a number of base machines that can be used, or you can create your own. This of course means that every VM you bring up has the same OS and patch level to begin with.

The next step is provisioning the box with the specific software you’re using. This is where you would install your application, any dependencies it has, and any dependencies of those dependencies, and so on. Since everything is installed automatically, everything is installed at the same version and with the same configuration, making it really easy to load up a fresh box with a known good state. Provisioning can be as simple as a handful of shell scripts, or it can use any of a number of provisioning systems, such as Chef, Ansible, or Puppet.

Here is a sample vagrantfile:

# -*- mode: ruby -*-

  $provisionScript = <<SCRIPT
    #Node & NPM
    sudo apt-get install -y curl
    curl -sL https://deb.nodesource.com/setup | sudo bash -  #We have to install from a newer location, the repo version is too old
    sudo apt-get install -y nodejs
    sudo ln -s /usr/bin/nodejs /usr/bin/node
    cd /vagrant
    sudo npm install --no-bin-links

# vi: set ft=ruby :

# All Vagrant configuration is done below. The "2" in Vagrant.configure
# configures the configuration version (we support older styles for
# backwards compatibility). Please don't change it unless you know what
# you're doing.
Vagrant.configure(2) do |config|
  # The most common configuration options are documented and commented below.
  # For a complete reference, please see the online documentation at
  # https://docs.vagrantup.com.

  # Every Vagrant development environment requires a box. You can search for
  # boxes at https://atlas.hashicorp.com/search.
  config.vm.box = "hashicorp/precise64"

  config.vm.provider "virtualbox" do |v|
    v.customize ["setextradata", :id, "VBoxInternal2/SharedFoldersEnableSymlinksCreate/v-root", "1"]
  config.vm.network "private_network", ip: ""
  #Hosts file plugin
  #To install: vagrant plugin install vagrant-hostsupdater
  #This will let you access the VM at servercooties.local once it's up
  config.vm.hostname = "servercooties.local"
  config.vm.provision "shell",
  inline: $provisionScript


I left a good deal of the tutorial text in place, just in case I needed to reference it. We’re using Ubuntu Precise Pangolin 64-bit as the base box, distributed by HashiCorp, and I use a plugin that modifies my hosts file so that I can always find the machine in my browser at a known host. The provision script is just a simple shell script embedded within the config; I’ve placed it at the top so it’s easy to find.

One other major feature that I haven’t yet played with is the ability for a single Vagrantfile to bring up multiple machines. If your cluster generally consists of, say, two web servers, a database server, and a load balancer, you can encode that all in a single vagrantfile to bring up a fresh cluster on demand. This makes it simple to bring up new testing environments with just one command.


I haven’t played much with Docker, but everyone seems to be raving about it, so I figured I’d touch on it as an alternative to Vagrant. Docker takes the metaphor of shipping containers, which revolutionized the shipping industry by abstracting away the handling of specific types of goods from the underlying business of moving goods around, and extends it to software. Before standard shipping containers, different goods packed differently, required different packaging material to keep them safe, and shipped in different amounts and weights; cargo handlers had to learn all these things, and merchants were a little wary of trusting their precious goods to someone who was less experienced. The invention of the standard shipping container changed all that: shipping companies just had to understand how to load and transport containers, and it was up to the manufacturers to figure out how to pack them. Docker does the same thing for software: operations staff just have to know how to deploy containers, while it’s up to the application developers to understand how to pack them.

Inside a docker container, the application, its dependencies, and its required libraries reside, all pinned to the right versions and nestled inside the container. Outside, the operating system and any system-wide dependencies can be maintained by the operational staff. When it’s time to upgrade, they just remove the existing container and deploy the new one over top. Different containers with different versions of the same dependency can live side  by side; each one can only see its own contents and the host’s contents.

And thus, we reach the limit of my knowledge of Docker. Do you have more knowledge? Do you have experience with Vagrant? Share in the comments!

Teatime: Programming Paradigms

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Rootbeer Rooibos from Sub Rosa Tea. A nice change from my typical tea-flavored-teas and chais, this fun, quirky tea really brightens up a dull day. 

Today’s Topic: Programming Paradigms

Programming languages come in many forms, and are intended to be used in many different ways. Understanding the organization and design patterns that apply to your language helps considerably with maintainability. This stuff might be old hat to devs, but as more and more QA folks pick up a bit of programming for automation, it can be helpful to have a refresher.

Procedural Code

The very first thing we designed (programmable) computers to do was to take in a list of instructions and execute them. Fundamentally, that’s the heart of all programming; as such, everything’s built on the procedural code paradigm.

The next improvement we made was to add subroutines (sometimes called “functions”), small bits of code that could be executed as a single custom instruction. This provides code re-use as well as increased readability. Basically everything can be written this way, but it’s the primary paradigm for C, BASIC, Go, Python, and PHP.

Object-Oriented Programming

Object-oriented programming attempts to model the system as a series of discrete objects that both encapsulate data and contain the logic necessary to work with the data. The basic building block here is the object, which contains both properties (pieces of data) and methods (encapsulated bits of logic, much like subroutines). This is where you start to get your classic Design Patterns, like Singleton or Factory patterns.

Objects can be composed; a Cat object might contain a Skeleton object that has a list of bones and keeps track if any are broken.  Objects can also have inheritance, where one object fundamentally is another object but with added properties or methods; for example, a Cat object might inherit from a Mammal object the ability to nurse its young and be petted. In classical inheritance, you have a Class definition which explains what all objects of a given type look like, and specific instances created from that template; Classes inherit from other Classes. In prototypical inheritance, every object is a unique snowflake that can dynamically mix-in the properties of another object, much like how bacteria can incorporate the genes of other bacteria they swallow. Objects in this system inherit directly from other objects.

Primarily object-oriented languages include Java, Ruby, and C#. You know you’re dealing with an object-oriented language when you have to declare a Class with a Main method in order to provide a starting point for even a simple application.

Functional Programming

The basic building block in a pure Functional programming paradigm is the Function. This isn’t the same as a subroutine; instead, this is a mathematical function, or “pure function”. A pure function takes inputs and returns outputs, with no state, side effects, or other changes. Functions are immutable, and the values are immutable; for a given immutable input, a function returns an immutable output. Rather than have extensive loops and conditionals, you instead are expected to compose functions together until your data is in the state you expect it to be in. If you’ve ever heard of Map-Reduce, the famous algorithm from Google, this is functional programming (Map takes a set of data and applies a function to each element; Reduce takes a set of data into a function that composes it into a single value).

Primarily functional languages include Lisp, Haskell, F#, Clojure, and Scala.

Event-Based Programming

Event-driven programming was invented basically to handle the special case of GUIs. When you’re running a Graphical User Interface, you typically want to sit idle waiting for the user to perform an action before you respond to it. Rather than have every element on the screen poll to see if it’s been clicked every so often, you instead have an “event loop” that polls for any click anywhere. Different elements subscribe to specific events, such as “was a button clicked” or “did a request from the server complete” or whatnot.

Primarily event-driven languages include Node.JS. Javascript in general is an odd mix of Functional, Procedural, and Event-Driven code, with some Objects thrown in there for extra fun.

What paradigm are you most comfortable programming in? Have you tried all of the above? Let me know in the comments 🙂


Teatime: Measuring Maintainability

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Today I’m living it up with Teavana’s Monkey-picked Oolong tea. A lot of people have complained that it’s not a very good Oolong, but I’m a black tea drinker most of the time, and I found it quite delightful. Your mileage may vary, of course!

Today’s Topic: Measuring Maintainability

Last week we touched on maintainable code, and one design pattern you could use to make your code more maintainable. Today, I wanted to pull back a bit and talk about how to measure the maintainability of your code. What can you use to objectively determine how maintainable your code is and measure improvement over time?


First of all, what is maintainability? The ISO 9126 standard describes five key components:

  • Modularity
  • Reusability
  • Analyzability
  • Modifiability
  • Testability

(I know you all like the last one ;)). Modular code is known to be a lot easier to maintain than blobs of spaghetti; that’s why procedures, objects, and modules were invented in the first place instead of keeping all code in assembly with gotos. Reusable code is easier to maintain because you can change the code in one place and it’s updated everywhere. Code that is easy to analyze is easier to maintain, because the learning curve is lessened. Code that cannot be  modified obviously cannot be maintained at all. And finally, code that is easy to test is easier to maintain because you can test if your changes broke anything (using regression tests, typically unit tests in this case).

These things are hard to measure objectively, though; it’s much easier to give a gut feel guided by these measures than a hard number.


Complexity is a key metric to measure when talking about maintainability. Now, before we dive in, I want to touch on the difference between code that is complex and code that is complicated. Complicated code is difficult to understand,  but with enough time and effort, can be known. Complexity, on the other hand, is a measure of the number of interactions between entities. As the number of entities grows, the potential interactions between them grows literally exponentially, and at some point, the software becomes too complex for your brain to physically keep in working memory. At that point, nobody really “knows” the software, and it’s difficult to maintain.

An example of 0-complexity code:

print('hello world')

This is Conway’s Game of Life written in a language called APL:

⍎'⎕',∈N⍴⊂S←'←⎕←(3=T)∨M∧2=T←⊃+/(V⌽"⊂M),(V⊖"⊂M),(V,⌽V)⌽"(V,V ←1¯1)⊖"⊂M'

So there’s your boundaries when talking about complexity 🙂

Halstead Complexity

In 1977, Maurice Howard Halstead developed a measure of complexity for C programs in an attempt to quantify software development. He wanted to identify measurable aspects of code quality and calculate the relationships between them. His measure goes something like this:

  • Define N1 as the total number of operators in the software. This includes things like arithmatic, equality, assignmant, logical operators, control words, function calls, array definitions, et cetera.
  • Define N2 as the total number of operands in the software. This includes identifiers, variables, literals, labels, function names, et cetera. ‘1 + 2’ has one operator and two operands, and ‘1 + 1 + 1’ has three operands and two operators.
  • Define n1 as the number of distinct operators in the software; basically, N1 with duplicates removed.
  • Define n2 as the number of distinct operands in the software; basically, N2 with duplicates removed.
  • The Vocabulary (n) is defined as  n1 +  n2
  • The Length (N) is defined as N1 +  N2
  • The Volume (V) is defined as N * log2n
  • The Difficulty (D) is defined as (n1/2) * (N2/n2)
  • The Effort (E) is defined as V * D
  • The Time required to write the software is calculated as E/18 seconds
  • The number of bugs expected is V/3000.

This seems like magic, and it is, a little bit: it’s not super accurate, but it’s a reasonable starting place. Personally I love the part where you can calculate the time it took you to write the code only after you’ve written it 🙂

Cyclomatic Complexity

Thankfully, that’s not the only measure we have anymore. A much  more popular measure is that of Cyclomatic Complexity, developed by Thomas McCabe Sr in 1976. Cyclomatic Complexity is defined as the number of linearly independent paths through a piece of code. To calculate it, you construct a directed control flow graph, like so:


Then you can calculate the complexity as follows:

  • Define E as the number of edges in the graph (lines)
  • Define N as the number of nodes in the graph (circles)
  • Define P as the number of connected components (a complete standalone segment of the graph which is not connected to any other subgraph; this graph has one component)
  • The complexity is computed as E – N + 2P

However, a much more practical method has arisen that is a simplification of the above:

  • Start with a complexity of 1
  • Add one for every “if”, “else”, “case” or other branch (such as “catch” or “then”)
  • Add one for each loop (“while”, “do”, “for”)
  • Add one for each complex condition (“and”, “or”)
  • The final sum is your complexity.

Automated complexity measuring tools typically use that method of calculation. A function with a Cyclomatic Complexity under 10 is a simple program, 20-50 is complex, and over 50 is very high risk and difficult to test thoroughly.

Bus Factor

Another great measure I like to talk about when talking about maintainability is the Bus Factor[1] of the code. The Bus Factor is defined as the number of developers who would have to be hit by a bus before the project can no longer be maintained and is abandoned. This risk can be mitigated by thorough documentation, but honestly, how many of us document a project super thoroughly? Usually we assume we can ask questions of the senior devs who have their fingers in all the pies. The Bus Factor tells us how many (or how few) devs really worked on a single application over the years.

You can calculate the Bus Factor as follows:

  • Calculate the author of each file in the repository (based on the original developer and the number of changes since)
  • Remove the author with the most files
  • Repeat until 50% of the files no longer have an author. The number of authors removed is your Bus Factor

I created a fun tool you can run on your own repositories to determine the Bus Factor: https://github.com/yamikuronue/BusFactor

The folks at http://mtov.github.io/Truck-Factor/ created their own tool, which they didn’t share, but they found the following results:

  • Rails scored a bus factor of 7
  • Ruby scored a 4
  • Android scored a 12
  • Linux scored a 90
  • Grunt scored a 1

Most systems I’ve scanned have scored a 1 or a 2, which is pretty scary to be honest. How about you? What do your systems score? Are there any other metrics you like to use?

[1]: Also known as the Lotto Factor at my work, defined as the number of people who would have to win the lottery and quit before the project is unmaintainable.