Teatime: Programming Paradigms

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Rootbeer Rooibos from Sub Rosa Tea. A nice change from my typical tea-flavored-teas and chais, this fun, quirky tea really brightens up a dull day. 
teaset2pts

Today’s Topic: Programming Paradigms

Programming languages come in many forms, and are intended to be used in many different ways. Understanding the organization and design patterns that apply to your language helps considerably with maintainability. This stuff might be old hat to devs, but as more and more QA folks pick up a bit of programming for automation, it can be helpful to have a refresher.

Procedural Code

The very first thing we designed (programmable) computers to do was to take in a list of instructions and execute them. Fundamentally, that’s the heart of all programming; as such, everything’s built on the procedural code paradigm.

The next improvement we made was to add subroutines (sometimes called “functions”), small bits of code that could be executed as a single custom instruction. This provides code re-use as well as increased readability. Basically everything can be written this way, but it’s the primary paradigm for C, BASIC, Go, Python, and PHP.

Object-Oriented Programming

Object-oriented programming attempts to model the system as a series of discrete objects that both encapsulate data and contain the logic necessary to work with the data. The basic building block here is the object, which contains both properties (pieces of data) and methods (encapsulated bits of logic, much like subroutines). This is where you start to get your classic Design Patterns, like Singleton or Factory patterns.

Objects can be composed; a Cat object might contain a Skeleton object that has a list of bones and keeps track if any are broken.  Objects can also have inheritance, where one object fundamentally is another object but with added properties or methods; for example, a Cat object might inherit from a Mammal object the ability to nurse its young and be petted. In classical inheritance, you have a Class definition which explains what all objects of a given type look like, and specific instances created from that template; Classes inherit from other Classes. In prototypical inheritance, every object is a unique snowflake that can dynamically mix-in the properties of another object, much like how bacteria can incorporate the genes of other bacteria they swallow. Objects in this system inherit directly from other objects.

Primarily object-oriented languages include Java, Ruby, and C#. You know you’re dealing with an object-oriented language when you have to declare a Class with a Main method in order to provide a starting point for even a simple application.

Functional Programming

The basic building block in a pure Functional programming paradigm is the Function. This isn’t the same as a subroutine; instead, this is a mathematical function, or “pure function”. A pure function takes inputs and returns outputs, with no state, side effects, or other changes. Functions are immutable, and the values are immutable; for a given immutable input, a function returns an immutable output. Rather than have extensive loops and conditionals, you instead are expected to compose functions together until your data is in the state you expect it to be in. If you’ve ever heard of Map-Reduce, the famous algorithm from Google, this is functional programming (Map takes a set of data and applies a function to each element; Reduce takes a set of data into a function that composes it into a single value).

Primarily functional languages include Lisp, Haskell, F#, Clojure, and Scala.

Event-Based Programming

Event-driven programming was invented basically to handle the special case of GUIs. When you’re running a Graphical User Interface, you typically want to sit idle waiting for the user to perform an action before you respond to it. Rather than have every element on the screen poll to see if it’s been clicked every so often, you instead have an “event loop” that polls for any click anywhere. Different elements subscribe to specific events, such as “was a button clicked” or “did a request from the server complete” or whatnot.

Primarily event-driven languages include Node.JS. Javascript in general is an odd mix of Functional, Procedural, and Event-Driven code, with some Objects thrown in there for extra fun.

What paradigm are you most comfortable programming in? Have you tried all of the above? Let me know in the comments 🙂

 

Teatime: Measuring Maintainability

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Today I’m living it up with Teavana’s Monkey-picked Oolong tea. A lot of people have complained that it’s not a very good Oolong, but I’m a black tea drinker most of the time, and I found it quite delightful. Your mileage may vary, of course!
teaset2pts

Today’s Topic: Measuring Maintainability

Last week we touched on maintainable code, and one design pattern you could use to make your code more maintainable. Today, I wanted to pull back a bit and talk about how to measure the maintainability of your code. What can you use to objectively determine how maintainable your code is and measure improvement over time?

Maintainability

First of all, what is maintainability? The ISO 9126 standard describes five key components:

  • Modularity
  • Reusability
  • Analyzability
  • Modifiability
  • Testability

(I know you all like the last one ;)). Modular code is known to be a lot easier to maintain than blobs of spaghetti; that’s why procedures, objects, and modules were invented in the first place instead of keeping all code in assembly with gotos. Reusable code is easier to maintain because you can change the code in one place and it’s updated everywhere. Code that is easy to analyze is easier to maintain, because the learning curve is lessened. Code that cannot be  modified obviously cannot be maintained at all. And finally, code that is easy to test is easier to maintain because you can test if your changes broke anything (using regression tests, typically unit tests in this case).

These things are hard to measure objectively, though; it’s much easier to give a gut feel guided by these measures than a hard number.

Complexity

Complexity is a key metric to measure when talking about maintainability. Now, before we dive in, I want to touch on the difference between code that is complex and code that is complicated. Complicated code is difficult to understand,  but with enough time and effort, can be known. Complexity, on the other hand, is a measure of the number of interactions between entities. As the number of entities grows, the potential interactions between them grows literally exponentially, and at some point, the software becomes too complex for your brain to physically keep in working memory. At that point, nobody really “knows” the software, and it’s difficult to maintain.

An example of 0-complexity code:

print('hello world')

This is Conway’s Game of Life written in a language called APL:

⍎'⎕',∈N⍴⊂S←'←⎕←(3=T)∨M∧2=T←⊃+/(V⌽"⊂M),(V⊖"⊂M),(V,⌽V)⌽"(V,V ←1¯1)⊖"⊂M'

So there’s your boundaries when talking about complexity 🙂

Halstead Complexity

In 1977, Maurice Howard Halstead developed a measure of complexity for C programs in an attempt to quantify software development. He wanted to identify measurable aspects of code quality and calculate the relationships between them. His measure goes something like this:

  • Define N1 as the total number of operators in the software. This includes things like arithmatic, equality, assignmant, logical operators, control words, function calls, array definitions, et cetera.
  • Define N2 as the total number of operands in the software. This includes identifiers, variables, literals, labels, function names, et cetera. ‘1 + 2’ has one operator and two operands, and ‘1 + 1 + 1’ has three operands and two operators.
  • Define n1 as the number of distinct operators in the software; basically, N1 with duplicates removed.
  • Define n2 as the number of distinct operands in the software; basically, N2 with duplicates removed.
  • The Vocabulary (n) is defined as  n1 +  n2
  • The Length (N) is defined as N1 +  N2
  • The Volume (V) is defined as N * log2n
  • The Difficulty (D) is defined as (n1/2) * (N2/n2)
  • The Effort (E) is defined as V * D
  • The Time required to write the software is calculated as E/18 seconds
  • The number of bugs expected is V/3000.

This seems like magic, and it is, a little bit: it’s not super accurate, but it’s a reasonable starting place. Personally I love the part where you can calculate the time it took you to write the code only after you’ve written it 🙂

Cyclomatic Complexity

Thankfully, that’s not the only measure we have anymore. A much  more popular measure is that of Cyclomatic Complexity, developed by Thomas McCabe Sr in 1976. Cyclomatic Complexity is defined as the number of linearly independent paths through a piece of code. To calculate it, you construct a directed control flow graph, like so:

cyclogrpah

Then you can calculate the complexity as follows:

  • Define E as the number of edges in the graph (lines)
  • Define N as the number of nodes in the graph (circles)
  • Define P as the number of connected components (a complete standalone segment of the graph which is not connected to any other subgraph; this graph has one component)
  • The complexity is computed as E – N + 2P

However, a much more practical method has arisen that is a simplification of the above:

  • Start with a complexity of 1
  • Add one for every “if”, “else”, “case” or other branch (such as “catch” or “then”)
  • Add one for each loop (“while”, “do”, “for”)
  • Add one for each complex condition (“and”, “or”)
  • The final sum is your complexity.

Automated complexity measuring tools typically use that method of calculation. A function with a Cyclomatic Complexity under 10 is a simple program, 20-50 is complex, and over 50 is very high risk and difficult to test thoroughly.

Bus Factor

Another great measure I like to talk about when talking about maintainability is the Bus Factor[1] of the code. The Bus Factor is defined as the number of developers who would have to be hit by a bus before the project can no longer be maintained and is abandoned. This risk can be mitigated by thorough documentation, but honestly, how many of us document a project super thoroughly? Usually we assume we can ask questions of the senior devs who have their fingers in all the pies. The Bus Factor tells us how many (or how few) devs really worked on a single application over the years.

You can calculate the Bus Factor as follows:

  • Calculate the author of each file in the repository (based on the original developer and the number of changes since)
  • Remove the author with the most files
  • Repeat until 50% of the files no longer have an author. The number of authors removed is your Bus Factor

I created a fun tool you can run on your own repositories to determine the Bus Factor: https://github.com/yamikuronue/BusFactor

The folks at http://mtov.github.io/Truck-Factor/ created their own tool, which they didn’t share, but they found the following results:

  • Rails scored a bus factor of 7
  • Ruby scored a 4
  • Android scored a 12
  • Linux scored a 90
  • Grunt scored a 1

Most systems I’ve scanned have scored a 1 or a 2, which is pretty scary to be honest. How about you? What do your systems score? Are there any other metrics you like to use?


[1]: Also known as the Lotto Factor at my work, defined as the number of people who would have to win the lottery and quit before the project is unmaintainable.

Teatime: Maintainable Code, MV*, and Backbone

Welcome back to Teatime! This is a weekly feature in which we sip tea and discuss some topic related to quality. Feel free to bring your tea and join in with questions in the comments section.

Tea of the week: Royal Milk Tea. In Japan, when they want black tea, they buy it powdered with milk powder already added so they don’t have to go buy milk. This stuff is super simple to make since it’s instant, but it tastes amazing. You can often find it at an Asian grocery store if you happen to have one in your area.
teaset2pts

Today’s topic: Maintainable code with MV*

Today we’re venturing into the area of maintainability. There are a lot of design patterns out there to increase the maintainability of your code, but none are as widespread in the web world as MV*

What? MV*?

The MV[something] family of patterns (abbreviated here as MV*, where * is a wildcard character) consist of a Model, a View, and… something else. The most common types are as follows:

  • MVC: Model, View, Controller. This is the classic model, the basic template that the others are following.
  • MVVM: Model, View, View-Model
  • MVP: Model, View, Presenter

Backbone, a popular MV* framework for Javascript, provides tools to create Models and Views, but leaves the third piece up to you. We’ll be having a look at the pieces in the next few sections here.

Models

The model contains the data the application needs to work. Generally, this is an Object in an OOP paradigm; you might have, say, a Product model that knows the description and price of a single product, plus can work out the sales tax given the country, or some such. This is independant of any particular presentation, and should be global to your application. The model layer contains the business logic pertaining to the domain; for example, a Car model knows it can only have four Tires attached, but a Truck might have 18.

The thing to keep in mind here is that the model is a layer like any other, and can contain more than just your specific object models. It can also contain data mappers that store the models into a database, services to translate models into a different format, data in the database held in a  normalized format, and the code needed to reconstruct those model objects out of the normalized data. There’s a lot going on here, and all of it can contain business logic that needs to be tested.

Views

Views are responsible for providing an interface to the user to interact with the data. In a classic server-based web app, your view is HTML, maybe a little injected javascript. In a single-page Javascript app, you’ll have View objects that construct widgets on the DOM, plus the HTML that’s actually injected. In a Windows Presentation Foundation app, you’ll have some XML defining your view.

View-models

This confused me for the longest time until I realized what the name was trying to imply. The Model layer models your application; the View-Model is a model of your ViewEverything you need to render a view — the data, the methods, the callbacks — live in a model, and your actual view — the DOM, or whatever — just renders what’s in the view-model. This is the pattern Knockout.js uses: the DOM is the view, and your methods bind to it from the view-model. You never write code to update the view, it just happens automatically.

Controllers

The controller represents the logic of the application (or page, in a multi-page js app). It determines what models are needed and what views should be displayed based on the context. For example, in ASP.NET, you might have a bit of HTML (your view) showing a record from the Users table (your Model). When you click the edit pencil, the view dynamically changes out to a form (a different view) showing the same record. When you click save, it swaps back. The controller contains this sort of logic.

In an MVP or MVVM setup, without a controller, often the logic is controlled via a REST-style “route”: one page has one view and one model and they compose directly, and to do anything with that page, you make another HTTP request which returns another view with another model.

Practical Example: Backbone

Let’s take a look at one Javascript library that’s become very popular in recent years. These code samples are from cdnjs, by the way, which is a great place to learn about Backbone. Backbone can be used in an MVC style very easily: your controller summons up views and models and composes the two together.[1] One model maps to one endpoint in the API layer, which serves up one model; for example, this might be a model for a user:

var UserModel = Backbone.Model.extend({ 
    urlRoot: '/user', 
    defaults: { 
        name: '', 
        email: '' 
    } 
});

user.save(userDetails, { 
    success: function (user) { alert(user.toJSON()); } 
});

You can see here how we extend the basic model to create our own, and only override what we want to use. The urlRoot tells it where to look for the API backend; the defaults tell us what properties we expect to see back as well as providing sensible defaults in case a new object is created during the course of our application.

A view might look something like this:

var SearchView = Backbone.View.extend({
    initialize: function(){
      this.render();
    },
    render: function(){
      var template = _.template( $("#search_template").html(), {} );
      this.$el.html( template );
    },
    events: {
      "click input[type=button]": "doSearch"
    },
    doSearch: function( event ){
      // Button clicked, you can access the element that was clicked with event.currentTarget
      alert( "Search for " + $("#search_input").val() );
    }
  });

  var search_view = new SearchView({ el: $("#search_container") });

Here we have an element ($el) we will bind our template to; that’s usually passed in when the view is instantiated. We also have a template here, in this case an underscore template. We then render the template when we’re asked to render.

We use Backbone’s events array to register for certain events on our element, and when the button is clicked, we perform an action (in this case, an alert).

Do you think this code is easier to maintain than the usual giant closure full of methods? Why or why not? Have you seen any maintenance nightmares recently you think could have been sorted out by using MV* better?


[1]: This assumes a REST backend, which we can talk about another day.