Sustainable Software Development, Part 2: Managing Complexity

As the code base grows, the complexity of your code increases. The pace of software development often slows down as the product matures. Why is this? What can we do to manage complexity?

Sustainable software development is about retaining software development velocity over the life-time of your product. In part one, I covered the technical debt side of sustainable software development. In essence, when you postpone refactoring, bug fixing, testing etc., you accumulate debt. As all debt, it has to be re-payed. But even worse, you will pay interest. Every time you think about, swear about, discuss, plan or anything (but fix) your known problems, you pay.

In this second part, we address another challenge to a sustainable pace of development: the increased complexity of the software. In order to avoid letting the complexity make your life miserable, we need to manage the complexity. For brevity, we will touch only upon these four areas:

  • Simplicity
  • Dependencies
  • Hierarchy
  • Abstraction

Fred Brooks makes the distinction between essential and accidental complexity (also discussed in his book “The Mythical Man-Month”). Essential complexity comes from your problem domain. The problem will have an inherent complexity you cannot get away from. Accidental complexity comes from the solution you’ve chosen. The “Keep it simple, stupid!” (KISS) design principle encourages developers to prefer simple solutions over those more complex. Thus, in order to manage complexity, it is important to keep the accidental complexity to a minimum. For example, dividing your code using the Single Responsibility Principle is one way, so that different pieces of functionality are not tangled up in a single class.

Another example of accidental complexity is the introduction of unnecessary dependencies in your software. Every subset of your code will have connections and communication with other subsets of your code. In a naive software implementation, the number of connections would grow wildly with the size of the software. The interconnectedness of your code is one aspect of complexity that needs to be managed. Decomposition and decoupling will help you, and they should be applied to all levels of your system: service, sub-system, module, class, function etc. Various design principles (e.g. Dependency Inversion Principle) and patterns can guide you.

Steve McConnell discusses hierarchy and abstraction in his article “Keep it simple” (and in his book “Code Complete”). Structuring your software as a hierarchy is a means of decomposing your solution into more manageable pieces. Normally, different kinds of hierarchies exist within your code. For example, think about the difference between the relation between objects (“which object creates/destroys/contains/depends on/owns the memory of which”) and the relation between functions (“which function calls which”). One architectural pattern that can help the high-level structure of your code is Layer (defined in the book “Pattern-Oriented Software Architecture: A System of Patterns” by Buschmann et al.). One upside of a structured approach like Layer is that tooling, such as automatic mapping which classes depend on which, can help you enforce your structure over time.

Abstraction is all about what level of detail is visible at what level (see also my post on abstraction). For example, the file system is an abstraction hiding the hard-drive with its tracks and sectors. When writing a program, you refer only to “file x”, and not to “track y, sector z”, which helps tremendously. Abstraction will help you reason about your program. Abstractions are largely domain dependent. Your domain will decide what is a good level of abstraction, hiding unnecessary detail while not constraining you.

Last words from McConnell’s “Keep it simple” article: “Neither hierarchies nor abstractions reduce the total number of details in a program — they might actually increase the total number. Their benefit arises from organizing details in such a way that fewer details have to be considered at any particular time.”

How do you fight complexity in your daily work?

The Diamond Shape

How can your classes talk to each other if they don’t know of each other? Hint: a diamond might come handy.

Revisiting a topic from my KTH lecture in October, I wanted to discuss the “diamond shape”.

Imagine you want to write a simple networking application. You have at least two responsibilities represented: first, an Application class is responsible for implementing the application logic. Second, a Network class is responsible for implementing sending and receiving packets over the network. As an example, let’s assume that when the user presses a button, the Application needs to tell the Network class to send some data. When a response is received over the network, the Network class needs to tell the Application class that some data was received. Each class needs to talk to the other class.

The interaction between Application and Network requires bi-directional communication. Despite this, we don’t want the two classes to refer directly to each other, since this would lead to problems (tight coupling, poor testability, unnecessary knowledge in the Network class about the full public API of Application etc.). Thus, we want bi-directional communication between two classes, but they mustn’t know of each other or depend on each other. Quite a dilemma.

To resolve the dilemma, we introduce two interfaces. The Network class implements an interface INetworkSender (with a function sendData(data)). When the Application wants to send something, it calls a function in the INetworkSender interface which in turn ends up in the Network class. Conversely, we let Application implement INetworkReceiver (with a function onDataReceived(data)). When a network packet arrives, the Network class calls a function in the INetworkReceiverInterface which ends up in Application.

What is important here is the way the Application class communicates with the Network class and vice versa. If you draw a diagram of the two classes and two interfaces, the picture resembles a diamond shape. The right-hand path is where Application talks to Network (top to bottom) through INetworkSender. The left-hand path is where Network talks to Application (bottom to top) through INetworkReceiver. In general, one path is for function calls in one direction while the other path is in the other direction.

The diamond shape decouples our classes despite bi-directional communication. This is a powerful thing and it’s widely used when building large software systems. In particular, it is used in the Layers architectural pattern. We keep the upper layers unaware of lower layers to obtain a flexible and reusable solution.

The decoupling of Application and Network helps us keep the application logic from the networking. It also gives us testability. If we let the Application class talk to a mock class instead of Network, we can verify the application logic fast and reliably, without complicating things with a real network. Also, with a clear separation, it is easier to spot where application logic depends on network behavior.

Object-Oriented Programming Lecture at KTH: Slides etc.

Thank you all that participated in my lecture at KTH October 26, 2011. I had a lot of fun, and we had some good discussions. For you who were not there, it was about object-oriented programming and how to write good code. We used some example code to discuss OO principles and testing. Here’s the presentation (pdf)!

A couple of posts of relevance:

Design Principles by Example: Talk to an Interface or an Abstraction?

What is the relation between design principles “Talk to an interface, not an implementation” and “Talk to an abstraction, not a concrete”? When you apply them, you want to achieve different goals.

Two important design principles for writing good software are

  1. “Talk to an interface, not an implementation” and
  2. “Talk to an abstraction, not a concrete”.

Admittedly, they sound very much alike, so what is the difference between them?

communications towerAssume you are writing a client implementation that needs to communicate with a server somewhere. You have chosen to use a web socket for sending messages over the wire. To that end, you will use a class ClientWebSocketSenderImplementation. Now, the “Talk to interface, not an implementation” design principle suggests that talking to the web socket implementation class directly is inappropriate. Instead, you should talk to an interface ClientWebSocketSender.

Following the first design principle have several upsides. First, it will make your code easier to test. In this case, using the web socket implementation directly would use the network. That would make your unit tests slow and unreliable and might require a complicated setup phase. Second, talking to the web socket implementation directly would couple your client code to that specific implementation. If needed, changing web socket implementations would be difficult. Also, your code would not be reusable without shipping the web socket implementation.

We have chosen to send messages using a web socket. But there is really no need for our client application to know how messages are sent over the network. The second design principle says “Talk to an abstraction, not a concrete”. The concrete here is a web socket. A suitable abstraction in this context could be the ability to send messages over the web without specifying how. So we introduce an interface ClientWebSender. Depending on the application, we could take it one step further. It might make sense to abstract away the fact that we’re sending messages over the internet (for example, it could be over an IPC channel). We would end up with an interface ClientSender.

The second design principle will make your application more resilient to change. Without the abstraction, the web socket details might propagate throughout your code. For example, functions, arguments and return types could be specific to web sockets. If you would like to change your application to support message sending over e.g. HTTP or your own proprietary protocol over TCP, you would have to chase down all references to web sockets.

Last, the “Talk to an abstraction, not a concrete” does not require us to talk to an interface. You might have a class that represents the “abstraction” part and hides the “concrete” part (e.g. by delegating to a web socket implementation). So our two design principles serve different purposes and does not necessarily overlap. That said, they work very well in combination to write decoupled, testable and change resilient software.

Object-Orientation is not Really About Objects

How do we write good object-oriented programs? Despite the name, it is not by focusing on objects.

OrientationObject-oriented (OO) programming presents several concepts to the programmer not present in a procedural language. Given the name “object-orientation”, objects is clearly a central concept. Objects are the basic building blocks of our program and they carry out all the important work of our program. This should be contrasted with procedural programs, where functions are global and data is passed as arguments.

From time to time, you come across OO programs that has a procedural flavor to them. There are few, if any, interfaces and classes are primarily used to organize related functions with data. Admittedly, this is a slight administrative improvement over procedural programming with structs and global functions. The problem is, OO programs like this have the same problems as procedural programs: coupling.

Martin Fowler illustrates the concept very well in his article Design Principles and
Design Patterns
(Figure 2-17 on page 13). Assume you have a main function. Main calls three other functions, which in turn calls two other functions. Main now depends on all its descendants! High-level logic should not depend on low-level implementation.

The same applies to the “procedural style OO program” described above. Main might use three objects and call a function in each. Those functions might call two functions on some other classes and so forth. Main will depend on all classes and functions. Before you know it, main depends on your whole system.

The point is, even though objects do all the important work, they do not take us very far when trying to write good, decoupled software. Instead, have a look at Figure 2-18 in the article. Assume that main calls functions on three different interfaces. Main would not know or care which class implements each interface. Main depends only on the interfaces.

Thus, when writing object-oriented software, think primarily about abstractions and interfaces. After that, you can think about objects.

Decoupling Starts with an Interface, but Where Does It End?

Writing decoupled software components means writing them so that they depend on each other as little as possible. If one of your classes uses another class directly, they will inevitably be coupled together. Use one, and you will have Couplingto use the other. To decouple components, we normally think of letting our classes talk to interfaces instead of concrete classes. Any subclass can hide behind the interface, so the coupling between the classes is reduced. One good example of “talking to interfaces” is the Factory method design pattern. Instead of using new on a concrete class, you will ask an interface to create the object for you. But decoupling is more than just talking to interfaces.

When I first started programming Java, I came across Spring and its inversion of control containers (IoC). Compared to the Factory method above, Spring IoC takes object creation decoupling a step further. By using XML files, you specify which object to create and which arguments to give to the constructor (e.g. other objects). Coming from C++, it was pure magic the first time I saw it. Nowhere in the code do your classes refer to each other. Now, that is decoupling!

Later, I started working with OSGi. It is a Java framework which allows for independent life-cycles of modules (called bundles). For example, you could replace a bundle with a newer version during run-time. Other bundles won’t notice they are communicating with something new. Bundles communicate with each other by publishing and consuming services, which are just plain interfaces. XML is used to specify what services are published and consumed by a bundle. Similar to above, individual modules never refer to each other in the code. Again, that is decoupling.

Web APIs, such as a REST or SOAP API over HTTP, are often used to provide access to web or cloud services. Also, more and more enterprises expose their internal sub-systems as web services. To assemble software using web APIs, you can combine software components that may execute anywhere in the world. The pattern here seems to be that the larger the software component, the more decoupling is possible (or required?). Decoupled components through interfaces needed to be compiled together. Decoupled components through web APIs need not even be in the same time zone. The downside is that more powerful decoupling techniques require a lot of work.

Combining decoupling on different levels will make your system more resilient to change and open up for use in new and unexpected ways. The next time you think of ways to make your code more loosely coupled, remember that decoupling is not only about interface classes!