Abstraction vs Compression

In our daily communication you might hear things like “higher level of abstraction”, but what is abstraction? And how does it relate to compression?

Wikipedia says this about Abstraction: “Abstractions may be formed by reducing the information content of a concept or an observable phenomenon, typically to retain only information which is relevant for a particular purpose.” For example, a ball is an abstraction of a football and other types of balls. Abstraction in software development is about removing details from something concrete; there’s a loss of information. For example, a class for sending packets over a TCP socket is a concrete concept. An abstraction could be a Network interface with a send function. In the abstraction, we remove the details of exactly how the data is sent.

Compression, on the other hand, is about hiding information. Although not visible, the information is there and can be retrieved if necessary (like unzipping compressed content). One example is procedural programming: Reading a function should give you a good picture of what the function does. If you need more information, you can always go into the called functions for details. No information is lost.

Using an interface in object-oriented programming introduces an abstraction. In runtime, in the general case, you cannot know which class implements an interface. You lose information. This can make your object-oriented code hard to understand, review etc. Some may argue that this is not a problem: If your interfaces are clear, it does not matter who implements it. It should be sufficient to know that the subclass carries out the work according to the specification of the interface (honoring the Liskov Substitution Principle). Still, the information loss can be a challenge.

Other uses of the word “abstraction” in software development may appear when people mention things like “programming at a higher level of abstraction”. The Network interface from above is a good example. It is on a higher level of abstraction than the more low-level TCP implementation (which itself hides raw socket operations). What if we had a Network class with a send function, and it implementing the TCP socket sending. Is this an abstraction? The public functions of Network hides the details of TCP packet sending etc. By going into the Network class, we can retrieve all details of exactly how packets are sent. Information is hidden, but not lost. If no information is lost, this is rather “programming at a higher level of compression”. :)

After putting you through this, I must say that in our daily communication, we (myself included) don’t pay much attention to the distinction between abstraction and compression: “abstraction” normally means any kind of information hiding or removal. But it’s useful to know the difference since it affects the understanding and readability of your code.

The Diamond Shape

How can your classes talk to each other if they don’t know of each other? Hint: a diamond might come handy.

Revisiting a topic from my KTH lecture in October, I wanted to discuss the “diamond shape”.

Imagine you want to write a simple networking application. You have at least two responsibilities represented: first, an Application class is responsible for implementing the application logic. Second, a Network class is responsible for implementing sending and receiving packets over the network. As an example, let’s assume that when the user presses a button, the Application needs to tell the Network class to send some data. When a response is received over the network, the Network class needs to tell the Application class that some data was received. Each class needs to talk to the other class.

The interaction between Application and Network requires bi-directional communication. Despite this, we don’t want the two classes to refer directly to each other, since this would lead to problems (tight coupling, poor testability, unnecessary knowledge in the Network class about the full public API of Application etc.). Thus, we want bi-directional communication between two classes, but they mustn’t know of each other or depend on each other. Quite a dilemma.

To resolve the dilemma, we introduce two interfaces. The Network class implements an interface INetworkSender (with a function sendData(data)). When the Application wants to send something, it calls a function in the INetworkSender interface which in turn ends up in the Network class. Conversely, we let Application implement INetworkReceiver (with a function onDataReceived(data)). When a network packet arrives, the Network class calls a function in the INetworkReceiverInterface which ends up in Application.

What is important here is the way the Application class communicates with the Network class and vice versa. If you draw a diagram of the two classes and two interfaces, the picture resembles a diamond shape. The right-hand path is where Application talks to Network (top to bottom) through INetworkSender. The left-hand path is where Network talks to Application (bottom to top) through INetworkReceiver. In general, one path is for function calls in one direction while the other path is in the other direction.

The diamond shape decouples our classes despite bi-directional communication. This is a powerful thing and it’s widely used when building large software systems. In particular, it is used in the Layers architectural pattern. We keep the upper layers unaware of lower layers to obtain a flexible and reusable solution.

The decoupling of Application and Network helps us keep the application logic from the networking. It also gives us testability. If we let the Application class talk to a mock class instead of Network, we can verify the application logic fast and reliably, without complicating things with a real network. Also, with a clear separation, it is easier to spot where application logic depends on network behavior.

Object-Oriented Programming Lecture at KTH: Slides etc.

Thank you all that participated in my lecture at KTH October 26, 2011. I had a lot of fun, and we had some good discussions. For you who were not there, it was about object-oriented programming and how to write good code. We used some example code to discuss OO principles and testing. Here’s the presentation (pdf)!

A couple of posts of relevance:

Design Patterns By Example: Implementing a State Machine

How do you implement the State pattern, while separating the different concerns? We use an example to discuss how to write code easy to understand and maintain.

When writing code, our classes often go through a series of transformations. What starts out as a simple class will grow as behavior is added. If care is not taken, your code will become difficult to understand and maintain. For example, assume you’re implementing a telephone. First, you support only the simplest of usages: your phone is not connected to the telephone jack. Thus, the phone can be either on-hook or off-hook.

You implement this as a class Telephone with a single boolean member offHook. Your phone has two operations pickUp() and hangUp() which manipulate the boolean. Great. Now you jack up your phone and want to implement the next use-case: the user picks up the phone (after which he should hear a tone) and presses a key (after which the tone should stop). You introduce a new boolean hasPressedFirstKey and a function pressKey(int key).

Boolean-Oriented Programming

Already, you have a couple of problems with your code. First, you have modeled three states (on hook, off hook while waiting for first key to be pressed, off hook while waiting for more keys to be pressed) with two boolean variables. Thus, there is one combination of values that does not correspond to a valid state for your telephone: offHook = false and hasPressedFirstKey = true. The state of a bigger system might involve more variables and might even be scattered over many classes. In that case, “boolean-oriented programming” like this makes the code very hard to understand.

Second, member variables live for as long as the object does. In our telephone example, the variable hasPressedFirstKey makes little sense as soon as the full phone number has been dialed and someone on the other side has answered. So the actual lifetime of hasPressedFirstKey is shorter than its “physical” lifetime. If we strive for self-documenting code, this is pretty far from it. And of course, this is even worse in the case of a complex system. So what can we do about it? State design pattern to the rescue.

The State Design Pattern

The Wikipedia page on the State design pattern says that the purpose of State is to “represent the state of an object”. In our telephony example, we would create three state classes: e.g. OnHookState, OffHookWaitForFirstDigitState, OffHookWaitForMoreDigitsState. State classes will only model valid states of the telephone, removing the first problem from above. Also, since there is always a valid state for our telephone, we remove the lifetime problem from above.

I use the State pattern for two reasons: First, it captures the behavior of the code in a single place. This will make your code easy to understand. Second, it makes it easy for you to separate what the system does (the behavior) from how it’s done (the implementation). This will make your code easy to maintain and test. Let’s go through an example.

States, Events and Actions

We have already mentioned the state classes. They all inherit from a common interface ITelephoneState. The state interface defines the events that the system accepts (here in Java):

interface ITelephoneState {
    void pickUp();
    void hangUp();
    void pressKey(int key);
}

As said earlier, we want the state class to be explicit about the system behavior, but without involving implementation details. Instead of letting the state class contain implementation details, we delegate to an action interface ITelephoneAction (shown later). Let’s implement the OffHookWaitForFirstDigitState class:

class OffHookWaitForFirstDigitState implements ITelephoneState {
    OffHookWaitForFirstDigitState(ITelephoneAction action) {
        this.action = action;
    }
    void pickUp() { /* do nothing */ }
    void hangUp() {
        action.stopTone();
        action.changeState(new OnHookState());
    }
    void pressKey(int key) {
        action.stopTone();
        action.changeState(new OffHookWaitForMoreDigitsState());
    }
    private ITelephoneAction action;
}

We see that the state conforms to the ITelephoneState interface. ITelephoneAction is defined like this:

interface ITelephoneAction {
    void stopTone();
    // ... more telephone specific functions here ...
    void changeState(ITelephoneState newState);
}

Thus, the responsibility of the state class is to implement the behavior (what the system does). The responsibility of the action class is to provide the implementation (how things are done). This makes the state code very easy to read.

The ITelephoneAction interface is normally implemented by the Telephone class:

class Telephone implements ITelephoneAction {
    Telephone() {
        state = new OnHookState(this);  // start state
    }

    // public interface
    public void pickUp() { state.pickUp(); }
    public void hangUp() { state.hangUp(); }
    public void pressKey(int key) { state.pressKey(key); }

    // implements ITelephoneAction
    void stopTone() { /* do something */ }
    void changeState(ITelephoneState newState) {
        state = newState;
    }
}

Note that the state never talks directly to the Telephone class. This ensures that the state uses only what’s needed from Telephone, and not the full range of public functions in Telephone. Furthermore, talking to an interface will allow you to unit test the logic of the state machine without using Telephone. To summarize the interactions between Telephone and its state: the Telephone class talks to the state class through the ITelephoneState interface; the state class talk to the Telephone class through the ITelephoneAction interface.

Trace Logging

As an added bonus, having a clear separation between state, event and action will make it easy for you to implement nice trace logs. If we trace every state change, event function call and action function call, we can use indentation to show the flow through the state machine:

OnHookState
    offHook                      // event
        changeState              // action
OffHookWaitForFirstDigitState    // new state
    pressKey
        stopTone
        changeState
OffHookWaitForMoreDigitsState
    ...

As you see, states are not indented, events are indented one level and actions resulting from the event are indented two levels.

Design Principles by Example: Talk to an Interface or an Abstraction?

What is the relation between design principles “Talk to an interface, not an implementation” and “Talk to an abstraction, not a concrete”? When you apply them, you want to achieve different goals.

Two important design principles for writing good software are

  1. “Talk to an interface, not an implementation” and
  2. “Talk to an abstraction, not a concrete”.

Admittedly, they sound very much alike, so what is the difference between them?

communications towerAssume you are writing a client implementation that needs to communicate with a server somewhere. You have chosen to use a web socket for sending messages over the wire. To that end, you will use a class ClientWebSocketSenderImplementation. Now, the “Talk to interface, not an implementation” design principle suggests that talking to the web socket implementation class directly is inappropriate. Instead, you should talk to an interface ClientWebSocketSender.

Following the first design principle have several upsides. First, it will make your code easier to test. In this case, using the web socket implementation directly would use the network. That would make your unit tests slow and unreliable and might require a complicated setup phase. Second, talking to the web socket implementation directly would couple your client code to that specific implementation. If needed, changing web socket implementations would be difficult. Also, your code would not be reusable without shipping the web socket implementation.

We have chosen to send messages using a web socket. But there is really no need for our client application to know how messages are sent over the network. The second design principle says “Talk to an abstraction, not a concrete”. The concrete here is a web socket. A suitable abstraction in this context could be the ability to send messages over the web without specifying how. So we introduce an interface ClientWebSender. Depending on the application, we could take it one step further. It might make sense to abstract away the fact that we’re sending messages over the internet (for example, it could be over an IPC channel). We would end up with an interface ClientSender.

The second design principle will make your application more resilient to change. Without the abstraction, the web socket details might propagate throughout your code. For example, functions, arguments and return types could be specific to web sockets. If you would like to change your application to support message sending over e.g. HTTP or your own proprietary protocol over TCP, you would have to chase down all references to web sockets.

Last, the “Talk to an abstraction, not a concrete” does not require us to talk to an interface. You might have a class that represents the “abstraction” part and hides the “concrete” part (e.g. by delegating to a web socket implementation). So our two design principles serve different purposes and does not necessarily overlap. That said, they work very well in combination to write decoupled, testable and change resilient software.

Object-Orientation is not Really About Objects

How do we write good object-oriented programs? Despite the name, it is not by focusing on objects.

OrientationObject-oriented (OO) programming presents several concepts to the programmer not present in a procedural language. Given the name “object-orientation”, objects is clearly a central concept. Objects are the basic building blocks of our program and they carry out all the important work of our program. This should be contrasted with procedural programs, where functions are global and data is passed as arguments.

From time to time, you come across OO programs that has a procedural flavor to them. There are few, if any, interfaces and classes are primarily used to organize related functions with data. Admittedly, this is a slight administrative improvement over procedural programming with structs and global functions. The problem is, OO programs like this have the same problems as procedural programs: coupling.

Martin Fowler illustrates the concept very well in his article Design Principles and
Design Patterns
(Figure 2-17 on page 13). Assume you have a main function. Main calls three other functions, which in turn calls two other functions. Main now depends on all its descendants! High-level logic should not depend on low-level implementation.

The same applies to the “procedural style OO program” described above. Main might use three objects and call a function in each. Those functions might call two functions on some other classes and so forth. Main will depend on all classes and functions. Before you know it, main depends on your whole system.

The point is, even though objects do all the important work, they do not take us very far when trying to write good, decoupled software. Instead, have a look at Figure 2-18 in the article. Assume that main calls functions on three different interfaces. Main would not know or care which class implements each interface. Main depends only on the interfaces.

Thus, when writing object-oriented software, think primarily about abstractions and interfaces. After that, you can think about objects.

Decoupling Starts with an Interface, but Where Does It End?

Writing decoupled software components means writing them so that they depend on each other as little as possible. If one of your classes uses another class directly, they will inevitably be coupled together. Use one, and you will have Couplingto use the other. To decouple components, we normally think of letting our classes talk to interfaces instead of concrete classes. Any subclass can hide behind the interface, so the coupling between the classes is reduced. One good example of “talking to interfaces” is the Factory method design pattern. Instead of using new on a concrete class, you will ask an interface to create the object for you. But decoupling is more than just talking to interfaces.

When I first started programming Java, I came across Spring and its inversion of control containers (IoC). Compared to the Factory method above, Spring IoC takes object creation decoupling a step further. By using XML files, you specify which object to create and which arguments to give to the constructor (e.g. other objects). Coming from C++, it was pure magic the first time I saw it. Nowhere in the code do your classes refer to each other. Now, that is decoupling!

Later, I started working with OSGi. It is a Java framework which allows for independent life-cycles of modules (called bundles). For example, you could replace a bundle with a newer version during run-time. Other bundles won’t notice they are communicating with something new. Bundles communicate with each other by publishing and consuming services, which are just plain interfaces. XML is used to specify what services are published and consumed by a bundle. Similar to above, individual modules never refer to each other in the code. Again, that is decoupling.

Web APIs, such as a REST or SOAP API over HTTP, are often used to provide access to web or cloud services. Also, more and more enterprises expose their internal sub-systems as web services. To assemble software using web APIs, you can combine software components that may execute anywhere in the world. The pattern here seems to be that the larger the software component, the more decoupling is possible (or required?). Decoupled components through interfaces needed to be compiled together. Decoupled components through web APIs need not even be in the same time zone. The downside is that more powerful decoupling techniques require a lot of work.

Combining decoupling on different levels will make your system more resilient to change and open up for use in new and unexpected ways. The next time you think of ways to make your code more loosely coupled, remember that decoupling is not only about interface classes!

Test-Driven Development Done Right

A couple of years ago, I had at least two misconceptions about Test-Driven Development: (1) you should write all your tests up-front and (2) after verifying that a test case can fail, you should make it pass right away. To better understand TDD, I got a copy of the book “Growing Object-Oriented Software Guided by Tests” by Freeman and Pryce (now one of my absolute favorites). Although the book does a great job explaining the concepts, it took me ten chapters to admit I had been wrong. Let’s never do that again. :)

Let me walk you through my misconceptions so that you don’t have to repeat my mistakes:

Misconception 1: write all tests up-front. Thinking about potential test cases up-front is not a bad thing. It will exercise your imagination and with some luck, many of them will still be applicable after the code is written. But don’t waste energy trying to compile an exhaustive list of tests. At least for me, this approach didn’t work since my imagination appears too limited to come anywhere near the final list. But foremost, I would like to get going writing some test code!

You are better off writing a few happy-path test cases. Filling in the test code will get you started working on the user interface of your classes. When the test code starts acting as a “user” of your interface, it will be obvious to you whether the API is okay or awkward to work with. The tests will drive you to improve your user interface. Creating the user interface will invariably make you think about error cases and how the API can be abused or misunderstood. You will come up with more test cases, and implement these. With some effort, but surprisingly little so, you will grow your test suite.

Misconception 2: fail the test, then make it pass right away. When you have written your test code, filled in the production code to get it all to compile and seen the test fail, it is very tempting to just fix things. Make the changes to have the test pass. Actually, you can do that. But there are at least two reasons not to.

First, I strongly prefer an incremental approach. I fix only the problem reported by the test! If the test says “null pointer exception”, I will fix it. Running the test again, you will get another failure and fix that. This is the convenient/lazy approach, you just let the test drive you. Also, it will result in minimal increments, which is very helpful if another test case would break while changing the production code.

Second, fixing the failed test case right away will throw away a lot of information in the process. When the test fail, it will provide you with valuable information on what went wrong. If you cannot immediately understand what the problem is, maybe you should improve your test or production code? For example, if you get a “null pointer exception”, maybe error handling or an assert earlier in the production code could make sure your program never gets into that kind of a corrupted state. Alternatively, you could improve your test code with all kinds of helpful diagnostics. The idea is, if it takes time for you to understand what went wrong today, imagine how much time will be wasted when the same test case fails in six months. “Growing Object-Oriented Software Guided by Tests” says you should “let the tests talk to you”. There you go, you are test-driven.

Edge-To-Edge Unit Tests

The term unit test implies at least two different things. First, it means testing your code at the smallest unit, which is a function or a class (well, technically, you test the methods of a class, but they would make little sense in isolation). Second, it means writing test code in a language to test functionality in the same language.

Normally, when I write C++ code to test my C++ functionality, I tend to stay away from the “unit” level. Instead, I like tests that exercise the system edge-to-edge, resembling the interactions with the outside world as much as possible. Now, full edge-to-edge testing is normally not possible since the peripheral parts of a system are often hard to control. For example, let’s say I have a system with network on one side and a GUI on the other side. A realistic test case would have traffic over the network and a GUI reflecting that. But taking the network as an example, it complicates testing due to issues like slow response times, need for a remote side and network failures. So I settle for testing up to the interfaces of the network and GUI: you would inject network “traffic”(or time-outs) on the network interface, verify that the GUI interface is told to show something, do some user input on the GUI interface and watch outgoing network traffic being generated.

As with everything, there are pros and cons when testing at this level. To me, the main benefits compared to low-level unit tests are:

  • Having tests at that level gives me confidence that there’s a reasonably low probability of faulty interaction with the outside world.
  • The boundaries of your system are much less likely to change than the internals. This means you are less likely to spend time changing your tests.
  • It is easy to argue for the business value of the tests. They correspond well to what the customer expects and are used to guarantee the quality.
  • The tests are a decent measure of progress. Having a passing test means you are close to something to show to your customer.
  • It’s fun! And you can test-run your system before you have a network and a GUI.

The downsides I’ve experienced compared to testing at the unit level are:

  • Tests like these can make it harder to achieve decent code coverage. For example, your code might involve randomness, timing issues or use of the current time. You will have to make sure these can be controlled from the test context.
  • High-level testing can be hard to introduce late in the development process. For this to succeed, the whole system must be designed for testability. See the previous item.
  • The tests become monolithic. I’ve come across the situation where parts of my system were broken out to form a new shared component. The new component has to have tests of its own (or someone changing it won’t notice it’s broken until they run your tests). Your tests use the classes of your system, which are not suitable to use in a shared component since dependencies would go the wrong way.
  • It might be overkill for testing some parts of the system. For example, if you have some deep-down string manipulation code, you should go ahead and unit test it (in the true sense of the word). It’s all about choosing the proper tools for the problem.
  • Due to complexity in the lower levels of your software, you might be facing a combinatorial explosion of different test cases. You will have to select a few representative test cases and resort to normal unit testing of test the low-level parts. See the previous item.
  • Testing on this level poses sort of a communication problem. If I call my tests “unit tests”, most people think only of tests on the lowest level. If I call them “acceptance tests” or “functional tests”, someone will inevitably assume I have properly tested the system from the outside, edge-to-edge (which is definitively necessary, even with the tests described above). Calling them e.g. “functional unit tests” only adds to the confusion. (“What do you mean? Is it a unit test? Is it a functional test? Surely, it can’t be both.”) If you know of terminology that could help, let me know. Until I hear from you, I will just call them Edge-to-Edge Unit Tests.

As I said before, it’s about choosing the proper tool for your problem. If at all possible, I resort to “unit testing” at the highest possible level. If you haven’t done so, you should give it a try.