Abstraction vs Compression

In our daily communication you might hear things like “higher level of abstraction”, but what is abstraction? And how does it relate to compression?

Wikipedia says this about Abstraction: “Abstractions may be formed by reducing the information content of a concept or an observable phenomenon, typically to retain only information which is relevant for a particular purpose.” For example, a ball is an abstraction of a football and other types of balls. Abstraction in software development is about removing details from something concrete; there’s a loss of information. For example, a class for sending packets over a TCP socket is a concrete concept. An abstraction could be a Network interface with a send function. In the abstraction, we remove the details of exactly how the data is sent.

Compression, on the other hand, is about hiding information. Although not visible, the information is there and can be retrieved if necessary (like unzipping compressed content). One example is procedural programming: Reading a function should give you a good picture of what the function does. If you need more information, you can always go into the called functions for details. No information is lost.

Using an interface in object-oriented programming introduces an abstraction. In runtime, in the general case, you cannot know which class implements an interface. You lose information. This can make your object-oriented code hard to understand, review etc. Some may argue that this is not a problem: If your interfaces are clear, it does not matter who implements it. It should be sufficient to know that the subclass carries out the work according to the specification of the interface (honoring the Liskov Substitution Principle). Still, the information loss can be a challenge.

Other uses of the word “abstraction” in software development may appear when people mention things like “programming at a higher level of abstraction”. The Network interface from above is a good example. It is on a higher level of abstraction than the more low-level TCP implementation (which itself hides raw socket operations). What if we had a Network class with a send function, and it implementing the TCP socket sending. Is this an abstraction? The public functions of Network hides the details of TCP packet sending etc. By going into the Network class, we can retrieve all details of exactly how packets are sent. Information is hidden, but not lost. If no information is lost, this is rather “programming at a higher level of compression”. :)

After putting you through this, I must say that in our daily communication, we (myself included) don’t pay much attention to the distinction between abstraction and compression: “abstraction” normally means any kind of information hiding or removal. But it’s useful to know the difference since it affects the understanding and readability of your code.

The Diamond Shape

How can your classes talk to each other if they don’t know of each other? Hint: a diamond might come handy.

Revisiting a topic from my KTH lecture in October, I wanted to discuss the “diamond shape”.

Imagine you want to write a simple networking application. You have at least two responsibilities represented: first, an Application class is responsible for implementing the application logic. Second, a Network class is responsible for implementing sending and receiving packets over the network. As an example, let’s assume that when the user presses a button, the Application needs to tell the Network class to send some data. When a response is received over the network, the Network class needs to tell the Application class that some data was received. Each class needs to talk to the other class.

The interaction between Application and Network requires bi-directional communication. Despite this, we don’t want the two classes to refer directly to each other, since this would lead to problems (tight coupling, poor testability, unnecessary knowledge in the Network class about the full public API of Application etc.). Thus, we want bi-directional communication between two classes, but they mustn’t know of each other or depend on each other. Quite a dilemma.

To resolve the dilemma, we introduce two interfaces. The Network class implements an interface INetworkSender (with a function sendData(data)). When the Application wants to send something, it calls a function in the INetworkSender interface which in turn ends up in the Network class. Conversely, we let Application implement INetworkReceiver (with a function onDataReceived(data)). When a network packet arrives, the Network class calls a function in the INetworkReceiverInterface which ends up in Application.

What is important here is the way the Application class communicates with the Network class and vice versa. If you draw a diagram of the two classes and two interfaces, the picture resembles a diamond shape. The right-hand path is where Application talks to Network (top to bottom) through INetworkSender. The left-hand path is where Network talks to Application (bottom to top) through INetworkReceiver. In general, one path is for function calls in one direction while the other path is in the other direction.

The diamond shape decouples our classes despite bi-directional communication. This is a powerful thing and it’s widely used when building large software systems. In particular, it is used in the Layers architectural pattern. We keep the upper layers unaware of lower layers to obtain a flexible and reusable solution.

The decoupling of Application and Network helps us keep the application logic from the networking. It also gives us testability. If we let the Application class talk to a mock class instead of Network, we can verify the application logic fast and reliably, without complicating things with a real network. Also, with a clear separation, it is easier to spot where application logic depends on network behavior.

Object-Oriented Programming Lecture at KTH: Slides etc.

Thank you all that participated in my lecture at KTH October 26, 2011. I had a lot of fun, and we had some good discussions. For you who were not there, it was about object-oriented programming and how to write good code. We used some example code to discuss OO principles and testing. Here’s the presentation (pdf)!

A couple of posts of relevance:

Lecture on Object-Oriented Programming at KTH October 26, 2011

I will be giving a lecture at KTH (The Royal Institute of Technology) in Stockholm Wednesday October 26, 2011, 13.00-15.00 in lecture hall K2.

The lecture will be on object-oriented programming and lessons learned. I will address the constraints we face in programming (e.g. cost/quality/time-to-market) and give examples on what we do in object-oriented programming to face our challenges (things like design principles, decoupling and test-driven development).

The lecture is free and open to all, but the number of seats is limited. The lecture will be held in Swedish. Send me an email (johnny@johnnybigert.se) if you plan to attend!

Software Development is a Strange Profession

Software development is similar to other creative professions in many ways. For example, working with buildings as an architect involves planning, recognition and application of well-known patterns, problem solving and so forth. But software development is different from most other professions in one very important way.

Imagine you are a painter, an architect or a musician looking for a job. What kind of information would you supply in your job application? Obviously, you would show proof of your skills (photographs of paintings or buildings or recordings of music). As a software developer, what do you show? An application?

Assume it’s a very nice application (or a very bad one). Sure, you can draw some conclusions from the externally observable quality and behavior of an application. But development is normally a team effort, your code might be perfect while other’s isn’t and vice versa… Showing an application also assumes it can be shown. Your stuff might be a deeply integrated in a backend server somewhere, or worse.

A friend of mine brought up the example of an electrician. After a job well done, the wiring is normally not visible, somewhat like the programmer’s. Nevertheless, the electrician can still take photos before walls cover up their work. Unfortunately, the code you write is normally a well-kept secret of the company you work for (and might be off-limits in terms of photography :). The problem is, as a software developer you have nothing to show! This is not only a problem for you, but also for people trying to hire you.

As I’ve said before, your CV does not say much, since experience is not really an indicator of quality code. Admittedly, there are ways to prove your skills: doing a test task, participating in open source software development, creating a portfolio of sample code to show etc. But this doesn’t change the fact that software development is a strange profession. Can you think of a profession that has the same problem?

Software Architecture Built to Survive Change

How do you create an object-oriented architecture that can survive change? We discuss how to isolate the parts of your system that never changes, while still making it easy to add new functionality.

What is it that makes some software architectures fragile, wavewhile others are solid as a rock? Recently, I have come across a very appealing idea from multiple sources. First, a colleague of mine brought it up (hi, D!), then I read about it in a book “Lean Architecture for Agile Software Development” by James Coplien and Gertrud Bjørnvig. The book introduced me to the DCI software architecture, which I might cover in more detail in a later post (what I describe here is just parts of it). Learning a new idea is like learning a new word: once you know it, you see it everywhere.

Let’s walk through a simple (and highly contrived :) example. Assume we have a bank system with users, accounts, transactions, transaction logs etc. We have a use case to implement: pay interest to user’s account. Without giving it much thought, we could add a function addInterest to the Account class which calculates and deposits interest and updates the transaction log. Given a few hundred of these types of features, your Account class will be cluttered with functions. As you can imagine, this will eventually lead to poor readability, increased risk of breaking working code and problems parallelizing work. Over time, your code might complex enough to prohibit any kind of progress. Actually, this is the path many projects go.

Let’s try a different take. In the terminology of the “Lean Architecture” book, we want to separate “what the system is” (e.g. accounts, users) from “what the system does” (e.g. pay interest, transfer money). We create classes Account, TransactionLog etc. and define simple interfaces IAccount and ITransactionLog. The interfaces expose only administrative operations like getters/setters and add/remove functions (but no behavior or use case logic!). Among other functions, IAccount might expose addMoney while ITransactionLog might expose addLogEntry. We then implement our use cases as “algorithms” that operate on the interfaces. For example:

interface IUseCase
{
    void execute();
}

class PayInterestUseCase implements IUseCase
{
    PayInterestUseCase(IAccount account,
        ITransactionLog transactionLog) {...}

    void execute()
    {
        double interest = calculateInterest(account);
        account.addMoney(interest);
        transactionLog.addLogEntry(account.getId(),
            interest);
    }
}

Later, when we add a “money transfer” use case, we already have the addMoney and addLogEntry functions. We create a new use case class TransferMoneyUseCase. We withdraw money from one account (e.g. by providing a negative number to addMoney), deposit the money in the other account and update the log. The Account/TransactionLog objects and interfaces do not change at all, but we can still extend the system! After a few use cases, we have a set of stable interfaces on which we can build almost anything. Changes and additions to the interfaces will be fewer and fewer.

To me, the concept of separating domain objects (what-the-system-is) from business logic (what-the-system-does) is very appealing. One changes slowly, if at all. The other will evolve and change all the time. I like it. What drawbacks/benefits do you see? Have you tried it? Leave your comments below.

My Favorite Interview Question

Do you interview to hire a software developer? Maybe you’re being interviewed. In any case, this is my favorite question to determine how skilled you are.

What is the ultimate interview question? What question reveals whether or not a person is the one you’re looking for? Clearly, it depends on what you want. But even if you have a job advertisement outlining very specific technical skills, I argue in my previous post that, what you first and foremost need is someone that has a good sense on software structure. In object-orientation, structuring a program is really about understanding and applying object-oriented design principles. Now that we know what skills we’re looking for, it is easier to formulate a question.

I like evidence based interviews where we discuss what the potential hire did in the past. Thus, my favorite question is “Can you give me an example of good object-oriented code from your working career?”question answer Sometimes I even add “it doesn’t have to be code that you wrote”. You would certainly expect a professional programmer to be able to recall at least one good example from his working career. The point is not really to test whether or not he can remember code he wrote or read. Rather, it is to make sure he has an opinion. But most importantly, to have something to discuss. Anyone can do namedropping, so we need a concrete piece of code or a drawing to talk about. The upside with something that I didn’t choose is that the problem and solution domain (its terminology, background etc.) is known to my potential hire. That makes the interview situation less stressful. And I get to see if he is good at communicating ideas.

Surprisingly, this seems to be a tough question and I sometimes have to settle without an answer. Maybe it’s the pressure of the interview situation (hey, maybe it’s me!) but it reflects badly on the interview subject. If he can’t recall or don’t have an opinion on what is good code, how could I expect him to write any? So I have to move on. “How about bad examples of object-oriented programming?” This is a little bit easier. Everybody has opinions on other people’s code, and we can pick up from there: “Why is it bad? How can it be improved? What is the difference between bad and good?”. With luck, we get some examples to discuss.

I recently looked for decent examples as a basis for discussions around object-oriented programming. I came across this blog post by Reginald Braithwaite discussing the game of Monopoly. The main ideas are that Monopoly is well-known, but still complex enough to require the interview subject to discuss requirements. It also have some non-local properties, e.g. you can only build a house if you own all streets in that neighborhood. I like it. If the above fails, it will open up the discussion on object-oriented design principles in a nice way.

I think I will take a summer break from the writing now. A few weeks at least. See you!

 

Skills That Will Get You Hired

What is it that companies look for when they hire programmers? Surprisingly, they probably want something else than what’s in the job ad.

Job advertisements often focus on technologies (network programmingfor hire, windows programming, XML, you name it) or expert skills in a specific language. As a consequence, most CVs focus on the same thing. This is only natural, since people want to get hired. With otherwise equally strong candidates, the technology skills listed in the CV might affect who’s hired. But highlighting technology skills in a job ad is probably misguided, since I think these skills are not what companies really need. Let me explain what I mean.

In order to get a software system to do what we want, we need the above-mentioned technology knowledge. So, it’s definitively required, don’t get me wrong. But for the project to be successful over time, technologies are worth nothing unless the structure of the code is good enough for maintenance. (Somebody said that your code is in maintenance mode after the first line of code has been checked in. There is some truth in that.)

Also, acquiring a new piece of technology is not rocket science. Give it a couple of months, and you will have good enough knowledge. You will not become an expert, but there is no need for everyone to be an expert. To me, the skill to properly structure software is an essential one. Nevertheless, it is a rare one. Could it be that it is a skill hard to acquire? My point is this: avoid hiring people for knowledge that can be taught, without making sure he/she has what is essential. More specifically, what is that?

Here it is: I value people that write code that are testable, modular, readable and extensible. Without testability, I dare not touch your code. So I would have to throw it all away and start over, risk breaking it while making it testable or just dive in with my eyes closed. Nether is acceptable. With good modularity, the worst case scenario is that modules with unreadable code can be replaced or wrapped with tests. Modularity will also help readability and understanding. If you can understand the interfaces, understanding the internals become less important. Once we get down to internals of a module, readability becomes crucial. Only if I can read it (= understand it), can I maintain and extend it. Extensibility allows your software to meet customer’s needs and survive over time.

In object-orientation, there are design principles that will help you write testable, modular, readable and extensible software (see e.g. these three posts). If someone showed up with a good sense of how to write well-structured software using these design principles, he would certainly be a good hire. And conversely, if someone wasn’t convincing enough on his skills in writing structured software, he would be hard to hire. Regardless of his skills in other areas.

Design Patterns By Example: Implementing a State Machine

How do you implement the State pattern, while separating the different concerns? We use an example to discuss how to write code easy to understand and maintain.

When writing code, our classes often go through a series of transformations. What starts out as a simple class will grow as behavior is added. If care is not taken, your code will become difficult to understand and maintain. For example, assume you’re implementing a telephone. First, you support only the simplest of usages: your phone is not connected to the telephone jack. Thus, the phone can be either on-hook or off-hook.

You implement this as a class Telephone with a single boolean member offHook. Your phone has two operations pickUp() and hangUp() which manipulate the boolean. Great. Now you jack up your phone and want to implement the next use-case: the user picks up the phone (after which he should hear a tone) and presses a key (after which the tone should stop). You introduce a new boolean hasPressedFirstKey and a function pressKey(int key).

Boolean-Oriented Programming

Already, you have a couple of problems with your code. First, you have modeled three states (on hook, off hook while waiting for first key to be pressed, off hook while waiting for more keys to be pressed) with two boolean variables. Thus, there is one combination of values that does not correspond to a valid state for your telephone: offHook = false and hasPressedFirstKey = true. The state of a bigger system might involve more variables and might even be scattered over many classes. In that case, “boolean-oriented programming” like this makes the code very hard to understand.

Second, member variables live for as long as the object does. In our telephone example, the variable hasPressedFirstKey makes little sense as soon as the full phone number has been dialed and someone on the other side has answered. So the actual lifetime of hasPressedFirstKey is shorter than its “physical” lifetime. If we strive for self-documenting code, this is pretty far from it. And of course, this is even worse in the case of a complex system. So what can we do about it? State design pattern to the rescue.

The State Design Pattern

The Wikipedia page on the State design pattern says that the purpose of State is to “represent the state of an object”. In our telephony example, we would create three state classes: e.g. OnHookState, OffHookWaitForFirstDigitState, OffHookWaitForMoreDigitsState. State classes will only model valid states of the telephone, removing the first problem from above. Also, since there is always a valid state for our telephone, we remove the lifetime problem from above.

I use the State pattern for two reasons: First, it captures the behavior of the code in a single place. This will make your code easy to understand. Second, it makes it easy for you to separate what the system does (the behavior) from how it’s done (the implementation). This will make your code easy to maintain and test. Let’s go through an example.

States, Events and Actions

We have already mentioned the state classes. They all inherit from a common interface ITelephoneState. The state interface defines the events that the system accepts (here in Java):

interface ITelephoneState {
    void pickUp();
    void hangUp();
    void pressKey(int key);
}

As said earlier, we want the state class to be explicit about the system behavior, but without involving implementation details. Instead of letting the state class contain implementation details, we delegate to an action interface ITelephoneAction (shown later). Let’s implement the OffHookWaitForFirstDigitState class:

class OffHookWaitForFirstDigitState implements ITelephoneState {
    OffHookWaitForFirstDigitState(ITelephoneAction action) {
        this.action = action;
    }
    void pickUp() { /* do nothing */ }
    void hangUp() {
        action.stopTone();
        action.changeState(new OnHookState());
    }
    void pressKey(int key) {
        action.stopTone();
        action.changeState(new OffHookWaitForMoreDigitsState());
    }
    private ITelephoneAction action;
}

We see that the state conforms to the ITelephoneState interface. ITelephoneAction is defined like this:

interface ITelephoneAction {
    void stopTone();
    // ... more telephone specific functions here ...
    void changeState(ITelephoneState newState);
}

Thus, the responsibility of the state class is to implement the behavior (what the system does). The responsibility of the action class is to provide the implementation (how things are done). This makes the state code very easy to read.

The ITelephoneAction interface is normally implemented by the Telephone class:

class Telephone implements ITelephoneAction {
    Telephone() {
        state = new OnHookState(this);  // start state
    }

    // public interface
    public void pickUp() { state.pickUp(); }
    public void hangUp() { state.hangUp(); }
    public void pressKey(int key) { state.pressKey(key); }

    // implements ITelephoneAction
    void stopTone() { /* do something */ }
    void changeState(ITelephoneState newState) {
        state = newState;
    }
}

Note that the state never talks directly to the Telephone class. This ensures that the state uses only what’s needed from Telephone, and not the full range of public functions in Telephone. Furthermore, talking to an interface will allow you to unit test the logic of the state machine without using Telephone. To summarize the interactions between Telephone and its state: the Telephone class talks to the state class through the ITelephoneState interface; the state class talk to the Telephone class through the ITelephoneAction interface.

Trace Logging

As an added bonus, having a clear separation between state, event and action will make it easy for you to implement nice trace logs. If we trace every state change, event function call and action function call, we can use indentation to show the flow through the state machine:

OnHookState
    offHook                      // event
        changeState              // action
OffHookWaitForFirstDigitState    // new state
    pressKey
        stopTone
        changeState
OffHookWaitForMoreDigitsState
    ...

As you see, states are not indented, events are indented one level and actions resulting from the event are indented two levels.

Replay: Recreate Every Single Bug

Do you ever spend lots of time trying to understand and recreate a bug scenario? Is there a bullet-proof way to reproduce every single bug? By logging the right information, it should be possible.

Reproducing bugs in complex systems is often hard. Even if the use case that caused the bug is explained in detail, this may not help much. Other factors, such as database state, configuration, timers, network activity and randomness, affect code execution. Without exact knowledge of these factors, you cannot determine which path was taken through your code. Often, debugging information is written to a trace log file to mitigate the problem. However, this is intrusive and clutters the code. Furthermore, by the time a bug is found in a live system, it is too late to add more tracing to the code. We need something else.

As said above, in order to reproduce a bug, we need to know exactly which path was taken through our code. In order to do that, we need to know exactly what decision is taken for every possible branch (if, for, switch etc.). In certain circumstances, this might actually be possible.

Recording the Input

First, let’s assume we have a module without concurrent data access (either single-threaded, by explicit synchronization or by design). Second, vhs cassettewe identify incoming events to the system, such as calls on a public API, GUI user interaction or network packages. They are the entry points to your code. Third, we identify all other external sources of data that might affect the execution. For example, calling a readFromDatabase function will return some data. The use of this data will affect the code execution. Thus, we treat all reads from the database as input to our module. The same goes for configuration, random values and all the other factors mentioned earlier. Combined, we think of the incoming events and the data from external functions as the input to our module.

Fourth, after identifying all module input, we introduce a mechanism to eavesdrop on the incoming data. For each event (e.g. callback userClickedButton or call to a public API function), we store the function arguments to a file. Let’s call this file the interaction log. Similarly, for each external function call (such as a database read), we store the return value or exception thrown.

Replaying From File

In order to reproduce the scenario of the bug, we replay the events from the file by injecting them as function calls into the module. The module will execute its code until it reaches an external function call such as readFromDatabase. Instead of calling the function, we retrieve the return value (or exception) from the file and return (or throw) that instead.

Now, there are a couple of challenges when implementing this approach. First, we’ve restricted ourselves to code that never accesses data concurrently. There’s probably nothing we can do about this. Anyway, from my experience, it is better to organize the software to avoid these concurrency problems since the alternative is just too painful (see post on concurrency).

Second, we want the calls from our module to an external function (e.g. readFromDatabase) to either call a real function (e.g. in the database implementation) or to replay from file. Obviously, we don’t want our module to be aware whether we are replaying or not. In object-orientation, we can achieve this through sub-classing an interface. The module talks to the interface, and behind the interface, there’s either a real entity (the database) or something that replays from file. Thus, all calls from your code to external functions must go through an interface, and never to a concrete implementation directly.

Third, external function calls (such as a database read) have arguments. What if the external function decides to manipulate one of the arguments? For example, it could call a setter method or change a public member. We would have a real problem. The replayed execution would not call the setter method, and the system state would not be equivalent to the bug scenario. Now, to me, manipulating the arguments of a function is poor style. Doing it in an API (for e.g. a database system) is even worse. So hopefully, situations like these are rare. But when they do arise, we will have to restructure our code slightly if we want to be able to replay.

Implementing the Replay Functionality

Java has a very handy mechanism to support the implementation of the replay functionality: reflection. We might have a large number of interfaces through which we call external functions. Nevertheless, using reflection, we can create a single wrapper class that can handle all interfaces. Lets denote it LoggingWrapper. We would create an instance of LoggingWrapper and supply it with an instance of the real class (e.g. the database implementation). We would give the LoggingWrapper object to our module, and our module would think it is talking to the real entity (the database). When our module calls an external function (e.g. readFromDatabase), the LoggingWrapper would forward the function call to the real entity (database) and then log the return value (or exception thrown) to file. If we don’t want to log to file, we would not create a LoggingWrapper. Thus, we would not suffer any performance penalty.

When we want to replay from file, we create a ReplayWrapper and give that to our module. From some other class (e.g. EventReplayer), we would read an event and its arguments (e.g. “the user clicked button X”) from the file and call the corresponding function on the module. When an external function is called (e.g. readFromDatabase), the ReplayWrapper would read a return value (or exception) from file and return it (or throw the exception). As an extension, we could also verify the integrity of the interaction log while replaying. The downside is that this require some extra information to be recorded in the log (such as the full name and argument values of each function call). An integrity check would be able to detect a number of things, such as if the arguments to an external function call differ from when the log was written.

You could imagine implementing the replay functionality per module. But a replay implementation seems complex enough to be non-trivial. It would be useful with a general purpose Replay framework. For fun, I have started sketching on one. Time will tell if/when it will be in good enough shape to be released to the public. All design/code/idea contributions are welcome. :)