Strands, Concurrency and Message Passing

When writing multi-threaded programs, we need to protect data from concurrent access from multiple threads. Using locks for synchronization makes software development difficult, with run-time problems such as deadlocks and live-locks. There are also design problems: locks break modularity since they do no obey module boundaries. It is very possible that a lock taken in one module will cause a deadlock in another. This might require exposing locks that should have been private (for an example, see article Beautiful Concurrency).

One solution to the above-mentioned problems is simply to avoid thread-shared data. Message passing is one way to achieve this. It is typically implemented as active objects, where each module would have a thread of its own with a synchronized queue. MessageOne downside compared to using threads-and-locks is that return values often need to be asynchronous, which makes the source code harder to follow. Also, given your computer’s number of cores and hardware threads, excessive numbers of concurrent threads will result in poor performance due to context switching and page faults. We would like a more optimal number of threads, while still keeping the message passing paradigm.

Some time ago, I came across the concept of strands: message passing that uses a fixed number of threads while still supporting a very large number of message receiver objects (called strands, which means thread contexts). Conceptually, you can think of a strands handler as a set of message receiver objects, a job queue and a fixed set of worker threads. When a worker thread becomes available, the first job in the queue will execute. Executing the job means passing a message to a receiver. The job contains the message and an receiver identifier. The identifier uniquely identifies one of the message receiver objects in the strands handler. Typically, “passing a message to a receiver” means executing a member function on the message receiver object with some parameters. When the execution of the member function finishes, the thread will be free for new jobs.

The important thing is to guarantee that you do not read/write the same data at the same time from multiple threads. To achieve this, the strand handler makes sure the message receiver object is assigned to at most one thread at a time. This also means the message receiver objects should not be accessible outside of the strand handler.

The C++ Boost library has an implementation of strands in asio. Unfortunately, I don’t know of a strands library for Java. Let me know if you know of one!

 

Update (May 7, 2011): A colleague of mine made me aware of the concept of an actor, a concept very similar to a Strand, but more elegant I must say. And there are libraries for Java, e.g. GPars (see also example code in Dr Dobb’s).