Handling Concurrency

posted on

We all hear about how we need to make our applications more concurrent. There is no longer a free lunch for software developers, where processor cores will get faster, giving us a performance boosts for free. Instead we need to try and run more code in parallel.

Unfortunately we also hear about how hard concurrency is. It isn't something you can just throw in easily. Concurrency is hard to conceptualise and incredibly hard to debug given the time-sensitive nature of many bugs. So you have to deal with locks to try and prevent things like race conditions. Basically it's a scary ball of complexity that keeps most developers away.

Thankfully, it doesn't have to be quite so scary. If you look at concurrency you realise that it isn't a multitude of problems but one problem that causes all the hurt: data mutability. And by thinking about concurrency in a new way you can simplify and to a large degree eliminate these problems, removing a lot of the need for things like locks in your own code, which can cause performance issues.

Data Mutability

So what is the problem with data mutability and concurrency? Well let me give a simple example. Imagine the following block of code:

int i = 0;

void increment() {
    int locali = i;
    locali += 1;
    i = locali;

A simple enough thing, every time you call increment the value of i is incremented. So if we were to run it twice and then print out i we would get 2. Now lets say we run it twice concurrently, we could very well get 2. But we could also occasionally get 1 as the output. Why is this? Well lets see what happens with the two function calls "inc1" and "inc2" in serial and parallel.

When run serially they run like so:

  • inc1: store i in locali
  • inc1: increment locali
  • inc1: set i to locali
  • inc2: store i in locali
  • inc2: increment locali
  • inc2: set i to locali

However, when run in parallel this might happen:

  • inc1: store i in locali
  • inc1: increment locali
  • inc2: store i in locali
  • inc1: set i to locali
  • inc2: increment locali
  • inc2: set i to locali

The second call of increment starts before the first one has finished, after locali has been incremented, but before i has been set to the new value. This means that when the second call gets the value of i, it is still 0.

The Solution

So the issue here is reading and writing data in variables being out of sync. When you start reading to or writing from a variable on multiple threads all hell breaks loose. There are 2 simple solutions to this problem though.

The first is to get rid of variables. This is an approach taken up in functional languages. Functional languages are very big on immutability of data. Take an array for example. You wouldn't append or insert an item into the existing array, but create a new array containing the appended or inserted item. This immutability means that you can't have two different threads modifying one object at the same time, as no thread can modify it. This eliminates so many issues with concurrency, but we can't just remove mutability from every language/API as it is still very useful in many situations.

The second is to conceptualise a background task as a black box. You pass in some data, it works on that data and it returns some new data. It knows nothing about the rest of your application and does not mutate any application data. All data reading and writing is done in chunks. Essentially you follow 3 steps:

  1. Read all the data you need
  2. Work on that data
  3. Write the result of the work

1 and 3 would usually be on the main thread and 2 is what happens on the background thread. This is basically how things such as pixel shaders work on graphics cards. When you write a shader you are writing step 2. As it is self contained, many instances can be run concurrently very easy with no fear of conflicts.

Take for example inverting the colours of an image, where every pixel is modified independently. So lets assume that operation takes 0.01 seconds per pixel and we have a 10,000 pixel image. If we did it sequentially it would take 100 seconds, but if we had 100 cores we can do the same task in 1 second. And you don't need to write any special code to handle concurrency as each shader instance only cares about the single pixel it is supplied.

Concurrent Cocoa

So how do you handle this in Cocoa? Well thankfully we have a class that is designed this very task: NSOperation. NSOperation is essentially a class for creating a self contained work unit. Operations are put on a queue which then handles all the implementation details of how many threads to run and which thread to run the operation on. Much like the pixel shader example above, it abstracts away many of the implementation details. A simple example of this is below. This uses NSData but it can be any immutable data type (and before you say anything, I'm using Garbage Collection in this example, so NO it does not leak):

NSData *someData = [iVarData copy];
[myOperationQueue addOperationWithBlock:^{
    NSData *newData = //result of some processor intensive action with someData;
    [[NSOperationQueue mainQueue] addOperationWithBlock:^{
        iVarData = [newData mutableCopy];

What we've done is copied our mutable ivar into a local variable. This is then used within the outer block to perform some intensive work and the result it put into newData. Once it is done we then put a new block on the main queue which will set the value of the ivar to our new data. Very simple and doesn't require any sort of locks in your own code.

Now this example can still cause issues, but it is easier to manage them. For example, while it eliminates the problem of potentially reading in invalid data, if you added the operation twice they wouldn't necessarily run one after another. Thankfully NSOperation allows you to make it so they do, by making one operation depend on another finishing. You can also subclass NSOperation to allow for more complex operations, which I would recommend if you need to pass a lot of data in.

Concurrency isn't simple, but it needn't be very hard either. By conceptualising background code as black boxes, which know about nothing but the data you pass in, and by making sure that data is immutable, you can remove a massive amount of issues. By working this way you can eliminate almost all locks in your own code as you will never have two items reading or writing a value at the same time. It also simplifies your code as you can draw the lines between what runs in the background and what on the main thread at object boundaries.

So just remember these few rules:

  1. Only pass immutable data across threads
  2. Treat background tasks as black boxes
  3. Only mutate data on a single thread (usually the main thread)

Follow those and most of your concurrency problems vanish. It doesn't work for every situation, but for most of the concurrent code an application developer will need to write it works fine.