Java concurrency with volatile

One of the main assumptions we developers often times make is that programs have some sort of sequential consistency[0]. This might be probably due to the fact that it’s not only easier for us humans to think in terms of consistent sequential steps, but also because we may have learnt it that way during college or at work.

Although we are constantly challenged with concurrent daily tasks, the first thoughts when developing an algorithm or a solution are similar to “we begin with step 1, then step 2, then if this happens, do step 3 or go to step 4, then end”.

However, reality differs from that, and any programming language that wants to offer the possibility to execute concurrent code will have to deal with it. Typically, the way this mechanism works is defined in the language memory model specifications.

Enough philosophical thoughts. A couple of days ago I reviewed some code that was supposed to be multi-threaded. It was indeed multi-threaded, however, when I saw the volatile keyword, it rang a bell. So, I tried to understand how the code was supposed to work. In the end, it turned out to introduce a race condition, which means that the concurrent code was incorrect (and obviously difficult to test, still, it doesn’t mean it won’t happen) and its output depended on the timing of events. It was quite similar to the following snippet (instead of using a thread pool, the code was using Akka actors):

If you try to run it locally, you’ll get always different results. For example:

Exception in thread "pool-1-thread-7" Finished all threads
1
8
7
6
4
2
9
3
5java.lang.ArrayIndexOutOfBoundsException: 33

10
11
at java.util.ArrayList.add(ArrayList.java:459)
12
at VolatileRaceCondition$MyRunnable.run(VolatileRaceCondition.java:43)13

14 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

15
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)16

17
at java.lang.Thread.run(Thread.java:745)18

19
20
21
22
23
25
24
26
28
27
29
31
32
33
34
null
36
35
37
38
39
40
..

The multi-threaded code is not correct, therefore the output is non-deterministic. In fact, we have N threads and each of them tries to execute some actions on a (volatile) static variable. The Java language specification for the volatile modifier says:

The Java programming language allows threads to access shared variables (§17.1). As a rule, to ensure that shared variables are consistently and reliably updated, a thread should ensure that it has exclusive use of such variables by obtaining a lock that, conventionally, enforces mutual exclusion for those shared variables.
The Java programming language provides a second mechanism, volatile fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4).

The way the Java Memory Model does it is 1) by not caching the variable in such a way that it can’t be seen outside the thread (for a quicker access), and 2) by not reordering the operations on that variable (this is done by compilers as an optimization step).

This means that volatile variables are just about visibility, not atomicity – you need locking for mutual exclusivity and read-modify-write operations. The code above will call the method ArrayList.add() concurrently, each time potentially with different state – multiple threads are reading, modifying and writing the arraylist, therefore, each thread may be given control before or after any of these operations (which is the reason why we have an ArrayIndexOutOfBoundsException):

    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

If we were able to have a data structure that allowed to add an element atomically, volatile would probably be OK, even if more threads were operating on it concurrently, because writing means performing a “single operation”, without any dependency on the current value.

One alternative, if we want to use an existing collection and depending on the use case, would be to use a CopyOnWriteArrayList, that provides a thread-safe ArrayList that doesn’t need any external synchronization.

When to use volatile variables?

As a rule of thumb, follow what B. Goetz writes[1]:

You can use volatile variables only when all the following criteria are met:

• Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;

• The variable does not participate in invariants with other state variables; and

• Locking is not required for any other reason while the variable is being accessed.

Lessons learnt:

  • When you see concurrency/multi-threading related code in code reviews, pay more attention and try to understand what it does.
  • Thread-safety is a hard topic, and even experienced developers may lack some of the deep understanding it takes to design thread-safe code. Ask questions, do some research, try to understand as much as you can. Thread-related bugs are really hard to find, especially because of the syndrome: “it won’t happen here, we are not having the scale of Google“.

External References

[0]: Java Concurrency in practice, p. 338, B. Goetz.

[1]: Java Concurrency in practice, p. 39, B. Goetz.

[2]: Java 8 Memory Model

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.