Java concurrency with volatile

Posted on Nov 3, 2018

One of the main assumptions we developers often times make is that programs have some sort of sequential consistency. This might be probably due to the fact that it’s not only easier for us humans to think in terms of consistent sequential steps, but also because we may have learnt it that way during college or at work.

Although we are constantly challenged with concurrent daily tasks, the first thoughts when developing an algorithm or a solution are similar to “we begin with step 1, then step 2, then if this happens, do step 3 or go to step 4, then end”.

However, reality differs from that, and any programming language that wants to offer the possibility to execute concurrent code will have to deal with it. Typically, the way this mechanism works is defined in the language concurrency (or memory!) model specifications.

Enough philosophical thoughts. A couple of days ago I reviewed some code that was supposed to be multi-threaded. It was indeed multi-threaded, however, when I saw the volatile keyword, it rang a bell. So, I tried to understand how the code was supposed to work. In the end, it turned out to introduce a race condition, which means that the concurrent code was incorrect (and obviously difficult to test, still, it doesn’t mean it won’t happen) and its output depended on the timing of events. It was quite similar to the following snippet (instead of using a thread pool, the code was using Akka actors):

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

import java.util.List;
import java.util.ArrayList;


public class VolatileRaceCondition {
    private static final int NTHREADS = 10;

    public static void main(String[] args) throws Exception {
        ExecutorService executor = Executors.newFixedThreadPool(NTHREADS);
        for (int i = 0; i < 100; i++) {
            Runnable worker = new MyRunnable(i);
            executor.execute(worker);
        }
        // This will make the executor accept no new threads
        // and finish all existing threads in the queue
        executor.shutdown();
        // Wait until all threads are finish
        executor.awaitTermination(60, TimeUnit.SECONDS);
        System.out.println("Finished all threads");
        for(Integer i: WithGlobalVariable.nums) {
            System.out.println(i);
        }
    }

    private static class WithGlobalVariable {
        public static volatile List<Integer> nums = new ArrayList<Integer>();
    }

    private static class MyRunnable implements Runnable {
        private final int countUntil;

        MyRunnable(int countUntil) {
            this.countUntil = countUntil;
        }

        @Override
        public void run() {
            WithGlobalVariable.nums.add(this.countUntil);
        }
    }
}

If you try to run it locally, you’ll get always different results. For example:

Exception in thread "pool-1-thread-7" Finished all threads
1
8
7
6
4
2
9
3
5java.lang.ArrayIndexOutOfBoundsException: 33

10
11
at java.util.ArrayList.add(ArrayList.java:459)
12
at VolatileRaceCondition$MyRunnable.run(VolatileRaceCondition.java:43)13

14 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

15
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)16

17
at java.lang.Thread.run(Thread.java:745)18

19
20
21
22
23
25
24
26
28
27
29
31
32
33
34
null
36
35
37
38
39
40
..

The multi-threaded code is not correct, therefore the output is non-deterministic. In fact, we have N threads and each of them tries to execute some actions on a (volatile) static variable. The Java language specification for the volatile modifier says:

The Java programming language allows threads to access shared variables §17.1. As a rule, to ensure that shared variables are consistently and reliably updated, a thread should ensure that it has exclusive use of such variables by obtaining a lock that, conventionally, enforces mutual exclusion for those shared variables.

The Java programming language provides a second mechanism, volatile fields, that is more convenient than locking for some purposes. A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4). The way the Java Memory Model does it is 1) by not caching the variable in such a way that it can’t be seen outside the thread (for a quicker access), and 2) by not reordering the operations on that variable (this is done by compilers as an optimization step).

This means that volatile variables are just about visibility, not atomicity - you need locking for mutual exclusivity and read-modify-write operations. The code above will call the method ArrayList.add() concurrently, each time potentially with different state - multiple threads are reading, modifying and writing the arraylist, therefore, each thread may be given control before or after any of these operations (which is the reason why we have an ArrayIndexOutOfBoundsException):

    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

If we were able to have a data structure that allowed to add an element atomically, volatile would *probably *be OK, even if more threads were operating on it concurrently, because writing means performing a “single operation”, without any dependency on the current value.

One alternative, if we want to use an existing collection and depending on the use case, would be to use a CopyOnWriteArrayList, that provides a thread-safe ArrayList that doesn’t need any external synchronization.

When to use volatile variables?

As a rule of thumb, follow what B. Goetz writes (2nd reference):

You can use volatile variables only when all the following criteria are met:

  1. Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;

  2. The variable does not participate in invariants with other state variables; and

  3. Locking is not required for any other reason while the variable is being accessed.

Lessons learnt

  • When you see concurrency/multi-threading related code in code reviews, pay more attention and try to understand what it does.
  • Thread-safety is a hard topic, and even experienced developers may lack some of the deep understanding it takes to design thread-safe code. Ask questions, do some research, try to understand as much as you can. Thread-related bugs are really hard to find, especially because of the syndrome: “it won’t happen here, we are not having the scale of Google”.

External References

  1. Java Concurrency in practice, p. 338, B. Goetz.

  2. Java Concurrency in practice, p. 39, B. Goetz.

  3. Java 8 Memory Model