Behavior of memory barrier in Java

Question

After reading more blogs/articles etc, I am now really confused about the behavior of load/store before/after memory barrier.

Following are 2 quotes from Doug Lea in one of his clarification article about JMM, which are both very straighforward:

Anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f.
Note that it is important for both threads to access the same volatile variable in order to properly set up the happens-before relationship. It is not the case that everything visible to thread A when it writes volatile field f becomes visible to thread B after it reads volatile field g.

But then when I looked into another blog about memory barrier, I got these:

A store barrier, “sfence” instruction on x86, forces all store instructions prior to the barrier to happen before the barrier and have the store buffers flushed to cache for the CPU on which it is issued.
A load barrier, “lfence” instruction on x86, forces all load instructions after the barrier to happen after the barrier and then wait on the load buffer to drain for that CPU.

To me, Doug Lea's clarification is more strict than the other one: basically, it means if the load barrier and store barrier are on different monitors, the data consistency will not be guaranteed. But the later one means even if the barriers are on different monitors, the data consistency will be guaranteed. I am not sure if I understanding these 2 correctly and also I am not sure which of them is correct.

Considering the following codes:

  public class MemoryBarrier {
    volatile int i = 1, j = 2;
    int x;

    public void write() {
      x = 14; //W01
      i = 3;  //W02
    }

    public void read1() {
      if (i == 3) {  //R11
        if (x == 14) //R12
          System.out.println("Foo");
        else
          System.out.println("Bar");
      }
    }

    public void read2() {
      if (j == 2) {  //R21
        if (x == 14) //R22
          System.out.println("Foo");
        else
          System.out.println("Bar");
      }
    }
  }

Let's say we have 1 write thread TW1 first call the MemoryBarrier's write() method, then we have 2 reader threads TR1 and TR2 call MemoryBarrier's read1() and read2() method.Consider this program run on CPU which does not preserve ordering (x86 DO preserve ordering for such cases which is not the case), according to memory model, there will be a StoreStore barrier (let's say SB1) between W01/W02, as well as 2 LoadLoad barrier between R11/R12 and R21/R22 (let's say RB1 and RB2).

Since SB1 and RB1 are on same monitor i, so thread TR1 which calls read1 should always see 14 on x, also "Foo" is always printed.
SB1 and RB2 are on different monitors, if Doug Lea is correct, thread TR2 will not be guaranteed to see 14 on x, which means "Bar" may be printed occasionally. But if memory barrier runs like Martin Thompson described in the blog, the Store barrier will push all data to main memory and Load barrier will pull all data from main memory to cache/buffer, then TR2 will also be guaranteed to see 14 on x.

I am not sure which one is correct, or both of them are but what Martin Thompson described is just for x86 architecture. JMM does not guarantee change to x is visible to TR2 but x86 implementation does.

Thanks~

You shouldn't care about memory barriers on x86. The semantics of Java and the Java Memory Model are defined on an abstract machine. That's the only thing that matters. The Java runtime takes care, that the guarantees made by the abstract machine are fulfilled during runtime. — nosid
As a matter of fact the x86 semantics (including the cache coherency) are stronger than what the jmm demands, but there's no reason for you to care about any of that if you're not working on a java runtime for x86 as nosid correctly points out. — Voo
Your concern is valid. It is /possible/ that reader 2 would print Bar. However, unless the reader threads had previously interacted with the memory barrier class and cached the value of x, reader 2 will print foo because it will be accessing x for the first time. The interaction with write means the change to x will be visible. Perhaps a more interesting test is to have read1 and read2 execute both before and after W1. — Brett Okken
CountDownLatch introduces an additional synchronization. So, if you use CountDownLatch to make sure, that read2 is executed after write, then read2 will always print "Foo". — nosid
@asticx: The answer is: "Bar" is obviously possible, if there is no synchronization between write and read2, and it is not possible, if there is a synchronization. I guess that's not what you are interested in. So, please provide more information what you really want to know. — nosid

nosid nosid · Accepted Answer · 2014-06-28T18:33:12

Doug Lea is right. You can find the relevant part in section §17.4.4 of the Java Language Specification:

§17.4.4 Synchronization Order

[..] A write to a volatile variable v (§8.3.1.4) synchronizes-with all subsequent reads of v by any thread (where "subsequent" is defined according to the synchronization order). [..]

The memory model of the concrete machine doesn't matter, because the semantics of the Java Programming Language are defined in terms of an abstract machine -- independent of the concrete machine. It's the responsibility of the Java runtime environment to execute the code in such a way, that it complies with the guarantees given by the Java Language Specification.

Regarding the actual question:

If there is no further synchronization, the method read2 can print "Bar", because read2 can be executed before write.
If there is an additional synchronization with a CountDownLatch to make sure that read2 is executed after write, then method read2 will never print "Bar", because the synchronization with CountDownLatch removes the data race on x.

Independent volatile variables:

Does it make sense, that a write to a volatile variable does not synchronize-with a read of any other volatile variable?

Yes, it makes sense. If two threads need to interact with each other, they usually have to use the same volatile variable in order to exchange information. On the other hand, if a thread uses a volatile variable without a need for interacting with all other threads, we don't want to pay the cost for a memory barrier.

It is actually important in practice. Let's make an example. The following class uses a volatile member variable:

class Int {
    public volatile int value;
    public Int(int value) { this.value = value; }
}

Imagine this class is used only locally within a method. The JIT compiler can easily detect, that the object is only used within this method (Escape analysis).

public int deepThought() {
    return new Int(42).value;
}

With the above rule, the JIT compiler can remove all effects of the volatile reads and writes, because the volatile variable can not be accesses from any other thread.

This optimization actually exists in the Java JIT compiler:

src/share/vm/opto/memnode.cpp

Behavior of memory barrier in Java

2 Answers