7
votes

Official notes say, that

Writing to a volatile field has the same memory effect as a monitor release, and reading from a volatile field has the same memory effect as a monitor acquire.

and

Effectively, the semantics of volatile have been strengthened substantially, almost to the level of synchronization. Each read or write of a volatile field acts like "half" a synchronization, for purposes of visibility.

from here.

Does that mean, that any write to a volatile variable makes executing thread flush its cache into main memory and every read from a volatile field makes the thread reread its variables from main memory?

I am asking because the very same text contains this statement

Important Note: Note that it is important for both threads to access the same volatile variable in order to properly set up the happens-before relationship. It is not the case that everything visible to thread A when it writes volatile field f becomes visible to thread B after it reads volatile field g. The release and acquire have to "match" (i.e., be performed on the same volatile field) to have the right semantics.

And this statement makes me very confused. I know for sure that it's not true for regular lock acquire and release with synchronized statement - if some thread releases any monitor then all changes it made become visibly to all other threads (Update: actually not true - watch best answer). There was even a question about it on stackoverflow. Yet it is stated that for whatever reason this is not the case for volatile fields. I can't imagine any implementation of happens-before guarantee, that doesn't make changes visible to other threads, threads that don't read the same volatile variable. At least imagine an implementation, that doesn't contradict the first two quotes.

Moreover before posting this question I did some research, and there is for example this article, which contains this sentence

After executing these instructions, all writes are visible to all other threads through cache subsystem or main memory.

mentioned instructions are the ones that happen when a write to volatile field is made.

So what's that important note is supposed to mean? Or am I am missing something? Or maybe that note is just plain wrong?

Answer?

After making some more research, I was only able to find this statement in official documentation about volatile fields and their effect on changes in non-volatile fields:

Using volatile variables reduces the risk of memory consistency errors, because any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change.

from here.

I don't know if that is enough to conclude, that happens-before relation is guaranteed only for threads reading the same volatile. So for now I can only summarize that the results are inconclusive.

But in practice I would recommend considering that changes made by thread A, when it writes to a volatile field, are guaranteed to be visible to thread B only if thread B reads the same volatile field. The above quote from the official source strongly implies that.

2
What is guaranteed if two threads synchronize on two different mutexes?curiousguy
After we exit a synchronized block, we release the monitor, which has the effect of flushing the cache to main memory, so that writes made by this thread can be visible to other threads. Same link. So this guarantees, that after a thread releases a lock, all other threads will see all changes made by that thread. ALL changes, not just the ones made inside the lock.Nik Kotovski
I'm pretty sure nothing is "flushed" outside the CPU cache in 99% of implementation.curiousguy
Ok, what am I quoting even? This link is being quoted all over the site, but is it official JMM description or a description made by some smart yet random people?Nik Kotovski
@NikKotovski You're quoting something totally valid. Bill Pugh was the lead writer for the JMM re-design in 2004 for Java 5. It was written at a time when cache flushing to main memory was still a thing. curiousguy isn't wrong when saying most implementation no longer flush right to memory. You can read up on cache coherence - a volatile store may simply notify the other CPUs of an update and share the writes with other CPUs in a case where it may never make it directly to memory if the field updates again.John Vint

2 Answers

4
votes

You are looking at this from an entirely wrong angle. First you are quoting the JLS and than talking about flush, which would be an implementation detail of that specification. The absolute only thing you need to rely on is the JLS, anything else is not bad to know may be, but does not prove right or wrong the specification in any shape or form.

And the fundamental place where you are wrong is this:

I know for sure that it's not true for regular lock acquire...

In practice, on x86, you might be right, but the JLS and the official oracle tutorial mandates that:

When a thread releases an intrinsic lock, a happens-before relationship is established between that action and any subsequent acquisition of the same lock.

Happens-before is established for subsequent actions (if you want, read two actions if it is simpler for you). One thread releases the lock and the other acquires it - these are subsequent (release-acquire semantics).

Same things happens for a volatile - some thread writes to that, and when some other thread observes that write via a subsequent read, happens-before is established.

2
votes

Does that mean, that any write to a volatile variable makes executing thread flush its cache into main memory and every read from a volatile field makes the thread reread its variables from main memory?

No, it does not mean that. And that's a common mistake to think that way. All it means is what is specified in the Java Memory Model.

On intel CPUs there are instruction to flush a cache line: clflush and clflushopt and it would be extremely inefficient to do that kind of flush of the whole cache line any time volatile write occurs.

To provide an example lets take a look how volatile variables implemented (for this example) by

Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

for my Haswell. Let's write this simple example:

public static volatile long a = 0;

public static void main(String[] args){
    Thread t1 = new Thread(() -> {
        while(true){
            //to avoid DCE
            if(String.valueOf(String.valueOf(a).hashCode()).equals(String.valueOf(System.nanoTime()))){
                System.out.print(a);
            }
        }
    });

    Thread t2 = new Thread(() -> {
        while(true){
            inc();
        }
    });

    t1.start();
    t2.start();
}

public static void inc(){
    a++;
}

I disabled tiered compilation and ran it with C2 compiler as follows:

java -server -XX:-TieredCompilation -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,*Volatile.inc -jar target/test-0.0.1.jar

The output is the following:

  # {method} {0x00007f87d87c6620} 'inc' '()V' in 'com/test/Volatlee'
  #           [sp+0x20]  (sp of caller)
  0x00007f87d1085860: sub     $0x18,%rsp
  0x00007f87d1085867: mov     %rbp,0x10(%rsp)   ;*synchronization entry
                                                ; - com.test.Volatlee::inc@-1 (line 26)

  0x00007f87d108586c: movabs  $0x7191fab68,%r10  ;   {oop(a 'java/lang/Class' = 'com/test/Volatlee')}
  0x00007f87d1085876: mov     0x68(%r10),%r11
  0x00007f87d108587a: add     $0x1,%r11
  0x00007f87d108587e: mov     %r11,0x68(%r10)
  0x00007f87d1085882: lock addl $0x0,(%rsp)     ;*putstatic a
                                                ; - com.test.Volatlee::inc@5 (line 26)

  0x00007f87d1085887: add     $0x10,%rsp
  0x00007f87d108588b: pop     %rbp
  0x00007f87d108588c: test    %eax,0xca8376e(%rip)  ;   {poll_return}
  0x00007f87d1085892: retq
  ;tons of hlt ommited

So in this simple example volatile compiles to a locked instruction requiring cache line to have an exclusive state to be executed (probably sending read invalidate signal to other cores if it's not).