21
votes

Whilst trying to understand how SubmissionPublisher (source code in Java SE 10, OpenJDK | docs), a new class added to the Java SE in version 9, has been implemented, I stumbled across a few API calls to VarHandle I wasn't previously aware of:

fullFence, acquireFence, releaseFence, loadLoadFence and storeStoreFence.

After doing some research, especially regarding the concept of memory barriers/fences (I have heard of them previously, yes; but never used them, thus was quite unfamiliar with their semantics), I think I have a basic understanding of what they are for. Nonetheless, as my questions might arise from a misconception, I want to ensure that I got it right in the first place:

  1. Memory barriers are reordering constraints regarding reading and writing operations.

  2. Memory barriers can be categorized into two main categories: unidirectional and bidirectional memory barriers, depending on whether they set constraints on either reads or writes or both.

  3. C++ supports a variety of memory barriers, however, these do not match up with those provided by VarHandle. However, some of the memory barriers available in VarHandle provide ordering effects that are compatible to their corresponding C++ memory barriers.

    • #fullFence is compatible to atomic_thread_fence(memory_order_seq_cst)
    • #acquireFence is compatible to atomic_thread_fence(memory_order_acquire)
    • #releaseFence is compatible to atomic_thread_fence(memory_order_release)
    • #loadLoadFence and #storeStoreFence have no compatible C++ counter part

The word compatible seems to really important here since the semantics clearly differ when it comes to the details. For instance, all C++ barriers are bidirectional, whereas Java's barriers aren't (necessarily).

  1. Most memory barriers also have synchronization effects. Those especially depend upon the used barrier type and previously-executed barrier instructions in other threads. As the full implications a barrier instruction has is hardware-specific, I'll stick with the higher-level (C++) barriers. In C++, for instance, changes made prior to a release barrier instruction are visible to a thread executing an acquire barrier instruction.

Are my assumptions correct? If so, my resulting questions are:

  1. Do the memory barriers available in VarHandle cause any kind of memory synchronization?

  2. Regardless of whether they cause memory synchronization or not, what may reordering constraints be useful for in Java? The Java Memory Model already gives some very strong guarantees regarding ordering when volatile fields, locks or VarHandle operations like #compareAndSet are involved.

In case you're looking for an example: The aforementioned BufferedSubscription, an inner class of SubmissionPublisher (source linked above), established a full fence in line 1079 (function growAndAdd; as the linked website doesn't support fragment identifiers, just CTRL+F for it). However, it is unclear for me what it is there for.

1
I've tried to answer, but to put it very simple, they exist because people want a weaker mode than what Java has. In ascending order, these would be: plain -> opaque -> release/acquire -> volatile (sequential consistency).Eugene

1 Answers

15
votes

This is mainly a non-answer, really (initially wanted to make it a comment, but as you can see, it's far too long). It's just that I questioned this myself a lot, did a lot of reading and research and at this point in time I can safely say: this is complicated. I even wrote multiple tests with jcstress to figure out how really they work (while looking at the assembly code generated) and while some of them somehow made sense, the subject in general is by no means easy.

The very first thing you need to understand:

The Java Language Specification (JLS) does not mention barriers, anywhere. This, for java, would be an implementation detail: it really acts in terms of happens before semantics. To be able to proper specify these according to the JMM (Java Memory Model), the JMM would have to change quite a lot.

This is work in progress.

Second, if you really want to scratch the surface here, this is the very first thing to watch. The talk is incredible. My favorite part is when Herb Sutter raises his 5 fingers and says, "This is how many people can really and correctly work with these." That should give you a hint of the complexity involved. Nevertheless, there are some trivial examples that are easy to grasp (like a counter updated by multiple threads that does not care about other memory guarantees, but only cares that it is itself incremented correctly).

Another example is when (in java) you want a volatile flag to control threads to stop/start. You know, the classical:

volatile boolean stop = false; // on thread writes, one thread reads this    

If you work with java, you would know that without volatile this code is broken (you can read why double check locking is broken without it for example). But do you also know that for some people that write high performance code this is too much? volatile read/write also guarantees sequential consistency - that has some strong guarantees and some people want a weaker version of this.

A thread safe flag, but not volatile? Yes, exactly: VarHandle::set/getOpaque.

And you would question why someone might need that for example? Not everyone is interested with all the changes that are piggy-backed by a volatile.

Let's see how we will achieve this in java. First of all, such exotic things already existed in the API: AtomicInteger::lazySet. This is unspecified in the Java Memory Model and has no clear definition; still people used it (LMAX, afaik or this for more reading). IMHO, AtomicInteger::lazySet is VarHandle::releaseFence (or VarHandle::storeStoreFence).


Let's try to answer why someone needs these?

JMM has basically two ways to access a field: plain and volatile (which guarantees sequential consistency). All these methods that you mention are there to bring something in-between these two - release/acquire semantics; there are cases, I guess, where people actually need this.

An even more relaxation from release/acquire would be opaque, which I am still trying to fully understand.


Thus bottom line (your understanding is fairly correct, btw): if you plan to use this in java - they have no specification at the moment, do it on you own risk. If you do want to understand them, their C++ equivalent modes are the place to start.