2
votes

Background

I have a class whose instances are used to collect and publish data (uses Guava's HashMultimap):

public class DataCollector {

    private final SetMultimap<String, String> valueSetsByLabel
            = HashMultimap.create();

    public void addLabelValue(String label, String value) {
        valueSetsByLabel.put(label, value);
    }

    public Set<String> getLabels() {
        return valueSetsByLabel.keySet();
    }

    public Set<String> getLabelValues(String label) {
        return valueSetsByLabel.get(label);
    }
}

Instances of this class will now be passed between threads, so I need to modify it for thread-safety. Since Guava's Multimap implementations aren't thread-safe, I used a LoadingCache that lazily creates concurrent hash sets instead (see the CacheBuilder and MapMaker javadocs for details):

public class ThreadSafeDataCollector {

    private final LoadingCache<String, Set<String>> valueSetsByLabel
            = CacheBuilder.newBuilder()
            .concurrencyLevel(1)
            .build(new CacheLoader<String, Set<String>>() {
                @Override
                public Set<String> load(String label) {
                    // make and return a concurrent hash set
                    final ConcurrentMap<String, Boolean> map = new MapMaker()
                            .concurrencyLevel(1)
                            .makeMap();
                    return Collections.newSetFromMap(map);
                }
            });

    public void addLabelValue(String label, String value) {
        valueSetsByLabel.getUnchecked(label).add(value);
    }

    public Set<String> getLabels() {
        return valueSetsByLabel.asMap().keySet();
    }

    public Set<String> getLabelValues(String label) {
        return valueSetsByLabel.getUnchecked(label);
    }
}

You'll notice I'm setting the concurrency level for both the loading cache and nested concurrent hash sets to 1 (meaning they each only read from and write to one underlying table). This is because I only expect one thread at a time to read from and write to these objects.

(To quote the concurrencyLevel javadoc, "A value of one permits only one thread to modify the map at a time, but since read operations can proceed concurrently, this still yields higher concurrency than full synchronization.")

Problem

Because I can assume there will only be a single reader/writer at a time, I feel that using many concurrent hash maps per object is heavy-handed. Such structures are meant to handle concurrent reads and writes, and guarantee atomicity of concurrent writes. But in my case atomicity is unimportant - I only need to make sure each thread sees the last thread's changes.

In my search for a more optimal solution I came across this answer by erickson, which says:

Any data that is shared between thread needs a "memory barrier" to ensure its visibility.

[...]

Changes to any member that is declared volatile are visible to all threads. In effect, the write is "flushed" from any cache to main memory, where it can be seen by any thread that accesses main memory.

Now it gets a bit trickier. Any writes made by a thread before that thread writes to a volatile variable are also flushed. Likewise, when a thread reads a volatile variable, its cache is cleared, and subsequent reads may repopulate it from main memory.

[...]

One way to make this work is to have the thread that is populating your shared data structure assign the result to a volatile variable. [...] When other threads access that variable, not only are they guaranteed to get the most recent value for that variable, but also any changes made to the data structure by the thread before it assigned the value to the variable.

(See this InfoQ article for a further explanation of memory barriers.)

The problem erickson is addressing is slightly different in that the data structure in question is fully populated and then assigned to a variable that he suggests be made volatile, whereas my structures are assigned to final variables and gradually populated across multiple threads. But his answer suggests I could use a volatile dummy variable to manually trigger memory barriers:

public class ThreadVisibleDataCollector {

    private final SetMultimap<String, String> valueSetsByLabel
            = HashMultimap.create();

    private volatile boolean dummy;

    private void readMainMemory() {
        if (dummy) { }
    }

    private void writeMainMemory() {
        dummy = false;
    }

    public void addLabelValue(String label, String value) {
        readMainMemory();
        valueSetsByLabel.put(label, value);
        writeMainMemory();
    }

    public Set<String> getLabels() {
        readMainMemory();
        return valueSetsByLabel.keySet();
    }

    public Set<String> getLabelValues(String label) {
        readMainMemory();
        return valueSetsByLabel.get(label);
    }
}

Theoretically, I could take this a step further and leave it to the calling code to trigger memory barriers, in order to avoid unnecessary volatile reads and writes between calls on the same thread (potentially by using Unsafe.loadFence and Unsafe.storeFence, which were added in Java 8). But that seems too extreme and hard to maintain.

Question

Have I drawn the correct conclusions from my reading of erickson's answer (and the JMM) and implemented ThreadVisibleDataCollector correctly? I wasn't able to find examples of using a volatile dummy variable to trigger memory barriers, so I want to verify that this code will behave as expected across architectures.

3
It seems ThreadVisibleDataCollector would be dumping/reading entire data to memory, rather than just the label-values that are being accessed.S.D.
@S.D. Wouldn't this also be the case with concurrent hash maps? They use explicit locking, but Lock guarantees the same memory synchronization.Paul Bellora
True. What I was wondering about was if volatile could be used for each { String label, Set<String> values } object. So, when you accessed a "label", then only that "label" and its set of "values" be synced with main memory, and not all other labels. This could be true with locking as well, i.e. more granular, "label" level locking.S.D.
@S.D. Interesting, that reminds me of Guava's Striped. I'll consider whether it could be used, though in my particular use case most labels are written by a single thread, and all labels are read by another single thread.Paul Bellora

3 Answers

1
votes

The thing you are trying to do is called “Premature Optimization”. You don’t have a real performance problem but try to make your entire program very complicated and possibly error prone, without any gain.

The reason why you will never experience any (notable) gain lies in the way how a lock works. You can learn a lot of it by studying the documentation of the class AbstractQueuedSynchronizer.

A Lock is formed around a simple int value with volatile semantics and atomic updates. In the simplest form, i.e. without contention, locking and unlocking consist of a single atomic update of this int variable. Since you claim that you can be sure that there will be only one thread accessing the data at a given time, there will be no contention and the lock state update has similar performance characteristics compared to your volatile boolean attempts but with the difference that the Lock code works reliable and is heavily tested.

The ConcurrentMap approach goes a step further and allows a lock-free read that has the potential to be even more efficient than your volatile read (depending on the actual implementation).

So you are creating a potentially slower and possibly error prone program just because you “feel that using many concurrent hash maps per object is heavy-handed”. The only answer can be: don’t feel. Measure. Or just leave it as is as long as there is no real performance problem.

1
votes

Some value is written to volatile variable happens-before this value can be read from it. As a consequence, the visibility guarantees you want will be achieved by reading/writing it, so the answer is yes, this solves visibility issues.

Besides the problems mentioned by Darren Gilroy in his answer, I'd like to remember that in Java 8 there are explicit memory barrier instructions in Unsafe class:

/**
 * Ensures lack of reordering of loads before the fence
 * with loads or stores after the fence.
 */
void loadFence();

/**
 * Ensures lack of reordering of stores before the fence
 * with loads or stores after the fence.
 */
void storeFence();

/**
 * Ensures lack of reordering of loads or stores before the fence
 * with loads or stores after the fence.
 */
void fullFence();

Although Unsafe is not a public API, I still recommend to at least consider using it, if you're using Java 8.

One more solution is coming to my mind. You have set your concurrencyLevel to 1 which means that only one thread at a time can do anything with a collection. IMO standard Java synchronized or ReentrantLock (for the cases of high contention) will also fit for your task and do provide visibility guarantees. Although, if you want one writer, many readers access pattern, consider using ReentrantReadWriteLock.

0
votes

Well, that's still not particularly safe, b/c it depends a lot of the underlying implementation of the HashMultimap.

You might take a look at the following blog post for a discussion: http://mailinator.blogspot.com/2009/06/beautiful-race-condition.html

For this type of thing, a common pattern is to load a "most recent version" into a volatile variable and have your readers read immutable versions through that. This is how CopyOnWriteArrayList is implemented.

Something like ...

class Collector {
   private volatile HashMultimap values = HashMultimap.create();

   public add(String k, String v) {
      HashMultimap t = HashMultimap.create(values);
      t.put(k,v);
      this.values = t;  // this invokes a memory barrier
   }

   public Set<String> get(String k) {
     values.get(k); // this volatile read is memory barrier
   }
}

However, both your and my solution still have a bit of a problem -- we are both returning mutable views on the underlying data structure. I might change the HashMultimap to an ImmutableMultimap to fix the mutability issue. Beware also that callers retain a reference to the full internal map (not just the returned Set) as a side effect of things being a view.

Creating a new copy can seem somewhat wasteful, but I suspect that if you have only one thread writing, then you have an understanding of the rate of change and can decide if that's reasonable or not. For example, f you wanted to return Set<String> instances which update dynamically as things change then the solution based on map maker doesn't seem heavy handed.