4
votes

The testMethod() in the below code is accessed by so many threads at runtime. It accepts a 'firstName' and immediately returns 'lastName' if the entry is found in the map. If not it looks up in API for the last name, updates the map and returns the same. Now this method does put and get operation in the same map data-structure which i think is not 'thread-safe'. I am now confused whether to make the function 'synchronized' or use a ConcurrentHashMap instead of a HashMap

public class Sample {

    Map<String, String> firstNameToLastName = new HashMap<>();

    public String testMethod(String firstName) {
        String lastName = firstNameToLastName.get(firstName);

        if (lastName!= null)
            return lastName;

        String generateLastName = SomeAPI.generateLastName(firstName);

        firstNameToLastName.put(firstName, generateLastName);

        return generateLastName;
    }
}
3
IMO ConcurrentHashMap should be used because as this map is instance variable, it is better to make it thread safe. And it seems that other data in your method is local and hence it is already thread safe.Jeet
Both would work for you. But ConcurrentHashMap is better solution, because it's likely, that somebody would write additional methods and forget add synchronized keyword. You have only one operation on firstNameToLastName but synchronized testMethod would block whole method not just one critical operation.Alex Baranowski
For this particular use case, take a look at Guava's LoadingCache. When the cache entry for a key is missing, only one thread calculates the value while any other threads getting the same key wait patiently. This can reduce load on your back-end. The CacheBuilder also allows for configurable eviction strategies.dnault

3 Answers

6
votes

You are right that your code is not thread safe. This results in the following problems, with the nasty downside that most of the time it will work fine:

  1. Updates from one thread might never be shown to other threads.
  2. Two threads might check for a first name, find nothing and both add the lastname (before the change is shown to eachother).
  3. Theads might see 'incomplete' objects.

Basic Fix using synchronized

A very basic fix is to allow only one thread at the same time to access the concurrent part of your function, using the synchronized keyword (this can be added to the function definition, but you should use a private object to synchronize on).

public class Sample {

    Map<String, String> firstNameToLastName = new HashMap<>();
    private final Object nameMapLock = new Object();

    public String testMethod(String firstName) {
        synchronized(nameMapLock){
            String lastName = firstNameToLastName.get(firstName);

            if (lastName!= null)
                return lastName;

            String generateLastName = SomeAPI.generateLastName(firstName);

            firstNameToLastName.put(firstName, generateLastName);

            return generateLastName;
        }
    }
}

If multiple threads try to access the data at the same time, they have to wait until a another thread is done. You also have to make sure you don't introduce deadlocks in the locking.

Synchronizing on a private Object

In reply to the comments, I'll add a little explanation of why synchronization is done on a private object instead of on the full method (by adding synchronized to the method definition) or on the map.

The reason for using a private object, is that you can be 100% sure that no other class is also using your object (read lock) to synchronize on.

When you use the synchronized keyword on the method, you are in reality synchronizing on this (the current object), meaning anyone using your class could also do that. When you synchronize on the map, the map itself might synchronize on that object as well, or other classes that you pass the map to.

Note that in some very rare cases you do want others to be able to use the same lock, but that means you have a lot of extra documenting to do, and run the risk of others misusing your locking.

The way I showed in the above example, is how most people do this. However there are plenty of other ways that would do the same.

Fix using ConcurrentHashMap

Using a ConcurrentHashMap will solve problem 1 and 3 (as numbered above). But you still have to take special measure for point two. Since Java 8 you can do this quiet elegantly using ConcurrentHashMap.computeIfAbsent(). This would work as follows:

public class Sample {

    ConcurrentHashMap<String, String> firstNameToLastName = new ConcurrentHashMap<>();

    public String testMethod(String firstName) {
        return firstNameToLastName.computeIfAbsent(firstName, 
                    name -> SomeAPI.generateLastName(name));

        }
    }
}

As you see this can make the implementation very elegant. However if you have more (and more complex) operations on the map, you might run into trouble.

0
votes

You can use ReentrantReadWriteLock, this way you can have multiple threads reading.

public class Sample {

final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
Map<String, String> firstNameToLastName = new HashMap<>();

public String testMethod(String firstName) {
    rwl.readLock().lock();
    String lastName = firstNameToLastName.get(firstName);
    rwl.readLock().unlock();

    if (lastName!= null)
        return lastName;

    lastName = SomeAPI.generateLastName(firstName);

    // Must release read lock before acquiring write lock, it is already released
    rwl.writeLock().lock();
    //now another thread could already put a last name, so we need to check again
    lastName = firstNameToLastName.get(firstName);
    if (lastName== null)
        firstNameToLastName.put(firstName, lastName);

    rwl.writeLock().unlock();
    return lastName;
}

}

0
votes

IMO you only need to synchronize parts of your code that try to access a shared resource(e.g a collection).

In your code aside from the api that you are calling (which we dont know anything about) the only shared resource is your first name to last name map, so if you make it a concurrent collection (Concurrent hashMap), you data in your map will be fine (in a scenario in which two threads enter the "testMethod" and can`t find a name in map and in a race-condition one of them succeeds to call put method first and adds the last name to map then the other thread calls put method with the same key/value but eventually your map has the correct values).

But in your code the overall operation of testMethod is unexpected and for example in one thread may not find a key and call the the api to generate last name while another thread is updating the map with the same key.