0
votes

I have some data (two HashSets and a timestamp Instant) that I'd like all requests to my JIRA (OpenSocial?) gadget/plugin to share -- because it takes a long time to generate (couple of minutes) and because the sharing will help the requests be more performant.

Occasionally (very rarely), a request might include a parameter that indicates this shared data should be refreshed. And of course the first time it's needed, it gets populated. It is okay for the data to represent a stale answer -- it is based on things that change slowly and used to visualize trends so off-by-one errors are tolerable.

I imagine when JIRA starts up (or I upload a new version of my add-on) and multiple requests come in during the first couple of minutes, I'd need to handle the population of this expensive shared data in a thread-safe way. Currently the results look fine but as I understand it, that's been just due to chance.

Only one thread needs to do the work of populating. On start-up, the other threads will have to wait of course because they can't skip ahead empty-handed. (If all threads do the expensive initialization, that's a lot of unnecessary load on the server)

But after the initial cost, if multiple concurrent requests come in and one of them includes the 'refresh' parameter, only that one thread needs to pay the price -- I'm fine with the other threads using an old copy of the expensive data and thereby staying performant, and including in the response that "yes someone out there is refreshing the data but here's a result using an old copy".

More about the data: The two HashSets and the timestamp are intended to represent a consistent snapshot in time. The HashSet contents depend on values in the database only, and the timestamp is just the time of the most recent refresh. None of this data depends on any earlier snapshot in time. And none of it depends on program state either. The timestamp is only used to answer the question "how old is this data" in a rough sense. Every time the data is refreshed, I'd expect the timestamp to be more recent but nothing is going to break if it's wrong. It's just for debugging and transparency. Since a snapshot doesn't depend on earlier snapshots or the program state, it could be wrapped and marked as volatile.

Is there an obvious choice for the best way to go about this? Pros and cons of alternatives?

1

1 Answers

1
votes

You'll want to use Locks to synchronize access to the sections of your code that you need to have only one thread executing at once. There are plenty of resources on SO and in the Oracle Java docs that show how to use locks in more detail, but something like this should do the trick.

The idea is that you want to maintain a copy of the most-recently generated set of results, and you always return that copy until you have a new set of data available.

import java.util.concurrent.locks.ReentrantLock;

public class MyClass
{
    private volatile MyObject completedResults;
    private final ReentrantLock resultsLock;
    private final ReentrantLock refreshLock;

    public MyClass()
    {
        // This must be a singleton class (such as a servlet) for this to work, since every
        // thread needs to be accessing the same lock.

        resultsLock = new ReentrantLock();
        refreshLock = new ReentrantLock();
    }

    public MyObject myMethodToRequestResults(boolean refresh)
    {
        MyObject resultsToReturn;

        // Serialize access to get the most-recently completed set of results; if none exists,
        // we need to generate it and all requesting threads need to wait.

        resultsLock.lock();

        try
        {
            if (completedResults == null)
            {
                completedResults = generateResults();
                refresh = false; // we just generated it, so no point in redoing it below
            }

            resultsToReturn = completedResults;
        }
        finally
        {
            resultsLock.unlock();
        }

        if (refresh)
        {
            // If someone else is regenerating, we just return the old data and tell the caller that.

            if (!refreshLock.tryLock())
            {
                // create a copy of the results to return, since we're about to modify it on the next line
                // and we don't want to change the (shared) original!

                resultsToReturn = new MyObject(resultsToReturn);  
                resultsToReturn.setSomeoneElseIsRegeneratingTheStuffRightNow(true);
            }
            else
            {
                try
                {
                    completedResults = generateResults();
                    resultsToReturn = completedResults;
                }
                finally
                {
                    refreshLock.unlock();
                }
            }
        }

        return resultsToReturn;
    }
}