7
votes

So here is the problem I am trying to solve - I have an Object with two integer fields that I want to cache

public class MyObject {
   int x;
   int y;
   ....
}

Now the field x is what I mainly match on - but there can be duplicates in which case I want to fall back on the second field (so that this.x=that.x and this.y=that.y). y can only be 25 distinct values. Now I know I could just combine the two as a String and use that as the cache key, but then I would have to try x+[25 possible values] to actually determine if it was not in the cache - making cache misses very expensive. I was thinking of trying to store a List<Integer> as the cache value for the field x and then if their was more then one, iterate down the list and look for a match on y.

Now if I use a ConcurrentList (or a Set if I care about duplicates - lets ignore that for now) will multiple threads be able to add to it and then put it back into the cache without race conditions? Is it possible that Ehcache might return two different List Objects to two threads and then when they add their new value to the list and attempt to put it back to the cache I could get undeterministic results? Do you see a better way of constructing this cache?

EDIT : I appreciate the answers below, but everyone seems to be missing the main point. Will this work? Could Ehcache actually return two different objects for the same cacheKey (say if the object was on disk during the call and it's serialized it twice, once for each call).

5
Is there a reason you can't use a hash as the key?Falmarri
As I stated in the question that would mean for any given value of x I would have to check the cache possibly 25 times to actually know it was not in the cache. I want to match on x no matter the value of y - but if there are multiple x then the best value of y.Gandalf
Yes it is possible to have a value added twice using the method you mentioned. Rather use a ConcurrentMap.Keegan Carruthers-Smith
Yeah I understand that Keegan (I'll change the question to reflect) - the real question is whether or not there could possible be two completely different List (or Set as you pointed out) objects created.Gandalf
What do you mean by the best value of y?John Vint

5 Answers

5
votes

It's absolutely possible that you get two different instances of your List (or of any Serializable)! Try this:

public static void main(final String[] args) throws Exception {
    final Cache cache = CacheManager.getInstance().getCache("smallCache");

    final List<String> list = new ArrayList<String>();
    cache.put(new Element("A", list));

    /* We put in a second element. Since maxElementsInMemory="1", this means
     * that "A" will be evicted from memory and written to disk. */
    cache.put(new Element("B", new ArrayList<String>())); 
    Thread.sleep(2000); // We need to wait a bit, until "A" is evicted.

    /* Imagine, the following happens in Thread 1: */
        final List<String> retrievedList1 =
                   (List<String>) cache.get("A").getValue();
        retrievedList1.add("From Thread 1");

    /* Meanwhile, someone puts something in the cache: */
        cache.put(new Element("C", new ArrayList<String>())); 

    Thread.sleep(2000); // Once again, we wait a bit, until "A" is evicted.

    /* Now the following happens in Thread 2: */
        final List<String> retrievedList2 =
                   (List<String>) cache.get("A").getValue();
        retrievedList2.add("From Thread 2");
        cache.put(new Element("A", retrievedList2));

    /* Meanwhile in Thread 1: */    
        cache.put(new Element("A", retrievedList1));

    /* Now let's see the result: */
    final List<String> resultingList =
                        (List<String>) cache.get("A").getValue();
    for (final String string : resultingList) {
        System.out.println(string);
    } /* Prints only "From Thread 1". "From Thread 2" is lost.
                 But try it with maxElementsInMemory="3", too!! */

    CacheManager.getInstance().shutdown();
}

I used the following in ehcache.xml:

<cache name="smallCache"
       maxElementsInMemory="1"
       eternal="true"
       overflowToDisk="true"
       diskPersistent="true"
       maxElementsOnDisk="200"
       memoryStoreEvictionPolicy="LRU"
       transactionalMode="off"
       >
</cache>

One solution may be to use Explicit Locking, which seems to be available for standalone (non-Terracotta) caches, too (since ehcache 2.1).

Another solution would be to only have one thread which can modify the List. If you have multiple threads which can modify it, and you don't use locking on the cache, then you can get exactly the undeterministic results you described!

2
votes

I have a different approach for you, which I just read in an article about geographic range searches.

Put two key-value pairs in the cache: One with only x as the key, and one with both x and y as the key. When you look in the cache, look for the x-and-y key first. If it's there, you found a perfect match. If it's not there, look for the x key and possibly find a match with different y value.

1
votes

I would create a method to get the value for your object. Use a semaphore to restrict access to the method (or use synchronized).

In your method, test for X-only matches, and if that returns multiple results, text for XY matches.

Once the object is outside of the cache, any modifications to the object will modify the object within the cache as well (since they are pointing to the same instance).

If you want to be super careful, use synchronized methods to get/set the member variables within the MyObject, and include a lock which is the MyObject instance.

public void setX( int x ) {
     synchronized( this ) {
         this.x = x;
     }
}
0
votes

You could use a Map containing a sorted set as the value. The first map could index on X and then you can pick the first element from the sorted set where the sort is based on Y.

I guess the google collection api got lots of neat stuff that you could use, for instance the SortedSetMultimap:

http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/SortedSetMultimap.html

0
votes
  • Make a Key class of x and y, ie. class Key { int x,y }
  • implement a separate comparison operation for you "lexical ordering" on x and y,
  • put it into a Map<Key,Value>