synchronizing computation and result using a distributed JVM cache

Question

I want to cache an expensive operation and don't want other threads doing the same operation at a given time in different JVMs. We are guaranteed to get at least 5 near real time requests for the same computation at the same time and don't have any control on streamlining these requests.

Solutions I could think of:

These other threads could wait to acquire lock (Hazelcast) but if there are more threads the last one to acquire lock could spend a lot of time waiting to acquire the lock.

Is there was a way for these other threads to simply "wait for lock to be released" and NOT to acquire the lock as they are simply reading from cache?

Use polling. First a blocking threadId=cache.putIfAbsent(key) returns which thread will process and others will keep polling another cache entry by threadId to get the result. This is a waste of polling, is there a way to "wait for a read from cache"?

An actual distributed "Shared Reentrant Read Write Lock" seems to be the solution but Apache Curator library does not seem light weight. And I am looking for a simple async P2P distributed cache approach.

Or how do I achieve the same using HazelCast?? Overall, isn't blocking and avoiding computation (in our case CPU and IO bound) in the first place a better approach than let all threads compute and say use the database/cache fail additional write and return the first computation's result?

kisna kisna · Accepted Answer · 2015-04-21T08:32:40

Did the following a while ago, first there was a solution posted already using Hazelcast: https://github.com/ThoughtWire/hazelcast-locks.

This library works a drop in replacement to do a distributed "Shared Reentrant Read Write Lock" allow locks while writing but not when reading, so we are guaranteed this happens only once. The only problem we faced was the lock release notification took longer to waiting than our expectation. This is besides the fact that you have to explicitly clean up the expensive locks somehow after a certain time as they are tied to a specific key.

We ended up implementing a custom logic in a forward proxy just before the requests hit the tomcat containers which basically routed requests to specific server based on the key and we had a local JVM lock on a shared concurrent map entry based on the same key that did the job. Also, the clean up logic was much simpler.

synchronizing computation and result using a distributed JVM cache

1 Answers