Distributed cache for non-serializable objects

Question

My application needs to cache non-serializable objects for performance reasons. These non-serializable objects are in-memory models built from an external resource. For example, a validation template is stored as XML in the database, and an in-memory model is constructed by parsing the XML. The in-memory model is relatively expensive to build, so caching improves performance. However, the in-memory model needs to be reloaded from the database when the underlying record is changed.

In a single application scenario, I stored the objects in a simple map. When a record is changed in the database, the in-memory model is rebuilt and replaced the old entry in the map.

In a distributed scenario, I need the invalidation message to propagate across the cluster so that all nodes rebuild the in-memory model when the record changes. I have looked at Infinispan and Hazelcast and they both require all cached objects to be serializable. However, if the cache operates in an invalidation mode (where data is not sent across the wire), I don't see why the cached objects need to be serializable.

What techniques are commonly used in this scenario? Is this scenario unusual (i.e. should I be doing something different)?

read hazelcast documentation for distributed caching it may help you to understand whole scenario — Nirav Prajapati
I've read a lot of the hazelcast documentation. Is there a particular part of the documentation you think would be relevant? — Nathan
yup dude i also cant got the concept proper,i am using Hazlcast for online offline user in distributed environment the same issue facing when one node have some problem it gives me stack trace due to serialize map — Nirav Prajapati
From their documentation, Hazelcast has a custom serialization feature, which should do what you need. — Lolo

javabrew javabrew · Accepted Answer · 2014-08-22T06:58:40

However, if the cache operates in an invalidation mode (where data is not sent across the wire)

not exactly sure what this means, why store objects in distributed cache then? And how did you get them in the cache in a first place?

Your objects do not have to be serializable in a pure Java sense, i.e., they do not have to implement Serializable interface. But since your cache is distributed, be it Hazelcast or Memcached or EhCache, you need to get your Java objects across the wire and store them in cache in some external format, and then be able to get them back from cache and restore as Java objects. This is called marshaling /unmarshaling, or ... serialization/deserialization. The are variety of formats you can consider: XML, Json, Bson, Yaml, Thrift, etc. There are numerous frameworks and libraries that can help you work with these different serialization schemas. XStream, JAXB, Jackson, Apache Camel, etc.

As far as Hazelcast goes, its documentation explicitly says: "All your distributed objects such as your key and value objects, objects you offer into distributed queue and your distributed callable/runnable objects have to be Serializable." May be you could consider Guava in-memory cache?

Distributed cache for non-serializable objects

1 Answers