I would be very careful about putting all the data from your underlying data source, e.g. from an RDBMS, in memory, in a java.util.Map
, since it would be very easy to run out of memory (hence an OutOfMemoryError
) pretty quickly depending on the size of the result set.
None-the-less, if you want an example of this see here; configuration is here.
Essentially, I am using a Spring BeanPostProcessor, the RegionPutAllBeanPostProcessor, to put a Map
of data into a "target" Region.
For example, I have a Region (i.e. "RegionOne") and I can use the RegionPutAllBeanPostProcessor
to target this Region and put data into the Region from the Map.
Obviously, you have many different options when it comes to triggering this Region load/"warming": a GemFire Initializer, a Spring BeanPostProcessor
(docs here) or even a Spring ApplicationListener
listening for ApplicationContextEvents
, such as an on ContextRefreshedEvent
(docs here).
However, while the Map
in this test is hard-coded in XML, you could envision populating this Map
from any data source, including a java.sql.ResultSet
derived from a SQL query executed against the RDBMS.
So, perhaps a better approach/solution, that would not eat up as much memory, would be to use a BBP "injected" with Spring's JdbcTemplate
or a JPA EntityManager
, or even better yet, use Spring Data JPA, and load the data from your framework of choice to put data directly into the Region. After all, if the Region.putAll(:Map)
is essentially just iterating the Map.Entries
of the incoming Map
and calling Region.put(key, value)
individually for each Map.Entry
(this, this and this) , then clearly it is not buying you that much and certainly does not justify putting all the data in-memory before putting it into the Region.
For instance, most ResultSets
are implemented with a DB cursor that allows you to fetch a certain number of rows at once, but not all the possible rows. Clearly, your SQL query can even be more selective about which rows are returned based on interests/pertinence, think of loading only a subset of the most important data, or some other criteria specifiable in the query predicate. Then simply just put the data into the Region when iterating the ResultSet
.
Food for thought.
-John