1
votes

I have a requirement in my project, to cache 9 million data from oracle database to Hazelcast .But apparently Hazelcast is consuming more heap space than it is supposed to consume . I have allotted 8bg heapspace for the app but still i am getting out of memory error.

Below is my data loader class .

public class CustomerProfileLoader  implements ApplicationContextAware, MapLoader<Long, CustomerProfile> {

private static CustomerProfileRepository customerProfileRepository;

    @Override
    public CustomerProfile load(Long key) {
        log.info("load({})", key);
        return customerProfileRepository.findById(key).get();
    }

    @Override
    public Map<Long, CustomerProfile> loadAll(Collection<Long> keys) {
        log.info("load all in loader executed");
        Map<Long, CustomerProfile> result = new HashMap<>();
        for (Long key : keys) {
            CustomerProfile customerProfile = this.load(key);
            if (customerProfile != null) {
                result.put(key, customerProfile);
            }
        }
        return result;
    }

   @Override
    public Iterable<Long> loadAllKeys() {

        log.info("Find all keys in loader executed");

        return customerProfileRepository.findAllId();
    }

    @Override
    public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
        customerProfileRepository = applicationContext.getBean(CustomerProfileRepository.class);
    }
}

Below is the repository query. If i change the below query so that it limit to say 2 million data, then everything works fine.

 @Query("SELECT b.id FROM CustomerProfile b ")
    Iterable<Long> findAllId();

Below is my map configuration in hazelcast.xml file. Here i gave backup count as zero, before it was 1 but that didn't make any difference.

<?xml version="1.0" encoding="UTF-8"?>
<hazelcast
        xsi:schemaLocation="http://www.hazelcast.com/schema/config
        http://www.hazelcast.com/schema/config/hazelcast-config-3.11.xsd"
        xmlns="http://www.hazelcast.com/schema/config"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <!-- Use port 5701 and upwards on this machine one for cluster members -->

    <network>
        <port auto-increment="true">5701</port>

        <join>
            <multicast enabled="false"/>
            <tcp-ip enabled="true">
                <interface>127.0.0.1</interface>
            </tcp-ip>
        </join>
    </network>

    <map name="com.sample.hazelcast.domain.CustomerProfile">
        <indexes>
            <!-- custom attribute without an extraction parameter -->
            <index ordered="false">postalCode</index>
        </indexes>
        <backup-count>0</backup-count>
        <map-store enabled="true" initial-mode="EAGER">
            <class-name>com.sample.hazelcast.CustomerProfileLoader</class-name>
        </map-store>
    </map>
</hazelcast>

database Table Structure:

ID                   NOT NULL NUMBER(19)        
LOGIN_ID       NOT NULL VARCHAR2(32 CHAR) 
FIRSTNAME              VARCHAR2(50 CHAR) 
LASTNAME               VARCHAR2(50 CHAR) 
ADDRESS_LINE1          VARCHAR2(50 CHAR) 
ADDRESS_LINE2          VARCHAR2(50 CHAR) 
CITY                    VARCHAR2(30 CHAR) 
postal_code                VARCHAR2(20 CHAR) 
COUNTRY                 VARCHAR2(30 CHAR) 
CREATION_DATE  NOT NULL DATE              
UPDATED_DATE   NOT NULL DATE              
REGISTER_NUM          NOT NULL VARCHAR2(10 CHAR) 

Other points:

  • I have only one instance of hazelcast server running now , with allocated heapspace as 8GB JAVA_OPTS=-Xmx8192m. Before it was 4gb but when i got heapspace error i increased to 8GB , but no luck.
  • For time being maploader is executed when map get accessed for the first time.
  • The particular table (customer_profile) is having 6 columns in it which doesnt have any binary types. It has just basic values like firstname lastname kind of.
  • hazelcast version used is 3.8

The problems now i face is :

I am getting heapspace error(java.lang.OutOfMemoryError: Java heap space) when it fetch all data and loads it to map. Now table has 9 million data in it.

Also it is taking lot of time to load data, probably i can fix this by running multiple instances of hazelcast server.

I am a newbie here in hazelcast, so any help would be greatly appreciated :)

1
What is the average size of a row in that table?Stephen C
@StephenC Its approximately 100 bytesvipin cp
100 bytes represented as what?Stephen C
@StephenC . You mean datatype of columns ?vipin cp
@StephenC I added the database table structure as well in the question.vipin cp

1 Answers

6
votes

It sounds to me like the real problem is that you have just too much data to hold in an 8GB heap.

You say you have 100 bytes of data on average per row represented as string data.

Here are some estimates1 of the space needed to represent a 9,000,000 rows of that data as a HashMap. Assuming that there are 9 strings, 2 dates and an int.

  • In a 64 bit JVM, a String has an overhead of 48 bytes + 2 bytes per character. So 9 Java strings representing ~100 bytes of character data amount to roughly 650 bytes.
  • A Date is 32 bytes x 2 -> 64 bytes
  • A record representing 9 strings, 2 dates and one int will be 112 bytes.
  • A key (say an Integer) will be 24 bytes.
  • A HashMap entry will be 40 bytes.
  • (650 + 64 + 112 + 24 + 40) x 9,000,000 -> ~8,000,000,000 bytes
  • The HashMap's main array will be 2^24 x 8 bytes == ~128,000,000 bytes

As you can that is over 8Gbytes of actual data. Then take account of the fact that a Java heap needs a fair amount of working space; say 30% at least.

It is not at all surprising that you are getting OOMEs. My guesstimate is that your heap needs to be 50% larger ... and that assumes that your estimate of 100 bytes per row is accurate.


This is purely based on your loadAll method which appears to be materializing all rows in the database as a regular HashMap. It doesn't take any account of the heap space or other memory that Hazelcast uses for caching.

While you could just expand the heap, but I think it would make more sense to change your code so that it doesn't materialize the rows like that. It is unclear whether that would make sense. It will depend on how the map is used.


1 - I am assuming that you are using Java 8.