22
votes

This is a memory stack (serves as a cache) that consist of nothing but a static ConcurrentHashMap (CHM).

All incoming HTTP request data are store in this ConcurrentHashMap. And there is a asynch scheduler process that takes the data from the same ConcurrentHashMap and remove the key.value after storing them into the Database.

This system runs fine and smooth but just discover under following criteria, the memory was fully utilized (2.5GB) and all CPU time was taken to perform GC:

-concurrent http hit of 1000/s

-maintain the same concurrent hit for a period of 15 minutes

The asynch process log the remaining size of the CHM everytime it writes to database. The CHM.size() maintain at around Min:300 to Max:3500

I thought there is a Memory Leak on this application. so i used Eclipse MAT to look at the Heap Dump. After running the Suspect Report, i got these comments from MAT:

One instance of "org.apache.catalina.session.StandardManager" loaded by "org.apache.catalina.loader.StandardClassLoader @ 0x853f0280" occupies 2,135,429,456 (94.76%) bytes. The memory is accumulated in one instance of "java.util.concurrent.ConcurrentHashMap$Segment[]" loaded by "".

3,646,166 instances of java.util.concurrent.ConcurrentHashMap$Segment retain >= 2,135,429,456 bytes.

and

Length    # Objects      Shallow Heap      Retained Heap 
0         3,646,166      482,015,968       >= 2,135,429,456 

The length 0 above i translate it as empty length record inside the CHM (each time i call CHM.remove() method). It is consistent to the number of record inside the database, 3,646,166 records was inside the database when this dump was created

The strange scenario is: if i pause the stress test, the utilization in Heap Memory will gradually release down to 25MB.This takes about 30-45 minutes. i have re-simulate this application and the curves looks similar to the VisualVM Graph below: alt text

Heres the questions:

1) Does this looks like a Memory Leak?

2) Each remove call remove(Object key, Object value) to remove a <key:value> from CHM, does that removed object get GC?

3) Is this something to do with the GC settings? i have added the following GC parameters but without help:

-XX:+UseParallelGC

-XX:+UseParallelOldGC

-XX:GCTimeRatio=19

-XX:+PrintGCTimeStamps

-XX:ParallelGCThreads=6

-verbose:gc

4) Any idea to resolve this is very much appreciated! :)

NEW 5) Could it be possible because all my reference are hard reference? My understanding is as long as the HTTP session is ended, all those variables that is not static are now available for GC.

NEW Note I tried replace the CHM with ehcache 2.2.0, but i get the same OutOfMemoryException problem. i suppose ehcache is also using ConcurrentHashMap.

Server Spec:

-Xeon Quad core, 8 threads.

-4GB Memory

-Windows 2008 R2

-Tomcat 6.0.29

5
How hard would it be to replace the hash map with an instance of EhCache? These libraries are optimized for this kind of tasks. - Boris Pavlović
At the moment we try not to change that much to the existing code because we are yet to analyse the impact. EhCache was part of the consideration initially but somehow was not choosen as the implementation choice. - Reusable

5 Answers

11
votes

This problem has bug me for a bad 7 days! And finally i found out the real problem! Below are the tasks on what i have tried but failed to solve the OutOfMemory Exception:

-change from using concurrenthashmap to ehcache. (turns out ehcache is also using ConcurrentHashMap)

-change all the hard reference to Soft Reference

-Override the AbstractMap along side with concurrnetHashMap as per suggest by Dr. Heinz M. Kabutz

The million dollar question is really "why 30-45 minutes later, memory starting to release back to the heap pool?"

The actual root cause was because there is something else still holding the actual variable session, and the culprit is the http session within tomcat is still active! Hence, even though the http session was completed, but if the timeout setting is 30 minutes, tomcat will hold the session information for 30 minutes before JVM can GC those. Problem solve immediately after changing the timeout setting to 1 minute as testing.

$tomcat_folder\conf\web.xml

<session-config>
    <session-timeout>1</session-timeout>
</session-config>

Hope this will help anyone out there with similar problem.

10
votes

I think you're using too much session data that won't fit at once in memory. Try this one:

  1. Edit bin/setenv.sh or wherever the JVM args are set on your Tomcat launcher :

    Append -Dorg.apache.catalina.session.StandardSession.ACTIVITY_CHECK=true

    e.g.

    # Default Java options
    if [ -z "$JAVA_OPTS" ]; then
            JAVA_OPTS="-server -Djava.awt.headless=true -XX:MaxPermSize=384m -Xmx1024m -Dorg.apache.catalina.session.StandardSession.ACTIVITY_CHECK=true"
    fi
    
  2. Edit conf/context.xml, before </Context> add this:

    <Manager className="org.apache.catalina.session.PersistentManager"
            maxIdleBackup="60" maxIdleSwap="300">
        <Store className="org.apache.catalina.session.FileStore"/>
    </Manager>
    

Restart Tomcat and your problem should be gone, since it'll store your sessions using the filesystem instead.

In my view setting session-timeout = 1 is a workaround that masks the root of problem, and is unusable in most apps where you actually need a big enough session-timeout. Our (Bippo's) apps usually have a session-timeout of 2880 minutes i.e. 2 days.

Reference: Tomcat 7.0 Session Manager Configuration

3
votes

1) Does this looks like a Memory Leak?

Yes, if the application keeps on putting objects in the map and never removes them, then that could very well be a memory leak.

2) Each remove call remove(Object key, Object value) to remove a from CHM, does that removed object get GC?

Objects can only be garbage collected if there is no live (running) thread that has a reference to them. The map is only one place where there's a reference to the object. There could still be other places that have references to the same object. But keeping the object in the map will prevent it from being garbage collected.

3) Is this something to do with the GC settings?

No; if an object is referenced, it cannot be garbage collected; it doesn't matter how you tweak the garbage collector.

1
votes

Of course, it is too late to answer, but just for other people who will find this question by search. It might be useful.

These 2 links are very useful
https://issues.apache.org/bugzilla/show_bug.cgi?id=50685
http://wiki.apache.org/tomcat/OutOfMemory

Briefly, in most cases it is a wrong test or test software. When some custom software open URL, if this software cannot manage http session tomcat creates new session for each request. For example it is possible to check it with simple code, which can be added to JSP.

System.out.println("session id: " + session.getId());
System.out.println("session obj: " + session);
System.out.println("session getCreationTime: " + (new Date(session.getCreationTime())).toString());
System.out.println("session.getValueNames().length: " + session.getValueNames().length);

If session ID will be the same for one user from load test point of view, it is fine, if each request generates new session ID, that means testing software does not manage the sessions very well and test result does not represent load from real users.

For some application session.getValueNames().length also important, because For example, when normal user works it remain the same, but when load testing software do the same, it is grows. It also means, that load testing software does not represent real workload very well. In my case session.getValueNames().length for normal user was about 100, but qwith load testing software after 10 minutes is was about 500 and finally system crashes with the same OutOfMemory error and MAT shows the same:

org.apache.catalina.loader.StandardClassLoader @ 0x853f0280" occupies 2,135,429,456 (94.76%) bytes.

0
votes

If you get this exception and are using spring boot version 1.4.4 RELEASE or lower, set value of property "server.session-timeout" in minutes, rather than what they suggest (seconds), so that the sessions on the heap will be cleaned in time. Or you can use a bean of EmbeddedServletContainerCustomizer but the provided value will be set in minutes.

example (session timeout in 10 minutes): server.session-timeout=10 (set in properties file) container.setSessionTimeout(10, TimeUnit.SECONDS); (set in EmbeddedServletContainerCustomizer)