Why would a java heap allocation resizing cause an OOME?
We are seeing OutOfMemoryExceptions in our logs, and they seem to coincide with the java heap commit size growing from ~1G to ~2.4G. Despite the error message, it does not appear that we are running out of heap space. Aside from the exception being thrown (and heap dump generated), the resize seems to eventually succeed and the application continues on without a problem (with a ~2.4G heap commit size).
Here's an example of the log output:
INFO | jvm 1 | 2013/08/16 12:08:05 | [GC [PSYoungGen: 328000K->2997K(339200K)] 645686K->320683K(1038272K), 0.0101580 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
INFO | jvm 1 | 2013/08/16 12:09:14 | [GC [PSYoungGen: 331509K->3487K(338816K)] 649195K->322153K(1037888K), 0.0115600 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
INFO | jvm 1 | 2013/08/16 12:09:59 | [GC [PSYoungGen: 331999K->2928K(340032K)] 650665K->322608K(1039104K), 0.0099300 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
INFO | jvm 1 | 2013/08/16 12:10:48 | [GC [PSYoungGen: 333104K->2723K(339648K)] 652784K->323240K(1038720K), 0.0100130 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
INFO | jvm 1 | 2013/08/16 12:11:28 | [GC [PSYoungGen: 332885K->3884K(340864K)] 653402K->325089K(1039936K), 0.0106250 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
INFO | jvm 1 | 2013/08/16 12:11:39 | [GC [PSYoungGen: 23694K->463K(340352K)] 344899K->323656K(2437504K), 0.0070330 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
INFO | jvm 1 | 2013/08/16 12:11:39 | [GC [PSYoungGen: 463K->0K(340608K)] 323656K->323592K(2437760K), 0.0044440 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
INFO | jvm 1 | 2013/08/16 12:11:39 | [Full GC
INFO | jvm 1 | 2013/08/16 12:11:40 | [PSYoungGen: 0K->0K(340608K)] [PSOldGen: 323592K->323592K(699072K)] 323592K->323592K(1039680K) [PSPermGen: 159297K->159297K(262144K)], 1.2076900 secs] [Times: user=1.20 sys=0.00, real=1.21 secs]
INFO | jvm 1 | 2013/08/16 12:11:40 | [GC [PSYoungGen: 0K->0K(340736K)] 323592K->323592K(2437888K), 0.0046330 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
INFO | jvm 1 | 2013/08/16 12:11:40 | [Full GC
INFO | jvm 1 | 2013/08/16 12:11:42 | [PSYoungGen: 0K->0K(340736K)] [PSOldGen: 323592K->279953K(744512K)] 323592K->279953K(1085248K) [PSPermGen: 159297K->159062K(262144K)], 1.7593100 secs] [Times: user=1.75 sys=0.00, real=1.76 secs]
INFO | jvm 1 | 2013/08/16 12:11:42 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 1 | 2013/08/16 12:11:42 | Dumping heap to java_pid28908.hprof ...
INFO | jvm 1 | 2013/08/16 12:11:48 | Heap dump file created [463314899 bytes in 6.037 secs]
INFO | jvm 1 | 2013/08/16 12:12:36 | [GC [PSYoungGen: 331840K->6044K(352192K)] 611793K->285998K(2449344K), 0.0164060 secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
INFO | jvm 1 | 2013/08/16 12:13:28 | [GC [PSYoungGen: 352156K->6161K(364160K)] 632110K->286114K(2461312K), 0.0152330 secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
INFO | jvm 1 | 2013/08/16 12:14:47 | [GC [PSYoungGen: 364113K->6575K(374144K)] 644066K->288169K(2471296K), 0.0179930 secs] [Times: user=0.02 sys=0.01, real=0.02 secs]
Notice that immediately prior to the OOME, the total committed heap oscillates between 1GB and 2.4GB. We can see that it was pretty steady at 1GB beforehand and pretty steady at 2.4GB afterwards.
Among the javaopts for this 1.6.0._24 JVM are:
- -Xmx3072m
- -XX:+HeapDumpOnOutOfMemoryError
- -XX:-UseGCOverheadLimit
- -verbose:gc
- -Xss256k
- -XX:MaxPermSize=256m
- -server
- -XX:+PrintGCDetails
The JVM is running 1.6.0._24. We can't change versions now, but will have a window to do so in the next month or two. If 1.6.0_45 is that much more stable, we will aim to switch to that. We're currently testing it.
The machine only has 4GB of total system ram. In addition, there is a small RAM-disk in use as well. I'm concerned that the Xmx setting is already too high for this environment.
This is puzzling us, as it doesn't seem like the heap usage is very big at the time of the exception. Why are we getting this OOME?
UPDATE: We are attempting to prevent this condition by setting the initial memory (Xms) equal to the maximum memory (Xmx). So far, these experiments have been promising though we've yet to introduce the change in production. It still doesn't explain why the OOME occurred in the first place, though it does seem to indicate that the OOME can be avoided without increasing the max heap size (or reducing the app memory footprint). So the mystery remains of why the heap resizing caused an OOME?