14
votes

When G1 decides it needs to start doing mixed collections, it aggressively shrinks our Eden space from 10g to about 1g.

{Heap before GC invocations=294 (full 0):
 garbage-first heap   total 20480000K, used 18304860K [0x00000002de000000, 0x00000002de804e20, 0x00000007c0000000)
  region size 8192K, 1363 young (11165696K), 11 survivors (90112K)
 Metaspace       used 37327K, capacity 37826K, committed 38096K, reserved 1083392K
  class space    used 3935K, capacity 4081K, committed 4096K, reserved 1048576K
2016-03-31T20:57:31.002+0000: 7196.427: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 717225984 bytes, new threshold 1 (max 1)
- age   1:   41346816 bytes,   41346816 total
 7196.427: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 144693, predicted base time: 48.88 ms, remaining time: 951.12 ms, target pause time: 1000.00 ms]
 7196.427: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1352 regions, survivors: 11 regions, predicted young region time: 20.72 ms]
 7196.427: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1352 regions, survivors: 11 regions, old: 0 regions, predicted pause time: 69.60 ms, target pause time: 1000.00 ms]
 7196.494: [G1Ergonomics (Mixed GCs) start mixed GCs, reason: candidate old regions available, candidate old regions: 789 regions, reclaimable: 4703761904 bytes (22.43 %), threshold: 5.00 %]
, 0.0673540 secs]
   [Parallel Time: 60.1 ms, GC Workers: 18]
      [GC Worker Start (ms): Min: 7196427.8, Avg: 7196428.1, Max: 7196428.2, Diff: 0.4]
      [Ext Root Scanning (ms): Min: 7.3, Avg: 7.5, Max: 7.7, Diff: 0.4, Sum: 134.2]
      [Update RS (ms): Min: 28.2, Avg: 28.8, Max: 29.9, Diff: 1.7, Sum: 518.4]
         [Processed Buffers: Min: 41, Avg: 57.7, Max: 80, Diff: 39, Sum: 1039]
      [Scan RS (ms): Min: 0.1, Avg: 0.2, Max: 0.5, Diff: 0.4, Sum: 3.7]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [Object Copy (ms): Min: 22.1, Avg: 23.1, Max: 23.4, Diff: 1.3, Sum: 416.2]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 18]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 2.5]
      [GC Worker Total (ms): Min: 59.5, Avg: 59.7, Max: 60.0, Diff: 0.5, Sum: 1075.1]
      [GC Worker End (ms): Min: 7196487.7, Avg: 7196487.8, Max: 7196487.9, Diff: 0.2]
   [Code Root Fixup: 0.2 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 1.9 ms]
   [Other: 5.2 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.5 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.5 ms]
      [Humongous Register: 0.2 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 2.3 ms]
   [Eden: 10.6G(10.6G)->0.0B(848.0M) Survivors: 88.0M->152.0M Heap: 17.5G(19.5G)->7128.3M(19.5G)]
Heap after GC invocations=295 (full 0):
 garbage-first heap   total 20480000K, used 7299344K [0x00000002de000000, 0x00000002de804e20, 0x00000007c0000000)
  region size 8192K, 19 young (155648K), 19 survivors (155648K)
 Metaspace       used 37327K, capacity 37826K, committed 38096K, reserved 1083392K
  class space    used 3935K, capacity 4081K, committed 4096K, reserved 1048576K
}
 [Times: user=1.09 sys=0.00, real=0.07 secs]
2016-03-31T20:57:31.070+0000: 7196.495: Total time for which application threads were stopped: 0.0699324 seconds, Stopping threads took: 0.0003462 seconds

This is after it's been running with 10-11g of Eden for 60 or more collections.

Here are the appropriate JVM GC parameters we're running with

-Xms20000m -Xmx20000m
-XX:+UseG1GC 
-XX:G1RSetUpdatingPauseTimePercent=5 
-XX:MaxGCPauseMillis=1000 
-XX:GCTimeRatio=99 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:MaxTenuringThreshold=1 
-XX:G1ConcRefinementThreads=6 
-XX:ConcGCThreads=18 
-XX:ParallelGCThreads=18

-XX:+PrintGCDetails"
-XX:+PrintGCDateStamps"
-XX:+PrintHeapAtGC"
-XX:+PrintTenuringDistribution"
-XX:+PrintGCApplicationStoppedTime"
-XX:+PrintPromotionFailure"
-XX:+PrintAdaptiveSizePolicy"

According to page 55 of this presentation, it needs to resize Eden so that max pause target accounts for the entire heap, not just to the new generation. Why is the collector being so aggressive?

Average young generation pause times are between 50-150ms for a heap size of 10g. If the presentation is correct (I haven't found anything else to support the statement), I would expect shrinking by half (20g heap), not 10x.

2
I've just read through that presentation (it's quite good really), but I don't see the same things on slide 55 that you mention in your post. And, it shows an example of eden reducing from 12.5gb to <1gb, which is rather similar to yours. Can you explain in more detail why you expect eden to shrink less than it has?Software Engineer
And, maybe consider adding the print-adaptive-size-policy flag.Software Engineer
@EngineerDollery The PrintAdaptiveSizePolicy parameter is enabled (added to question). In the presentation, issue #3, it states "Must account for old regions, not only young, Currently G1 shrinks the young generation". Ideally, I wouldn't expect the generation to shrink at all. We've set our pause goal to 1000ms and collections are much quicker than that. Having to collect twice the garbage (which is already marked) shouldn't take more than twice the time.Savior
you're putting lots of constraints on G1. are you certain that each parameter is justified as it is right now? And if the new gen shrinking really is the concern, why not set a minimum size for it?the8472
@the8472 We can have that discussion in chat (though you'd have to create it, I don't have enough reputation).Savior

2 Answers

12
votes

You can find answer for your query in slide no : 56

Young generation shrunk by a factor of 20X

So shrinking by a factory of 10X is not a surprise.

From infoQ article by Monica Beckwith on tips for tuning G1GC :

The Full GC could have been avoided by letting the nursery / young gen shrink to the default minimum (5% of the total Java heap)

Since you have not set young gen size explicitly, default value is

-XX:G1NewSizePercent=5

Sets the percentage of the heap to use as the minimum for the young generation size.

So to respect your pause time goal of

-XX:MaxGCPauseMillis=1000 

the young gen can shrink up-to 5% of total heap.

I have found one good google group article regarding G1GC at https://groups.google.com/a/jclarity.com/forum/#!msg/friends/hsZiz6HTm9M/klrRjBclCwAJ

If G1 predicted pause time goal is larger than the target pause time goal, then shrink young generation, but no more than G1NewSizePercent of the current Java heap size, (not the max size). Again, overall Java heap will grow (or shrink) based on the value computed GC time ratio versus value of GCTimeRatio.

Note: G1NewSizePercent, and G1MaxNewSizePercent are not to be confused with NewSize or MaxNewSize.

G1NewSizePercent and G1MaxNewSizePercent put a lower and upper bound respectively on how small or how large young gen can be sized by G1.

On a different note, you have configured many parameters which may be un-necessary sine G1GC works fine if most of the default parameters have been set to default values. Refer to this SE question for more details.

Java 7 (JDK 7) garbage collection and documentation on G1

In summary: Depending on pause time goal, young gen will shrink. If you are really worried about shrinking of young gen to low value, configure -XX:G1NewSizePercent. But I won't recommend it as long as -XX:MaxGCPauseMillis has been met

EDIT:

From G1GC ergonomics page,

It is typical that the size of the heap will oscillate as the garbage collector tries to satisfy competing goals. This is true even if the application has reached a steady state. The pressure to achieve a throughput goal (which may require a larger heap) competes with the goals for a maximum pause time and a minimum footprint (which both may require a small heap).

0
votes

With MaxTenuringThreshold=1 you are telling the GC to promoted an object to the Old Generation after surviving only 1 GC cycle. In most of the application this will lead to many premature promotions. If too many objects are promoted too early your Old Gen will grow rapidly thus it has to shrink the Youg Gen to make space for the Old Gen