I have two questions. And one of them will be ragarding the topic:)
1) I have faced with a problem of not being able to find the full information about how different garbage collector work in HotSpot. But I am not talking about general descriptions of garbage collector's work (we have a lot of this information on the internet), I am talking about concrete algorithms. I have found this whitepaper (Memory managment in the Java HotSpot Virtual Machine) http://www.oracle.com/technetwork/java/javase/tech/memorymanagement-whitepaper-1-150020.pdf . But it has only general ideas. It has a good description (probably it is not so good - see my second question) of the Parallel compacting algorithm (I mean parallel mark-sweep-compact) but it doesn't explain other garbage collector's algorithms. However this whitepaper is the best information I was able to find on the Internet. What I would like to know is where to get a full description/information about how different garbage collectors (for young gen I mean: ParNew, DefNew, PSYoungGen; for old gen: PSOLdGen, ParOldGen, Concurrent-Mark-Sweep) work. Can't believe that this information is not available to the users.
2) The question regarding the Parallel Compacting Collector algorithm (ParOldGen or Parallel Mark-Sweep-Compact). The whitepaper (see the first question) has a description of it's work. Let me paste a quote from the whitepaper (please, spend a minute to take a look on it):
The things I cannot understand are listed below:
Regarding summary phase:
Due to compactions from previous collections, it is typical that some portion of the left side of each generation will be dense, containing mostly live objects. The amount of space that could be recovered from such dense regions is not worth the cost of compacting them.
well, does it mean that when we have a region which consist of 98-99% of live objects and 2-1% of dead objects (in other words a very small percentage of the dead objects) than compacting of this region is not worth the space that could be recovered from such a region. However this tiny free spaces (holes) will be eventually filled up and there will be no holes after garbage collections is finished.
So the first thing the summary phase does is examine the density of the regions, starting with the leftmost one, until it reaches a point where the space that could be recovered from a region and those to the right of it is worth the cost of compacting those regions.
well, if we have a big percentage of the dead objects than this region is worth compacting, right?
The regions to the left of that point are referred to as the dense prefix, and no objects are moved in those regions.
"and no objects are moved in those regions", but there might be some little free spaces in those regions, am I right? Can't uderstand the point
The regions to the right of that point will be compacted, eliminating all dead space.
please clarify how they will be compacted. Every region will be compacted separately? I guess no. So maybe some kind of shifting will be here?
The summary phase calculates and stores the new location of the first byte of live data for each compacted region.
to understand it I need to understand the previous question I suppose.
Regarding compaction phase:
In the compaction phase, the garbage collection threads use the
summary data to identify regions that need to be filled, and the
threads can independently copy data into the regions. This produces a heap that is densely packed on one end, with a single large empty
block at the other end.
I am totally confused. So no compaction happened on the "summary phase"? Was the previous phase's purpose only to find all free spaces?
Please help me to get a clear picture.