2
votes

I have this program:

    public static void main(String[] args) {
        int sum = 0;
        LinkedList<Integer> ll = new LinkedList<>();
        for(int i=0;i<Long.MAX_VALUE;i++) {
            sum += i;
            if (sum % 3 == 0){
                ll.add(i);
                if (ll.size() % 1000 == 0)
                    System.out.println("Linked List size: " + ll.size());
            }
        }
    }`

What I expected to see was integer object being created in the young generation, some of them added to the linked list moved to the old generation. So I expected young generation GC to happen consistently with objects being moved to the survivor spaces and then from there to the old generation. But what I find is that it is the old generation GC that is happening all the time and young generation GC is not happening at all. Is this some kind of optimization that the JVM is doing? Where the objects are being created in the old generation directly? As you can see in the below image young gc has happened only twice while old gc 41 times. Old Generation GC only

Next I tried the same code except that instead of adding the integer object to a linked list I just created a new Object() and to my surprise there were no young or old gc.

public static void main(String[] args) {
    int sum = 0;
    LinkedList<Integer> ll = new LinkedList<>();
    for(int i=0;i<Long.MAX_VALUE;i++) {
        sum += i;
        Object obj = new Object();
    }
}

No Young or Old GC

Then I created random string objects:

public static void main(String[] args) {
    int sum = 0;
    LinkedList<Integer> ll = new LinkedList<>();
    for(int i=0;i<Long.MAX_VALUE;i++) {
        sum += i;
        String s = String.valueOf(Math.random());
    }
}

Now I see the famililar see-saw pattern with the objects being transferred to survivor spaces:

5.833: [GC (Allocation Failure) [PSYoungGen: 1048576K->1984K(1223168K)] 1048576K->2000K(4019712K), 0.0035789 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 12.678: [GC (Allocation Failure) [PSYoungGen: 1050560K->2000K(1223168K)] 1050576K->2024K(4019712K), 0.0023286 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 18.736: [GC (Allocation Failure) [PSYoungGen: 1050576K->1968K(1223168K)] 1050600K->2000K(4019712K), 0.0016530 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 24.346: [GC (Allocation Failure) [PSYoungGen: 1050544K->2000K(1223168K)] 1050576K->2040K(4019712K), 0.0016131 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 31.257: [GC (Allocation Failure) [PSYoungGen: 1050576K->1952K(1223168K)] 1050616K->2000K(4019712K), 0.0018461 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 38.519: [GC (Allocation Failure) [PSYoungGen: 1050528K->1984K(1395712K)] 1050576K->2040K(4192256K), 0.0022490 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 47.998: [GC (Allocation Failure) [PSYoungGen: 1395648K->256K(1394176K)] 1395704K->2153K(4190720K), 0.0024607 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]

So my question is the GC smart enough to see that the objects created are not used anywhere and discard them? Why not with Strings?

1

1 Answers

3
votes

In you first example, you are constantly adding objects to the LinkedList which is the oldest object in your application code. As soon as this object has been promoted to the old generation, adding an object to that list implies modifying a member of the old generation. This implies that the next garbage collection has to check whether newly created objects are reachable by the old object—which is actually reasonable as every third object is.

In your second example you are just creating an Object without any side effect. Such an allocation can be eliminated by the Hotspot optimizer. After that, there is no garbage at all. Actually, the entire loop could get eliminated as the additions could get replaced by a single multiplication. But whether that happens or not is irrelevant to the garbage collector activity (or the lack of it).

You third example invokes Math.random() which has a globally visible, not removable effect. Regardless of whether you use the returned number, it will advance the state of the shared global random number generator which Math.random() uses internally. I suppose, the arithmetic’s of the pseudorandom number generator is too complex to allow to transform the loop into a single computation step.

In principle, the creation of the unused String instances still could get eliminated, but it seems that being interleaved with the non-removable code hits a limitation of the optimizer. It might also be the case that due to the complexity of the random number generation code, the creation of a temporary string has been considered not performance relevant here.

So you see, the second and third example have little to do with the garbage collector but rather with the Hotspot optimizer. Further, regarding your first example, you have to consider how a garbage collector works. Despite it’s name, its not processing the garbage, but the alive objects to find out, which objects are referenced by alive objects and hence, alive themselves. So it doesn’t matter, in which generational space the objects are created, but which object can potentially reach them. If no old object has been modified since the last collection, a local collection can be performed as the unmodified old objects can’t reach the younglings. But if an old object has been modified, its references have to get traversed to find out whether it references new objects.