67
votes

What are the roots in garbage collection?

I have read the definition of root as "any reference that you program can access to" and definition of live is that an object that is being used, which can be a local variable, static variable.

I m little confused with discriminating the difference between root and live objects.

What is path to root? How does root and live objects work?

Can someone elaborate ?

5
What crummy definitions :) I would start at Garbage Collectionuser166390
@user177833 - where did you read those definitions?Stephen C
the definition in that page for the root is: "any object reference your program can access directly, without going through another object". That is vastly different from "any reference that you program can access to". It is very specific in that your program holds the references to the said managed object, and that your program does not need to traverse the heap to arrive at the root.Vineet Reynolds
you'll need to visualize the JVM/CLR as the actual processes that manage the heap. The only objects in the heap, that the process is aware of, is the set of thread stack frames under execution, the classes that have been loaded, amongst a few others. This is the GC root; every other object in the heap is either reachable or unreachable from this set.Vineet Reynolds

5 Answers

89
votes

If you think of the objects in memory as a tree, the "roots" would be the root nodes - every object immediately accessible by your program.

Person p = new Person();
p.car = new Car(RED);
p.car.engine = new Engine();
p.car.horn = new AnnoyingHorn();

There are four objects; a person, a red car, its engine and horn. Draw the reference graph:

     Person [p]
        |
     Car (red)
   /           \
Engine    AnnoyingHorn

And you'll end up with Person at the "root" of the tree. It's live because it's referenced by a local variable, p, which the program might use at any time to refer to the Person object. This also goes for the other objects, through p.car, p.car.engine, etc.

Since Person and all other objects recursively connected to it are live, there would be trouble if the GC collected them.

Consider, however, if the following is run after a while:

p.car = new Car(BLUE);

And redraw the graph:

     Person [p]
        |
     Car (blue)       Car (red)
                    /           \
                Engine    AnnoyingHorn

Now the Person is accessible through p and the blue car through p.car, but there is no way the red car or its parts can ever be accessed again - they are not connected to a live root. They can be safely collected.

So it's really a matter of taking every starting point (every local variable, globals, statics, everything in other threads and stack frames) — every root — and recursively following all the references to make up a list of all the "live" objects: objects which are in use and unsuitable for deletion. Everything else is garbage, waiting to be collected.

39
votes

The GC (Garbage Collector) roots are objects special for garbage collector. The Garbage Collector collects those objects that are not GC roots and are not accessible by references from GC roots.

There are several kinds of GC roots. One object can belong to more than one kind of root. The root kinds are:

  • Class - class loaded by system class loader. Such classes can never be unloaded. They can hold objects via static fields. Please note that classes loaded by custom class loaders are not roots, unless corresponding instances of java.lang.Class happen to be roots of other kind(s).
  • Thread - live thread
  • Stack Local - local variable or parameter of Java method
  • JNI Local - local variable or parameter of JNI method
  • JNI Global - global JNI reference
  • Monitor Used - objects used as a monitor for synchronization
  • Held by JVM - objects held from garbage collection by JVM for its purposes. Actually the list of such objects depends on JVM implementation. Possible known cases are: the system class loader, a few important exception classes which the JVM knows about, a few pre-allocated objects for exception handling, and custom class loaders when they are in the process of loading classes. Unfortunately, JVM provides absolutely no additional detail for such objects. Thus it is up to the analyst to decide to which case a certain "Held by JVM" belongs.

(credit to YourKit's website)

Not mentioned by YourKit is the fact that objects awaiting finalization will be retained as roots until the GC runs the finalize() method. That can cause transient retention of large graphs somewhat unexpectedly. The general rule of thumb is not to use finalizers (but that's a different question).

31
votes

Roots or garbage collection roots are the objects that are always reachable. If an object is always reachable, then it is not eligible for garbage collection; roots therefore are always ineligible for collection. It is the initial set of objects from where reachability of all other objects on the heap are determined.

Other objects on the heap reachable from the garbage collection roots are considered to be live objects, and ineligible for collection; the objects that are unreachable can be marked for reclamation.

I know Java more than the .Net platform, so I'll speak only for one. On the Java platform, the GC roots are actually implementation dependent. In most runtime however, the GC roots tend to be the operands on the stack (for they are currently in use by threads) and class (static) members of classes. Reachability is calculated from these objects in most JVMs. There are other cases where local parameters and operands used by JNI calls will be considered part of the root set, and also used to calculate reachability.

I hope this clears any lingering doubts over what is a root (set) and what is a live object.

18
votes

The IBM web site lists the following as GC roots.

Note that some of these are artificial constructs done by a memory analyzer, but still important to be aware of if you are looking at a heap dump.

  • System class

    A class that was loaded by the bootstrap loader, or the system class loader. For example, this category includes all classes in the rt.jar file (part of the Java runtime environment), such as those in the java.util.* package.

  • JNI local

    A local variable in native code, for example user-defined JNI code or JVM internal code.

  • JNI global

    A global variable in native code, for example user-defined JNI code or JVM internal code.

  • Thread block

    An object that was referenced from an active thread block.

  • Thread

    A running thread.

  • Busy monitor

    Everything that called the wait() or notify() methods, or that is synchronized, for example by calling the synchronized(Object) method or by entering a synchronized method. If the method was static, the root is a class, otherwise it is an object.

  • Java local

    A local variable. For example, input parameters, or locally created objects of methods that are still in the stack of a thread. Native stack

    Input or output parameters in native code, for example user-defined JNI code or JVM internal code. Many methods have native parts, and the objects that are handled as method parameters become garbage collection roots. For example, parameters used for file, network, I/O, or reflection operations.

  • Finalizer

    An object that is in a queue, waiting for a finalizer to run.

  • Unfinalized

    An object that has a finalize method, but was not finalized, and is not yet on the finalizer queue.

  • Unreachable

    An object that is unreachable from any other root, but was marked as a root by Memory Analyzer so that the object can be included in an analysis.

    Unreachable objects are often the result of optimizations in the garbage collection algorithm. For example, an object might be a candidate for garbage collection, but be so small that the garbage collection process would be too expensive. In this case, the object might not be garbage collected, and might remain as an unreachable object.

    By default, unreachable objects are excluded when Memory Analyzer parses the heap dump. These objects are therefore not shown in the histogram, dominator tree, or query results. You can change this behavior by clicking File > Preferences... > IBM Diagnostic Tools for Java - Memory Analyzer, then selecting the Keep unreachable objects check box.

  • Java stack frame

    A Java stack frame, which holds local variables. This type of garbage collection root is only generated if you set the Preferences to treat Java stack frames as objects. For more information, see Java Basics: Threads and thread stack queries.

  • Unknown

    An object of unknown root type. Some dumps, such as IBM Portable Heap Dump (.phd) files, do not have root information. In this case, the Memory Analyzer parser marks objects that have no inbound references, or are unreachable from any other root, as unknown. This action ensures that Memory Analyzer retains all the objects in the dump.

3
votes

In java I would say threads are the root objects. Every live object can be back traced to a live thread. For example, a static object is referenced by a class, which is referenced by a class loader, which is referenced by another class, which is referenced by an instance of that class, ... which is referenced by a Runnable, which is referenced by a live thread. (Note, classes can be GC'ed, they can't be roots)

We can also consider a "real" root for all threads, however that is out of the realm of standard Java. We can't say what it is, and how it references all the threads.