2
votes

According to MDN, Mark and Sweep is a better algorithm for garbage collection than reference counting as it avoids memory leaks which can occur with cyclic references. However, I don't understand why this is the case? Can someone explain (with code) how Mark and Sweep avoids such memory leaks? Feel free to use below code from MDN, which explains how reference counting works:

var x = { 
  a: {
    b: 2
  }
}; 

// 2 objects are created. One is referenced by the other as one of its properties.
// The other is referenced by virtue of being assigned to the 'x' variable.
// Obviously, none can be garbage-collected.

var y = x;      // The 'y' variable is the second thing that has a reference to the object.

x = 1;          // Now, the object that was originally in 'x' has a unique reference
                //   embodied by the 'y' variable.

var z = y.a;    // Reference to 'a' property of the object.
                //   This object now has 2 references: one as a property, 
                //   the other as the 'z' variable.

y = 'mozilla';  // The object that was originally in 'x' has now zero
                //   references to it. It can be garbage-collected.
                //   However its 'a' property is still referenced by 
                //   the 'z' variable, so it cannot be freed.

z = null;       // The 'a' property of the object originally in x 
                //   has zero references to it. It can be garbage collected.

Also, are global variables never garbage collected?

1

1 Answers

1
votes

Mark-sweep works by marking roots, such as temporary variables (in the current method) and global (static) variables, and then sweeping the heap removing any unmarked references

For example the first line in your example creates a nested object and assigns it to variable x. For MS to know whether its field b is alive or not it first marks the roots.

This works by checking all global variables and local variables in the current method (frame) and then marking the references they point to. For example one would be the outer object of x.

Here has to be noted that each object in the internal object layout has (atleast) two bits reserved for marking whether an object is referenced or not.

var x = { a: {b: 2}}; 

So the outer object that contains the field a, is marked as alive since it's referenced by a local variable. Now it has to mark the objects its fields reference too. Fields a and a.b are such objects. Once all references that an object's fields and it's fields fields refer to are marked the second bit of the outer object and it's inner objects are marked.

For the sake of simplicity let's assume that this is the only live object on the heap. Once the marking phase is finished, the heap (can imagine it as an array or list of generic objects) can be swept. Hereby all objects are either fully marked (2-bits) or are unmarked aka have no references. All unmarked objects can be freed and the memory they used to partake can be re-used.

Now the problem of cyclic references is that an object a refers to another object b and vice-versa. With reference counting it's is the case that as long as an object has a reference to itself its reference count field will be atleast 1. The problem is that no root object or an object indirectly referenced by the root (aka a live object) refers to the cyclic object, but according to the algorithm a reference counter equal or higher than 1 means it's alive.

Mark-sweep's definition of a live object is different, namely that it's directly or indirectly referenced by a root object.