5
votes

I know reference counter technique but never heard of mark-sweep technique until today, when reading the book named "Concepts of programming language".
According to the book:

The original mark-sweep process of garbage collection operates as follow: The runtime system allocates storage cells as requested and disconnects pointers from cells as necessary, without regard of storage reclamation ( allowing garbage to accumulate), until it has allocated all available cells. At this point, a mark-sweep process is begun to gather all the garbage left floating-around in the heap. To facilitate the process, every heap cells has an extra indicator bit or field that is used by the collection algorithm.

From my limited understanding, smart-pointers in C++ libraries use reference counting technique. I wonder is there any library in C++ using this kind of implementation for smart-pointers? And since the book is purely theoretical, I could not visualize how the implementation is done. An example to demonstrate this idea would be greatly valuable. Please correct me if I'm wrong.

Thanks,

2

2 Answers

2
votes

There is one difficulty to using garbage collection in C++, it's to identify what is pointer and what is not.

If you can tweak a compiler to provide this information for each and every object type, then you're done, but if you cannot, then you need to use conservative approach: that is scanning the memory searching for any pattern that may look like a pointer. There is also the difficulty of "bit stuffing" here, where people stuff bits into pointers (the higher bits are mostly unused in 64 bits) or XOR two different pointers to "save space".

Now, in C++0x the Standard Committee introduced a standard ABI to help implementing Garbage Collection. In n3225 you can find it at 20.9.11 Pointer safety [util.dynamic.safety]. This supposes that people will implement those functions for their types, of course:

void declare_reachable(void* p); // throw std::bad_alloc
template <typename T> T* undeclare_reachable(T* p) noexcept;

void declare_no_pointers(char* p, size_t n) noexcept;
void undeclare_no_pointers(char* p, size_t n) noexcept;

pointer_safety get_pointer_safety() noexcept;

When implemented, it will authorize you to plug any garbage collection scheme (defining those functions) into your application. It will of course require some work of course to actually provide those operations wherever they are needed. One solution could be to simply override new and delete but it does not account for pointer arithmetic...

Finally, there are many strategies for Garbage Collection: Reference Counting (with Cycle Detection algorithms) and Mark And Sweep are the main different systems, but they come in various flavors (Generational or not, Copying/Compacting or not, ...).

1
votes

Although they may have upgraded it by now, Mozilla Firefox used to use a hybrid approach in which reference-counted smart pointers were used when possible, with a mark-and-sweep garbage collector running in parallel to clean up reference cycles. It's possible other projects have adopted this approach, though I'm not fully sure.

The main reason that I could see C++ programmers avoiding this type of garbage collection is that it means that object destructors would run asynchronously. This means that if any objects were created that held on to important resources, such as network connections or physical hardware, the cleanup wouldn't be guaranteed to occur in a timely fashion. Moreover, the destructors would have to be very careful to use appropriate synchronization if they were to access shared resources, while in a single-threaded, straight reference-counting solution this wouldn't be necessary.

The other complexity of this approach is that C++ allows for raw arithmetic operations on pointers, which greatly complicates the implementation of any garbage collector. It's possible to conservatively solve this problem (look at the Boehm GC, for example), though it's a significant barrier to building a system of this sort.