12
votes

How does one get the garbage collector in Mono to do anything useful? At the bottom of this post is a simple C# test program that generates two large strings. After generating the first string, the variable is dereferenced, and the scope is exited, and the garbage collector is manually triggered. Despite this, the memory used does not decrease and the program explodes with an out of memory exception during construction of the second string.

Unhandled Exception: OutOfMemoryException [ERROR] FATAL UNHANDLED EXCEPTION: System.OutOfMemoryException: Out of memory at (wrapper managed-to-native) string:InternalAllocateStr (int) at System.String.Concat (System.String str0, System.String str1) [0x00000] in :0 at GCtest.Main (System.String[] args) [0x00000] in :0

After doing research I found the --gc=sgen switch for Mono that uses a different garbage collection algorithm. This is even worse, generating the following stack trace:

Stacktrace:

at (wrapper managed-to-native) string.InternalAllocateStr (int) <0xffffffff> at string.Concat (string,string) <0x0005b> at GCtest.Main (string[]) <0x00243> at (wrapper runtime-invoke) .runtime_invoke_void_object (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

0 mono-sgen 0x000bc086 mono_handle_native_sigsegv + 422 1 mono-sgen
0x0000466e mono_sigsegv_signal_handler + 334 2 libsystem_c.dylib
0x913c659b _sigtramp + 43 3 ???
0xffffffff 0x0 + 4294967295 4 mono-sgen
0x0020306d mono_gc_alloc_obj_nolock + 363 5 mono-sgen
0x0020394a mono_gc_alloc_string + 153 6 mono-sgen
0x001c9a10 mono_string_new_size + 147 7 mono-sgen
0x0022a6d1 ves_icall_System_String_InternalAllocateStr + 28 8 ???
0x004c450c 0x0 + 4998412 9 ???
0x004ceec4 0x0 + 5041860 10 ???
0x004c0f74 0x0 + 4984692 11 ???
0x004c1163 0x0 + 4985187 12 mono-sgen
0x00010164 mono_jit_runtime_invoke + 164 13 mono-sgen
0x001c5791 mono_runtime_invoke + 137 14 mono-sgen
0x001c7f92 mono_runtime_exec_main + 669 15 mono-sgen
0x001c72cc mono_runtime_run_main + 843 16 mono-sgen
0x0008c617 mono_main + 8551 17 mono-sgen
0x00002606 start + 54 18 ???
0x00000003 0x0 + 3

Debug info from gdb:

/tmp/mono-gdb-commands.2aCwlD:1: Error in sourced command file: unable to debug self

Got a SIGSEGV while executing native code. This usually indicates a fatal error in the mono runtime or one of the native libraries used by your application.

/Users/fraser/Documents/diff-match-patch/csharp/GCtest.command: line 12: 41011 Abort trap: 6 mono --gc=sgen GCtest.exe

Here's the code:

using System;
public class GCtest {
  public static void Main(string[] args) {
    Console.WriteLine("Memory: " + (GC.GetTotalMemory(true) / 1024) + " KB");

    {
      // Generate the first string.
      string text1 = "hello old world.";
      for (int i = 0; i < 25; i++) {
        text1 = text1 + text1;
      }
      // Dereference variable.
      text1 = null;
      // Drop out of scope.
    }

    GC.Collect();
    GC.WaitForPendingFinalizers();
    Console.WriteLine("Memory: " + (GC.GetTotalMemory(true) / 1024) + " KB");

    // Generate the second string.
    string text2 = "HELLO NEW WORLD!";
    for (int i = 0; i < 25; i++) {
      text2 = text2 + text2;
    }

    Console.WriteLine("Memory: " + (GC.GetTotalMemory(true) / 1024) + " KB");
  }
}
1
Strange use of the word "dereference". Dereferencing is the process of following a pointer to its destination. I'd say "assign null to the variable".CodesInChaos

1 Answers

8
votes

Is that a 32bit Mono? I believe that the behaviour you have described is caused by the fact, that with Boehm GC, at least stack is conservatively scanned. That means that values on it are treated like if they were pointers. If some such value points to an object (or inferior of it), then this object will not be collected. Now it is clear, why big objects are problematic here - they can easily fill virtual address space of 32bit process, we have a great chance some value at stack is pointing somewhere in it and the whole object is not collected. What are the sources of such nasty fake pointers? Best known to me are hashes, random values or date/time (usual ints are low).

How does one get the garbage collector in Mono to do anything useful?

The method you've used is correct, but I think you've encountered problem described above. Usual programs does not suffer so much, because (so) huge objects are quite rare. Tomorrow I'll also test it on 64bit Mono.

Nevertheless the question which arises automatically is why don't the Mono project make another GC, which won't be conservative? Such garbage collector is, as you have found, sgen. I think that the purpose of sgen is, besides being precise collector, the fact that it is compacting, which is very important for long running apps, and that it has (sometimes much) better performance. However sgen is still in its beta stage, one can observe crashes here and there. Also the feature of precise stack scanning had been turned on and off in different releases of Mono, sometimes some regression got through, so you may find that older release of Mono works better than newer. Nevertheless sgen is actively developed (as one can find browsing commit history on github) and should soon take the place of default garbage collector in Mono. Which should finally resolve issues like one described.

Btw, for example, my version of Mono (still 32bit) passes this test with sgen:

$ mono --gc=sgen GCTest.exe 
Memory: 4098 KB
Memory: 4140 KB
Memory: 1052716 KB

Hope this was helpful, ask if something was unclear (especially due to my second quality English).

Edit:

On my 64-bit machine Boehm works well:

$ mono GCTest.exe
Memory: 132 KB
Memory: 280 KB
Memory: 1048860 KB

(sgen naturally too). It is Mono 2.10.5 on Linux.