1
votes

I am writing a C++ memory profiler.

As everyone knows, an executable program may contain many global objects, and its DLL modules may contain global objects as well. These global objects will be initialized with CRT initialization - after the entry point and before WinMain; for DLLs, the entry point is _DllMainCRTStartup, before you get into DllMain, all the global objects of the DLL are initialized.

A global object may allocate memory. so a memory profiler must be initialized before any global object initialization. After doing a lot of researching, I found that this is not an easy job.

One idea is using CreateProcess with CREATE_SUSPENDED flag - try to get a first chance. After that, use CreateRemoteThread to call LoadLibrary in the target process to load an injection DLL, and initialize that DLL. But it doesn't work because this will load all the implicit-linking DLLs of the executable program first. Maybe CreateRemoteThread triggers this behavior?

So, how do we get the first chance?

1

1 Answers

0
votes

There might be a way to do this otherwise using very platform-specific ways, but one way to solve the issue is to combine lazy initialization with dylib loading.

For example, say your memory allocator functions are exported like this:

API void* exported_alloc();
API void exported_free(void* mem);

... inside a dylib called mem.dll.

In this case, to ensure that all of your other dylibs can get to it when they are being loaded, we can create a central statically-linked library (ex: sdk.lib) that all of your dylibs link against with a header like so:

#ifndef MEMORY_H
#define MEMORY_H

// Memory.h
void* my_alloc();
void my_free(void* mem);

#endif

... which we can implement like so:

static void* (exported_alloc)() = 0;
static void (exported_free)(void* mem) = 0;

static void initialize()
{
    if (!exported_alloc)
    {
         // Load 'mem.dll' (ex: 'LoadLibrary') and look up
         // symbols for `exported_alloc` and `exported_free`
         // (ex: GetProcAddress).
    }
}

void* my_alloc()
{      
    initialize();
    return exported_alloc();
}

void my_free(void* mem)
{
    initialize();
    exported_free(mem);
}

.. Then call FreeLibrary at the appropriate time when you're finished with the DLL. This incurs a bit of run-time overhead (similar to the overhead of accessing a singleton), but is a cross-platform solution (provided that you have a cross-platform means of loading/unloading dylibs/shared libs at runtime).

With this solution, all of your DLLs allocating memory at a global scope would then load mem.dll prior to performing any memory allocations in a lazy-initialized kind of way, ensuring that they all have access to your memory functions at the appropriate time.