0
votes

I am basically calling a C++ DLL using LoadLibrary() in my C++ application. The application causes random 0xc0000005 (Access violation) Errors. I have done a lot of study on DLL's having their own heaps and their problems.

Things I've made sure to do so far:

In the DLL :

  1. All allocations are done in C++ standards. (no usage of malloc or calloc)
  2. All new's have a reachable equivalent delete.
  3. There is no memory allocated inside the DLL that is freed in the Host exe or vice versa.
  4. Data transfer between the two is done via POD (char* specifically). No STL's.
  5. All exported function's have a calling convention of __stdcall
  6. The DLL is built specifying extern "C" and a DEF file.

In the Host Exe:

  1. Allocated memory using HeapAlloc() with GetProcessHeap()
  2. The pointer is passed to the DLL which copies bytes on it using memcpy()
  3. DLL function typedef's are correct.
  4. Compilers for both the DLL and the exe are same.(built in VS2010).

The crashes occur at random locations :

  1. While debugging I observed that just as we step over "}" function end brace in the DLL, the exception occurs.
  2. After successfully returning from the DLL call. Crash occurs randomly.

All the Event logs show "Faulting module name" to be the DLL.

Taking into account all the points that I have stated previously, I would appreciate if anyone guided me on where to look for cause of the exception.

Also does the pointer I send to the DLL get resolved to the correct HEAP in memcpy()?. The data is correct in the host exe though. GetProcessHeaps() return 4 HEAPS.

EDIT Cannot post the full code due to policies. (again, make note that I have accounted for most of the common mistakes made).

Function where the error occurs (DLL)

extern "C"  void __stdcall BuildApplicationsList();

Typedef in exe

typedef void(__stdcall *buildAppsList)(void);

UPDATE

In response to @RalfFriedl. You were right!. the program crashes in this location.

}

5822593F  mov         byte ptr [esp+7A0h],7  
58225947  cmp         dword ptr [esp+0A0h],0  
5822594F  jne         BuildApplicationsList+1CE2h (58225992h)  
58225951  mov         eax,dword ptr [esp+74h]  
58225955  test        eax,eax  
58225957  je          BuildApplicationsList+1CB1h (58225961h)  
58225959  mov         ecx,dword ptr [eax]  
5822595B  mov         edx,dword ptr [ecx+8]    // Crash Occurs here. 
5822595E  push        eax  
5822595F  call        edx  
58225961  mov         eax,dword ptr [esp+70h]  
58225965  test        eax,eax  
58225967  je          BuildApplicationsList+1CC1h (58225971h)  
58225969  mov         ecx,dword ptr [eax]  
5822596B  mov         edx,dword ptr [ecx+8]  
5822596E  push        eax  
5822596F  call        edx  
58225971  mov         eax,dword ptr [esp+6Ch]  
58225975  test        eax,eax  
58225977  je          BuildApplicationsList+1CD1h (58225981h)  
58225979  mov         ecx,dword ptr [eax]  
5822597B  mov         edx,dword ptr [ecx+8]  
5822597E  push        eax  
5822597F  call        edx  
58225981  call        dword ptr [__imp__CoUninitialize@0 (5823F2C8h)]  

edx and ecx are 0 and obviously accessing 0x00000008 is a violation. Where to next?

1
The Windows "heaps" are parts of the same memory space - there's no need to "resolve" pointers. These heaps have nothing to do with the thing people call "heap" in C++.molbdnilo
The next place to look is BuildApplicationsList. (CoUninitialize in a DLL looks a bit strange, by the way.)molbdnilo
mov ecx,dword ptr [eax]: fetch the vtable pointer from the object at eax; mov edx,dword ptr [ecx+8]: fetch the function pointer at offset 8 in the vtable; push eax : add the this argument; call edx: call the function. eax isn't zero, but edx is, so the vtable pointer must be null.molbdnilo
mov ecx, dword ptr [eax] retrieves the value at the address eax, not the value of eax. (That is, it's dereferencing a pointer.)molbdnilo
Just realised that I wrote edx when I meant ecx in the comment that explains the virtual function call. Too late to edit, but I might as well nitpick myself.molbdnilo

1 Answers

0
votes

When you step over the end of the function, the local destructors are called. Switch to debugging the assembler code and find out which destructor causes the problem.

You issue is probably not related to the separation between main program and DLL, but just a reference to an invalidated pointer that might occur anyway.

Edit

58225951  mov         eax,dword ptr [esp+74h]  
58225955  test        eax,eax  
58225957  je          BuildApplicationsList+1CB1h (58225961h)  
58225959  mov         ecx,dword ptr [eax]  
5822595B  mov         edx,dword ptr [ecx+8]    // Crash Occurs here. 

At esp+74h you have a local variable, it seems to be a pointer to a class with virtual function. The value of this pointer is nonzero. But as destructors are not called for the target of a pointer, you probably have a class that encapsulates a pointer and calls delete on the pointer from the class destructor. The problem is probably that the destructor for the target object has already been called and the space freed.

Just before stepping into the end of the function, find out the value of esp+74h. That is the address esp+74h, not the value stored at that address. Check the addresses of all local variables. On of them must be equal to esp+74h. This is the destructor that causes the problem.

You could also try to disable inlining. Then your debugger probably stops at the correct place.