2
votes

We have an existing network messaging server that implements a custom communications protocol that we are migrating to an in-process COM object. The server is implemented as a free threaded out-of-process COM server. The clients can register with the server (think publish-subscribe) to receive messages.

After migration we noticed several dead locks when calling GUI related functions, e.g. SetWindowPos, RedrawWindow, EnumWindows, etc. After doing some research I found that this was due to callbacks happening on a thread other than the main GUI thread (message pump). The callbacks are custom COM callbacks that derive from IUnknown, so we are not using connection points.

What's interesting is if a create a simple MFC dialog project, everything works. But in our main application it fails. Our main application (20+ years old and doing things now it was never designed for) is an MFC dialog project. In response to a message, the MFC dialog project loads a DLL that, in turn, creates an MFC dialog and registers with the COM server for messages. When the DLL dialog receives a message it calls one of the three example GUI related functions above, and blocks.

This use to work for an out-of-process COM server, so I know there is a way to get the callback to be within the context of the clients main GUI thread but have been unable find the 'magic' code to make it work. I did stumble upon SendMessageCallback but can't get the asynchronous callback function to be called.

The COM callbacks into the client are handled via a thread inside the in-process COM server. The thread is initialized with CoInitialize, which from my research means STA. I've tried changing this to CoInitializeEx and tested with MTA. I've also tried changing the COM server threading model to STA, MTA, Both, and Free. As you can see, I'm really starting to throw darts. here.

Any help would be appreciated.

I did order the Don Box books, Essential COM and Effective COM, but they will not arrive until later this week.

EDIT:

Our main application: Inside of CApp derived class' InitInstance

  1. AfxOleInit()

Inside of CDialog derived class' OnInitDialog

  1. CoCreateInstance (Messaging COM Object)
  2. Allocate callback pointer. The client callback is an interface that derives from IUnknown. Then a callback class derives from the client callback and implements the AddRef/Release interfaces. When the client creates the callback the client passes a pointer to itself so the callback can call the client.
  3. MessageComObject->Register(callback pointer)

Inside MessageCOMObject:

  1. MessageComObject adds the callback to the GIT and saves the cookie.
  2. MessageComObject starts a thread to 'send' callbacks to the client.

Some point in time later the main application receives a 'message' via the callback pointer. The callback is initiated from within the MessageComObject. Inside the MessageCOMObject:

  1. MessageComObject gets the callback from the GIT using the cookie
  2. MessageComObject calls a function on the callback interface

Inside the callback interface's derived class:

  1. Calls the client callback function

Inside the CDialog derived class' callback handler:

  1. Calls LoadLibrary on a DLL (what gets loaded is data driven)
  2. Calls exported function from DLL.

Inside DLL's exported function:

  1. Create CWnd object
  2. CoCreateInstance (Messaging COM Object)
  3. Allocate callback pointer (same as above)
  4. MessageCOMObject->Register

Inside MessageCOMObject:

  1. MessageComObject adds the callback to the GIT and saves the cookie.
  2. MessageComObject starts a thread to 'send' callbacks to the client.

Some point in time later the DLL receives a message:

  1. MessageComObject gets the callback from the GIT using the cookie, GIT returns 'Catastrophic Failure'

Code

Declaring the callback in the IDL file:

[
    object,
    uuid(...),
    pointer_default(unique)
]
interface IMessageRouterCallback : IUnknown
{
   ...
};

[
    object,
    uuid(...),
    pointer_default(unique)
]
interface IMessageRouter : IUnknown
{
    ...
};

[
    uuid(....),
    version(1.0),
]
library MessageRouterCOMLib
{
    importlib("stdole2.tlb");
    [
        uuid(....)  
    ]
    coclass MessageRouter
    {
            [default] interface IMessageRouter;
    };
};

The in-process COM DLL

class ATL_NO_VTABLE CMessageRouter :
    public CComObjectRootEx<CComMultiThreadModel>,
    public CComCoClass<CMessageRouter, &CLSID_MessageRouter>,
    public IMessageRouter
{
public:

DECLARE_GET_CONTROLLING_UNKNOWN()

BEGIN_COM_MAP(CMessageRouter)
    COM_INTERFACE_ENTRY(IMessageRouter)
    COM_INTERFACE_ENTRY_AGGREGATE(IID_IMarshal, m_pUnkMarshaler.p)
END_COM_MAP()

    CComPtr<IUnknown> m_pUnkMarshaler;

    DECLARE_PROTECT_FINAL_CONSTRUCT()

    DWORD callbackRegistrationId;

    HRESULT FinalConstruct()
    {
        //- TODO: Add error checking
        IUnknown *unknown;
        DllGetClassObject(IID_IMessageRouterCallback, IID_IUnknown, (void**)&unknown);
        CoRegisterClassObject(IID_IMessageRouterCallback, unknown, CLSCTX_INPROC_SERVER, REGCLS_MULTIPLEUSE, &callbackRegistrationId);
        CoRegisterPSClsid(IID_IMessageRouterCallback, IID_IMessageRouterCallback);

        return CoCreateFreeThreadedMarshaler(GetControllingUnknown(), 
    }

    void FinalRelease()
    {
        if (callbackRegistrationId)
            CoRevokeClassObject(callbackRegistrationId);
        callbackRegistrationId = 0;

        if (m_pUnkMarshaler)
            m_pUnkMarshaler.Release();
    }
}

Where callback is registered:

boost::lock_guard<boost::mutex> guard(callbacksMutex);

//- callback is the raw interface pointer from the client
//- The class Callback contains a pointer to the raw client callback
//- and the global process ID.  The raw pointer is AddRef/Release in
//- the Callback class' constructor and destructor.
ptr = Callback::Pointer(new Callback(callback));
DWORD callbackId = 0;

HRESULT result = globalInterfaceTable->RegisterInterfaceInGlobal(callback, IID_IMessageRouterCallback, &callbackId);
if (SUCCEEDED(result))
{
    ptr->globalCallbackId = callbackId;
    callbackMap[callback] = ptr;
    //- NOTE: uses raw pointer as key into map.  This key is only
    //-       ever used during un-register.
    //- callbackMap is a std::map of Callback instances indexed by the raw pointer.
}

Callback thread:

CoInitialize(NULL);

while (processCallbackThreadRunning)
{
    QueueMessage message = messageQueue.Pop();
    if (!processCallbackThreadRunning)
        break;

    //- Make a copy because callbacks may be added/removed during
    //- this call.
    CallbackMap callbacks;
    {
        boost::lock_guard<boost::mutex> guard(callbacksMutex);
        callbacks = callbackMap;
    }

    for (CallbackMap::iterator callback = callbacks.begin(); callback != callbacks.end(); ++callback)
    {
        try
        {
            IMessageRouterCallback *mrCallback = NULL;
            HRESULT result = globalInterfaceTable->GetInterfaceFromGlobal(callback->second->globalCallbackId,IID_IMessageRouterCallback,(void **) &mrCallback);
            if (SUCCEEDED(result))
            {
                result = mrCallback->MessageHandler((unsigned char*)message.messageBuffer->Data(), message.messageBuffer->Length(), message.metaData.id, message.metaData.fromId, CComBSTR(message.metaData.fromAddress.c_str()));
                if (FAILED(result))
                {
                    ... log debug
                }
            }
            else
            {
                ... log debug
            }
        }
        catch (...)
        {
            ... log debug
        }
    }

    MessagePool::Push(message.messageBuffer);
}

CoUninitialize();

Client's implementation of the callback:

template <class CALLBACKCLASS>
class CMessageRouterCallback :
    public CComBase<IMessageRouterCallback>
{
    CMessageRouterCallback( CALLBACKCLASS *pCallbackClass = NULL) :
        m_pCallbackClass(pCallbackClass)
    {
        AddRef();   //- Require by CComBase.  This makes it so that this
                    //- class does not get automatically deleted when
                    //- Message Router is done with the class.
    }

    STDMETHODIMP MessageHandler( UCHAR *szBuffer, int nSize, DWORD dwTransCode, DWORD dwSenderID, BSTR bstrFromIP )
    {
        if ( m_pCallbackClass != NULL )
        {
            m_pCallbackClass->MessageHandler( szBuffer, nSize, dwTransCode, dwSenderID, bstrFromIP );
        }

        return S_OK;
    }
}

CComBase implements the IUnknown interface:

template < class BASE_INTERFACE, const IID* piid = &__uuidof(BASE_INTERFACE) >
class CComBase :
    public BASE_INTERFACE
{
protected:
    ULONG   m_nRefCount;

public:

    CComBase() : m_nRefCount(0) {}

    STDMETHODIMP QueryInterface(REFIID riid, void** ppv)
    {
        if (riid == IID_IUnknown || riid == *piid )
            *ppv=this;
        else
            return *ppv=0,E_NOINTERFACE;
        AddRef();

        return S_OK;
    }

    STDMETHODIMP_(ULONG) AddRef()
    {
        return ++m_nRefCount;
    }

    STDMETHODIMP_(ULONG) Release()
    {
        if (!--m_nRefCount)
        {
            delete this;
            return 0;
        }
        return m_nRefCount;
    }

    virtual ~CComBase() {}

};

The client then uses it:

class Client
{
    CMessageRouterCallback<Client> *callback;

    Client(IMessageRouter *messageRouter)
    {
        callback = new CMessageRouterCallback<this>();

        messageRouter->Register(..., callback);
    }

    void MessageHandler(...) { ... }
}
1
You don't have that fairy around anymore to marshal calls, it is now up to you to make the callbacks in a thread-safe way. Marshaling the interface pointer in your worker thread is a hard requirement to ensure that the callback runs on the UI thread and can safely make user32 calls. CoMarshalInterThreadInterfaceInStream() is the low-level function, IGlobalInterfaceTable makes it easy to use, PostMessage() is the generic way to get code to run on the UI thread.Hans Passant
@Hans Thanks Hans. I'm familiar with IGlobalInterfaceTable and PostMessage. I'll take a look at CoMarshalInterThreadInterfaceInStream. I've tried looking, but again come up empty. Is there an existing Windows message that I can pass to PostMessage to handle part of this, or do I need to roll my own?Aaron Smith
It is entirely unclear to me why you are not using IGlobalInterfaceTable. Having the COM infrastructure taking care of it (it uses PostMessage and CoMarshalInterThreadInterfaceInStream under the hood) is the simplest approach.Hans Passant
Originally, I had been using IGlobalInterfaceTable. But it was returning an HRESULT (I can't remember the exact wording) indicating that it was not required. I can easily add it back in now that we are further along in our integration and see what happens...Aaron Smith
@Hans I added the usage of IGlobalInterfaceTable back into the code and fortunately I am getting the same issue. To make sure that I am using it correctly, a register the interface with the GIT where the client passes the interface into the server. Then on the server, from within the callback thread, I query the GIT for the callback interface using the same cookie obtained during registration.Aaron Smith

1 Answers

2
votes

There's something wrong with how those callbacks are being registered. Possible causes may be:

  • A direct pointer to the callback manager's interface in the GUI thread, so you're providing direct pointers to STA objects to the callback manager too.

    Your Callback instance in the code you added seems to be doing exactly this, it can't blindly call the raw pointer's Release when destroyed.

  • Your server objects are marshaled with CoCreateFreeThreadedMarshaler (not much different).

    With the FTM, you must never use raw pointers, and you must always marshal interface pointers you intend to keep and unmarshal interface pointers you previously kept, preferrably using the GIT. And I mean always, if you intend to keep things safe.

I recommend you keep your server objects in MTA (ThreadingModel="Free") or NA (ThreadingModel="Neutral"), make sure you're accessing them somehow in the GUI thread through CoCreateInstance[Ex] or CoGetClassObject and IClassFactory::CreateInstance (or any other object activation API), and let the "magic" happen. This is as transparent as it can get without using the GIT or manually marshaling things between threads.