We are using the now deprecated Windows Azure Accelerator to deploy multiple applications to a Windows Azure web role. We have noticed a massive memory leak in the WAIISHost.exe process - it is currently consuming 2.5GB of RAM (on a Large Azure instance). One week ago, it was at 1.5GB, so it's safe to say it leaks a gig a week.
We've looked at the memory dump and it appears that the leak is unmanaged - using SOS in WinDBG revealed no more than 50MB of managed heap.
We've used the heap_stat.py WinDBG extension and it revealed that most of the allocated objects come from nativerd dll (which I believe is an internal infrastructure library). Here is what !py heap_stat.py -stat revealed:
Statistics:
Type name Count Size nativerd!SCHEMA_ATTRIBUTE 8127384 Unknown nativerd!ATTRIBUTE_VALUE 8127037 Unknown nativerd!SCHEMA_ELEMENT 2032263 Unknown nativerd!CONFIG_ELEMENT 1112616 Unknown nativerd!NAMED_ENTRY_KEY 99967 Unknown nativerd!DICTIONARY_LIST 54152 Unknown nativerd!DUPLICATE_TABLE 11654 Unknown
Running !heap -p -a on any of those objects did not reveal much additional information:
0:000> !heap -p -a 000000002c1591e0
address 000000002c1591e0 found in _HEAP @ 8d0000 HEAP_ENTRY Size Prev Flags UserPtr UserSize - state 000000002c1591e0 0014 0000 [00] 000000002c1591f0 00130 - (busy) nativerd!SCHEMA_ELEMENT::`vftable'
At this point, we are wondering what could the next steps investigating the memleak be. Is there any other useful information that can be extracted from the memory dump, or should we resort to other means such as inspecting the code and trying to run locally with a profiler?
Update: Our VMs are running Windows Server 2008 R2 SP1. We are using Azure SDK 1.7. Finally, the version of nativerd.dll is 7.5.7601.17855
lm vm nativerd
– Thomas Weller