0
votes

After failing to get DebugDiag to analyse crash-dump files it was suggested that I try using WinDbg instead.

The crash-dump files have been created on a Windows Server 2016 box, running my ASP.Net 4.5.2 web application on IIS-10. My ASP.Net web application contains several 3rd party components, with their individual DLLs.

I have copied the crash-dump files onto my Windows 10 development machine, and am running WinDbg locally instead of on the server.

The problem is... when I run !analyze -v in WinDbg on any of the crash-dump files, it effectively hangs while "Downloading file xxx.DLL" (xxx.DLL being the name of just one of the 3rd party component DLLs), and eventually cancels itself after a period of time.

I'm running WinDbg on the same machine that I built the website on in the first place... so is there a way of telling WinDbg that it can find the DLL in a particular location on the local machine?

I obviously don't have a .pdb file for any of the 3rd party components, and so I'm not bothered about it loading symbols for those DLLs... but either I somehow tell it to ignore those particular DLLs, or I tell it how to find them locally.

Can anybody point me in the right direction?

3
You don't really need !analyze -v for a .NET Framework application, as that's primarily used for native crash and others. Navigate the managed threads and check their call stacks, and the culprit should be clear. Hints can be found in posts like dougrathbone.com/blog/2014/03/20/…Lex Li
Thanks again for your help @Lex (turning into my personal assistant)... and thanks for the blog post. I'll continue investigatingfreefaller
@Lex - taken a while, but I've eventually tracked down the issue to a specific 3rd party component. Thanks again for the blog post, it was really helpfulfreefaller
You can summarize what you learned as an answer and accept it. .NET dump analysis is actually easy for certain scenarios (like yours), so once you master the basic steps you can conquer bigger ones in the future.Lex Li
@Lex - finally found some time to write an answer. I'm sure there are plenty of holes in my methods, but it's accurate as to how I managed to do itfreefaller

3 Answers

0
votes

You don't have to analyze the dump file with !analyze -v. If you need to load dll, then .load D:.... is enough.

To maunal analyze a dump file. Please run .loadb sos clr to load debug module. If the crash server and your machine run different version of .net framework. Then you need to load sos.dll manually.

When you need to debug .net application in IIS, !mex extension is recommened. https://www.microsoft.com/en-us/download/details.aspx?id=53304

You can load mex.dll via .load c:\.....\mex.dll

!mex.aspxpages can show all requests inside the process and their process

!mex.mthreads show the status of all threads

!mex.clrstack2 will show all exceptions and mananaged call stack in specific thread.

1.You can use ~* k to load the full call stack in all threads and !mex.mthreads check status. Then you may find something like KERNELBASE!RaiseException in specific thread

2.Then go to this thread via threadid~ like 12~

3.Run !mex.clrstack2 and it will show the crash exception

0
votes

Basically, no, you cannot speed up the process of loading symbols for DLLs where you don't have symbols. IMHO, the only way of speeding up the symbol process would be to disable the HTTP server, so that symbols are only searched on your local disk.

See also: How to set up symbols in WinDbg if you have not done this often.

Getting a HTTP 404 for those files should not take very long. However, it tries various file endings and pointers etc. Sometimes Microsoft servers are slow. Also, having a lot of 3rd party DLLs may sum up of course. That can be pretty anoying.

0
votes

I'll start by saying I don't 100% understand everything I had to do, but here are the step I took to discover where the stackoverflow issue was in my application...

The majority of the information came from this blog.

  • On the server I added the following registry settings to create the crash dump files...

    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps]
    
    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\w3wp.exe]
    "DumpCount"=dword:00000005
    "DumpFolder"=hex(2):43,00,3a,00,5c,00,43,00,72,00,61,00,73,00,68,00,44,00,75,\
    00,6d,00,70,00,73,00,5c,00,00,00
    

(The DumpCount is the number of files to store before it starts overwriting old ones - DumpFolder is where the files are to be saved, is a REG_EXPAND_SZ and in my case represents C:\CrashDumps\)

  • Waited for crashes to happen
  • Copied the crash files into a directory on my local machine called C:\WinDbg\CrashDumps\
  • Create another directory called C:\WinDbg\Symbols, into which I placed...
    • clr.dll (from the server, taken from C:\Windows\Microsoft.NET\Framework64\v4.0.30319\)
    • sos.dll (from the server, taken from C:\Windows\Microsoft.NET\Framework64\v4.0.30319\)
    • all .dll and .pdb files from my local development environment, including third party component .dll files
  • Installed WinDbg via Windows Store on my Windows 10 development machine
  • Ran windbgx -y c:\windbg\symbols via Run command (for some reason it's windbgx on my machine but maybe that's because it's via the Store rather than manual download)
  • In the file menu Open dump file and select one of the dump files in C:\WinDbg\CrashDumps
  • Ran the following commands...
    • .symfix
    • .reload
    • .load c:\windbg\symbols\sos.dll (see note 1 below)
    • !clrstack (see note 2 below)

Although this didn't give me all the information I expected, what it did show was that one of my 3rd party components was 100% to blame for the stackoverflow exception.

Note 1 - Lots of places I read said that .loadby sos clr should be used, but that just gave me The call to LoadLibrary(C:\ProgramData\Dbg\sym\clr.dll\5E7D1F3B9eb000\sos.dll) failed and I couldn't figure out how to fix it... so instead I've used .load c:\windbg\symbols\sos.dll.

Note 2 - The !clrstack command worked because WinDbg appeared to pre-select the thread that had the exception. The other option is to use ~*e !clrstack which will show you call stacks for ALL threads.