2
votes

I have a delphi 2007 application that is having periodic access violations in TControl.Perform method from the standard VCL unit Controls.pas. The call stack looks like this:

exception message : Access violation at address 00000000. Read of address 00000000.

Main ($1cac):
00000000 +000 ???
004cd644 +024 mainexe.exe  Controls  5021   +5 TControl.Perform
004ce705 +015 mainexe.exe  Controls  5542   +2 TControl.CMMouseEnter
004cd9b7 +2bb mainexe.exe  Controls  5146  +83 TControl.WndProc
004d19bb +4fb mainexe.exe  Controls  7304 +111 TWinControl.WndProc
004a8ff8 +06c mainexe.exe  StdCtrls  3684  +13 TButtonControl.WndProc
004cd644 +024 mainexe.exe  Controls  5021   +5 TControl.Perform
004d182b +36b mainexe.exe  Controls  7255  +62 TWinControl.WndProc
004a8ff8 +06c mainexe.exe  StdCtrls  3684  +13 TButtonControl.WndProc
004d10e4 +02c mainexe.exe  Controls  7073   +3 TWinControl.MainWndProc
0048af08 +014 mainexe.exe  Classes  11583   +8 StdWndProc
75ce7bc5 +00a USER32.dll                       DispatchMessageA
004ecaf4 +0fc mainexe.exe  Forms     8105  +23 TApplication.ProcessMessage
004ecb2e +00a mainexe.exe  Forms     8124   +1 TApplication.HandleMessage
004ece23 +0b3 mainexe.exe  Forms     8223  +20 TApplication.Run
0136cac7 +383 mainexe.exe  mainexe    326  +45 initialization
75563398 +010 kernel32.dll                     BaseThreadInitThunk

I am unable to reproduce it in the office, so I only get the call stacks from customers directly, via MadExcept.

I am not sure how to diagnose or otherwise determine the cause, and then correct a fault that occurs this way. I'm hoping someone has seen this "TControl.Perform" style of access violation, and has some idea on the root causes.

My #1 suspicion is that a form has been "freed" by some other area of my code, and that a window message is being processed, and that TControl (as a base class of some real control in some real form) is simply failing because Self is nil, or some resource like the window handle is invalid.

I'm looking for a technique that will help me diagnose this problem, that can be executed on a client's computer, without access to the delphi debugger. Thoughts I've had include adding some logging (but what?) or even running WinDbg (the windows SDK debugger tool) on the client's machine.

1
I'm a little slow today. The +5 means line 5. The only code in TControl.Perform that jumps anywhere is the call to WindowProc. What this means is that you have a TControl instance for which WindowProc is nil. Something very bad has gone wrong.David Heffernan
From the call stack, this seems like a TButtonControl descendant handling WM_MOUSEMOVE message, see TWinControl.WndProc. It looks like the problem is with the parent of the button control (see TControl.CMMouseEnter line 5542) - for some reason its WindowProc is nil. Perhaps you're using the WindowProc property for subclassing but are not replacing the original properly? This can also easily happen if subclassing multiple times but restoring original methods in wrong order...Ondrej Kelle
This code I'm debugging was written by someone with no idea how to instantiate objects and manage object lifetimes properly. My guess is a free'd TControl, with a form object being zapped prematurely during a show-modal loop. I have narrowed it down to one of 180 modal forms that customers could be bringing up that are being created and then freed just a bit too soon.Warren P
@WarrenP, I would sweep through the code to look for the way they are destroyed, in particular any Free called when a Release should be used. (I had such a case at a previous company with some stinking mix of Free-instead-of-Release and ProcessMessages)Francesca
I don't see why a freed object would have nil for WindowProc. A stale pointer perhaps, but nil seems unlikely.David Heffernan

1 Answers

0
votes

The problem above was caused by an old out of date version of the TMS StringGrid Control, and the code in question was executing on a WM_MOUSEMOVE, but the blame rests squarely with ME, not with TMS, because I was the one who didn't do a clean update of my TMS component folder.

The problem went away when I rebuilt with later version of the code, and deleted old DCU files that were being linked with my application.

In short, you have your source code which is what you THINK is in your app, and if you have some old DCU folder in your library or project search path, for your release build, you have some mysterious precompiled thing in your code. I already knew you shouldn't structure your project in a way that makes it possible for old DCUs to be left around, and that you should not have any DCUs in your version control system, but this one slipped past me.