2
votes

i have an interesting problem in my delphi 2009 app. when run in the debugger, i get an AV between the subroutine's Begin keyword and the first statement. i believe that's when it's setting up local variables. here's the information shown in the debugger:

uDeviceModule.pas.940: begin  // _GetMeasurementsForChannel
00AF24C8 55               push ebp
00AF24C9 8BEC             mov ebp,esp
00AF24CB 51               push ecx
00AF24CC B9E9A90100       mov ecx,$0001a9e9    // isn't this a lot for the stack?

// error happens in here
00AF24D1 6A00             push $00
00AF24D3 6A00             push $00
00AF24D5 49               dec ecx
00AF24D6 75F9             jnz $00af24d1

00AF24D8 874DFC           xchg [ebp-$04],ecx
00AF24DB 53               push ebx
00AF24DC 894DF4           mov [ebp-$0c],ecx
00AF24DF 8955FC           mov [ebp-$04],edx
00AF24E2 8945F8           mov [ebp-$08],eax
00AF24E5 33C0             xor eax,eax
00AF24E7 55               push ebp
00AF24E8 687D2FAF00       push $00af2f7d
00AF24ED 64FF30           push dword ptr fs:[eax]
00AF24F0 648920           mov fs:[eax],esp
uDeviceModule.pas.941: SelectChannel(eChannelNum);       // first statement

this is a simplified version of this nested subroutine (see below).

procedure TDeviceModule.GetMeasurements(ExpInfo:TExpInfo;
  _DisplayList:TMeasDisplayListAncestor; eExposureStatus:TExposureStatus;
  bActiveErrorEnabled:boolean);

  procedure _GetMeasurementsForChannel(_DisplayList:TObjectList;
    eChannelNum:TDeviceChannelNum; eExposureStatus:TMyEnum;
    bActiveErrorEnabled:boolean);
  var
    // these are all objects (not records)
    selChannel:TDeviceChannel;
    det:TDeviceDetector;
    shoKVMeas:TStoMeasurement;
  begin  // ********************* error happens on this line
    SelectChannel(eChannelNum);

    _GetMeasurement(ExpInfo, _DisplayList, eChannelNum, eExposureStatus, ctdVal1);
    _GetMeasurement(ExpInfo, _DisplayList, eChannelNum, eExposureStatus, ctdVal2);
    _GetMeasurement(ExpInfo, _DisplayList, eChannelNum, eExposureStatus, ctdVal3);
  end;  // _GetMeasurementsForChannel

begin
  // blah blah blah

      _GetMeasurementsForChannel(_DisplayList,
                                 eChannelNum,
                                 eExposureStatus,
                                 bActiveErrorEnabled);

  // blah blah blah
end;

it is a single-threaded app.

how would you suggest i go about finding the cause of this problem? my first thoughts were:

1) increase max stack size--i did but it didn't change anything. now it's $160000 (1441792) but before this i think it was $150000. 2) is this object still valid? seems to be...it responds to the ClassName method correctly & FastMM doesn't warn me about any problems.

interestingly, the stack trace makes no mention of the routine where the problem is caused.

:7e42b35c USER32.MoveWindow + 0xbe
:7e4565b7 USER32.GetRawInputDeviceInfoW + 0x5f
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
ActnMenus.CallWindowHook(???,0,$31104)
:7e42b372 USER32.MoveWindow + 0xd4
:7e4565b7 USER32.GetRawInputDeviceInfoW + 0x5f
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
:007b882d aqDockingWndProcHook + $1D
:7e42b372 USER32.MoveWindow + 0xd4
:7e4565b7 USER32.GetRawInputDeviceInfoW + 0x5f
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
:7e428dd9 USER32.DefWindowProcW + 0xb9
:7e428d77 USER32.DefWindowProcW + 0x57
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e42a013 USER32.IsWindowUnicode + 0xa1
:7e42a039 USER32.CallWindowProcW + 0x1b
Controls.TWinControl.DefaultHandler(???)
:0050fac8 TWinControl.DefaultHandler + $DC
:0050b4b9 TControl.WndProc + $2D5
:0050f9cc TWinControl.WndProc + $518
:0050f0e3 TWinControl.MainWndProc + $2F
:0048874e StdWndProc + $16
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e428ea0 ; C:\WINDOWS\system32\USER32.dll
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
:7e428dd9 USER32.DefWindowProcW + 0xb9
:7e428d77 USER32.DefWindowProcW + 0x57
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e42a013 USER32.IsWindowUnicode + 0xa1
:7e42a039 USER32.CallWindowProcW + 0x1b
:0050fac8 TWinControl.DefaultHandler + $DC
:0050f9cc TWinControl.WndProc + $518
:0050f0e3 TWinControl.MainWndProc + $2F
:0048874e StdWndProc + $16
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e428ea0 ; C:\WINDOWS\system32\USER32.dll
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
:7e428dd9 USER32.DefWindowProcW + 0xb9
:7e428d77 USER32.DefWindowProcW + 0x57
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e42a013 USER32.IsWindowUnicode + 0xa1
:7e42a039 USER32.CallWindowProcW + 0x1b
:0050fac8 TWinControl.DefaultHandler + $DC
:0050f9cc TWinControl.WndProc + $518
:0050f0e3 TWinControl.MainWndProc + $2F
:0048874e StdWndProc + $16
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e428ea0 ; C:\WINDOWS\system32\USER32.dll
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
:7e428dd9 USER32.DefWindowProcW + 0xb9
:7e428d77 USER32.DefWindowProcW + 0x57
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e42a013 USER32.IsWindowUnicode + 0xa1
:7e42a039 USER32.CallWindowProcW + 0x1b
:0050fac8 TWinControl.DefaultHandler + $DC
:0050f9cc TWinControl.WndProc + $518
:0065279d TcxControl.WndProc + $121
:0070b38d TcxCustomGrid.WndProc + $5
:0048874e StdWndProc + $16
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e428ea0 ; C:\WINDOWS\system32\USER32.dll
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
:7e428dd9 USER32.DefWindowProcW + 0xb9
:7e428d77 USER32.DefWindowProcW + 0x57
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e42a013 USER32.IsWindowUnicode + 0xa1
:7e42a039 USER32.CallWindowProcW + 0x1b
:0050fac8 TWinControl.DefaultHandler + $DC
:0050f9cc TWinControl.WndProc + $518
:0065279d TcxControl.WndProc + $121
:0075bbc4 TcxGridSite.WndProc + $20
:0048874e StdWndProc + $16
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e428ea0 ; C:\WINDOWS\system32\USER32.dll
:7e428eec ; C:\WINDOWS\system32\USER32.dll
:7c90e473 ntdll.KiUserCallbackDispatcher + 0x13
:0044c91e HandleException + $22A
:004539af InterceptAHandleExcept + $3F
:0048874e StdWndProc + $16
:7e418734 USER32.GetDC + 0x6d
:7e418816 ; C:\WINDOWS\system32\USER32.dll
:7e4189cd ; C:\WINDOWS\system32\USER32.dll
:7e418a10 USER32.DispatchMessageW + 0xf

this suggests to me that the problem is stack overrun of some kind--bashing things used by message handling.

suggestions??? THANK YOU!

7
Are there any class constructors for any of the objects?code4life
What are TDeviceChannel, TDeviceDetector, TStoMeasurement, TDeviceChannelNum, TMyEnum? Namely: what are SizeOf for them?Alex
>Are there any class constructors for any of the objects? the objects were constructed elsewhere before this routine is called and we (not shown) retrieve those objects. @Alexander: the size of all of those is 4 (TDeviceChannel, TDeviceDetector, and TStoMeasurement are all objects so their SizeOf would be the SizeOf for a pointer (4)).X-Ray
What about arguments of routine? There is no "const" qualifier, so they should be copied to stack too.Alex
>What about arguments of routine? There is no "const" qualifier, so they should be copied to stack too. never thought of that...thank you! am still working on this.X-Ray

7 Answers

4
votes

I strongly suspect that the TDeviceModule reference involved is invalid. You won't always see any ill effects of calling a method on a bad object reference until some way into the method body unless the method is virtual in which case the invocation of the method itself will typically (always?) yield an AV.

3
votes

From your comment ("error happens here") your error pops up in the loop that sets up stack space, all 212 Kb of it! It has absolutely nothing to do with the parameters you're passing to the procedure and nothing to do with the viability of the object you're passing as a parameter (there's no CALL over there, it's just an JNZ that loops to the PUSH $00 thing until the DEC ECX operation marks the ZERO flag, that is, $1a9e9 times).

Since you're dealing with a procedure that uses 212Kb of stack space maybe you should try increasing the stack space by a lot more! Even better, figure out why your procedure is using up that much space and figure out if other procedures are in the same situation (look out for large Records used as local variables).

3
votes

See this question: Guard page exceptions in Delphi?

Normally, you should get stack overflow exception, when you're running out of your stack. But if your guard page was touched by someone else and exception was eaten silently without expanding stack - then your code will crash with AV when you will expand your stack.

This isexactly what happening in your code: you expand stack and you got the AV. This assembler cycle is designed to touch stack to trigger stack expansion by guard page. Since guarg page is gone, but stack was not expanded - you got simple AV here.

Note, that increasing stack size will not help, since stack doesn't grow at all.

You need to find who plays with your stack.

2
votes

I'd comment out each of the 3 variables, then un-comment one at a time to see if any particular one of them is blowing up. If so, you've just cut your problem by 2/3.

0
votes

One possibility would be that the 3 local variables (stack variables) are growing larger than expected. I suppose this could happen if the objects are declared in a unit that is contained in another BPL and it's not rebuilt correctly (i.e. your program thinks it's smaller than it really is).
Whatever the reason, you can experiment and find out if that's happening. Place "buffer" variables between and after your 3 vars.

ex: 
  var 
    selChannel:TDeviceChannel; 
    Buff1 : array[1..1024] of AnsiChar;
    det:TDeviceDetector; 
    Buff2 : array[1..1024] of AnsiChar;
    shoKVMeas:TStoMeasurement; 
    Buff3 : array[1..1024] of AnsiChar;

This should do two things for you. 1) it should prevent the A/V, assuming that 1024 is enough. 2) by examining the arrays, you should be able to see if garbage appears. That would indicate that they're being overwritten by the declaration directly above.

0
votes

here's what i learned:

by exercising the object, i found it was healthy.

by dumping stuff on the stack i determined it really was running out of stack space.

procedure TDeviceModule.Validate;
const
  icTestSize=400000;
var
  i:integer;
begin
  // ask the object stuff to try to see if it's healthy

  SelectChannel(dcCh1);

  ClassName;

  for eChannelNum:=low(TDeviceChannelNum) to high(TDeviceChannelNum) do
    if HasChannel(eChannelNum) then
      m_aChannels[eChannelNum].Validate;

  // exercise the stack to see if loading on extra stuff is a problem...it is

  i:=0;
  while i<icTestSize do
    begin
      asm
        push 00
      end;
      inc(i);
    end;

  i:=0;
  while i<icTestSize do
    begin
      asm
        pop ecx
      end;
      inc(i);
    end;
end;

there were a few nested functions (neither it's use nor it's declaration were part of the question because i didn't realize how much they were a part of the problem) who returned a record i'll call TBigRecord...it is 32 KB. not just that but it was used quite a few times.

procedure TDeviceModule.GetMeasurements(blah blah blah);

  function _DoSomething1(blah blah blah):TBigRecord;
  begin
  end;

  function _DoSomething2(blah blah blah):TBigRecord;
  begin
  end;

  function _DoSomething3(blah blah blah):TBigRecord;
  begin
  end;

begin
  _DoSomething1(blah blah blah);
  _DoSomething2(blah blah blah);
  _DoSomething3(blah blah blah);
end;

each time i use it (and even if i don't use the result), i get stack space allocated for the result value.

the solution i used for now was to change those functions to procedures since i wasn't using the return value anyway.

i had increased the stack space but not enough to prevent this problem.

can i expect stack overflow to be reported in such a case?

thank you all for your valuable assistance! this problem had me worried...

0
votes

Sorry to be simplistic but...

_DisplayList:TMeasDisplayListAncestor AND _DisplayList: TObjectList are both in scope simultaneously.

So are two eExposureStatus of differing types and two bActiveErrorEnabled of Boolean.

when you call _GetMeasurement(ExpInfo, _DisplayList, eChannelNum, eExposureStatus, ctdVal1) in the local procedure which variable and Type is it using? TobjectList or TTMeasDisplayListAncestor ?

Unless I'm just more drunk than I think... :)