10
votes

[Deep breath.] We have an application that pops up a window using WxMotif 2.6.3 (the GUI library was not - and is not - my choice). It runs fine on the 32-bit ix86 systems. I had the task of converting it to a 64-bit application. It always seg-faults. I'm on RHEL 6, so I compiled using gcc 4.4.7. After much gnashing of teeth, the problem seems apparent: in wxFrame::DoCreate, m_mainWidget is set (correctly); in wxFrame::GetMainWidget, it is returned as a null pointer. The null pointer results in the crash. Using gdb, the instruction that sets m_mainWidget is

mov    %rax,0x1e0(%rdx) # $rdx = 0x68b2f0

whereas the code that gets m_mainWidget is

mov    0x1f0(%rax),%rax # $rax = 0x68b2f0

In gdb, I can examine the memory and see that the pointer at 0x68b4d0 is correct. Why is the offset incorrect?

To confuse things even more, when I use objdump to disassemble libwx_motifd_core-2.6.so.0.3.1, the "get" assembly is

  mov    0x1e0(%rax),%rax

In objdump, both the get and the set use 0x1e0 as the offset. What is going on?

I've uploaded some relevant info here: GitHub

I've included a small program that replicates the problem on my system.

Investigating further, I see in the disassembly of wxFrame::DoCreate, that further uses of m_mainWidget retrieve the value using 0x1e0 as the offset (The disassembly is on a compile where I used -O0, so the code has to go back to the memory each time). "Just for Fun," I added a new member variable to wxFrame - m_myMainWidget - and set it right after m_mainWidget was set. I then had wxFrame::GetMainWidget() return the local value (m_myMainWidget). Wouldn't you know it: The crash still occurs and GetMainWidget contains the same +16 offset when I disassemble from within gdb. (The offset is not there where I use objdump to disassemble.)

1
Could this be a difference in compiler optimization levels?BlackVegetable
same behavior whether I use -O2 or -O0.John
Could it be a (dynamic) linking problem?BlackBear
Somewhere, you have two translation units, or two modules, built with different compiler settings or different macro definitions. As a result, these two modules don't agree on the binary layout of the class. E.g. class MyWidget { MyInt a; MyInt b; }; If MyInt is, say, typedef'ed as 32-bit integer sometimes and 64-bit other times, then the offset to MyWidget::b would be different.Igor Tandetnik
Looking at github.com/tagged/wx/blob/master/include/wx/frame.h , there are macros that conditionally add or remove class members. That's scary. Be very very careful that you are defining these macros the same way in your application as they were defined when the library was built.Igor Tandetnik

1 Answers

2
votes

Based on @Igor's comment, I have looked at the class layouts using the -fdump-class-hierarchy compiler option. It turns out that there is indeed a vtable layout mismatch, due to this conditional block in include/wx/app.h:

#ifdef __WXDEBUG__
    virtual void OnAssert(const wxChar *file,
                          int line,
                          const wxChar *cond,
                          const wxChar *msg);
#endif // __WXDEBUG__

You need to make sure you compile your code with the same __WXDEBUG__ setting.