0
votes

Consider the following simplified example function:

void foo(void) {
    int t;
    asm("push %0\n\t"
        "push %0\n\t"
        "call bar"
        :
        : "m" (t)
        :
        );
}

If I compile it with Cygwin x86 gcc 4.8.3 without optimization, the push instructions become:

push -4(%ebp)
push -4(%ebp)

This is good. The problem happens with -O:

push 12(%esp)
push 12(%esp)

This is clearly wrong. The first push changes esp, and the second push then accesses the wrong location. I have read that adding "%esp" to the clobber list should fix it, but it does not help. How can I make GCC use a frame pointer or properly consider esp changes?

(Assume that returning from the bar function will set esp to the value it had before the asm statement. I just need to call a thiscall function and am using inline assembly for that.)

1
How about "r" (&t)? That gives me push %rax\npush %rax.David Wohlferd
It's 'wrong' because gcc doesn't perform any sort of semantic analysis on the asm body. gcc asm doesn't accept %ebp or %esp as constraints IIRC, nor %ebx on an ELF (PIC) platform. I think @DavidWohlferd should submit an answer - it's a good choice, since %eax is often clobbered as a return value anyway. But "a" (&t) instead.Brett Hale
Well, if we want to assume %eax is getting returned, then we should probably do something more like "=a" (retval) : "0" (&t). After all, we'd have to let gcc know the contents of eax are changing. I will say that calling back into c++ code is going to be tricky. Have you pushed every register the callee might change before doing the call (and popped them on return)? Are you sure everything bar might need that gcc has stored in registers got flushed back to memory before doing the call? It's possible there are no options here, but using inline asm for this would not be my first choice.David Wohlferd
Using "r" would work, but it is kind of messy and leads to other problems. This here is just a simplified example. The function which I'm actually calling has more parameters, and clobbers some registers. GCC does not allow registers used for input to be in the clobber list. I would have to introduce dummy output variables for some clobbered registers so I don't run out of registers.dreamlayers
I see nothing inherently "messy" about using "r" here, although it can potentially limit the number of registers available elsewhere (which could result in messy-ness). And while using dummy outputs is esthetically impure, I'm not aware of any actual performance implications to using them. So while this may indeed be the best solution, it's hard to say without seeing the actual code, knowing the calling conventions, etc.David Wohlferd

1 Answers

-1
votes

I can simply use __attribute__((optimize("-fno-omit-frame-pointer"))) to prevent the problem-causing optimization in this one function. This might be the best solution. – dreamlayers