GCC inline assembly - Move float to XMM0 before call

Question

I'm currently trying to call a generic C function from GCC inline assembly (bad idea, I know, but I'm bored today...).

My operating system is Mac OS X, 64bits, so the calling convention is System V, meaning arguments 0-6 are passed through the rdi, rsi, rdx, rcx, r8 and r9 registers. Other arguments are pushed to the stack.

I know the function signature, so I can guess the return type, and the type of the arguments. With that information, I can place the arguments in the correct registers.

Everything is working great with integer types, but I got a problem with floating point values.

Floating point values need to be passed through the xmm0-xmm7 registers.

So the problem is basically the following. I've got a C variable of type float. I need to move that variable in, let's say, the xmm0 register, using GCC's inline assembly.

Imagine the following code:

#include <stdio.h>

void foo( int x )
{
    printf( "X: %i\n", x );
}

int main( void )
{
    int x = 42;

    __asm__
    (
        "mov %[x], %%rdi;"
        "call _foo;"
        :
        : [ x ] "m" ( x )
    );

    return 0;
}

The function foo is called, with 42 as parameter. It works...

Now I try the same with a float argument. I only have to use movss instead of mov, and it works.

The problem comes when I try to call both functions:

#include <stdio.h>

void foo( int a )
{
    printf( "A: %i\n", a );
}

void bar( float b )
{
    printf( "B: %f\n", b );
}

int main( void )
{
    int   a = 42;
    float b = 42;

    __asm__
    (
        "mov %[a], %%rdi;"
        "call _foo;"
        "movss %[b], %%xmm0;"
        "call _bar;"
        :
        : [ a ] "m" ( a ),
          [ b ] "m" ( b )
    );

    return 0;
}

The function taking the float argument receive 0. I don't understand why. I don't touch the stack, so there's no cleanup to do...

If I call the functions directly from C, GCC produces the following:

movl    $42, -4(%rbp)
movl    $0x42280000, %eax
movl    %eax, -8(%rbp)
movl    -4(%rbp), %edi
call    _foo
movss   -8(%rbp), %xmm0
call    _bar

I don't get the difference... Any help will be greatly appreciated : )

Have a nice day, all

EDIT

As requested, here's the ASM output when using inline assembly:

 movl    $42, -4(%rbp)
 movl    $0x42280000, %eax
 movl    %eax, -8(%rbp)
 mov    -4(%rbp), %rdi;
 call    _foo;
 movl    -8(%rbp), %eax;
 movl    %eax, -4(%rbp);
 movss    -4(%rbp), %xmm0;
 call    _bar;

EDIT2

As requested, here's the GDB output:

0x100000e9e <main+4>:   movl   $0x2a,-0x4(%rbp)
0x100000ea5 <main+11>:  mov    $0x42280000,%eax
0x100000eaa <main+16>:  mov    %eax,-0x8(%rbp)
0x100000ead <main+19>:  mov    -0x4(%rbp),%rdi
0x100000eb1 <main+23>:  callq  0x100000e54 <foo>
0x100000eb6 <main+28>:  movss  -0x8(%rbp),%xmm0
0x100000ebb <main+33>:  callq  0x100000e75 <bar>

The important part is already at the bottom of the post. Tell me if you need some other code... : ) Thanx — Macmade
I thought the assembly at the bottom of the post is what gcc generated when using C to call foo() and bar(). What I'd like to see is what gcc generates when you're using the inline assembly to call those functions? — Michael Burr
It looks like the inline assembly should be loading 0x42280000 into xmm0 (though I don't know why it's taking extra steps to do so through -4(%rbp)) - I'd have to step through it with a debugger to see what I'm missing that's setting xmm0 to zero. Unfortunately, I don't have a gcc that'll target x64 readily available to me at the moment. Have you tried stepping though the assembly in gdb to see where it's going wrong? — Michael Burr

ughoavgfhw ughoavgfhw · Accepted Answer · 2011-05-02T21:46:33

It took me a while, but I figured this out. In the output using inline assembly, gcc uses negative offsets of rbp to store the values. However, since it doesn't know about the function calls in the inline assembly, it doesn't think it calls any functions. Therefore, it puts the variables in the red zone and doesn't change rsp to make room for the variables. When you call foo, the return address is pushed to the stack, overwriting your stored variables and giving you an incorrect variable.

If, at any point in the main function outside of the assembly, you called a function, then gcc would change the stack to preserve the variables. For example, if you add foo(-1); to the top of main, it would work.

GCC inline assembly - Move float to XMM0 before call

1 Answers