3
votes

I'm using an AMD 64bit (I don't think it matters what exact architecture) on Linux, also 64bit. Compiling with gcc to elf64.

I've seen from the C ABI that integer arguments are passed to a function via general purpose registers, and I can find the values on the assembly side of my code (the callee). The problem arises when I need to retrieve the results from the callee to the caller.

As far as I can understand, RAX gets the 1st integer returned value, and I can easily find and use that one. The 2nd integer returned value is passed via RDX. And this is the point that baffles me.

I can also see, from the C ABI, that RDX is the register used to pass the third integer function argument from the caller to the callee, but my function doesn't use a third argument.

How do I get the RDX out from my function? Do I have to fake an argument in the function just to be able to refer to it on the caller side?


fixed point multiplication 16.16:

called from C looks like:

typedef long int Fixedpoint;
Fixedpoint _FixedMul(Fixedpoint v1, Fixedpoint v2);

and this is the function itself:

_FixedMul:
   push bp
   mov bp, sp
; entering the function EDI contains v1, ESI contains v2. So:
   mov eax, edi   ; eax = v1
   imul dword esi ; eax = v1 * v2
                  ; at this point EDX contains the higher part of the
                  ; imul moltiplication, EAX the lower one.
   add eax, 8000h ; round by adding 2^(-17)
   adc edx, 0     ; whole part of result is in DX
   shr eax, 16    ; put the fractional part in AX
   pop bp
   ret

from System V Application Binary Interface AMD64 Architecture Processor Supplement

Returning of Values algorithm:

The returning of values is done according to the following

  1. Classify the return type with the classification algorithm.
  2. If the type has class MEMORY, then the caller provides space for the return value and passes the address of this storage in %rdi as if it were the first argument to the function. In effect, this address becomes a “hidden” first ar- gument. This storage must not overlap any data visible to the callee through other names than this argument. On return %rax will contain the address that has been passed in by the caller in %rdi.
  3. If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.

I hope is more clear what i mean.

PS: sorry for confusion I've made in comments. Thanks for the tip.

2
Only 1 result can be returned, and that's in rax or xmm0 or in memory depending on the type.Jester
Your English is absolutely fine. And how arguments are passed depends on the calling convention that has been adopted. Actual return values are returned in (E/R)AX; however, if the function receives some pointers to "return" more values, those pointers will be passed as normal arguments (usually on the stack, in __stdcall and __cdecl, which are most common).user4520
I think i need to past a snippet from x86-64.org/documentation/abi.pdf:primefo crastimino
Since you're trying to optimize for speed, keep in mind that on "normal" (Intel/AMD) x86 CPUs, fixed-point math takes more instructions, enough to make it slower than just using the powerful hardware floating point. Also, on amd64, you need to push/pop rbp, not just the low 16 bits! If you're not spilling any local variables to memory on the stack, don't even bother creating a stack frame by saving rsp.Peter Cordes
Also, if you want to do a 64bit multiply, use 64bit registers. Then you won't need add-with-cary. Maybe it works better for the rest of your functions to carry around your fixed-point data in two parts.Peter Cordes

2 Answers

2
votes

The AMD64 calling conventions (System V ABI) designate a dual-use role for register RDX: it may be used to pass the third argument (if any) and it may be used as second return register. (cf. Figure 3.4, page 21)

Depending on the function signature it's used for either or none of these roles.

So why are there 2 return registers (i.e. RAX and RDX)? This allows a function to return values of up to 128 bits in general purpose registers - such as __int128 or a struct with two 8 byte fields.

Thus, to access the RDX register from your C call site you just have to adjust your function's signature. That means instead of

typedef long int Fixedpoint;
Fixedpoint _FixedMul(Fixedpoint v1, Fixedpoint v2);

declare it as:

typedef long int Fixedpoint;
struct Fixedpoint_Pair { // when used as return value:
    Fixedpoint fst;      //   - passed in RAX
    Fixedpoint snd;      //   - passed in RDX
};
typedef struct Fixedpoint_Pair Fixedpoint_Pair;

Fixedpoint_Pair _FixedMul(Fixedpoint v1, Fixedpoint v2);

Then you can access RDX like this in your C code:

Fixedpoint_Pair r = _FixedMul(a, b);
printf("RDX: %ld\n", r.snd);

References:

  • page 18,19: especially 'The classification of aggregate (structure and arrays)', point 3:

If the size of the aggregate exceeds a single eightbyte, each is classified separately. Each eightbyte gets initialized to class NO_CLASS.

  • page 22: 'Return of Values', point 3:

If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.

1
votes

Your English is absolutely fine.

How arguments are passed depends on the calling convention that has been adopted - the two most common ones, __stdcall and __cdecl, use the stack to pass all arguments, but for example the __fastcall convention will use registers for the first two args, and in x64, it's still different. See here for a comprehensive list.

Actual return values are returned in (E/R)AX; however, if the function receives some pointers to "return" more values, those pointers will be passed as normal arguments - as previously stated, usually on the stack.