3
votes

I write my own link script to put different variables in two different data sections (A & B).

A is linked to zero address; B is linked near to code, and in high address space (higher than 4G, which is not available for normal absolute addressing in x86-64).

A can be accessed through absolute addressing, but not RIP-relative; B can be accessed through RIP-relative addressing, but not absolute;

My question: Is there any way to choose RIP-relative or absolute addressing for different variables in gcc? Perhaps with some annotation like #pragma?

1
Currently we have some code that use section A, and the code only allows absolute addressing. (the code is directly copied from one place to one runtime heap, and be executed)xingchong

1 Answers

1
votes

Without hacking the GCC source code, you're not going to get it to emit 32-bit absolute addressing, but there are cases where gcc will use 64-bit absolute addresses.


-mcmodel=medium puts large objects into a separate section, using 64-bit absolute addresses for the large-data section. (With a size threshold that all objects have to agree on, set by -mlarge-data-threshold=). But still uses RIP-relative for all other variables.

See the x86-64 System V ABI doc for more about the different memory models. And/or GCC docs for -mcmodel= and -mlarge-data-threshold= : https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
The default is -mcmodel=small : everything is within 2GiB of everything else, so RIP-relative works. And for non-PIE executables, that's the low 2GiB of virtual address space so static addresses can be 32-bit absolute sign- or zero-extended immediates or disp32 in addressing modes.

int a[1000000];
int b[1];

int fa() {   return a[0];  }
int fb() {   return b[0];  }

ASM output (Godbolt):

# gcc9.2 -O3 -mcmodel=medium
fa():
        movabs  eax, DWORD PTR [a]     # 64-bit absolute address, special encoding for EAX
        ret
fb():
        mov     eax, DWORD PTR b[rip]
        ret

For loading into a register other than AL/AX/EAX/RAX, GCC would use movabs r64, imm64 with the address and then use mov reg, [reg].

You won't get gcc to use 32-bit absolute addressing for section A. It will always be using 64-bit absolute, never [array + rdx*4] or [abs foo] (NASM syntax). And never mov edi, msg (imm32) for putting an address in a register, always mov rdi, qword msg (imm64).

GCC puts b in the .lbss section and a in the regular .bss. Presumably you can use __attribute__((section("name"))) on

        .globl  b
        .section        .lbss,"aw"           # "aw" = allocate(?), writeable
        .align 32
        .size   b, 4000000
b:
        .zero   4000000

        .globl  a
        .bss                      # shortcut for .section
        .align 4
a:
        .zero   4

Things that don't work:

  • __attribute__((optimize("mcmodel=large"))) on a per-function basis. Doesn't actually work, and is per-function not per-variable anyway.
  • https://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html doesn't document any x86 or common variable attributes related to memory-model or size. The only x86-specific variable attribute is ms vs gcc struct layout.

    There are x86-specific attributes for functions and types, but those don't help.


Possible hacks:

Put all your section-A variables in a large struct, larger than any section-B global/static objects. Possibly pad it at the end with a dummy array to make it larger: your linker script can probably avoid actually allocating extra space for that dummy array.

Then compile with -mcmodel=medium mlarge-data-threshold=that size.