1
votes

I try to write a program to generate ELF(based on Arm and execute through qemu-arm). Most format in ELF has been well illustrated on wiki. But I can't find any spec describe the format of special section(e.g. .text .data(especially what I want to know)).

I tried to put some initialized global variable in .data section. What format should I write in ELF(.data section) if I have global statement like: int a = 10;

1

1 Answers

2
votes

There is not special format for .text and .data.

When the static linker links several .o file,

  1. it simply concatenates the .text and .data segments (while resolving relocations)
  2. and places them in the final .so or executable file according to the linker script (see gcc -Wl,-verbose /dev/null).

The .data segment simply contains the initial values of the instanciated global variables.

The .text segment simply contains the machine code of the routines/functions.

Let's take this simple C file:

char x[5] = {0xba, 0xbb, 0xbc, 0xbd, 0xbe};

char f(int i) {
    return x[i];
}

Let's compile it:

$ gcc -c -o test.o test.c

Let's dump the .data section, using elfcat:

$ elfcat test.o --section-name .data | xxd
00000000: babb bcbd be                             .....

We can clearly explain the content of .data section.

Let's dump the .text section:

$ elfcat test.o --section-name .text | xxd
00000000: 5548 89e5 897d fc8b 45fc 4898 488d 1500  UH...}..E.H.H...
00000010: 0000 000f b604 105d c3

Let's decompile this:

$ elfcat test.o --section-name .text > test.text
$ r2 -a x86 -b 64 -qc pd test.text
            0x00000000      55             push rbp
            0x00000001      4889e5         mov rbp, rsp
            0x00000004      897dfc         mov dword [rbp - 4], edi
            0x00000007      8b45fc         mov eax, dword [rbp - 4]
            0x0000000a      4898           cdqe
            0x0000000c      488d15000000.  lea rdx, qword [0x00000013] ; 19
            0x00000013      0fb60410       movzx eax, byte [rax + rdx]
            0x00000017      5d             pop rbp
            0x00000018      c3             ret

Again, there is nothing special in the text segment: it only contains the machine code of the routines/functions of my program.

Notice however the relocation and symbol informations in other segments:

$ readelf -a test.o
[ ... ]

Relocation section '.rela.text' at offset 0x1b8 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000f  000800000002 R_X86_64_PC32     0000000000000000 x - 4

Relocation section '.rela.eh_frame' at offset 0x1d0 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0

[...]

Symbol table '.symtab' contains 10 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     8: 0000000000000000     5 OBJECT  GLOBAL DEFAULT    3 x
     9: 0000000000000000    25 FUNC    GLOBAL DEFAULT    1 f