I have made this simple ELF for learning purposes:
bits 64
org 0x08048000
elfHeader:
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
db 0 ; abi version
times 7 db 0 ; unused padding
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq programHeader - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw elfHeaderSize ; e_ehsize
dw programHeaderSize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
elfHeaderSize equ $ - elfHeader
programHeader:
dd 1 ; p_type
dd 7 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq fileSize ; p_filesz
dq fileSize ; p_memsz
dq 0x1000 ; p_align
programHeaderSize equ $ - programHeader
_start:
xor rdi, rdi
xor eax,eax
mov al,60
syscall
fileSize equ $ - $$
In order to compile that code I use NASM:
nasm -f bin exe.asm -o exe
If you take a look to the programHeader
, you will see that p_offset
is 0, and p_filesz
is fileSize
. That means that the segment contains the whole file. That's something I wasn't expecting(and I'm not the only one), but apparently the Linux operating system needs the headers to be in a segment of type PT_LOAD
so that information gets loaded.
This is the only resource I could find that mentions that fact that the headers are inside one segment: https://www.intezer.com/blog/research/executable-linkable-format-101-part1-sections-segments/
Something important to highlight about segments is that only PT_LOAD segments get loaded into memory. Therefore, every other segment is mapped within the memory range of one of the PT_LOAD segments.
In order to understand the relationship between Sections and Segments, we can picture segments as a tool to make the linux loader’s life easier, as they group sections by attributes into single segments in order to make the loading process of the executable more efficient, instead of loading each individual section into memory. The following diagram attempts to illustrate this concept:
But I don't understand why Linux needs that headers to be loaded at run time. What are they used for? If they are needed for the process to run, couldn't Linux load it by himself?
EDIT:
It has been mentioned in the comments that headers don't need to be loaded, however, they are sometimes loaded anyways to avoid having to add padding. I have tried adding padding to get it 4KB aligned but it didn't work. Here's my attempt:
bits 64
org 0x08048000
elfHeader:
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
db 0 ; abi version
times 7 db 0 ; unused padding
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq programHeader - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw elfHeaderSize ; e_ehsize
dw programHeaderSize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
elfHeaderSize equ $ - elfHeader
programHeader:
dd 1 ; p_type
dd 7 ; p_flags
dq _start - $$ ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq codeSize ; p_filesz
dq codeSize ; p_memsz
dq 0x1000 ; p_align
programHeaderSize equ $ - programHeader
; padding until 4KB
paddingUntil4k equ 4*1024 - ($ - elfHeader)
times paddingUntil4k db 0
_start:
xor rdi, rdi
xor eax,eax
mov al,60
syscall
codeSize equ $ - _start
fileSize equ $ - $$