0
votes

This allocates some BSS variables:

.lcomm page_table_l2, 4096
.lcomm page_table_l3, 4096
.lcomm page_table_l4, 4096

The binary I'm building ends up being 4792 bytes, so you can see that the BSS variables are not directly included (or the binary would be >12 KiB).

However, these three need to be 4 KiB aligned, so I change the section to:

.section .bss
.align 4096
.lcomm page_table_l2, 4096
.lcomm page_table_l3, 4096
.lcomm page_table_l4, 4096

…and the binary grows to 8760! Given that BSS is supposed to just be a note in an ELF binary saying to the linker, hey, allocate n bytes of zeroed out storage, why does aligning a BSS variable cause any growth of the binary at all?

You can see this in C, too:

char x[4096] __attribute__ ((aligned (8192))) = {0};

If you vary the alignment, the output object file's size varies with it. (though in my original example, I'm looking at the final binary's size.)

Note that this output binary is an OS kernel; I'm following the tutorial here. I am using the following linker script:

ENTRY(start)

SECTIONS {
    . = 1M;

    .boot ALIGN(8) : AT(ADDR(.boot))
    {
        /* ensure that the multiboot header is at the beginning */
        KEEP( *(.multiboot_header) )
    }

    .text :
    {
        *(.text)
    }
}

According to objdump, it kind of looks like the entire program gets 4 KiB aligned in the elf itself, which is a bit weird.

Without .align:

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .boot         00000018  0000000000100000  0000000000100000  00000078  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA

With .align:

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .boot         00000018  0000000000100000  0000000000100000  00001000  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA

Note File off, or the offset of the section in the file. Why does that change?

1
What commands are you building with? gcc -nostdlib? Making a plain ELF binary? Also, I didn't know multiboot kernels got text/data/bss segments; that's pretty fancy. The tutorial you're following is using NASM, so it's not hard to imagine that gas works slightly differently wrt. aligning things in the BSS. Anyway, making this a minimal reproducible example would only take a couple more lines of commands and notes about which block of code goes in what filename, right? This is interesting and makes me want to try it out, but not badly enough to guess at what exactly you did :P - Peter Cordes
@MichaelPetch: Yeah, I agree it's clear that this is an ELF binary. I just don't know enough about making multiboot kernel images to know if anything special you'd do for that would be relevant, or if you'd expect the same behaviour when creating a .o or plain Linux executable. I wasn't doubting that GRUB's multiboot loader will zero a BSS for you, I was just saying I didn't know that was supported until I read it just now. :P - Peter Cordes

1 Answers

0
votes

I tested this some (with just this code and a _start that makes an exit_group syscall), built with gcc -c or gcc -nostdlib).

It seems like the .o size scales with the largest alignment used in the .S. There are zeroed bytes in the ELF file. (look at the hexdump for example).

However, the linked binary doesn't seem to have any extra padding. So it's not an issue after linking, just for space consumption of temporary files during build.


I got the same result from using .space to assemble zeroed bytes into the .bss as I did from using .lcomm to allocate space in the .bss without switching to it.

.section .bss
    .balign 4096                # .balign or .p2align are preferable, to avoid ambiguity between power-of-2 or exponent

page_table_l2:  .space 4096
.balign 1024
page_table_l3:  .space 4096
page_table_l4:  .space 4096
foo:     .space 17
    .balign 4096              # This doesn't make the .o any bigger if a .align 4096 is already present
bar:    .space 1

#   .lcomm page_table_l2, 4096
#   .lcomm page_table_l3, 4096
#   .lcomm page_table_l4, 4096


    .text
.global _start
_start:
    xor %edi, %edi

    mov $231, %eax  #  exit_group(0)
    syscall

# the .o is big
$ gcc -c align-bss.S  && ll align-bss.o && objdump -haf align-bss.o
-rw-rw-r-- 1 peter peter 4.8K Jul 30 11:05 align-bss.o
...
Idx Name          Size      VMA               LMA               File off  Algn
...
  2 .bss          00004001  0000000000000000  0000000000000000  00001000  2**12

# the binary doesn't care
$ gcc -nostdlib align-bss.S -o align-bss  && ll align-bss && objdump -haf align-bss
-rwxrwxr-x 1 peter peter 1.2K Jul 30 11:08 align-bss
  ...
Idx Name          Size      VMA               LMA               File off  Algn
 ...
  2 .bss          00004008  0000000000601000  0000000000601000  00001000  2**12

With the .balign directives commented, the .o is 872B, and the linked static binary is still 1.2k (unstripped).