6
votes

AFAIK, there're 2 types of global variables, initialized and unintialized. How are they stored? Are they both stored in the executable file? I can think of initialized global variables having their initial values stored in executable file. But what needs to be stored for the uninitialized ones?

My current understanding is like this:

Executable file is organized as several sections, such as .text, .data, and .bss. Code is stored in .text section, initialized global or static data is stored in .data section, and uninitialized global or static data is stored in .bss section.

Thanks for your time to view my questions.

Update 1 - 9:56 AM 11/3/2010

I found a good reference here:

Segments in Assembly Language Source - Building the text and data segments with .text, .data, and .bss directives

Update 2 - 10:09 AM 11/3/2010

@Michael

  1. I define a 100 bytes of un-initialized data area in my assembly code, this 100-bytes is not stored in my executable file because it is NOT initialized.

  2. Who will allocate the 100-byte uninitialized memory space in RAM? The program loader?

Suppose I got the following code:

int global[100];

void main(void)
{
   //...
}

The global[100] is not initialzed. How will the global[100] be recoded in my executable file? And who will allocate it at what time? What if it is initialized?

4
Usually there is also .rdata segment for const variables (sorry for oxymoron). - ruslik
void main is neither legal C nor legal C++. The return type of main must always be int. - Konrad Rudolph

4 Answers

9
votes

Initialized variable values are stored in the .data segment of the executable. Uninitialized ones don't have to be stored. They end up in the .bss segment in RAM, but the size of the segment is zero in the executable file, just the required amount of memory is stored in the segment descriptor. The code in the .text section is accessing these via offsets into the segment. Runtime linker-loader patches these references to actual virtual addresses. See, for example, the Executable and Linkable Format, which is used on most Unix-like operating systems.

3
votes

In PE files there are two sizes specified for each segment: RAWsize (size on disk) and Vsize (size in RAM).

When Vsize is larger than the RAWsize, the rest of segment in RAM is zeroed.

.bss (if present) always has RAWsize of 0, and the uninialized globlal variables are located there.

Another common approach is to make Vsize of .data larger than its RAWsize, so that the rest of the segment will hold unitialized variables.

2
votes

Storage for global variables is allocated in your computer's virtual memory by the OS linker/loader at the time your program is loaded. The actual global variable storage is somewhere in the physical memory hierarchy (cache, RAM memory, SSD/HD backing storage, etc.), as mapped by the cache and VM system. It could all end up quite fragmented.

The values of initialized globals are copied from the .data segment into a portion of the allocated virtual memory. Non-initialized globals might be zeroed, or might have junk left in them, depending on the security of the particular OS under which the program is running.

The are other variations, depending on the language, compiler, language run-time, and OS.

0
votes

Uninitialized variables are simply pointers at the machine level. The space for them is allocated at runtime, and the program will fill it in at some later time.

For instance, if in assembler you create a global variable global BYTE 100 that will reserve global as a pointer to a 100 byte region. The program then has access to that region for whatever it needs.

EDIT: I looked up in my assembler book and it looks like uninitialized globals are defined in the .data section just as initialized variables are. From my understanding the space is allocated in the exe (say 100 bytes as above) but will have undefined contents. On Intel machines in Windows it will be garbage; the program is responsible for initializing it. Hope this helps!