15
votes

I viewed the source code of __libc_init_array from http://newlib.sourcearchive.com/documentation/1.18.0/init_8c-source.html .
But I don't quite understand what this function does.

I know that these symbols

/* These magic symbols are provided by the linker.  */
extern void (*__preinit_array_start []) (void) __attribute__((weak));
extern void (*__preinit_array_end []) (void) __attribute__((weak));
extern void (*__init_array_start []) (void) __attribute__((weak));
extern void (*__init_array_end []) (void) __attribute__((weak));
extern void (*__fini_array_start []) (void) __attribute__((weak));
extern void (*__fini_array_end []) (void) __attribute__((weak));

is defined in the linker script.
Part of the linker script may look like:

  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array*))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  } >FLASH
  .init_array :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT(.init_array.*)))
    KEEP (*(.init_array*))
    PROVIDE_HIDDEN (__init_array_end = .);
  } >FLASH
  ...

and then I searched with the key "init_array" in the docs of ELF-v1.1, gcc 4.7.2, ld, and codesourcery(I'm using codesourcery g++ lite) only to get nothing.

Where can I find the specification of these symbols?

4

4 Answers

19
votes

These symbols are related to the C / C++ constructor and destructor startup and tear down code that is called before / after main(). Sections named .init, .ctors, .preinit_array, and .init_array are to do with initialization of C/C++ objects, and sections .fini, .fini_array, and .dtors are for tear down. The start and end symbols define the beginning and end of code sections related to such operations and might be referenced from other parts of the runtime support code.

The .preinit_array and .init_array sections contain arrays of pointers to functions that will be called on initialization. The .fini_array is an array of functions that will be called on destruction. Presumably the start and end labels are used to walk these lists.

A good example of code that uses these symbols is to be found here libc source for initfini.c. You can see that on startup, __libc_init_array() is called and this first calls all the function pointers in section .preinit_array by referring to the start and end labels. Then it calls the _init() function in the .init section. Lastly it calls all the function pointers in section .init_array. After main() is complete the teardown call to __libc_fini_array() causes all the functions in .fini_array to be called, before finally calling _fini(). Note that there seems to be a cut-and-paste bug in this code when it calculates the count of functions to call at teardown. Presumably they were dealing with a real time micro controller OS and never encountered this section.

8
votes

The answer from @Robotbugs is interesting, but I found some additional information that might satisfy the curiosity of others.

The System V Application Binary Interface seems to apply to the executables produced by gcc (and I guess some other compilers - clang comes to mind).

The special sections chapter states (only relevant parts, reordered by me):

.preinit_array:

This section holds an array of function pointers that contributes to a single pre-initialization array for the executable or shared object containing the section.

.init_array

This section holds an array of function pointers that contributes to a single initialization array for the executable or shared object containing the section.

.fini_array

This section holds an array of function pointers that contributes to a single termination array for the executable or shared object containing the section.

The file init.c from newlib includes:

/* Iterate over all the init routines.  */
void
__libc_init_array (void)
{
    size_t count;
    size_t i;

    count = __preinit_array_end - __preinit_array_start;
    for (i = 0; i < count; i++)
        __preinit_array_start[i] ();

#ifdef HAVE_INIT_FINI
    _init ();
#endif

    count = __init_array_end - __init_array_start;
    for (i = 0; i < count; i++)
    __init_array_start[i] ();
}

This corresponds to the canonical linker script solution for STM32 processors (as an example):

.preinit_array     :
{
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array*))
    PROVIDE_HIDDEN (__preinit_array_end = .);
} >FLASH
.init_array :
{
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT(.init_array.*)))
    KEEP (*(.init_array*))
    PROVIDE_HIDDEN (__init_array_end = .);
} >FLASH
.fini_array :
{
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(SORT(.fini_array.*)))
    KEEP (*(.fini_array*))
    PROVIDE_HIDDEN (__fini_array_end = .);
} >FLASH

That linker script portion is quite clear: it defines the symbols necessary for Newlib to execute the array functions specified by the System V Application Binary Interface for preinit and init. This seems to be the standard solution for static constructors in C++. And fini would corresponds to static destructors.

The most ironic part of this story, of course, is that using static C++ objects without the Construct On First Use Idiom is the best way to get the static initialization order problem! I.e. C++ objects should in practice never be constructed via the preinit/init array above!

4
votes

These special symbols will end up being referenced by the PT_DYNAMIC section of the generated library. PT_DYNAMIC defines the various resources needed to make dynamic linking succeed (library dependencies, exported symbols, symbol hash table, init/fini arrays, etc.).

Thus, any functions in these lists will end up linked to the PT_DYNAMIC section and called at the appropriate time during the dynamic linking process. You may want to consult the sources for ldd for more information.

1
votes

the specifications for these objects are the specifications for the elf header file format. at least why they are there.

They are NOT made to be worked with in any way shape means or form unless you plan on rewriting the glic lib and everything it talks to. In short the elf header requires a _start function. It won't launch a binary without one.

A large part of the libc library is written in assembly not C which does not consider this. The pre array function is a way of adding this header.

Check out the gnu-csu folder in glibc or teeny-efl.git for examples. It also sets the array as a slash formatted string. Sets both elements as static, the array in argv and the init_array. It will later check to make sure they match. It also takes more code than you should add to this kind of function to break this process or do anything other than what it is meant for which is to be left alone. Go play with your fridge.