2
votes

I'm trying to get a small piece of hello-world MIPS program running in Gem 5 simulator. The program was compiled with gcc 4.9.2 and glibc 2.19 (built by crosstool-ng) and runs well in qemu, but it crashed with a page fault (trying to access address 0) in gem5.

Code is rather simple:

#include <stdio.h>
int main()
{
    printf("hello, world\n");
    return 0;
}

file ./test result:

./test: ELF 32-bit LSB executable, MIPS, MIPS-I version 1, statically linked, for GNU/Linux 3.15.4, not stripped

After some debugging with gdb, I figured out that the page fault is triggered by _dl_setup_stack_chk_guard function in glibc. It accepts a void pointer called _dl_random passed by __libc_start_main function, which happens to be NULL. However, as far as I know, these functions never dereference the pointer, but instructions were generated to load values from the memory _dl_random pointer points to. Some code pieces might help understanding:

in function __libc_start_main (macro THREAD_SET_STACK_GUARD is not set):

   /* Initialize the thread library at least a bit since the libgcc
   functions are using thread functions if these are available and
   we need to setup errno.  */
  __pthread_initialize_minimal ();

  /* Set up the stack checker's canary.  */
  uintptr_t stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);
# ifdef THREAD_SET_STACK_GUARD
  THREAD_SET_STACK_GUARD (stack_chk_guard);
# else
  __stack_chk_guard = stack_chk_guard;
# endif

in function _dl_setup_stack_chk_guard (always inlined):

static inline uintptr_t __attribute__ ((always_inline))
_dl_setup_stack_chk_guard (void *dl_random)
{
  union
  {
    uintptr_t num;
    unsigned char bytes[sizeof (uintptr_t)];
  } ret = { 0 };

  if (dl_random == NULL)
    {
      ret.bytes[sizeof (ret) - 1] = 255;
      ret.bytes[sizeof (ret) - 2] = '\n';
    }
  else
    {
      memcpy (ret.bytes, dl_random, sizeof (ret));
#if BYTE_ORDER == LITTLE_ENDIAN
      ret.num &= ~(uintptr_t) 0xff;
#elif BYTE_ORDER == BIG_ENDIAN
      ret.num &= ~((uintptr_t) 0xff << (8 * (sizeof (ret) - 1)));
#else
# error "BYTE_ORDER unknown"
#endif
    }
  return ret.num;
}

disassembly code:

   0x00400ea4 <+228>:   jal 0x4014b4 <__pthread_initialize_minimal>
   0x00400ea8 <+232>:   nop
   0x00400eac <+236>:   lui v0,0x4a
   0x00400eb0 <+240>:   lw  v0,6232(v0)
   0x00400eb4 <+244>:   li  a0,-256
   0x00400eb8 <+248>:   lwl v1,3(v0)
   0x00400ebc <+252>:   lwr v1,0(v0)
   0x00400ec0 <+256>:   addiu   v0,v0,4
   0x00400ec4 <+260>:   and v1,v1,a0
   0x00400ec8 <+264>:   lui a0,0x4a
   0x00400ecc <+268>:   sw  v1,6228(a0)
  • 0x4a1858 (0x4a0000 + 6232) is the address of _dl_random
  • 0x4a1854 (0x4a0000 + 6228) is the address of __stack_chk_guard

Page fault occurs at 0x00400eb8. I don't quite get it how instruction 0x00400eb8 and 0x00400ebc are generated. Could someone shed some light on it please? Thanks.

1
Apparently the compiler thinks dl_random can not be NULL. Verify the source of _dl_random. - Jester
@Jester Thanks for your reply. Here is the declaration of _dl_random: extern void *_dl_random attribute_hidden attribute_relro. Digging further I realized that attribute_relro actually refers to __attribute__ ((section (".data.rel.ro"))). Could it mean the compiler is certain that _dl_random is pointed to data section? - skies457
That just means the _dl_random itself is in the .data.rel.ro section, it says nothing about what it's points to. It's not all clear why the compiler would assume it's not NULL. - Ross Ridge
@Jester@Ross Ridge Sorry I made a foolish mistake... The copy of _dl_setup_stack_chk_guard I gave in description does not involve in compilation at all. However, there is another version located in a different directory I've missed. Most implementations are the same, except that if statement is enclosed with a ifndef, whose macro, __ASSUME_AT_RANDOM, is a default-enabled kernel feature. Therefore, if statement was skipped and it also explains why there is an and instruction. Perhaps I should file a bug report complaining gem5 does not set the value. Thanks again for your kind help! - skies457

1 Answers

1
votes

Here is how I find the root of this problem and my suggestion for solution.

I think it helpful to dive into the Glibc source code to see what really happens. Starting from _dl_random or __libc_start_main are both OK.

As the value of _dl_random is unexpectedly NULL, we need to find how this variable initialize and where it is assigned. With the help of code analysing tools, we can find _dl_random in Glibc is only assigned with meaningful value in function _dl_aux_init, and this function is called by __libc_start_min.

_dl_aux_init iterates on its parameter -- auxvec -- and acts corresponding to auxvec[i].at_type. AT_RANDOM is the case for the assignment of _dl_random. So the problem is that there isn't an AT_RANDOM element to make _dl_random assigned.

As the program runs well in user mode qemu, the root of this problem resides in system environment provider, say, gem5, which has the responsibility to construct auxvec. Having that keyword, we can find that the auxv is constructed in gem5/src/arch/<arch-name>/process.cc.

The current auxv for MIPS is constructed as below:

    // Set the system page size
    auxv.push_back(auxv_t(M5_AT_PAGESZ, MipsISA::PageBytes));
    // Set the frequency at which time() increments
    auxv.push_back(auxv_t(M5_AT_CLKTCK, 100));
    // For statically linked executables, this is the virtual
    // address of the program header tables if they appear in the
    // executable image.
    auxv.push_back(auxv_t(M5_AT_PHDR, elfObject->programHeaderTable()));
    DPRINTF(Loader, "auxv at PHDR %08p\n", elfObject->programHeaderTable());
    // This is the size of a program header entry from the elf file.
    auxv.push_back(auxv_t(M5_AT_PHENT, elfObject->programHeaderSize()));
    // This is the number of program headers from the original elf file.
    auxv.push_back(auxv_t(M5_AT_PHNUM, elfObject->programHeaderCount()));
    //The entry point to the program
    auxv.push_back(auxv_t(M5_AT_ENTRY, objFile->entryPoint()));
    //Different user and group IDs
    auxv.push_back(auxv_t(M5_AT_UID, uid()));
    auxv.push_back(auxv_t(M5_AT_EUID, euid()));
    auxv.push_back(auxv_t(M5_AT_GID, gid()));
    auxv.push_back(auxv_t(M5_AT_EGID, egid()));

Now we know what to do. We just need to provide an accessible address value to _dl_random tagged by MT_AT_RANDOM. Gem5's ARM arch implements this already (code). Maybe we can take it as an example.