2
votes

I'm trying to write an ELF executable loader for x86-64 Linux, similar to this, which was implemented on ARM. Chris Rossbach's advanced OS class includes a lab that does basically what I want to do. My goal is to load a simple (statically-linked) "hello world" type binary into my process's memory and run it without execveing. I have successfully mmap'd the ELF file, set up the stack, and jumped to the ELF's entry point (_start).

// put ELF file into memory. This is just one line of a complex
// for() loop that loads the binary from a file.
mmap((void*)program_header.p_vaddr, program_header.p_memsz, map, MAP_PRIVATE|MAP_FIXED, elffd, program_header.p_offset);


newstack = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); // Map a page for the stack

if((long)newstack < 0) {
  fprintf(stderr, "ERROR: mmap returned error when allocating stack, %s\n", strerror(errno));
  exit(1);
}

topstack = (unsigned long*)((unsigned char*)newstack+4096); // Top of new stack

*((unsigned long*)topstack-1) = 0; // Set up the stack
*((unsigned long*)topstack-2) = 0; // with argc, argv[], etc.
*((unsigned long*)topstack-3) = 0;
*((unsigned long*)topstack-4) = argv[1];
*((unsigned long*)topstack-5) = 1;

asm("mov %0,%%rsp\n"     // Install new stack pointer
    "xor %%rax, %%rax\n" // Zero registers
    "xor %%rbx, %%rbx\n"
    "xor %%rcx, %%rcx\n"
    "xor %%rdx, %%rdx\n"
    "xor %%rsi, %%rsi\n"
    "xor %%rdi, %%rdi\n"
    "xor %%r8, %%r8\n"
    "xor %%r9, %%r9\n"
    "xor %%r10, %%r10\n"
    "xor %%r11, %%r11\n"
    "xor %%r12, %%r12\n"
    "xor %%r13, %%r13\n"
    "xor %%r14, %%r14\n"
    :   
    : "r"(topstack-5)
    :"rax", "rbx", "rcx", "rdx", "rsi", "rdi", "r8", "r9", "r10", "r11", "r12", "r13", "r14");
asm("push %%rax\n"
    "pop %%rax\n"
    :   
    :   
    : "rax");

asm("mov %0,%%rax\n" // Jump to the entry point of the loaded ELF file
    "jmp *%%rax\n"
    :   
    : "r"(jump_target)
    : );

I then step through this code in gdb. I've pasted the first few instructions of the startup code below. Everything works great until the first push instruction (starred). The push causes a segfault.

0x60026000      xor    %ebp,%ebp
0x60026002      mov    %rdx,%r9
0x60026005      pop    %rsi
0x60026006      mov    %rsp,%rdx
0x60026009      and    $0xfffffffffffffff0,%rsp
0x6002600d *    push   %rax
0x6002600e      push   %rsp
0x6002600f      mov    $0x605f4990,%r8

I have tried:

  1. Using the stack from the original process.
  2. mmaping a new stack (as in the above code): (1) and (2) both cause segfaults.
  3. pushing and poping to/from the stack before jmping to the loaded ELF file. This does not cause a segfault.
  4. Changing the protection flags for the stack in the second mmap to PROT_READ | PROT_WRITE | PROT_EXEC. This doesn't make a difference.

I suspect this maybe has something to do with the segment descriptors (maybe?). It seems like the code from the ELF file that I'm loading does not have write access to the stack segment, no matter where it is located. I have not tried to modify the segment descriptor for the newly loaded binary or change the architectural segment registers. Is this necessary? Does anybody know how to fix this?

1
Maybe asm volatile to keep GCC from screwing things up with optimizations. Also, have you verified you are using an LP64 data model?jww
Thanks for the suggestion. I have verified that my inline assembly is being faithfully converted in the compiled binary by stepping thru in gdb assembly mode. I assume that the program is in LP64 mode, but not certain. It uses rax, rbx, rcx, etc. with 64-bit ints in them. Is that enough to be in LP64 mode?Neil Klingensmith
What is the type of topstack? What is the value of topstack-5, and of $RSP at crash point?Employed Russian
topstack is an unsigned long *. topstack-5 is 1 and $rsp is 0x7ffff7ff6fe0 just before the crash.Neil Klingensmith

1 Answers

0
votes

It turned out that when I was stepping through the loaded code in gdb, the debugger would consistently blow by the first push instruction when I typed nexti and instead continue execution. It was not in fact the push instruction that was causing the segfault but a much later instruction in the C library start code. The problem was caused by a failed call to mmap in the initial binary load that I didn't error check.

Regarding gdb randomly deciding to continue execution instead of stepping: this can be fixed by loading the symbols from the target executable after jumping to the newly loaded executable.