I'm trying to write an ELF executable loader for x86-64 Linux, similar to this, which was implemented on ARM. Chris Rossbach's advanced OS class includes a lab that does basically what I want to do. My goal is to load a simple (statically-linked) "hello world" type binary into my process's memory and run it without execve
ing. I have successfully mmap
'd the ELF file, set up the stack, and jumped to the ELF's entry point (_start
).
// put ELF file into memory. This is just one line of a complex
// for() loop that loads the binary from a file.
mmap((void*)program_header.p_vaddr, program_header.p_memsz, map, MAP_PRIVATE|MAP_FIXED, elffd, program_header.p_offset);
newstack = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); // Map a page for the stack
if((long)newstack < 0) {
fprintf(stderr, "ERROR: mmap returned error when allocating stack, %s\n", strerror(errno));
exit(1);
}
topstack = (unsigned long*)((unsigned char*)newstack+4096); // Top of new stack
*((unsigned long*)topstack-1) = 0; // Set up the stack
*((unsigned long*)topstack-2) = 0; // with argc, argv[], etc.
*((unsigned long*)topstack-3) = 0;
*((unsigned long*)topstack-4) = argv[1];
*((unsigned long*)topstack-5) = 1;
asm("mov %0,%%rsp\n" // Install new stack pointer
"xor %%rax, %%rax\n" // Zero registers
"xor %%rbx, %%rbx\n"
"xor %%rcx, %%rcx\n"
"xor %%rdx, %%rdx\n"
"xor %%rsi, %%rsi\n"
"xor %%rdi, %%rdi\n"
"xor %%r8, %%r8\n"
"xor %%r9, %%r9\n"
"xor %%r10, %%r10\n"
"xor %%r11, %%r11\n"
"xor %%r12, %%r12\n"
"xor %%r13, %%r13\n"
"xor %%r14, %%r14\n"
:
: "r"(topstack-5)
:"rax", "rbx", "rcx", "rdx", "rsi", "rdi", "r8", "r9", "r10", "r11", "r12", "r13", "r14");
asm("push %%rax\n"
"pop %%rax\n"
:
:
: "rax");
asm("mov %0,%%rax\n" // Jump to the entry point of the loaded ELF file
"jmp *%%rax\n"
:
: "r"(jump_target)
: );
I then step through this code in gdb
. I've pasted the first few instructions of the startup code below. Everything works great until the first push
instruction (starred). The push
causes a segfault.
0x60026000 xor %ebp,%ebp
0x60026002 mov %rdx,%r9
0x60026005 pop %rsi
0x60026006 mov %rsp,%rdx
0x60026009 and $0xfffffffffffffff0,%rsp
0x6002600d * push %rax
0x6002600e push %rsp
0x6002600f mov $0x605f4990,%r8
I have tried:
- Using the stack from the original process.
mmap
ing a new stack (as in the above code): (1) and (2) both cause segfaults.push
ing andpop
ing to/from the stack beforejmp
ing to the loaded ELF file. This does not cause a segfault.- Changing the protection flags for the stack in the second
mmap
toPROT_READ | PROT_WRITE | PROT_EXEC
. This doesn't make a difference.
I suspect this maybe has something to do with the segment descriptors (maybe?). It seems like the code from the ELF file that I'm loading does not have write access to the stack segment, no matter where it is located. I have not tried to modify the segment descriptor for the newly loaded binary or change the architectural segment registers. Is this necessary? Does anybody know how to fix this?
asm volatile
to keep GCC from screwing things up with optimizations. Also, have you verified you are using anLP64
data model? – jwwtopstack
? What is the value oftopstack-5
, and of$RSP
at crash point? – Employed Russianunsigned long *
.topstack-5
is 1 and$rsp
is0x7ffff7ff6fe0
just before the crash. – Neil Klingensmith