6
votes

First I am sorry about the length of this post, but I wanted to explain the problem clearly.

I try to write a kind of small self modifying program in C but I have some troubles and I don't know exactly why.

Plateform is : Ubuntu/Linux 2.6.32-40 x86_64, prog is build on x86 arch, gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3, GNU ld (GNU Binutils for Ubuntu) 2.20.1-system.20100303

The purpose of the program is to create a read/write/execute chunk of memory (with memalign(3) and mprotect(2)), copy a small function called p() (defined in the .text segment) in this chunk of memory and then execute the copied function through a pointer. The p() function just displays a message using printf(puts).

In order to get the starting and ending address of the code of p() (to copy it), I use the address of the function itself and the address of a dummy() function create just after p() in .text.

int p() { ... }       <- address where the copy starts
int dummy() { ... }   <- address where the copy stops

Chunk memory creation and copy are successfully done but when the code in the chunk is run a segfault occurs. By using gdb it is clear that we enter in the code of the chunk (the body of the copied function) but the call to printf failed. When disassembling the p() function and the code in the chunk I see that the address use in the 'call' is not the same.

And I don't known why the address is incorrect, when the code is copied it is displayed and it is the same that objdump (or gdb) gave me when I disassemble the p() function.

The binary is create with -static to avoid potential problem with got/plt or with the relocation process of ld.so. It doesn't seem to be a problem to run code on the heap because the beginning of the copied function is executed (check under gdb).

The simplified src of the program :

<... skip include/checks ...>
#define newline()   putchar('\n')

/* - function copied in the chunk */ 
int p()
{
    printf("hello world\n");
    return 0;
}

/* - dummy function to get last address of p() */
int dummy() { return 0; }

int main()
{
    char *s, *code, *chunk;
    unsigned int pagesz, sz;
    int (*ptr)(void);

    pagesz = sysconf(_SC_PAGE_SIZE);
    chunk = (char*)memalign(pagesz, 4 * pagesz);
    mprotect(chunk, 4 * pagesz, PROT_WRITE|PROT_EXEC|PROT_READ);

    /* - get size, display addr */
    sz = (char *)dummy - (char *)p;
    printf("Copy between  : %p - %p\n", (char *)p, (char *)dummy);
    printf("Copy in chunk : %p - %p\n", chunk, chunk + sz, sz);

    /* - copy code (1 byte in supp) */
    printf("Copied code   : ");
    for(s = (char *)p, code = chunk; \
            s <= (char *)dummy; s++, code++) {

       *code = *s;     
       /* - write in console -- to check if code of p() after disas
        * it with objdump(1) is the same, RESULT : ok */
        printf("%02x ", *(unsigned char *)code);
    }
    newline();

    /* - run once orginal function */
    ptr = p;
    ptr();

    /* - run copied function (in chunk) */
    ptr = (int (*)(void))chunk;
    ptr(); 

    newline();
    free(chunk);
    return 0;
}

The p() function disassembled by objdump(1) :

080483c3 <p>:
 80482c0:       55                      push   %ebp
 80482c1:       89 e5                   mov    %esp,%ebp
 80482c3:       83 ec 18                sub    $0x18,%esp
 80482c6:       c7 04 24 a8 d9 0a 08    movl   $0x80ad9a8,(%esp)
 80482cd:       e8 7e 0c 00 00          call   8048f50 <_IO_puts>
 80482d2:       b8 00 00 00 00          mov    $0x0,%eax
 80482d7:       c9                      leave  
 80482d8:       c3                      ret    

080483dc <dummy>:
 ....

When the program is run under gdb(1) the copied code is the same (hex value) than objdump(1) provide above :

# gcc -m32 -o selfmodif_light selfmodif_light.c -static -g -O0

# gdb -q ./selfmodif_light
Reading symbols from /path/.../selfmodif_light...done.

(gdb) list 55
50          /* - run once orginal function */
51          ptr = p;
52          ptr();
53
54          /* - run copied function (in chunk) */
55          ptr = (int (*)(void))chunk;

<<< The problem is here >>>

56          ptr();
57      
58          newline();
59          free(chunk);

(gdb) br 56
Breakpoint 1 at 0x8048413: file tmp.c, line 56.

(gdb) run
Starting program: /path/.../selfmodif_light
Copy between  : 0x80482c0 - 0x80482d9
Copy in chunk : 0x80d2000 - 0x80d2019
Copied code   : 55 89 e5 83 ec 18 c7 04 24 a8 d9 0a 08 e8 7e 0c 00 00 b8 00 00 00 00 c9 c3 55
hello world

Breakpoint 1, main () at tmp.c:56
56          ptr();

If we look in main we next go into the chunk :

(gdb) disas main
Dump of assembler code for function main:
   0x080482e3 <+0>: push   %ebp
    ... <skip> ...

=> 0x08048413 <+304>:   mov    0x18(%esp),%eax
   0x08048417 <+308>:   call   *%eax

    ... <skip> ...
   0x08048437 <+340>:   ret
End of assembler dump.

But when p() and chunk are disassembled, we have a call 0x80d2c90 in the memory chunk instead of a call 0x8048f50 <puts> like in the p() function ? For which reason displayed address is not the same.

(gdb) disas p
Dump of assembler code for function p:
   0x080482c0 <+0>:     push   %ebp
   0x080482c1 <+1>:     mov    %esp,%ebp
   0x080482c3 <+3>:     sub    $0x18,%esp
   0x080482c6 <+6>:     movl   $0x80ad9a8,(%esp)
   0x080482cd <+13>:    call   0x8048f50 <puts> <<= it is not the same address
   0x080482d2 <+18>:    mov    $0x0,%eax
   0x080482d7 <+23>:    leave  
   0x080482d8 <+24>:    ret
End of assembler dump.
(gdb) disas 0x80d2000,0x80d2019
Dump of assembler code from 0x80d2000 to 0x80d2019:
   0x080d2000:  push   %ebp
   0x080d2001:  mov    %esp,%ebp
   0x080d2003:  sub    $0x18,%esp
   0x080d2006:  movl   $0x80ad9a8,(%esp)
   0x080d200d:  call   0x80d2c90             <<= than here (but it should be ??)
   0x080d2012:  mov    $0x0,%eax
   0x080d2017:  leave  
   0x080d2018:  ret
End of assembler dump.

When memory is checked, codes seem to be identical. At this point I don't understand what is happening, what is the problem ? gdb's interpretation failed, copy of code or what ?

(gdb) x/25bx p // code of p in .text
0x80482c0 <p>:  0x55    0x89    0xe5    0x83    0xec    0x18    0xc7    0x04
0x80482c8 <p+8>:    0x24    0xa8    0xd9    0x0a    0x08    0xe8    0x7e    0x0c
0x80482d0 <p+16>:   0x00    0x00    0xb8    0x00    0x00    0x00    0x00    0xc9
0x80482d8 <p+24>:   0xc3

(gdb) x/25bx 0x80d2000 // code of copy in the chunk
0x80d2000:  0x55    0x89    0xe5    0x83    0xec    0x18    0xc7    0x04
0x80d2008:  0x24    0xa8    0xd9    0x0a    0x08    0xe8    0x7e    0x0c
0x80d2010:  0x00    0x00    0xb8    0x00    0x00    0x00    0x00    0xc9
0x80d2018:  0xc3

If a breakpoint is set then execution continue in the memory chunk :

(gdb) br *0x080d200d
Breakpoint 2 at 0x80d200d
(gdb) cont
Continuing.

Breakpoint 2, 0x080d200d in ?? ()
(gdb) disas 0x80d2000,0x80d2019
Dump of assembler code from 0x80d2000 to 0x80d2019:
   0x080d2000:  push   %ebp
   0x080d2001:  mov    %esp,%ebp
   0x080d2003:  sub    $0x18,%esp
   0x080d2006:  movl   $0x80ad9a8,(%esp)
=> 0x080d200d:  call   0x80d2c90
   0x080d2012:  mov    $0x0,%eax
   0x080d2017:  leave
   0x080d2018:  ret
End of assembler dump.
(gdb) info reg eip
eip            0x80d200d    0x80d200d
(gdb) nexti
0x080d2c90 in ?? ()
(gdb) info reg eip
eip            0x80d2c90    0x80d2c90
(gdb) bt
#0  0x080d2c90 in ?? ()
#1  0x08048419 in main () at selfmodif_light.c:56

So at this point either the program runs like that and a segfault occurs or $eip is changed and the program ends without errors.

(gdb) set $eip = 0x8048f50
(gdb) cont
Continuing.
hello world

Program exited normally.
(gdb)

I do not understand what is happening, what failed. The copy of the code seems to be ok, the jump into the memory chunk too, so why the address (of the call) isn't the good ?

Thanks for your answers and your time

1
What kind of a hacker are you anyway??? Please! If you're going to write self-modifying code you've got to do this yourself :-)ControlAltDel

1 Answers

7
votes
80482cd:       e8 7e 0c 00 00          call   8048f50

That's a relative CALL (to +0xC7E). When you move that instruction to a different EIP, you need to modify the offset.