8
votes

I'm learning asm on Linux (noobuntu 10.04) I got the following code off of: http://asm.sourceforge.net/intro/hello.html

section .text
global _start ;must be declared for linker (ld)

_start: ;tell linker entry point

mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel

mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel

section .data

msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string

It's a simple hello world. Runs on Linux + calls the kernel directly (apparently). Can anyone please explain what is really going on here? I think it reads the integers in the eax & ebx processor registers & ecx, edx data and that defines the system call when the kernel is called. If so, do different combinations of integers define different system calls when int 0x80 is called?

I'm not good with man pages, but have read every related one I can find, does any man page tell me what combinations define what syscalls?

ANY help is appreciated. A line by line explanation would be amazing... -Thanks in advance Jeremy

3

3 Answers

8
votes

When you call int 0x80, the kernel looks at the value of the eax register to determine the function you want to call (this is the "syscall number"). Depending on that number, the rest of the registers are interpreted to mean specific things. The sys_write call expects the registers to be set up as follows:

  • eax contains 4
  • ebx contains the file descriptor
  • ecx contains the address of the data to write
  • edx contains the number of bytes

For further extensive information, see Linux System Calls.

2
votes
section .text
global _start ;must be declared for linker (ld)

This is just header material, the "text" section of an assembly program is just the machine instructions (versus the data, read-only data, and BSS sections). The global line is akin to saying that the _start function is "public."

_start: ;tell linker entry point

mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel

From the comments we know that we are looking at the sys_write function, so we can man 2 write to get the details. The C prototype gives the following parameters: fd, *buf, and count. Starting with %ebx we see that those match (%ebx = fd, %ecx = string to write, and %edx = length of string). Then, since we are a user process, we must ask the kernel to perform the output. This is done through the SYSCALL interface, and the write() function is (apparently) given the number 4. INT 0x80 is a software interrupt that calls the Linux kernel's SYSCALL routine.

You can find the actual numbers of all the syscalls in the Linux header files (assuming you have them installed). On my system, I checked /usr/include/sys/syscall.h leading to /usr/include/asm/unistd.h and then onto /usr/include/asm-i386/unistd.h. Where (I see), #define __NR_write 4.

mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel

As with the last two lines of the previous segment, this just loads the syscall id and does the software interrupt to exit the program (remove it's memory mapping and cleanup).

section .data

msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string

This is the data section, it just describes variables we used in our program.

0
votes

There are too many system calls for there to be a different assembly-language instruction for each one.

Instead, you call the TRAP instruction. The value of eax determines which system call will be invoked. The other registers are the arguments to the system call.

The system calls are listed inside the kernel.