0
votes

I actually wanted to write a program in assembly(linux) , to accept filename from the command line and I was successful by retrieving the values from the stack using successive pop opcodes when I used "ld" command to build but I was unsuccessful when i used "gcc" command . I need to use gcc because I will be using various C std library function in this program.

Actually the file was creating , but its always got a "Invalid encoding " label and appeared like <? G ? in the directory.I wanted to know:

  1. Do we follow a different procedure when built using gcc tools
  2. What was the possible reason for an invalid encoding file being created (out of curiosity).

Here is a sample code that works with ld but not with gcc.

section .data
    filename: db 'testing',0
section .text
    ;extern printf    ;to be uncommented when using gcc
    ;extern scanf     ;           -do-
    global _start   ; replace with main when using gcc

_start:     ; replace with main:
    pop ebx     ; argc (argument count)
    pop ebx     ; argv[0] (argument 0, the program name)
    pop ebx     ; The first real arg, a filename

    mov eax,8       
    ; issue: ebx is not holding the filename popped from cli using gcc 
    ;mov     ebx,filename   ; filename as a constant works with gcc but cli?
    mov ecx,00644Q  ; Read/write permissions in octal (rw_rw_rw_)
    int 80h     ; Call the kernel
                ; Now we have a file descriptor in eax

    test    eax,eax     ; Lets make sure the file descriptor is valid
    js  terminate   ; If the file descriptor has the sign flag              
    call    fileWrite

terminate:
    mov ebx,eax     ; If there was an error, save the errno in ebx
    mov eax,1       ; Put the exit syscall number in eax
    int 80h     ; control over to kernel

fileWrite:  ; simply closing the file for time being
    mov ebx,eax        ; edited
    mov eax,6       ; sys_close (ebx already contains file descriptor)
    int 80h
    call terminate

Solution and Caveat: There is a difference in the stack when using libc or bare-bone assembly.

  1. When using libc the , the first pop returns the return address followed by argc and argv values respectively.

  2. In bare-bone assembly , the first pop return the argc ,and every pop hence gives the successive argv values unlike a arguments pointer returned when using libc.

Source: Reading filename from argv via x86 assembly

1
The gcc command when linking is just a frontend for the ld command, it still calls ld to do the actual linking, passing along the flags needed for the standard library. If you want to use the standard C library, just add the flag to link with it: -lc.Some programmer dude
Just a point, your last comment says ; sys_close (ebx already contains file descriptor). Does it?Weather Vane
@WeatherVane you are right ...i missed the mov ebx,eax line at the beginning of fileWrite..but ebx hold the filename when i use ld without any stdlib support.touchStone
@JoachimPileborg adding -lc flag worked , but i wonder why indirectly using ld through gcc doesn't work. any idea?touchStone
@JoachimPileborg and touchStone: Linking using gcc foo.o actually uses ld crt.o foo.o -lc. You can use gcc -nostartfiles to get libc but not the CRT startup code which defines _start. See stackoverflow.com/questions/36861903/…Peter Cordes

1 Answers

0
votes

This blog post explains how the stack looks like when the entry point of a program is being called: http://eli.thegreenplace.net/2012/08/13/how-statically-linked-programs-run-on-linux

In a nutshell, you have these elements on the stack:

  1. argc
  2. argv[0] - program/executable name
  3. argv[1] ... argv[argc-1] - Program arguments
  4. argv[argc] - Always NULL
  5. envp[0] ... envp[N] - The current environment
  6. NULL to terminate the envp array

Those pointers are either 32 bit or 64 bit, depending on your kernel. x86 = 32 bit, x64 = 64 bit. So make sure you fetch the correct sizes from the stack. On x64, argc takes 8 bytes.

If you want to avoid this hassle, link against libc and provide a main entry point instead of _start. libc contains _start which will process the command line argument into arrays and then call main with three elements on the stack:

  1. int argc
  2. char** argv
  3. char** envp

The startup code of libc will also initialize the stdio framework; without that, calls to printf() will fail because stdout will be a NULL pointer.