1
votes

On ubuntu 16.04

$ cat hola.asm

    extern puts
    global main

    section .text
main:
    mov rdi,message
    call puts
    ret

message:
    db  "Hola",0
$ nasm -f elf64 hola.asm  
$ gcc hola.o

/usr/bin/ld: hola.o: relocation R_X86_64_PC32 against symbol `puts@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status

Use:

$gcc -fPIC hola.o -o hola && ./hola
Hola

The docs:

-fPIC If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on AArch64, m68k, PowerPC and SPARC.

Position-independent code requires special support, and therefore works only on certain machines. When this flag is set, the macros "pic" and "PIC" are defined to 2. Position-independent code requires special support, and therefore works only on certain machines.

The -static option with gcc works:

use -static to completely avoid external calls to dynamic libraries

$nasm -f elf64 -l hola.lst hola.asm && gcc -m64 -static -o hola hola.o && ./hola
Hola

and also:

$nasm -f elf64 hello.asm && gcc -static -o hola hola.o && ./hola Hola

Including wrt ..plt also worked

 global main
    extern puts

    section .text
main:
    mov rdi,message
    call puts wrt ..plt
    ret
message:
    db "Hola", 0




$nasm -f elf64 hola.asm
$gcc -m64 -o hola hola.o && ./hola
Hola

from ..plt description

..plt Referring to a procedure name using wrt ..plt causes the linker to build a procedure linkage table entry for the symbol, and the reference gives the address of the PLT entry. You can only use this in contexts which would generate a PC-relative relocation normally (i.e. as the destination for CALL or JMP), since ELF contains no relocation type to refer to PLT entries absolutely.

1
Why are passing /usr/bin/ld as an argument to GCC? You want to run the linker, not use the linker to link itself.David Grayson
Compile the final executable using -static optionMichael Petch
The alternative is to modify the assembly file and modify the C library call by placing wrt ..plt on the end. So it would look like call puts wrt ..plt . I suspect you are on a more recent Ubuntu or Debian based system that defaults to compiling position independent executables.Michael Petch
As grayson points out you don't put /usr/bin/ld on the GCC command line. /usr/bin/ld is an executable that links code. It should look like nasm -f elf64 -l hola.lst hola.asm && gcc -m64 -o hola hola.oMichael Petch
You can see the assembly that GCC makes by using the -S option: gcc -S hi.c -o hi.s. When I do this, it's clear that GCC is using call puts. Starting with GCC's assembler, you could remove stuff you don't understand (or learn why it's there) until you have a small assembly file that does what you want. For linking, if you use gcc hi.s -o hi, then GCC will ensure that the C library is included properly.Dave M.

1 Answers

2
votes

I wrote up this program to do the same as the hi.c program, without the c lib call. Then followed a suggestion to use the -S gcc option on hi.c then to dissect the resulting hi.s program.

$ cat hiasm.asm

section .text
    global _start

_start:

    mov     dl, 5
    mov     esi, msg
    xor     di,di
    xor     al,al
    inc     di
    inc     al
    syscall

    xor     rdi,rdi 
    mov al,60
    syscall

msg:    db "Hello"

$ nasm -f elf64 hiasm.asm && ld -m elf_x86_64 hiasm.o -o hiasm && ./hiasm

Hello

$ echo $?

0

So this works fine

again, here's the simple hi.c

$ cat hi.c

#include <stdio.h>

int main(void)
{
    puts("Hello");
    return 0;
}

$ gcc -s hi.c && cat hi.s

    .file   "hi.c"
    .section    .rodata
.LC0:
    .string "Hello"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    leaq    .LC0(%rip), %rdi
    call    puts@PLT
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
    .section    .note.GNU-stack,"",@progbits

$ gcc hi.s -o hi && ./hi

Hello

The labels .LFB0 and .LFE0 do not appear to be referenced within the .s file After removing both the file still works as expected, referencing 'as' assembler docs:

https://sourceware.org/binutils/docs/as/index.html

Local symbols are defined and used within the assembler, but they are normally not saved in object files. Thus, they are not visible when debugging. You may use the `-L' option (see Include Local Symbols) to retain the local symbols in the object files.

So as a pure executable with no need for bells and whistles, they can be chopped

So I got rid of the easy ones

Next the function wants to call main, there's not much use for this, so I'll call _start

For ELF targets, the .size directive is used like this:

 .size name , expression

This directive sets the size associated with a symbol name. The size in bytes is computed from expression which can make use of label arithmetic. This directive is typically used to set the size of function symbols.

Don't need function symbol sizes, got rid of the .size at the bottom that references main

$cat hi.s .

file    "hi.c"          ##tells 'as' that we are about to start a new logical file
        .section    .rodata     ##assembles the following code into section '.rodata'
    .LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                    ##But here, only .LC0 is actually referenced in the code

    .string "Hello"         ##
    .text
    .globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc
    pushq   %rbp            ##push base pointer onto stack

    .cfi_def_cfa_offset 16      ##modifies a rule for computing CFA. Register remains the
                    ##same, but offset is new. Note that it is the absolute
                    ##offset that will be added to a defined register to
                    ##compute CFA address
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    leaq    .LC0(%rip), %rdi
    call    puts@PLT
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc            ##close of .cfi_startproc

    .ident  "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
    .section    .note.GNU-stack,"",@progbits

Trying that:

$ gcc -o hi hi.s

/tmp/ccLxG1jh.o: In function `_start':
hi.c:(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

$ ldd hi

linux-vdso.so.1 (0x00007fffb6569000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe7456e7000)
/lib64/ld-linux-x86-64.so.2 (0x000055edc8bc8000)

It's definitely using libc, which explains our multiple definitions of _start So I'll try getting rid of std lib with the -nostdlib gcc option

$ gcc -nostdlib -o hi hi.s

/tmp/ccV5QYaT.o: In function `_start':
hi.c:(.text+0xc): undefined reference to puts'
collect2: error: ld returned 1 exit status

Right, still need C for puts, getting rid of puts

.file   "hi.c"          ##tells 'as' that we are about to start a new logical file
.section    .rodata     ##assembles the following code into section '.rodata'
.LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                ##But here, only .LC0 is actually referenced in the code

.string "Hello"         ##
.text
.globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc
    pushq   %rbp            ##push base pointer onto stack

.cfi_def_cfa_offset 16      ##modifies a rule for computing CFA. Register remains the
                ##same, but offset is new. Note that it is the absolute
                ##offset that will be added to a defined register to
                ##compute CFA address
.cfi_offset 6, -16
movq    %rsp, %rbp
.cfi_def_cfa_register 6
leaq    .LC0(%rip), %rsi     ##this reg value and others were changed for write call
movq    $1, %rax
movq    $1, %rdi
movq    $5, %rdx
syscall

movl    $0, %eax
popq    %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc            ##close of .cfi_startproc

$ gcc -nostdlib -o hi.s && ./hi

HelloSegmentation fault

Promising

.file   "hi.c"          ##tells 'as' that we are about to start a new logical file
.section    .rodata     ##assembles the following code into section '.rodata'
.LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                    ##But here, only .LC0 is actually referenced in the code

.string "Hello"         
.text
.globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc

##deleted the base pointer push and pops from stack, don't need stack

.cfi_def_cfa_offset 16      ##modifies a rule for computing CFA. Register remains the
                ##same, but offset is new. Note that it is the absolute
                ##offset that will be added to a defined register to
                ##compute CFA address
.cfi_offset 6, -16
movq    %rsp, %rbp
.cfi_def_cfa_register 6
leaq    .LC0(%rip), %rsi
movq    $1, %rax
movq    $1, %rdi
movq    $5, %rdx
syscall

xor %rdi,%rdi   
mov $60, %rax
.cfi_def_cfa 7, 8
syscall
.cfi_endproc            ##close of .cfi_startproc

$ gcc -g -nostdlib -o hi hi.s && ./hi Hello

Got it! Trying to figure out what a CFA is http://dwarfstd.org/doc/DWARF4.pdf Section 6.4

An area of memory that is allocated on a stack called a “call frame.” The call frame is identified by an address on the stack. We refer to this address as the Canonical Frame Address or CFA. Typically, the CFA is defined to be the value of th e stack pointer at the call site in the previous frame (which may be different from its value on entry to the current frame)

So then all .cfi_def_cfa_offset, .cfi_offset and .cfi_def_cfa_register are doing is computing, and manipulating the stack. But this program doesn't need the stack at all, so might as well delete that too

$ cat hi.s

.file   "hi.c"          ##tells 'as' that we are about to start a new logical file
    .section    .rodata     ##assembles the following code into section '.rodata'
.LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                    ##But here, only .LC0 is actually referenced in the code

.string "Hello"         
.text


.globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc
    leaq    .LC0(%rip), %rsi
    movq    $1, %rax
    movq    $1, %rdi
    movq    $5, %rdx
    syscall

xor %rdi,%rdi   
mov $60, %rax
syscall
.cfi_endproc            ##close of .cfi_startproc

.cfi_startproc :

Used at the beginning of each function that should have an entry in the .eh_frame

What is eh_frame "When using languages that support exceptions, such as C++, additional information must be provided to the runtime environment that describes the call frames that much be unwound during the processing of an exception. This information is contained in the special sections .eh_frame and .eh_framehdr."

Don't need exception handling, not using C++

$ cat hi.s

.section    .rodata     
.LC0:                   

.string "Hello"         

.text
.globl  _start
_start:
    leaq    .LC0(%rip), %rsi
    movq    $1, %rax
    movq    $1, %rdi
    movq    $5, %rdx
    syscall

xor %rdi,%rdi   
mov $60, %rax
syscall