2
votes

I am just starting with x86 assembly and was trying some basics with the MOV instruction. (code below)

BITS 32

SECTION .data                   
    somedata: db "Hello world",10

SECTION .text                   
    global _start                   
_start: 
    mov eax, somedata
    mov al, [eax]
    mov edx, [somedata]

I seem to not understand why nasm is using RIP relative addressing, when it is specified BITS 32 in the assembly (I thought the relative addressing in only in the 64 bit mode). Moreover it is using RAX in 32 bit mode. If I do not specify anything, it seems to not use relative addressing and uses EAX.

Code with BITS 32

Disassembly of section .text:

00000000004000b0 <_start>:
  4000b0:   b8 c0 00 60 00          mov    eax,0x6000c0
  4000b5:   8a 00                   mov    al,BYTE PTR [rax]
  4000b7:   8b 15 c0 00 60 00       mov    edx,DWORD PTR [rip+0x6000c0]        # a0017d <_end+0x4000ad>

Code without BITS 32

Disassembly of section .text:

00000000004000b0 <_start>:
  4000b0:   b8 c0 00 60 00          mov    eax,0x6000c0
  4000b5:   67 8a 00                mov    al,BYTE PTR [eax]
  4000b8:   8b 14 25 c0 00 60 00    mov    edx,DWORD PTR ds:0x6000c0

I know it is not the assembler, it is me. What is it that I am doing wrong?

PS:

  • Using nasm, and 64 bit computer with linux.

  • Assembling using nasm -f elf64 -F stabs -g sandbox.asm -o sandbox.o

  • Disassembling using objdump -M intel -d sandbox

I also tried the following assembler and linker flags:

nasm -f elf32 -F stabs -g sandbox.asm -o sandbox.o
ld -oformat=elf32-i386 -o sandbox sandbox.o

but it is not working saying ld: i386 architecture of input file `sandbox.o' is incompatible with i386:x86-64 output

2
Post the command lines you are using to assemble and, most importantly, to disassemble. IMO you are disassembling 32 bit code as if it was 64 bit code (as there's no such a thing as rip addressing in 32 bit code).Matteo Italia
@MatteoItalia I have added them in my question.Arjob Mukherjee
Using the BITS directive is almost certainly wrong. As a rule of thumb, it's either superfluous or wrong in almost all situations.fuz

2 Answers

4
votes

TL:DR: Never use the BITS directive unless it's necessary.

The BITS directive does not change the output file type selected by -felf64 or -felf32. (-felf is a synonym for -felf32, in case you ever see that used in examples.)

To make 32-bit static executables, I use an asm-link shell script that ends up doing this:

nasm -felf32 -g -Fdwarf foo.asm   &&
ld -melf_i386 -o foo foo.o

The stabs debug format is obsolete, although as long as your debugger supports it, it's probably fine for mapping asm source lines to asm instructions. Anyway, -Fstabs is the default if you use -g. (I haven't read it all, but https://www.ibm.com/developerworks/library/os-debugging/index.html has some info about STAB vs. DWARF.)


Most of the time, the BITS directive is at best useless, at worst actively harmful. Instead of a helpful error like push ebx not being encodeable if you try to build 32-bit code into a 64-bit object file, it lets things like this happen. (Although it wouldn't have saved you here, because all of that code assembles both ways.)

The only time BITS is useful is when you want to use nasm -fbin and make a flat binary you can feed to ndisasm or use as shellcode, or define ELF or whatever other metadata headers yourself with db (A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux). nasm doesn't provide a command-line option to change the BITS 16 default mode for -fbin.

Or if you actually want to mix 16, 32, and 64-bit code in a file that boots in 16-bit and switches to 64-bit mode: that's the primary use-case for BITS. Or to include some BITS 32 or BITS 16 machine code as data in your 64-bit executable.


Don't slap a BITS 32 line at the top of your file as part of the boilerplate, it's not helpful or good practice. Use a comment like ;;; 32-bit x86 Linux code, NASM syntax if you want to describe what's in this source file and how it can be built / run.

You can and should use DEFAULT REL though, so if you're building 64-bit code you'll get RIP-relative addressing modes for memory operands like [somedata] (symbol name with no GP registers). That's one byte shorter than 32-bit absolute addressing modes, and will work in a PIE executable.

Fun fact: 32-bit mode has 2 redundant ways to encode [disp32] absolute addressing modes. x86-64 repurposed the shorter one (no SIB byte) as RIP-relative. That's why your 64-bit disassembly of 32-bit machine code has DWORD PTR [rip+0x6000c0] where the rel32 is the absolute address of the symbol.

4
votes
nasm -f elf64 -F stabs -g sandbox.asm -o sandbox.o

This will always produce a 64 bit ELF executable, regardless of the fact that you are putting 32 bit code inside it. This causes the disassembler to decode your 32 bit machine code as if it were 64 bit code, hence the bizarre results. The fact that the disassembly resembles the original code is just a coincidence, deriving from the fact that the encoding of instructions in 64 bit mode is very similar or even the same (possibly changing default register size) to the 32 bit equivalent.

Use -f elf32 to get a 32 bit ELF executable.