GCC Inline-Assembly Error: "Operand size mismatch for 'int'"

Question

first, if somebody knows a function of the Standard C Library, that prints a string without looking for a binary zero, but requires the number of characters to draw, please tell me!

Otherwise, I have this problem:

void printStringWithLength(char *str_ptr, int n_chars){

asm("mov 4, %rax");//Function number (write)
asm("mov 1, %rbx");//File descriptor (stdout)
asm("mov $str_ptr, %rcx");
asm("mov $n_chars, %rdx");
asm("int 0x80");
return;

}

GCC tells the following error to the "int" instruction:

"Error: operand size mismatch for 'int'"

Can somebody tell me the issue?

I know, but this here has other reasons I dont want to speak about :D — toskana98
Ooooh, I love a good pointless secret. stackoverflow.com/questions/1817577/… — Retired Ninja
That's not how inline assembly works. Things like variable names are not available in inline assembly, if you want to have these things, you need to use extended inline assembly. Read the article carefully! — fuz

fuz fuz · Accepted Answer · 2017-09-02T17:52:45

There are a number of issues with your code. Let me go over them step by step.

First of all, the int $0x80 system call interface is for 32 bit code only. You should not use it in 64 bit code as it only accepts 32 bit arguments. In 64 bit code, use the syscall interface. The system calls are similar but some numbers are different.

Second, in AT&T assembly syntax, immediates must be prefixed with a dollar sign. So it's mov $4, %rax, not mov 4, %rax. The latter would attempt to move the content of address 4 to rax which is clearly not what you want.

Third, you can't just refer to the names of automatic variables in inline assembly. You have to tell the compiler what variables you want to use using extended assembly if you need any. For example, in your code, you could do:

asm volatile("mov $4, %%eax; mov $1, %%edi; mov %0, %%esi; mov %2, %%edx; syscall"
    :: "r"(str_ptr), "r"(n_chars) : "rdi", "rsi", "rdx", "rax", "memory");

Fourth, gcc is an optimizing compiler. By default it assumes that inline assembly statements are like pure functions, that the outputs are a pure function of the explicit inputs. If the output(s) are unused, the asm statement can be optimized away, or hoisted out of loops if run with the same inputs.

But a system call like write has a side-effect you need the compiler to keep, so it's not pure. You need the asm statement to run the same number of times and in the same order as the C abstract machine would. asm volatile will make this happen. (An asm statement with no outputs is implicitly volatile, but it's good practice to make it explicit when the side effect is the main purpose of the asm statement. Plus, we do want to use an output operand to tell the compiler that RAX is modified, as well as being an input, which we couldn't do with a clobber.)

You do always need to accurately describe your asm's inputs, outputs, and clobbers to the compiler using Extended inline assembly syntax. Otherwise you'll step on the compiler's toes (it assumes registers are unchanged unless they're outputs or clobbers). (Related: How can I indicate that the memory *pointed* to by an inline ASM argument may be used? shows that a pointer input operand alone does not imply that the pointed-to memory is also an input. Use a dummy "m" input or a "memory" clobber to force all reachable memory to be in sync.)

You should simplify your code by not writing your own mov instructions to put data into registers but rather letting the compiler do this. For example, your assembly becomes:

ssize_t retval;
asm volatile ("syscall"            // note only 1 instruction in the template
    : "=a"(retval)                 // RAX gets the return value
    : "a"(SYS_write), "D"(STDOUT_FILENO), "S"(str_ptr), "d"(n_chars)
    : "memory", "rcx", "r11"       // syscall destroys RCX and R11
  );

where SYS_WRITE is defined in <sys/syscall.h> and STDOUT_FILENO in <stdio.h>. I am not going to explain all the details of extended inline assembly to you. Using inline assembly in general is usually a bad idea. Read the documentation if you are interested. (https://stackoverflow.com/tags/inline-assembly/info)

Fifth, you should avoid using inline assembly when you can. For example, to do system calls, use the syscall function from unistd.h:

syscall(SYS_write, STDOUT_FILENO, str_ptr, (size_t)n_chars);

This does the right thing. But it doesn't inline into your code, so use wrapper macros from MUSL for example if you want to really inline a syscall instead of calling a libc function.

Sixth, always check if the system call you want to call is already available in the C standard library. In this case, it is, so you should just write

write(STDOUT_FILENO, str_ptr, n_chars);

and avoid all of this altogether.

Seventh, if you prefer to use stdio, use fwrite instead:

fwrite(str_ptr, 1, n_chars, stdout);

GCC Inline-Assembly Error: "Operand size mismatch for 'int'"

2 Answers