I think the question you are trying to ask is: “I know that functions like printf
and scanf
are implemented by the C runtime library. But I can use them without telling my compiler and/or IDE to link my program with the C runtime library. Why don’t I need to do that?”
The answer to that question is: “Programs that don’t need to be linked with the C runtime library are very, very rare. Even if you don’t explicitly use any library functions, you will still need the startup code, and the compiler might issue calls to memcpy
, floating-point emulation functions, and so on ‘under the hood.’ Therefore, as a convenience, the compiler automatically links your program with the C runtime library, unless you tell it to not do that.”
You will have to consult the documentation for your compiler to learn how to tell it not to link in the C runtime library. GCC uses the -nostdlib
command-line option. Below, I demonstrate the hoops you have to jump through to make that work...
$ cat > test.c
#include <stdio.h>
int main(void) { puts("hello world"); return 0; }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
/usr/bin/ld: warning: cannot find entry symbol _start
/tmp/cc8svIx5.o: In function ‘main’:
test.c:(.text+0xa): undefined reference to ‘puts’
collect2: error: ld returned 1 exit status
puts
is obviously in the C library, but so is this mysterious "entry symbol _start
". Turn off the C library and you have to provide that yourself, too...
$ cat > test.c
int _start(void) { return 0; }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
Segmentation fault
139
It links now, but we get a segmentation fault, because _start
has nowhere to return to! The operating system expects it to call _exit
. OK, let's do that...
$ cat > test.c
extern void _exit(int);
void _start(void) { _exit(0); }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
/tmp/ccuDrMQ9.o: In function `_start':
test.c:(.text+0xa): undefined reference to `_exit'
collect2: error: ld returned 1 exit status
... nuts, _exit
is a function in the C runtime library, too! Raw system call time...
$ cat > test.c
#include <unistd.h>
#include <sys/syscall.h>
void _start(void) { syscall(SYS_exit, 0); }
^D
$ gcc -nostdlib test.c && { ./a.out; echo $?; }
/tmp/cchtZnbP.o: In function `_start':
test.c:(.text+0x14): undefined reference to `syscall'
collect2: error: ld returned 1 exit status
... nope, syscall
is also a function in the C runtime. I guess we just have to use assembly!
$ cat > test.S
#include <sys/syscall.h>
.text
.globl _start
.type _start, @function
_start:
movq $SYS_exit, %rax
movq $0, %rdi
syscall
$ gcc -nostdlib test.S && { ./a.out; echo $?; }
0
And that, finally, works. On my computer. It wouldn't work on a different operating system, with a different assembly-level convention for system calls.
You might now be wondering what the heck -nostdlib
is even good for, if you have to drop down to assembly language just to make system calls. It's intended to be used when compiling completely self-contained, low-level system programs like the bootloader, the kernel, and (parts of) the C runtime itself — things that were going to have to implement their own everything anyway.
If we had it to do all over again from scratch, it might well make sense to separate out a low-level language-independent runtime, with just the syscall wrappers, language-independent process startup code, and the functions that any language's compiler might need to call "under the hood" (memcpy
, _Unwind_RaiseException
, __muldi3
, that sort of thing). The problem with that idea is it rapidly suffers mission creep — do you include errno
? Generic threading primitives? (Which ones, with which semantics?) The dynamic linker? An implementation of malloc
, which several of the above things need? Windows's ntdll.dll
began as this concept, and it's 1.8MB on disk in Windows 10, which is (slightly) bigger than libc.so
+ ld.so
on my Linux partition. And it's rare and difficult to write a program that only uses ntdll.dll
, even if you're Microsoft (the only example I'm sure of is csrss.exe
, which might as well be a kernel component).
int main(int argc, char *argv[])
, 4) Adding brackets around a single line of code in anif
statement just wastes vertical space and clutters up the code. It's just a style issue and people are free to do it either way (but the indentation does need to be fixed no matter which style you prefer). – Carey Gregory