Why assembly produced by objdump is huge?

Question

I am trying to view the assembly for my simple C application. So, I have tried to produce assembly from binary by using objdump and it produces about 4.3MB sized file with 103228 lines of assembly code. Then, I have tried to do so by providing -S & -save-temps flags to the gcc.

I have used the following three commands:

 1. arm-linux-gnueabi-objdump -d hello_simple > hello_simple.dump
 2. arm-linux-gnueabi-gcc -save-temps -static hello_simple.c -o hello_simple -lm
 3. arm-linux-gnueabi-gcc -S -static hello_simple.c -o hello_simple.asm -lm

In case of 2 & 3, exactly same results are produced, i.e., 65 lines of assembly code. I understand objdump produces some extra details too.

But, why is there a huge difference?

EDIT1: I have used the following command to build that binary:

arm-linux-gnueabi-gcc -static hello_simple.c -o hello_simple -lm

EDIT2: Though, -static and -lm flags may look here unnecessary but, I have to execute this binary on simulator after compile time additions of some assembly components, making them a must.

So, which assembly code should I consider as the most relevant during my analysis of execution traces? (I know it's another question but it would be handy to cover it in the same answer.)

If you use -static linkage for your program, it will also contain libc and other libraries linked into executable. Those will be dumped with objdump as well. — Alex Skalozub
re: your last edit. -static -lm have no effect on command 3. They obviously do make a difference for the other commands, since those do produce binaries. — Peter Cordes
That's command 2., where they do have an effect, like I said. — Peter Cordes
Why two negatives and two close votes without any comment? I think SO philosophy is to help and guide rather than just condemn. Had there been some comments as well, I would have learnt something at least. — tod

Peter Cordes Peter Cordes · Accepted Answer · 2016-02-14T15:05:41

The second two are just saving the asm for your functions.

The first one also has the CRT startup code. And, since you statically linked it, all the library functions you called.

Note that for 3, -static and -lm don't do anything, because you're not linking. gcc foo.c -S -O3 -fverbose-asm -o- | less is often handy.

I notice that none of your command lines included a -O3, or a -march=. You should compile with optimization on, and have gcc optimize your code for the target hardware.

.s is the standard suffix for machine-generated asm. (.S for hand-written asm: gcc foo.S will run it through cpp first). gcc -S produces a .s, the same way -c produces a .o.

For x86, .asm is usually only used for Intel-syntax (NASM/YASM), but IDK what the conventions are for ARM.

So, which assembly code should I consider as the most relevant during my analysis of execution traces?

It depends what you're trying to learn! If you have a good sense of how "expensive" each library function call is (in terms of number of instructions, number of branches polluting the branch-predictors, and data-cache pollution), then you don't need to trace execution through library calls. If you have math library functions that are used from some of your inner loops, then it's worth looking at them if the code is time-critical.

Usually a profiler or single-stepping in a debugger is useful for that, though. Just having disassembly output of a lot of library code is usually just clutter.

Why assembly produced by objdump is huge?

1 Answers