I'm learning how to profile my code with gprof. For one of my applications, I have the following output:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
10.27 1.20 1.20 Location::get_type() const (location.cpp:20 @ 40a4bd)
Farther down I see this
1.20 4.98 0.14 34662692 0.00 0.00 Location::get_type() const (location.cpp:19 @ 40a4ac)
Here is the function
char Location::get_type() const {
return type;
}
I'm assuming the first line by gprof refers to the total time the function needs to execute, while the second line refers to just the time needed by the return statement. I have other functions that are getters for the same class that return int
s, but the difference between the function time and return statement time is only about 0.1 seconds, where as with the times I posted the time difference is 1.06 seconds (the other getters are called about about 2 million times less, which is small compared to the total number of calls). What could explain the higher times for the function call as opposed to the one line of code in it?
It might be worth mentioning that I compiled with -g -pg since I'm using gprof in line-by-line mode.
Edit: One of the answers suggested I look at the assembly output. I can't understand it, so I'll post it here. I've posted the assembly code for two function calls. The first one is get_floor(), which is relatively fast (~.10 seconds). The second one is get_type() which is slow.
_ZNK8Location9get_floorEv:
.LFB5:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movl 8(%rax), %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE5:
.size _ZNK8Location9get_floorEv, .-_ZNK8Location9get_floorEv
.align 2
.globl _ZNK8Location8get_typeEv
.type _ZNK8Location8get_typeEv, @function
_ZNK8Location8get_typeEv:
.LFB6:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movzbl 12(%rax), %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
gprof
. Did a teacher recommend it? Did a blog or a book or some documentation recommend it? It has a number of issues. – Mike Dunlavey