1
votes

I know that an inline function does not use the stack for copying the parameters but it just replaces the body of the function wherever it is called.

Consider these two functions:

inline void add(int a) {
   a++; 
} // does nothing, a won't be changed
inline void add(int &a) {
   a++; 
} // changes the value of a

If the stack is not used for sending the parameters, how does the compiler know if a variable will be modified or not? What does the code looks like after replacing the calls of these two functions?

4
I'd say it should give a warning and optimize it out as long there aren't any side effects. - πάντα ῥεῖ
the compiler knows whether or not the function modifies memory. - tristan
I'm not sure I understand what you're asking, but there are no parameters if a functions gets inlined. There's no function call. If you write int a = 0; add(a); after inlining it'll be just int a = 0; a++; and after further optimization just int a = 1. - jrok
Your a is stored somewhere - either in memory or on stack of the caller of the inlined function. Compiler does what it sees fit with the stack of the caller to accommodate the inlined function's needs. After all, inlined function might use some local variables within itself. Compiler will put them on the stack of the caller. - lapk
I've just checked compiling both functions into assembly using g++ -finline-functions -S q.cpp and neither function gets inlined. - Igor Popov

4 Answers

0
votes

What makes you think there is a stack ? And even if there is, what makes you think it would be use for passing parameters ?

You have to understand that there are two levels of reasoning:

  • the language level: where the semantics of what should happen are defined
  • the machine level: where said semantics, encoded into CPU instructions, are carried out

At the language level, if you pass a parameter by non-const reference it might be modified by the function. The language level knows not what this mysterious "stack" is. Note: the inline keyword has little to no effect on whether a function call is inlined, it just says that the definition is in-line.

At machine level... there are many ways to achieve this. When making a function call, you have to obey a calling convention. This convention defines how the function parameters (and return types) are exchanged between caller and callee and who among them is responsible for saving/restoring the CPU registers. In general, because it is so low-level, this convention changes on a per CPU family basis.

For example, on x86, a couple parameters will be passed directly in CPU registers (if they fit) whilst remaining parameters (if any) will be passed on the stack.

0
votes

I have checked what at least GCC does with it if you force it to inline the methods:

inline static void add1(int a) __attribute__((always_inline)); 
void add1(int a) {
   a++; 
} // does nothing, a won't be changed

inline static void add2(int &a) __attribute__((always_inline));
void add2(int &a) {
   a++; 
} // changes the value of a

int main() {

label1:
    int b = 0;
    add1(b);

label2:
    int a = 0;
    add2(a);

    return 0;
}

The assembly output for this looks like:

.file   "test.cpp"
.text
.globl  main
.type   main, @function
main:
.LFB2:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    subl    $16, %esp
.L2:
    movl    $0, -4(%ebp)
    movl    -4(%ebp), %eax
    movl    %eax, -8(%ebp)
    addl    $1, -8(%ebp)
.L3:
    movl    $0, -12(%ebp)
    movl    -12(%ebp), %eax
    addl    $1, %eax
    movl    %eax, -12(%ebp)
    movl    $0, %eax
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
.LFE2:

Interestingly even the first call of add1() that effectively does nothing as a result outside of the function call, isn't optimized out.

0
votes

If the stack is not used for sending the parameters, how does the compiler know if a variable will be modified or not?

As Matthieu M. already pointed out the language construction itself knows nothing about stack.You specify inline keyword to the function just to give a compiler a hint and express a wish that you would prefer this routine to be inlined. If this happens depends completely on the compiler.

The compiler tries to predict what the advantages of this process given particular circumstances might be. If the compiler decides that inlining the function will make the code slower, or unacceptably larger, it will not inline it. Or, if it simply cannot because of a syntactical dependency, such as other code using a function pointer for callbacks, or exporting the function externally as in a dynamic/static code library.

What does the code looks like after replacing the calls of these two functions?

At he moment none of this function is being inlined when compiled with

g++ -finline-functions -S main.cpp

and you can see it because in disassembly of main

void add1(int a) {
    a++;
}
void add2(int &a) {
   a++; 
}

inline void add3(int a) {
   a++; 
} // does nothing, a won't be changed

inline void add4(int &a) {
   a++; 
} // changes the value of a

inline int f() { return 43; }

int main(int argc, char** argv) {

    int a = 31;
    add1(a);
    add2(a);
    add3(a);
    add4(a);
    return 0;
}

we see a call to each routine being made:

main:
.LFB8:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        pushq   %rbp
        .cfi_def_cfa_offset 16
        movq    %rsp, %rbp
        .cfi_offset 6, -16
        .cfi_def_cfa_register 6
        subq    $32, %rsp
        movl    %edi, -20(%rbp)
        movq    %rsi, -32(%rbp)
        movl    $31, -4(%rbp)
        movl    -4(%rbp), %eax
        movl    %eax, %edi
        call    _Z4add1i        // function call
        leaq    -4(%rbp), %rax
        movq    %rax, %rdi
        call    _Z4add2Ri       // function call
        movl    -4(%rbp), %eax
        movl    %eax, %edi
        call    _Z4add3i        // function call
        leaq    -4(%rbp), %rax
        movq    %rax, %rdi
        call    _Z4add4Ri       // function call
        movl    $0, %eax
        leave
        ret
        .cfi_endproc

compiling with -O1 will remove all functions from program at all because they do nothing. However addition of

__attribute__((always_inline))

allows us to see what happens when code is inlined:

void add1(int a) {
    a++;
}

void add2(int &a) {
   a++; 
}

inline static void add3(int a) __attribute__((always_inline));
inline void add3(int a) {
   a++; 
} // does nothing, a won't be changed

inline static void add4(int& a) __attribute__((always_inline));
inline void add4(int &a) {
   a++; 
} // changes the value of a

int main(int argc, char** argv) {

    int a = 31;
    add1(a);
    add2(a);
    add3(a);
    add4(a);
    return 0;
}

now: g++ -finline-functions -S main.cpp results with:

main:
.LFB9:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        pushq   %rbp
        .cfi_def_cfa_offset 16
        movq    %rsp, %rbp
        .cfi_offset 6, -16
        .cfi_def_cfa_register 6
        subq    $32, %rsp
        movl    %edi, -20(%rbp)
        movq    %rsi, -32(%rbp)
        movl    $31, -4(%rbp)
        movl    -4(%rbp), %eax
        movl    %eax, %edi
        call    _Z4add1i        // function call
        leaq    -4(%rbp), %rax
        movq    %rax, %rdi
        call    _Z4add2Ri       // function call
        movl    -4(%rbp), %eax
        movl    %eax, -8(%rbp)
        addl    $1, -8(%rbp)    // addition is here, there is no call
        movl    -4(%rbp), %eax
        addl    $1, %eax        // addition is here, no call again
        movl    %eax, -4(%rbp)
        movl    $0, %eax
        leave
        ret
        .cfi_endproc
0
votes

The inline keyword has two key effects. One effect is that it is a hint to the implementation that "inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism." This usage is a hint, not a mandate, because "an implementation is not required to perform this inline substitution at the point of call".

The other principal effect is how it modifies the one definition rule. Per the ODR, a program must contain exactly one definition of any given non-inline function that is odr-used in the program. That doesn't quite work with an inline function because "An inline function shall be defined in every translation unit in which it is odr-used ...". Use the same inline function in one hundred different translation units and the linker will be confronted with one hundred definitions of the function. This isn't a problem because those multiple implementations of the same function "... shall have exactly the same definition in every case." One way to look at this: There still is only one definition; it just looks like there are a whole bunch to the linker.

Note: All quoted material are from section 7.1.2 of the C++11 standard.