Can lambdas translate into functions?

Question

Common knowledge dictated that lambda functions are functors under the hood.

In this video (@ about 45:43) Bjarne says:

I mentioned that a lambda translates into a function object, or into a function if that's convenient

I can see how this is a compiler optimization (ie it doesn't change the perception of lambdas as unnamed functors which means that eg lambdas still won't overload) but are there any rules that specify when this is applicable?

Edit

The way I understand the term translate (and that's what I'm asking about) has nothing to do with conversion (I'm not asking whether lambdas are convertible to function ptr etc). By translate I mean "compile lambda expressions into functions instead of function objects".

As mentioned in cppreference : The lambda expression constructs an unnamed prvalue temporary object of unique unnamed non-union non-aggregate type, known as closure type.

The question is : can this object be ommited and have a plain function instead? If yes, then when and how?

^{Note: I imagine one such rule being "don't capture anything" but I can't find any reliable sources to confirm it}

There's no such thing as a function object on the binary level. The lambda's call operator will always compile into a function. If it's a capture-less lambda, the compiler might well omit the this pointer from this function as an optimization. — Sebastian Redl
@SebastianRedl The lambda's call operator is a (member) function (never said otherwise). The lambda expression constructs an unnamed prvalue temporary object of unique unnamed non-union non-aggregate type, known as closure type. (link)The question is : can this object be ommited and have a plain function instead? If yes, then when and how? — Nikos Athanasiou

Mikhail Maltsev Mikhail Maltsev · Accepted Answer · 2015-08-29T15:34:34

TLDR: if you only use lambda to convert it to a function pointer (and only invoke it via that function pointer), it is always profitable to omit the closure object. Optimizations which enable this are inlining and dead code elimination. If you do use the lambda itself, it is still possible to optimize the closure away, but requires somewhat more aggressive interprocedural optimization.

I will now try to show how that works under the hood. I will use GCC in my examples, because I'm more familiar with it. Other compilers should do something similar.

Consider this code:

#include <stdio.h>

typedef int (* fnptr_t)(int);
void use_fnptr(fnptr_t fn)
{
    printf("fn=%p, fn(1)=%d\n", fn, fn(1));
}

int main()
{
    auto lam = [] (int x) { return x + 1; };
    use_fnptr((fnptr_t)lam);
}

Now, I compile it and dump intermediate representation (for versions prior to 6, you should add -std=c++11):

g++ test.cc -fdump-tree-ssa

A little cleaned-up and edited (for brevity) dump looks like this:

// _ZZ4mainENKUliE_clEi
main()::<lambda(int)> (const struct __lambda0 * const __closure, int x)
{
    return x_1(D) + 1;
}

// _ZZ4mainENUliE_4_FUNEi
static int main()::<lambda(int)>::_FUN(int) (int D.2780)
{
    return main()::<lambda(int)>::operator() (0B, _2(D));
}

// _ZZ4mainENKUliE_cvPFiiEEv
main()::<lambda(int)>::operator int (*)(int)() const (const struct __lambda0 * const this)
{
    return _FUN;
}

int main() ()
{
    struct __lambda0 lam;
    int (*<T5c1>) (int) _3;
    _3 = main()::<lambda(int)>::operator int (*)(int) (&lam);
    use_fnptr (_3);
}

That is, lambda has 2 member functions: function call operator and a conversion operator and one static member function _FUN, which simply invokes lambda with this set to zero. main calls the conversion operator and passes the result to use_fnptr - exactly as it is written in the source code.

I can write:

extern "C" int _ZZ4mainENKUliE_clEi(void *, int);

int main()
{
    auto lam = [] (int x) { return x + 1; };
    use_fnptr((fnptr_t)lam);
    printf("%d %d %d\n", lam(10), _ZZ4mainENKUliE_clEi(&lam, 11), __lambda0::_FUN(12));
    printf("%p %p\n", &__lambda0::_FUN, (fnptr_t)lam);
   return 0;
}

This program outputs:

fn=0x4005fc, fn(1)=2
11 12 13
0x4005fc 0x4005fc

Now, I think it's pretty obvious, that the compiler should inline lambda (_ZZ4mainENKUliE_clEi) into _FUN (_ZZ4mainENUliE_4_FUNEi), because _FUN is the only caller. And inline operator int (*)(int) into main (because this operator simply returns a constant). GCC does exactly this, when compiling with optimization (-O). You can check this like:

g++ test.cc -O -fdump-tree-einline

Dump file:

// Considering inline candidate main()::<lambda(int)>.
//   Inlining main()::<lambda(int)> into static int main():<lambda(int)>::_FUN(int).

static int main()::<lambda(int)>::_FUN(int) (int D.2822)
{
    return _2(D) + 1;
}

The closure object is gone. Now, a more complicated case, when lambda itself is used (not a function pointer). Consider:

#include <stdio.h>

#define PRINT(x)    printf("%d", (x))
#define PRINT1(x)   PRINT(x); PRINT(x); PRINT(x); PRINT(x);
#define PRINT2(x)   do { PRINT1(x) PRINT1(x) PRINT1(x) PRINT1(x) } while(0)

__attribute__((noinline)) void use_lambda(auto t)
{
    t(1);
}

int main()
{
    auto lam = [] (int x) { PRINT2(x); };
    use_lambda(lam);
    return 0;
}

GCC will not inline lambda, because it is rather huge (that is what I used printf's for):

g++ test2.cc -O2 -fdump-ipa-inline -fdump-tree-einline -fdump-tree-esra

Early inliner's dump:

Considering inline candidate main()::<lambda(int)>
  will not early inline: void use_lambda(auto:1) [with auto:1 = main()::<lambda(int)>]/16->main()::<lambda(int)>/19, growth 46 exceeds --param early-inlining-insns

But "early interprocedural scalar replacement of aggregates" pass will do what we want:

;; Function main()::<lambda(int)> (_ZZ4mainENKUliE_clEi, funcdef_no=14, decl_uid=2815, cgraph_uid=12, symbol_order=12)
IPA param adjustments: 0. base_index: 0 - __closure, base: __closure, remove_param
  1. base_index: 1 - x, base: x, copy_param

The first parameter (i.e., closure) is not used, and it gets removed. Unfortunately interprocedural SRA is not able to optimize away indirection, which is introduced for captured values (though there are cases when it would be obviously profitable), so there is still some room for enhancements.

Can lambdas translate into functions?

Edit

5 Answers