9
votes

I have a pure abstract base and two derived classes:

struct B { virtual void foo() = 0; };
struct D1 : B { void foo() override { cout << "D1::foo()" << endl; } };
struct D2 : B { void foo() override { cout << "D1::foo()" << endl; } };

Does calling foo in Point A cost the same as a call to a non-virtual member function? Or is it more expensive than if D1 and D2 wouldn't have derived from B?

int main() {
 D1 d1; D2 d2; 
 std::vector<B*> v = { &d1, &d2 };

 d1.foo(); d2.foo(); // Point A (polymorphism not necessary)
 for(auto&& i : v) i->foo(); // Polymorphism necessary.

 return 0;
}

Answer: the answer of Andy Prowl is kind of the right answer, I just wanted to add the assembly output of gcc (tested in godbolt: gcc-4.7 -O2 -march=native -std=c++11). The cost of the direct function calls is:

mov rdi, rsp
call    D1::foo()
mov rdi, rbp
call    D2::foo()

And for the polymorphic calls:

mov rdi, QWORD PTR [rbx]
mov rax, QWORD PTR [rdi]
call    [QWORD PTR [rax]]
mov rdi, QWORD PTR [rbx+8]
mov rax, QWORD PTR [rdi]
call    [QWORD PTR [rax]]

However, if the objects don't derive from B and you just perform the direct call, gcc will inline the function calls:

mov esi, OFFSET FLAT:.LC0
mov edi, OFFSET FLAT:std::cout
call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)

This could enable further optimizations if D1 and D2 don't derive from B so I guess that no, they are not equivalent (at least for this version of gcc with these optimizations, -O3 produced a similar output without inlining). Is there something preventing the compiler from inlining in the case that D1 and D2 do derive from B?

"Fix": use delegates (aka reimplement virtual functions yourself):

struct DG { // Delegate
 std::function<void(void)> foo;
 template<class C> DG(C&& c) { foo = [&](void){c.foo();}; }
};

and then create a vector of delegates:

std::vector<DG> v = { d1, d2 };

this allows inlining if you access the methods in a non-polymorphic way. However, I guess accessing the vector will be slower (or at least as fast because std::function uses virtual functions for type erasure) than just using virtual functions (can't test with godbolt yet).

2
There is no reason the compiler couldn't inline the calls if D1 and D2 are derived from B for the direct calls.Daniel Frey
You could not time the difference in those instruction sets.Martin York
Nothing prevents compiler from inlining D1::foo(), D2::foo(). It's some GCC 4.7 and above glitch. GCC 4.5 inlined this with no problems. clang 3.4.1 inlined this as well.doc
It still fails with gcc-4.9 (tip-of-trunk) -O3 -march=native -DNDEBUG (see code and assembly here: goo.gl/NKm3Uz). It should inline these calls since we have a single TU. In a more complex program, and unless you use final, it is very hard to inline these even with LTO, since you can always create a new TU in which you derive from a class (and a dynamic library could do so too). IIRC Herb Sutter described the issue as "with virtual inheritance you pay for infinite extensibility", and that has a cost.gnzlbg
Furthermore, with virtual inheritance, the interface (or all possible interfaces, unless you use the Adapter pattern) get put in the vtable with the object, and this vtable can become quiet large. The delegate provides a smaller interface (and vtable), and this improves the cache usage in loops.gnzlbg

2 Answers

8
votes

Does calling foo in Point A cost the same as a call to a non-virtual member function?

Yes.

Or is it more expensive than if D1 and D2 wouldn't have derived from B?

No.

The compiler will resolve these function calls statically, because they are not performed through a pointer or through a reference. Since the type of the objects on which the function is called is known at compile-time, the compiler knows which implementation of foo() will have to be invoked.

4
votes

Simplest solution is looking at the compilers innards. In Clang we find canDevirtualizeMemberFunctionCall in lib/CodeGen/CGClass.cpp:

/// canDevirtualizeMemberFunctionCall - Checks whether the given virtual member
/// function call on the given expr can be devirtualized.
static bool canDevirtualizeMemberFunctionCall(const Expr *Base, 
                                              const CXXMethodDecl *MD) {
  // If the most derived class is marked final, we know that no subclass can
  // override this member function and so we can devirtualize it. For example:
  //
  // struct A { virtual void f(); }
  // struct B final : A { };
  //
  // void f(B *b) {
  //   b->f();
  // }
  //
  const CXXRecordDecl *MostDerivedClassDecl = getMostDerivedClassDecl(Base);
  if (MostDerivedClassDecl->hasAttr<FinalAttr>())
    return true;

  // If the member function is marked 'final', we know that it can't be
  // overridden and can therefore devirtualize it.
  if (MD->hasAttr<FinalAttr>())
    return true;

  // Similarly, if the class itself is marked 'final' it can't be overridden
  // and we can therefore devirtualize the member function call.
  if (MD->getParent()->hasAttr<FinalAttr>())
    return true;

  Base = skipNoOpCastsAndParens(Base);
  if (const DeclRefExpr *DRE = dyn_cast<DeclRefExpr>(Base)) {
    if (const VarDecl *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
      // This is a record decl. We know the type and can devirtualize it.
      return VD->getType()->isRecordType();
    }

    return false;
  }

  // We can always devirtualize calls on temporary object expressions.
  if (isa<CXXConstructExpr>(Base))
    return true;

  // And calls on bound temporaries.
  if (isa<CXXBindTemporaryExpr>(Base))
    return true;

  // Check if this is a call expr that returns a record type.
  if (const CallExpr *CE = dyn_cast<CallExpr>(Base))
    return CE->getCallReturnType()->isRecordType();

  // We can't devirtualize the call.
  return false;
}

I believe the code (and accompanying comments) are self-explanatory :)