14
votes

I am seeing a difference in generated code depending on whether I explicitly default the copy constructor, or hand-write the same thing. It's a simple class that only holds an int and defines some arithmetic operators on it.

Both clang and g++ handle this situation in similar ways, so it made me wonder if there is an underlying language requirement for this, and if so, what's it doing? Looking for citations in the standard if possible. :)

To show this in action, I wrote the average() function two ways, operating on raw ints and also on Holders. I expected the two to generate the same code. Here is the output:

Explicitly defaulted copy constructor:

average(Holder, Holder):
  add esi, edi
  mov eax, esi
  shr eax, 31
  add eax, esi
  sar eax
  ret
average(int, int):
  add esi, edi
  mov eax, esi
  shr eax, 31
  add eax, esi
  sar eax
  ret

It is the same! Awesome, right? The question arises when I forget to "default" the implementation, and simply hand-write the version. Up until now I was under the impression that this should have the same resulting code as the defaulted code, but it doesn't.

hand written copy constructor

average(Holder, Holder):
  mov edx, DWORD PTR [rdx]
  mov ecx, DWORD PTR [rsi]
  mov rax, rdi
  add ecx, edx
  mov edx, ecx
  shr edx, 31
  add edx, ecx
  sar edx
  mov DWORD PTR [rdi], edx
  ret
average(int, int):
  add esi, edi
  mov eax, esi
  shr eax, 31
  add eax, esi
  sar eax
  ret

I'm trying to understand the reason for this, and relevant citations from the standard are most appreciated.

Here is the code

#define EXPLICITLY_DEFAULTED_COPY_CTOR true

class Holder {
public:

#if EXPLICITLY_DEFAULTED_COPY_CTOR
    Holder(Holder const & other) = default;
#else
    Holder(Holder const & other) noexcept : value{other.value} { }
#endif 
    constexpr explicit Holder(int value) noexcept : value{value} {}

    Holder& operator+=(Holder rhs) { value += rhs.value; return *this; } 
    Holder& operator/=(Holder rhs) { value /= rhs.value; return *this; } 
    friend Holder operator+(Holder lhs, Holder rhs) { return lhs += rhs; }
    friend Holder operator/(Holder lhs, Holder rhs) { return lhs /= rhs; }    

private:
    int value;
};

Holder average(Holder lhs, Holder rhs) {
    return (lhs + rhs) / Holder{2};
}

int average(int lhs, int rhs) {
    return (lhs + rhs) / int{2};
}

If this is expected, then is there anything I can do to the hand-written implementation that will get it to generate the same code as the defaulted version? I thought noexcept might help, but it doesn't.

Notes: If I add move constructor, the same issue remains except this difference happens with it instead of the copy constructor. It's the underlying reason I'm seeking, not just workarounds. I'm not interested in a code review or comments on style that are not directly relevant to answering why the code generation is different, because this is heavily minimized to show the issue I'm asking about.

See it live on Godbolt: https://godbolt.org/g/YA5Zsq

1
I think it's because by defining your own non-default copy-constructor, your class is no longer a POD type. Perhaps there's some compiler optimisations that can be performed for PODs that are not done for "regular" classes.Hitobat
Add static_assert(std::is_trivially_copyable<Holder>::value); to one of the functions and the header <type_traits>. It's no longer trivially copyable See Trivial copy constructor in: en.cppreference.com/w/cpp/language/copy_constructor Suggest you add the [language-lawyer] tagRichard Critten
Thanks, that's a very interesting observation. I'm still unclear whether the subtleties in the standard require the copy constructor to have a different generated code quality depending on its triviality. Is there a functional difference? This could either be an optimizer limitation, or a mandate that precludes the optimization.Chris Uzdavinis
Any reason why you use {} to initialize the value rather than ()? It might make a difference.Mark Ransom
@RichardCritten Adding a user-provided assignment operator also renders the class non-trivially-copyable, but in that case the "better" assembly is still generatedM.M

1 Answers

9
votes

This seems to be an ABI issue. The Itanium C++ ABI section 3.1.1/1 says:

If the parameter type is non-trivial for the purposes of calls, the caller must allocate space for a temporary and pass that temporary by reference.

and

A type is considered non-trivial for the purposes of calls if:

  • it has a non-trivial copy constructor, move constructor, or destructor, or
  • all of its copy and move constructors are deleted.

The C++ Standard alludes to this in [class.temporary]/3:

When an object of class type X is passed to or returned from a function, if each copy constructor, move constructor, and destructor of X is either trivial or deleted, and X has at least one non-deleted copy or move constructor, implementations are permitted to create a temporary object to hold the function parameter or result object. The temporary object is constructed from the function argument or return value, respectively, and the function's parameter or return object is initialized as if by using the non-deleted trivial constructor to copy the temporary (even if that constructor is inaccessible or would not be selected by overload resolution to perform a copy or move of the object). [ Note: This latitude is granted to allow objects of class type to be passed to or returned from functions in registers. — end note ]


So the difference you see in the assembly is that when Holder has a user-provided copy-constructor, the ABI requires that the caller pass a pointer to the argument, instead of passing the argument in the register.

I noticed that 32-bit g++ does the same thing. I didn't check the 32-bit ABI; not sure whether it has a similar requirement, or whether g++ just used the same code in both cases.