115
votes

I've just lost three days of my life tracking down a very strange bug where unordered_map::insert() destroys the variable you insert. This highly non-obvious behaviour occurs in very recent compilers only: I found that clang 3.2-3.4 and GCC 4.8 are the only compilers to demonstrate this "feature".

Here's some reduced code from my main code base which demonstrates the issue:

#include <memory>
#include <unordered_map>
#include <iostream>

int main(void)
{
  std::unordered_map<int, std::shared_ptr<int>> map;
  auto a(std::make_pair(5, std::make_shared<int>(5)));
  std::cout << "a.second is " << a.second.get() << std::endl;
  map.insert(a); // Note we are NOT doing insert(std::move(a))
  std::cout << "a.second is now " << a.second.get() << std::endl;
  return 0;
}

I, like probably most C++ programmers, would expect output to look something like this:

a.second is 0x8c14048
a.second is now 0x8c14048

But with clang 3.2-3.4 and GCC 4.8 I get this instead:

a.second is 0xe03088
a.second is now 0

Which might make no sense, until you examine closely the docs for unordered_map::insert() at http://www.cplusplus.com/reference/unordered_map/unordered_map/insert/ where overload no 2 is:

template <class P> pair<iterator,bool> insert ( P&& val );

Which is a greedy universal reference move overload, consuming anything not matching any of the other overloads, and move constructing it into a value_type. So why did our code above choose this overload, and not the unordered_map::value_type overload as probably most would expect?

The answer stares you in the face: unordered_map::value_type is a pair<const int, std::shared_ptr> and the compiler would correctly think that a pair<int, std::shared_ptr> isn't convertible. Therefore the compiler chooses the move universal reference overload, and that destroys the original, despite the programmer not using std::move() which is the typical convention for indicating you are okay with a variable getting destroyed. Therefore the insert destroying behaviour is in fact correct as per the C++11 standard, and older compilers were incorrect.

You can probably see now why I took three days to diagnose this bug. It was not at all obvious in a large code base where the type being inserted into unordered_map was a typedef defined far away in source code terms, and it never occurred to anyone to check if the typedef was identical to value_type.

So my questions to Stack Overflow:

  1. Why do older compilers not destroy variables inserted like newer compilers? I mean, even GCC 4.7 doesn't do this, and it's pretty standards conforming.

  2. Is this problem widely known, because surely upgrading compilers will cause code which used to work to suddenly stop working?

  3. Did the C++ standards committee intend this behaviour?

  4. How would you suggest that unordered_map::insert() be modified to give better behaviour? I ask this because if there is support here, I intend to submit this behaviour as a N note to WG21 and ask them to implement a better behaviour.

2
Just because it uses a universal ref doesn't mean the inserted value is always moved - it should only ever do so for rvalues, which plain a is not. It should make a copy. Also, this behaviour totally depends on the stdlib, not the compiler.Xeo
That seems like a bug in the implementation of the libraryDavid Rodríguez - dribeas
"Therefore the insert destroying behaviour is in fact correct as per the C++11 standard, and older compilers were incorrect." Sorry, but you're wrong. What part of the C++ Standard did you get that idea from? BTW cplusplus.com is not official.Ben Voigt
I cannot reproduce this on my system, and I'm using gcc 4.8.2 and 4.9.0 20131223 (experimental) respectively. Output is a.second is now 0x2074088 (or similar) for me.user1508519
This was GCC bug 57619, a regression in the 4.8 series that was fixed for 4.8.2 in 2013-06.Casey

2 Answers

83
votes

As others have pointed out in the comments, the "universal" constructor is not, in fact, supposed to always move from its argument. It's supposed to move if the argument is really an rvalue, and copy if it's an lvalue.

The behaviour, you observe, which always moves, is a bug in libstdc++, which is now fixed according to a comment on the question. For those curious, I took a look at the g++-4.8 headers.

bits/stl_map.h, lines 598-603

  template<typename _Pair, typename = typename
           std::enable_if<std::is_constructible<value_type,
                                                _Pair&&>::value>::type>
    std::pair<iterator, bool>
    insert(_Pair&& __x)
    { return _M_t._M_insert_unique(std::forward<_Pair>(__x)); }

bits/unordered_map.h, lines 365-370

  template<typename _Pair, typename = typename
           std::enable_if<std::is_constructible<value_type,
                                                _Pair&&>::value>::type>
    std::pair<iterator, bool>
    insert(_Pair&& __x)
    { return _M_h.insert(std::move(__x)); }

The latter is incorrectly using std::move where it should be using std::forward.

20
votes
template <class P> pair<iterator,bool> insert ( P&& val );

Which is a greedy universal reference move overload, consuming anything not matching any of the other overloads, and move constructing it into a value_type.

That is what some people call universal reference, but really is reference collapsing. In your case, where the argument is an lvalue of type pair<int,shared_ptr<int>> it will not result in the argument being an rvalue reference and it should not move from it.

So why did our code above choose this overload, and not the unordered_map::value_type overload as probably most would expect?

Because you, as many other people before, misinterpreted the value_type in the container. The value_type of *map (whether ordered or a unordered) is pair<const K, T>, which in your case is pair<const int, shared_ptr<int>>. The type not matching eliminates the overload that you might be expecting:

iterator       insert(const_iterator hint, const value_type& obj);