0
votes

Keeping a pointer on an element of a vector which is resized and dereferencing it afterwards is undefined behavior.

When testing this bad practice on the following program with a std::vector<int> (with #if 0), the address sanitizer correctly reports a heap-use-after-free error.

$ ./prog
capa: 8
v[0]: 0x603000000010 <1000>
p: 0x603000000010 <1000>
capa: 16
v[0]: 0x6060000000e0 <1000>
=================================================================
==23068==ERROR: AddressSanitizer: heap-use-after-free on address 0x603000000010

But when trying the same experiment with std::vector<std::string> (with #if 1), the address sanitizer does not report anything, which leads to using a destroyed string (probably moved-from during the resize) through the pointer!

$ ./prog
capa: 8
v[0]: 0x611000000040 <1000>
p: 0x611000000040 <1000>
capa: 16
v[0]: 0x615000000080 <1000>
p: 0x611000000040 <>

My question: why does not the address sanitizer report the error in this second case?
edit: valgrind reports the error.

I tested the following program on GNU/Linux x86_64 (Archlinux) with g++ 9.2.0 and clang++ 9.0.0.

/**
  g++ -std=c++17 -o prog prog.cpp \
      -pedantic -Wall -Wextra -Wconversion -Wno-sign-conversion \
      -g -O0 -UNDEBUG -fsanitize=address,undefined
**/

#include <iostream>
#include <vector>

#if 1
# include <string>
  inline auto make_elem(int n) { return std::to_string(n); }
#else
  inline auto make_elem(int n) { return n; }
#endif

using elem_t = decltype(make_elem(0));

inline
void
fill(std::vector<elem_t> &v,
     int sz)
{
  v.resize(std::size_t(sz));
  for(auto i=0; i<sz; ++i)
  {
    v[i]=make_elem(1000+i);
  }
}

inline
void
show(const std::vector<elem_t> &v,
     const elem_t *p)
{
  std::cout << "capa: " << v.capacity() << '\n';
  std::cout << "v[0]: " << &v[0] << " <" << v[0] << ">\n";
  std::cout << "p: " << p << " <" << *p << ">\n"; // <-- possible invalid pointer here
}

int
main()
{
  constexpr auto sz=8;
  auto v=std::vector<elem_t>{};
  fill(v, sz);
  const auto *p=data(v);
  show(v, p);
  fill(v, 2*sz);
  show(v, p);
  return 0;
}

I've also filed upstream bug about this.

1
Did you let Valgrind have a go at your code? - tadman
@tadman yes, and it correctly reports the use-after-free bug (it is indeed present in my code). But I don't understand why the address-sanitizer does not (but does with a vector of int). - prog-fh
The compiler builtins are rarely as good as a special-purpose tool like Valgrind. - tadman
You're assuming that std::vector::resize() causes all pointers to elements of the vector to become invalid. However, resize() is not required to do reallocation, so there is no guarantee that p becomes invalid after the second call of fill(). The circumstances in which reallocation occurs may also be affected by the size of the vector element (sizeof(int) versus sizeof(std::string)), compilation settings (optimisation, etc), ..... So it is not actually unreasonable that memory checking tools will not reliably detect every instance where p is potentially invalidated. - Peter
@tadman Actually I think both Asan and Valgrind need support from STL to support proper container overflows (which is hidden under macro and calls Asan's __asan_poison_memory or Valgrind's VALGRIND_MEM_DEFINED, etc.). So I take my words back - both Valgrind and Asan have equal opportunities (but Asan already has support in libc++ and libstdc++). - yugr

1 Answers

1
votes

I've commented on github issue, but the short answer is that due to the way libstdc++.so.6 splits certain common template instantiations, such as

basic_ostream<...>::operator<<(basic_ostream<...>&, const std::string &);

and instantiates them only once inside libstdc++.so.6, and because libstdc++.so.6 itself is not asan-instrumented, all that instrumented code can see is that you are passing a dangling pointer into an external function. It doesn't know what the external function will do with this pointer, and so can't report the error.

The problem does not reproduce with clang++ ... -stdlib=libc++ (dangling access is properly reported).