0
votes

Given this valid C or C++ code

int x() {
    int numbers[3]; // Lets suppose numbers are filled in with values
    int sum = 0;
    for (const int* p = &numbers[0]; p != &numbers[3]; ++p)
        sum += *p;
    return sum;
}

This code uses pointer arithmetics and as far as I know it is valid to have a pointer point to the one past last element inside an array, referencing that pointer is uspecified but we can have a pointer point to that position. So, &p[0], &p[1], &p[2] and &p[3] are valid pointers, and p[0], p[1] and p[2] are valid values.

If I replace the int array with a std::vector everything should be fine, we get this code

#include <vector>
int x() {
    std::vector<int> numbers(3);
    int sum = 0;
    for (const int* p = &numbers[0]; p != &numbers[3]; ++p)
        sum += *p;
    return sum;
}

But running under Visual C++ 2017 in DEBUG mode I get this exception "vector subscript out of range", that is triggered from MS STL library because the implementation asumes that using operator[] we are automatically referencing the underlaying value, wich is not the case. This is MS STL code that do the bounds check ...

    _NODISCARD _Ty& operator[](const size_type _Pos)
        {   // subscript mutable sequence
 #if _ITERATOR_DEBUG_LEVEL == 2
        if (size() <= _Pos)
            {   // report error
            _DEBUG_ERROR("vector subscript out of range");
            }
 #elif _ITERATOR_DEBUG_LEVEL == 1
        _SCL_SECURE_VALIDATE_RANGE(_Pos < size());
 #endif /* _ITERATOR_DEBUG_LEVEL */

        return (this->_Myfirst()[_Pos]);
        }

I don´t get the error if I replace &numbers[0] and &numbers[3] with numbers.begin() and numbers.end().

I agree this is really ugly code, but I simplify the real code just to expose the bug.

The original code was using &vec[0] on a vector with zero elements.

So my question is:

Is this a BUG on Microsoft Visual C++ STL implementation or there is some restriction on operator[] for vectors<>?

I know that replacing [] by at() would be a bug, but I understand &vec[size] should still be valid for std::vector<>

4
I don't believe &p[3] is a valid pointer, because p[3] is invalid to start with. p+3 is a valid pointer, that points to the intended location though, as is vec.begin()+3Mooing Duck
This is MSVS being "helpful". Compile in release mode if you don't want debugging support.NathanOliver
Also, FWIW, the standard way to do this is auto sum = std::accumulate(std::begin(container_name), std::end(container_name), container_name::value_type{});NathanOliver
@NathanOliver i'd be tempted to use 0 and not container_name::value_type{}Caleth
@Caleth Yeah. When I wrote it I was thinking to myself, it would be nice to have a std::value_type type trait that would give use the type for containers or arrays. Generic programing FTW.NathanOliver

4 Answers

3
votes

It has been a grey area whether obtaining a pointer to the one past the last element with &p[n] is well defined.

However, a pointer to the one past the last element is well defined.

You can avoid such errors by using plain pointers arithmetic:

for (const int* p = numbers.data(); p != numbers.data() + 3; ++p)

Or, more generically, iterators:

using std::begin;
using std::end;
for(auto p = begin(v), q = end(v); p != q; ++p)

Or using a range for loop:

for(auto const& element : v)

There is no good reason to use v[v.size()], really.

3
votes

Dereferencing a past-last element results in undefined behaviour. If you declare std::vector<int> numbers(3) then the last element you are allowed to access is numbers[2]. Same story with raw arrays.

Avoid using raw arrays if you can:

int x() {
    std::vector<int> numbers(3); 
    //...
    int sum = 0;

    for (auto value : numbers)
        sum += value;

    return sum;
}
2
votes

The debug iterators library in Visual C++ is correctly reporting the problem, and the problem isn't subtle.

According to the standard, [sequence.reqmts] Table 101, the expression a[n] for a sequence container a of a type providing operator[] (std::vector does, as does basic_string, deque, and array), has the operational semantics of *(a.begin()+n). However, in the case you're running, a.begin()+n will be equivalent to a.end(). Therefore, the result is *(a.end()) before applying the address-of operator.

Dereferencing the end() iterator of a container invokes undefined behavior. Visual C++ is correct in reporting the assertion, and you would do well to change your enumeration strategy.

1
votes

So, &p[0], &p[1], &p[2] and &p[3] are valid pointers

No. The subscript operator (p[x]) for arrays is syntactic sugar for *(p+x), so &p[3] is actually doing &(*(p+3)) but *(p+3) is undefined behavior!!

If all you want is the address of one past last, p+3 is perfectly valid. This will use pointer arithmetic and not dereference anything.

If I replace the int array with a std::vector everything should be fine

Again, no! If you try to dereference a memory location you haven't allocated, you will get undefined behavior. std::vector says nothing about v[v.length()] being allocated for you, so this is undefined behavior.

the implementation asumes that using operator[] we are automatically referencing the underlaying value, wich is not the case

Yes it is!! Read the cppreference page on operator[]: "Returns a reference to the element at specified location pos. No bounds checking is performed." Just like with the raw arrays, the subscript operator here returns a reference to the underlying value, meaning there's a dereference step involved here!