15
votes

Let's say I have a function, called like this:

void mysort(int *arr, std::size_t size)
{
  std::sort(&arr[0], &arr[size]);
}

int main()
{
  int a[] = { 42, 314 };
  mysort(a, 2);
}

My question is: does the code of mysort (more specifically, &arr[size]) have defined behaviour?

I know it would be perfectly valid if replaced by arr + size; pointer arithmetic allows pointing past-the-end normally. However, my question is specifically about the use of & and [].

Per C++11 5.2.1/1, arr[size] is equivalent to *(arr + size).

Quoting 5.3.1/1, the rules for unary *:

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [ Note: a pointer to an incomplete type (other than cv void) can be dereferenced. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1. —end note ]

Finally, 5.3.1/3 giving the rules for &:

The result of the unary & operator is a pointer to its operand. The operand shall be an lvalue ... if the type of the expression is T, the result has type “pointer to T” and is a prvalue that is the address of the designated object (1.7) or a pointer to the designated function.

(Emphasis and ellipses mine).

I can't quite make up my mind about this. I know for sure that forcing an lvalue-to-rvalue conversion on arr[size] would be Undefined. But no such conversion happens in the code. arr + size does not point to an object; but while the paragraphs above talk about objects, they never seem to explicitly call out the necessity for an object to actually exist at that location (unlike e.g. the lvalue-to-rvalue conversion in 4.1/1).

So, the questio is: is mysort, the way it's called, valid or not?

(Note that I'm quoting C++11 above, but if this is handled more explicitly in a later standard/draft, I would be perfectly happy with that).

5
No not a duplicate, @Angew knows that: trust me.Bathsheba
Can anyone tell me what is the new in this question when compared to the possible duplicate? It seems to me that copy-pasting the answer from there would indeed answer this question too.Tomáš Zato - Reinstate Monica
§6.5.3.2, paragraph 3: ... Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator. Otherwise, the result is a pointer to the object or function designated by its operand. Appears to answer the question. Direct copy/paste from the proposed dupe.R_Kapp
My reading is that &arr[size] is UB since it's essentially &(*something) where *something is UB. But I'm waiting for an expert to confirm. But the dupe has little to do with that.Bathsheba
@Bathsheba How is it not a duplicate?Barry

5 Answers

9
votes

It's not valid. You bolded "result is an lvalue referring to the object or function to which the expression points" in your question. That's exactly the problem. array + size is a valid pointer value that does not point to an object. Therefore, your quote about *(array + size) does not specify what the result refers to, and that then means there is no requirement for &*(array + size) to give the same value as array + size.

In C, this was considered a defect and fixed so that the spec now says in &*ptr, neither & nor * gets evaluated. C++ hasn't yet received fixed wording. It's the subject of a very old still active DR: DR #232. The intent is that it is valid, just as it is in C, but the standard doesn't say so.

1
votes

In the context of normal C++ arrays, yes. It is legal to form the address of the one-past-the-end element of the array. It is not legal to read or write to what it is pointing at, however (after all, there is no actual element there). So when you do the &arr[size], the arr[size] forms what you might think of as a reference to the one-past-the-end element, but has not tried to actually access that element yet. Then the & gets you the address of that element. Since nothing has tried to actually follow that pointer, nothing bad has happened.

This isn't by accident, this makes pointers into arrays behave like iterators. Thus &a[0] is essentially .begin() on the array, and &a[size] (where size is the number of elements in the array) is essentially .end(). (See also std::array where this ends up being more explicit)

Edit: Erm, I may have to retract this answer. While it probably applies in most cases, if the type stored in the array has an overridden operator& then when you do the &a[size], the operator& method may attempt to access members of the instance of the type at a[size] where there is no instance.

0
votes

Assuming size is the actual array size, you are passing a pointer to past-the-end element to std::sort().

So, as I understand it, the question boils down to: is this pointer equivalent to arr.end()?

There is little doubt this is true for every existing compiler, since array iterators are indeed plain old pointers, so &arr[size] is the obvious choice for arr.end().

However, I doubt there is a specific requirement about the actual implementation of plain old array iterators.

So, for the sake of the argument, you could imagine a compiler using a "past end" bit in addition to the actual address to implement plain old array iterators internally and perversely paint your mustache pink if it detected any concievable inconsistency between iterators and addresses obtained through pointer arithmetics. This freakish compiler would cause a lot of existing C++ code to crash without actually violating the spec, which might just be worth the effort of designing it...

0
votes

If we admit that arr[i] is just a shorthand for *(arr + i), we can rewrite &arr[size] as &*(arr + size). Hence, we are dereferencing a pointer that points to the past-the-end element, which leads to an undefined behavior. As you correctly say, arr + size would instead be legal, because no dereferencing operation takes place.

Coincidentally, this is also presented as a quiz in Stepanov's notes (page 11).

-5
votes

It's perfectly fine and well defined as long as size is not larger than the size of the actual array (in units of the array elements).

So if main () called mysort (a, 100), &arr [size] would already be undefined behaviour (but most likely undetected, but std::sort would obviously go wrong badly as well).