19
votes

TL;DR

Given the following code:

int* ptr;
*ptr = 0;

does *ptr require an lvalue-to-rvalue conversion of ptr before applying indirection?

The standard covers the topic of lvalue-to-rvalue in many places but does not seem to specify enough information to determine whether the * operator require such a conversion.

Details

The lvalue-to-rvalue conversion is covered in N3485 in section 4.1 Lvalue-to-rvalue conversion paragraph 1 and says (emphasis mine going forward):

A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue.53 If T is an incomplete type, a program that necessitates this conversion is ill-formed. If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior.[...]

So does *ptr = 0; necessitate this conversion?

If we go to section 4 paragraph 1 it says:

[...]A standard conversion sequence will be applied to an expression if necessary to convert it to a required destination type.

So when is it necessary? If we look at section 5 Expressions the lvalue-to-rvalue conversion is mentioned in paragraph 9 which says:

Whenever a glvalue expression appears as an operand of an operator that expects a prvalue for that operand, the lvalue-to-rvalue (4.1), array-to-pointer (4.2), or function-to-pointer (4.3) standard conversions are applied to convert the expression to a prvalue. [...]

and paragraph 11 which says:

In some contexts, an expression only appears for its side effects. Such an expression is called a discarded-value expression.[...] The lvalue-to-rvalue conversion (4.1) is applied if and only if the expression is an lvalue of volatile-qualified type and it is one of the following [...]

neither paragraph seems to apply to this code sample and 5.3.1 Unary operators paragraph 1 it says:

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [ Note: indirection through a pointer to an incomplete type (other than cv void) is valid. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1. —end note ]

it does not seem to require the value of the pointer and I don't see any requirements for a conversion of the pointer here am I missing something?

Why do we care?

I have seen an answer and comments in other questions that claim the use of an uninitialized pointer is undefined behavior due the need for an lvalue-to-rvalue conversion of ptr before applying indirection. For example: Where exactly does C++ standard say dereferencing an uninitialized pointer is undefined behavior? makes this argument and I can not reconcile the argument with what is laid out in any of the recent draft versions of the standard. Since I have seen this several times I wanted to get clarification.

The actual proof of undefined behavior is not as important since as I noted in the linked question above we have others way to get to undefined behavior.

2
"it does not seem to" explicitly "require the value of the pointer". Once could argue that the value is required implicitly (by "common sense").dyp
To be clear, your goal is to figure out exactly what part of the standard makes int *p; *p=0; undefined behavior? And failing that, spot a bug in the standard?Yakk - Adam Nevraumont
@Yakk let me rephrase, the answer impacts whether you can use this argument to show using an uninitialized pointer is UB, but it is not the only argument, we can show it is UB b/c we must assume it is singular. I see three outcomes 1) there is defect, the standard should explicitly say there is a l-to-r conversion 2) no conversion is mandated and this is not a proof of UB 3) the l-to-r conversion is implied and it does prove UB.Shafik Yaghmour
@dyp but if it is implied does that mean any read of a variable requires an l-to-r conversion? If that is the case then why all the specific language in section 5?Shafik Yaghmour
@ShafikYaghmour I'm not sure if every read requires an l-to-r conversion. Jerry Coffin's interpretation gives UB w/o l-to-r, by violating the requirements. Similar questions: stackoverflow.com/q/14991219/420683 stackoverflow.com/q/14935722/420683dyp

2 Answers

13
votes

I think you're approaching this from a rather oblique angle, so to speak. According to §5.3.1/1:

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.”

Although this doesn't talk about the lvalue-to-rvalue conversion, it requires that the expression be a pointer to an object or function. An uninitialized pointer won't (except, perhaps by accident) be any such thing so the attempt at dereferencing gives undefined behavior.

4
votes

I have converted the update section in my question to an answer since at this point it seems to be the answer, albeit an unsatisfactory one that my question is unanswerable:

dyp pointed me to two relevant threads that cover very similar ground:

The consensus seems to be that the standard is ill-specified and therefore can not provide the answer I am looking for, Joseph Mansfield posted a defect report on this lack of specification, and it looks like it is still open and it is not clear when it may be clarified.

There are a few common sense arguments to be made as to the intent of the standard. One can argue Logicially, an operand is a prvalue if the operation requires using the value of that operand. Another argument is that if we look back to the C99 draft standard says an lvalue to rvalue conversion is done by default and the exceptions are noted. The relevant section from the draft C99 standard is 6.3.2.1 Lvalues, arrays, and function designators paragraph 2 which says:

Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue). […]

which basically says with some exceptions an operand is converted to the value stored and since indirection is not an exception if this is clarified to also be the case in C++ as well then it would indeed make the answer to my question yes.

As I attempted to clarify the proof of undefined behavior was less important than clarifying whether a lvalue-to-rvalue conversion is mandated. If we want to prove undefined behavior we have alternate approaches. Jerry’s approach is a common sense one and in that indirection requires that the expression be a pointer to an object or function and an indeterminate value will only by accident point to a valid object. In general the draft C++ standard does not give an explicit statement to say using an indeterminate value is undefined, unlike the C99 draft standard In C++11 and back the standard does not give an explicit statement to say using an indeterminate value is undefined. The exception being iterators and by extension pointers we do have the concept of singular value and we are told in section 24.2.1 that:

[…][ Example: After the declaration of an uninitialized pointer x (as with int* x;), x must always be assumed to have a singular value of a pointer. —end example ] […] Dereferenceable values are always non-singular.

and:

An invalid iterator is an iterator that may be singular.268

and footnote 268 says:

This definition applies to pointers, since pointers are iterators. The effect of dereferencing an iterator that has been invalidated is undefined.

In C++1y the language changes and we do have an explicit statement making the use of an intermediate value undefined with some narrow exceptions.