7
votes

Is this valid C++?

int main() {
    int *p;
    p = reinterpret_cast<int*>(42);
}

Assuming I never dereference p.

Looking up the C++ standard, we have

C++17 §6.9.2/3 [basic.compound]

3 Every value of pointer type is one of the following:

  • a pointer to an object or function (the pointer is said to point to the object or function), or
  • a pointer past the end of an object ([expr.add]), or
  • the null pointer value ([conv.ptr]) for that type, or
  • an invalid pointer value.

A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory ([intro.memory]) occupied by the object or the first byte in memory after the end of the storage occupied by the object, respectively. [ Note: A pointer past the end of an object ([expr.add]) is not considered to point to an unrelated object of the object's type that might be located at that address. A pointer value becomes invalid when the storage it denotes reaches the end of its storage duration; see [basic.stc]. — end note ] For purposes of pointer arithmetic ([expr.add]) and comparison ([expr.rel], [expr.eq]), a pointer past the end of the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical array element n of x and an object of type T that is not an array element is considered to belong to an array with one element of type T.

p = reinterpret_cast<int*>(42); does not fit into the list of possible values. And:

C++17 §8.2.10/5 [expr.reinterpret.cast]

A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note: Except as described in 6.7.4.3, the result of such a conversion will not be a safely-derived pointer value. — end note ]

C++ standard does not seem to say more about the integer to pointer conversion. Looking up the C17 standard:

C17 §6.3.2.3/5 (emphasis mine)

An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.68)

and

C17 §6.2.6.1/5

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.50) Such a representation is called a trap representation.

To me, it seems like any value that does not fit into the list in [basic.compound] is a trap representation, thus p = reinterpret_cast<int*>(42); is UB. Am I correct? Is there something else making p = reinterpret_cast<int*>(42); undefined?

3
The first quote is about the pointer value, not how you obtain it, so I don't think it's relevant here. reinterpret_cast<int*>(42) is (likely) an invalid pointer value, which fits the 4th bullet of your first quote. Also, "A value of integral type or enumeration type can be explicitly converted to a pointer. " - How does this not answer your question?Holt
I'd read C++17 §6.9.2/3 .. "invalid pointer value" as one of the four possible and allowed forms a pointer value may take on. So an "invalid pointer value" (e.g. pointing to an invalid object) is still defined behaviour. UB comes from those parts defining the meaning of operations on pointers (e.g. arithmetic, dereferencing). There is still one defined behaviour left on "invalid pointer values", which is "any pointer value may be converted to integral type", regardless of whether the pointer value is an "invalid pointer value" or not.Stephan Lechner
@Ayxan If pointers could only become invalid when the object they point to reach the end of their storage duration, int *p would be UB, so I don't think the way you take that quote is right.Holt
BTW, 42 is probably misaligned value for int*.Jarod42
@Holt: it is UB: eel.is/c++draft/basic.indet#2geza

3 Answers

5
votes

This is not UB, but implementation-defined, and you already cited why (§8.2.10/5 [expr.reinterpret.cast]). If a pointer has invalid pointer value, it doesn't necessarily mean that it has a trap representation. It can have a trap representation, and the compiler must document this. All you have here is a not safely-derived pointer.

Note, that we generate pointers with invalid pointer value all the time: if an object is freed by delete, all the pointers which pointed to this object have invalid pointer value.

Using the resulting pointer is implementation defined as well (not UB):

[...] if the object to which the glvalue refers contains an invalid pointer value ([basic.stc.dynamic.deallocation], [basic.stc.dynamic.safety]), the behavior is implementation-defined.

2
votes

The example shown is valid c++. On some platforms this is how you access "hardware resources" (and if it's not valid you have found a bug/mistake in standard text).

See also this answer for a better explanation.


Update: The first sentence of reinterpret_cast as you quote yourself:

A value of integral type or enumeration type can be explicitly converted to a pointer.

I recommend you stop reading and rest yourself at this point. The rest of just a lot details including possible implementation specified behavior, etc. That doesn't make it UB/invalid.

0
votes

Trap Representations

What: As covered by [C17 §6.2.6.1/5], a trap representation is a non-value. It is a bit pattern that fills the space allocated for an object of a given type, but this pattern does not correspond to a value of that type. It is a special pattern that can be recognized for the purpose of triggering behavior defined by the implementation. That is, the behavior is not covered by the standard, which means it falls under the banner of "undefined behavior". The standard sets out the possibilities for when a trap could be (not must be) triggered, but it makes no attempt to limit what a trap might do. For more information, see A: trap representation.

The undefined behavior associated with a trap representation is interesting in that an implementation has to check for it. The more common cases of undefined behavior were left undefined so that implementations do not need to check for them. The need to check for trap representations is a good reason to want few trap representations in an efficient implementation.

Who: The decision of which bit patterns (if any) constitute trap representations falls to the implementation. The standards do not force the existence of trap representations; when trap representations are mentioned, the wording is permissive, as in "might be", as opposed to demanding, as in "shall be". Trap representations are allowed, not required. In fact, N2091 came to the conclusion that trap representations are largely unused in practice, leading up to a proposal to remove them from the C standard. (It also proposes a backup plan if removal proves infeasible: explicitly call out that implementations must document which representations are trap representations, as there is no other way to know for sure whether or not a given bit pattern is a trap representation.)

Why: Theoretically, a trap representation could be used as a debugging aid. For example, an implementation could declare that 0xDDDD is a trap representation for pointer types, then choose to initialize all otherwise uninitialized pointers to this bit pattern. Reading this bit pattern could trigger a trap that alerts the programmer to the use of an uninitialized pointer. (Without the trap, a crash might not occur until later, complicating the debugging process. Sometimes early detection is the key.) In any event, a trap representation requires a trap of some sort to serve a purpose. An implementation would not define a trap representation without also defining its trap.

My point is that trap representations must be specified. They are deliberately removed from the set of values of a given type. They are not simply "everything else".

Pointer Values

C++17 §6.9.2/3 [basic.compound]

This section defines what an invalid pointer value is. It states "Every value of pointer type is one of the following" before listing four possibilities. That means that if you have a pointer value, then it is one of the four possibilities. The first three are fully specified (pointer to object or function, pointer past the end, and null pointer). The last possibility (invalid pointer value) is not fully specified elsewhere, so it becomes the catch-all "everything else" entry in the list (it is a "wild card", to borrow terminology from the comments). Hence this section defines "invalid pointer value" to mean a pointer value that does not point to something, does not point to the end of something, and is not null. If you have a pointer value that does not fit one of those three categories, then it is invalid.

In particular, if we agree that reinterpret_cast<int*>(42) does not point to something, does not point to the end of something, and is not null, then we must conclude that it is an invalid pointer value. (Admittedly, one could assume that the result of the cast is a trap representation for pointers in some implementation. In that case, yes, it does not fit into the list of possible pointer values because it would not be a pointer value, hence it's a trap representation. However, that is circular logic. Furthermore, based upon N2091, few implementations define any trap representations for pointers, so the assumption is likely groundless.)

[ Note: [...] A pointer value becomes invalid when the storage it denotes reaches the end of its storage duration; see [basic.stc]. — end note ]

I should first acknowledge that this is a note. It explains and clarifies without adding new substance. One should expect no definitions in a note.

This note gives an example of an invalid pointer value. It clarifies that a pointer can (perhaps surprisingly) change from "points to an object" to "invalid pointer value" without changing its value. Looking at this from a formal logic perspective, this note is an implication: "if [something] then [invalid pointer]". Viewing this as a definition of "invalid pointer" is a fallacy; it is merely an example of one of the ways one can get an invalid pointer.

Casting

C++17 §8.2.10/5 [expr.reinterpret.cast]

A value of integral type or enumeration type can be explicitly converted to a pointer.

This explicitly permits reinterpret_cast<int*>(42). Therefore, the behavior is defined.

To be thorough, one should make sure there is nothing in the standard that makes 42 "erroneous data" to the degree that undefined behavior results from the cast. The rest of [§8.2.10/5] does not do this, and:

C++ standard does not seem to say more about the integer to pointer conversion.

Is this valid C++?

Yes.