28
votes

Recently tried the following program and it compiles, runs fine and produces expected output instead of any runtime error.

#include <iostream>
class demo
{
    public:
        static void fun()
        {
            std::cout<<"fun() is called\n";
        }
        static int a;
};
int demo::a=9;
int main()
{
    demo* d=nullptr;
    d->fun();
    std::cout<<d->a;
    return 0;
}

If an uninitialized pointer is used to access class and/or struct members behaviour is undefined, but why it is allowed to access static members using null pointers also. Is there any harm in my program?

5
Is there any harm in my program? It is still UB.user1810087
Undefined behavior does not mean that the code is required to crash; rather it means that anything at all is allowed to happen, the result is undefined. That is, the code could appear to work fine and as expected, it could crash, it could appear to run fine but give you the wrong result, anything at all.wolfPack88
Voted to reopen; the linked question addresses non-static members, not static ones.T.C.
this is an interesting question - it compiles cleanly with no warnings, it calls the correct function. Is it valid syntax? Forget that d is null, what if d is valid pointer. It is surprising to see d->f() where f is a static functionpm100
The biggest problem is maintainability. It should be demo::f() and demo::a, and if someone later edits the code they might actually try to use that pointer.Kenny Ostrom

5 Answers

33
votes

TL;DR: Your example is well-defined. Merely dereferencing a null pointer is not invoking UB.

There is a lot of debate over this topic, which basically boils down to whether indirection through a null pointer is itself UB.
The only questionable thing that happens in your example is the evaluation of the object expression. In particular, d->a is equivalent to (*d).a according to [expr.ref]/2:

The expression E1->E2 is converted to the equivalent form (*(E1)).E2; the remainder of 5.2.5 will address only the first option (dot).

*d is just evaluated:

The postfix expression before the dot or arrow is evaluated;65 the result of that evaluation, together with the id-expression, determines the result of the entire postfix expression.

65) If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.

Let's extract the critical part of the code. Consider the expression statement

*d;

In this statement, *d is a discarded value expression according to [stmt.expr]. So *d is solely evaluated1, just as in d->a.
Hence if *d; is valid, or in other words the evaluation of the expression *d, so is your example.

Does indirection through null pointers inherently result in undefined behavior?

There is the open CWG issue #232, created over fifteen years ago, which concerns this exact question. A very important argument is raised. The report starts with

At least a couple of places in the IS state that indirection through a null pointer produces undefined behavior: 1.9 [intro.execution] paragraph 4 gives "dereferencing the null pointer" as an example of undefined behavior, and 8.3.2 [dcl.ref] paragraph 4 (in a note) uses this supposedly undefined behavior as justification for the nonexistence of "null references."

Note that the example mentioned was changed to cover modifications of const objects instead, and the note in [dcl.ref] - while still existing - is not normative. The normative passage was removed to avoid commitment.

However, 5.3.1 [expr.unary.op] paragraph 1, which describes the unary "*" operator, does not say that the behavior is undefined if the operand is a null pointer, as one might expect. Furthermore, at least one passage gives dereferencing a null pointer well-defined behavior: 5.2.8 [expr.typeid] paragraph 2 says

If the lvalue expression is obtained by applying the unary * operator to a pointer and the pointer is a null pointer value (4.10 [conv.ptr]), the typeid expression throws the bad_typeid exception (18.7.3 [bad.typeid]).

This is inconsistent and should be cleaned up.

The last point is especially important. The quote in [expr.typeid] still exists and appertains to glvalues of polymorphic class type, which is the case in the following example:

int main() try {

    // Polymorphic type
    class A
    {
        virtual ~A(){}
    };

    typeid( *((A*)0) );

}
catch (std::bad_typeid)
{
    std::cerr << "bad_exception\n";
}

The behavior of this program is well-defined (an exception will be thrown and catched), and the expression *((A*)0) is evaluated as it isn't part of an unevaluated operand. Now if indirection through null pointers induced UB, then the expression written as

*((A*)0);

would be doing just that, inducing UB, which seems nonsensical when compared to the typeid scenario. If the above expression is merely evaluated as every discarded-value expression is1, where is the crucial difference that makes the evaluation in the second snippet UB? There is no existing implementation that analyzes the typeid-operand, finds the innermost, corresponding dereference and surrounds its operand with a check - there would be a performance loss, too.

A note in that issue then ends the short discussion with:

We agreed that the approach in the standard seems okay: p = 0; *p; is not inherently an error. An lvalue-to-rvalue conversion would give it undefined behavior.

I.e. the committee agreed upon this. Although the proposed resolution of this report, which introduced so-called "empty lvalues", was never adopted…

However, “not modifiable” is a compile-time concept, while in fact this deals with runtime values and thus should produce undefined behavior instead. Also, there are other contexts in which lvalues can occur, such as the left operand of . or .*, which should also be restricted. Additional drafting is required.

that does not affect the rationale. Then again, it should be noted that this issue even precedes C++03, which makes it less convincing while we approach C++17.


CWG-issue #315 seems to cover your case as well:

Another instance to consider is that of invoking a member function from a null pointer:

  struct A { void f () { } };
  int main ()
  {
    A* ap = 0;
    ap->f ();
  }

[…]

Rationale (October 2003):

We agreed the example should be allowed. p->f() is rewritten as (*p).f() according to 5.2.5 [expr.ref]. *p is not an error when p is null unless the lvalue is converted to an rvalue (4.1 [conv.lval]), which it isn't here.

According to this rationale, indirection through a null pointer per se does not invoke UB without further lvalue-to-rvalue conversions (=accesses to stored value), reference bindings, value computations or the like. (Nota bene: Calling a non-static member function with a null pointer should invoke UB, albeit merely hazily disallowed by [class.mfct.non-static]/2. The rationale is outdated in this respect.)

I.e. a mere evaluation of *d does not suffice to invoke UB. The identity of the object is not required, and neither is its previously stored value. On the other hand, e.g.

*p = 123;

is undefined since there is a value computation of the left operand, [expr.ass]/1:

In all cases, the assignment is sequenced after the value computation of the right and left operands

Because the left operand is expected to be a glvalue, the identity of the object referred to by that glvalue must be determined as mentioned by the definition of evaluation of an expression in [intro.execution]/12, which is impossible (and thus leads to UB).


1 [expr]/11:

In some contexts, an expression only appears for its side effects. Such an expression is called a discarded-value expression. The expression is evaluated and its value is discarded. […]. The lvalue-to-rvalue conversion (4.1) is applied if and only if the expression is a glvalue of volatile-qualified type and […]

4
votes

From the C++ Draft Standard N3337:

9.4 Static members

2 A static member s of class X may be referred to using the qualified-id expression X::s; it is not necessary to use the class member access syntax (5.2.5) to refer to a static member. A static member may be referred to using the class member access syntax, in which case the object expression is evaluated.

And in the section about object expression...

5.2.5 Class member access

4 If E2 is declared to have type “reference to T,” then E1.E2 is an lvalue; the type of E1.E2 is T. Otherwise, one of the following rules applies.

— If E2 is a static data member and the type of E2 is T, then E1.E2 is an lvalue; the expression designates the named member of the class. The type of E1.E2 is T.

Based on the last paragraph of the standard, the expressions:

  d->fun();
  std::cout << d->a;

work because they both designate the named member of the class regardless of the value of d.

4
votes

runs fine and produces expected output instead of any runtime error.

That's a basic assumption error. What you are doing is undefined behavior, which means that your claim for any kind of "expected output" is faulty.

Addendum: Note that, while there is a CWG defect (#315) report that is closed as "in agreement" of not making the above UB, it relies on the positive closing of another CWG defect (#232) that is still active, and hence none of it is added to the standard.

Let me quote a part of a comment from James McNellis to an answer to a similar Stack Overflow question:

I don't think CWG defect 315 is as "closed" as its presence on the "closed issues" page implies. The rationale says that it should be allowed because "*p is not an error when p is null unless the lvalue is converted to an rvalue." However, that relies on the concept of an "empty lvalue," which is part of the proposed resolution to CWG defect 232, but which has not been adopted.

1
votes

The expressions d->fun and d->a() both cause evaluation of *d ([expr.ref]/2).

The complete definition of the unary * operator from [expr.unary.op]/1 is:

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.

For the expression d there is no "object or function to which the expression points" . Therefore this paragraph does not define the behaviour of *d.

Hence the code is undefined by omission, since the behaviour of evaluating *d is not defined anywhere in the Standard.

0
votes

What you are seeing here is what I would consider an ill-conceived and unfortunate design choice in the specification of the C++ language and many other languages that belong to the same general family of programming languages.

These languages allow you to refer to static members of a class using a reference to an instance of the class. The actual value of the instance reference is of course ignored, since no instance is required to access static members.

So, in d->fun(); the the compiler uses the d pointer only during compilation to figure out that you are referring to a member of the demo class, and then it ignores it. No code is emitted by the compiler to dereference the pointer, so the fact that it is going to be NULL during runtime does not matter.

So, what you see happening is in perfect accordance to the specification of the language, and in my opinion the specification suffers in this respect, because it allows an illogical thing to happen: to use an instance reference to refer to a static member.

P.S. Most compilers in most languages are actually capable of issuing warnings for that kind of stuff. I do not know about your compiler, but you might want to check, because the fact that you received no warning for doing what you did might mean that you do not have enough warnings enabled.