4
votes

If you compile the following code with Visual Studio 2010:

    public struct A
    {
        public static implicit operator B(A a)
        {
            Console.WriteLine("11111111111");
            return new B();
        }
    }
    public struct B
    { }
    public static B F(A? a)
    {
        return (B)a;
    }

using ILSpy, return (B)a; is actually compiled as return A.op_Implicit(a.value).

By my understanding of C# 4.0 chapter 6.4.5 'User-defined explicit conversions', it should produce a compiler error.

But, reading ECMA 334 chapter 13.4.4 'User-defined explicit conversions', it has a different rule which the above code seems to comply with.

C# 4.0:

Find the set of applicable user-defined and lifted conversion operators, U. This set consists of the user-defined and lifted implicit or explicit conversion operators declared by the classes or structs in D that convert from a type encompassing or encompassed by S to a type encompassing or encompassed by T. If U is empty, the conversion is undefined and a compile-time error occurs.

ECMA 334:

Find the set of applicable conversion operators, U. This set consists of the user-defined and, if S and T are both nullable, lifted implicit or explicit conversion operators (§13.7.3) declared by the classes or structs in D that convert from a type encompassing or encompassed by S to a type encompassing or encompassed by T. If U is empty, there is no conversion, and a compile-time error occurs.

Am I correct that VS2010 does not comply with the "Evaluation of user-defined conversions" section in the C# 4.0 spec, but does comply with the ECMA spec?

1
There is no sentence, including the title, which has a question mark in it. All you've done is make statements. Are we supposed to guess what your question is?Eric Lippert
update title. please review.Vince

1 Answers

5
votes

Let's first see what should happen when we follow various rules.

Following the rules in the C# 4.0 spec:

  • The set D of types to search for user-defined conversions consists of A and B.
  • The set U of applicable conversions consists of the user-defined implicit conversion from A to B, and the lifted user-defined implicit conversion from A? to B?.
  • We must now choose the unique best of those two elements of U.
  • The most specific source type is A?.
  • The most specific target type is B.
  • U does not contain any conversion from A? to B, so this is ambiguous.

This should make sense. We do not know here whether the conversion should use the lifted conversion, converting from A? to B? and then from B? to B, or whether we should use the unlifted conversion, convert from A? to A and then A to B.


ASIDE:

Upon deeper reflection it is not clear that this is a difference which makes any difference.

Suppose we use the lifted conversion. If A? is non-null then we will convert from A? to A, then A to B, then B to B?, then B? back to B, which will succeed. If A? is null then we will convert from A? directly to a null B?, and then crash when unwrapping that to B.

Suppose we use the unlifted conversion and A? is non-null. Then we convert from A? to A, A to B, done. If A? is null then we crash when unwrapping A? to A.

So in this case both versions of the conversion have exactly the same action, so it doesn't really matter which we choose, and so calling this an ambiguity is unfortunate. However, this does not change the fact that clearly the compiler is not following the letter of the C# 4 specification.


What about the ECMA spec?

  • The set U consists of the user-defined conversion from A to B, but not the lifted conversion because S (which is A?) and T (which is B) are not both nullable.

And now we have only one to choose from, so overload resolution has an easy job of it.

However, this does not imply that the compiler is following the rules of the ECMA spec. In fact it is following the rules of neither spec. It is closer to the ECMA spec in that it does not add both operators to the candidate set, and therefore, in this simple case, chooses the only member of the candidate set. But in fact it never adds the lifted operator to the candidate set, even when both the source and target are nullable value types. Moreover it violates the ECMA spec in numerous other ways that would be shown up by more complex examples:

  • Lifted conversion semantics (that is, inserting a null check before calling the method and skipping it if the operand is null) are allowed on user-defined conversions from a non-nullable struct type to a nullable struct type, pointer type or reference type! That is, if you have a conversion from A to string, then you get a lifted conversion from A? to string that produces a null string if the operand is null. This rule is found nowhere in either spec.

  • According to the spec, the types that must encompass or be encompassed by each other are the type of the expression being converted (called S in the specification) and the formal parameter type of the user-defined conversion. The C# compiler actually checks for encompassment of the underlying type of the expression being converted if it is a nullable value type. (S0 in the spec.) This means that certain conversions which ought to be rejected are instead accepted.

  • According to the spec, the best target type should be determined by looking at the set of output types of the various conversions, lifted or unlifted. A user-defined conversion from A to B should be treated as having an output type of B for the purposes of finding the best output type. But if you had a cast from A to B? then the compiler would actually consider B? as the output type of the unlifted conversion for the purposes of determining the most specific output type!

I could go on (and on and on...) for hours about these and numerous other bugs in user-defined conversion processing. We've barely scratched the surface here; we haven't even gotten into what happens when generics get involved. But I will spare you. The takeaway here is: you cannot narrowly parse any version of the C# specification and from it determine what will happen in a complicated user-defined conversion scenario. The compiler usually does what the user expects, and usually does it for the wrong reasons.

This is both one of the most complicated parts of the specification, and the part of the specification that the compiler complies with the least, which is a bad combination. This is deeply unfortunate.

I made a valiant attempt to bring Roslyn into compliance with the specification but I failed; doing so introduced far, far too many real-world breaking changes. Instead I made Roslyn copy the behavior of the original compiler, just with a much cleaner, easier-to-understand implementation.