Problem
for (int i = 0; i < 1000; i++)
yield i;
This is indeed not valid without a yield keyword, but what if we add parentheses around the i
?
for (int i = 0; i < 1000; i++)
yield (i);
Now this a perfectly valid call of a method named yield
. So if we interpreted yield (i);
as a use of the contextual keyword yield
, the meaning of this valid code would change, breaking backwards compatibility.
A more formal way to look at this would be like this: If we change the grammar of C# 2 to replace statement: 'yield' 'return' expression ';'
with statement: 'yield' expression ';'
, then there'll be an ambiguity between that rule and the rule for function calls because expression
can be derived to '(' expression ')'
and 'yield' '(' expression ')' ';'
could also be a function call in an expression statement.
Possible Solution 1
You could of course say that only yield i;
(or any other expression that does not start with an opening parenthesis) should be interpreted as a use of the contextual keyword while yield (i);
would still be seen as a method call. However that'd be quite inconsistent and surprising behavior - adding parentheses around an expression shouldn't change the semantics like that.
Also this would mean changing the above grammar rule to something like statement: 'yield' expressionNoStartingParen ';'
and then defining expressionNoStartingParen
, which would duplicate most of the actual definition of expression
. That'd make the grammar pretty complicated (though you could work around that by just describing the no-starting-parenthesis requirement in words instead of in the grammar and then use a flag to track this in actual implementations (though that would probably not be an option using most parser generators)).
Possible Solution 2
Another way to resolve this ambiguity, which you've mentioned in comments, would be to only interpret yield expression;
as a yield statement when inside non-void methods that do not have a return statement. This would maintain backwards-compatibility because such methods would be invalid in C# 1 anyway. However this would be somewhat inconsistent because now you could define a method named yield and call it in methods that don't use yield-statements, but not methods that do.
More importantly this isn't what contextual keyword are usually like. Normally a contextual keyword acts as an identifier whenever it's used in any place where identifiers are valid and can only be used as a keyword in places where identifiers could not occur. This would not be the case here. That's not only inconsistent with how contextual keywords usually work and would make it more difficult for readers to distinguish yield-as-a-keyword from yield-as-an-identifier, it would also make it much more difficult to implement:
Not only wouldn't you be able to tell whether yield(x);
is a yield statement just by looking at that line (you'd need to look at the whole method); the parser wouldn't either - it would have to know whether the method contains a return
statement. This would require two distinct definitions for bodies with and without return in the grammar - and a separate definition of what's allowed as an identifier in each one. That would be a horrible grammar to look at and also to implement.
In practice one would most likely create an ambiguous grammar and then parse yield (x);
into a placeholder AST that contains both the possibility that it's a yield statement or a function call. Then you'd try to typecheck both and throw away the one that doesn't typecheck. This would work, but it's pretty uncommon to do and would have required extensive changes to how parsing works in the compiler and how it then works with the AST. Any other implementations of the language (Mono, Roslyn) would then also have had to deal with this complexity, making it more difficult to create new implementations.
Conclusion
So in conclusion, both ways to work around this issue lead to some inconsistencies and the latter is also significantly difficult to implement. Only treating yield
as special when used together with return
avoids the ambiguity without creating any inconsistencies and is easy to implement.