Eric Lippert's comments in this question have left me thoroughly confused. What is the difference between casting and conversion in C#?
13 Answers
I believe what Eric is trying to say is:
Casting is a term describing syntax (hence the Syntactic meaning).
Conversion is a term describing what actions are actually taken behind the scenes (and thus the Semantic meaning).
A cast-expression is used to convert explicitly an expression to a given type.
And
A cast-expression of the form (T)E, where T is a type and E is a unary-expression, performs an explicit conversion (§13.2) of the value of E to type T.
Seems to back that up by saying that a cast operator in the syntax performs an explicit conversion.
I am reminded of the anecdote told by Richard Feynman where he is attending a philosophy class and the professor askes him "Feynman, you're a physicist, in your opinion is an electron an 'essential object'?" So Feynman asks the clarifying question "is a brick an essential object?" to the class. Every student has a different answer to that question. They say that the fundamental abstract notion of "brickness" is the essential object. No, one specific, unique brick is the essential object. No, the parts of the brick you can empirically observe is the essential object. And so on.
Which is of course not to answer your question.
I'm not going to go through all these dozen answers and debate with their authors about what I really meant. I'll write a blog article on the subject in a few weeks and we'll see if that throws any light on the matter.
How about an analogy though, a la Feynman. You wish to bake a loaf of banana bread Saturday morning (as I do almost every Saturday morning.) So you consult The Joy of Cooking, and it says "blah blah blah... In another bowl, whisk together the dry ingredients. ..."
Clearly there is a strong relationship between that instruction and your actions tomorrow morning, but equally clearly it would be a mistake to conflate the instruction with the action. The instruction consists of text. It has a location, on a particular page. It has punctuation. Were you to be in the kitchen whisking together flour and baking soda, and someone asked "what's your punctuation right now?", you'd probably think it was an odd question. The action is related to the instruction, but the textual properties of the instruction are not properties of the action.
A cast is not a conversion in the same way that a recipe is not the act of baking a cake. A recipe is text which describes an action, which you can then perform. A cast operator is text which describes an action - a conversion - which the runtime can then perform.
From the C# Spec 14.6.6:
A cast-expression is used to convert explicitly an expression to a given type.
...
A cast-expression of the form (T)E, where T is a type and E is a unary-expression, performs an explicit conversion (§13.2) of the value of E to type T.
So casting is a syntactic construct used to instruct the compiler to invoke explicit conversions.
From the C# Spec §13:
A conversion enables an expression of one type to be treated as another type. Conversions can be implicit or explicit, and this determines whether an explicit cast is required. [Example: For instance, the conversion from type int to type long is implicit, so expressions of type int can implicitly be treated as type long. The opposite conversion, from type long to type int, is explicit, so an explicit cast is required.
So conversions are where the actual work gets done. You'll note that the cast-expression quote says that it performs explicit conversions but explicit conversions are a superset of implicit conversions, so you can also invoke implicit conversions (even if you don't have to) via cast-expressions.
Just my understanding, probably much too simple:
When casting the essential data remains intact (same internal representation) - "I know this is a dictionary, but you can use it as a ICollection".
When converting, you are changing the internal representation to something else - "I want this int to be a string".
After reading Eric's comments, an attempt in plain english:
Casting means that the two types are actually the same at some level. They may implement the same interface or inherit from the same base class or the target can be "same enough" (a superset?) for the cast to work such as casting from Int16 to Int32.
Converting types then means that the two objects may be similar enough to be converted. Take for example a string representation of a number. It is a string, it cannot simply be cast into a number, it needs to be parsed and converted from one to the other, and, the process may fail. It may fail for casting as well but I imagine that's a much less expensive failure.
And that's the key difference between the two concepts I think. Conversion will entail some sort of parsing, or deeper analysis and conversion of the source data. Casting does not parse. It simply attempts a match at some polymorphic level.
Casting is the creation of a value of one type from another value of another type. Conversion is a type of casting in which the internal representation of the value must also be changed (rather than just its interpretation).
In C#, casting and converting are both done with a cast-expression:
( type ) unary-expression
The distinction is important (and the point is made in the comment) because only conversions may be created by a conversion-operator-declarator. Therefore, only (implicit or explicit) conversions may be created in code.
A non-conversion implicit cast is always available for subtype-to-supertype casts, and a non-conversion explicit cast is always available for supertype-to-subtype casts. No other non-conversion casts are allowed.
This page of the MSDN C# documentation suggests that a cast is specific instance of conversion: the "explicit conversion." That is, a conversion of the form x = (int)y
is a cast.
Automatic data type changes (such as myLong = myInt
) are the more generic "conversion."
A cast is an operator on a class/struct. A conversion is a method/process on one or the other of the affected classes/structs, or may be in a complete different class/struct (i.e. Converter.ToInt32()
Cast operators come in two flavors: implicit and explicit
Implicit cast operators indicate that data of one type (say, Int32) can always be represented as another type (decimal) without loss of data/precision.
int i = 25;
decimal d = i;
Explicit cast operators indicate that data of one type (decimal) can always be faithfully represented as another type (int), but there may be loss of data/precision. Therefor the compiler requires you to explicitly state that you are aware of this and want to do it anyway, through use of the explicit cast syntax:
decimal d = 25.0001;
int i = (int)d;
Conversion takes two types that are not necessarily related in any way, and attempts to convert one into the other through some process, such as parsing. If all known conversion algorithms fail, the process may either throw an exception or return a default value:
string s = "200";
int i = Converter.ToInt32(s); // set i to 200 by parsing s
string s = "two hundred";
int i = Converter.ToInt32(s); // sets i to 0 because the parse fails
Eric's references to syntactic conversion vs. symantic conversion are basically an operator vs. methodology distinction.
A cast is syntactical, and may or may not involve a conversion (depending on the type of cast). As you know, C++ allows specifying the type of cast you want to use.
Casting up/down the hierarchy may or may not be considered conversion, depending on who you ask (and what language they're talking about!)
Eric (C#) is saying that casting to a different type always involves a conversion, though that conversion may not even change the internal representation of the instance.
A C++-guy will disagree, since a static_cast
might not result in any extra code (so the "conversion" is not actually real!)
Casting and Conversion are basically the same concept in C#, except that a conversion may be done using any method such as Object.ToString()
. Casting is only done with the casting operator (T) E
, that is described in other posts, and may make use of conversions or boxing.
Which conversion method does it use? The compiler decides based on the classes and libraries provided to the compiler at compile-time. If an implicit conversion exists, you are not required to use the casting operator. Object o = String.Empty
. If only explicit conversions exist, you must use the casting operator. String s = (String) o
.
You can create explicit
and implicit
conversion operators in your own classes. Note: conversions can make the data look very similar or nothing like the original type to you and me, but it's all defined by the conversion methods, and makes it legal to the compiler.
Casting always refers to the use of the casting operator. You can write
Object o = float.NaN;
String s = (String) o;
But if you access s
, in for example a Console.WriteLine
, you will receive a runtime InvalidCastException
. So, the cast operator still attempts to use conversion at access time, but will settle for boxing during assignment.
INSERTED EDIT#2: isn't it hilariously inconsistent myopia that since I provided this answer, the question has been marked as duplicate of a question which asks, "Is casting the same thing as converting?". And the answers of "No" are overwhelmingly upvoted. Yet my answer here which points out the generative essence for why casts are not the same as conversion is overwhelmingly downvoted (yet I have one +1 in the comments). I suppose that readers have a difficult time with comprehending that casts apply at the denotational syntax/semantics layer and conversions apply at the operational semantics layer. For example, a cast of a reference (or pointer in C/C++)—referring to a boxed data type—to another data type, doesn't (in all languages and scenarios) generate a conversion of the boxed data. For example, in C float a[1]; int* p = (int*)&a;
doesn't insure that *p
refers to int
data.
A compiler compiles from denotational semantics to operational semantics. The compilation is not bijective, i.e. it isn't guaranteed to uncompile (e.g. Java, LLVM, asm.js, or C# bytecode) back to any denotational syntax which compiles to that bytecode (e.g. Scala, Python, C#, C via Emscripten, etc). Thus the two layers not the same.
Thus most obviously a 'cast' and a 'conversion' are not the same thing. My answer here is pointing out that the terms apply to two different layers of semantics. Casts apply to the semantics of what the denotational layer (input syntax of the compiler) knows about. Conversions apply to the semantics of what the operational (runtime or intermediate bytecode) layer knows about. I used the standard term of 'erased' to describe what happens to denotational semantics that aren't explicitly recorded in the operational semantics layer.
For example, reified generics are an example of recording denotational semantics in the operational semantics layer, but they have the disadvantage of making the operational semantics layer incompatible with higher-order denotational semantics, e.g. this is why it was painful to consider implementing Scala's higher-kinded generics on C#'s CLR because C#'s denotational semantics for generics was hard-coded at the operational semantics layer.
Come on guys, stop downvoting someone who knows a lot more than you do. Do your homework first before you vote.
INSERTED EDIT: Casting is an operation that happens at the denotational semantics layer (where types are expressed in their full semantics). A cast may (e.g. explicit conversion) or may not (e.g. upcasting) cause a conversion at the runtime semantic layer. The downvotes on my answer (and the upvoting on Marc Gavin's comment) indicates to me that most people don't understand the differences between denotational semantics and operational (execution) semantics. Sigh.
I will state Eric Lippert's answer more simply and more generally for all languages, including C#.
A cast is syntax so (like all syntax) is erased at compile-time; whereas, a conversion causes some action at runtime.
That is a true statement for every computer language that I am aware of in the entire universe. Note that the above statement does not say that casting and conversions are mutually exclusive.
A cast may cause a conversion at runtime, but there are cases where it may not.
The reason we have two distinct words, i.e. cast and conversion, is we need a way to separately describe what is happening in syntax (the cast operator) and at runtime (conversion, or type check and possible conversion).
It is important that we maintain this separation-of-concepts, because in some programming languages the cast never causes a conversion. Also so that we understand implicit casting (e.g. upcasting) is happening only at compile-time. The reason I wrote this answer is because I want to help readers understand in terms of being multilingual with computer languages. And also to see how that general definition correctly applies in the C# case as well.
Also I wanted to help readers see how I generalize concepts in my mind, which helps me as computer language designer. I am trying to pass along the gift of a very reductionist, abstract way of thinking. But I am also trying to explain this in a very practical way. Please feel free to let me know in the comments if I need to improve the elucidation.
Eric Lippert wrote:
A cast is not a conversion in the same way that a recipe is not the act of baking a cake. A recipe is text which describes an action, which you can then perform. A cast operator is text which describes an action - a conversion - which the runtime can then perform.
The recipe is what is happening in syntax. Syntax is always erased, and replaced with either nothing or some runtime code.
For example, I can write a cast in C# that does nothing and is entirely erased at compile-time when it is does not cause a change in the storage requirements or is upcasting. We can clearly see that a cast is just syntax, that makes no change to the runtime code.
int x = 1;
int y = (int)x;
Giraffe g = new Giraffe();
Animal a = (Animal)g;
That can be used for documentation purposes (yet noisy), but it is essential in languages that have type inference, where a cast is sometimes necessary to tell the compiler what type you wish it to infer.
For an example, in Scala a None
has the type Option[Nothing]
where Nothing
is the bottom type that is the sub-type of all possible types (not super-type). So sometimes when using None, the type needs to be casted to a specific type, because Scala only does local type inference, thus can't always infer the type you intended.
// (None : Option[Int]) casts None to Option[Int]
println(Some(7) <*> ((None : Option[Int]) <*> (Some(9) > add)))
A cast could know at compile-time that it requires a type conversion, e.g. int x = (int)1.5
, or could require a type check and possible type conversion at runtime, e.g. downcasting. The cast (i.e. the syntax) is erased and replaced with the runtime action.
Thus we can clearly see that equating all casts with explicit conversion, is an error of implication in the MSDN documentation. That documentation is intending to say that explicit conversion requires a cast operator, but it should not be trying to also imply that all casts are explicit conversions. I am confident that Eric Lippert can clear this up when he writes the blog he promised in his answer.
ADD: From the comments and chat, I can see that there is some confusion about the meaning of the term erased.
The term 'erased' is used to describe information that was known at compile-time, which is not known at runtime. For example, types can be erased in non-reified generics, and it is called type erasure.
Generally speaking all the syntax is erased, because generally CLI is not bijective (invertible, and one-to-one) with C#. You cannot always go backwards from some arbitrary CLI code back to the exact C# source code. This means information has been erased.
Those who say erased is not the right term, are conflating the implementation of a cast with the semantic of the cast. The cast is a higher-level semantic (I think it is actually higher than syntax, it is denotational semantics at least in case of upcasting and downcasting) that says at that level of semantics that we want to cast the type. Now how that gets done at runtime is entirely different level of semantics. In some languages it might always be a NOOP. For example, in Haskell all typing information is erased at compile-time.