253
votes

I read about Java's type erasure on Oracle's website.

When does type erasure occur? At compile time or runtime? When the class is loaded? When the class is instantiated?

A lot of sites (including the official tutorial mentioned above) say type erasure occurs at compile time. If the type information is completely removed at compile time, how does the JDK check type compatibility when a method using generics is invoked with no type information or wrong type information?

Consider the following example: Say class A has a method, empty(Box<? extends Number> b). We compile A.java and get the class file A.class.

public class A {
    public static void empty(Box<? extends Number> b) {}
}
public class Box<T> {}

Now we create another class B which invokes the method empty with a non-parameterized argument (raw type): empty(new Box()). If we compile B.java with A.class in the classpath, javac is smart enough to raise a warning. So A.class has some type information stored in it.

public class B {
    public static void invoke() {
        // java: unchecked method invocation:
        //  method empty in class A is applied to given types
        //  required: Box<? extends java.lang.Number>
        //  found:    Box
        // java: unchecked conversion
        //  required: Box<? extends java.lang.Number>
        //  found:    Box
        A.empty(new Box());
    }
}

My guess would be that type erasure occurs when the class is loaded, but it is just a guess. So when does it happen?

7
@afryingpan: The article mentioned in my answer explains in detail how and when type erasure happens. It also explains when type information is kept. In other words: reified generics is available in Java, contrary to widespread belief. See: rgomes.info/using-typetokens-to-retrieve-generic-parametersRichard Gomes

7 Answers

258
votes

Type erasure applies to the use of generics. There's definitely metadata in the class file to say whether or not a method/type is generic, and what the constraints are etc. But when generics are used, they're converted into compile-time checks and execution-time casts. So this code:

List<String> list = new ArrayList<String>();
list.add("Hi");
String x = list.get(0);

is compiled into

List list = new ArrayList();
list.add("Hi");
String x = (String) list.get(0);

At execution time there's no way of finding out that T=String for the list object - that information is gone.

... but the List<T> interface itself still advertises itself as being generic.

EDIT: Just to clarify, the compiler does retain the information about the variable being a List<String> - but you still can't find out that T=String for the list object itself.

104
votes

The compiler is responsible for understanding Generics at compile time. The compiler is also responsible for throwing away this "understanding" of generic classes, in a process we call type erasure. All happens at compile time.

Note: Contrary to beliefs of majority of Java developers, it is possible to keep compile-time type information and retrieve this information at runtime, despite in a very restricted way. In other words: Java does provide reified generics in a very restricted way.

Regarding type erasure

Notice that, at compile-time, the compiler has full type information available but this information is intentionally dropped in general when the byte code is generated, in a process known as type erasure. This is done this way due to compatibility issues: The intention of language designers was providing full source code compatibility and full byte code compatibility between versions of the platform. If it were implemented differently, you would have to recompile your legacy applications when migrating to newer versions of the platform. The way it was done, all method signatures are preserved (source code compatibility) and you don't need to recompile anything (binary compatibility).

Regarding reified generics in Java

If you need to keep compile-time type information, you need to employ anonymous classes. The point is: in the very special case of anonymous classes, it is possible to retrieve full compile-time type information at runtime which, in other words means: reified generics. This means that the compiler does not throw away type information when anonymous classes are involved; this information is kept in the generated binary code and the runtime system allows you to retrieve this information.

I've written an article about this subject:

https://rgomes.info/using-typetokens-to-retrieve-generic-parameters/

A note about the technique described in the article above is that the technique is obscure for majority of developers. Despite it works and works well, most developers feel confused or uncomfortable with the technique. If you have a shared code base or plan to release your code to the public, I do not recommend the above technique. On the other hand, if you are the sole user of your code, you can take advantage of the power this technique delivers to you.

Sample code

The article above has links to sample code.

35
votes

If you have a field that is a generic type, its type parameters are compiled into the class.

If you have a method that takes or returns a generic type, those type parameters are compiled into the class.

This information is what the compiler uses to tell you that you can't pass a Box<String> to the empty(Box<T extends Number>) method.

The API is complicated, but you can inspect this type information through the reflection API with methods like getGenericParameterTypes, getGenericReturnType, and, for fields, getGenericType.

If you have code that uses a generic type, the compiler inserts casts as needed (in the caller) to check types. The generic objects themselves are just the raw type; the parameterized type is "erased". So, when you create a new Box<Integer>(), there is no information about the Integer class in the Box object.

Angelika Langer's FAQ is the best reference I've seen for Java Generics.

14
votes

Generics in Java Language is a really good guide on this topic.

Generics are implemented by Java compiler as a front-end conversion called erasure. You can (almost) think of it as a source-to-source translation, whereby the generic version of loophole() is converted to the non-generic version.

So, it's at compile time. The JVM will never know which ArrayList you used.

I'd also recommend Mr. Skeet's answer on What is the concept of erasure in generics in Java?

8
votes

The term "type erasure" is not really the correct description of Java's problem with generics. Type erasure is not per se a bad thing, indeed it is very necessary for performance and is often used in several languages like C++, Haskell, D.

Before you disgust, please recall the correct definition of type erasure from Wikipedia

What is type erasure?

type erasure refers to the load-time process by which explicit type annotations are removed from a program, before it is executed at run-time

Type erasure means to throw away type tags created at design time or inferred type tags at compile time such that the compiled program in binary code does not contain any type tags. And this is the case for every programming language compiling to binary code except in some cases where you need runtime tags. These exceptions include for instance all existential types (Java Reference Types which are subtypeable, Any Type in many languages, Union Types). The reason for type erasure is that programs get transformed to a language which is in some kind uni-typed (binary language only allowing bits) as types are abstractions only and assert a structure for its values and the appropriate semantics to handle them.

So this is in return, a normal natural thing.

Java's problem is different and caused to how it reifies.

The often made statements about Java does not have reified generics is also wrong.

Java does reify, but in a wrong way due to backward compatibility.

What is reification?

From Wikipedia

Reification is the process by which an abstract idea about a computer program is turned into an explicit data model or other object created in a programming language.

Reification means to convert something abstract (Parametric Type) into something concrete (Concrete Type) by specialization.

We illustrate this by a simple example:

An ArrayList with definition:

ArrayList<T>
{
    T[] elems;
    ...//methods
}

is an abstraction, in detail a type constructor, which gets "reified" when specialized with a concrete type, say Integer:

ArrayList<Integer>
{
    Integer[] elems;
}

where ArrayList<Integer> is really a type.

But this is exactly the thing what Java does not!!!, instead they reify constantly abstract types with their bounds, i.e. producing the same concrete type independent of the parameters passed in for specialization:

ArrayList
{
    Object[] elems;
}

which is here reified with the implicit bound Object (ArrayList<T extends Object> == ArrayList<T>).

Despite that it makes generic arrays unusable and cause some strange errors for raw types:

List<String> l= List.<String>of("h","s");
List lRaw=l
l.add(new Object())
String s=l.get(2) //Cast Exception

it causes a lot of ambiguities as

void function(ArrayList<Integer> list){}
void function(ArrayList<Float> list){}
void function(ArrayList<String> list){}

refer to the same function:

void function(ArrayList list)

and therefore generic method overloading can't be used in Java.

7
votes

Type erasure occurs at compile time. What type erasure means is that it will forget about the generic type, not about every type. Besides, there will still be metadata about the types being generic. For example

Box<String> b = new Box<String>();
String x = b.getDefault();

is converted to

Box b = new Box();
String x = (String) b.getDefault();

at compile time. You may get warnings not because the compiler knows about what type is the generic of, but on the contrary, because it doesn't know enough so it cannot guarantee type safety.

Additionally, the compiler does retain the type information about the parameters on a method call, which you can retrieve via reflection.

This guide is the best I've found on the subject.

2
votes

I've encountered with type erasure in Android. In production we use gradle with minify option. After minification I've got fatal exception. I've made simple function to show inheritance chain of my object:

public static void printSuperclasses(Class clazz) {
    Type superClass = clazz.getGenericSuperclass();

    Log.d("Reflection", "this class: " + (clazz == null ? "null" : clazz.getName()));
    Log.d("Reflection", "superClass: " + (superClass == null ? "null" : superClass.toString()));

    while (superClass != null && clazz != null) {
        clazz = clazz.getSuperclass();
        superClass = clazz.getGenericSuperclass();

        Log.d("Reflection", "this class: " + (clazz == null ? "null" : clazz.getName()));
        Log.d("Reflection", "superClass: " + (superClass == null ? "null" : superClass.toString()));
    }
}

And there is two results of this function:

Not minified code:

D/Reflection: this class: com.example.App.UsersList
D/Reflection: superClass: com.example.App.SortedListWrapper<com.example.App.Models.User>

D/Reflection: this class: com.example.App.SortedListWrapper
D/Reflection: superClass: android.support.v7.util.SortedList$Callback<T>

D/Reflection: this class: android.support.v7.util.SortedList$Callback
D/Reflection: superClass: class java.lang.Object

D/Reflection: this class: java.lang.Object
D/Reflection: superClass: null

Minified code:

D/Reflection: this class: com.example.App.UsersList
D/Reflection: superClass: class com.example.App.SortedListWrapper

D/Reflection: this class: com.example.App.SortedListWrapper
D/Reflection: superClass: class android.support.v7.g.e

D/Reflection: this class: android.support.v7.g.e
D/Reflection: superClass: class java.lang.Object

D/Reflection: this class: java.lang.Object
D/Reflection: superClass: null

So, in minified code actual parametrized classes are replaced with raw classes types without any type information. As a solution for my project i removed all reflection calls and replced them with explicit params types passed in function arguments.