; last updated - 9 minutes read

You can't beat type erasure, they say. Information about generic Java data types is lost during compilation, and there's no way to get it back. While that's absolutely true, Mahmoud Ben Hassine and I found a tiny loophole which allowed us to improve Mahmoud's jPopulator framework tremendously. Under certain circumstances, it's possible to find about the type which is supposed to be erased. So jPopulator is able to populate not only simple beans but also complex generic data types.

Mahmoud agreed to write an article about the ups and downs of our investigation. You'll see it wasn't exactly easy-going. Along the way, you'll learn about the loophole allowing us to beat type erasure.

Introducing Mahmoud

Dear BeyondJava.net readers, my name is Mahmoud. I'm a passionate freelance Java developer living in Brussels. First of all, I would like to thank Stephan for giving me the opportunity to contribute a guest article in his fantastic blog! In this post, we will be revisiting one of the most misunderstood topics among Java developers: Type erasure. We will see how it is possible to "beat" type erasure with a real use case encountered when developing jPopulator. Type erasure can be a real PITA, and our journey with Stephan to beat type erasure was a kind of drama. We would like to share it with you today in this post.

Act 1: Presentation. What is jPopulator and why are we trying to "beat" type erasure?

You all know how much work it is to generate test cases. In particular, it's a real chore to populate Java beans with test data. In many cases, we want every attribute to be filled with data, but the data themselves aren't important. It's just important that the bean is not empty.

jPopulator is a Java tool that allows you to populate Java beans with random data. Its main use case is generating random data for testing purposes. Random data, in turn, is useful to avoid side effects caused by large sets of identical beans.

So I wanted to generate random values for every field of a given Java bean. Let's start with a simple example:

public class Person { private String name; // constructor, getter and setter omitted for brevity }

The goal is to populate an instance of the Person class with random data.

At first glance, you would probably say: "Easy, it's a piece of cake! Thanks to reflection, I can create a new instance of the Person class and call the setter with a random String". In fact, this is how I felt when I first got the idea to develop jPopulator. This approach worked perfectly well with the beans I tested first. But you will see that this is just the beginning of the story.

Let's consider more complex data structures. We add a list of nicknames to the Person class:

public class Person { private String name; private List nickNames; // constructor, getters and setters omitted for brevity }

Here, you would say: "No problem man, I'll pick up the type of the nickNames field at runtime and generate a list of random strings". There's a catch: when you use reflection to get the type of the nickNames field, you don't get the type you expect. Instead of getting the generic type List<String>.class, you get the raw type List.class at runtime. So you ask yourself: "What kind of objects should I put into this list?" :cry: Well, this is when you fire up your browser googling for "java generic type at runtime".

Act 2: Depression. Type erasure: what is it, and why it has been added to Java?

Type erasure is the process by which the Java compiler "erases" generic type information from a class file, replacing it with type casts in the byte code.

Basically, the following code:

List nickNames = new ArrayList(); nickNames.add("Foo"); String nickName = nickNames.get(0);

compiles to:

List nickNames = new ArrayList(); nickNames.add("Foo"); String nickName = (String) nickName.get(0);

Type erasure has been introduced with Generics in Java 5. It was a clever idea to ensure backward compatibility with previous versions of the platform. The Java language designers could have implemented generics differently (using reification for instance, like in C++). But in this case, developers would have to recompile a legacy application when they want to migrate it to a newer version of the platform. With type erasure, method signatures are preserved, eliminating the need to recompile the code.

As a matter of fact, type erasure is a great success story. Granted, some information is lost during compilation, but most developers won't ever notice. It's only the framework developers like us who struggle with type erasure.

Back to jPopulator. We are stuck since there is no way to get the generic type of the list of nickNames at runtime. Even worse: everybody is telling us "You can't beat type erasure. Don't even try!". jPopulator won't ever be able to populate generic data structures. It will never be able to cover such important use cases such as ArrayLists and HashMaps. jPopulator started out a tiger but ended up a bedside rug.

Act 3: Hope. Discovering java.lang.reflect.Field#getGenericType()

Well, giving up is not an option, there should be a way to find the generic type at runtime. After diving deep into the ocean of generics, we finally discovered the method Field#getGenericType(). At first glance, this method is just what we are looking for, getting the generic type of the field! If the field is a ParametrizedType, we could simply call the method getActualTypeArguments() to get the actual types. After several days of depression, we finally got a little hope. It's definitely possible to get the actual type at runtime. Now, let's see how to achieve that.

We will use the same Person class (omitting everything except the nickNames for the sake of simplicity):

public class Person { private List nickNames; }

The goal is to retrieve the type of the field nickNames at runtime. More precisely, we want to find out it is a java.util.List containing String objects:

// Get declared fields Field[] fields = Person.class.getDeclaredFields(); // Get the "nickNames" field which is of type java.util.List Field nickNamesField = fields[0]; System.out.println("field name = " + nickNamesField.getName()); // nickNames System.out.println("field type : " + nickNamesField.getType()); // interface java.util.List Type genericType = nickNamesField.getGenericType(); System.out.println("field genericType = " + genericType); // java.util.List ParameterizedType parameterizedType = (ParameterizedType) genericType; // Get the actual types (this is an array because // there could be multiple types, think of MyType for example) Type[] actualTypeArguments = parameterizedType.getActualTypeArguments(); // Get the actual type Type actualTypeArgument = actualTypeArguments[0]; System.out.println("actualTypeArgument = " + actualTypeArgument); // class java.lang.String Assert.assertEquals(actualTypeArgument, String.class);

The listing above shows how to retrieve the actual type parameter of the nickNames field at runtime. The devil is in the nickNamesField.getGenericType(); method. This is the key to get the information we think was "erased" by the compiler.

So you may ask, has type erasure been applied here? The answer is yes. The actual type of the field is java.util.List (as seen in nickNamesField.getType()). But the compiler writes the type actual parameter in the byte code of the class, so it's possible to get it through the nickNamesField.getGenericType() method. In general, it's true that the type information has been erased. But the entry in the metadata of the class allows us to reconstruct the actual type of the field. However, this works for fields only. For instance, there's no way to find out about the generic type of variables defined locally within a method. Luckily, jPopulator is only interested in fields, so we are well off.

Yeah, it's a bit tricky, but we warned you, generics and type erasure could be a real PITA!

Act 4: Devastation. Type erasure beats me!

So we've managed to get the actual type parameter at runtime. Now, things get a bit confusing. Calling actualTpyeArgument.toString() yields "class java.lang.String", which looks promising. But that doesn't mean the type of actualTypeArgument is String.class. Instead, the actualTypeArgument is of type Class, pointing the java.lang.String class definition. Trying to instantiate the actualTypeArgument, you will get a:

java.lang.IllegalAccessException: Can not call newInstance() on the Class for java.lang.Class at java.lang.Class.newInstance(Class.java:339)

Granted, there is no way to instantiate a java.lang.Class at runtime. But that isn't our goal, anyway. Remember, our goal is to create a random instance of the type contained in the list, which is in our case java.lang.String. So the idea is to extract the fully qualified class name from the result of actualTypeArgument.toString():

actualTypeArgument.toString().substring(6); // "class java.lang.String".substring(6) yields "java.lang.String"

Finally, it would be possible to instantiate the target type and call setters with random values:

Class targetClass = Class.forName("java.lang.String"); populateElement(targetClass);

Act 5: Euphoria: Hey, Java 8 introduced "Type#getTypeName()". Life is beautiful! :)

The good news is the approach works. But on the other hand, extracting the class name from the result of the toString method of the Type interface is surely an ugly workaround! Luckily, Java 8 comes with the new method Type#getTypeName() that gives you the fully qualified name of the actual type. No more tricks and workarounds :).

Conclusion

Type erasure is known to be unbeatable, but we have shown in this post that this is actually not entirely true. There's a tiny loophole, just big enough to allow jPopulator to perform its magic. Now, when someone tells you "You can't beat type erasure. Don't even try!", point them to this post. :wink:


Dig deeper

https://docs.oracle.com/javase/tutorial/java/generics/erasure.html

http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.6

https://github.com/cowtowncoder/java-classmate

http://techblog.bozho.net/on-java-generics-and-erasure/


Comments