Saturday, April 07, 2012

Java Type Erasure not a Total Loss -- use Java Classmate for resolving generic signatures

As I have written before ("Why 'java.lang.reflect.Type' Just Does Not Cut It"), Java's Type Erasure can be a royal PITA.

But things are actually not quite as bleak as one might think. But let's start with an actual somewhat unsolvable problem; and then proceed with another important, similar, yet solvable problem.

1. Actual Unsolvable problem: Java.util Collections

Here is piece of code that illustrates a problem that most Java developers either understand, or think they understand:

  List<String,Integer> stringsToInts = new ArrayList<String,Integer>();
List<byte[],Boolean> bytesToBools = new ArrayList<byte[], Boolean>();
assertSame(stringsToInts.getclass(), bytesToBools.getClass();

The problem is that although conceptually two collections seem to act different, at source code level, they are instances of the very same class (Java does not generate new classes for genericized types, unlike C++).

So while compiler helps in keeping typing straight, there is little runtime help to either enforce this, or allow other code to deduce expected type; there just isn't any difference from type perspective.

2. All Lost? Not at all

But let's look at another example. Starting with a simple interface


public interface Callable<IN, OUT> {
public OUT call(IN argument);
}

do you think following is true also?


public void compare(Callable<?,?> callable1, Callable<?,?> callable2) {
assertSame(callable1.getClass(), callable2.getClass());
}

Nope. Not necessarilly; classes may well be different. WTH?

The difference here is that since Callable is an interface (and you can not instantiate an interface), instances must be of some other type; and there is a good chance they are different.

But more importantly, if you use Java ClassMate library (more on this in just a bit), we can even figure out parameterization (unlike with earlier example, where all you could see is that parameters are "a subtype of java.lang.Object"), so for example we can do


// Assume 'callable1' was of type:
// class MyStringToIntList implements Callable<String, List<Integer>> { ... }
  TypeResolver resolver = new TypeResolver();
  ResolvedType type = resolver.resolve(callable1.getClass());
  List<ResolvedType> params = type.typeParametersFor(Callable.class);
// so we know it has 2 parameters; from above, 'String' and 'List<Integer>'
assertEquals(2, params.size()); assertSame(String.class, params.get(0).getErasedType();
// and second type is generic itself; in this case can directly access
ResolvedType resultType = params.get(1);
assertSame(List.class, resultType.getErasedType());
List<ResolvedType> listParams = resultType.getTypeParameters();
assertSame(Integer.class, listParams.get(0).getErasedType();
//or, just to see types visually, try:
String desc = type.getSignature(); // or 'getFullDescription'

How is THIS possible? (fun exercise: pick 5 of your favorite Java experts; ask if above is possible, observe how most of them would have said "nope, not a chance" :-) )

3. Long live generics -- hidden deep, deep within

Basically generic type information is actually stored in class definitions, in 3 places:

  1. When defining parent type information ("super type"); parameterization for base class and base interface(s) if any
  2. For generic field declarations
  3. For generic method declarations (return, parameter and exception types)

It is the first place where ClassMate finds its stuff. When resolving a Class, it will traverse the inheritance hierarchy, recomposing type parameterizations. This is a rather involved process, mostly due to type aliasing, ability for interfaces to use different signatures and so on. In fact, trying to do this manually first looks feasible, but if you try it via all wildcarding, you will soon realize why having a library do it for you is a nice thing...

So the important thing to learn is this: to retain run-time generic type information, you MUST pass concrete sub-types which resolve generic types via inheritance.

And this is where JDK collection types bring in the problem (wrt this particular issue): concerete types like ArrayList still take generic parameters; and this is why runtime instances do not have generic type available.

Another way to put this is that when using a subtype, say:


  MyStringList list = new ArrayList<String>() { }
// can use ClassMate now, a la:
ResolvedType type = resolver.resolve(list.getClass());
// type itself has no parameterization (concrete non-generic class); but it does implement List so: List<ResolvedType> params = type.typeParametersFor(List.class);
assertSame(String.class, params.get(0).getErasedType());

which once again would retain usable amount of generic type information.

4. Real world usage?

Above might seem as an academic exercise; but it is not. When designing typed APIs, many callbacks would actually benefit from proper generic typing. And of special interest are callbacks or handlers that need to do type conversions.

As an example, my favorite Database access library, jDBI, makes use of this functionality (using embedded ClassMate) to figure out data-binding information without requiring extra Class argument. That is, you could pass something like (not an actual code sample):

  MyPojo value = dbThingamabob.query(queryString, handler);

instead of what would more commonly requested:

  MyPojo value = dbThingamabob.query(queryString, handler, MyPojo.class);

and framework could still figure out what kind of thing 'handler' would handle, assuming it was a generic interface caller has to implement.

difference may seem minute, but this can actually help a lot by simplifying some aspects of type passing, and remove one particular mode of error.

5. More on ClassMate

Above actually barely scratch surface of what ClassMate provides. Although it is already tricky to find "simple" parameterization for main-level classes, there are much more trickier things. Specifically, resolving types of Fields and Methods (return types, parameters). Given classes like:

  public interface Base<T> {
    public T getStuff();
  }
  public class ListBase<T> implements Base<List<T>> {
protected T value;
protected ListBase(T v) { value = v; }
public T getstuff() { return value; }
} public class Actual implements ListBase<String> {
public Actual(List<String> value) { super(value; }
}

you might be interested in figuring out, exactly what is the type of return value of "getStuff()". By eyeballing, you know it should be "List<String>", but bytecode does not tell this -- in fact, it just tells it's "T", basically.

But with ClassMate you can resolve it:

  // start with ResolvedType; need MemberResolver
  ResolvedType classType = resolver.resolve(Actual.class);
MemberResolver mr = new MemberResolver(resolver);
ResolvedTypeWithMembers beanDesc = mr.resolve(classType, null, null);
ResolvedMethod[] members = bean.getMemberMethods();
ResolvedType returnType = null;
for (ResolvedMethod m : members) {
if ("getStuff".equals(m.getName())) {
returnType = m.getReturnType();
}
}
// so, we should get
assertSame(List.class, returnType.getErasedType());
ResolvedType elemType = returnType.getTypeParameters().get(0);
assertSame(String.class, elemType.getErasedType();

and get the information you need.

6. Why so complicated for nested types?

One thing that is obvious from code samples is that code that uses ClassMate is not as simple as one might hope. Handling of nested generic types, specifically, is bit verbose in some cases (specifically: when type we are resolving does not directly implement type we are interested in)
Why is that?

The reason is that there is a wide variety of interfaces that any class can (and often does) implement. Further, parameterizations may vary at different levels, due to co-variance (ability to override methods with more refined return types). This means that it is not practical to "just resolve it all" -- and even if this was done, it is not in general obvious what the "main type" would be. For these reasons, you need to manually request parameterization for specific generic classes and interfaces as you traverse type hierarchy: there is no other way to do it.

blog comments powered by Disqus

Sponsored By


Related Blogs

(by Author (topics))

Powered By

About me

  • I am known as Cowtowncoder
  • Contact me at@yahoo.com
Check my profile to learn more.