Annotation processing + instrumentation = language extensions

(Nhat Minh Lê (rz0) @ 2010-11-20 02:33:03)

As promised, this is the first blog post about my work at Google on contract programming for Java.1 Today, I’m going to discuss the general rationale behind our choice of a combination of annotation processing and instrumentation to bring contracts to Java, the alternatives, and how it might or might not work for you if you’re also into extending the language.

This article is mainly for people interested in Java as a language, and its compilation-related aspects, but are not too familiar with the development environment itself. I fall into that category as well so don’t expect any mind-blowing technical tricks. :) And if you find errors, you’re welcome to report them.

The project I’ve worked on is actually based on Modern Jass. We chose this framework as a basis because we agreed this was the best approach to adding contracts to the Java language; as a result, many of the design decisions discussed below were originally explored in Johnannes Rieken’s work on Modern Jass.

The basic idea is to use two complementary interfaces offered by the Java development environment to achieve the goal of specifying and enforcing contracts on Java code:

Annotation processing lets you handle arbitrary compile-time constant information added to elements of code in Java source files, while bytecode instrumentation empowers you with the ability to rewrite classes entirely before they get loaded into the run-time environment. These techniques can be used in many different ways, and were not exactly meant to be mixed together either… So why these two, and why do we need both of them? Let’s take a look at a couple of usage patterns that appear to solve our problem.

1 You may want to read more about Design by Contract if you don’t know about it, to help understand the motivation behind the various techniques presented here. Wikipedia is a good starting point, and I’ve also personally replied to a comment on Reddit about the uses and purposes of contracts; you may want to have a look.

2 That’s Java 6 annotation processing. For those of you who have not kept track of this evolution, Java 5 had annotations, which basically had the same syntax, but vastly differing processing API based on Sun library extensions. Java 6 introduced a standardized annotation processing API under javax.annotation.processing and related name spaces.

What if we only had annotation processing?

Java 6 annotation processing, as specified by JSR 269, is a symbiotic pass that runs alongside the compiler. The Java compiler exposes a partially reified representation of classes to one or more annotation processors. These are free to generate new classes, however, the well-known limitation of this interface is that annotation processors cannot alter the classes being compiled. In fact, the program representation is read-only.

Consider the following example:

@A(x="foo")
class C {
  void f() { ... }
  int g(int) { ... }
}

The code above will be reified into a TypeElement (under javax.lang.model.element) object representing C, containing instances of ExecutableElement for each of the two methods, and referencing an AnnotationMirror object that, as its name implies, mirrors the annotation A.

The annotation processor would then be perfectly free to generate a class say C$froboz and get the Java compiler to process it along the way. But altering C directly, e.g. to add a method or a field, is prohibited under the regular API: the class as reflected through the javax.lang.model.element model is what will be compiled, regardless of the actions of the annotation processor (well, unless it halts the JVM or something).

Does that mean that it is impossible to add behavior to objects through annotation processing alone? Well, yes and no. Technically speaking, classes cannot be modified, that’s a fact; but it is also well-known that through cleverly twisting the inheritance hierarchy, it is possible to affect instances of a class. The pattern is officially encouraged; it is described in the class documentation of the Filer class, which manages the creation of source, class or resource files by annotation processors, as a decorator pattern.

The trick is that by extending and instantiating from classes that are generated by the annotation processor, it is possible to add to the interface (superclass) and implementation (subclass) of a given class that has the proper layout. For example:

@D(S.class)
class C extends C$$super {
  ...
}

In the above code, an annotation processor could catch the D annotation, and generate the class C$$super as follows to add a hello method to C:

@Generated
class C$$super extends S {
  void hello() {
    System.out.println("hello, world!");
  }
}

For a working example, you can take a look at this proof of concept of the decorator pattern applied to properties.

The main drawback is that it requires some strong cooperation from user code; also, composability is pretty bad: what if there are two frameworks that need to generate two different child classes? For contracts, which need to be enabled and disabled transparently, mandating that user code should interact with specific subclasses was never an option. The only use case I see is for mocking tools, but then, there are better ways. (But if you have an interesting case, feel free to prove me wrong, here. :)

Conclusion: Annotation processors are good at generating new classes, so if your language extension is purely generative by nature (e.g. some kind of templating), then it may be the most suitable option.

Interlude 1: Contract specifications as annotations

Up to now, we have assumed that contract specifications would be written as annotations. But what does it mean, really? First, what is a contract? Well, you could say it’s a kind of predicate that must hold at some points in time, e.g. before or after execution of a method, or between public method calls of an object.

So what we want to specify is basically a predicate. There are more than one way to do that. We can write predicates as code, or we could reify the concept into some data structure (e.g. com.google.common.base.Predicate).

The main constraint, on the technical side, is that annotations are compile-time constants, and initializations of annotation attributes must have a constant expression on the right-hand side.

@Foo(x + y)           // Looks cool but won't do.
@Foo("x + y")         // Ugly but OK.

This means that if we’re opting for the first solution, plain arbitrary code, it must be encoded as a string. As a consequence, any syntactic or semantic checks won’t be handled by the compiler, and must be done independently.

Yet, the second solution does not solve this problem completely either. Class constants are fine (think .class constructs), but references to members and method parameters, which are needed as arguments to the predicates, are not, and would need to be encoded as strings again.

All things considered, we decided to stick with the first approach, if only because it was the most natural way to write predicates to begin with!

So what if we don’t use an annotation processor at all?

After all, annotations are fine, but what use do we have for an annotation processor if all it can do is take a peek at the annotations? After all, we could just as well use the classical reflection API (java.lang.reflect) to access classes and their annotations at run time.

Then, using bytecode rewriting or generative techniques, it would be possible to do the same, or more, than with an annotation processor, without having to do anything at compile time. All the magic would be hidden, it’d be just like a library! Great!

Well, one major drawback with this approach is that any complex semantic checks are left till run time. In the case of contracts, that means any errors in the specifications will not be reported until the code gets executed. This is akin to spotting syntax errors while the code is running: it’s bad user experience, especially for a statically-typed language like Java, where users expect such things to be reported during compilation.

Conclusion: So, it seems like we’re reaching a first milestone here. We can have an annotation processor take care of all the static sanity checks, while delegating the effective work to a run-time component. This strategy was actually used in the original Modern Jass framework; and to be honest, I like it. I’d say, if your language extension can be implemented this way, go ahead, it’s pretty cool and clean.

Midway through my project though, I discovered there were some things that couldn’t be handled with this method…

Interlude 2: Why run-time contract compilation was not enough

The technique we’ve just discussed is fine if your features don’t depend on any aspect of Java that differs (significantly) between the source and bytecode representation. As discussed in a previous introductory blog post about the JVM bytecode (in French), the Java bytecode closely mirrors constructs of the Java language… to a point. It sticks enough to the language so that it’s a pain if you want to implement anything else on top of the JVM, but not enough so that all features of the language are reflected onto the JVM bytecode.

Worth mentioning in this category are: inner classes, anonymous classes, generics, as well as various little "details" such as covariant return types.

Well, hopefully you don’t need to deal with these, but in my case, it’s these little details that made us switch to a more complex contract compilation model (described below).

More specifically, contracts follow inheritance rules that closely match those of the actual source hierarchy. The problem was that the bytecode did not reflect that hierarchy exactly. The JVM has a simplified concept of inheritance, in which a method overrides another if they have the exact same signature. This is of course not true for Java, because of covariant returns and specialized generic extension.

class A {
  X f() { ... }
}

class B extends A {
  // With Y extends X, this is a covariant return override.
  Y f() { ... }
}

class G<T> {
  void f(T x) { ... }
}

class S extends G<Integer> {
  // Specialized override.
  void f(Integer x) { ... }
}

The Java compiler copes with that by spitting a bunch of bridge methods (see for example this Stack Overflow thread for more information on bridge methods). And while it is possible to infer the original override relationship from a bridge, it requires a lot of work, whereas the annotation processing API does that already for you, and for free. And with a source-level abstraction, too! That means if the Java language evolves to include more weirdness in the way it handles inheritance, at least this part maybe will not need to be upgraded. :)

And so, we ended up not only checking but also compiling contracts through our annotation processor. The run-time part was reduced to only handle weaving of the separately compiled bytecode (remember: we can’t alter classes being compiled) into the actual classes. The details of how this is accomplished deserve a blog post of their own, though, so stay around if you’re interested!

The combo: annotation processing and instrumentation

We’ve tried one and the other separately, so the last step is logically to combine both into something greater and better. The basic pattern becomes:

source files --[javac]-> class files  --[java]--> execution
                ||                       ||
annotations  --[apt]---> intermediate --[agent]-> run-time
                         information              behavior

In words: An annotation processor compiles annotations into some intermediate form suitable for exploitation by the run-time agent, reporting any errors to the user; at run time, an instrumentation agent uses the intermediate information to alter the behavior of the program (e.g. through bytecode rewriting).

How much information should be precompiled into the intermediate files and how much should be retrieved directly from the classes at run time by the agent depends on the application and is basically a design choice. Rationally, it’d be best to do as much offline as possible, to lower the overhead at run time, but some things may be more practical for the agent, so in the end, it’s up to the developer to pick the best compromise.

Conclusion: The most flexible among the three, if you need it. Otherwise, something simpler may prove more manageable.

Alternatives: There are

Annotation processing and bytecode instrumentation is one way (actually, we’ve seen three) to do it, but there are, of course, other approaches.

You could, for example, write a preprocessor that outputs Java from an augmented Java syntax. This is probably a bad idea, though, because Java, even less so than C, is not really designed to be an intermediate language. You won’t find any convenient #line directive, or anything like that. It also means you’ll have to edit the debug records to match your original source information after compiling the resulting Java code. I wouldn’t recommend this on a large scale.

Another possibility would be to roll out your own compiler. This is what JML did, but this has many problems, because basically it means you’re forking the language into your own. Upstream changes to the languages will need to be integrated back into your customized version, and it won’t play nicely with other extensions. So, unless you’re really really motivated and have the resources to afford this kind of scenario, I wouldn’t recommend this either.

A more modest alternative would be to replace the run-time instrumentation agent with an offline bytecode rewriter that produces standalone class files with modified bytecode. This was also implemented in our framework, as an optional compilation method.

Lastly, you could take the party to break the JSR 269 abstraction and dive into the collection of classes of the underlying compiler, typically OpenJDK Javac, which offer way more power, including full abstract syntax tree access (see annex B below for more details) and the ability to alter classes being compiled.

Limitations: What annotations and instrumentation cannot do

There are valid reasons, though, to pursue the route of having your own parser or compilation suite: If you need to alter the syntax in a way annotations do not permit.

Aside from the fact that writing code in double-quoted strings is a pain, annotation-based language extensions have other limitations. The main issue that you will run into is that annotation processing does not reflect the structure of classes below the level of members: that is, no code. In the same vein, anonymous classes are not represented. If you want to add something to these elements, then you need another way…

Similarly, instrumentation is great for optional functionality, such as contract checking, but for mandatory behavior, that is, the stuff that may create random crashes if not enabled, you’re probably better off with an offline compilation or bytecode modification strategy.

Conclusion: What’s best for your language extension needs?

Well, that depends on your needs. I don’t know if there is anyone reading this who doesn’t either already know about this whole thing or else didn’t understand anything. As I said, I kind of wrote it for people like me, who are not really Java experts but already have a good grip on compilation-related concepts. Why would anyone like that be interested in the subject, you ask? I guess out of curiosity, to keep in touch with what’s going on in the field, or maybe because, like me, they’ve been hired to do Java-related work even though they’re C programmers to the core. :)

In keeping with that pragmatic approach, I’d say a good rule of the thumb is probably to pick the simplest solution that works for you. If a single annotation processor will do, go for an annotation processor; if you only need a run-time agent, then only write a run-time agent; otherwise, you can always aim for both. And if you have any experiences or questions you’d like to share with me, you’re welcome to comment below!