Anybody ordered a new JVM compiler? Whoever ordered it – now it’s there, and it’s exciting! But wait: is it a compiler? No, it’s not. It’s more. Currently, it’s half a dozen compilers, plus a framework making it easy to write your own compiler.
In a nutshell, this is what the Graal project promises:
- It’s a compiler generating faster Java code than ever. Or it will be, in a few months. The version bundled with Java 10 is marked experimental, probably for a good reason, such as a performance penalty in some applications.
- It’s a compiler everybody can understand and modify. People like you and me can contribute improvements.
- It brings polyglot programming to a whole new level.
- Plus, it’s a great opportunity for minor or new languages. It’s never been so easy to create a new language with a decent performance from day one.
- Finally, it brings AOT compilation to the Java world. That’s a big plus in cloud environment like AWS Lambda, taking the scare out of hibernation.
OK, you get it: I’m excited. Let’s examine Graal piece by piece. And let’s have a look at the current state of the art.
A word of caution
Graal, SubstrateVM, and Truffle are fairly new to me, so most of this article isn’t backed up by hands-on expertise. Instead, I collected information from blogs, presentation talk videos, and slides. If you’re more familiar with Graal than I am, don’t hesitate to leave a comment pointing out whatever I got wrong.
What is Graal?
Graal is a plugin to the JVM, so let’s start with the JVM first. Among other things, Java 9 implements JEP 243, which sort of removes the JIT compiler from the JVM. The JIT compiler we’re used to is still there, but now it’s “only” the default implementation. You can replace it by a different implementation using arcane JVM parameters (
-XX:+UnlockExperimentalVMOptions -XX:+UseJVMCICompiler, plus the path to the jar files containing the new compiler). Chris Seaton has written a great blog containing all the gory details, so suffice it to say it’s possible, and it’s easy, too.
So now the JIT compiler converting the Java bytecode isn’t baked into the JVM anymore. It’s a plugin. You can replace it with another plugin. One of these plugins is the Graal compiler (and most likely, it’s the first and – for a long time – the most important plugin).
The JIT compiler has been rewritten from scratch!
Graal, in turn, is exciting because it’s a complete rewrite of the Java compiler. Traditionally, the low-level code of the Java virtual machine has been written in C++. That was a good decision as long as Java was an interpreter language. But it always puzzled me when the JIT compiler was added to Java. One of the best practices of compiler construction is to write the compiler in the language to be compiled.
Sounds weird? It is.
But it makes a lot of sense once you take into account that the first few generations of the compiler are written in an already-established language before switching to the target language. So you write a little Bootstrap compiler, implement a couple of features of the new language, and as soon as the language is powerful enough to write a compiler, you replace the original compiler by a compiler written in the new language.
The rationale behind this is that writing the compiler in the new language is a great test case. Plus, it’s a great motivation to optimize the performance of the language.
In the case of Java, it took a tad longer to get there.
It’s Java, dude!
Writing the Java compiler in Java has other benefits, too. Now every Java programmer can experiment with the compiler. C++ is considered a hard language, so few people took the pain to download the source code on their own computer. I guess implementing the compiler in Java itself is going to result in countless pull requests improving the Java language. In other words, it allows for crowdsourcing. That’s why I believe the performance of the Java programs compiled by Graal is soon going to overtake Java applications compiled by the traditional HotSpot+JRockit compiler.
Of course, there’s also the combined weight of the history of the existing compiler. Like every other software, it became more and more difficult to maintain over the years. In the software industry, it’s often a good idea to start over again. It’s a pity that starting over is almost never an option. No matter how high the cost of maintenance is, the management always compares it to the money they’ve already invested in the product. From this perspective, it’s a lucky circumstance that Oracle decided to invest time and money to write a new compiler.
Graal operates on an AST
There’s a second reason why Graal might perform better than HotSpot. The HotSpot compiler operates on the Java bytecode. Graal takes a slightly different approach. The Java part of Graal also operates on the Java bytecode, but internally, Graal works on an abstract syntax tree (aka AST). That’s a fairly generic representation of the program code. Every modern programming language naturally translates into an AST. Plus, if my sources are right, it’s easier to optimize the AST than the underlying bytecode. (Note that I’m not entirely convinced: I always assumed every optimizer first converts the code to an AST).
Graal does not necessarily operate on Java byte code
Be that as it may, an important message is that the input of Graal is not necessarily the Java bytecode. The Truffle project is an alternative source of input. Truffle bypasses the bytecode, converting human-readable source code directly to the AST representation. Graal is a compiler translating this AST tree to machine code. In other words, it’s a great engine you can simply use.
That’s great news for everybody planning to create a new language. They don’t have to create the compiler from scratch. All they have to do is to write a parser translating the source code to a Graal AST. After that, they can use the infrastructure provided by Graal to translate the language to machine code. Most likely, they have to add or tweak a few things, but Graal has been designed with being extended in mind.
Graal, in contrast, has been written with dynamic languages in mind. It does so by making the type information optional in the AST tree. If the compiler can induce type information from the context, it can emit optimized code. Otherwise, it emits fairly generic code. See this presentation (slide 62 – 66) for more details.
The Truffle framework solves this problem. It’s a framework for implementing languages as simple interpreters. It allows you to easily convert your language to an AST tree that can be compiled by Graal.
import org.graalvm.polyglot.*; ... Context context = Context.create(); context.eval("js", "print(4+5);"); }
Update March 27, 2018: Since writing the article, it dawned on me that the error concerning require.js is on my side indeed. Well, partially, at least. Originally, require.js was simple a library you can add to your application to benefit from a module system. This works fine in the browser: simply add another
require() method optional.
Implementing your own language
I guess you’ve already got an idea how to implement your own language. Use Truffle as a framework to translate the source code to an AST, and use Graal to compile the AST to machine code. Chances are you can use AST nodes that already have a compiler. If you really have to define your own AST node type, it’s easy to add it to the Graal compiler and to write a machine code generator for the new node type, as Chris Seaton demonstrates. At least it’s easy compared to the traditional “create everything from scratch” approach.
OK, that sounds a bit abstract, so here’s an example. The GraalVM provides AST nodes emitting machine code. Local variables can be stored in slow main memory or in a fast CPU register. However, there are only a few registers. So it’s a high art to map local variables to CPU registers. Parameters are a similar story. Traditionally, parameters are simply stored in the cache, but that’s a slow memory access. Sometimes it’s possible to replace the stack by CPU registers. Graal brings a fairly sophisticated algorithm implementing the register mapping. You can just use it. Before Graal, you’d have to implement it yourself. That’s why many languages simply use the stack, which saves a lot of development time but costs a lot of performance.
Ahead of time compiler for Java
Another nice feature is the AOT compiler that’s already being shipped with Java 9 and 10. At first glance, the idea of an ahead-of-time compiler for Java sounds a bit odd. We’ve been told year after year that an AOT compiler can never beat the performance of the JIT compiler. The JIT compiler has the opportunity to watch the behavior of the running code. So it can run a lot of optimizations an AOT compiler can never do.
The AOT compiler promises to change that. It allows you to precompile your application (or parts of it), taking the sting out of the cold-start phase.
If you want to experiment with the AOT compiler, have a look at the JEP 295 page which has a lot of hints how to use the AOT compiler.
State of the art in Java 10
As far as I can see, the AOT compiler and the new Java compiler are the only part of Graal that’s shipped with the standard Open JDK 10.
If you want to experiment with polyglot programming, downloads the GraalVM. It’s tested with Java 8, 9, 10, and 11, but the only public distribution is a Java 8 build.
export PATH=<your installation path>/graalvm-0.32/Contents/Home/bin/:$PATH // OSX PATH=<your installation path>/graalvm-0.32/bin/;%PATH% // Windows js demo.js
Wrapping it up
Download page from Oracle labs
Polyglot programming on the JVM with Graal by Akihiro Nishikawa
Learning to use wholly GraalVM with a large hands-on section
Understanding how Graal works – a tour de force through Graal, teaching you no less than how to tweak your compiler or how to write your own language