A Peek Inside the Erlang Compiler
Erlang is a complex system, and I can’t do its inner workings justice
in a short article, but I wanted to give some insight into what goes on
when a module is compiled and loaded. As with most compilers, the first
step is to convert the textual source to an abstract syntax tree, but
that’s unremarkable. What is interesting is that the code goes through
three major representations, and you can look at each of them.
Erlang is unique among functional languages in its casual scope rules. You introduce variables as you go, without fanfare, and there’s no creeping indentation caused by explicit scopes. Behind the scenes that’s too quirky, so the syntax tree is converted into Core Erlang. Core Erlang looks a lot like Haskell or ML with all variables carefully referenced in “let” statements. You can see the Core Erlang representation of a module with this command from the shell:
The human-readable Core Erlang for the example module is written to example.core.
The next big transformation is from Core Erlang to code for the register-based BEAM virtual machine. BEAM is poorly documented, but it’s a lot like the Warren Abstract Machine developed for Prolog (but without the need for backtracking). BEAM isn’t terribly hard to figure out if you write short modules and examine them with:
The disassembled BEAM code for the example module is written to example.S.
The key to understanding BEAM is that there are two sets of registers:
one for passing parameters (“x” registers) and one for use as locals
within functions (“y” registers).
Virtual BEAM code is the final output of the compiler, but it’s still not what gets executed by the system. If you look at the source for the Erlang runtime, you’ll see that beam_load.c is over six thousand lines of code. Six thousand lines to load a module? That’s because the beam loader is doing more than its name lets on.
There’s an optimization pass on the virtual machine instructions, specializing some for certain situations and combining others into superinstructions. To check if a value is a tuple of three elements is accomplished with a pair of BEAM operations: is_tuple and is_arity. The BEAM loader turns these into one superinstruction: is_tuple_of_arity. You can see this condensed representation of BEAM code with:
erts_debug:df(example).The disassembled code is written to example.dis. (Note that the module must be loaded, so compile it before giving the above command.)
The loader also turns the BEAM bytecode into threaded code: a list of addresses that get jumped to in sequence. There’s no “Now what do I do with this opcode?” step, just fetch and jump, fetch and jump. If you want to to know more about threaded code, look to the Forth world.
Threaded code takes advantage of the labels as values extension of gcc. If you build the BEAM emulator with another compiler like Visual C++, it falls back on using a giant switch statement for instruction dispatch and there’s a significant performance hit.