After studying computer science at a university level for more than four years at the time of writing this article, many concepts and tasks have become demystified for me. The idea of building a website or app seems pretty simple to me now. Even embedded development or Linux kernel modules don't seem like that big of a task. However, compilers still remain a topic I have a huge amount of respect for.
To get a better idea about how compilers work, I wanted to create a simple compiler by myself. Since I already work so much with Rust, I decided to use the Cranelift framework to build it. As for the programming language I wanted to compile, I went with Brainfuck.
Why Brainfuck and Cranelift?
I want to start by explaining why I chose to build a Brainfuck compiler. Brainfuck, if you don't know it, is an esoteric programming language. It only has eight total instructions (explained below) that are all represented by a single character. This makes the parsing step extremely easy and allows me to focus on the compilation itself.
How does Brainfuck work?
Brainfuck is conceptually very similar to a Turing machine. All data is represented in a large byte array (often 30000 elements) and accessed one cell at a time. The current position in the array is called the data pointer.
Command Explanation > Move the data pointer right by 1 element < Move the data pointer left by 1 element + Increase the value at the data pointer by 1 - Decrease the value at the data pointer by 1 . Print the value at the data pointer interpreted as ASCII (equivalent to putchar), Read one character and save its ASCII value at the data pointer (equivalent to getchar)[ Starts a loop ] If the value at the data pointer is nonzero, jump back to the matching bracket Should you write any real programs this way? Probably not. But it is fun to see what you can get away with.
Overall, Brainfuck is a "real" enough language that it makes sense to write a compiler for it (e.g. there is a "Hello World" example you can find on Wikipedia and that should work with this compiler). So, why use Cranelift to write it? I initially looked into llvm which is also what Rust, clang and many more compilers use. However, I did not find any of the Rust bindings for llvm particularly intuitive. If you need to use llvm with Rust, inkwell is probably your best bet right now. But there are also alternative compiler frameworks. I decided to go with Cranelift since it is a product of the Rust/WASM space. While it does not support all the targets of llvm yet, it is perfectly sufficient for a toy project like this one.
About the format of this writeup
In this article, I want to explain the parts of writing a Cranelift-based compiler that I find interesting or that I would have found helpful
when I started this project. I will not go into depth on e.g. how I handle compiler arguments using structopt since that would make this
article even longer than it already is and in my opinion result in a pretty bad information-to-noise ratio. If you are interested in these
details, you can read the source code linked at the end of the article.
Step One: Parsing
The first step in any compiler is parsing the source code. Basically, you want to transform the plaintext format into a
structured format that is easier to work with and that doesn't contain unnecessary data such as comments or whitespace.
This is referred to as an Abstract Syntax Tree (AST).
As I already explained, Brainfuck only uses single character instructions. So, we can create a Token enum like this:
#[derive(Debug)]
pub enum Token {
    Right,
    Left,
    Increment,
    Decrement,
    Output,
    Input,
    LoopStart,
    LoopEnd,
}
Then, we can match over all the characters in the input and simply discard any characters that aren't instructions as comments. This is another neat thing about Brainfuck: There can be no syntax errors, so we never need to check.
However, we can do one more trick in parsing. If we get an input like >>>, we shouldn't generate three instructions
to increase the data pointer, but rather increase the data pointer by 3 in one instruction. This may increase the performance
of our compiled Brainfuck code and make the generated assembly easier to read1.
Step Two: General Setup
The way Cranelift is designed, you compile your AST in two steps. First, you transform it into a (mostly) target-independent Intermediate Representation (IR). This IR can then be compiled to the assembly language of your target and put in a container format - an object file in our case.
We start with some boilerplate: While Cranelift IR is mostly independent of the target Instruction Set Archtitecture (ISA), some parts like calling conventions are still target-dependent. To generate the target ISA representation we use the following code:
let mut shared_builder = settings::builder();
shared_builder.enable("is_pic").unwrap();
let shared_flags = settings::Flags::new(shared_builder);
let isa_builder = isa::lookup(target).unwrap();
let isa = isa_builder.finish(shared_flags).unwrap();
let call_conv = isa.default_call_conv();
The shared_flags represent ISA-independent settings we want to customize for our build. For example, we enable Position-independent code
here. We then call the isa::lookup function to create an isa::Builder for our target triple. The target defaults to the host system, but can be changed using a command line parameter.
With the builder, we could also set ISA-specific settings, but this is not necessary here. We then create a Box<dyn TargetIsa>, a trait object that contains
all the settings for our target architecture including those that we have modified. Finally, we save the calling convention for our ISA because we'll need it later.
What is a Calling Convention?
When you call a function, the compiler will usually insert a
callassembly instruction. However, at that point you still need to think about how parameters are passed to the function and how you retrieve the return value(s). This is specified in a calling convention.
We also need to setup the module our compiled code is going to live in. Modules are the
containers that code and data live in when using Cranelift. We use ObjectBuilder and
ObjectModule from the cranelift_object crate, since we want to generate linkable
object files.
let obj_builder =
    ObjectBuilder::new(isa, "main", cranelift_module::default_libcall_names()).unwrap();
let mut obj_module = ObjectModule::new(obj_builder);
You can see that the ObjectBuilder needs to know our ISA, a symbol name (the "main" string) and libcall names.
Some Cranelift IR instructions don't have simple mappings to assembly instructions on certain target architectures.
These need to be replaced with function calls. We simply use the default mapping provided by cranelift_module.
The symbol name is for our case arbitrary.
Next, we want to create a function. Due to the (lack of) structure in Brainfuck, we just use one function that all our code is going to live in.
let mut sig = Signature::new(call_conv);
sig.returns.push(AbiParam::new(types::I32));
let fid = obj_module
    .declare_function("main", Linkage::Export, &sig)
    .unwrap();
let mut func = Function::with_name_signature(UserFuncName::user(0, 0), sig);
let mut func_ctx = FunctionBuilderContext::new();
let mut fn_builder = FunctionBuilder::new(&mut func, &mut func_ctx);
We need to declare the function in our module and then define it in Cranelift IR. Our main function should take no arguments and return a 32-bit
integer (the exit code of our program). Using the name main and making the symbol visible, we declare it in the module. Then,
we create a Cranelift IR function with a name and the already created signature.
For the name, we use a UserFuncName with namespace and index both set to zero. These
numbers are only meant for us and are not used by Cranelift, so we can use arbitrary values here.
The exported name is how other programs will find our main function. Then, we create a FunctionBuilder. This is the object that will
do most of the work of generating Cranelift IR for us. Before we can translate the Brainfuck tokens into instructions in this shiny,
new function, we still need to do some more preparation.
let mem_array = obj_module
    .declare_data("mem", Linkage::Local, true, false)
    .unwrap();
let mut data_ctx = DataContext::new();
data_ctx.define_zeroinit(30000);
obj_module.define_data(mem_array, &data_ctx).unwrap();
let local_data = obj_module.declare_data_in_func(mem_array, fn_builder.func);
let pointer = obj_module.target_config().pointer_type();
As you may remember, Brainfuck uses one large array as its memory. In our module, we declare a data object of 30000 bytes and
define it to be zero-initialized. We then need to make it accessible to our function. We also save the pointer type of our
platform in a variable, since we'll need it to access the memory later. At this point, we're still not done with our setup!
As described in the Brainfuck instruction table, the . and , instructions are equivalent to the putchar and getchar
C functions. We need to tell Cranelift that we want to import them.
let mut putchar_sig = obj_module.make_signature();
putchar_sig.params.push(AbiParam::new(types::I8));
let putchar_fn = obj_module
    .declare_function("putchar", Linkage::Import, &putchar_sig)
    .unwrap();
let local_putchar = obj_module.declare_func_in_func(putchar_fn, fn_builder.func);
We define putchar similary to our own main function. However, since it is only linked in and not built by us, we need to
import it in the object module. We then need to make it accessible to our main function. The code for getchar is very
similar, so I'll omit it here.
Now, we can finally start writing code into our function:
let block = fn_builder.create_block();
fn_builder.switch_to_block(block);
fn_builder.seal_block(block);
let mem = fn_builder.ins().global_value(pointer, local_data);
let zero_ptr = fn_builder.ins().iconst(pointer, 0);
let offset = Variable::new(1);
fn_builder.declare_var(offset, pointer);
fn_builder.def_var(offset, zero_ptr);
let mut jumps = Vec::new();
Code in Cranelift is organized into Blocks . A block is a set of instructions without any branches except at the beginning
and end. By sealing the block, we tell Cranelift that there will be no further branches to this block (which we know here, since
it's the first block). We then create some variables and values we will need later. mem points to the array we created earlier
and makes it accessible via a pointer. offset is the variable we will use as our data pointer and initialized to zero. We also
need a table of jumps to manage loops later.
Step 3: Instruction Translation
With the setup inside the function done, we can now generate Cranelift IR for our Brainfuck instructions. We do this by iterating
over our generated tokens (our pseudo-AST, so to speak) and matching over each token. I want to start by talking about the > and <
instructions.
// The offset is loaded outside the match statement
// because most instructions need to use it
let offset_val = fn_builder.use_var(offset);
match instr {
    Token::Right(count) => {
        // Since `offset` is a pointer, we need to convert it to a Rust i64 here
        let to_add = fn_builder.ins().iconst(pointer, *count as i64); 
        let new_offset = fn_builder.ins().iadd(offset_val, to_add);
        fn_builder.def_var(offset, new_offset);
    }
    // Other match arms omitted
}
As described in the parsing section, we combine repeated instructions. To update the offset variable, we
need three steps:
- Create a Cranelift IR value for the amount of cells to move
- Add the number of repetitions to the existing offset
- Save the result back into the offset
As you can see, this is exactly what the above code does. We load or create both operands, add them together and redefine the offset
variable. Calling fn_builder.ins() gives us an object that implements the InstBuilder trait and can add instructions to the current
block. The Token::Left case looks very similar, but inserts an isub instruction instead2.
Next, let's look at the implementation of Token::Increment.
Token::Increment(count) => {
    let mem_offset = fn_builder.ins().iadd(mem, offset_val);
    let val = fn_builder.ins().load(
        types::I8,
        MemFlags::new(),
        mem_offset,
        Offset32::new(0),
    );
    let to_add = fn_builder.ins().iconst(types::I8, *count as i64);
    let val = fn_builder.ins().iadd(val, to_add);
    fn_builder
        .ins()
        .store(MemFlags::new(), val, mem_offset, Offset32::new(0));
}
This one is a bit longer, but in many ways similar to Token::Left and Token::Right. First, we calculate the location of the
current data cell by adding the offset to the location of our global memory. Then, we load one byte (an I8) from the calculated
location. The addition works the same as before. Finally, we need to store the new value in the same location. Once again,
Token::Decrement looks the same, except it uses an isub instruction.
After these instructions, let's see how input and output work.
Token::Output => {
    let mem_offset = fn_builder.ins().iadd(mem, offset_val);
    let val = fn_builder.ins().load(
        types::I8,
        MemFlags::new(),
        mem_offset,
        Offset32::new(0),
    );
    fn_builder.ins().call(local_putchar, &[val]);
}
Token::Input => {
    let mem_offset = fn_builder.ins().iadd(mem, offset_val);
    let res = fn_builder.ins().call(local_getchar, &[]);
    let val = fn_builder.inst_results(res)[0];
    fn_builder
        .ins()
        .store(MemFlags::new(), val, mem_offset, Offset32::new(0));
}
For the output, we once again need to load the value of the current cell. We then insert a call instruction to call
the putchar function we defined earlier with val as the argument. Note that Cranelift would let us call putchar with
any number of arguments here! We could give it a wrong number of arguments and this would only be caught at the compilation
stage.
For Token::Input the implementation looks sort of like the reverse of Token::Input. We call getchar without any arguments
and load the first (and here, the only) return value. We then store it at the current cell. We also don't do any error handling
yet. This might be a project for the future.
Finally, we have the implementation of loops. These were the most difficult part for me as they require the use of jumps and breaks which in turn need additional blocks.
Token::LoopStart => {
    let next_block = fn_builder.create_block();
    fn_builder.ins().jump(next_block, &[]);
    jumps.push(next_block);
    fn_builder.switch_to_block(next_block);
}
Token::LoopEnd => {
    let mem_offset = fn_builder.ins().iadd(mem, offset_val);
    let val = fn_builder.ins().load(
        types::I8,
        MemFlags::new(),
        mem_offset,
        Offset32::new(0),
    );
    let jump_target = jumps.pop().unwrap();
    fn_builder.ins().brnz(val, jump_target, &[]);
    let next_block = fn_builder.create_block();
    fn_builder.ins().jump(next_block, &[]);
    fn_builder.switch_to_block(next_block);
}
When we enter a loop, we want to create a new block. Here, you can think of a block in a similar way to a label for goto:
We need the block to remember where to jump back to if we don't want to exit the loop. Therefore, we save the new block on
a stack to access it later.
Once we encounter the end of a loop, we need to check the value of the current cell. If it is zero, we move out of the loop, otherwise
we return to the beginning of the loop. So, we need the matching Token::LoopStart, or rather the matching block from our
jumps stack. We want to branch to this block if the current cell is non-zero, so we use a brnz instruction. However, this
forces us to create a new block for the non-branching path as well since branches and jumps are only allowed at the end of a block.
So, the non-branching path will jump to the new block. Finally, our function needs to return.
let val = fn_builder.ins().iconst(types::I32, 0);
fn_builder.ins().return_(&[val]);
fn_builder.seal_all_blocks();
fn_builder.finalize();
At the moment, we will always return with code 0, indicating no error. We can now seal all blocks in our function since we know there
will be no further jumps to them and finalize the FunctionBuilder, finishing the generation of Cranelift IR.
Step Four: Generating an Object File
We are almost finished now: We just need to compile our Cranelift IR and put it into an object file.
let mut context = Context::for_function(func);
obj_module.define_function(fid, &mut context).unwrap();
let res = obj_module.finish();
let mut file = File::create(output).unwrap();
res.object.write_stream(&mut file).unwrap();
We can now define the main function in the object module that we declared earlier. By calling ObjectModule::finish,
we compile our generated Cranelift IR and can write the result into a file.
Step Five: Linking
We now have a binary object file. However, it refers to the external putchar and getchar
functions that we couldn't import yet. In order to use them, we need to link our object file. Using a linker
of our choice (like ld on Linux and link.exe on Windows), we can link a C runtime that provides the necessary startup
code and functions. On Linux, this is usually /lib/crt1.o and on Windows we can use the Universal C Runtime ucrt.lib.
The repository provides scripts to invoke these linkers.
This is where using the proper calling convention matters: Since our program and the precompiled C runtime use the same calling convention, they can understand how to call each other's functions even though they're written in different programming languages.
What I learned
At a total of about 320 lines of code, my Brainfuck compiler is relatively compact. In fact, the compile function where
most of the magic happens is only about 160 lines. The main difficulty in this project was the lack and fragmentation of
information. Cranelift is still a relatively young project, so there are only relatively few guides on how to use it.
The toy JIT compiler was a very helpful resource, along with crate
documentation and source code. However, Cranelift is split across multiple crates which can make finding the information
you're looking for more difficult than it needs to be. Once I had a general idea of how the different crates worked together,
it mainly became a matter of putting the right parts together and I quite enjoyed using Cranelift for this project.
In the future, I also want to look at debugging, since most of my debugging so far consisted of reading Cranelift IR and essentially running the code by hand. This is not a viable technique for more complex compilers, so I'm very interested in seeing how I can integrate something like the DWARF debugging format into my compiler.
For now, you can find the project on my GitHub Page.
- 
I haven't actually profiled this. Maybe a modern CPU can optimize all these identical instructions really well. However, the point about assembly readability is really important to me: A lot of my time debugging this project was spent reading Cranelift IR or x86 assembly. ↩ 
- 
This similarity indicates that it might make sense to use one instruction for both the left and right case. I chose not to do this here, because it would make parsing the input more complicated and I wanted to focus on the compilation. ↩