Lecture 5: Binary Operations and Basic Blocks
Today we will extend the compiler to support binary arithmetic operations. This is a surprisingly significant change as it introduces the ambiguity of evaluation order into the language, and so we will introduce a new pass to the compiler that makes the evaluation order explicit in the structure of the term.
1 Growing the language: adding infix operators
Again, we follow our standard recipe:
Its impact on the concrete syntax of the language
Examples using the new enhancements, so we build intuition of them
Its impact on the abstract syntax and semantics of the language
Any new or changed transformations needed to process the new forms
Executable tests to confirm the enhancement works as intended
1.1 Concrete and Abstract Syntax, Examples
We add three new forms to our grammar: three binary arithmetic operations as well as parentheses so that we can disambiguate arithmetic notation.
‹prog› def main ( IDENTIFIER ) : ‹expr› ‹expr› NUMBER ADD1 ( ‹expr› ) SUB1 ( ‹expr› ) IDENTIFIER LET IDENTIFIER EQ ‹expr› IN ‹expr› ‹expr› + ‹expr› ‹expr› - ‹expr› ‹expr› * ‹expr› ( ‹expr› )
Here the abstract syntax breaks slightly from the concrete syntax in that we don’t have an abstract syntax form for parentheses, since they only serve a syntactic and not a semantic purpose. We add these new operations as primitives, adjusting the primitive constructor to take in a vector of arguments, so that it encapsulates both unary and binary primitives.
enum Prim {
Add1,
Sub1,
Add,
Sub,
Mul,
}
enum Expression {
...
Prim(Prim, Vec<Expression>),
}
These new expression forms should be familiar from standard arithmetic notation. The parser will take care of operator precedence. I.e., the expressions
(2 - 3) + 4 * 5
(2 - 3) + (4 * 5)
Prim(Add,
[Prim(Sub, [Number(2), Number(3)]),
Prim(Mul, [Number(4), Number(5)])])
1.2 Semantics
At first it seems utterly straightforward to extend our interpreter to account for these new forms:
fn interpret(p: &ast::Program, x: i64) -> i64 {
fn interp_exp(e: &ast::Expression, mut env: HashMap<String, i64>) -> i64 {
match e {
ast::Expression::Prim { prim, args } => match prim {
ast::Prim::Add => {
let res1 = interp_exp(&args[0], env.clone());
let res2 = interp_exp(&args[1], env);
res1 + res2
}
...
},
}
}
let env: HashMap<String, i64> = HashMap::unit(p.parameter.clone(), x);
interp_exp(&p.body, env)
}
But notice that there is a somewhat arbitrary choice here. Should the clause for interpreting add be
ast::Prim::Add => {
let res1 = interp_exp(&args[0], env.clone());
let res2 = interp_exp(&args[1], env);
res1 + res2
}
ast::Prim::Add => {
let res2 = interp_exp(&args[1], env.clone());
let res1 = interp_exp(&args[0], env);
res1 + res2
}
Do we evaluate the expression from left-to-right or right-to-left? It turns out that this decision doesn’t affect the interpreter for our current language, but it will matter with future extensions.
Do Now!
What programming language feature could you add to our language that would make the difference between left-to-right and right-to-left evaluation matter?
There are many different possible answers:
Mutable variables
Writing to stdout or files
Reading from stdin
For instance, consider if we added a primitive print(e)
that
would print out the value of e
and produce the same value. So
print(5)
would print 5
to stdout.
Then how should the following program evaluate?
print(6) * print(7)
Obviously, the value it should produce is 42
, but what should
it print?
Prints "67", this is left-to-right evaluation order
Prints "76", this is right-to-left evaluation order
Print either "67" or "76", meaning the evaluation order is unspecified, or implementation dependent
Which do you prefer? Either of the first two seem very reasonable, with left-to-right seeming more reasonable to match the way we write English. The third option is something probably only a compiler writer would choose, because it means it is easier to optimize the program because you can arbitrarily re-order things!
We’ll go with the first choice: left-to-right evaluation order.
Note that doing things left-to-right like this is not quite the same as the PEMDAS rules. For instance the following arithmetic expression evaluates:
(2 - 3) + 4 * 5
==> -1 + (4 * 5)
==> -1 + 20
==> 19
rather than the possible alternative of doing the multiplication first. The alternative of following PEMDAS to do the evaluation order would be very confusing:
(print(2) - 3) + print(4) * 5
1.3 Enhancing the transformations: a new intermediate representation (IR)
Exercise
What goes wrong with our current naive transformations? How can we fix them?
We don’t need much in the way of new x86 features to compile our
language. We’re already familiar with add
and sub
, and so
we only need to know that the signed multiplication operation is
called imul
.
Let’s try manually “compiling” some simple binary-operator expressions to assembly:
Original expression |
| Compiled assembly |
|
|
|
|
|
|
|
|
|
|
|
|
Do Now!
Convince yourself that using a let-bound variable in place of any of these constants will work just as well.
So far, our compiler has only ever had to deal with a single active expression
at a time: it moves the result into rax
, increments or decrements it, and
then potentially moves it somewhere onto the stack, for retrieval and later
use. But with our new compound expression forms, that won’t suffice: the
execution of (2 - 3) + (4 * 5)
above clearly must stash the result of
(2 - 3)
somewhere, to make room in rax
for the subsequent
multiplication. We might try to use another register (rcx
, maybe?), but
clearly this approach won’t scale up, since there are only a handful of
registers available. What to do?
1.3.1 Immediate expressions
Do Now!
Why did the first few expressions compile successfully?
Notice that for the first few expressions, all the arguments to the operators were immediately ready:
They required no further computation to be ready.
They were either constants, or variables that could be read off the stack.
Perhaps we can salvage the final program by transforming it somehow, such that all its operations are on immediate values, too.
Do Now!
Try to do this: Find a program that computes the same answer, in the same order of operations, but where every operator is applied only to immediate values.
Note that conceptually, our last program is equivalent to the following:
let first = 2 - 3 in
let second = 4 * 5 in
first + second
This program has decomposed the compound addition expression into the sum of two let-bound variables, each of which is a single operation on immediate values. We can easily compile each individual operation, and we already know how to save results to the stack and restore them for later use, which means we can compile this transformed program to assembly successfully.
Come to think of it, compiling operations when they are applied to
immediate values is so easy, wouldn’t it be nice if we did the same
thing for unary primitives and if? This way every intermediate result
gets a name, which will then be assigned a place on the stack (or
better yet, a register) instead of every intermediate result
necessarily going through rax
.
2 Basic Blocks
We introduce a new compiler pass: translating the code into an intermediate representation (IR). The intermediate representation we will use is called Static Single Assignment (SSA), and is the industry standard, used for example by the LLVM compiler framework.
We’ll only use a fragment of the full SSA IR for now: we will compile our source programs to a single basic block. The version of basic blocks we use now is a sequence of simple "operations" applied to immediate values, ending in a return statement1We will later add other "terminating" statements that can end a basic block, but we only need return for the straightline code we are producing now.. This has the benefit of being quite straightforward to compile to assembly code: if we have a mapping from variables to memory locations, then each operation can be directly compiled to a short sequence of instructions. This is one of the benefits of our IR: since the IR is "closer" to assembly, we can more easily understand what code will be generated for it, and in particular how to make that code efficient.
Interestingly, even though SSA IR is used for compilation of imperative code, like our source language, the variable bindings in SSA IR are immutable, meaning that a variable cannot be updated once it is defined. This is the origin of the name "Static Single Assignment", every variable is only assigned to at one static program position2when we extend to full SSA IR we will see that a variable can take on multiple values dynamically..
pub struct Program {
pub param: VarName,
pub entry: BlockBody,
}
pub enum BlockBody {
Return(Immediate),
Operation { dest: VarName, op: Operation, next: Box<BlockBody> },
}
pub enum Operation {
Immediate(Immediate),
Prim(Prim, Immediate, Immediate),
}
pub enum Prim {
Add,
Sub,
Mul,
}
pub enum Immediate {
Const(i64),
Var(VarName),
}
An SSA program consists of a single basic block body, with a parameter
name for the argument to the main function. A BlockBody
is a
sequence of Operation
s that assign the output of an
Operation
to a variable, ending with a Return
of a
specified immediate value. An operation is either one of the primitive
arithmetic operations, or an immediate value.
Our SSA IR is a different kind of "programming language" than our
source, in that we don’t really ever use a concrete syntax for it,
instead only working with the abstract syntax trees. Programmers don’t
write SSA programs themselves, the compiler generates them and
analyzes them. But for convenience of discussion, we will sometimes
use a textual format rendering a Program
with 3 operations
ending in a return as:
entry(x):
y = add 2 x
z = sub 18 3
w = mul y z
ret w
2.1 Translating Basic Blocks to Assembly
We can think of our Basic Blocks as a simplified version of our source
language and so we can adapt our method of compiling let
bindings to compile basic blocks. We map each SSA variable to a memory
offset from rsp
and then we compile each operation to a sequence
of instructions that places the result of the operation in the
location of the given variable. For instance if we store y, z, w
to the offsets [rsp - 16]
, [rsp - 24]
and [rsp - 32]
then we can compile the multiply operation w = mul y z
to
mov rax, [rsp - 16]
mov r10, [rsp - 24]
imul rax, r10
mov [rsp - 32], rax
Here we use rax
and r10
as "scratch registers" since x86
cannot operate on multiple addressed memory locations in the same
instruction.
For uniformity, we can compile the entry point to move the input
parameter from rdi
into an offset from rsp
. And lastly, we
compile a ret
in SSA to move the value into rax
and then
execute the ret
instruction. This strategy would compile our example SSA program
entry(x):
y = add 2 x
z = sub 18 3
w = mul y z
ret w
;; entry(x):
mov [rsp - 8], rdi
;; y = add 2 x
mov rax, 2
mov r10, [rsp - 8]
add rax, r10
mov [rsp - 16], rax
;; z = sub 18 3
mov rax, 18
mov r10, 3
sub rax, r10
mov [rsp - 24], rax
;; w = mul y z
mov rax, [rsp - 16]
mov r10, [rsp - 24]
imul rax, r10
mov [rsp - 32], rax
;; ret w
mov rax, [rsp - 32]
ret
2.2 Translating to a Basic Block
Now how do we go about compiling our source language to an SSA Basic Block? An SSA Basic Block is very similar to our source programming language except that
We’ve removed the redundant
add1
andsub1
primitives, translating them to useadd
andsub
with a constantPrograms end with an explicit
ret
of an immediate, whereas Adder was an expression languagePrimitive operations can only be applied to immediates, whereas in Adder they can be arbitrarily complex sub-expressions
A variable bindings stores the result of a simple
Operation
, whereas in Adder, thelet
binding stores the result of an arbitrarily complicated sub-expression.
Our goal then is to implement a function
fn lower(prog: ast::Program) -> ssa::Program
fn lower_exp(exp: ast::Expression) -> ssa::BlockBody
fn lower_exp(e: &ast::Expression) -> ssa::BlockBody {
match e {
ast::Expression::Variable(x) => ssa::Return(ssa::Immediate(Immediate::Var(x.clone()))),
ast::Expression::Number(n) => ssa::Return(ssa::Immediate(Immediate::Const(*n))),
ast::Prim { prim, args } => match prim {
Add => {
let arg1 = args[0];
let arg2 = args[1];
??
}
...
},
...
}
}
lower_exp(arg1)
and
lower_exp(arg2)
to get one that returns their sum.What do we want to do in this case? We want to perform some sequence
of operations and store the output of arg1
in a variable, then
do the same for arg2
and then add up the results. We can make
this compositional by using a trick called continuation-passing
style: instead of producing a BlockBody
that returns the value
of the input expression directly, we take in as an argument
A "destination" variable where we should store the result of the expression
A "next"
BlockBody
of code that should be run after we have assigned the result to the destination variable
We call this combination of a destination variable and a next
BlockBody
a continuation, as it tells us how the program
should continue after the expression we are compiling.
How does continuation-passing style solve our issue? Well consider the
lower_exp
function, but now taking a continuation as an
argument:
fn lower_exp(e: &ast::Expression, k: Continuation) -> ssa::BlockBody {
match e {
ast::Expression::Variable(x) => {
let (dest, body) = k;
ssa::BlockBody::Operation {
dest: dest,
op: ssa::Operation::Immediate(ssa::Immediate::Var(x.to_string())),
next: Box::new(body),
}
}
ast::Expression::Number(n) => {
let (dest, body) = k;
ssa::BlockBody::Operation {
dest: dest,
op: ssa::Operation::Immediate(ssa::Immediate::Const(*n)),
next: Box::new(body),
}
}
ast::Expression::Prim { prim, args } => match prim {
ast::Prim::Add => {
let arg1 = &args[0];
let arg2 = &args[1];
// TODO: generate *unique* variable names for these!
let tmp1 = format!("addArg1");
let tmp2 = format!("addArg2");
let (dest, body) = k;
lower_exp(
arg1,
(
tmp1.clone(),
lower_exp(
arg2,
(
tmp2.clone(),
ssa::BlockBody::Operation {
op: ssa::Operation::Prim(
ssa::Prim::Add,
ssa::Immediate::Var(tmp1),
ssa::Immediate::Var(tmp2),
),
dest,
next: Box::new(body),
},
),
),
),
)
}
_ => todo!(),
},
_ => todo!(),
}
}
When we compile a variable or a number, we simply place that immediate in the destination variable and then execute the next code of the continuation.
When compiling a complex expression like arg1 + arg2
, we want
to do the following sequence of things:
Compile
arg1
, storing its result in a temporary variabletmp1
Then compile
arg2
, storing its result in a temporary variabletmp2
Then add up
tmp1
andtmp2
, storing them in thedest
of the provided continuationFinally, we execute the provided body of the continuation
We see that the code above implements this by building up a large continuation to be passed to arg1
. In a sense, this continuation-passing style translation runs "backwards": we first
Finally, we return to lower_prog
, which should provide a
continuation for the entry point expression. In this case we provide a
continuation that immediately returns its input:
fn lower(p: &ast::Program) -> ssa::Program {
// TODO: make sure this variable name is unique!
let dest = format!("result");
let body = ssa::BlockBody::Return(ssa::Immediate::Var(dest.clone()));
ssa::Program {
param: p.parameter.to_string(),
entry: lower_exp(&p.body, (dest, body)),
}
}
As written, there is a flaw in the translation. When we compile
an Add
expression, we use the same temporary variable names
"addArg1"
and "addArg2"
. So this means the result of
this translation on the input expression
def main(x):
(x + 3) + (4 + x)
entry(x):
addArg1 = x
addArg2 = 3
addArg1 = add addArg1 addArg2
addArg1 = 4
addArg2 = x
addArg2 = add addArg1 addArg2
result = add addArg1 addArg2
ret result
entry(x%0):
addArg1%1 = x%0
addArg2%2 = 3
addArg1%3 = add addArg1%1 addArg2%2
addArg1%4 = 4
addArg2%5 = x
addArg2%6 = add addArg1%4 addArg2%5
result%7 = add addArg1%3 addArg2%6
ret result%7
Exercise
Modify the continuation-based translation to generate unique variable names
Exercise
Extend this continuation-based translation to the remaining types of expressions
1We will later add other "terminating" statements that can end a basic block, but we only need return for the straightline code we are producing now.
2when we extend to full SSA IR we will see that a variable can take on multiple values dynamically.