Assignment 7: Hundred-pacer: Register Allocation
Due: Thu 11/18 at 9pm
git clone
The popular name “hundred pacer” refers to a local belief that, after being bitten, the victim will only be able to walk 100 steps before dying. Incorporating register allocation should allow your compiled code to complete its paces faster than before
In this assignment, you’ll be implementing the most important optimization for the compiler: register allocation.
We won’t be adding any new language features this week, so the language we are implementing is still diamondback. However, you will now use registers for local variables wherever possible, rather than our stack-based allocation scheme we’ve been using so far.
We break the assignment down into three parts. First, we need to analyze the code to determine which variables cannot be stored in the same register. Second, we use Chaitin’s graph coloring algorithm to assign registers (or spilled stack slots) to all variables. Finally, we need to change our code generation to account for our register allocation scheme, rather than storing all variables on the stack.
1 Analysis
First need to figure out which variables conflict with each other. We break this down into liveness analysis, which determines when a variable’s value is needed and conflict analysis which determines which variables are live at the same time with possibly different values.
We implement liveness analysis as a compiler pass, annotating all expressions with the free variables live at that point. It has the following type
fn liveness<Ann>(e: &SeqExp<Ann>, params: &HashSet<String>, live_out: HashSet<String>) -> SeqExp<HashSet<String>> {
}
The parameter set params
are the parameters in the current
function body. You need this information because you should not treat
these parameters as part of your liveness analysis since parameters
are passed into fixed slots on the stack. Next, live_out
is the
set of variables whose values are needed after running the expression
e
. The output of this function is an expression with the same
structure as e
but with each sub-expression annotated with the
variables that are live at that point. So for example, if the input expression were
let x = 3 in f(x, y, a)
where a
is a parameter and y
is not, then
your liveness function would return the expression annotated as
follows given an empty live_out
set:
Let { var: "x",
bound_exp: Imm(Num(3), {"y"}),
body: Call("f", [Var("x"), Var("y"), Var("a")], {"x", "y"}),
ann: {"y"},
}
The inner Call
has both x
and y
live because both
values are used as arguments. a
is not live since it is a
parameter. The Let
outer only has y
live in it because
the x
is defined in the letside.
The liveness function returns a SeqExp<HashSet<String>>
, an
expression annotated with its liveness information. To extract the
liveness information for an expression you can use the .ann()
method implemented in syntax.rs
.
We implement this as a compiler pass because we can use this liveness information later to optimize how many registers we save.
Next, given our liveness analysis, we go through and determine what
the conflicts are. This generates a graph that will be the
input to our graph coloring function.
The conflicts
function has the following type.
fn conflicts<Ann>(e: &SeqExp<(HashSet<String>, Ann)>) -> Graph<String>
Ann
that you
will ignore here (it is used to pass unique numbers later).The output of the conflicts function is a Graph<String>
representing the conflicts between variables. The Graph
datatype is provided in the graph.rs
module, and represents an
undirected graph. Each method is documented with a brief
description. Note that other than Graph::new()
, these are all
methods on a graph, so for instance to insert a vertex v
into a
graph g
using the insert_vertex
method, you invoke the
method as g.insert_vertex(v)
.
Make sure you insert all let-bound variables into the graph you produce so that they are all assigned a register, even if they have no conflicts. To ensure that you don’t put any parameters in your output graph, add the variables into your output when you see them bound by a let, rather than when they are used.
By Rice’s theorem, perfect liveness/conflict analysis are impossible. However, you should make an effort to be somewhat precise. In particular, you will not receive full credit by just saying all variables conflict with each other. In particular your conflict analysis handles examples like the following one from lecture:
def f(a, b):
let x = a + b in
let y = x in
h(x, y)
end
In this case, there are two local variables, x
and
y
in the function, and they do not conflict with each
other. Your conflict analysis should be able to tell that there is no
conflict between x
and y
to receive full
credit.
2 Graph Coloring
Now that we have our conflict graph, we attempt to assign each vertex a "color", i.e., register. We implement this with the following signature:
fn allocate_registers(conflicts: Graph<String>, all_registers: &[Reg]) -> HashMap<String, VarLocation>
You are given the register interference graph and a slice of registers
to be used in register allocation and you should return a HashMap
mapping the variables in the input graph to a VarLocation
,
which is either a register VarLocation::Reg(r)
or a stack
offset VarLocation::Spill(i)
.
We parameterize the graph coloring algorithm by which registers we are using so that we can test spilling more easily by passing in fewer registers. You may find this useful for incrementally implementing your use of registers as well1For instance you might first use no registers, then use only caller-save registers and finally both caller-save and callee-save registers.
Ultimately you should use the registers in
GENERAL_PURPOSE_REGISTERS
which include all registers except
for the following which we have reserved for some other use:
rax
: the return registerr13-14
: a scratch registersr15
: the heap pointerrbp
,rsp
: the frame and stack pointers
3 Using our Register Allocator
To show you when to call the conflict analysis and register allocator,
we have provided stub functions compile_fun
and
compile_main
, as well as putting the liveness analysis in the
compiler pipeline. Additionally, we have added a new pass
uniquify_names
before the liveness pass that ensures all
variable names are unique.
Once we have a register/spill assignment for all of the variables in each function body, we need to update our code generation to use the new assignment. This means changing several parts of the compiler.
First, a minor change is that your code generation function takes in an expression annotated by both liveness information and a unique name. You will need to change your code that generates unique names appropriately.
When we generate let bindings and variables, we should use the given registers rather than the old stack-based allocation. Furthermore, depending on how you set up your stack frame, you may need to account for how many callee-save registers you save either when accessing parameters or when accessing spilled variables.
In the preamble of our function bodies we will need to ensure we are saving any callee-save registers that we use, and restoring their values in our function epilogue and in our tail call code. It is always safe to save all of them, but it is less memory-intensive to only save those that we actually use. Also, our code for allocating space for our stack frame should take into account how many spilled variables we have, rather than our old method of counting the depth of variable bindings. As always, be careful to get the alignment right!
For (non-tail) function calls, we will need to ensure we are saving any caller-save registers whose values we need after the call returns. It is always safe to save all of them, but to be more efficient you can use the liveness information annotated on the expressions
4 Recommended TODO List
Here’s an order in which you could consider tackling the implementation:
Fill in liveness and conflicts with stub implementations and implement a register allocator that spills all variables it is given. Then update your code generation to get your old tests passing.
Implement liveness
Implement conflict analysis.
Implement the graph coloring algorithm
Re-run your end-to-end tests but with an empty set of registers.
Implement saving of (used) callee-save registers. Re-run your end-to-end tests but with only callee-save registers
Implement appropriate saving of caller-save registers. Re-run your end-to-end tests now with all (non-reserved) registers
5 List of Deliverables
your
compile.rs
andasm.rs
the other src/*.rs files in the starter code
any additional modules you saw fit to write
your
runtime/stub.rs
the Cargo.toml
integration tests (
tests/examples.rs
)your test input programs (
examples/*
files)
6 Grading Standards
This assignment will be solely autograded. There are three kinds of tests:
Liveness/Conflict analysis tests to test the soundness and precision of your analyses
Graph coloring tests that ensure you always give valid register/spill assignments and don’t spill excessively
End-to-end tests to determine that you’ve incorporated your register allocation successfully with your code.
We will not be grading your test coverage, but you may find it useful, especially using your extensive existing test suite to stress test your implementation.
7 Submission
Wait! Please read the assignment again and verify that you have not forgotten anything!
Please submit your homework to gradescope by the above deadline.1For instance you might first use no registers, then use only caller-save registers and finally both caller-save and callee-save registers