Assignment 7: Hundred-pacer: Register Allocation

8.2

Assignment 7: Hundred-pacer: Register Allocation

Due: Thu 11/18 at 9pm

git clone

The popular name “hundred pacer” refers to a local belief that, after being bitten, the victim will only be able to walk 100 steps before dying. Incorporating register allocation should allow your compiled code to complete its paces faster than before

In this assignment, you’ll be implementing the most important optimization for the compiler: register allocation.

We won’t be adding any new language features this week, so the language we are implementing is still diamondback. However, you will now use registers for local variables wherever possible, rather than our stack-based allocation scheme we’ve been using so far.

We break the assignment down into three parts. First, we need to analyze the code to determine which variables cannot be stored in the same register. Second, we use Chaitin’s graph coloring algorithm to assign registers (or spilled stack slots) to all variables. Finally, we need to change our code generation to account for our register allocation scheme, rather than storing all variables on the stack.

1 Analysis

First need to figure out which variables conflict with each other. We break this down into liveness analysis, which determines when a variable’s value is needed and conflict analysis which determines which variables are live at the same time with possibly different values.

We implement liveness analysis as a compiler pass, annotating all expressions with the free variables live at that point. It has the following type

fn liveness<Ann>(e: &SeqExp<Ann>, params: &HashSet<String>, live_out: HashSet<String>) -> SeqExp<HashSet<String>> {
}

The parameter set params are the parameters in the current function body. You need this information because you should not treat these parameters as part of your liveness analysis since parameters are passed into fixed slots on the stack. Next, live_out is the set of variables whose values are needed after running the expression e. The output of this function is an expression with the same structure as e but with each sub-expression annotated with the variables that are live at that point. So for example, if the input expression were

let x = 3 in f(x, y, a)

where a is a parameter and y is not, then your liveness function would return the expression annotated as follows given an empty live_out set:

Let { var: "x",
      bound_exp: Imm(Num(3), {"y"}),
      body: Call("f", [Var("x"), Var("y"), Var("a")], {"x", "y"}),
      ann: {"y"},
    }

The inner Call has both x and y live because both values are used as arguments. a is not live since it is a parameter. The Let outer only has y live in it because the x is defined in the letside.

The liveness function returns a SeqExp<HashSet<String>>, an expression annotated with its liveness information. To extract the liveness information for an expression you can use the .ann() method implemented in syntax.rs.

We implement this as a compiler pass because we can use this liveness information later to optimize how many registers we save.

Next, given our liveness analysis, we go through and determine what the conflicts are. This generates a graph that will be the input to our graph coloring function. The conflicts function has the following type.

fn conflicts<Ann>(e: &SeqExp<(HashSet<String>, Ann)>) -> Graph<String>

It takes as input an expression that is annotated with two things: the set of live variables and some other annotation Ann that you will ignore here (it is used to pass unique numbers later).

The output of the conflicts function is a Graph<String> representing the conflicts between variables. The Graph datatype is provided in the graph.rs module, and represents an undirected graph. Each method is documented with a brief description. Note that other than Graph::new(), these are all methods on a graph, so for instance to insert a vertex v into a graph g using the insert_vertex method, you invoke the method as g.insert_vertex(v).

Make sure you insert all let-bound variables into the graph you produce so that they are all assigned a register, even if they have no conflicts. To ensure that you don’t put any parameters in your output graph, add the variables into your output when you see them bound by a let, rather than when they are used.

By Rice’s theorem, perfect liveness/conflict analysis are impossible. However, you should make an effort to be somewhat precise. In particular, you will not receive full credit by just saying all variables conflict with each other. In particular your conflict analysis handles examples like the following one from lecture:

def f(a, b):
  let x = a + b in
  let y = x in
  h(x, y)
end

In this case, there are two local variables, x and y in the function, and they do not conflict with each other. Your conflict analysis should be able to tell that there is no conflict between x and y to receive full credit.

2 Graph Coloring

Now that we have our conflict graph, we attempt to assign each vertex a "color", i.e., register. We implement this with the following signature:

fn allocate_registers(conflicts: Graph<String>, all_registers: &[Reg]) -> HashMap<String, VarLocation>

You are given the register interference graph and a slice of registers to be used in register allocation and you should return a HashMap mapping the variables in the input graph to a VarLocation, which is either a register VarLocation::Reg(r) or a stack offset VarLocation::Spill(i).

We parameterize the graph coloring algorithm by which registers we are using so that we can test spilling more easily by passing in fewer registers. You may find this useful for incrementally implementing your use of registers as well1For instance you might first use no registers, then use only caller-save registers and finally both caller-save and callee-save registers.

Ultimately you should use the registers in GENERAL_PURPOSE_REGISTERS which include all registers except for the following which we have reserved for some other use:

rax: the return register
r13-14: a scratch registers
r15: the heap pointer
rbp, rsp: the frame and stack pointers

3 Using our Register Allocator

To show you when to call the conflict analysis and register allocator, we have provided stub functions compile_fun and compile_main, as well as putting the liveness analysis in the compiler pipeline. Additionally, we have added a new pass uniquify_names before the liveness pass that ensures all variable names are unique.

Once we have a register/spill assignment for all of the variables in each function body, we need to update our code generation to use the new assignment. This means changing several parts of the compiler.

First, a minor change is that your code generation function takes in an expression annotated by both liveness information and a unique name. You will need to change your code that generates unique names appropriately.
When we generate let bindings and variables, we should use the given registers rather than the old stack-based allocation. Furthermore, depending on how you set up your stack frame, you may need to account for how many callee-save registers you save either when accessing parameters or when accessing spilled variables.
In the preamble of our function bodies we will need to ensure we are saving any callee-save registers that we use, and restoring their values in our function epilogue and in our tail call code. It is always safe to save all of them, but it is less memory-intensive to only save those that we actually use. Also, our code for allocating space for our stack frame should take into account how many spilled variables we have, rather than our old method of counting the depth of variable bindings. As always, be careful to get the alignment right!
For (non-tail) function calls, we will need to ensure we are saving any caller-save registers whose values we need after the call returns. It is always safe to save all of them, but to be more efficient you can use the liveness information annotated on the expressions

4 Recommended TODO List

Here’s an order in which you could consider tackling the implementation:

Fill in liveness and conflicts with stub implementations and implement a register allocator that spills all variables it is given. Then update your code generation to get your old tests passing.
Implement liveness
Implement conflict analysis.
Implement the graph coloring algorithm
Re-run your end-to-end tests but with an empty set of registers.
Implement saving of (used) callee-save registers. Re-run your end-to-end tests but with only callee-save registers
Implement appropriate saving of caller-save registers. Re-run your end-to-end tests now with all (non-reserved) registers

5 List of Deliverables

your compile.rs and asm.rs
the other src/*.rs files in the starter code
any additional modules you saw fit to write
your runtime/stub.rs
the Cargo.toml
integration tests (tests/examples.rs)
your test input programs (examples/* files)

6 Grading Standards

This assignment will be solely autograded. There are three kinds of tests:

Liveness/Conflict analysis tests to test the soundness and precision of your analyses
Graph coloring tests that ensure you always give valid register/spill assignments and don’t spill excessively
End-to-end tests to determine that you’ve incorporated your register allocation successfully with your code.

We will not be grading your test coverage, but you may find it useful, especially using your extensive existing test suite to stress test your implementation.

7 Submission

Wait! Please read the assignment again and verify that you have not forgotten anything!

Please submit your homework to gradescope by the above deadline.

1For instance you might first use no registers, then use only caller-save registers and finally both caller-save and callee-save registers

1	Analysis
2	Graph Coloring
3	Using our Register Allocator
4	Recommended TODO List
5	List of Deliverables
6	Grading Standards
7	Submission