* Announcements Next homework will be out this Friday, another 2-week assignment covering function definitions and calls/tail calls. Next week we have Fall break on Monday-Tuesday. I will still hold my Tuesday office hours. After Fall break: I will be away for a conference a week. Cancel one lecture and one will be guest lecture by GSI Steven. Dates TBA. Today: Deep dive on lambda lifting. * Review of Scope let x1 = e1, x2 = e2, x3 = e3, ... in e def f1(x11,...): e1 and def f2(x21,...): e2 and ... def fn(xn1,...): en in e let: *nested scope*: each xi is bound in e{i+1} and on and in e. Not bound in ei. def: *mutually recursive*: each fi is bound in all of the ei and e, the xji are only bound in ej In both cases we disallow the x1,... or the f1,... from defining the same name twice. But when nesting, we allow shadowing def f(x,..): e1 in let z = f(...) in def f(y,...): e2 in f(...) is allowed. * Why do we need lambda lifting? def mult(x, y): def loop(i): if i == 0: 0 else: x + loop(i - 1) in loop(y) in mult(5, 3) When we call loop, it needs to have access to at least x in each stack frame, so whenever loop is called, we need to push not just i, but also x. We can model this by a program transformation: def mult(x, y): loop(x, y) and def loop(x, i): if i == 0: 0 else: x + loop(x, i - 1) in mult(5, 3) You might say, why didn't we write the original this way, but note that the programmer didn't have to worry about the fact that x was getting captured! This is a nice convenience! General principle: Turn the program which has various *local* function definitions, which might *capture* variables in their containing scope, into a program where all functions are *lifted* to the top level as a bunch of mutually recursive functions. I.e., start with let x = 7 in def f(a,b,c): def g(z): def h(i,j,k): if i == a: g(z,b,q(k)) * 3 else: 17 in h(i,j,q(3))) + 4 and def q(r): x - r in g(c * x) - 7 in f(x, 14, 7) and turn it into something flattened: def f(x, a, b, c): g(c * x) - 7 and def g(a,b,z): h(a, i, j, q(x, 3)) + 4 and def h(a, i, j, k): if i == a: g(a, z, b, q(x, k)) * 3 else: 17 and def q(x, r): x - r in f(x, x, 14, 7) where we have added extra arguments wherever necessary. * How to Lambda Lift We can break the lifting process into two stages: 1. Extend each function definition so that it takes all of its captured variables as arguments 2. Lift the local function definitions to the top level I.e., for our example above: def mult(x, y): def loop(i): if i == 0: 0 else: x + loop(i - 1) in loop(y) in mult(5, 3) becomes first something like def mult(x, y): def loop(x, i): if i == 0: 0 else: x + loop(x, i - 1) in loop(x, y) in mult(5, 3) and then we lift everything to the top level def mult(x, y): loop(x, y) and def loop(x, i): if i == 0: 0 else: x + loop(x, i - 1) in mult(5, 3) (Easy to fuse these if we want to) ** Variable Capture Take each function, and add as arguments, all the variables it will need to have in its stack frame at runtime. Additionally, add those arguments to every function *call*. How do we determine what variables to add to each function? Consider two things: 1. Correctness (most important) 2. Efficiency -- Class discussion --- let x = 7 in def f(a,b,c): def g(z): def h(i,j,k): if i == a: g(z,b,q(k)) else: 17 in h(i,j,q(3))) and def q(r): x - r in g(c * x) in f(x, 14, 7) def mult(x, y): def loop(x, y, i): if i == 0: 0 else: x + loop(x, y, i - 1) in loop(x, y, y) in mult(5, 3) let x1 = .. x2 = ... x3 = ... ... x100 = ... in def mult(x, y): def loop(i): if i == 0: 0 else: x + loop(i - 1) in loop(y) in mult(5, 3) Here's what I expect to hear: 1. All variables that are currently in scope? 2. All variables that occur syntactically in the body of the function? Why is (1) correct, but wasteful? Why is (2) wrong (but when it is correct, more efficient)? An example of a function where we need to capture a variable that does *not* syntactically occur in the body def f(a,b): def g(): b + h() and def h(): a in g() in f(0,1) def f(a,b): def g(b): b + h(a) and def h(a): a in g(b) in f(0,1) def f(a,b): def g(a, b): b + h(a) and def h(a): a in g(a, b) in f(0,1) -- Example of (1) being *very* wasteful -- let x1 = ..., x2 = ... x3 = ... ... x100 = ... in def loop(i): if i == 0: 0 else: x1 + loop(i - 1) end loop(x2) + x100 Adds 100 variables to every stack frame!! -- Example of (2) being wrong -- def f(x): def g(a): a + h(a - 1) and def h(z): if z == 0: x else: g(a) in g(15) in ... How to improve on (2): Classical dichotomy in compilation: 1. Try to produce the best code at every stage (fixed point algorithm for determining what variables are captured) 2. Produce inefficient code but implement good optimizations *later* (implement a general purpose "unused argument removal" optimization pass) So we might have a later pass that optimizes def mult(x, y): loop(x, y, y) and def loop(x, y, i): if i == 0: 0 else: x + loop(x, y, i - 1) in mult(5, 3) into what we wrote manually. The answer (in this class) is almost always (2): for instance, the programmer might have included unused arguments in their code that you would want to get repaired anyway. If we are building an optimizing compiler anyway, let the optimizer do the work. -- Food for thought: are there settings where 1 would be more practical? -- Ok, so that's how we add in the extra arguments to the function definitions, but we also need to add the arguments to the function *calls*. -- How would you implement this? -- Regardless of how we propagate that information, there is a subtle bug(!) -- Challenge: come up with an example program such that the algorithm so far produces incorrect code -- ???????? Shadowing ??????? def f(a,b): def g(): let a' = 3 in (b * a') + h() and def h(): a in g() in f(0,1) def f(a,b): def g(): let a = 3 in (b * a) + h() and def h(): a in g() in f(0,1) def f(a,b): def g(a, b): let a = 3 in (b * a) + h(a) and def h(a): a in g(a,b) in f(0,1) def f(a,b): def g(a,b): let a = 42 in b + h(a,b) and def h(a,b): a in g(a,b) in f(0,1) def f(a#0,b#0): def g(): let a#1 = 42 in b#0 + h() and def h(): a#0 in g() in f(0,1) def f(a#0,b#0): def g(a#2,b#2): let a#1 = 42 in b#2 + h(a#2) and def h(a#3): a#3 in g(a#0,b#0) in f(0,1) Example: def f(x,y,z): def g(a): a + x + y in let x = 12, y = 13 in g(z) in f(1,2,3) becomes def f(x,y,z): def g(x, y, a): a + x + y in let x = 12, y = 13 in -- oh no!!! g(x, y, z) in f(1,2,3) ** Lifting The lifting is straightforward to implement once we've done this right? -- Challenge: come up with an example program that would break if we simply lift the functions to the top level -- let x = ... def f(a,b,c): e1 in ... , y = ... def f(i,j): e2 in ... in ... Shadowing is the culprit again(!) * Our Savior: Unique Identifiers Shadowing is completely reasonable for the source language syntax, but there's 0 benefit to allowing it in our source program. So why don't we get rid of it? The names of the variable itself doesn't really matter, we can always re-write the program so that there is no shadowing. Additionally, for function definitions, we will squash everything into mutual recursion, so we'll have to make everything different there. -- Discussion: at what point should we re-name variables uniquely ? -- How? Of course we can use a tagging pass to give us some unique ids, then we can just append them to the names. Algorithm for renaming: 1. Recursively descend into the term, keeping track of a mapping from Old variable names to new variable names 2. When you reach a variable *declaration* (def/let) - generate a new name for each of the newly introduced variables, associate the old name to then new name in the environment in sub-expressions where it is bound. 3. When you reach a variable *use* (var/funcall) - lookup the old name in the current environment and replace it with the newly generated one. Step through this example on the board: let x = 5 in let x = x + 1 in x * Updated Compiler Pipeline Source Program ---[ Parsing ] --> AST ---[ check_prog ]--> well-scoped AST --[ unique names ]--> well-scoped with unique names --[ lambda lift ]--> (AST, FunDefs) --[ sequentialize ]--> (SeqAST, SeqFunDefs) --[ codegen ]--> ASM * Downsides of Lambda Lifting def mult(x, y): def loop(i, acc): if i == 0: acc else: loop(i - 1, y + acc) in loop(x, 0) in mult(5, 3) becomes def loop(y, i, acc): if i == 0: acc else: loop(y, i - 1, y + acc) and def mult(x,y): loop(y, x, 0) in mult(5, 3) We have *lost* information here: 1. In the original program mult and loop share the variable y. And so the variable will be stored in the same location throughout. 2. In the lifted program, mult and loop have unrelated argument lists. y happens to be the first argument to loop and the second argument to mult. - in our simple stack-based allocation, this means they will *certainly* be assigned different locations, so we will mov y on the stack - if we do a more complex register-based allocation scheme we can figure out using a program analysis that the two variables *can* be stored in the same place, but it will be hard in general to ensure that they will. Compromise: 1. Lambda lifting only for *true non-tail calls* 2. Tail called functions stay nested