* Announcements

Next homework will be out this Friday, another 2-week assignment
covering function definitions and calls/tail calls.


Next week we have Fall break on Monday-Tuesday. I will still hold my
Tuesday office hours.

After Fall break: I will be away for a conference a week. Cancel one
lecture and one will be guest lecture by GSI Steven. Dates TBA.


Today: Deep dive on lambda lifting.


* Review of Scope

let x1 = e1,
    x2 = e2,
    x3 = e3,
    ...
in e


def f1(x11,...): e1 and
def f2(x21,...): e2 and
...
def fn(xn1,...): en in
e


let: *nested scope*: each xi is bound in e{i+1} and on and in e. Not bound in ei.

def: *mutually recursive*: each fi is bound in all of the ei and e,
the xji are only bound in ej

In both cases we disallow the x1,... or the f1,... from defining the
same name twice.


But when nesting, we allow shadowing

def f(x,..): e1
in

let z = f(...) in

def f(y,...): e2 in

f(...)

is allowed.


* Why do we need lambda lifting?

def mult(x, y):
  def loop(i):
    if i == 0:
      0
    else:
      x + loop(i - 1)
  in
  loop(y)
in

mult(5, 3)


When we call loop, it needs to have access to at least x in each stack
frame, so whenever loop is called, we need to push not just i, but
also x.


We can model this by a program transformation:

def mult(x, y):
  loop(x, y)
and
def loop(x, i):
 if i == 0:
   0
 else:
   x + loop(x, i - 1)
in

mult(5, 3)


You might say, why didn't we write the original this way, but note
that the programmer didn't have to worry about the fact that x was
getting captured! This is a nice convenience!


General principle:

Turn the program which has various *local* function definitions, which
might *capture* variables in their containing scope, into a program
where all functions are *lifted* to the top level as a bunch of
mutually recursive functions.


I.e., start with

let x = 7 in
def f(a,b,c):
  def g(z):
    def h(i,j,k):
      if i == a:
        g(z,b,q(k)) * 3
      else:
        17
    in
    h(i,j,q(3))) + 4
  and
  def q(r):
    x - r
  in
  g(c * x) - 7
in 
f(x, 14, 7)

and turn it into something flattened:


def f(x, a, b, c):
  g(c * x) - 7
and
def g(a,b,z):
  h(a, i, j, q(x, 3)) + 4
and
def h(a, i, j, k):
  if i == a:
    g(a, z, b, q(x, k)) * 3
  else:
    17
and
def q(x, r):
  x - r
in
f(x, x, 14, 7)

where we have added extra arguments wherever necessary.


* How to Lambda Lift

We can break the lifting process into two stages:
1. Extend each function definition so that it takes all of its
   captured variables as arguments
2. Lift the local function definitions to the top level

I.e., for our example above:


def mult(x, y):
  def loop(i):
    if i == 0:
      0
    else:
      x + loop(i - 1)
  in
  loop(y)
in

mult(5, 3)

becomes first something like

def mult(x, y):
  def loop(x, i):
    if i == 0:
      0
    else:
      x + loop(x, i - 1)
  in
  loop(x, y)
in

mult(5, 3)


and then we lift everything to the top level

def mult(x, y):
  loop(x, y)
and
def loop(x, i):
  if i == 0:
    0
  else:
    x + loop(x, i - 1)
in

mult(5, 3)

(Easy to fuse these if we want to)


** Variable Capture

Take each function, and add as arguments, all the variables it will
need to have in its stack frame at runtime.

Additionally, add those arguments to every function *call*.

How do we determine what variables to add
to each function?

Consider two things:
1. Correctness (most important)
2. Efficiency


-- Class discussion ---

let x = 7 in
def f(a,b,c):
  def g(z):
    def h(i,j,k):
      if i == a:
        g(z,b,q(k))
      else:
        17
    in
    h(i,j,q(3)))
  and
  def q(r):
    x - r
  in
  g(c * x)
in 
f(x, 14, 7)


def mult(x, y):
  def loop(x, y, i):
    if i == 0:
      0
    else:
      x + loop(x, y, i - 1)
  in
  loop(x, y, y)
in

mult(5, 3)


let x1 = ..
    x2 = ...
    x3 = ...
    ...
    x100 = ...
in
def mult(x, y):
  def loop(i):
    if i == 0:
      0
    else:
      x + loop(i - 1)
  in
  loop(y)
in

mult(5, 3)


Here's what I expect to hear:
1. All variables that are currently in scope?
2. All variables that occur syntactically in the body of the function?

Why is (1) correct, but wasteful?
Why is (2) wrong (but when it is correct, more efficient)?


An example of a function where we need to capture a variable that does
*not* syntactically occur in the body


def f(a,b):
    def g():
        b + h()
    and
    def h():
        a
    in
    g()
in
f(0,1)


def f(a,b):
    def g(b):
        b + h(a)
    and
    def h(a):
        a
    in
    g(b)
in
f(0,1)


def f(a,b):
    def g(a, b):
        b + h(a)
    and
    def h(a):
        a
    in
    g(a, b)
in
f(0,1)


-- Example of (1) being *very* wasteful --

let x1 = ...,
    x2 = ...
    x3 = ...
    ...
    x100 = ...
in
def loop(i):
  if i == 0:
    0
  else:
    x1 + loop(i - 1)
end
loop(x2) + x100

Adds 100 variables to every stack frame!!


-- Example of (2) being wrong --


def f(x):
  def g(a):
    a + h(a - 1)
  and
  def h(z):
      if z == 0:
        x
      else:
        g(a)
  in
  g(15)
in
...


How to improve on (2):
Classical dichotomy in compilation:
1. Try to produce the best code at every stage (fixed point algorithm
   for determining what variables are captured)
2. Produce inefficient code but implement good optimizations *later*
   (implement a general purpose "unused argument removal" optimization
   pass)


So we might have a later pass that optimizes 

def mult(x, y):
  loop(x, y, y)
and
def loop(x, y, i):
  if i == 0:
    0
  else:
    x + loop(x, y, i - 1)
in

mult(5, 3)

into what we wrote manually.


The answer (in this class) is almost always (2): for instance, the
programmer might have included unused arguments in their code that you
would want to get repaired anyway. If we are building an optimizing
compiler anyway, let the optimizer do the work.


-- Food for thought: are there settings where 1 would be more practical? --


Ok, so that's how we add in the extra arguments to the function
definitions, but we also need to add the arguments to the function
*calls*.

-- How would you implement this? --


Regardless of how we propagate that information, there is a subtle
bug(!)

-- Challenge: come up with an example program such that the algorithm
      so far produces incorrect code --

????????  Shadowing  ???????


def f(a,b):
    def g():
        let a' = 3 in
        (b * a') + h()
    and
    def h():
        a
    in
    g()
in
f(0,1)


def f(a,b):
    def g():
        let a = 3 in
        (b * a) + h()
    and
    def h():
        a
    in
    g()
in
f(0,1)


def f(a,b):
    def g(a, b):
        let a = 3 in
        (b * a) + h(a)
    and
    def h(a):
        a
    in
    g(a,b)
in
f(0,1)


def f(a,b):
    def g(a,b):
      let a = 42 in
      b + h(a,b)
    and
    def h(a,b):
        a
    in
    g(a,b)
in
f(0,1)


def f(a#0,b#0):
    def g():
      let a#1 = 42 in
      b#0 + h()
    and
    def h():
        a#0
    in
    g()
in
f(0,1)


def f(a#0,b#0):
    def g(a#2,b#2):
      let a#1 = 42 in
      b#2 + h(a#2)
    and
    def h(a#3):
        a#3
    in
    g(a#0,b#0)
in
f(0,1)


Example:

def f(x,y,z):
  def g(a):
    a + x + y
  in
  let x = 12, y = 13 in
  g(z)
in
f(1,2,3)

becomes

def f(x,y,z):
  def g(x, y, a):
    a + x + y
  in
  let x = 12, y = 13 in  -- oh no!!!
  g(x, y, z)
in
f(1,2,3)


** Lifting

The lifting is straightforward to implement once we've done this right?

-- Challenge: come up with an example program that would break if we
simply lift the functions to the top level --


let x = ... def f(a,b,c): e1 in ... ,
    y = ... def f(i,j): e2 in ...
in ...


Shadowing is the culprit again(!)


* Our Savior: Unique Identifiers

Shadowing is completely reasonable for the source language syntax, but
there's 0 benefit to allowing it in our source program. So why don't
we get rid of it? The names of the variable itself doesn't really
matter, we can always re-write the program so that there is no
shadowing. Additionally, for function definitions, we will squash
everything into mutual recursion, so we'll have to make everything
different there.


-- Discussion: at what point should we re-name variables uniquely ? --


How? Of course we can use a tagging pass to give us some unique ids,
then we can just append them to the names.


Algorithm for renaming:

1. Recursively descend into the term, keeping track of a mapping from
   Old variable names to new variable names
2. When you reach a variable *declaration* (def/let)
   - generate a new name for each of the newly introduced variables,
     associate the old name to then new name in the environment in
     sub-expressions where it is bound.

3. When you reach a variable *use* (var/funcall)
   - lookup the old name in the current environment and replace it
     with the newly generated one.


Step through this example on the board:
let x = 5 in
let x = x + 1 in
x


* Updated Compiler Pipeline


Source Program

---[ Parsing ] -->

AST

---[ check_prog ]-->

well-scoped AST

--[ unique names ]-->

well-scoped with unique names

--[ lambda lift ]-->

(AST, FunDefs)

--[ sequentialize ]-->

(SeqAST, SeqFunDefs)

--[ codegen ]-->

ASM


* Downsides of Lambda Lifting

def mult(x, y):
  def loop(i, acc):
    if i == 0:
      acc
    else:
      loop(i - 1, y + acc)
  in
  loop(x, 0)
in

mult(5, 3)


becomes


def loop(y, i, acc):
  if i == 0:
    acc
  else:
    loop(y, i - 1, y + acc)
and
def mult(x,y):
  loop(y, x, 0)
in

mult(5, 3)


We have *lost* information here:
1. In the original program mult and loop share the variable y.
   And so the variable will be stored in the same location throughout.

2. In the lifted program, mult and loop have unrelated argument
   lists. y happens to be the first argument to loop and the second
   argument to mult.

   - in our simple stack-based allocation, this means they will
     *certainly* be assigned different locations, so we will mov y on
     the stack

   - if we do a more complex register-based allocation scheme we can
     figure out using a program analysis that the two variables *can*
     be stored in the same place, but it will be hard in general to
     ensure that they will.

Compromise:

1. Lambda lifting only for *true non-tail calls*
2. Tail called functions stay nested