Lecture 6: Conditionals Part 1:   Branching and Logical Operations
1 Growing the language:   adding conditionals and logical operations
1.1 The new concrete syntax
1.2 Examples and semantics
1.3 The new abstract syntax
2 Conditional Control Flow in Assembly
2.1 Comparisons and jumps
3 Intermediate representation for conditionals
3.1 From (sub-)blocks to x86
3.2 From If to Branching
4 Join Points
8.18

Lecture 6: Conditionals Part 1: Branching and Logical Operations🔗

Our compiler so far can handle basic arithmetic operations on numbers as well as handle let-bound identifiers. This is completely straight-line code; there are no decisions to make that would affect code execution. We need to support conditionals to incorporate such choices.

1 Growing the language: adding conditionals and logical operations🔗

Reminder: Every time we enhance our source language, we need to consider several things:

  1. Its impact on the concrete syntax of the language

  2. Examples using the new enhancements, so we build intuition of them

  3. Its impact on the abstract syntax and semantics of the language

  4. Any needed changes to our intermediate representation

  5. Any new or changed transformations needed to process the new forms

  6. Executable tests to confirm the enhancement works as intended

1.1 The new concrete syntax🔗

‹expr›: ... | if ‹expr› : ‹expr› else: ‹expr›

1.2 Examples and semantics🔗

Currently our language includes only integers as its values. We’ll therefore define conditionals to match C’s behavior: if the condition evaluates to a nonzero value, the then-branch will execute, and if the condition evaluates to zero, the else-branch will execute. It is never the case that both branches should execute.

Concrete Syntax

     

Answer

if 5: 6 else: 7

     

6

if 0: 6 else: 7

     

7

if sub1(1): 6 else: 7

     

7

Unlike C, but as in Rust, if-expressions are indeed expressions: they evaluate to a value, which means they can be composed freely with the other expression forms in our language. For instance we can form complex expressions with an if as a sub-expression such as

(if x: 6 else: 8) + (if y: x else: 3)

This makes our if expression analogous to C’s "ternary conditional operator", in C the first example would be written as 5 ? 6: 1. In Rust, if (5 != 0) { 6 } else { 7 }. Since our if is an expression, we always include an else branch.

Do Now!

Construct larger examples, combining if-expressions with each other or with let-bindings, and show their evaluation.

1.3 The new abstract syntax🔗

enum Exp {
  ...
  If { cond: Box<Exp>, thn: Box<Exp>, els: Box<Exp> }
}

Do Now!

Extend your interpreter from the prior lecture to include conditionals. As with last lecture, suppose we added a print expression to the language — what care must be taken to get the correct semantics?

We should ensure that our programs only evaluate one side of an if expression. But how would we test this? We need to have a test where we can tell whether some code has executed or not. This would work if we had printing or say infinite loops.

if x:
  print(1)
else:
  print(0)

or

let x = 1 in
if x:
  7
else:
  infinite-loop

There’s something a bit unsatisfying about interpreting if in our language by using if in Rust: it feels like a coincidence that our semantics and Rusts’s semantics agree, and it doesn’t convey much understanding of how conditionals like if actually work...

Scope

def main(x):
if 0:
  y
else:
  x

2 Conditional Control Flow in Assembly🔗

2.1 Comparisons and jumps🔗

To understand how to compile conditionals, we first need to understand how we can implement conditional execution in x86. So far, x86 execution has always gone sequentially from one instruction to the next. Concretely, in x86 what instruction to run is determined by a special register called the instruction pointer, whose name is, somewhat morbidly, RIP. RIP stores a 64-bit value like the general purpose registers, but that’s where the similarities end. The value of RIP is interpreted as an address, and at each step of execution, the x86 abstract machine interprets the memory at RIP as the binary encoding of the instruction to execute. Then most of the instructions we’ve used so far implicitly increment RIP so that after executing the instruction, RIP then points to the next instruction in memory. This is what produces the sequential behavior we have seen so far.

TODO: show some assembly code in memory

There are several instructions in x86 that manipulate RIP in more interesting ways. The first is the "jump" instruction jmp addr, which simply sets RIP to addr directly. addr here could be a register or a label, an x86 abstraction of an address that will be determined by the assembler. We’ve seen only one label so far, namely our_code_starts_here, but we can freely add more labels to our program to indicate targets of jumps.

To compile conditionals, we want something a little more complicated: we want to be able to choose based on some dynamically determined information whether to set RIP to the address for the start of the then branch or the start of the else branch. x86 has a large group of instructions that do just that, which are called conditional jump instructions which are of the form jcc loc where cc is one of many different condition codes that say what condition to check, and the loc is a memory address. The behavior of a conditional jump is that if the condition code is satisfied, the instruction pointer is set to loc, and otherwise, the instruction pointer is incremented and the sequentially next instruction in memory is executed, as with a typical expression.

The condition codes themselves are interpeted by yet another special register, RFLAGS. This register consists of many single-bit flags such as SF (sign flag), OF (overflow flag), ZF (zero flag), etc. Like RIP, most x86 instructions implicitly manipulate these flags. For instance, the zero flag ZF is set to 1 if the result of an arithmetic operation is 0 and 1 otherwise, the SF is set to 1 if the result of the operation is negative and 0 otherwise and OF is set to 1 if the arithmetic operation overflowed and 0 otherwise. The various condition codes then each check for some specific combination of flag settings. For instance, the condition code z checks for if the zero flag is set.

Other conditions are more complex. For instance, the condition code l for "less than" means that the overflow flag is set xor the sign flag is set. This results in the correct condition for x < y if the flags have been set as in the instruction sub x, y.

Exercise

Why is x < y if and only if x - y is negative xor overflows?

The most common way to set the condition codes is with the cmp arg1, arg2 instruction, which sets the flags in the same way as a sub operation, but without changing the value of arg1. The following conditional jumps make the most sense after executing such a cmp:

Instruction

     

Jump if ...

je LABEL

     

... the two compared values are equal

jne LABEL

     

... the two compared values are not equal

jl LABEL

     

... the first value is less than the second

jle LABEL

     

... the first value is less than or equal to the second

jg LABEL

     

... the first value is greater than the second

jge LABEL

     

... the first value is greater than or equal to the second

jb LABEL

     

... the first value is less than the second, when treated as unsigned

jbe LABEL

     

... the first value is less than or equal to the second, when treated as unsigned

Some conditional jumps instead make more sense directly after an arithmetic operation:

Instruction

     

Jump if ...

jz LABEL

     

... the last arithmetic result is zero

jnz LABEL

     

... the last arithmetic result is non-zero

jo LABEL

     

... the last arithmetic result overflowed

jno LABEL

     

... the last arithmetic result did not overflow

Do Now!

Consider the examples of if-expressions above. Translate them manually to assembly.

def main(x):
  if sub1(x):
    6
  else:
    7

entry:
  mov rax, rdi
  sub rax, 1
  cmp rax, 0
  jne thn
els:
  mov rax, 7
  ret
thn:
  mov rax, 6
  ret

Let’s examine the last example above: ~hl:2:s~if ~hl:1:s~sub1(1)~hl:1:e~: ~hl:3:s~6~hl:3:e~ else: ~hl:4:s~7~hl:4:e~~hl:2:e~. Which of the following could be valid translations of this expression?

  ~hl:1:s~mov RAX, 1
  sub1 RAX~hl:1:e~
  ~hl:2:s~cmp RAX, 0
  je if_false
if_true:
  ~hl:3:s~mov RAX, 6~hl:3:e~
  jmp done
if_false:
  ~hl:4:s~mov RAX, 7~hl:4:e~
done:~hl:2:e~

          

  ~hl:1:s~mov RAX, 1
  sub1 RAX~hl:1:e~
  ~hl:2:s~cmp RAX, 0
  je if_false
if_true:
  ~hl:3:s~mov RAX, 6~hl:3:e~

if_false:
  ~hl:4:s~mov RAX, 7~hl:4:e~
done:~hl:2:e~

          

  ~hl:1:s~mov RAX, 1
  sub1 RAX~hl:1:e~
  ~hl:2:s~cmp RAX, 0
  jne if_true
if_true:
  ~hl:3:s~mov RAX, 6~hl:3:e~
  jmp done
if_false:
  ~hl:4:s~mov RAX, 7~hl:4:e~
done:~hl:2:e~

          

  ~hl:1:s~mov RAX, 1
  sub1 RAX~hl:1:e~
  ~hl:2:s~cmp RAX, 0
  jne if_true
if_false:
  ~hl:4:s~mov RAX, 7~hl:4:e~
  jmp done
if_true:
  ~hl:3:s~mov RAX, 6~hl:3:e~
done:~hl:2:e~

The first two follow the structure of the original expression most closely, but the second has a fatal flaw: once the then-branch finishes executing, control falls through into the else-branch when it shouldn’t. The third version flips the condition and the target of the jump, but tracing carefully through it reveals there is no way for control to reach the else-branch. Likewise, tracing carefully through the first and last versions reveal they could both be valid translations of the original expression.

Working through these examples should give a reasonable intuition for how to compile if-expressions more generally: we compile the condition, check whether it is zero and if so jump to the else branch and fall through to the then branch. Both branches are then compiled as normal. The then-branch, however, needs an unconditional jump to the instruction just after the end of the else-branch, so that execution dodges the unwanted branch.

Do Now!

Work through the initial examples, and the examples you created earlier. Does this strategy work for all of them?

Let’s try this strategy on a few examples. For clarity, we repeat the previous example below, so that the formatting is more apparent.

Original expression

          

Compiled assembly

~hl:2:s~if ~hl:1:s~sub1(1)~hl:1:e~:
  ~hl:3:s~6~hl:3:e~
else:
  ~hl:4:s~7~hl:4:e~~hl:2:e~

          

  ~hl:1:s~mov RAX, 1
  sub1 RAX~hl:1:e~
  ~hl:2:s~cmp RAX, 0
  je if_false
if_true:
  ~hl:3:s~mov RAX, 6~hl:3:e~
  jmp done
if_false:
  ~hl:4:s~mov RAX, 7~hl:4:e~
done:~hl:2:e~

~hl:1:s~if ~hl:2:s~10~hl:2:e~:
  ~hl:3:s~2~hl:3:e~
else:
  ~hl:4:s~sub1(0)~hl:4:e~~hl:1:e~

          

  ~hl:2:s~mov RAX, 10~hl:2:e~
  ~hl:1:s~cmp RAX, 0
  je if_false
if_true:
  ~hl:3:s~mov RAX, 2~hl:3:e~
  jmp done
if_false:
  ~hl:4:s~mov RAX, 0
  sub1 RAX~hl:4:e~
done:~hl:1:e~

~hl:1:s~let x =~hl:1:e~ if 10:
          2
        else:
          0
in
~hl:3:s~if ~hl:2:s~x~hl:2:e~:
  ~hl:4:s~55~hl:4:e~
else:
  ~hl:5:s~999~hl:5:e~~hl:3:e~

          

  mov RAX, 10
  cmp RAX, 0
  je if_false
if_true:
  mov RAX, 2
  jmp done
if_false:
  mov RAX, 0
done:
  ~hl:1:s~mov [RSP-8], RAX~hl:1:e~
  ~hl:2:s~mov RAX, [RSP-8]~hl:2:e~
  ~hl:3:s~cmp RAX, 0
  je if_false
if_true:
  ~hl:4:s~mov RAX, 55~hl:4:e~
  jmp done
if_false:
  ~hl:5:s~mov RAX, 999~hl:5:e~
done:~hl:3:e~

The last example is broken: the various labels used in the two if-expressions are duplicated, which leads to illegal assembly:

$ nasm -f elf64 -o output/test1.o output/test1.s
output/test1.s:20: error: symbol `if_true' redefined
output/test1.s:23: error: symbol `if_false' redefined
output/test1.s:25: error: symbol `done' redefined

We need to generate unique labels for each expression.

TODO: rewrite this based on the approach we end up using.

3 Intermediate representation for conditionals🔗

To compile our conditionals to x86 conditional jumps and blocks, we enrich our intermediate representation with blocks and conditional branching, but without all the complexity of the RIP and RFLAGS registers. Recall that our current IR consists of a single block: a sequence of operations that assign to a variable ending in a return to one of those variables. We extend this with two new constructors for blocks:

  • We add a form for defining a new named block

  • We add a form for conditional branching where the targets of the branch are previously declared blocks

We change the abstract syntax as follows:

pub struct Program {
    pub param: VarName,
    pub entry: BlockBody,
}
pub enum BlockBody {
    Terminator(Terminator),
    Operation {
        dest: VarName,
        op: Operation,
        next: Box<BlockBody>,
    },
    SubBlock {
        block: BasicBlock,
        next: Box<BlockBody>,
    },
}
pub enum Terminator {
    Return(Immediate),
    ConditionalBranch {
        cond: Immediate,
        thn: Label,
        els: Label,
    },
}
pub enum Operation {
    Immediate(Immediate),
    Prim(Prim, Immediate, Immediate),
}
pub struct BasicBlock {
    pub label: Label,
    pub body: BlockBody,
}

TODO: Rust

Whereas before every basic block ended in a return, now they may also end in a conditional br. We group these forms together into what we call a terminator. So now every block consists of a sequence of declarations ending in a terminator.

We will use the following textual format for these:

TODO: example of sub-blocks.

entry(x):
  thn:
    ret 6
  els:
    ret 7
  sub1_arg = x
  cond = sub sub1_arg 1
  cbr cond thn els

The semantics of this form is a simplified form of the x86 control flow. The declaration of a sub-block doesn’t have any observable side-effect, it’s simply there as a declaration providing a name for the block so that we have the ability to branch to it later. As with our variable names, we should ensure that the names we use for blocks are unique so there is no confusion during code generation. The semantics of a br x l1 l2 is analogous to our if expression: if x is non-zero, we start executing the l1 block and otherwise we start executing the l2 block.

3.1 From (sub-)blocks to x86🔗

We can translate our new IR forms to assembly by turning each named SSA block into a corresponding region ox x86 code with a label corresponding to the declared block name.

The main difference the IR blocks and labeled assembly code blocks is that our IR blocks are nested within each other:

TODO: example nested block
And we need to ensure when we generate the assembly instructions not to naively put them in the same order:
TODO
But instead to produce the code for the sub-blocks either before the label of the current block or after the compiled code for the terminator of the block:
TODO

Then to compile a conditional branch br x l1 l2, we need to check if x is non-zero and branch accordingly. For this we can use the cmp instruction x and 0 to set the rflags register and then check for the e condition code:

cbr x l1 l2

mov rax, 0
cmp rax, [rsp - offset(x)] ;; compare 0 to the stored value of rax
je l2
jmp l1

Of course, we also need to incorporate our new SSA forms into the existing analyses in our translation. Specifically, we need to extend our assignment of variables to work with sub-blocks.

Do Now!

Where should we store variables that are declared in a sub-block?

entry:
        mov [rsp + -8], rdi
        mov rax, [rsp + -8]
        mov [rsp + -16], rax
        mov rax, [rsp + -16]
        mov r10, 1
        sub rax, r10
        mov [rsp + -24], rax
        mov rax, [rsp + -24]
        cmp rax, 0
        jne thn#0
        jmp els#1
thn#0:
        mov rax, 6
        ret
els#1:
        mov rax, 7
        ret

if cond:
  thn
else:
  els

thn%uid:
  ... thn code
els%uid':
  ... els code
... cond code
cond_result%uid'' = ...
cbr cond_result%uid'' thn%uid els%uid'

We also need to account for the continuation for the current result.

thn%uid:
  ... thn code
  ... continuation code
els%uid':
  ... els code
  ... continuation code
... cond code
cond_result%uid'' = ...
cbr cond_result%uid'' thn%uid els%uid'

def main(y):
  let x = (if y: 5 else: 6) in
  x * x

entry(y%5):
  thn%0:
    x%2 = 5
    res%3 = x%2 * x%2
    ret res%3
  els%1:
    x%4 = 6
    res%3 = x%4 * x%4
    ret res%3
  cbr y%5 thn%0 els%1

3.2 From If to Branching🔗

How do we compile our if expressions to branches?

Schematically, we want



Looking at a basic example like ... TODO: example where the if is in tail position ... we simply push the return into the blocks of the if.

But recall that in producing the intermediate code, we also need to flatten the code, and for this we are given a continuation as an extra argument.

Exercise

In what cases does this compilation strategy go horribly wrong?

4 Join Points🔗

If we copy the code for the continuation, we have a problem: each time a continuation for a conditional is used, its code is produced twice in the output. But then that code may itself be used in a continuation. For example:

def main(y):
  let x = if y: 5 else: 6 in
  let x = if y: x else: add1(x) in
  let x = if y: x else: add1(x) in
  x * x

In our continuation-copying scheme, the continuation containing the x * x computation is copied in the third if, meaning it is used twice. Then that code constitutes the continuation for the second if, so it is copied again, meaning the x * x is now included 4 times. This process repeats and now the code for x * x is included 8 times. This produces some very large SSA even for this simple program:

entry(y%0):
  thn#4():
    x%1 = 5
    thn#2():
      x%2 = x%1
      thn#0():
        x%3 = x%2
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      els#1():
        add1_0%8 = x%2
        x%3 = add1_0%8 + 1
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      cond%7 = y%0
      cbr cond%7 thn#0 els#1
    els#3():
      add1_0%10 = x%1
      x%2 = add1_0%10 + 1
      thn#0():
        x%3 = x%2
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      els#1():
        add1_0%8 = x%2
        x%3 = add1_0%8 + 1
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      cond%7 = y%0
      cbr cond%7 thn#0 els#1
    cond%9 = y%0
    cbr cond%9 thn#2 els#3
  els#5():
    x%1 = 6
    thn#2():
      x%2 = x%1
      thn#0():
        x%3 = x%2
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      els#1():
        add1_0%8 = x%2
        x%3 = add1_0%8 + 1
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      cond%7 = y%0
      cbr cond%7 thn#0 els#1
    els#3():
      add1_0%10 = x%1
      x%2 = add1_0%10 + 1
      thn#0():
        x%3 = x%2
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      els#1():
        add1_0%8 = x%2
        x%3 = add1_0%8 + 1
        *_0%4 = x%3
        *_1%5 = x%3
        result%6 = *_0%4 * *_1%5
        ret result%6
      cond%7 = y%0
      cbr cond%7 thn#0 els#1
    cond%9 = y%0
    cbr cond%9 thn#2 els#3
  cond%11 = y%0
  cbr cond%11 thn#4 els#5
And just for a program with 3 sequential conditionals!

How can we address this? On the one hand, this compilation strategy is correct, because the same continuation needs to be used by the two branches. So what we need to do is to share the continuation without copying the instructions themselves. We can do this by instead making some kind of new block for the continuation and then having each branch of the conditional branch back to this new block. In assembly code we might implement this as

entry:
        cmp rdi, 0
        jne thn#0
        jmp els#1
thn#0:
        mov rax, 5
        jmp jn#2
els#1:
        mov rax, 6
        jmp jn#2
jn#2:
        imul rax, rax
        ret

How could we implement this in our SSA intermediate representation? We should clearly make this join point a block, and now we want to do an unconditional branch to the join point, which is easy enough to add to our IR:

entry(y%0):
  jn#2:
    ...?
    result%4 = x%1 * x%1
    ret result%4
  thn#0:
    thn_res%6 = 5
    ...?
    br jn#2
  els#1:
    els_res%7 = 6
    ...?
    br jn#2
  cond%5 = y%0
  cbr cond%5 thn#0 els#1

entry(y%0):
  jn#2:
    result%4 = x%1 * x%1
    ret result%4
  thn#0:
    x%1 = 5
    br jn#2
  els#1:
    x%1 = 6
    br jn#2
  cond%5 = y%0
  cbr cond%5 thn#0 els#1

entry(y%0):
  jn#2:
    x%1 = ϕ(thn_res%6, els_res%7)
    result%4 = x%1 * x%1
    ret result%4
  thn#0:
    thn_res%6 = 5
    br jn#2
  els#1:
    els_res%7 = 6
    br jn#2
  cond%5 = y%0
  cbr cond%5 thn#0 els#1

entry(y%0):
  jn#2(x%1):
    result%4 = x%1 * x%1
    ret result%4
  thn#0():
    br jn#2(5)
  els#1():
    br jn#2(6)
  cond%5 = y%0
  cbr cond%5 thn#0() els#1()
ϕony ϕunctions: ϕ nodes

l(x1,x2,x3):
  ...
br l(imm1,imm2,imm3)

mov rax, imm1
mov [rsp - offset(x1)], rax
mov rax, imm2
mov [rsp - offset(x2)], rax
mov rax, imm3
mov [rsp - offset(x3)], rax
jmp l

l(x1,x2,x3):
  ...
x1 = imm1
x2 = imm2
x3 = imm3
br l

cbr x l1 l2

mov rax, [rsp - offset(x)]
cmp rax, 0
jne l1
jmp l2

l1(v1,v2):
...
l2(w):
...
cbr x l1(y1,y2) l2(z)

mov rax, [rsp - offset(x)]
cmp rax, 0
mov rax, [rsp - offset(y1)]
mov [rsp - offset(v1)], rax
mov rax, [rsp - offset(y1)]
mov [rsp - offset(v1)], rax
jne l1
mov rax, [rsp - offset(z)]
mov [rsp - offset(w)], rax
jmp l2

Instead implement as a source-to-source transformation

l1(v1,v2):
...
l2(w):
...
l1b():
  l1(y1,y2)
l2b():
  l2(z)
cbr x l1b l2b

jn%uid''(x): ; continuation parameter
  ... continuation code
thn%uid:
  ... thn code
  br jn%uid''(thn_res)
els%uid':
  ... els code
  br jn%uid''(els_res)
... cond code
cond_result%uid'' = ...
cbr cond_result%uid'' thn%uid els%uid'