Assignment 5: Diamondback:   Defining functions
1 The Diamondback Language
1.1 Concrete Syntax
1.2 Abstract Syntax
1.3 Semantics
1.4 Implementation
1.5 Tail calls
1.6 Extra credit:   Printing stack traces
1.7 Testing
2 Running main
3 List of Deliverables
4 Grading Standards
5 Submission

Assignment 5: Diamondback: Defining functions

Due: Tue 10/26 at 8:59pm

git clone

In this assignment, you’ll implement a compiler for a small language with functions declarations and function calls, that can communicate with functions written in Rust. You’ll also add some static well-formedness checks to the compiler.

As for the project acronym, so far our greatest minds have come up with

Click here for scary snake picture!

1 The Diamondback Language

As usual, we have concrete and abstract syntaxes, along with a specification of semantics.

1.1 Concrete Syntax

The major addition to Diamondback are function declarations. Our programs are now a sequence of zero or more function declarations, followed by a single main expression.

‹program›: | ‹decls› ‹expr› | ‹expr› ‹binop-expr›: | IDENTIFIER | NUMBER | true | false | ! ‹binop-expr› | ‹prim1› ( ‹expr› ) | ‹expr› ‹prim2› ‹expr› | IDENTIFIER ( ‹exprs› ) | IDENTIFIER ( ) | ( ‹expr› ) ‹prim1›: | add1 | sub1 | print | printStack | isbool | isnum ‹prim2›: | + | - | * | < | > | <= | >= | == | && | || ‹decls›: | ‹decl› | ‹decl› ‹decls› ‹decl›: | def IDENTIFIER ( ‹ids› ) : ‹expr› end | def IDENTIFIER ( ) : ‹expr› end ‹ids›: | IDENTIFIER | IDENTIFIER , ‹ids› ‹expr›: | let ‹bindings› in ‹expr› | if ‹expr› : ‹expr› else: ‹expr› | ‹binop-expr› ‹exprs›: | ‹expr› | ‹expr› , ‹exprs› ‹bindings›: | IDENTIFIER = ‹expr› | IDENTIFIER = ‹expr› , ‹bindings›

The other addition is function applications, which are written IDENTIFIER(‹exprs›), for example f(1, 2, 3). This is the syntax for a call to a function.

1.2 Abstract Syntax

In this assignment, we add two new AST types: Prog for representing programs and FunDecl for representing function declarations:

struct Prog<E, Ann> {
    funs: Vec<FunDecl<E, Ann>>,
    main: E,
    ann: Ann,

struct FunDecl<E, Ann> {
    name: String,
    parameters: Vec<String>,
    body: E,
    ann: Ann,

These types are parameterized by a type of annotations Ann as well as a type of expressions E. This way we can abstract over programs containing arbitrary expressions versus and programs containing only sequential expressions. We define "Surface language" programs and function declarations versus "sequential" programs and declarations as type aliases:

type SurfProg<Ann> = Prog<Exp<Ann>, Ann>;
type SurfFunDecl<Ann> = FunDecl<Exp<Ann>, Ann>;

type SeqProg<Ann> = Prog<SeqExp<Ann>, Ann>;
type SeqFunDecl<Ann> = FunDecl<SeqExp<Ann>, Ann>;

Additionally we add forms for function calls to expressions and sequential expressions.

1.3 Semantics

There are several distinguishing features of Diamondback. The first is function applications. A function application should give the answer we’d get if we followed the rules for substituting argument values for parameter names. So, for example:

def f(x, y):
  x + y

f(1, 2)

should produce 3.

Logical operators && and || do not need to short-circuit; they should have the same behavior as you gave them in Assignment 4.

Since our Diamondback functions only call other Diamondback functions, rather than arbitrary functions in C, you can use a simpler calling convention than the full x64 calling convention: you can simply push all the arguments onto the stack and pop them all back off again. For primitives like print, obviously, you’ll need to follow the proper x64 stack layout. (If you’d like to use the x64 calling convention everywhere, you may do so, but it is more complicated for functions of more than 6 arguments. Test thoroughly.)

There are a number of new errors that can occur now that we have function declarations and calls. Your implementation should catch all of these cases statically; that is, at compile time before the program runs:

Again, these errors should stop the program from compiling, in your check_prog function, not happen at runtime.

1.4 Implementation

The file gives you all the assembly instructions you should need, though you’re free to add your own new instructions if you want.

There are a few new pieces to the implementation:

1.5 Tail calls

Refine your compilation of function applications to support tail-calls. There are three levels of complexity here:

To support the last one, you’ll need to be creative with the calling convention. The only truly essential feature of a stack frame is knowing where the saved RSP, RBP and return addresses must be, and all other variables’ locations can be calculated from knowing those. In particular, you’ll need to change the ordering of values in your stack frame, where RBP points between parameters and locals. Instead, both parameters and locals will need to be above RBP: change your compilation of function calls — and anything else that is affected! — to support this new convention. This change will prevent you from using the call instruction (why?), so you’ll need to come up with something clever to save the desired return address. Come see me if you try this, for a hint.

To detect that you’ve implemented tail calls correctly, you may want to attempt the following extra credit.

1.6 Extra credit: Printing stack traces

Implement a new prim1 operator, printStack, that takes a single argument and returns it, just like print does. What makes printStack special is its output: it will print out a stack trace of all the call frames from wherever it’s called, down to wherever our_code_starts_here began. For example, given the following program:

def fact(acc, n):
  if n < 1: printStack(acc)
  else: fact(acc * n, n - 1) + 0
  # NOTE: This + 0 is to prevent tail-recursion in this example

fact(1, 3)
Here is a potential stack trace:

RSP: 0x00007ffc352b1bb0	==>  0
RBP: 0x00007ffc352b1bd0	==>  70360600251912
(difference: -4)
Requested return val: 0x000000000000000c	==> 6
Num args: 2
    0x00007ffc352b1bb0: 000000000000000000	==>  old rbp
    0x00007ffc352b1bb8: 000000000000000000	==>  0
    0x00007ffc352b1bc0: 000000000000000000	==>  0
    0x00007ffc352b1bc8: 0xffffffffffffffff	==>  true
RBP 0x00007ffc352b1bd0: 0x00007ffc352b1c10	==>  old rbp
    0x00007ffc352b1bd8: 0x0000000000401077	==>  saved ret
    0x00007ffc352b1be0: 0x000000000000000c	==>  6
    0x00007ffc352b1be8: 000000000000000000	==>  0
    0x00007ffc352b1bf0: 000000000000000000	==>  0
    0x00007ffc352b1bf8: 000000000000000000	==>  0
    0x00007ffc352b1c00: 0x000000000000000c	==>  6
    0x00007ffc352b1c08: 0x7fffffffffffffff	==>  false
RBP 0x00007ffc352b1c10: 0x00007ffc352b1c50	==>  old rbp
    0x00007ffc352b1c18: 0x0000000000401077	==>  saved ret
    0x00007ffc352b1c20: 0x000000000000000c	==>  6
    0x00007ffc352b1c28: 0x0000000000000002	==>  1
    0x00007ffc352b1c30: 000000000000000000	==>  0
    0x00007ffc352b1c38: 0x0000000000000002	==>  1
    0x00007ffc352b1c40: 0x000000000000000c	==>  6
    0x00007ffc352b1c48: 0x7fffffffffffffff	==>  false
RBP 0x00007ffc352b1c50: 0x00007ffc352b1c90	==>  old rbp
    0x00007ffc352b1c58: 0x0000000000401077	==>  saved ret
    0x00007ffc352b1c60: 0x0000000000000006	==>  3
    0x00007ffc352b1c68: 0x0000000000000004	==>  2
    0x00007ffc352b1c70: 000000000000000000	==>  0
    0x00007ffc352b1c78: 0x0000000000000004	==>  2
    0x00007ffc352b1c80: 0x0000000000000006	==>  3
    0x00007ffc352b1c88: 0x7fffffffffffffff	==>  false
RBP 0x00007ffc352b1c90: 0x00007ffc352b1cb0	==>  old rbp
    0x00007ffc352b1c98: 0x00000000004010c1	==>  saved ret
    0x00007ffc352b1ca0: 0x0000000000000002	==>  1
    0x00007ffc352b1ca8: 0x0000000000000006	==>  3
BOT 0x00007ffc352b1cb0: 0x00007ffc352b1cf0	==>  old rbp
    0x00007ffc352b1cb8: 0x0000000000400f37	==>  saved ret

The output shows:

Hint: To identify where to stop printing the stack, I used a trick: declare a global mutable variable in Rust, static mut BOTTOM: u64 = 0; and extend start_here to take an argument fn start_here(bottom: *mut 64) -> SnakeVal and pass a pointer to BOTTOM: let output = unsafe { start_here(&mut BOTTOM) }; .

Then at the beginning of start_here, initialize that value with the current value of RSP, and then you’re guaranteed to know where your stack ends, dynamically.

Note: if you’ve properly implemented tail calls, then eliminating the + 0 from the program above, and rerunning it, should produce a much shorter stack trace...

Note: If you implement this extra credit, provide a brief, written explanation (in a comment in, near whatever new functions you define) of why this must be implemented as a Prim1 and not as a Call expression.

1.7 Testing

Include integration test examples in examples/ These include a new testing macro mk_any_test! that takes 2 arguments: the test name and the input file name and tests that compiling and running the example works without error, but doesn’t check for a particular output. This should be useful in testing your printStack function.

2 Running main

Running your own programs is the same as with Cobra, except you’ll give them the .diamond file extension. Examples from previous assignments will still work since they will simply be interpreted as programs with 0 function declarations.

3 List of Deliverables

Again, please ensure the makefile builds your code properly. The black-box tests will give you an automatic 0 if they cannot compile your code!

Please ensure the your code builds properly. The autograder will give you a 0 on that portion of the graade if it cannot compile your code.

4 Grading Standards

For this assignment, you will be graded on

5 Submission

Wait! Please read the assignment again and verify that you have not forgotten anything!

Please submit your homework to gradescope by the above deadline.