Assignment 4: Datatypes
Due: Fri 04/04 at 11:59pm
git clone
In this assignment, we extend our Snake language with support for dynamic typing and heap-allocated arrays.
1 The Diamondback Language
1.1 Concrete Syntax
‹expr› ... ‹array› ‹expr› [ ‹expr› ] ‹expr› [ ‹expr› ] := ‹expr› newArray ( ‹expr› ) length ( ‹expr› ) isBool ( ‹expr› ) isInt ( ‹expr› ) isArray ( ‹expr› ) ‹exprs› ‹expr› ‹expr› , ‹exprs› ‹array› [ ] [ ‹exprs› ]
1.2 Semantics
Runtime values in Diamondback are either 63-bit signed integers, booleans true
or false
or a mutable array of Diamondback values.
Most operations in Diamondback have strict dynamic type checking, and raise runtime errors if they are applied to incorrect values or overflow:
If an arithmetic operation
+,-,*
receives any input that is not an integer, it raises the error message "expected a number, got v". If the arithmetic operation overflows, it should raise the error message "arithmetic operation overflowed"If a comparison operation
<,<=,>,>=
receives an inputv
that is not an integer, it raises the error message "expected a number, got v".If a boolean operation
&&,||, !
receives an inputv
that is not a boolean, it raises an error "expected a boolean, got v".If an if statement is run on a value
v
that is not a boolean, it raises an error "expected a boolean, got v".If
newArray
is called on a non-integerv
it raises an "expected a number got v" error. If the input is a negative integer it raises an error "length v is negative".In evaluating
length(e1)
,e1[e2]
ore1[e2] := e3
if after evaluating all sub-expressions,e1
is not an array, raise the error "expected an array got v". After checking this, check thate2
is an integer and a valid index. If it is not an integer, raise the error "expected a number got v" and if it is an integer but not in the range of the array, raise the error "index n out of bounds"
The equality and inequality operations implement physical equality of values, so two array values are only equal if they are the same object in memory, regardless of if they contain the same values.
So for instance, x == x
always returns true, but [0] == [0]
returns false because each of the two arrays is newly allocated. Values of different type are always unequal, e.g., 0 != false
.
The operations isInt,isBool,isArray
return true when the value is of that type and false otherwise.
When displaying the final result of a Diamondback program, or executing the print
function, the value should be printed to stdout as follows:
Integers should be displayed as signed integers
true
andfalse
should be displayed as suchAn array
[v1, ...]
should be printed as a square bracket, followed by the values printed out, with a ", " separating the values, and ending in a close bracket. The only exception to this is for circular array values such as an array whose only element is itself. This can be constructed aslet a = [0] in let _ = a[0] := a in a
. Whenever a circularity is reached, rather than printing infinitely, the value should be printed as "<loop>". So the circular array should be printed as "[<loop>]".
There are two primitives for creating new array values. The operation newArray(e)
takes a non-negative integer e
as input and creates an array of that given length, with all the elements initialized to 0
. Array literals [e1,e2,...]
are parsed as Prim(MakeArray,[e1,e2,...])
which creates an array with the given arguments as initial values.
Use "ArrayGet" and "ArraySet" to read and modify the elements in an array. a[i]
takes an array a
and a zero-indexed integer i
and outputs the element at index i
in the array. The expression a[i] := b
takes three arguments: an array a
, a zero-indexed integer i
, and an arbitrary expression b
after modifying the element at index i
in the array to be b
, the whole expression outputs b
.
Finally, the length
operation takes an array and outputs its length.
1.3 Runtime Representation
The runtime representation for values is as follows:
A 63-bit integer
n
is represented as the 64-bit integer2 * n
true
has the runtime representation0b101
andfalse
is represented as0b001
An array
a
is stored as a pointer with the bottom two bits overwritten to0b11
. The pointer points to an object on the heap consisting of a length and then the values in the array sequentially after it (i.e., at higher addresses).
2 SSA Extensions
Unlike Diamondback, SSA values are 64-bit signed integers. In the translation from Diamondback to SSA, diamondback values must be explicitly tagged and encoded to their correct runtime representation. All primitive operators in SSA expect inputs and outputs to be "untagged" as before, and manual tagging and untagging should be performed when lowering AST operators to SSA operators. For example:
Our SSA IR is extended with assertions for error checking as well as array primitives:
"AssertType" takes a desired type and a tagged argument
"AssertLength" takes an untagged (but signed) integer argument
"AssertInBounds" takes two untagged integer arguments, where the first one is the bound
"Store" takes an untagged address, an untagged integer offset, and a value that is either tagged or untagged
Operation "Load" takes an untagged address and an untagged offset
Operation "AllocateArray" takes an untagged integer length.
The other extensions to SSA are 4 shift operations, arithmetic/logical left/right shift. These operations are introduced to closely ressemble their corresponding assembly instructions. The shift offset must be an unsigned 8-bit integer constant, which is the reason why these shift operations are introduced as "ssa::Prim1", because they only require one input argument. The "IntToBool" operation is removed in favor of dynamic type checking and explicit tagging.
2.1 Semantics
The primitive operations in SSA should have the same semantics as their assembly counterpart. The arithmetic operations +
,-
and *
raise an error message "arithmetic operation overflowed" if the operation overflows the 64-bit representation.
The "AssertType" instruction checks if the tagged argument has the same tag as the desired type. When the runtime argument doesn’t keep up to the assertion, the instruction should produce the error message "expected a <type>, got <value>", where "<type>" is either "number", "boolean", or "array", and "<value>" is the Diamondback representation of the argument (e.g. true
if the input is 5
).
The "AssertLength" instruction checks if the untagged signed integer is non-negative at runtime. If not, it should produce the error message "length <n> is negative", where "<n>" is the length.
The "AssertInBounds" instruction checks if the untagged signed index is in the range [0, bound). If not, it should produce the error message "index <n> out of bounds", where "<n>" is the index.
The "Store" and "Load" should store or load element located at (address + offset * 8).
Finally, "AllocateArray" calls "snake_new_array" in stub.rs file. The input is an untagged integer length, and the output is an untagged address pointing to the allocated array.
2.2 Code generation
Code generation for most existing operators is unchanged from previous assignments. An exception is that arithmetic operations should check for overflow, which can be detected by the overflow condition code o
.
Assertions are compiled to comparison and a conditional jump to the error labels as demonstrated in the lectures. At the beginning of your code generation, you should emit labeled blocks that call call "snake_error" with the error code set as the first argument. For those errors that require a second argument, you should correctly set it up as well before jumping to the error label.
3 Runtime System
The runtime system has been split into three files: runtime/common.rs
includes some basic definitions and utilities. runtime/extensions.rs
is a file containing the error handling as well as printing for snake values, which you must implement. runtime/stub.rs
has been extended to initialize the Snake heap, and provides the function snake_new_array
to allocate new array values. Additionally, it modifies the call into entry
to consist of an array value of the command line arguments.
You need to complete the sprint_snake_value
function in runtime/extensions.rs
to support printing Diamondback values correctly.
4 Starter Code and Deliverables
This assignment’s starter code includes a full implemtation of the frontend, and much of the scaffolding of the middle end.
For this assignment you will submit your updated versions of three files:
src/frontend.rs
src/middle_end.rs
src/backend.rs
runtime/extensions.rs
You don’t have to update src/frontend.rs
but if you prefer to you are allowed to do so.
These are the only files that you can change in your submission, your code will be linked with reference versions of the other files.
We’ve included a script mk_submission.sh
in the starter code
repo that should make zip file with just those three files and your
testing files that you should upload to gradescope. The testing files
are included for us to get some idea of how students are testing their
code, they are not graded.