Assignment 3: Cobra: Multiple types of values
Due: Fri 10/13 at 5:00pm
git clone
In this compiler, you’ll deal with COded Binary RepresntAtions of values. Click here for scary snake picture!
1 The Cobra Language
1.1 Concrete Syntax
The concrete syntax of Cobra is very similar to Boa, with a few new additions.
‹expr› ... true false ‹expr› < ‹expr› ‹expr› > ‹expr› ‹expr› <= ‹expr› ‹expr› >= ‹expr› ‹expr› == ‹expr› ‹expr› && ‹expr› ‹expr› || ‹expr› ! ‹expr› isbool ( ‹expr› ) isnum ( ‹expr› ) print ( ‹expr› )
1.2 Abstract Syntax
The abstract syntax is very similar to Boa, also:
pub enum Prim1 {
...
Not,
Print,
IsBool,
IsNum,
}
pub enum Prim2 {
...
And,
Or,
Lt,
Gt,
Le,
Ge,
Eq,
}
pub enum Exp<Ann> {
...
Bool(bool, Ann),
}1.3 Semantics
The semantics of booleans are straightforward. The semantics of
Exp::If changes slightly: its condition must evaluate to a
boolean, and it branches on the truth or falsehood of that value,
rather than whether it’s nonzero.
With the addition of two types to the language, there are two main changes that ripple through the implementation:
The representation of values
The possibility of errors
There is one other major addition, which is the print
primitive, discussed more below.
The representation of values requires a definition. We’ll use the following representations for the Cobra runtime:
truewill be represented as the constant0xFFFFFFFFFFFFFFFF.falsewill be represented as the constant0x7FFFFFFFFFFFFFFF.numbers will be represented with a zero in the rightmost bit, by shifting them left 1 bit. So, for example, 2 is represented as
0x0000000000000004.
You should raise errors at runtime in the following cases:
-,+,*,add1andsub1should raise an error (by printing it out) with the substring"arithmetic expected a number"if the operation’s argument(s) are not numbers. You can print more information than this if you like, but the message must have at least this substring.<,<=,>and>=should raise an error with the substring"comparison expected a number"if the arguments are not both numbers.+,-,*,add1andsub1should raise an error with the substring"overflow"if the result overflows, and falls outside the range representable in 63 bits. Thejoinstruction will be useful: it jumps if the last instruction overflowed.ifshould raise an error with the substring"if expected a boolean"if the conditional value is not a boolean.andandorshould raise an error with the substring"logic expected a boolean"if the arguments are not both booleans. Implementation of short-circuiting behavior is optional.notshould raise an error with the substring"logic expected a boolean"if its argument is not a boolean.
These error messages should be printed on standard error. The other operators never fail, and so never produce errors.
The == should work polymorphically, regardless of the types of
its inputs. If we ask if a boolean is equal to a number, the result
should be false, not a runtime error. A value is only
== to itself.
We add two new primitive operations, isbool and isnum. These two
operations have an effective type of Any -> Bool, and will return
true if the argument they receive is indeed a boolean or number,
respectively.
The last required primitive operation is print, which prints
its single argument to the command line (with a newline), and then
returns it. The print_snake_val function in stub.rs
explicitly returns its argument; you will need to retrieve and use
that value in your compiled output.
2 Examples
The expression
let x = 1 in let y = print(x + 1) in print(y + 2)will output
2 4 4The first 2 comes from the first
printexpression. The first 4 comes from the secondprintexpression. The final line prints the answer of the program as usual, so there’s an “extra” 4.The expression
if 54: true else: falseprints (on standard error) something like:
Error: if expected a boolean, got 54
3 Implementation strategies
3.1 Rendering errors
To display error messages on standard error, you’ll need to use a call something like:
eprintln!("Error: if expected a boolean");We have included unimplemented functions snake_error() in stub.rs. You
will need to add inputs of your design to this function to get your
desired error messages. You can follow what we did in class or design
it however you like.
3.2 Memory Layout and Calling Rust Functions
In order to set up the stack properly to call Rust functions, like
print_snake_val and your error functions, it’s necessary to
make a few changes to what we had in Boa.
When calling a Rust function such as
print_snake_val, you need to ensure that you decrementRSPenough to move it past all of your local variables before the call and then reverse this by incrementing it after the call returns.To figure out how much space, you need to implement the
space_neededfunction. There are many ways to implement this, but you need to make sure that you have enough space to fit all of your local variables and that you keep the stack aligned.When you make a function call, you must ensure that when the called function starts executing, the value of the stack pointer
rspis 8 modulo 16. If you use thecallinstruction to make the call, then because the call instruction decrements the value ofrspby 8, you want the stack pointer to be 0 modulo 16 immediately before executing thecallinstruction. As discussed in Lecture 7, you can achieve this by allocating one extra slot of "junk" space on the stack if you would otherwise be misaligned.Participating in the stack: As a callee using the System V AMD64 calling convention, our code has some responsibilities. First, we need to store any callee-save registers that we use in the code upon entry. So if you are using any of the callee-save registers as scratch registers, you should push them upon entry into
start_hereand pop them back off before the final return.
If you write any functions in stub.rs that you need to be available in
your assembly, you need to declare them in the assembly via:
;; In your generated assembly
extern "sysv64" <exported name of function>3.3 New Assembly Constructs
Sized: You may run into errors that report that the size of an operation is ambiguous. This could happen if you write, for example:cmp [RSP - 8], 0because the assembler doesn’t know if the program should compare a four-byte zero, a one-byte zero, or something in between into memory starting at
[RSP - 8]. To solve this, you can supply a size:cmp QWORD [RSP - 8], 0The
QWORDkeyword tells the assembler to use the "quad word" size for the memory reference. In our generated assembly code we have always worked at the full 64-bit word level so it is fine to just change all memory references to use theQWORDkeyword. If you do not want every memory reference to use the keyword feel free to change types inasm.rsto support more selective use.Unsigned vs Signed: Assembly code supports 64-bit unsigned and signed constants and certain values like the representation oftrue,falseand bit patterns cannot be represented asi64values. So theArg64/Arg32/Reg32types have replaced theImmwith two cases forSignedandUnsignedconstants. We recommend displayingUnsignedconstants in hexadecimal since you will probably only be using them for bit-patterns. To format a numbernin hexadecimal, you can useformat!("0x{:016x}", n)for 64-bit numbers andformat!("0x{:08x}", n)for 32-bit.push,pop: These two instructions manage values on the stack.pushadds a value at the current location ofrsp, and decrementsrspto point past the added value (remember, the stack grows toward numerically-smaller addresses, even though we draw it as growing upward in diagrams).popincrementsrspand moves the value at the locationrspwas pointing to into the provided arg.A
calldoes two things:Pushes the next code location onto the stack (just like a
push), which becomes the return pointerPerforms an unconditional jump to the provided label
shr,shl,sar: Bit shifting operations.sar(shift-arithmetic-right) preserves the sign bit of the value being shifted, whileshr(shift-right) simply fills in a zero bit.and,or,xor: Bitwise logical operations.test: Performs a bitwise-and of the two arguments, and records whether the result was zero, signed, or overflowed. This can be used in tandem with jumps likejoorjz.jo,jno: Jump to the provided label if the last arithmetic operation did/did not overflow.Comment: these allow you to add a comment on a line by itself, or in-line after an instruction. These might be helpful for readability of your generated output, while you’re debugging.
3.4 Debugging
If you are having trouble producing correct code, we highly recommend
Inspecting your generated assembly code
Print out intermediate expressions using
println!("{:?}", e)Finding minimal examples that produce incorrect behavior
4 Recommended TODO List
Here’s an order in which you could consider tackling the implementation:
Extend any tagging functions and sequentialization
Fix the Num case and implement the Bool case of the compiler to correctly encode the values.
Extend
check_progto use theOverflowerror to indicate integer constants that are too big to fit into 63-bit signed integers.Implement arithmetic and logical
Primoperations without checking runtime type tags. Test programs where the tag checks would succeed.Figure out how to test for errors and call the
snake_errorfunction when an error is detected. Test as you go. Be aware that if the function call segfaults, it may be because you need to refine yourspace_neededfunction.Complete the if case and test as you go.
Implement
printand in the processspace_needed. Test as you go; be aware that you should test interesting sequences of print expressions and let-bindings to make sure your stack integrity is good before and after calls.
Be sure to get your alignment correct! Stack misalignment does not
typically cause errors on Mac, but does on Linux and in particular on
the autograder servers so if you are getting errors you can’t
replicate on Mac, make sure to think about alignment. One way to
harden your code to force stack misalignment to error is at any
point before you use the call instruction for a call into Rust to
insert the following code:
test RSP, 0xF
jnz mis_alignedWhere mis_aligned is a label to some code that calls your
snake_error function (fixing the stack alignment first!) to
report a stack alignment error.
5 Running main
Running your own programs is the same as with Boa, except you’ll give them
the .cobra file extension.
6 List of Deliverables
your
compile.rsandasm.rsthe other src/*.rs files in the starter code
any additional modules you saw fit to write
your
runtime/stub.rsthe Cargo.toml
integration tests (
tests/examples.rs)your test input programs (
examples/*.cobrafiles)
Please ensure the your code builds properly. The autograder will give you a 0 on that portion of the graade if it cannot compile your code.
7 Grading Standards
For this assignment, you will be graded on
Whether your code implements the specification (functional correctness),
The autograder will again include tests that are hidden until after the deadline that are worth 10% of your score.
8 Submission
Wait! Please read the assignment again and verify that you have not forgotten anything!
Please submit your homework to gradescope by the above deadline.