Reed CSCI 384 Fall 2021
Homework 05: a MiniML interpreter
The deadlines are::
• Code for *
, -
, div
, mod
, orelse
, <=
, =
due Friday, 10/15, 7pm.
• Written semantics of pairs, output/sequencing due Friday, 10/15, 7pm.
• Pairs, print
, and ;
code due Wednesday, 10/27, 3pm.
• Function application, fn
, let fun
code due Wednesday, 10/27, 3pm.
Download the interpreter code
In an earlier homework you worked out a parser for a (very partial) grammar for the MiniML language. In class the last weeks we worked out the semantics for that language. We chose to develop a “big step” semantics, defining the meaning of a three-place relation:
E |- e V v
(here, we are using capital V in place of a “double” downward arrow) which asserts that the abstract syntax term e has the value v when evaluated in the variable-value binding context E.
We added functions to the language, along with function application, with the grammar productions
<exp> ::= fn <name> => <expn>
| <expn> <expn>
These correspond to a lambda expression with a formal parameter on the left side of the arrow, and its rule for evaluation on the right. For example
fn x => x + 1
would be the integer successor function. The second is the “silent juxtaposition” binary operation for function application. For example
(fn x => fn y => x + y) 3 4
is a fancy way of writing 7.
We then added semantic rules for evaluating these two constructs, having added their AST constructors ‘fn’ and ‘ap’. To do so we needed to invent new kind of value, a “closure”. Closures are essentially the same as their lambda expression, namely, they consist of their function’s formal parameter, along with the AST of their rule for evaluation, but also the variable binding environment that was the context within which that rule was described. For example, the expression
let val y = 6 in (fn x => y*10 + x) end
evaluates to a closure triple
("x", r, C)
where r is the AST
plus(times(var("y"),num(10)),var("x"))
and the C is the environment
[("y",6)]
That way, should that function get applied to some value, we can evaluate the rule knowing the actual parameter value to be substituted for the formal parameter “x”, along with the bound value 6 to be substituted for “y”.
ASSIGNMENT
The code in the file miniml.py
contains a parser for MiniML, including parsing of the function syntax, producing ASTs that contain the constructors fn
/app
with the grammar rules:
<exp> ::= fn <name> => <expn>
| <expn> . <expn>
NOTE THE SLIGHT VARIATION: The parser expects an explicit .
when code performs a function application. To the left of the period is a function expression. To the right of the period is the argument being fed. I may have time to fix the parser so that the .
isn’t needed— it’s not hard to— but that is its current state. (And I didn’t want to introduce any last-minute bugs.)
The code, in addition, is built as a simple interpreter for the MiniML language, rather than just a parser. Most of that code sits in the definition of the function def eval(env,ast)
described more below, but there are also a few supporting functions I’ve written.
The code is incomplete. It handles let
, var
, if
, num
, plus
, and
, less
, true
, and false
. Your job is to finish the code. In the end you’ll hand in a working MiniML interpreter, along with a significant series of test programs that you used to convince yourself that the code works.
This means that you’ll need to write the eval
cases for times
, div
, mod
, minus
, or
, fn
, app
, leq
, equals
, and not
. The exercises also ask you to handle fst
, snd
, pair
, seq
and print
, too. You’ll want to consider adding a unit
literal, as well, as this is the return value of print
.
Below I make a few notes about the implementation, offer some bonus exercises, and also request that you complete some written exercises.
Note 1: AST nodes with code locations
My parser now returns richer AST nodes. Like before, each node is a Python list ast
where ast[0]
is a string that indicates the constructor for that node, and ast[1]
, ast[2]
, etc. are the needed information components attached to that node, including the ASTs of subexpressions. But I’ve also added a last item, ast[-1]
which is a string indicating the line # and column # where that node’s expression was parsed within the source.
Throughout the code, you’ll see me report an error by grabbing that location information, usually as a variable named where
. For example, evaluation of a plus
makes sure that its two sub-expressions each evaluate to an integer value. It raises an error indicating where the +
occurred in the source code, should either of those sub-expressions result in a non-integer.
Your eval
code should also raise “run-time” errors when they should arise (e.g. type errors, divisions by 0, failure to define a name in the MiniML program, failure to apply a function in ap
, etc.)
Note 2: tagged values
I made a design decision within my interpreter to represent values returned by eval
as more than just a Python value. Instead, I return a list that describes the type of the value, along with the value itself. For example, the MiniML expression 35
gets returned as the Python list
["Int",35]
and the MiniML expression true
gets returned as the Python list
["Bool",True]
I use these “tagged values” for type checking, as mentioned in Note 1, and also for reporting the result of an evaluated expression to the MiniML user/programmer.
The starter code expects function values to have the tag "Clos"
. You’ll want to return a list with that tag when you evaluate an fn
. It should result in a closure value, and in this form:
["Clos", ...the information needed for a closure... ]
Note 3: provided code
Here is a summary of the functions I’ve added for the interpreter:
def interpreter(tokens)
: takes a token stream, parses it, evaluates the AST from the parse, then reports the resulting (tagged) value.
def eval(env,ast)
: incomplete code that evaluates a MiniML term ast
within a variable-value binding environment env
.
def lookUpVar(x,env,err)
: my implementation of variable look-up (our notation E(x)) that seeks a binding of x
within the environment env
. The extra parameter err
is expected to be a string, an error message raised when the variable isn’t found in the given environment.
getIntValue(tval,err)
: this is a helper function that’s used to pick apart the tagged value tval
, expecting it to be an integer, and returning that integer value. It raises an error with message err
if tval
is something else.
getBoolValue(tval,err)
: similar, for boolean values.
Exercise: shortcircuited orelse
Write the case for evaluating the orelse
logical operator. The AST will have the tag "Or"
. Like the way we evaluate "And"
for andalso
, this logical construct should be “short-circuited” by the interpreter. This means that there should be cases where only the left expression gets evaluated by the interpreter. For orelse
, if the left expression evaluates to true
, then the right expression should not be evaluated because, at that point in the evaluation, the full expression will be already known to be true
.
Exercise: add pairs to MiniML
Consider adding the additional grammar:
<expn> ::= fst <atom> | snd <atom> | <atom>
<atom> ::= ( <expn> , <expn> )
The bottom of the two is the syntax for constructing a pair of values in MiniML. The top two are used to extract either the first component (with fst
) or the second component (with snd
) of a pair value.
Add these to the parser, producing their AST nodes. Add cases to eval
that handle programs that use pairs. You’ll need to add a new tagged value type, one that represents pairs, to make these constructs work.
Written exercise: semantics of pairs
Give the written rules for evaluating these three pair constructs.
Exercise: adding printing; sequencing
Add a sequencing operation to the MiniML language. This has the syntax
<atom> ::= ( <expn> ; <expn> ; ... ; <expn> )
Also add a construct
<expn> ::= print <atom>
The second, when evaluated, should evaluate the expression parsed as , and then print that value as a single line on the console. The first when evaluated, should evaluate each of the expressions in the sequence, from left to right, and then ultimately return the value of the last expression.
Thus, for example,
(print 5; print 7; print 2; 8)
would print three lines with 5
, 7
, and 2
, in that order, and then should evaluate to 8.
The value returned by print
should be a new value ()
, one of type unit
. (You might consider adding a parse of the expression “()” to the grammar/parser, just to be complete.)
Parse the semicolon for sequencing as a left-heavy parse tree of seq
AST nodes. So the expression above parses as
seq(seq(seq(print(num(5)),print(num(7))),print(num(2))),num(8))
Add print
and seq
execution to the interpreter.
Written exercise: semantics of output
With print
and seq
, the semantics of MiniML changes fundamentally. To model output, we instead want a 4-place relation like
E |- e V v => <v1,v2,...,vk>
This says that “evaluating e within the context E yields the value v along with the output value sequence <v1,v2,…,vk>.” If the expression has no output, then k=0, and we’d say that => <>
instead.
So our new evaluation relation is
E |- e V v => s
where E is a sequence of (Id x Val), e is a Trm, v is a Val, and s is a sequence of Val. An example s could be <5,7,2> or it could be <>.
Consider a subset of the MiniML language that includes only num
, var
, let
, if
, less
, plus
, seq
, and print
. Give the full semantics for this language by defining the inference rules for the “eval with output” relation.
For example if we evaluate seq(e1,e2) and evaluating e1 outputs s1 = <v1,v2,…,vk> and then evaluating e2 outputs s2 = <u1,u2,…,un> then our rules should say that seq(e1,e2) outputs their concatenation s1 • s2, that is, the sequence <v1,v2,…,vk,u1,u2,…,un>.
Exercise: adding functions and function application
Okay you are now ready to handle closures. You need to extend eval
to handle the fn
construct, which gets parsed as an "Fn"
-tagged AST. Recall the rule for evaluation
----------------------
E |- fn(x,r) V (x,r,E)
This is just an axiom in our rules, and the value listed as a triple is a closure. Your eval
code, when it sees an fn
, should give back a "Clos"
-tagged value that holds the formal parameter name of the function, the AST rule of the function, and the environment env
.
The key test of this work is with the "App"
-tagged AST node, which corresponds to function application. You’ll need to handle a case for eval
that mimics this rule from class:
E |- e1 V (x,r,C) E |- e2 V a [x -> a]C |- r V v
----------------------------------------------------
E |- app(e1,e2) V v
We evaluate the function, and make sure it is a closure value. We then evaluate the argument. And then we make a recursive call to eval
to evaluate the function body (r
in the above) with that argument value.
Suddenly, you have a functional interpreter for a functional language!
Exercise: adding let fun
In Standard ML, there is a construct let fun
that allows its programmers define functions without using the fn
notation. For example, here is the code for describing and using the integer squaring function
let fun sqr x = x * x
in
sqr 9
end
This expression evaluates to 81. It binds the name sqr
to the function value equivalent to fn x => x*x
.
Modify the parsing code and interpreter code to handle function definitions. Have it work for multi-argument functions, for example
let fun twoDigit tens ones = 10*tens + ones
in
twoDigit 3 7
end
Note that Standard ML allows complicated patterns in place of the formal parameter variables. It also allows several cases to be defined using the |
syntax. Your code definitely doesn’t need to deal with that. Your code should just handle a single case where the patterns are only variable names.
The fun
construct is used by Standard ML programmers to define recursive functions, so it is not mere “syntactic sugar” for let val f = fn ...
. Here, for example is code for computing the 10th Fibonacci number:
let fun fib n =
if n = 0 then 0
else if n = 1 then 1
else fib (n-1) + fib (n-2)
in
fib 10
end
For this to work, the name fib
needs to be bound to a function value, but one that somehow knows the definition of itself when its body gets evaluated. So somehow its value should be a closure, say, with a binding for the name fib
included within the context carried around by that closure.
Parse this kind of let fun
construct and then figure out a way to get recursion to work. You don’t have to handle mutually recursive functions defined with a series of the and
connector. It is not too hard to get mutual recursive functions to work—to have a collection of functions each know about all the other companion functions in their mutual definition— but it is a BONUS if you get these kinds of definitions working, too.