CSCI 384 Fall 2021

Homework 07: a type system for MiniML+

Due: November 29th, 3pm


starter code for this homework


This code in this repository is a nearly-working, nearly-complete implementation of a MiniML interpreter. It runs, but is missing some functionality, with the command

python3 miniml.py

bringing up the interactive interpreter, or with the command

python3 miniml.py f1.mml f2.mml ... fk.mml

which loads several MiniML source files first before bringing up the interpreter.

It relies on several Python source files:

prse.py: the parser for MiniML programs and expressions

eval.py: the evaluator of MiniML expression and definition terms

chck.py: the type checker and inference engine for MiniML

table.py: an implementation of MiniML+ “tables”

All of the work required for this assigment will involve editing the type system code chck.py. That type-checker is missing a few key things:

  1. the ability to check the types of some list and pair constructs.

  2. the ability to check the types of the new table type’s constructs.

In the description below, I talk about the type system code, briefly, mostly hinting at the work we’ve done in class and referring you to the miniml_types.pdf document included in this repository. I then give you your assignment.


The Type Checker and Inference Engine

The code in chck.py implements the components of the type system of the MiniML interpreter. The key functions it defines are infer and check and these rely on several components, including the function unify. Let’s talk about each of these briefly.

tau = infer(tyc,expn)

This function takes a name-type context tyc and a MiniML term expn and works to deduce a type tau for that expression. It does so by checking the subterms of the expression, making sure the uses of names and language constructs within that expression are consistent with the typing rules of the language. In doing so, the algorithm relies on assumptions about the types of free names within the expression. These assumptions are stored within the type context tyc. This algorithm is what is given in the PDF document included in this repository.

tyc' = check(tyc,defn)

This function is not much different than the above (I just needed another name for a function). Rather than checking an expression, it checks the val and fun constructs’ definitions of names. Rather than giving back the type of that name, or name(s), we instead give those back as a modified context tyc’. Furthermore the definitions are attached to a let, and it is in check that the type system intoduces a “forall type.” This too is discussed in the PDF document included in this repository.

It should be noted that both functions wrestle with type variables in a number of ways. Oftentimes, during checking of an expression, the code has to invent a type of a name or a subexpression, but without yet knowing enough information about how that name is going to be used or not fully knowing what other expressions that subexpression might serve. Further checking the use of a name or the use of the result of a subexpression leads to constraints on their type. To do this, type variables are used, and an algorithm unify is used to resolve these contstraints. The code thus also defines

unify(tau1,tau2)

This unifies two type terms, ones that include type variables, by seeking a set of substitutions of those variables with type terms, so that tau1 and tau2 can be made equal. This algorithm is laid out in the PDF document included in this repository. The difference here from the pseudocode is that, rather than returning a substitution, a “binding” is made to the type variable object. In essence, the variable is no longer free, but is instead replaced by that term everywhere it gets used. The code raises a “mismatch error” should unification not be possible.


MiniML+ Tables

I’ve extended the parsed and interpreted language with a new table type, one that looks superficially similar to the “record” type that is built into Standard ML—a kind of primitive object type—but instead is more like an “associative list” or dictionary (or map, etc) found in many languages. MiniML+ tables hold a collection of key-value pairs. The keys are strings and the associated values can be any MiniML value. The only restriction is that the values in a table are required to be of the same type. For example, the type int table would be a dictionary whose entries associate a string with an integer value.

Here are some examples of the use of an int table:

- val T = {hello=>4,bob=>17};
val T = {hello=>4,bob=>17} : int table
- T +: ("a",2);
val it = {hello=>4,bob=>17,a=>2} : int table
- empty +: ("goodbye",0);
val it = {goodbye=>0} : int table
- T -: "hello";
val it = {bob=>17} : int table
- T . "hello"
val it = 4 : int
- T ? "hello"
val it = true : bool
- size T;
val it = 2 : int
- keys T;
val it = ["hello","bob"] : string list

In the above, we’re seeing several forms of syntax for building tables. Here in fact is the full suite of operations:

empty - an empty table, analogous to nil
+: - binary operation for extending a table with an entry
-: - binary operation for removing an entry from a table
? - binary operation for key containment
. - binary operation for getting a value associated with a key
size - primitive unary operation that gives the number of entries
keys - primitive unary operation that gives the list of keys

One can, in principle, write functions that act on tables that are polymorphic. For example, the function zipt below takes a list of strings ks and a list of values vs and builds a table of them as entries:

fun zipt ks vs = 
  if null ks orelse null vs 
  then empty 
  else (zipt (tl ks) (tl vs)) +: (hd ks,hd vs)

The type of that should be

zipt : string list -> 'a list -> 'a table

because it would work just fine with vs being a list of bool, or a list of int, etc.

The code in prse.py produces Python AST list terms for each of the table constructs, like so

["Empty", p]

["Binop", op, e1, e2, p] for op of "+:", "-:", "?", "."

["Primitive", op] for op of "size", "keys" 

And the type checker works with new abstract type terms of the form

["Table", tau]

This is similar to ["List", tau] for the list type.


chck.py status

The type system code is nearly complete. Here is what is missing in the function infer because it is not yet implemented:

  • checking the list prepend operation ::

  • checking the type of the pair extraction primitives fst and snd.

  • checking the type of the list inspection primitives null, hd, and tl.

  • checking the empty table’s type (similar to nil)

  • checking the type of the table operations +:, -:, ?, and .

  • checking the table inspection primitives size and keys

The first three of these are laid out in pseudocode in the PDF document. The others you need to devise the rules for yourself.


ASSIGNMENT

Part 1 code
Make type checking work for each of the list and pair constructs listed above. Write .mml test files to demonstrate that your code works.

Part 2 written
Write the rules for checking and inferring the type of the table constructs. These should be inference rules like we devised for all the other types and their constructs, together, in class. Your inference rules shouldn’t be concerned about polymorphism, just as we didn’t when we wrote the typing rules for the other constructs.

Part 3a written
Write the pseudocode of INFER for each of the table constructs, similar to what I provided in class in that PDF.

Part 3b code
Modify the function infer so that it handles these constructs. Write .mml test files to demonstrate that your code works.

Throughout the code of chck.py you will find Python comment lines like

#
# IMPLEMENT THIS
#
return ["Bogus"]

You’ll want to replace each of those with your code, and maybe also some of your own comments.