Homework 06: a lambda calculus reduction engine

Reed CSCI 384 Fall 2019

Due: Wednesday, 11/10 at 3pm.


In class we studied the lambda calculus which, in essence, was a MiniML reduced considerably to only include these constructs:

<term> ::= fn <name> => <term>
<term> ::= <term> <term>
<term> ::= <name>

The full syntax is suggested by the grammar above, but the expressions can include parenthesization of subterms. The terms are parsed with rules of associativity and precedence in order to resolve the ambiguities introduced by the grammar productions listed above.

This concrete syntax corresponds to an abstract syntax of terms with three constructors illustrated below:

VA(x) for any variable name x
LM(x,t) for any variable name x and any term t
AP(t1,t2) for any two terms t1 and t2

these could be used to develop a family of terms. For example, we could have the term

AP(VA("succ"),LM("f",LM("x",AP(VA("f"),AP(VA("f"),VA("x"))))))

which is the AST that would result from parsing the MiniML expression

succ (fn f => fn x => f (f x))

We then developed the definition of substitution, where variables could be replaced by terms within another term, and this led us to pay careful attention to “free” versus “bound” occurrences of a variable within a subterm.

We defined substitution succinctly, with the rules

[s/x] x           := s
[s/x] y           := y
[s/x] (fn x => t) := (fn x => t)
[s/x] (fn y => t) := (fn z => [s/x] ([z/y] t)) for fresh z
[s/x] (t1 t2)     := ([s/x] t1) ([s/x] t2)

We then described the action of a reduction step, building from a “beta reduction” step. We’ll use --> to describe this step, and so that step looks follows:

(fn x => t) s --> [s/x] t

This is just describing application of a function to its argument. We replace each free occurrence of the function parameter x in the function’s body t, plugging in the argument term s in each occurrence’s place.

From all this we were able to develop lambda calculus terms for natural numbers and booleans, including defining arithmetic and a zero-checking function. We defined booleans along with if. We figured out how to express let val and let fun, with a way of handling recursive functions in the latter. We showed how basic data structures could be introduced with a primitive pairing term. It came with the design of the extractors fst and snd.

Finally, we showed that evaluation could be developed formally, so that lambda calculus “programs” could be “run” by using beta reduction steps. We chose to do this in a way that is sometimes described as the foundations of “lazy evaluation,” by defining normal order reduction. To do this, we defined terms that were irreducible, and so were in normal form. This can be defined as the judgement t irr as follows:

-----
x irr

 (t1 t2) irr    t3 irr
----------------------
   ((t1 t2) t3) irr

  t irr
---------
(x t) irr

     t irr
---------------
(fn x => t) irr

This subset of terms (the irreducible ones) allowed us to define normal order reduction steps. We’ll use ~~> to denote that stepping relation and define it as follows:

-------------------------
(fn x => t) s ~~> [s/x] t

          t ~~> t'
----------------------------
(fn x => t) ~~> (fn x => t')

          t1 ~~> t1'
------------------------------
((t1 t2) t3) ~~> ((t1' t2) t3)
 
    t1 irr      t2 ~~> t2'
------------------------------
((t1 t2) t3) ~~> ((t1 t2') t3)

 t1 irr    t2 ~~> t2'
----------------------
 (t1 t2) ~~> (t1 t2')

This defines a rewrite activity of the form

t1 ~~> t2 ~~> t3 ... ~~> tn

where, if tn is irreducible, we get something that looks like a value!

Whew.


ASSIGNMENT

Part 1. The parser.
Build a parser for source files that look like lambda calculus “programs.” These should allow its “programmer” to define a “main” term, using syntax similar to what I wrote for the grammar at the top of this page, and also to give a series of definitions of named terms that can be used in this main term.

For example, you could imagine parsing files that look like this:

zero := (fn f => fn x => x);
succ := fn n => (fn f => fn x => f (n f x));
plus := fn n => fn m => (n succ m);
times := fn n => fn m => (fn f => fn x => n (m f) x);
two := succ (succ zero);
main := times (succ two) two; 

The last definition defines a main term to be reduced, and that term relies on the series of terms defined just above. The above file should be read by your program, say, as a sample1.lc.

Part 2. The term.
Having parsed these files, you should produce a single lambda caclulus term that is to be reduced. This will, of course, require you to carefully substitute the definitions of names of terms into the terms that follow. This can be done by applying our trick that we applied for expressing let using the lambda calculus. This trick rewrote the MiniML expression:

let val x = d in b end

as the expression

(fn x => b) d

That means that, with a .lc program of the form

x1 := t1;
...
x99 := t99;
x100 := t100;
main := t

you could construct a term equivalent to

(fn x1 => ... ((fn x99 => ((fn x100 => t) t100)) t99) ...) t1

For example, were the “program” something like this sample2.lc:

zz := fn f => fn x => x; 
zf := fn y => zz;
main := zf zz;

I would expect your code to want to reduce a term like this:

AP(LM("zz",AP(LM("zf",AP(VA "zf",VA"zz")),LM("y",VA"zz")),LM("f",LM("x",VA"x")))

This term is using AP to replace zf with its definition within the main term zf z, and then replacing zz with its definition in that resulting term.

Part 3. reduction
Your program should then start applying reduction steps, one beta reduction step at a time, working to reduce the term. It should use normal order reduction, defined above, which simply beta reduces the leftmost, outermost beta-reducible term.

It should be able to do this in a quiet mode, where only the final, fully reduced term is reported. You should also, for debugging purposes, have it run in a verbose mode where each intermediate reduction step is shown. In that verbose mode, only one beta reduction should be perfomed with each step.

You can write all this code in either Python or Standard ML, or even a combination of both. You can also work with a partner on the assignment. If you do, I insist that both of you know the details of each other’s work. In other words, I’d prefer that you pair program, and that you work to switch “leads” as you do that development. Both of you will submit a repository, but presumably that repo will be the same for both of you.

Part 4. testing
I’d like you to convince me that your code works. To do this, I’d like you to write several test .lc programs. You’d want to include several small tests, simple ones that just showcase that the reduction engine is working. But I’d also like to see some significant tests. Here, I’d want to see evidence that you know how to “program” in the lambda calculus. Showcase the Church numerals; booleans using if, etc.; pairs; let; bounded iteration; and recursive simple functions.

In addition, I’m requiring you to write each of the following terms prescribed below:

Part 5. write some terms
Write each of the following as lambda calculus terms:

power n m: this should compute the m-th power of n, where each is a Church numeral.

pred n: This should compute the Church numeral that precedes n. This can just be what we wrote in class. It relies on pairs, booleans, succ, and zero that we wrote in class. Note that the predecessor of zero is just zero.

minus n m: this should compute the difference between n and m. Since we are working with Church numerals, and they represent natural numbers, then the result should be zero if m is larger than n.

less n m: this should compute whether n is strictly smaller than m. It relies on our encoding of a term that computes whether or not a Church numeral is zero.

equal n m: this should compute whether n is equal to m. Hint: it might rely on an encoding of and.

fibbit n: this should compute the n-th Fibonacci number. It should be written using “bounded iteration.” That is, it should have this form

fibbit := fn n => (fibExtract (n fibIterator fibStart))

• div2 n: this should compute the quotient of n divided by two. It should also use bounded iteration in its design.

fibrec n: this should also compute the n-th Fibonacci number. It should be written using recursion via the Y combinator. That is, it should have this form

fibrec := (Y fibTemplate)

• gcd n m: this should be a recursive function (again, using Y) that computes the greatest common divisor of n and m. It should use Euclid’s algorithm which finds the GCD by repeated subtraction. (That is, you don’t need to define mod to do this.)

You will get a lot of credit for writing these. Include them in your submission using the .lc file format syntax. Even if your parser and/or reducer isn’t fully working, I will still look at these and give you credit for them.


Appendix

Below are some notes and suggestions about the work for this assignment.

FRESH VARIABLES

There is an important alpha-renaming part of a beta reduction step that needs to happen, and that is snuck into the substitution rule:

[s/x] (fn y => t) := (fn z => [s/x] ([z/y] t)) for fresh z

Recall why we needed to do this: it’s possible that, during reduction, a lambda term might be a subterm of a lambda term, and each of those lambda’s might use the same variable name. If that inner lambda term is fn y => t term in the above rule, then that means that the s might have occurrences of y for the outer lambda. So when the term s is being plugged in for x inside of t, then any y in s will get treated as the inner y, when it should instead be treated as the outer y.

To fix this, we invent a “fresh” variable z. This is a variable that has never been used anywhere before during our reduction. We then take that z and we perform

[z/y] t

That is, we change all the inner y variables to z. So, for example, a term like

[z/y] (y ((b c) (fn w => y w)))

becomes the term

z ((b c) (fn w => z w))

To mimic this mathematics within SML or Python, we need a way of generating fresh variables. This is not too hard. In Python you can keep a global counter

freshVariableIndex = 0

and then we can write a function

def getFreshVariable():
    freshVariableIndex = freshVariableIndex+1
    return "x_"+str(freshVariableIndex)

Each time you call this function, you’ll get a newly named variable like

x_3451

and that global index will increment, so the next one will be

x_3452

etc. This insures that each variable name has never been used before (assuming none of the original lambda terms use a name like that, of course).

Now I personally have found that these fresh variables make debugging difficult, since at some point you’ll have intermediate terms riddled with x_WXYZ everywhere. So here’s an alternate strategy: Have def getFreshVariable(v) take a variable name as its parameter v, say, a string like "fact" or "bob_102" or "x_87686" or "y". And then getFreshVariable can work to find the base name of that variable. That would be "fact", "bob", "x", and "y" for my examples. It then should append the current index to that base string. Were that index 100135, then you’d get back

fact_100136
bob_100137
x_100138
y_100139

as a result of four successive calls.

In this way, you can track the origins of the renamed variables a little better, as they’ll be named like they were named in the term where they first appeared.

FRESH VARIABLES IN SML

This is somewhat tricker in SML, since SML is a functional language. You’ll need to use the reference type that’s built into SML. You’d write something like:

val freshVariableIndex = ref 0

and then write the function

fun getFreshVariable v = 
    let val _ = (freshVariableIndex := (!freshVariableIndex) + 1)
        val i = (!freshVariableIndex)
        val base = ... strip out _ with explode and implode...
        in (base ^ (Int.toString i))
    end

to write the function getFreshVariable : string -> string. Running that function has the side effect of changing the fresh variable index.

A PYTHON/SML HYBRID

I think the easiest way to get this program working is to write the reducer code in Standard ML and the parser in Python. For example, suppose you parsed this program in Python:

two := fn f => fn x => f (f x);
succ := fn n => (fn f => fn x => f (n f x));
plus := fn n => (n succ)
main := plus two two

You could pretty print the AST that was created into an SML file as something like:

let
val x1 = "two"
val t1 = LM("f",LM("x",AP(VA"f",AP(VA"f",VA "x"))))
val x2 = "succ"
val t2 = LM("n",LM("f",LM("x",AP(VA"f",AP(AP(VA"n",VA"f"),VA"x")))))
val x3 = "plus"
val t3 = LM("n",AP(VA "n",VA "succ"))
val t = AP(AP(VA "plus",VA "two"),VA "two")
val main = AP(LM(x1,AP(LM(x2,AP(LM(x3,t),t3)),t2)),t1)
val value = norReduce main
in
   print (pretty value)
end

The term main is thus the Standard ML representation of the .lc program parsed by your Python. The function norReduce would perform the normal order reduction of main. And so then value would be the reduced term of that main term. The last lines output that reduced term using a function pretty that converts it to a string.

So my hybrid proposal is as follows:

  1. parse .lc files with a Python parser
  2. pretty print the AST in Python in a form that looks like SML code, outputting it into a text file
  3. append that text file onto the end of your SML reducer’s source code (using Python scripting), putting it into a .sml file
  4. execute that .sml source by running SML/NJ
  5. pretty print the SML term by building a string, then using SML’s print function to output that string to the console.