On this page:
3.19.1 Template
3.19.2 Parsing
parse
3.19.3 Alpha Renaming
alpha-vary
3.19.4 Constraint Generation
generate-constraints
3.19.5 Unification
unify
3.19.6 Inferring Types
infer-type
3.19.7 Testing Type Equality
3.19.8 Testing Hints
3.19.9 Extra Credit
3.19.10 What Not To Do

3.19 Type Inferrer

Complete this assignment with Team Three.

You must submit this in an archive named "typei.zip". This archive must contain a folder named "typei". This folder must contain all the files specified below. You must attach "typei.zip" to an email whose subject is "BYU - Fall 2010 - CS 330 - typei" and whose message body contains the name of everyone on your team (each on a separate line.) You must send this email to jay@cs.byu.edu before 5pm (Provo time) on 12/2. Ensure that what you are satisfied with what you submit, because only your chronologically first submission will be graded. Ensure that you follow these instructions exactly, since submissions that do not meet these requirements (i.e. do not have the correct format) will receive no credit, despite the time and energy you put into the assignment. Please see Turn In Policy for more information.

You must submit this in a file named "typei.rkt".

You must complete this assignment using a specific language. Choose Determine language from source in the DrRacket menu, then write
  #lang plai
as the first line of your program, overwritting the default #lang at the top of the file.

The grammar for the language:
  expr = num
  | true
  | false
  | (+ expr expr)
  | (- expr expr)
  | (* expr expr)
  | (iszero expr)
  | (bif expr expr expr)
  | id
  | (with (id expr) expr)
  | (rec (id expr) expr)
  | (fun (id) expr)
  | (expr expr)
  | tempty
  | (tcons expr expr)
  | (tempty? expr)
  | (tfirst expr)
  | (trest expr)

There are no type expressions this time. Your completed program will infer types for all expressions, including functions.

3.19.1 Template

You must use the following template:

  (define-type Expr
    [num (n number?)]
    [id (v symbol?)]
    [bool (b boolean?)]
    [bin-num-op (op procedure?) (lhs Expr?) (rhs Expr?)]
    [iszero (e Expr?)]
    [bif (test Expr?) (then Expr?) (else Expr?)]
    [with (bound-id symbol?) (bound-body Expr?) (body Expr?)]
    [rec-with (bound-id symbol?) (bound-body Expr?) (body Expr?)]
    [fun (arg-id symbol?) (body Expr?)]
    [app (fun-expr Expr?) (arg-expr Expr?)]
    [tempty]
    [tcons (first Expr?) (rest Expr?)]
    [tfirst (e Expr?)]
    [trest (e Expr?)]
    [istempty (e Expr?)])
  
  (define-type Type
    [t-num]
    [t-bool]
    [t-list (elem Type?)]
    [t-fun (arg Type?) (result Type?)]
    [t-var (v symbol?)])
  
  (define-type Constraint
    [eqc (lhs Type?) (rhs Type?)])

3.19.2 Parsing

Write the following function:

(parse se)  Expr?
  se : s-expression?
Parses expressions from the grammar.

You may assume that the s-expression supplied to parse are valid.

3.19.3 Alpha Renaming

Write the following function:

(alpha-vary e)  Expr?
  e : Expr?
Renames all the identifiers in e to new unique identifier, such that the same identifier is never used twice.

You will find make-immutable-hasheq, hash-ref, hash-set, and gensym useful for this part.

The function gensym returns a unique symbol each time it is applied. (gensym accepts an optional symbol as an argument; the result of (gensym 'x) is then a unique symbol that “looks like” 'x.)

Examples:

  > (gensym)

  'g1077

  > (gensym 'x)

  'x1078

If we were to call alpha-vary on (parse '(+ (with (x 4) x) (with (x 5) x))), it should return something like (parse '(+ (with (x1 4) x1) (with (x2 5) x2))).

If alpha-vary is given something that refers to unbound identifiers, it should error. For example: (alpha-vary (parse '(with (x 5) y))).

3.19.4 Constraint Generation

Write the following function:

(generate-constraints e-id e)  (listof Constraint?)
  e-id : symbol?
  e : Expr?
Returns the constraints generated by e. e-id serves as e’s label in this list.

A few things to note:
  • List operations are polymorphic; that is, you can create lists of numbers or booleans.

  • The bound body of a rec binding does not have to be a syntactic function. However, you may assume that the rec-bound identifier only appears under a fun in the bound body. In other words, the following expressions are legal:
      (rec (f (fun (x) (f x)))
        ....)
      (rec (f (with (y 4)
                    (fun (x) (f y))))
        ....)
    while the following are not legal:
      (rec (f f)
        ....)
      (rec (f (+ 1 f))
        ....)

  • During constraint generation, you will need to generate fresh identifiers for labels. Use gensym.

3.19.5 Unification

Write the following function:

(unify loc)  (listof Constraint?)
  loc : (listof Constraint?)
Implements the unification algorithm from class. The list of constraints that is returned should only have t-vars on the left.

unify should signal an error if two types cannot be unified or if the occurs check fails.

3.19.6 Inferring Types

Write the following function:

(infer-type e)  Type?
  e : Expr?
This should alpha-vary e, generate-constraints on it, run the unify algorithm, and finally extract the type from the substitution returned by unify.

3.19.7 Testing Type Equality

infer-type returns a Type, so you will need to test types for equality. This can be tricky due to type variables, particularly because generate-constraints calls gensym to generate unique type variables.

Use the following support code to test if two types are equal modulo identifier renaming:
  ; type=?/mapping : hash hash Type Type -> Bool
  ; determines if types are equal modulo renaming
  (define (type=?/mapping ht1 ht2 t1 t2)
    (define (teq? t1 t2)
      (type=?/mapping ht1 ht2 t1 t2))
    (cond
      [(and (t-num? t1) (t-num? t2)) true]
      [(and (t-bool? t1) (t-bool? t2)) true]
      [(and (t-list? t1) (t-list? t2))
       (teq? (t-list-elem t1) (t-list-elem t2))]
      [(and (t-fun? t1) (t-fun? t2))
       (and (teq? (t-fun-arg t1) (t-fun-arg t2))
            (teq? (t-fun-result t1) (t-fun-result t2)))]
      [(and (t-var? t1) (t-var? t2))
       (local ([define v1 ; the symbol that ht1 says that t1 maps to
                 (hash-ref
                  ht1 (t-var-v t1)
                  (lambda ()
                    ; if t1 doesn't map to anything, it's the first
                    ; time we're seeing it, so map it to t2
                    (hash-set! ht1 (t-var-v t1) (t-var-v t2))
                    (t-var-v t2)))]
               [define v2
                 (hash-ref
                  ht2 (t-var-v t2)
                  (lambda ()
                    (hash-set! ht2 (t-var-v t2) (t-var-v t1))
                    (t-var-v t1)))])
         ; we have to check both mappings, so that distinct variables
         ; are kept distinct. i.e. a -> b should not be isomorphic to
         ; c -> c under the one-way mapping a => c, b => c.
         (and (symbol=? (t-var-v t2) v1)
              (symbol=? (t-var-v t1) v2)))]
      [(and (Type? t1) (Type? t2)) false]
      [else (error 'type=? "either ~a or ~a is not a Type" t1 t2)]))
  
  ; type=? Type -> Type -> Bool
  ; signals an error if arguments are not variants of Type
  (define ((type=? t1) t2)
    (or (type=?/mapping (make-hash) (make-hash) t1 t2)
        ; Unfortunately, test/pred simply prints false;
        ; this helps us see what t2 was.
        (error 'type=?
               "~s and ~a are not equal (modulo renaming)"
               t1 t2)))
  
  (test/pred (t-var 'a)
             (type=? (t-var 'b)))
  (test/pred (t-fun (t-var 'a) (t-var 'b))
             (type=? (t-fun (t-var (gensym)) (t-var (gensym)))))
  (test/pred (t-fun (t-var 'a) (t-var 'b))
             (type=? (t-fun (t-var (gensym)) (t-var (gensym)))))
  (test/pred (t-fun (t-var 'a) (t-var 'a)) ; fails
             (type=? (t-fun (t-var (gensym)) (t-var (gensym)))))
  (test/pred (t-fun (t-var 'a) (t-var 'b)) ; fails
             (type=? (t-fun (t-var 'c) (t-var 'c))))
  (test/exn ((type=? 34) 34) "not a Type")
  
  ; constraint-list=? : Constraint list -> Constraint list -> Bool
  ; signals an error if arguments are not variants of Constraint
  (define ((constraint-list=? lc1) lc2)
    (define htlc1 (make-hash))
    (define htlc2 (make-hash))
    (or (andmap (lambda (c1 c2)
                  (and
                   (type=?/mapping
                    htlc1 htlc2
                    (eqc-lhs c1) (eqc-lhs c2))
                   (type=?/mapping
                    htlc1 htlc2
                    (eqc-rhs c1) (eqc-rhs c2))))
                lc1 lc2)
        (error 'constraint-list=?
               "~s and ~a are not equal (modulo renaming)"
               lc1 lc2)))

3.19.8 Testing Hints

You may find Racket’s quasiquoting useful. For example, you could define
  (define add1-fun '(fun (x) (+ x 1)))
to use in multiple test cases, such as
  (test/pred (infer-type (parse add1-fun)) (type=? (t-fun (t-num) (t-num))))
  (test/pred (infer-type (parse `(,add1-fun 1))) (type=? (t-num)))
  (test/exn (infer-type (parse `(,add1-fun false))) "unify")
This becomes more useful as functions get more complex, like curried map. Yes, your test cases should get this elaborate, as well as covering the simple cases.

Make sure you have test cases for the occurs check [PLAI 282].

3.19.9 Extra Credit

For an amount of extra credit equal to a “special” test case, write a program in this language for which your algorithm infers the type
  (t-fun (t-var 'a) (t-var 'b))

You shouldn’t attempt this problem until you’ve fully completed the assignment. Put "extra credit" in a comment above the test case to make it easy to spot.

3.19.10 What Not To Do

You do not need to implement an interpreter for this language.

You do not need to implement let-based polymorphism.