Tuesday, January 26, 2010

Twelf, my favorite general-purpose functional programming language

(Disclaimer: the title of this post is tongue-in-cheek—I do not really like to play favorites. But hopefully you will understand why I chose it: to mess with your head.)

Last week I was at POPL 2010 in Madrid. Most fun of course was catching up with friends and meeting new people, but there were also some excellent talks. A few highlights:

In this post I want to talk a bit about my favorite general-purpose functional programming language, Twelf. If that sentence gives you cognitive dissonance, please read on!

Really, what I am going to be describing is a very shallow embedding of a pure functional language within Twelf. For familiarity (though at a cost of expressiveness), I will limit our language to an ML-like, call-by-value fragment (with arbitrary recursive datatypes and recursive functions), rather than a fully polarized type theory. For fun, our language will also include Haskell-style typeclasses. Our two weapons will be traditional LF-style (dependently-typed) higher-order abstract syntax, and the defunctionalization trick I described a year ago. Don't worry, we won't be proving any theorems, only writing programs.

We begin by declaring the (meta-level) type of (object-level) types:

tp : type.                      % types
As we did informally in a previous post, we will distinguish three syntactic categories:
val : tp -> type.               % values
kon : tp -> tp -> type.         % "delimited" continuations
exp : tp -> type.               % expressions
(Note: there is nothing scary about delimited continuations—think of them as pure ML functions, or entailments, or morphisms in a category.) Expressions in our language are constructed out of values and continuations, as follows:
! : val A -> exp A.             %prefix 10 !.
$ : kon A B -> exp A -> exp B.  %infix right 11 $.
Because we are really restricting to types of positive polarity, values are built by applying constructors, while continuations are defined by pattern-matching on values. The latter principle is expressed (in defunctionalized form) by the apply function, a Twelf logic program:
apply : kon A B -> val A -> exp B -> type.
%mode apply +K +V -E.
Now, without knowing anything else about values and continuations, we can already give our language a generic (big-step) operational semantics:
eval : exp A -> val A -> type.
%mode eval +E -V.
- : eval (! V) V.
- : eval (F $ E) VR
        <- eval E V
        <- apply F V E'
        <- eval E' VR.
And that's the core language! Next we can build up a nice library of datatypes and library routines. We define some simple datatype constructors, by their value constructors:
1 : tp.
<> : val 1.

0 : tp.

* : tp -> tp -> tp.             %infix right 12 *.
pair : val A -> val B -> val (A * B).

+ : tp -> tp -> tp.             %infix right 11 +.
inl : val A -> val (A + B).
inr : val B -> val (A + B).

=> : tp -> tp -> tp.            %infix right 10 =>.
fn : kon A B -> val (A => B).
We also introduce polymorphic and recursive types (note our reliance on HOAS and substitution):
∀ : (tp -> tp) -> tp.
Λ : ({x}val (A x)) -> val (∀ A).

rec : (tp -> tp) -> tp.
fold : val (A* (rec A*)) -> val (rec A*).
Finally we derive some familiar datatypes:
bool = 1 + 1.
true = inl <>.
false = inr <>.

maybe = [A] A + 1.
nothing : val (maybe A) = inr <>.
just : val A -> val (maybe A) = [V] inl V.

nat = rec [X] 1 + X.
z : val nat = fold (inl <>).
s : val nat -> val nat = [V] fold (inr V).

list = [A] rec [X] 1 + A * X.
nil : val (list A) = fold (inl <>).
cons : val A -> val (list A) -> val (list A) =
        [V] [Vs] fold (inr (pair V Vs)).
Just as we do in Haskell, we can define continuations (functions) by giving them a type declaration, and then a list of pattern-matching clauses. For example, here we define the boolean operations "and" and "or":
and : kon (bool * bool) bool.
- : apply and (pair true true) (! true).
- : apply and (pair true false) (! false).
- : apply and (pair false _) (! false).

or : kon (bool * bool) bool.
- : apply or (pair false false) (! false).
- : apply or (pair false true) (! true).
- : apply or (pair true _) (! true).
We can also define some continuation constructors in the same way. For example, we can define the beloved lambda:
λ : (val A -> exp B) -> kon A B.
- : apply (λ F) V (F V).
λ constructs a continuation from an expression with a free value variable, which is useful when we don't need to pattern-match on the argument. We also define some useful syntactic sugar:
let : exp A -> (val A -> exp B) -> exp B = [E] [F] λ F $ E.
In effect here we are using Twelf as a macro system for our embedded language. The following continuation constructors correspond to (sequent calculus) "left rules" for the various type constructors:
split : (val A -> val B -> exp C) -> kon (A * B) C.
- : apply (split E*) (pair V1 V2) (E* V1 V2).

case : kon A C -> kon B C -> kon (A + B) C.
- : apply (case K1 K2) (inl V) E
        <- apply K1 V E.
- : apply (case K1 K2) (inr V) E
        <- apply K2 V E.

funsplit : (kon A B -> exp C) -> kon (A => B) C.
- : apply (funsplit E*) (fn F) (E* F).

gensplit : (({x}val (A x)) -> exp C) -> kon (∀ A) C.
- : apply (gensplit E*) (Λ V) (E* V).

unfold : kon (A* (rec A*)) C -> kon (rec A*) C.
- : apply (unfold K) (fold V) E
        <- apply K V E.
Note we are not obligated to use these constructors to define continuations, i.e., they are not part of the language specification: they are just library routines. Here's an example of a recursive definition:
append : kon (list A * list A) (list A).
- : apply append (pair nil L2) (! L2).
- : apply append (pair (cons V L1) L2)
     (let (append $ ! pair L1 L2) [L'] ! cons V L').
Okay, hopefully by now you are convinced that in this style we we can write more or less ordinary (effect-free) ML programs. Next we turn to typeclasses. Typeclasses are basically type-indexed (or type constructor-indexed) records, and so their encoding in Twelf follows the same pattern of defunctionalization (i.e., for the polarity-sensitive, because they are negative values). Here we introduce a few typeclasses familiar from Haskell:
EQ : tp -> type.
eq : EQ A -> kon (A * A) bool.

FUNCTOR : (tp -> tp) -> type.
fmap : FUNCTOR F -> kon A B -> kon (F A) (F B).

MONAD : (tp -> tp) -> type.
eta : MONAD T -> kon A (T A).
ext : MONAD T -> kon A (T B) -> kon (T A) (T B).

% Wadler's syntactic sugar...
return : MONAD T -> exp A -> exp (T A) = [M] [E] eta M $ E.
bind : MONAD T -> exp (T A) -> kon A (T B) -> exp (T B) = [M] [E] [K] ext M K $ E.
We can instantiate EQ in various ways:
1eq : EQ 1.
- : apply (eq 1eq) (pair <> <>) (! true).

sumeq : EQ A -> EQ B -> EQ (A + B).
- : apply (eq (sumeq EqA EqB)) (pair (inl Va1) (inl Va2))
        (eq EqA $ ! pair Va1 Va2).
- : apply (eq (sumeq EqA EqB)) (pair (inl Va1) (inr Vb2))
        (! false).
- : apply (eq (sumeq EqA EqB)) (pair (inr Vb1) (inl Va2))
        (! false).
- : apply (eq (sumeq EqA EqB)) (pair (inr Vb1) (inr Vb2))
        (eq EqB $ ! pair Vb1 Vb2).

paireq : EQ A -> EQ B -> EQ (A * B).
- : apply (eq (paireq EqA EqB)) (pair (pair Va1 Vb1) (pair Va2 Vb2))
       (let (eq EqA $ ! pair Va1 Va2) [x]
          let (eq EqB $ ! pair Vb1 Vb2) [y]
          and $ ! pair x y).

nateq : EQ nat.
- : apply (eq nateq) (pair z z) (! true).
- : apply (eq nateq) (pair z (s _)) (! false).
- : apply (eq nateq) (pair (s _) z) (! false).
- : apply (eq nateq) (pair (s N1) (s N2)) (eq nateq $ ! pair N1 N2).
Likewise FUNCTOR:
idf : FUNCTOR ([A]A).
- : apply (fmap idf K) V E
     <- apply K V E.

prodf : FUNCTOR F -> FUNCTOR G -> FUNCTOR ([A] F A * G A).
- : apply (fmap (prodf F G) K) (pair V1 V2)
     (let (fmap F K $ ! V1) [x] let (fmap G K $ ! V2) [y] ! pair x y).

expf : FUNCTOR ([A]B => A).
- : apply (fmap expf K) (fn K') (! fn (λ [x] K $ K' $ ! x)).

listf : FUNCTOR ([A]list A).
- : apply (fmap listf K) nil (! nil).
- : apply (fmap listf K) (cons V VS) (let (K $ ! V) [x]
                                  let (fmap listf K $ ! VS) [xs]
                                  ! cons x xs).
And MONAD:
idm : MONAD ([a]a).
- : apply (eta idm) V (! V).
- : apply (ext idm K) V E
     <- apply K V E.

maybem : MONAD ([a]maybe a).
- : apply (eta maybem) V (! just V).
- : apply (ext maybem K) V (case K (λ [_] ! nothing) $ ! V).
And now an example where we run fmap on a pair of lists, using Twelf's logic engine to automatically infer the appropriate instance of FUNCTOR:
%query 1 *
  eval (fmap F (λ [x] ! s x) $
            ! pair (cons z (cons z nil)) (cons (s z) (cons (s z) nil))) R.
---------- Solution 1 ----------
R = pair (cons (s z) (cons (s z) nil)) (cons (s (s z)) (cons (s (s z)) nil));
F = prodf listf listf.
____________________________________________
And finally, a slightly more involved example:
cond : exp C -> exp C -> kon bool C =
    [E1] [E2] case (λ [_] E1) (λ [_] E2).

lookup : EQ A -> val A -> kon (list (A * B)) (maybe B).
- : apply (lookup Eq V) nil (! nothing).
- : apply (lookup Eq V) (cons (pair V' VE) Vs)
     (cond (lookup Eq V $ ! Vs) (! just VE) $ 
        (eq Eq $ ! pair V V')).

one = s z. two = s one. three = s two. four = s three.
%query 1 *
  eval (bind _ (lookup _ one $ ! cons (pair z three)
                  (cons (pair one four) (cons (pair two z) nil)))
          (λ [x] return _ (! s x))) R.
---------- Solution 1 ----------
R = just (s four).
____________________________________________
And that's the gist of it.

If you are still reading this, perhaps you will be inspired and use our favorite general-purpose functional programming language to write, say, a webserver?