Monday, January 26, 2009

Defunctionalizing Proofs (and how to define a pattern-matching, CPS language in Twelf)

I drove back from Savannah and POPL/PLPV/TLDI yesterday. It was a long and exciting drive (in part because of the snow and frozen windshield washer fluid driving through West Virginia, god I hate that stretch of highway), and a long and exciting week at the conference. My favorite talk was an invited one by Chris Barker on Wild Control Operators, in the sense of "continuations in the wild". Chris is a linguist who for several years (I think since around 2000) has been advocating the use of continuations to analyze apparently non-compositional phenomena in natural language. He collaborates with brilliant computer scientist Ken Shan, who wrote his PhD thesis on "Linguistic Side Effects".

In this post I want to introduce some Twelf code I wrote after getting back to Pittsburgh. At the PLPV workshop I presented my paper on Refinement Types and Computational Duality. I didn't exactly time the talk well, and ended up going through about half my slides in ten minutes past the allotted time. (Luckily, I didn't run into Obama's talk, which was better prepared.) The paper itself is similar in that respect, trying to rush through too much material in too short a space. One of the ideas it describes probably too briefly is that of defunctionalizing proofs. Defunctionalization is a trick due to John Reynolds, from his paper on Definitional Interpreters for Higher-Order Programming Languages. It allows you to take a program with higher-order functions and convert it into one with only first-order functions. Essentially, it boils down to giving a unique tag to every function definition in your program, and then defining a separate "apply" function that tells you how to apply the function denoted by a tag to an argument. (A complication is that function bodies may reference escaping variables, so defunctionalization has to be preceded by closure conversion.)

Why would proofs contain functions in the first place? For the same reason programs do: sometimes they are a useful abstraction. An old example from proof theory comes in the work of Paul Lorenzen and Kurt Schütte, who in the 1950s began revisiting Gerhard Gentzen's work via infinitary methods (see "Lieber Herr Bernays!, Lieber Herr Gödel!" by Feferman for an interesting history). Schütte simplified Gentzen's proof of cut-elimination for Peano arithmetic by replacing the usual schema of induction by the so-called ω-rule, which says, "To prove ∀x:ℕ.A(x), prove A(0), A(1), A(2), ...". At first glance, the ω-rule seems silly: how can we ever apply a rule with infinitely many premises? But a different way of looking at the ω-rule is that it demands a single function, mapping natural numbers n to proofs of A(n). And this function can certainly have a finite representation, for example as a C program.

In some other papers, I described how to view Andreoli's "focusing" proofs as higher-order in this sense, containing functions that map proof patterns (or refutation patterns) to other sorts of proof objects. For example, to refute a proposition of positive polarity, we build a function from proof patterns to proofs of contradiction. This has a simple Curry-Howard interpretation: we can define call-by-value continuations by functions from value patterns to expressions. In other words, we actually represent the syntax of a programming language using computation.

Dependently-typed functional languages such as Agda and Coq turn out to be very convenient platforms for embedding such higher-order language definitions. In the dependently-typed setting of Twelf, frustratingly, it seems impossible! This is surprising, at least in a superficial way, because Twelf is well-known for allowing language definition via "higher-order abstract syntax" (HOAS). But HOAS is a different idea from the above technique (which I've taken to calling AHOS). Essentially, the problem is that the function space of LF (the type-theoretic core of Twelf) is very weak: abstractly, a function A→B is just a B with a hole for an A, or in other words, A→B represents a substitution function, something into which you can plug in an A to obtain a B. This weakness is in fact crucial for doing HOAS. But it means we absolutely cannot use the LF function space to do AHOS. (Think, for example, what that would mean for the ω-rule: to prove ∀x:ℕ.A(x), we would have to give a proof of A(n) that works the same way for every n. But the predicate A might hold for different n for different reasons, so such a rule would be woefully incomplete.) On the other hand, that we can't define languages using AHOS in Twelf is frustrating, because HOAS is rather nice for other parts of a language definition.

Fortunately, it turns out we can have our AHOS and eat HOAS too! And it all boils down to Reynolds' trick. You can check out the code here.