Wednesday, October 28, 2009

Big-Step Normalisation

I realized that the idea of the previous post of factoring normalization as a translation to and from a syntactic model of normal forms is explored very nicely in a JFP article by Altenkirch and Chapman, indeed with a concrete environment semantics, albeit without a connection to focusing:

"Big-Step Normalisation"
Thorsten Altenkirch and James Chapman
Traditionally, decidability of conversion for typed λ-calculi is established by showing that small-step reduction is confluent and strongly normalising. Here we investigate an alternative approach employing a recursively defined normalisation function which we show to be terminating and which reflects and preserves conversion. We apply our approach to the simply-typed λ-calculus with explicit substitutions and βη-equality, a system which is not strongly normalising. We also show how the construction can be extended to System T with the usual β-rules for the recursion combinator. Our approach is practical, since it does verify an actual implementation of normalisation which, unlike normalisation by evaluation, is first order. An important feature of our approach is that we are using logical relations to establish equational soundness (identity of normal forms reflects the equational theory), instead of the usual syntactic reasoning using the Church-Rosser property of a term rewriting system.

Sunday, October 11, 2009

Normalization-By-Evaluation, Normalization-By-Focusing

So I've moved to Paris. I finally arrived here last week (after a bit of a summer vacation), to begin a post-doc at Université Paris 7. In addition to my host Paul-André Melliès, there is a ridiculous number of people based here doing exciting work in programming and proof theory, and also quite a few visitors. So I'm excited.

One of my goals with this post-doc is to better connect the general approach to logic and programming that I was trained in at CMU, which could be called a "syntactic" approach, to the more semantic approach prevalent in Europe. "Syntactic" is in scare quotes because it is sometimes used as a derogatory term, slightly gentler than the epithet "symbol-pushing", and I don't think it is really appropriate. Of course I wouldn't think so! But there's a burden of proof in explaining how this thing that looks like symbol-pushing to the outside world is actually just another way (and a useful way) of manipulating semantic content.

So this post is in that vein, and was prompted by hearing Danko Ilik (one of Hugo Herbelin's students) give a talk two days ago about Normalization-By-Evaluation (NBE), and its apparently well-known (but I guess not well-known enough, since I hadn't seen it before) connection to Kripke semantics for intuitionistic logic. The connection is extremely easy to state:

NBE = completeness ○ soundness

That is, constructively speaking, NBE for the lambda calculus (a.k.a. a proof system for intuitionistic logic) corresponds to applying the soundness theorem (if something is provable it is true in every Kripke model) followed by the completeness theorem (if something is true in every Kripke model then it is provable). More precisely, the soundness theorem is applied to bring a lambda term into the universal model of contexts ordered by inclusion, and then the completeness theorem takes it back into the lambda calculus, now in normal form. Darin Morrison recently posted some elegant code to the Agda mailing list, demonstrating this explicitly.

Okay, that's neat, now let's talk about something else: focusing. If you are reading this blog you have probably heard something about it. (If not, you can learn about it here or here or here or here or here or here.) Originally, focusing was conceived as a search strategy for linear logic sequent calculus, achieving efficiency by severely restricting when you could apply the different rules, based on the polarity of connectives (positive ⊗, ⊕, !, ∃, versus negative ⅋, &, ?, ∀). The surprising and important result was that this strategy is complete, i.e.:

THEOREM (Completeness of focusing) [Andreoli]: If the sequent Γ ⊢ Δ is provable in linear logic, then it has a focusing proof.

The converse, soundness, is of course also important, but was obvious in Andreoli's original formulation, where focusing was seen as a search strategy, and a focusing proof was just a special sort of sequent proof.

Nonetheless, we have soundness and completeness theorems. Great! Can we compose them?

Well, let's not go there quite yet...

First, let's move away from linear logic, which may seem exotic, to something more mundane: classical propositional logic. Is there a notion of focusing proofs in classical logic? In fact, there are too many. Whereas the polarities of the connectives of linear logic are all somehow fixed by their inference rules, in classical logic there are many different, equally valid possibilities for inference rules, and as a result the propositional connectives are fundamentally bipolar. The flip side is that the classical connectives can be explicitly polarized, e.g.., we can choose whether to treat conjunction ∧ positively as ⊗, or negatively as &, and similarly with disjunction, negation, and implication. And this act of disambiguation endows the classical connectives with constructive content.

How so?

Well, let's take another look at the soundness and completeness theorems for focusing. Since focusing proofs deal with explicitly polarized propositions, and ordinary (classical) proofs don't, we will relate the two notions via an operation |-| that "forgets" polarity, i.e., collapses ⊗ and & both to plain conjunction, ⅋ and & both to plain disjunction, etc. Then we can state the following:

THEOREM (Classical soundness of focusing): If the sequent Γ ⊢ Δ has a focusing proof, then |Γ| ⊢ |Δ| has an ordinary classical proof.

THEOREM (Classical completeness of focusing): If the sequent |Γ| ⊢ |Δ| has an ordinary classical proof, then Γ ⊢ Δ has focusing proof.

So now can we compose?

Again speaking more constructively, suppose we take as our starting point some version of classical natural deduction, e.g., natural deduction adjoined with reductio ad absurdum, or with Peirce's law, a.k.a. lambda calculus with control operators. And suppose we have a proof (program) establishing (inhabiting) some proposition (type). Now, to apply the completeness theorem to obtain a focusing proof/program, crucially, we must choose some polarization of the connectives. What we obtain by applying the completeness theorem can and will depend on the particular polarization. But then, once we have a focusing derivation, we can again obtain a classical derivation in a deterministic way, by applying the soundness theorem.

And what do we get as a result? Well, we can make two observations:

1. The result is in normal form. I.e., it corresponds to a β-reduced, η-expanded lambda term (or a sequent calculus proof without cuts and with only atomic initial axioms).
2. The result is in continuation-passing style (CPS).

And in fact, the choice we made when applying the completeness theorem—the way we polarized the connectives—is precisely the way we choose among the many different CPS translations, call-by-value vs call-by-name, etc.

[Aside: it is well-known that different CPS transformations correspond to different double-negation translations, proof-theoretically. This may seem at odds with the above statement, since the result of applying completeness followed by soundness is still a proof of the original sequent, not a double-negated sequent. But we can make the connection to double-negation translation more transparent by factoring the soundness theorem by way of minimal logic, i.e, the fragment of intuitionistic logic where negation is defined by ¬A = A ⊃ #, for some distinguished atom #. The soundness theorem is then stated as follows:

THEOREM (Minimal soundness of focusing): If the sequent Γ ⊢ Δ has a focusing proof, then Γ^t, Δ^f ⊢ # has an ordinary minimal proof.

Here, -^t is the operation that interprets positive connectives "directly" (i.e., conjunction as conjunction, disjunction as disjunction, etc.) and interprets negative connectives as the negation of their De Morgan duals, while -^f interprets negative connectives as their duals, and positive connectives as their negations. Now, the result of applying the classical completeness theorem followed by minimal soundness, followed by the inclusion of minimal logic into classical logic, is not necessarily the original sequent, but a (classically) equivalent one. For more details, see Chapters 3 and 4 of my thesis.]

In short, by composing the soundness and completeness theorems for focusing, we obtain a normalization procedure for a proof system/programming language, that moreover fixes an evaluation order based on the polarization. We could call this "Normalization-By-Focusing":

NBF = soundness ○ completeness

So how does this compare to the original equation, NBE = completeness ○ soundness? First, observe that the fact that the positions of soundness and completeness are swapped is purely a historical convention! Since focusing was originally conceived as a search strategy for sequent calculus, it made sense to ask that it be complete. But in any case, it is clear that the focusing completeness theorem is playing a role analogous to the Kripke soundness theorem, and the focusing soundness theorem a role analogous to the Kripke completeness theorem. In other words, in both cases we are going out of a formal system with some arbitrary collection of rules, into a very well-behaved system, and then back into a canonical fragment of the original system.

In other words, I think it is entirely accurate to say that focusing provides a semantics of proofs. And a proof-theoretic semantics at that.

But how does it really relate to other semantic approaches? Again, that's what I'm here to find out. I suspect, though, that there's more than a superficial resemblance between the NBE and NBF equations, and there might be a deeper connection between focusing and the technique of forcing in proof theory.