Saturday, May 22, 2010

Five kinds of double-negation

A recent question on the TYPES mailing list reminded me that one reason negation is confusing is because it can mean so many different things. I and others originally thought the answer the person was looking for (the type-theoretic reading of negation) was "continuations", but it turned out his intuition was about a different form of negation not usually considered in constructive logic (but available in "refinement type systems", etc.), since it's tied to the "extrinsic" interpretation of types-as-properties. (For more on this deficit of constructive logic, see the introduction to Paul Taylor's Subspaces in Abstract Stone Duality.)

But even restricted to plain old type theory, there are many different interpretations of negation! Under the principle of "Why talk about something confusing when you can talk about something doubly confusing?", today I want to talk about different computational interpretations of double negations in logic.

  1. (A -> 0) -> 0. Here 0 is falsehood, and this is the standard interpretation of double-negation in intuitionistic logic. What does this mean computationally? Well, one way to think about this is to interpret types as sets. How many different functions f : X -> 0 are there from any set X into the empty set? None, unless X is itself empty, in which case there is exactly one. Applying this reasoning to our double-negation, if A is non-empty, then A -> 0 is empty, which in turn means that (A -> 0) -> 0 contains exactly one element; conversely, if A is empty, then A -> 0 is non-empty, which in turns means that (A -> 0) -> 0 is empty. What we see is that intuitionistic double-negation encodes the principle of proof-irrelevance, i.e., it forgets all the computational content of a type. (The catch here is that set-theoretic reasoning doesn't always prove type isomorphisms—this argument really only works for extensional type theory.)
  2. (A -> ⊥) -> ⊥. Here ⊥ represents "minimal" falsehood, i.e., simply a logical atom with no introduction or elimination rules, and in particular without the principle of explosion. Because we have replaced "empty" with "unknown", minimal double-negation does not have the same computational degeneracy of intuitionistic double-negation: there are at least as many different constructive proofs of (A -> ⊥) -> ⊥ as there are of A, and often more. For example, in general there is no proof in minimal logic of excluded middle A ∨ (A -> ⊥), but there is a proof of ((A ∨ (A -> ⊥)) -> ⊥) -> ⊥. Indeed, minimal logic rather than intuitionistic logic really is the proper place to understand the computational content of the classical double-negation translations (something which I didn't fully appreciate until reading Jeremy Avigad's The computational content of classical arithmetic). As such, this form of double-negation is the first step towards understanding continuation-passing style.
  3. (A -> R) -> R, where R is any type of answers. We said that ⊥ above is a logical atom. In type-theoretic terms, it is a type variable. So to move to this form double-negation, we simply instantiate the variable with a concrete type R. Now, logically we are beginning to stretch the meaning of "negation". In particular, nothing says that R is uninhabited—indeed if it is provably uninhabited, we are back to the computationally degenerate situation (1). But whether or not we accept the terminology, this form of double-negation is extremely important computationally, tied to the representation of control flow. A function of type A -> R is a continuation (transforming any value of type A into a result of type R), and so a term of type (A -> R) -> R is something which takes a continuation to a result—a computation with "control effects". The type (A -> R) -> R is so powerful that it comes with the following warning:
    Abuse of the Continuation monad can produce code that is impossible to understand and maintain.
    The "principle of double-confusion" I mentioned at the beginning then motivates the following pair of generalizations...
  4. (A -> R1) -> R2, where R1 and R2 are two (possibly distinct) types of answers. This form of double-negation comes up in the study of delimited control operators, which were originally motivated by the fact that the type (A -> R) -> R is not only too powerful but also not powerful enough. The type (A -> R1) -> R2 can be seen as a sort of Hoare triple {R1}A{R2}, which gives an intimation of its power.
  5. ∀α.(A -> α) -> α, a polymorphic type (and more generally, ∀α.(A -> Tα) -> Tα, where T is an arbitrary monad). It's easy to see that A ≡ ∀α.(A -> α) -> α is provable in second-order intuitionistic logic, but in fact this can also be interpreted as the Yoneda isomorphism in category theory, as sigfpe explained a few years ago. More generally, there is a Yoneda isomorphism between TA and ∀α.(A -> Tα) -> Tα for any monad T (and formally, a type isomorphism in System F + parametricity axioms). This isomorphism lies at the heart of Filinski's representation theorem that delimited control operators can be used to perform "monadic reflection".
So that is a small cast of characters, biased towards the view of "negation as continuation" (which as I said, is not the only view). For more plots involving these different forms of double-negation (particularly (4) and (5)), you can read the paper I posted two weeks ago.

11 comments:

gelisam said...

Thanks! Interpretation 1 is precisely the kind of proof irrelevance I currently need. I had no clue it was possible to get proof irrelevance without specific compiler support!

In case you are curious, my current need is to represent unordered pairs, i.e., sets of size two. The type (ℕ × ℕ) double-counts the set {1,2} since the pairs (1,2) and (2,1) represent the same set. My solution so far is to use the type (Σx:ℕ. Σy:ℕ. x ≤ y), using the pair (1,2,prf) to represent {1,2}. (2,1,prf') is not a valid representation because there exists no such value prf' of type (2 ≤ 1).

I was concerned that with element types more complicated than ℕ, there might be more than one valid proof of (x ≤ y). By using the type (Σx:ℕ. Σy:ℕ. (x ≤ y → ⊥) → ⊥), this problem (real or imagined) is no longer an issue.

Now, this will probably end up in a blog post rather than a paper, but just in case: which paper should I cite if I need to use double-negation for proof-irrelevance?

Noam said...

Wait! As I said, this trick only works out-of-the-box in extensional type theory (where two functions are definitionally equal iff they are equal on all values...which is in general undecidable), so for example in Agda it won't give you what you want. I think I remember Neel K telling me they used to use this trick in NuPRL... There's one reference
here ("Markov's Principle for Type Theory"), but maybe this was already folklore.

I guess it's a good question, though, how this relates to the approach to proof irrelevance in Coq/Epigram. Does anyone know the answer?

Neel Krishnaswami said...

Bob Harper told me they used to use double-negation in NuPRL, but it was sufficiently contorted that it motivated them to introduce squash types into it.

Coq and (my understanding of) Epigram actually have very different approaches to proof irrelevance!

In out-of-the-box Coq, the type theory has a sort Prop, which is not internally proof irrelevant, but is externally proof irrelevant. That is, you can't always prove (internally to Coq) that two inhabitants of the same type in Prop are equal, without adding an axiom. However, Coq does not permit computations in Set to vary based upon terms in Prop. This lets them use a realizability-style interpretation as part of program extraction, which turns all terms in Prop into units.

Epigram is very different. (What follows is all based on me trying to puzzle through Conor and co's papers. They may have a different idea of what they're doing!)

To get Epigram, you start with intensional Martin Lof type theory, and then you *remove* the identity type. Then, you systematically reinterpret each of the type constructors as setoids (ie, a type constructor + the obvious equivalence relation for that type), following Martin Hofmann's idea of interpreting extensional type theory in intensional type theory.

Then, the idea is that the conversion relation can be beefed up to systematically apply the functorial actions for each setoid type constructor, as needed. Then, since all of your type formers are functorial on setoids, every type you can form is a setoid, with the desired extensionality properties.

The reason you have to remove the identity type is that it internalizes conversion, which would make intensional properties observable. But you replace it with a whole zoo of equality types, which give you the stronger propositional equalities you want, unless you're a homotopy theorist.

The way Epigram supports proof-irrelevance for the equality types is by carefully engineering all the setoid equality proofs to be (extensionally equivalent to) identity functions, which lets them drop the coercions whenever they need to. They do this by ensuring their setoid equality propositions lie in a subuniverse of *intensionally* proof-irrelevant types (basically, unit, void, and products of and exponentials into same).

This setup supports quotient types by allowing you to quotient an arbitrary type X by a relation X -> X -> Prop, where Prop is one of those intensionally proof-irrelevant types, using Martin Hofmann's elim rule for quotients.

Then, you can just *define* the proof irrelevance type constructor PI(A), by taking PI(A) to be A quotiented by the full relation f : A -> A -> type = \a a'. unit. (Actually, I think Epigram supports proof-irrelevance directly, to get a more useful definitional equality for it, but conceptually it could work this way.)isig

Dan Doel said...

It seems to me that linear logic (which has been on my mind a bit lately) has a handle on both 1 and 2. Of course, it's pretty well known that you can embed intuitionistic logic using !A ⊸ B for implication. Additive false explodes like normal, so it can presumably be seen as an empty type.

However, there's also multiplicative false, and it doesn't explode. So presumably, you get minimal negation by using that false.

What's more, if + is additive conjunction, then A + (A ⊸ ⊥) is not a theorem. But if we have multiplicative negation (which is the one related to ⊸), say ⊕, then A ⊕ (A ⊸ ⊥) is a theorem. And from what I gather from talking with someone who's thought more about a (functional) linear logic language than I have, ⊕ and ⊥ can be thought about as having to do with continuations or concurrency. So, negation-as-continuations and LEM has been added without conflating them with their usual type theoretic interpretation, which is nice.

gelisam said...

I am indeed using agda, but I don't understand what the problem is. Could it be that even though two values p1 and p2 of type ((x ≤ y → ⊥) → ⊥) cannot be distinguished by any agda expression, it is not possible to construct a value of type (x ≡ y)?

I don't think I need values of this type, but maybe there are other problems I am not thinking of? Even if I do need proofs of this kind, I guess I could just postulate them.

Noam said...

@Neel: thanks for the explanations! Getting up to speed on Epigram is something I want to do real soon... (Haha on "which give you the stronger propositional equalities you want, unless you're a homotopy theorist"! I just heard Steve Awodey give an exciting talk today on homotopical semantics of type theory...)

@Dan: Yes, linear logic is a very interesting subject, but in my humble opinion it's a dangerous place for thinking about continuations, because of the big distraction (in Girard's papers) of "involutive negation". First, note that intuitionistic logic already has some handle on all five forms of double-negation (or at least on the first four, since the fifth is expressed with second-order quantifiers): they're just not always called double-"negation". The important conceptual insight from linear logic which is not as explicit in intuitionistic logic is this duality between proofs and refutations, i.e., the duality between values and continuations—but it's not an idea that depends on linearity, rather on the notion of polarity. There's a lot to say about polarity, but it's a pretty simple idea: a type is either defined in terms of its values or its continuations. This gives a way of decomposing the different double-negation/CPS translations. For example, the LEM theorem A⅋(A ⊸ ⊥) (I think you wrote ⊕ where you meant ⅋) holds because ⅋ is a negative connective (defined in terms of its continuations instead of its values), encoding disjunction rather like the Church-encoding. The proof term for A⅋(A ⊸ ⊥) is isomorphic to the intuitionistic proof of ((A→⊥)∧A)→ ⊥ (i.e., λ(k,x).k(x)). So intuitionistic logic in some sense (in pretty much the same sense as linear logic) also supports "LEM" modulo syntactic sugar.

@Samuel: yes, the problem is that you cannot prove p1 ≡ p2. If you do not need that, well, then I'm not sure what you are hoping to get? Just that the two values are in some sense observationally equivalent?

gelisam said...

My ultimate goal is to prevent the programmer from writing non-commutative functions, and to do this I first combine the two inputs x and y into an unordered pair xy. Since the combinations (combine x y) and (combine y x) both yield the same unordered pair xy, the composition (λu v. f (combine u v)) will be commutative regardless of which f the programmer writes.

For the composition to be commutative, I only need (combine x y) and (combine y x) to be undistinguishable, or "observationally equivalent", as you say. My main concern is that the composition must be commutative — whether it is provably commutative or not is much less important, especially if I can fake it with a postulate.

I'm pretty much done with this commutativity project; I'm trying to do the same with associativity now, but I'm stumped. Any ideas?

Noam said...

If I understand correctly, what you want is to define functions over quotient types A/R (values of type A modulo the equivalence relation R). There are two ways to do this: either 1. take functions on A together with a proof that they respect the equivalence R (the brute force approach), or 2. find canonical representatives for equivalence classes in A/R, and take functions over that. And you are trying to use approach (2), right?

I think the way to do this for associativity is to generalize from binary functions to functions over cons-lists of natural numbers: such functions are by definition associative, since they are defined on a unique representative of each equivalence class.

gelisam said...

Indeed, the whole point is to avoid writing independent proofs. Instead of writing f followed by a proof that f is commutative, it is indeed possible to define a quotient type or an equivalent representation, making sure that any function taking this quotient as input is commutative. An important part of this technique is the combine function, which makes it possible to transform the function taking a quotient as input into a bona fide binary function.

The reason associativity is stumping me, however, is that the technique is not applicable in this new setting. The problem is that commutativity (f (x, y) ≡ f (y, x)) is mainly concerned with f's input, while associativity (f (x, f (x, y)) ≡ f (f (x, y), z)) is a complex relation involving both f's input and output.

If I define f over lists instead of pairs, then I agree that I won't need to worry about ((x + y) + z) being different from (x + (y + z)) because I won't use either of those expressions, I will use (f [x, y, z]). This, however, is not a binary function, so the solution is incomplete.

If I define
_+_ x y = f [x, y]
then I do get a binary function, but this function is not necessarily associative. Thanks for trying, though!

Dan Doel said...

Yes, sorry about that. I screwed up some terminology in my post, but you read it right. By + I mean additive disjunction, and by ⊕ meant multiplicative disjunction (I realize the latter is usually used for additive, but I didn't even know ⅋ was a unicode symbol, and know someone who uses the alternate naming, because the additive connectives are categorical sum, product, initial and terminal, I believe, while the multiplicative connectives are other 'tensorial' structure).

And yes, after banging out my error-filled post, I grabbed your paper from the last entry, on polarity in classical logic, and realized I wasn't saying anything surprising to you. :) So I've got some reading to do.

Saturated said...

While reading over the article and the comments, I couldn't help wondering whether you'd come across the so-called "Peirce monad", J X = (X->R)->X, for which there is a monad morphism into the continuation monad. Martin Escardo and Paulo Oliva have recently been making much hay with it: see (for one example) http://www.cs.bham.ac.uk/~mhe/papers/jtrans.pdf.

If that's irrelevant, my apologies.