## Polarity in Type Theory

There has recently arisen some misguided claims about a supposed opposition between functional and object-oriented programming.  The claims amount to a belated recognition of a fundamental structure in type theory first elucidated by Jean-Marc Andreoli, and developed in depth by Jean-Yves Girard in the context of logic, and by Paul Blain-Levy and Noam Zeilberger in the context of programming languages.  In keeping with the general principle of computational trinitarianism, the concept of polarization has meaning in proof theory, category theory, and type theory, a sure sign of its fundamental importance.

Polarization is not an issue of language design, it is an issue of type structure.  The main idea is that types may be classified as being positive or negative, with the positive being characterized by their structure and the negative being characterized by their behavior.  In a sufficiently rich type system one may consider, and make effective use of, both positive and negative types.  There is nothing remarkable or revolutionary about this, and, truly, there is nothing really new about it, other than the terminology.  But through the efforts of the above-mentioned researchers, and others, we have learned quite a lot about the importance of polarization in logic, languages, and semantics.  I find it particularly remarkable that Andreoli’s work on proof search turned out to also be of deep significance for programming languages.  This connection was developed and extended by Zeilberger, on whose dissertation I am basing this post.

The simplest and most direct way to illustrate the ideas is to consider the product type, which corresponds to conjunction in logic.  There are two possible ways that one can formulate the rules for the product type that from the point of view of inhabitation are completely equivalent, but from the point of view of computation are quite distinct.  Let us first state them as rules of logic, then equip these rules with proof terms so that we may study their operational behavior.  For the time being I will refer to these as Method 1 and Method 2, but after we examine them more carefully, we will find more descriptive names for them.

Method 1 of defining conjunction is perhaps the most familiar.  It consists of this introduction rule

$\displaystyle\frac{\Gamma\vdash A\;\textsf{true}\quad\Gamma\vdash B\;\textsf{true}}{\Gamma\vdash A\wedge B\;\textsf{true}}$

and the following two elimination rules

$\displaystyle\frac{\Gamma\vdash A\wedge B\;\textsf{true}}{\Gamma\vdash A\;\textsf{true}}\qquad\frac{\Gamma\vdash A\wedge B\;\textsf{true}}{\Gamma\vdash B\;\textsf{true}}$.

Method 2 of defining conjunction is only slightly different.  It consists of the same introduction

$\displaystyle \frac{\Gamma\vdash A\;\textsf{true}\quad\Gamma\vdash B\;\textsf{true}}{\Gamma\vdash A\wedge B\;\textsf{true}}$

and one elimination rule

$\displaystyle\frac{\Gamma\vdash A\wedge B\;\textsf{true} \quad \Gamma,A\;\textsf{true},B\;\textsf{true}\vdash C\;\textsf{true}}{\Gamma\vdash C\;\textsf{true}}$.

From a logical point of view the two formulations are interchangeable in that the rules of the one are admissible with respect to the rules of the other, given the usual structural properties of entailment, specifically reflexivity and transitivity.  However, one can discern a difference in “attitude” in the two formulations that will turn out to be a manifestation of the concept of polarity.

Method 1 is a formulation of the idea that a proof of a conjunction is anything that behaves conjunctively, which means that it supports the two elimination rules given in the definition.  There is no commitment to the internal structure of a proof, nor to the details of how projection operates; as long as there are projections, then we are satisfied that the connective is indeed conjunction.  We may consider that the elimination rules define the connective, and that the introduction rule is derived from that requirement.  Equivalently we may think of the proofs of conjunction as being coinductively defined to be as large as possible, as long as the projections are available.  Zeilberger calls this the pragmatist interpretation, following Count Basie’s principle, “if it sounds good, it is good.”

Method 2 is a direct formulation of the idea that the proofs of a conjunction are inductively defined to be as small as possible, as long as the introduction rule is valid.  Specifically, the single introduction rule may be understood as defining the structure of the sole form of proof of a conjunction, and the single elimination rule expresses the induction, or recursion, principle associated with that viewpoint.  Specifically, to reason from the fact that $A\wedge B\;\textsf{true}$ to derive $C\;\textsf{true}$, it is enough to reason from the data that went into the proof of the conjunction to derive $C\;\textsf{true}$.  We may consider that the introduction rule defines the connective, and that the elimination rule is derived from that definition.  Zeilberger calls this the verificationist interpretation.

These two perspectives may be clarified by introducing proof terms, and the associated notions of reduction that give rise to a dynamics of proofs.

When reformulated with explicit proofs, the rules of Method 1 are the familiar rules for ordered pairs:

$\displaystyle\frac{\Gamma\vdash M:A\quad\Gamma\vdash N:B}{\Gamma\vdash \langle M, N\rangle:A\wedge B}$

$\displaystyle\frac{\Gamma\vdash M:A\wedge B}{\Gamma\vdash \textsf{fst}(M):A}\qquad\frac{\Gamma\vdash M:A\wedge B}{\Gamma\vdash \textsf{snd}(M):B}$.

The associated reduction rules specify that the elimination rules are post-inverse to the introduction rules:

$\displaystyle\textsf{fst}(\langle M,N\rangle)\mapsto M \qquad \textsf{snd}(\langle M,N\rangle)\mapsto N$.

In this formulation the proposition $A\wedge B$ is often written $A\times B$, since it behaves like a Cartesian product of proofs.

When formulated with explicit proofs, Method 2 looks like this:

$\displaystyle \frac{\Gamma\vdash M:A\quad\Gamma\vdash M:B}{\Gamma\vdash M\otimes N:A\wedge B}$

$\displaystyle\frac{\Gamma\vdash M:A\wedge B \quad \Gamma,x:A,y:B\vdash N:C}{\Gamma\vdash \textsf{split}(M;x,y.N):C}$

with the reduction rule

$\displaystyle\textsf{split}(M\otimes N;x,y.P)\mapsto [M,N/x,y]P$.

With this formulation it is natural to write $A\wedge B$ as $A\otimes B$, since it behaves like a tensor product of proofs.

Since the two formulations of “conjunction” have different internal structure, we may consider them as two different connectives.  This may, at first, seem pointless, because it is easily seen that $x:A\times B\vdash M:A\otimes B$ for some $M$ and that $x:A\otimes B\vdash N:A\times B$ for some $N$, so that the two connectives are logically equivalent, and hence interchangeable in any proof.  But there is nevertheless a reason to draw the distinction, namely that they have different dynamics.

It is easy to see why.  From the pragmatic perspective, since the projections act independently of one another, there is no reason to insist that the components of a pair be evaluated before they are used.  Quite possibly we may only ever project the first component, so why bother with the second?  From the verificationist perspective, however, we are pattern matching against the proof of the conjunction, and are demanding both components at once, so it makes sense to evaluate both components of a pair in anticipation of future pattern matching.  (Admittedly, in a structural type theory one may immediately drop one of the variables on the floor and never use it, but then why give it a name at all?  In a substructural type theory such as linear type theory, this is not a possibility, and the interpretation is forced.)  Thus, the verficationist formulation corresponds to eager evaluation of pairing, and the pragmatist formulation to lazy evaluation of pairing.

Having distinguished the two forms of conjunction by their operational behavior, it is immediately clear that both forms are useful, and are by no means opposed to one another.  This is why, for example, the concept of a lazy language makes no sense, rather one should instead speak of lazy types, which are perfectly useful, but by no means the only types one should ever consider.  Similarly, the concept of an object-oriented language seems misguided, because it emphasizes the pragmatist conception, to the exclusion of the verificationist, by insisting that everything be an object characterized by its methods.

More broadly, it is useful to classify types into two polarities, the positive and the negative, corresponding to the verificationist and pragmatist perspectives.  Positive types are inductively defined by their introduction forms; they correspond to colimits, or direct limits, in category theory.  Negative types are coinductively defined by their elimination forms; they correspond to limits, or inverse limits, in category theory.  The concept of polarity is intimately related to the concept of focusing, which in logic sharpens the concept of a cut-free proof and elucidates the distinction between synchronous and asynchronous connectives, and which in programming languages provides an elegant account of pattern matching, continuations, and effects.

As ever, enduring principles emerge from the interplay between proof theory, category theory, and type theory.  Such concepts are found in nature, and do not depend on cults of personality or the fads of the computer industry for their existence or importance.

Update: word-smithing.

### 26 Responses to Polarity in Type Theory

1. Mike Shulman says:

If you’ll forgive my bringing in a particular fad of the computer industry, I’m curious what you think of the programming language Scala. It seems to have at least a basic recognition of both negative aspects of types (since it builds on the “object-oriented” style coming out of C++ and Java) and positive ones (there called, I believe, “case classes”).

• i haven’t written any significant amount of code in scala, so it’s hard for me to say. we evaluated it as a possible choice for the introductory curriculum at carnegie mellon (along with haskell, ocaml, standard ml, and f#), and decided against it. my impression at that time was that it is a rather baroque combination of object nonsense with functional programming dictated by the requirement to compile to the jvm and to accommodate conventional wisdom. for teaching purposes something more idealistic seemed more appropriate. f# is a non-starter because they eliminated the very thing we need most from ml, the concept of modules! it is not possible in f# to write down a signature of, say, a queue and consider several different implementations of that signature simultaneously, nor is it possible to consider that a given module satisfies many different descriptions (views). whatever f# may be good for, i’m not sure, but it’s definitely not good for teaching programming. we tried hard to like haskell, but it, too, is impossible to use for teaching, because the time and especially space complexity is impossible to assess analytically, because laziness is absolutely incompatible with parallelism, and because even the most elementary program verification technique, proof by mathematical induction, is inapplicable to any haskell program! three strikes and you’re out. this left us with a choice of either ocaml or standard ml, which could really have gone either way, but we have a prior commitment to standard ml, and ocaml’s modules language has some shortcomings that we’d prefer to avoid, so the result was that we decided on standard ml. contrary to what one might suspect, we were forced into this choice (not unwillingly), rather than imposed it a priori.

2. Neel Krishnaswami says:

Polarity doesn’t give a complete account of how data types work, because it doesn’t “know about” the effects of data abstraction. The simplest example I know is the following isomorphism in System F:

P ≃ ∀α. (P → α) → α

If P is positive, then this asserts the equivalence of a positive and a negative type! This is quite a remarkable fact, the more so since it’s true. :)

From the categorical point of view, this reflects the insufficiently-appreciated fact that abstract types are neither algebras nor coalgebras. The categorical approach to (co)algebra is inherently functorial, and since APIs frequently and naturally are mixed-variance, you can’t really define functorial actions for them. You really need parametricity to explain this, and its categorical formulation is nontrivial, to say the least.

None of this is a knock on polarity, of course — it’s really amazing just how much mileage you can get out of it, and I don’t know anything in PL theory with a better power-to-weight ratio.

So you always assume the category for positive values is Cartesian closed, to enable self-enrichment, hence tensors (I.e. colimit form of products) ?

What properties of positive types stem from formulating them as colimits?

• essentially yes. for my purposes the point is the pattern-matching elimination arises naturally from considering positives as colimits (well, coproducts, there being no equational restrictions in simple types)

• Mike Shulman says:

One way to see cartesian products as a positive type categorically, which avoids the unsatisfying feeling of having to “put them in” in advance before you can get them out, is to start with a cartesian multicategory instead of an ordinary category. In some sense this is a more accurate representation of type-theoretic semantics, because in practice (in most programming languages) we define functions which take “multiple arguments” and regard them as different both from functions which take a single tupled argument, and from “curried” functions which take one argument and return another function.

In a cartesian multicategory, the cartesian product has a “mapping out” universal property like you would expect for a positive type: Hom(A x B ; C) = Hom(A,B ; C). The fact that such an object is equivalent to a categorical cartesian product (with usual its negative “mapping in” universal property) depends on cartesianness of the multicategory, which in turn encodes the structural rules of weakening and contraction in type theory. Thus the category theory exactly mirrors the type-theoretic fact that in linear type theory, the positive and negative products are different.

• thanks, i ought to have said that.

You wrote that positive types are colimits. Can you please elaborate on how to view products as colimits?

• Degenerately, as a unary sum.

Interesting. So you’re breaking the symmetry by indexing over one type over the other.

I can see this working over sets, how do you generalise to other settings? Tensors?

Oh, I realise I may have misinterpreted your comment.

Did you mean the product itself is indexing the sum of the unit type?

If so, do you also use this indexing for the implicit product within contexts, or only for product types? (And my second question on generalisation remains.)

• yes, you could see it that way. in pl terms it’s a datatype with one constructor that takes two arguments, so you need some prior way to handle multiple arguments, by tensoring.

5. jsalvati says:

In the explicit proofs formulation the bottom of the second elimination rule reads Gamma |- snd(B) instead of Gamma |- snd(M) : B

6. jsalvati says:

How does Call by Push-Value fit into this? This work (http://lambda-the-ultimate.org/node/1975) makes strong claims about it exposes the ‘fine structure’ of strict vs lazy evaluation.

• After extensive discussions with Paul, we came to the conclusion that focusing and cbpv are essentially the same idea. This is most clear when one examines the focused type theory from the point of view of a categorial semantics.

7. Dan Doel says:

“This is why, for example, the concept of a lazy language makes no sense, rather one should instead speak of lazy types, which are perfectly useful, but by no means the only types one should ever consider.”

Except that there is no necessary connection between a type being inductively defined and eager evaluation being suitable for its use (there may be examples the other way, as well, but in general it is impossible to be completely eager with coinductive types of course).

For instance, just because one wants a type of finite lists does not mean that he always want the lists to be fully evaluated into memory up front. There are situations in which doing so can use unreasonable amounts of memory, while lazy evaluation of even the inductive type will require significantly less overhead. A powerlist can be produced lazily in a way that does not (I believe) require O(2^n) memory (which full evaluation will), while still being finite.

• Rob Simmons says:

A type theory with an awareness of polarity can absolutely represent lazy lists, but the type must explicitly represent that we need perform a computation to determine whether the tail of a list is empty or is a cons. In the system I sketched out over here, a lazy list of integers has type μx.[Nil, Cons: int*UFx], where UFx represents the necessity of performing some computation to force the tail of the list.

As a positive recursive type, the inhabitants of that recursive type do include infinite lists, so it’s not a type of lazy finite lists. I think, however, that the connection of necessary-finiteness that we associate with inductively defined data structures with eagerness *really is* suggested by the structure of Noam’s system. There doesn’t seem to be a strong argument against following the direction the types are pointing us in.

• Rob Simmons says:

Oof, “need perform” should have been “need to perform.” I can’t even justify that with Pittsburghese.

• dbaelde says:

I agree with most of what Bob wrote, but I’d also say that the association of inductive with positive and coinductive with negative is abusive. Indeed, my work on extending focusing to logics with least and greatest fixed points has revealed that polarities are not fixed for those connectives: one can define complete focused proof systems with either choice of polarities. And Dan’s comment gives a pretty good intuition of why this is the case.

• As soon as you admit partiality, the situation changes.

• dbaelde says:

Really, least fixed points can be made asynchronous (providing in particular finite lazy lists) and greatest fixed points synchronous, without loosing completeness of focused proofs, and without having partial proofs: see my ACM ToCL paper. Now, there are lots of subtleties in the details here: for example, I avoid saying “negative” but stick to “asynchronous” for a reason. But I don’t think that blog comments are the right place to bring the discussion to the necessary level of technical detail.

• Ok, I’ll have a look. Thanks for the reference.

8. I use the concept of polarity extensively when I’m building software. Sometimes I think that deciding which types should be “positive” and which should be “negative” is the hardest part of API design.

• ezyang says:

You should give an example!

• ok, what do you have in mind?