Old neglected theorems are still theorems

March 20, 2014

I have very recently been thinking about the question of partiality vs totality in programming languages, a perennial topic in PL’s that every generation thinks it discovers for itself.  And this got me to remembering an old theorem that, it seems, hardly anyone knows ever existed in the first place.  What I like about the theorem is that it says something specific and technically accurate about the sizes of programs in total languages compared to those in partial languages.  The theorem provides some context for discussion that does not just amount to opinion or attitude (and attitude alway seems to abound when this topic arises).

The advantage of a total programming language such as Goedel’s T is that it ensures, by type checking, that every program terminates, and that every function is total. There is simply no way to have a well-typed program that goes into an infinite loop. This may seem appealing, until one considers that the upper bound on the time to termination can be quite large, so large that some terminating programs might just as well diverge as far as we humans are concerned. But never mind that, let us grant that it is a virtue of  T that it precludes divergence.

Why, then, bother with a language such as PCF that does not rule out divergence? After all, infinite loops are invariably bugs, so why not rule them out by type checking? (Don’t be fooled by glib arguments about useful programs, such as operating systems, that “run forever”. After all, infinite streams are programmable in the language M of inductive and coinductive types in which all functions terminate. Computing infinitely does not mean running forever, it just means “for as long as one wishes, without bound.”)  The notion does seem appealing until one actually tries to write a program in a language such as T.

Consider computing the greatest common divisor (GCD) of two natural numbers. This can be easily programmed in PCF by solving the following equations using general recursion:

\begin{array}{rcl}    \textit{gcd}(m,0) & = & m \\    \textit{gcd}(0,m) & = & m \\    \textit{gcd}(m,n) & = & \textit{gcd}(m-n,n) \quad \text{if}\ m>n \\    \textit{gcd}(m,n) & = & \textit{gcd}(m,n-m) \quad \text{if}\ m<n    \end{array}

The type of \textit{gcd} defined in this manner has partial function type (\mathbb{N}\times \mathbb{N})\rightharpoonup \mathbb{N}, which suggests that it may not terminate for some inputs. But we may prove by induction on the sum of the pair of arguments that it is, in fact, a total function.

Now consider programming this function in T. It is, in fact, programmable using only primitive recursion, but the code to do it is rather painful (try it!). One way to see the problem is that in T the only form of looping is one that reduces a natural number by one on each recursive call; it is not (directly) possible to make a recursive call on a smaller number other than the immediate predecessor. In fact one may code up more general patterns of terminating recursion using only primitive recursion as a primitive, but if you examine the details, you will see that doing so comes at a significant price in performance and program complexity. Program complexity can be mitigated by building libraries that codify standard patterns of reasoning whose cost of development should be amortized over all programs, not just one in particular. But there is still the problem of performance. Indeed, the encoding of more general forms of recursion into primitive recursion means that, deep within the encoding, there must be “timer” that “goes down by ones” to ensure that the program terminates. The result will be that programs written with such libraries will not be nearly as fast as they ought to be.  (It is actually quite fun to derive “course of values” recursion from primitive recursion, and then to observe with horror what is actually going on, computationally, when using this derived notion.)

But, one may argue, T is simply not a serious language. A more serious total programming language would admit sophisticated patterns of control without performance penalty. Indeed, one could easily envision representing the natural numbers in binary, rather than unary, and allowing recursive calls to be made by halving to achieve logarithmic complexity. This is surely possible, as are numerous other such techniques. Could we not then have a practical language that rules out divergence?

We can, but at a cost.  One limitation of total programming languages is that they are not universal: you cannot write an interpreter for T within T (see Chapter 9 of PFPL for a proof).  More importantly, this limitation extends to any total language whatever.  If this limitation does not seem important, then consider the Blum Size Theorem (BST) (from 1967), which places a very different limitation on total languages.  Fix any total language, L, that permits writing functions on the natural numbers. Pick any blowup factor, say 2^{2^n}, or however expansive you wish to be.  The BST states that there is a total function on the natural numbers that is programmable in L, but whose shortest program in L is larger by the given blowup factor than its shortest program in PCF!

The underlying idea of the proof is that in a total language the proof of termination of a program must be baked into the code itself, whereas in a partial language the termination proof is an external verification condition left to the programmer. Roughly speaking, there are, and always will be, programs whose termination proof is rather complicated to express, if you fix in advance the means by which it may be proved total. (In T it was primitive recursion, but one can be more ambitious, yet still get caught by the BST.)  But if you leave room for ingenuity, then programs can be short, precisely because they do not have to embed the proof of their termination in their own running code.

There are ways around the BST, of course, and I am not saying otherwise.  For example, the BST merely guarantees the existence of a bad case, so one can always argue that such a case will never arise in practice.  Could be, but I did mention the GCD in T problem for a reason: there are natural problems that are difficult to express in a language such as T.  By fixing the possible termination arguments in advance, one is tempting fate, for there are many problems, such as the Collatz Conjecture, for which the termination proof of a very simple piece of code has been an open problem for decades, and has resisted at least some serious attempts on it.  One could argue that such a function is of no practical use.  I agree, but I point out the example not to say that it is useful, but to say that it is likely that its eventual termination proof will be quite nasty, and that this will have to be reflected in the program itself if you are limited to a T-like language (rendering it, once again, useless).  For another example, there is no inherent reason why termination need be assured by means similar to that used in T.  We got around this issue in NuPRL by separating the code from the proof, using a type theory based on a partial programming language, not a total one.  The proof of termination is still required for typing in the core theory (but not in the theory with “bar types” for embracing partiality).  But it’s not baked into the code itself, affecting its run-time; it is “off to the side”, large though it may be).

Updates: word smithing, fixed bad link, corrected gcd, removed erroneous parenthetical reference to Coq, fixed LaTeX problems.


What’s the big deal with HoTT?

June 22, 2013

Now that the Homotopy Type Theory book is out, a lot of people are asking “What’s the big deal?”.  The full answer lies within the book itself (or, at any rate, the fullest answer to date), but I am sure that many of us who were involved in its creation will be fielding this question in our own ways to help explain why we are so excited by it.  In fact what I think is really fascinating about HoTT is precisely that there are so many different ways to think about it, according to one’s interests and backgrounds.  For example, one might say it’s a nice way to phrase arguments in homotopy theory that avoids some of the technicalities in the classical proofs by treating spaces and paths synthetically, rather than analytically.  Or one might say that it’s a good language for mechanization of mathematics that provides for the concise formulation of proofs in a form that can be verified by a computer.  Or one might say that it points the way towards a vast extension of the concept of computation that enables us to compute with abstract geometric objects such as spheres or toruses.  Or one might say that it’s a new foundation for mathematics that subsumes set theory by generalizing types from mere sets to arbitrary infinity groupoids,  sets being but particularly simple types (those with no non-trivial higher-dimensional structure).

But what is it about HoTT that makes all these interpretations and applications possible?  What is the key idea that separates HoTT from other approaches that seek to achieve similar ends?  What makes HoTT so special?

In a word the answer is constructivity.  The distinctive feature of HoTT is that it is based on Per Martin-Löf’s Intuitionistic Theory of Types, which was formulated as a foundation for intuitionistic mathematics as originally put forth by Brouwer in the 1930′s, and further developed by Bishop, Gentzen, Heyting, Kolmogorov, Kleene, Lawvere, and Scott, among many others.  Briefly put, the idea of type theory is to codify and systematize the concept of a mathematical construction by characterizing the abstract properties, rather than the concrete realizations, of the objects used in everyday mathematics.  Brouwer’s key insight, which lies at the heart of HoTT, is that proofs are a form of construction no different in kind or character from numbers, geometric figures, spaces, mappings, groups, algebras, or any other mathematical structure.  Brouwer’s dictum, which distinguished his approach from competing alternatives, is that logic is a part of mathematics, rather than mathematics is an application of logic.  Because for him the concept of a construction, including the concept of a proof, is prior to any other form of mathematical activity, including the study of proofs themselves (i.e., logic).

So under Martin-Löf’s influence HoTT starts with the notion of type as a classification of the notion of construction, and builds upwards from that foundation.  Unlike competing approaches to foundations, proofs are mathematical objects that play a central role in the theory.  This conception is central to the homotopy-theoretic interpretation of type theory, which enriches types to encompass spaces with higher-dimensional structure.  Specifically, the type \textsf{Id}_A(M,N) is the type of identifications of M and N within the space A.  Identifications may be thought of as proofs that M and N are equal as elements of $A$, or, equivalently, as paths in the space A between points M and N.  The fundamental principles of abstraction at the heart of type theory ensure that all constructs of the theory respect these identifications, so that we may treat them as proofs of equality of two elements.  There are three main sources of identifications in HoTT:

  1. Reflexivity, stating that everything is equal to itself.
  2. Higher inductive types, defining a type by giving its points, paths, paths between paths, and so on to any dimension.
  3. Univalence, which states that an equivalence between types determines a path between them.

I will not attempt here to explain each of these in any detail; everything you need to know is in the HoTT book.  But I will say a few things about their consequences, just to give a flavor of what these new principles give us.

Perhaps the most important conceptual point is that mathematics in HoTT emphasizes the structure of proofs rather than their mere existence.  Rather than settle for a mere logical equivalence between two types (mappings back and forth stating that each implies the other), one instead tends to examine the entire space of proofs of a proposition and how it relates to others.  For example, the univalence axiom itself does not merely state that every equivalence between types gives rise to a path between them, but rather that there is an equivalence between the type of equivalences between two types and the type of paths between them.  Familiar patterns such as “A iff B” tend to become “A\simeq B“, stating that the proofs of A and the proofs of B are equivalent.  Of course one may choose neglect this additional information, stating only weaker forms of it using, say, truncation to suppress higher-dimensional information in a type, but the tendency is to embrace the structure and characterize the space of proofs as fully as possible.

A close second in importance is the axiomatic freedom afforded by constructive foundations.  This point has been made many times by many authors in many different settings, but has particular bite in HoTT.   The theory does not commit to (nor does it refute) the infamous Law of the Excluded Middle for arbitrary types: the type A+(A\to \textbf{0}) need not always be inhabited.  This property of HoTT is absolutely essential to its expressive power.  Not only does it admit a wider range of interpretations than are possible with the Law included, but it also allows for the selective imposition of the Law where it is needed to recover a classical argument, or where it is important to distinguish the implications of decidability in a given situation.  (Here again I defer to the book itself for full details.)  Similar considerations arise in connection with the many forms of Choice that can be expressed in HoTT, some of which are outright provable, others of which are independent as they are in axiomatic set theory.

Thus, what makes HoTT so special is that it is a constructive theory of mathematics.  Historically, this has meant that it has a computational interpretation, expressed most vividly by the propositions as types principle.  And yet, for all of its promise, what HoTT currently lacks is a computational interpretation!  What, exactly, does it mean to compute with higher-dimensional objects?  At the moment it is difficult to say for sure, though there seem to be clear intuitions in at least some cases of how to “implement” such a rich type theory.  Alternatively, one may ask whether the term “constructive”, when construed in such a general setting, must inevitably involve a notion of computation.  While it seems obvious on computational grounds that the Law of the Excluded Middle should not be considered universally valid, it becomes less clear why it is so important to omit this Law (and, essentially, no other) in order to obtain the richness of HoTT when no computational interpretation is extant.  From my point of view understanding the computational meaning of higher-dimensional type theory is of paramount importance, because, for me, type theory is and always has been a theory of computation on which the entire edifice of mathematics ought to be built.


The Homotopy Type Theory Book is out!

June 20, 2013

By now many of you have heard of the development of Homotopy Type Theory (HoTT), an extension of intuitionistic type theory that provides a natural foundation for doing synthetic homotopy theory.  Last year the Institute for Advanced Study at Princeton sponsored a program on the Univalent Foundations of Mathematics, which was concerned with developing these ideas.  One important outcome of the year-long program is a full-scale book presenting the main ideas of Homotopy Type Theory itself and showing how to apply them to various branches of mathematics, including homotopy theory, category theory, set theory, and constructive analysis.  The book is the product of a joint effort by dozens of participants in the program, and is intended to document the state of the art as it is known today, and to encourage its further development by the participation of others interested in the topic (i.e., you!).  Among the many directions in which one may take these ideas, the most important (to me) is to develop a constructive (computational) interpretation of HoTT.  Some partial results in this direction have already been obtained, including fascinating work by Thierry Coquand on developing a constructive version of Kan complexes in ITT, by Mike Shulman on proving homotopy canonicity for the natural numbers in a two-dimensional version of HoTT, and by Dan Licata and me on a weak definitional canonicity theorem for a similar two-dimensional theory.  Much work remains to be done to arrive at a fully satisfactory constructive interpretation, which is essential for application of these ideas to computer science.  Meanwhile, though, great progress has been made on using HoTT to formulate and formalize significant pieces of mathematics in a new, and strikingly beautiful, style, that are well-documented in the book.

The book is freely available on the web in various formats, including a PDF version with active references, an ebook version suitable for your reading device, and may be purchased in hard- or soft-cover from Lulu.  The book itself is open source, and is available at the Hott Book Git Hub.  The book is under the Creative Commons  CC BY-SA license, and will be freely available in perpetuity.

Readers may also be interested in the posts on Homotopy Type Theory, the n-Category Cafe, and Mathematics and Computation which describe more about the book and the process of its creation.


More Is Not Always Better

January 28, 2013

In a previous post I discussed the status of Church’s Law in type theory, showing that it fails to hold internally to extensional type theory, even though one may see externally that the definable numeric functions in ETT are λ-definable, and hence Turing computable.  The distinction between internal and external is quite important in logic, mainly because a logical formalism may be unable to express precisely an externally meaningful concept.  The classical example is the Löwenheim-Skolem Theorem of first-order logic, which says that any theory with an infinite model has a countable model.  In particular the theory of sets has a countable model, which would seem to imply that the set of real numbers, for example, is countable.  But internally one can prove that the reals are uncountable (Cantor’s proof is readily expressed in the theory), which seems to be a paradox of some kind.  But no, all it says is that the function witnessing the countability of the term model cannot be expressed internally, and hence there is no contradiction at all.

A similar situation obtains with Church’s Law.  One may observe empirically, so to say, that Church’s Law holds externally of ETT, but this fact cannot be internalized.  There is a function given by Church’s Law that “decompiles” any (extensional) function of type N→N by providing the index for a Turing machine that computes it.  But this function cannot be definable internally to extensional type theory, because it may be used to obtain a decision procedure for halting of Turing machines, which is internally refutable by formalizing the standard undecidability proof.  In both of these examples it is the undefinability of a function that is important to the expressive power of a formalism, contrary to naïve analyses that would suggest that, when it comes to definability of functions, the more the merrier.  This is a general phenomenon in type theory.  The power of type theory arises from its strictures, not its affordances, in direct opposition to the ever-popular language design principle “first-class x” for all imaginable values of x.

Another perspective on the same issue is provided by Martin-Löf’s meaning explanation of type theory, which is closely related to the theory of realizability for constructive logic.  The high-level idea is that a justification for type theory may be obtained by starting with an untyped concept of computability (i.e., a programming language given by an operational semantics for closed terms), and then giving the meaning of the judgments of type theory in terms of such computations.  So, for example, the judgment A type, where A is a closed expression means that A evaluates to a canonical type, where the canonical types include, say, Nat, and all terms of the form A’→A”, where A’ and A” are types.  Similarly, if A is a type, the judgment a:A means that A evaluates to a canonical type A’ and that a evaluates to a canonical term a’ such that a’ is a canonical element of A’, where, say, any numeral for a natural number is a canonical member of Nat.  To give the canonical members of the function type A’→A” requires the further notion of equality of elements of a type, a=b:A, which all functions are required to respect.  A meaning explanation of this sort was suggested by Martin-Löf in his landmark paper Constructive Mathematics and Computer Programming, and is used as the basis for the NuPRL type theory, which extends that account in a number of interesting directions, including inductive and coinductive types, subset and quotient types, and partial types.

The relation to realizability emerges from applying the meaning explanation of types to the semantics of propositions given by the propositions-as-types principle (which, as I’ve previously argued, should not be called “the Curry-Howard isomorphism”).  According to this view a proposition P is identified with a type, the type of its proofs, and we say that P true iff evaluates to a canonical proposition that has a canonical member.  In particular, for implication we say that P→Q true if and only if P true implies Q true (and, in addition, the proof respects equality, a condition that I will suppress here for the sake of simplicity).  More explicitly, the implication is true exactly when the truth of the antecedent implies the truth of the consequent, which is to say that there is a constructive transformation of proofs of P into proofs of Q.

In recursive realizability one accepts Church’s Law and demands that the constructive transformation be given by the index of a Turing machine (i.e., by a program written in a fixed programming language).  This means, in particular, that if P expresses, say, the decidability of the halting problem, for which there is no recursive realizer, then the implication P→Q is vacuously true!  By taking Q to be falsehood, we obtain a realizer for the statement that the halting problem is undecidable.  More generally, any statement that is not realized is automatically false  in the recursive realizability interpretation, precisely because the realizers are identified with Turing machine indices.  Pressing a bit further, there are statements, such as the statement that every Turing machine either halts or diverges on its own input, that are true in classical logic, yet have no recursive realizer, and hence are false in the realizability interpretation.

In contrast in the meaning explanation for NuPRL Church’s Law is not assumed.  Although one may show that there is no Turing machine to decide halting for Turing machines, it is impossible to show that there is no constructive transformation that may do so.  For example, an oracle machine would be able to make the required decision.  This is entirely compatible with intuitionistic principles, because although intuitionism does not affirm LEM, neither does it deny it.  This point is often missed in some accounts, leading to endless confusions.  Intuitionistic logic, properly conceived, is compatible with classical logic in that classical logic may be seen as an idealization of intuitionistic logic in which we heuristically postulate that all propositions are decidable (all instances of LEM hold).

The crucial point distinguishing the meaning explanation from recursive realizability is precisely the refusal to accept Church’s Law, a kind of comprehension principle for functions as discussed earlier.  This refusal is often called computational open-endedness because it amounts to avoiding a commitment to the blasphemy of limiting God’s programming language to Turing machines (using an apt metaphor of Andrej Bauer’s).  Rather, we piously accept that richer notions of computation are possible, and avoid commitment to a “final theory” of computation in which Church’s Law is postulated outright.  By avoiding the witnessing function provided by Church’s Law we gain expressive power, rather than losing it, resulting in an elegant theory of constructive mathematics that enriches, rather than diminishes, classical mathematics.    In short, contrary to “common sense” (i.e., uninformed supposition), more is not always better.

Update: corrected minor technical error and some typographical errors.

Update: clarified point about incompatibility of recursive realizability with classical logic.


Univalent Foundations at IAS

December 3, 2012

As many of you may know, the Institute for Advanced Study is sponsoring a year-long program, called “Univalent Foundations for Mathematics” (UF), which is developing the theory and applications of Homotopy Type Theory (HTT).  The UF program is organized by Steve Awodey (CMU), Thierry Coquand (Chalmers), and Vladimir Voevodsky (IAS).  About two dozen people are in residence at the Institute to participate in the program, including Peter Aczel, Andrej Bauer, Peter Dybjer, Dan Licata, Per Martin-Löf, Peter Lumsdaine, Mike Shulman, and many others.  I have been shuttling back and forth between the Institute and Carnegie Mellon, and will continue to do so next semester.

The excitement surrounding the program is palpable.  We all have the sense that we are doing something important that will change the world.  A typical day consists of one or two lectures of one or two hours, with the rest of the day typically spent in smaller groups or individuals working at the blackboard.  There are many strands of work going on simultaneously, including fundamental type theory, developing proof assistants, and formulating a body of informal type theory.  As visitors come and go we have lectures on many topics related to HTT and UF, and there is constant discussion going on over lunch, tea, and dinner each day.  While there I work each day to the point of exhaustion, eager to pursue the many ideas that are floating around.

So, why is homotopy type theory so exciting?  For me, and I think for many of us, it is the most exciting development in type theory since its inception.  It brings together two seemingly disparate topics, algebraic topology and type theory, and provides a gorgeous framework in which to develop both mathematics and computer science.  Many people have asked me why it’s so important.  My best answer is that it’s too beautiful to be ignored, and such a beautiful concept bmust be good for something!  We’ll be at this for years, but it’s too soon to say yet where the best applications of HTT will arise.  But I am sure in my bones that it’s as important as type theory itself.

Homotopy type theory is based on two closely related concepts:

  1. Constructivity.  Proofs of propositions are mathematical objects classified by their types.
  2. Homotopy.  Paths between objects of a type are proofs of their interchangeability in all contexts.  Paths in a type form a type whose paths are homotopies (deformations of paths).

Homotopy type theory is organized so that maps and families respect homotopy, which, under the identification of paths with equality proofs, means that they respect equality.  The force of this organization arises from axioms that specify what are the paths within a type.   There are two major sources of non-trivial paths within a type, the univalence axiom, and higher inductive types.

The univalence axiom specifies that there is an equivalence between equivalences and equalities of the objects of a universe.  Unravelling a bit, this means that for any two types inhabiting a universe, evidence for their equivalence (a pair of maps that are inverse up to higher homotopy, called weak equivalence) is evidence for their equality.  Put another way, weak equivalences are paths in the universe.  So, for example, a bijection between two elements of the universe \textsf{Set} of sets constitutes a proof of the equality (universal interchangeability) of the two sets.

Higher inductive types allow one to define types by specifying their elements, any paths between their elements, any paths between those paths, and so on to any level, or dimension.  For example, the interval, I, has as elements the endpoints 0, 1 : I, and a path \textsf{seg} between 0 and 1 within I.  The circle, S^1 has an element \textsf{base} and a path \textsf{loop} from \textsf{base} to itself within S^1.

Respect for homotopy means that, for example, a family F of types indexed by the type \textsf{Set} must be such that if A and B are isomorphic sets, then there must be an equivalence between the types F(A) and F(B) allowing us to transport objects from one “fiber” to the other.  And any function with domain \textsf{Set} must respect bijection—it could be the cardinality function, for example, but it cannot be a function that would distinguish \{\,0,1\,\} from \{\,\textsf{true},\textsf{false}\,\}.

Univalence allows us to formalize the informal convention of identifying things “up to isomorphism”.  In the presence of univalence equivalence types (spaces) are, in fact, equal.  So rather than rely on convention, we have a formal account of such identifications.

Higher inductives generalize ordinary inductive definitions to higher dimensions.  This means that we can now define maps (computable functions!) between, say, the 4-dimensional sphere and the 3-dimensional sphere, or between the interval and the torus.  HTT makes absolutely clear what this even means, thanks to higher inductive types.  For example, a map out of S^1 is given by two pieces of data:

  1. What to do with the base point.  It must be mapped to a point in the target space.
  2. What to do with the loop.  It must be mapped to a loop in the target space based at the target point.

A map out of I is given similarly by specifying

  1. What to do with the endpoints.  These must be specified points in the target space.
  2. What to do with the segment.  It must be a path between the specified points in the target space.

It’s all just good old functional programming!  Or, rather, it would be, if we were to have a good computational semantics for HTT, a topic of intense interest at the IAS this year.  It’s all sort-of-obvious, yet also sort-of-non-obvious, for reasons that are difficult to explain briefly.  (If I could, they would probably be considered obvious, and not in need of much explanation!)

A game-changing aspect of all of this is that HTT provides a very nice foundation for mathematics in which types (\infty-groupoids) play a fundamental role as classifying all mathematical objects, including proofs of propositions, which are just types.  Types may be classified according to the structure of their paths—and hence propositions may be classified according to the structure of their proofs.  For example, any two proofs of equivalence of two natural numbers are themselves equivalent; there’s only one way to say that 2+2=4, for example.  Of course there is no path between 2+2 and 5.  And these two situations exhaust the possibilities: any two paths between natural numbers are equal (but there may not even be one).  This unicity of paths property lifts to function spaces by extensionality, paths between functions being just paths between the range elements for each choice of domain element.  But the universe of Sets is not like this: there are many paths between sets (one for each bijection), and these are by no means equivalent.  However, there is at most one way to show that two bijections between sets are equivalent, so the structure “peters out” after dimension 2.

The idea to apply this kind of analysis to proofs of propositions is a distinctive feature of HTT, arising from the combination of constructivity, which gives proofs status as mathematical objects, and homotopy, which provides a powerful theory of the equivalence of proofs.  Conventional mathematics ignores proofs as objects of study, and is thus able to express certain ideas only indirectly.  HTT brings out the latent structure of proofs, and provides an elegant framework for expressing these concepts directly.

Update: edited clumsy prose and added concluding paragraph.


Polarity in Type Theory

August 25, 2012

There has recently arisen some misguided claims about a supposed opposition between functional and object-oriented programming.  The claims amount to a belated recognition of a fundamental structure in type theory first elucidated by Jean-Marc Andreoli, and developed in depth by Jean-Yves Girard in the context of logic, and by Paul Blain-Levy and Noam Zeilberger in the context of programming languages.  In keeping with the general principle of computational trinitarianism, the concept of polarization has meaning in proof theory, category theory, and type theory, a sure sign of its fundamental importance.

Polarization is not an issue of language design, it is an issue of type structure.  The main idea is that types may be classified as being positive or negative, with the positive being characterized by their structure and the negative being characterized by their behavior.  In a sufficiently rich type system one may consider, and make effective use of, both positive and negative types.  There is nothing remarkable or revolutionary about this, and, truly, there is nothing really new about it, other than the terminology.  But through the efforts of the above-mentioned researchers, and others, we have learned quite a lot about the importance of polarization in logic, languages, and semantics.  I find it particularly remarkable that Andreoli’s work on proof search turned out to also be of deep significance for programming languages.  This connection was developed and extended by Zeilberger, on whose dissertation I am basing this post.

The simplest and most direct way to illustrate the ideas is to consider the product type, which corresponds to conjunction in logic.  There are two possible ways that one can formulate the rules for the product type that from the point of view of inhabitation are completely equivalent, but from the point of view of computation are quite distinct.  Let us first state them as rules of logic, then equip these rules with proof terms so that we may study their operational behavior.  For the time being I will refer to these as Method 1 and Method 2, but after we examine them more carefully, we will find more descriptive names for them.

Method 1 of defining conjunction is perhaps the most familiar.  It consists of this introduction rule

\displaystyle\frac{\Gamma\vdash A\;\textsf{true}\quad\Gamma\vdash B\;\textsf{true}}{\Gamma\vdash A\wedge B\;\textsf{true}}

and the following two elimination rules

\displaystyle\frac{\Gamma\vdash A\wedge B\;\textsf{true}}{\Gamma\vdash A\;\textsf{true}}\qquad\frac{\Gamma\vdash A\wedge B\;\textsf{true}}{\Gamma\vdash B\;\textsf{true}}.

Method 2 of defining conjunction is only slightly different.  It consists of the same introduction

\displaystyle \frac{\Gamma\vdash A\;\textsf{true}\quad\Gamma\vdash B\;\textsf{true}}{\Gamma\vdash A\wedge B\;\textsf{true}}

and one elimination rule

\displaystyle\frac{\Gamma\vdash A\wedge B\;\textsf{true} \quad \Gamma,A\;\textsf{true},B\;\textsf{true}\vdash C\;\textsf{true}}{\Gamma\vdash C\;\textsf{true}}.

From a logical point of view the two formulations are interchangeable in that the rules of the one are admissible with respect to the rules of the other, given the usual structural properties of entailment, specifically reflexivity and transitivity.  However, one can discern a difference in “attitude” in the two formulations that will turn out to be a manifestation of the concept of polarity.

Method 1 is a formulation of the idea that a proof of a conjunction is anything that behaves conjunctively, which means that it supports the two elimination rules given in the definition.  There is no commitment to the internal structure of a proof, nor to the details of how projection operates; as long as there are projections, then we are satisfied that the connective is indeed conjunction.  We may consider that the elimination rules define the connective, and that the introduction rule is derived from that requirement.  Equivalently we may think of the proofs of conjunction as being coinductively defined to be as large as possible, as long as the projections are available.  Zeilberger calls this the pragmatist interpretation, following Count Basie’s principle, “if it sounds good, it is good.”

Method 2 is a direct formulation of the idea that the proofs of a conjunction are inductively defined to be as small as possible, as long as the introduction rule is valid.  Specifically, the single introduction rule may be understood as defining the structure of the sole form of proof of a conjunction, and the single elimination rule expresses the induction, or recursion, principle associated with that viewpoint.  Specifically, to reason from the fact that A\wedge B\;\textsf{true} to derive C\;\textsf{true}, it is enough to reason from the data that went into the proof of the conjunction to derive C\;\textsf{true}.  We may consider that the introduction rule defines the connective, and that the elimination rule is derived from that definition.  Zeilberger calls this the verificationist interpretation.

These two perspectives may be clarified by introducing proof terms, and the associated notions of reduction that give rise to a dynamics of proofs.

When reformulated with explicit proofs, the rules of Method 1 are the familiar rules for ordered pairs:

\displaystyle\frac{\Gamma\vdash M:A\quad\Gamma\vdash N:B}{\Gamma\vdash \langle M, N\rangle:A\wedge B}

\displaystyle\frac{\Gamma\vdash M:A\wedge B}{\Gamma\vdash \textsf{fst}(M):A}\qquad\frac{\Gamma\vdash M:A\wedge B}{\Gamma\vdash \textsf{snd}(M):B}.

The associated reduction rules specify that the elimination rules are post-inverse to the introduction rules:

\displaystyle\textsf{fst}(\langle M,N\rangle)\mapsto M \qquad \textsf{snd}(\langle M,N\rangle)\mapsto N.

In this formulation the proposition A\wedge B is often written A\times B, since it behaves like a Cartesian product of proofs.

When formulated with explicit proofs, Method 2 looks like this:

\displaystyle \frac{\Gamma\vdash M:A\quad\Gamma\vdash M:B}{\Gamma\vdash M\otimes N:A\wedge B}

\displaystyle\frac{\Gamma\vdash M:A\wedge B \quad \Gamma,x:A,y:B\vdash N:C}{\Gamma\vdash \textsf{split}(M;x,y.N):C}

with the reduction rule

\displaystyle\textsf{split}(M\otimes N;x,y.P)\mapsto [M,N/x,y]P.

With this formulation it is natural to write A\wedge B as A\otimes B, since it behaves like a tensor product of proofs.

Since the two formulations of “conjunction” have different internal structure, we may consider them as two different connectives.  This may, at first, seem pointless, because it is easily seen that x:A\times B\vdash M:A\otimes B for some M and that x:A\otimes B\vdash N:A\times B for some N, so that the two connectives are logically equivalent, and hence interchangeable in any proof.  But there is nevertheless a reason to draw the distinction, namely that they have different dynamics.

It is easy to see why.  From the pragmatic perspective, since the projections act independently of one another, there is no reason to insist that the components of a pair be evaluated before they are used.  Quite possibly we may only ever project the first component, so why bother with the second?  From the verificationist perspective, however, we are pattern matching against the proof of the conjunction, and are demanding both components at once, so it makes sense to evaluate both components of a pair in anticipation of future pattern matching.  (Admittedly, in a structural type theory one may immediately drop one of the variables on the floor and never use it, but then why give it a name at all?  In a substructural type theory such as linear type theory, this is not a possibility, and the interpretation is forced.)  Thus, the verficationist formulation corresponds to eager evaluation of pairing, and the pragmatist formulation to lazy evaluation of pairing.

Having distinguished the two forms of conjunction by their operational behavior, it is immediately clear that both forms are useful, and are by no means opposed to one another.  This is why, for example, the concept of a lazy language makes no sense, rather one should instead speak of lazy types, which are perfectly useful, but by no means the only types one should ever consider.  Similarly, the concept of an object-oriented language seems misguided, because it emphasizes the pragmatist conception, to the exclusion of the verificationist, by insisting that everything be an object characterized by its methods.

More broadly, it is useful to classify types into two polarities, the positive and the negative, corresponding to the verificationist and pragmatist perspectives.  Positive types are inductively defined by their introduction forms; they correspond to colimits, or direct limits, in category theory.  Negative types are coinductively defined by their elimination forms; they correspond to limits, or inverse limits, in category theory.  The concept of polarity is intimately related to the concept of focusing, which in logic sharpens the concept of a cut-free proof and elucidates the distinction between synchronous and asynchronous connectives, and which in programming languages provides an elegant account of pattern matching, continuations, and effects.

As ever, enduring principles emerge from the interplay between proof theory, category theory, and type theory.  Such concepts are found in nature, and do not depend on cults of personality or the fads of the computer industry for their existence or importance.

Update: word-smithing.


Extensionality, Intensionality, and Brouwer’s Dictum

August 11, 2012

There seems to be a popular misunderstanding about the propositions-as-types principle that has led some to believe that intensional type theory (ITT) is somehow preferable to or more sensible than extensional type theory (ETT).  Now, as a practical matter, few would dispute that ETT is much easier to use than ITT for mechanizing everyday mathematics.  Some justification for this will be given below, but I am mainly concerned with matters of principle.  Specifically, I wish to dispute the claim that t ETT is somehow “wrong” compared to ITT.  The root of the problem appears to be a misunderstanding of the fundamental ideas of intuitionism, which are expressed by the proposition-as-types principle.

The most popular conception appears to be the trivial one, namely that certain inductively defined formal systems of logic correspond syntactically to certain inductively defined formal systems of typing.  Such correspondences are not terribly interesting, because they can easily be made to hold by construction: all you need to do is to introduce proof terms that summarize a derivation, and then note that the proofs of a proposition correspond to the terms of the associated type.  In this form the propositions-as-types principle is often dubbed, rather grandly, the Curry-Howard Isomorphism.  It’s a truism that most things in mathematics are named after anyone but their discoverers, and that goes double in this case.  Neither Curry nor Howard discovered the principle (Howard himself disclaims credit for it), though they both did make contributions to it.  Moreover, this unfortunate name deprives credit to those who did the real work in inventing the concept, including Brouwer, Heyting, Kolmogorov, deBruijn, and Martin-Löf.  (Indeed, it is sometimes called the BHK Correspondence, which is both more accurate and less grandiose.)  Worse, there is an “isomorphism” only in the most trivial sense of an identity by definition, hardly worth emphasizing.

The interesting conception of the propositions-as-types principle is what I call Brouwer’s Dictum, which states that all of mathematics, including the concept of a proof, is to be derived from the concept of a construction, a computation classified by a type.  In intuitionistic mathematics proofs are themselves “first-class” mathematical objects that inhabit types that may as well be identified with the proposition that they prove.  Proving a proposition is no different than constructing a program of a type.  In this sense logic is a branch of mathematics, the branch concerned with those constructions that are proofs.  And mathematics is itself a branch of computer science, since according to Brouwer’s Dictum all of mathematics is to be based on the concept of computation.  But notice as well that there are many more constructions than those that correspond to proofs.  Numbers, for example, are perhaps the most basic ones, as would be any inductive or coinductive types, or even more exotic objects such as Brouwer’s own choice sequences.  From this point of view the judgement M\in A stating that M is a construction of type A is of fundamental importance, since it encompasses not only the formation of “ordinary” mathematical constructions, but also those that are distinctively intuitionistic, namely mathematical proofs.

An often misunderstood point that must be clarified before we continue is that the concept of proof in intuitionism is not to be identified with the concept of a formal proof in a fixed formal system.  What constitutes a proof of a proposition is a judgement, and there is no reason to suppose a priori that this judgement ought to be decidable.  It should be possible to recognize a proof when we see one, but it is not required that we be able to rule out what is a proof in all cases.  In contrast formal proofs are inductively defined and hence fully circumscribed, and we expect it to be decidable whether or not a purported formal proof is in fact a formal proof, that is whether it is well-formed according to the given inductively defined rules.  But the upshot of Gödel’s Theorem is that as soon as we fix the concept of formal proof, it is immediate that it is not an adequate conception of proof simpliciter, because there are propositions that are true, which is to say have a proof, but have no formal proof according to the given rules.  The concept of truth, even in the intuitionistic setting, eludes formalization, and it will ever be thus.  Putting all this another way, according to the intuitionistic viewpoint (and the mathematical practices that it codifies), there is no truth other than that given by proof.  Yet the rules of proof cannot be given in decidable form without missing the point.

It is for this reason that the first sense of the propositions-as-types principle discussed above is uninteresting, for it only ever codifies a decidable, and hence incomplete, conception of proof.  Moreover, the emphasis on an isomorphism between propositions and types also misses the point, because it fails to account for the many forms of type that do not correspond to propositions.  The formal correspondence is useful in some circumstances, namely those in which the object of study is a formal system.  So, for example, in LF the goal is to encode formal systems, and hence it is essential in the LF methodology that type checking be decidable.  But when one is talking about a general theory of computation, which is to say a general theory of mathematical constructions, there is no reason to expect either an isomorphism or decidability.  (So please stop referring to propositions-as-types as “the Curry-Howard Isomorphism”!)

We are now in a position to discuss the relationship between ITT and ETT, and to correct the misconception that ETT is somehow “wrong” because the typing judgement is not decidable.  The best way to understand the proper relationship between the two is to place them into the broader context of homotopy type theory, or HTT.  From the point of view of homotopy type theory ITT and ETT represent extremal points along a spectrum of type theories, which is to say a spectrum of conceptions of mathematical construction in Brouwer’s sense.  Extensional type theory is the theory of homotopy sets, or hSets for short, which are spaces that are homotopically discrete, meaning that the only path (evidence for equivalence) of two elements is in fact the trivial self-loop between an element and itself.  Therefore if we have a path between x and y in A, which is to say a proof that they are equivalent, then x and y are equal, and hence interchangeable in all contexts.  The bulk of everyday mathematics takes place within the universe of hSets, and hence is most appropriately expressed within ETT, and experience has born this out.  But it is also interesting to step outside of this framework and consider richer conceptions of type.

For example, as soon as we introduce universes, one is immediately confronted with the need to admit types that are not hSets.  A universe of hSets naturally includes non-trivial paths between elements witnessing their isomorphism as hSets, and hence their interchangeability in all contexts.  Taking a single universe of hSets as the sole source of such additional structure leads to (univalent) two-dimensional type theory.  In this terminology ETT is then to be considered as one-dimensional type theory.  Universes are not the only source of higher dimensionality.  For example, the interval has two elements, 0 and 1 connected by a path, the segment between them, which may be seen as evidence for their interchangeability (we can slide them along the segment one to the other).  Similarly, the circle S^1 is a two-dimensional inductively defined type with one element, a base point, and one path, a non-reflexive self-loop from the base point to itself.  It is now obvious that one may consider three-dimensional type theory, featuring types such as S^2, the sphere, and so forth.  Continuing this through all finite dimensions, we obtain finite-dimensional type theory, which is just ITT (type theory with no discreteness at any dimension).

From this perspective one can see more clearly why it has proved so awkward to formalize everyday mathematics in ITT.  Most such work takes place in the universe of hSets, and makes no use of higher-dimensional structure.  The natural setting for such things is therefore ETT, the theory of types as homotopically discrete sets.  By formalizing such mathematics within ITT one is paying the full cost of higher-dimensionality without enjoying any of its benefits.  This neatly confirms experience with using NuPRL as compared to using Coq for formulating the mathematics of homotopy sets, and why even die-hard ITT partisans find themselves wanting to switch to ETT for doing real work (certain ideological commitments notwithstanding).  On the other hand, as higher-dimensional structure becomes more important to the work we are doing, something other than ETT is required.  One candidate is a formulation of type theory with explicit levels, representing the dimensionality restriction appropriate to the problem domain.  So work with discrete sets would take place within level 1, which is just extensional type theory.  Level 2 is two-dimensional type theory, and so forth, and the union of all finite levels is something like ITT.  To make this work requires that there be a theory of cumulativity of levels, a theory of resizing that allows us to move work at a higher level to a lower level at which it still makes sense, and a theory of truncation that allows suppression of higher-dimensional structure (generalizing proof irrelevance and “squash” types).

However this may turn out, it is clear that the resulting type theory will be far richer than merely the codification of the formal proofs of some logical system.  Types such as the geometric spaces mentioned above do not arise as the types of proofs of propositions, but rather are among the most basic of mathematical constructions, in complete accordance with Brouwer’s dictum.


Church’s Law

August 9, 2012

A new feature of this year’s summer school was a reduction in the number of lectures, and an addition of daily open problem sessions for reviewing the day’s material. This turned out to be a great idea for everyone, because it gave us more informal time together, and gave the students a better chance at digesting a mountain of material. It also turned out to be a bit of an embarrassment for me, because I posed a question off the top of my head for which I thought I had two proofs, neither of which turned out to be valid. The claimed theorem is, in fact, true, and one of my proofs is easily corrected to resolve the matter (the other, curiously, remains irredeemable for reasons I’ll explain shortly). The whole episode is rather interesting, so let me recount a version of it here for your enjoyment.

The context of the discussion was extensional type theory, or ETT, which is characterized by an identification of judgemental with propositional equality: if you can prove that two objects are equal,then they are interchangeable for all purposes without specific arrangement. The alternative, intensional type theory,or ITT, regards judgemental equality as definitional equality (symbolic evaluation), and gives computational meaning to proofs of equality of objects of a type, allowing in particular transport across two instances of a family whose indices are equal. NuPRL is an example of an ETT; CiC is an example of an ITT.

Within the framework of ETT, the principle of function extensionality comes “for free”, because you can prove it to hold within the theory. Function extensionality states that f=g:A\to B whenever x:A\vdash f(x)=g(x):B. That is, two functions are if they are equal on all arguments (and, implicitly, respect equality of arguments). Function extensionality holds definitionally if your definitional equivalence includes the \eta and \xi rules, but in any case does not have the same force as extensional equality. Function extensionality as a principle of equality cannot be derived in ITT, but must be added as an additional postulate (or derived from a stronger postulate, such as univalence or the existence of a one-dimensional interval type).

Regardless of whether we are working in an extensional or an intensional theory, it is easy to see that all functions of type N\to N definable in type theory are computable. For example, we may show that all such functions may be encoded as recursive functions in the sense of Kleene, or in a more modern formulation we may give a structural operational semantics that provides a deterministic execution model for such functions (given n:N, run f:N\to N on n until it stops, and yield that as result). Of course the proof relies on some fairly involved meta-theory, but it is all constructively valid (in an informal sense) and hence provides a legitimate computational interpretation of the theory. Another way to say the same thing is to say that the comprehension principles of type theory are such that every object deemed to exist has a well-defined computational meaning, so it follows that all functions defined within it are going to be computable.

This is all just another instance of Church’s Law, the scientific law stating that any formalism for defining computable functions will turn out to be equivalent to, say, the λ-calculus when it comes to definability of number-theoretic functions. (Ordinarily Church’s Law is called Church’s Thesis, but for reasons given in my Practical Foundations book, I prefer to give it the full status of a scientific law.) Type theory is, in this respect, no better than any other formalism for defining computable functions. By now we have such faith in Church’s Law that this remark is completely unsurprising, even boring to state explicitly.

So it may come as a surprise to learn that Church’s Law is false. I’m being provocative here, so let me explain what I mean before I get flamed to death on the internet.   The point I wish to make is that there is an important distinction between the external and the internal properties of a theory. For example, in first-order logic the Löwenheim-Skolem Theorem tells us that if a first-order theory has an infinite model, then it has a countable model. This implies that, externally to ZF set theory, there are only countably many sets, even though internally to ZF set theory we can carry out Cantor’s argument to show that the powerset operation takes us to exponentially higher cardinalities far beyond the countable. One may say that the “reason” is that the evidence for the countability of sets is a bijection that is not definable within the theory, so that it cannot “understand” its own limitations. This is a good thing.

The situation with Church’s Law in type theory is similar. Externally we know that every function on the natural numbers is computable. But what about internally? The internal statement of Church’s Law is this: \Pi f:N\to N.\Sigma n:N. n\Vdash f, where the notation n\Vdash f means, informally, that n is the code of a program that, when executed on input m:N, evaluates to f(m). In Kleene’s original notation this would be rendered as \Pi m:N.\Sigma p:N.T(n,m,p)\wedge Id(U(p),f(m)), where the T predicate encodes the operational semantics, and the U predicate extracts the answer from a successful computation. Note that the expansion makes use of the identity type at the type N. The claim is that Church’s Law, stated as a type (proposition) within ETT, is false, which is to say that it entails a contradiction.

When I posed this as an exercise at the summer school, I had in mind two different proofs, which I will now sketch. Neither is valid, but there is a valid proof that I’ll come to afterwards.

Both proofs begin by applying the so-called Axiom of Choice. For those not familiar with type theory, the “axiom” of choice is in fact a theorem, stating that every total binary relation contains a function. Explicitly,

(\Pi x:A.\Sigma y:B.R(x,y)) \to \Sigma f:A\to B.\Pi x:A.R(x,f(x)).

The function f is the “choice function” that associates a witness to the totality of R to each argument x. In the present case if we postulate Church’s Law, then by the axiom of choice we have

\Sigma F:(N\to N)\to N.\Pi f:N\to N. F(f)\Vdash f.

That is, the functional F picks out, for each function f in N\to N, a (code for a) program that witnesses the computability of f. This should already seem suspicious, because by function extensionality the functional F must assign the same program to any two extensionally equal functions.

We may easily see that F is injective, for if F(f) is F(g), then both track both f and g, and hence f and g are (extensionally) equal. Thus we have an injection from N\to N into N, which seems “impossible” … except that it is not! Let’s try the proof that this is impossible, and see where it breaks down. Suppose that i:(N\to N)\to N is injective. Define d(x)=i^{-1}(x)(x)+1, and consider d(i(d))=i^{-1}(i(d))(i(d))+1=d(i(d))+1 so 0=1 and we are done. Not so fast! Since i is only injective, and not necessarily surjective, it is not clear how to define i^{-1}. The obvious idea is to send x=i(f) to f, and any x outside the image of i to, say, the identity. But there is no reason to suppose that the image of i is decidable, so the attempted definition breaks down. I hacked around with this for a while, trying to exploit properties of F to repair the proof (rather than work with a general injection, focus on the specific functional F), but failed. Andrej Bauer pointed out to me, to my surprise, that there is a model of ETT (which he constructed) that contains an injection of N\to N into N! So there is no possibility of rescuing this line of argument.

(Incidentally, we can show within ETT that there is no bijection between N and N\to N, using surjectivity to rescue the proof attempt above. Curiously, Lawvere has shown that there can be no surjection from N onto N\to N, but this does not seem to help in the present situation. This shows that the concept of countability is more subtle in the constructive setting than in the classical setting.)

But I had another argument in mind, so I was not worried. The functional F provides a decision procedure for equality for the type N\to N: given f,g:N\to N, compare F(f) with F(g). Surely this is impossible! But one cannot prove within type theory that \textrm{Id}_{N\to N}(-,-) is undecidable, because type theory is consistent with the law of the excluded middle, which states that every proposition is decidable. (Indeed, type theory proves that excluded middle is irrefutable for any particular proposition P: \neg\neg(P\vee\neg P).) So this proof also fails!

At this point it started to seem as though Church’s Law could be independent of ETT, as startling as that sounds. For ITT it is more plausible: equality of functions is definitional, so one could imagine associating an index with each function without disrupting anything. But for ETT this seemed implausible to me. Andrej pointed me to a paper by Maietti and Sambin that states that Church’s Law is incompatible with function extensionality and choice. So there must be another proof that refutes Church’s Law, and indeed there is one based on the aforementioned decidability of function equivalence (but with a slightly different line of reasoning than the one I suggested).

First, note that we can use the equality test for functions in N\to N to check for halting. Using the T predicate described above, we can define a function that is constantly 0 iff a given (code of a) program never halts on given input. We may then use the above-mentioned equality test to check for halting. So it suffices to show that the halting problem for (codes of) functions and inputs is not computable to complete the refutation of the internal form of Church’s Law.

Specifically, assume given h:N\times N\to N that, given a code for a function and an input, yields 0 or 1 according to whether or not that function halts when applied to that input. Define d:N\to N by \lambda x:N.\neg h(x,x), the usual diagonal function. Now apply the functional F obtained from Church’s Law using the Axiom of Choice to obtain n=F(d), the code for the function d, and consider h(n,n) to derive the needed contradiction. Notice that we have used Church’s Law here to obtain a code for the type-theoretic diagonal function, which is then passed to the halting tester in the usual way.

As you can see, the revised argument follows along lines similar to what I had originally envisioned (in the second version), but requires a bit more effort to push through the proof properly. (Incidentally, I don’t think the argument can be made to work in pure ITT, but perhaps it would go through for ITT enriched with function extensionality.)

Thus, Church’s Law is false internally to extensional type theory, even though it is evidently true externally for that theory. You can see the similarity to the situation in first-order logic described earlier. Even though all functions of type N\to N are computable, type theory itself is not capable of recognizing this fact (at least, not in the extensional case). And this is a good thing, not a bad thing! The whole beauty of constructive mathematics lies in the fact that it is just mathematics, free of any self-conscious recognition that we are writing programs when proving theorems constructively. We never have to reason about machine indices or any such nonsense, we just do mathematics under the discipline of not assuming that every proposition is decidable. One benefit is that the same mathematics admits interpretation not only in terms of computability, but also in terms of continuity in topological spaces, establishing a deep connection between two seemingly disparate topics.

(Hat tip to Andrej Bauer for help in sorting all this out. Here’s a link to a talk and a paper about the construction of a model of ETT in which there is an injection from N\to N to N.)

Update: word-smithing.


Types and Cells

June 7, 2011

The doctrine of computational trinitarianism implies that there should be a categorial analogue of two-dimensional type theory … and indeed there is.  It is called, oddly enough, two-dimensional category theory, which arose first for numerous reasons.  What is two-dimensional category theory?  And how does it relate to two-dimensional type theory?

Two-dimensional category theory may be seen as an instance of the general concept of enriched categories in which the collection of morphisms between any two objects is endowed with some additional structure. A locally small category is one for which the maps between any two objects form a set, called the hom set.  An order-enriched category is one in which the morphisms form an ordered set, which is a set equipped with a pre-order (reflexive and transitive relation) on maps such that composition is monotone.  Order-enriched categories were used by Mitchell Wand and Gordon Plotkin and Mike Smyth in their development of the category-theoretic interpretation of recursive types.  Specifically, an O-category is an order-enriched category whose objects are cpo’s (pre-orders closed under sup’s of chains, and hence containing a least element) and whose hom sets are themselves cpo’s and for which composition is continuous (monotone and preserves suprema of chains).

More generally still, one may  consider (different forms of) category-enriched categories, in which the hom sets are structured as categories with their own vertical notion of identity and composition that meshes well with the horizontal notion of identity and composition of the category itself.  (I will not give here the precise meaning of “meshes well with”, since there are several possible notions and I will be discussing this structure in the type-theoretic setting below.)  The vertical maps are maps between maps that can be interpreted in several ways.  One way is as “witnesses” to an ordering relation among maps, viewing a category as a “pre-order with witnesses”, perhaps an approximation relation among maps such as arises in domain theory.  Another is to view these maps as homotopies between continuous functions on a space; these are the continuous deformations of one map into another.  (This is usually expressed by introducing a “time” coordinate ranging over the unit interval, and stating that a homotopy h between maps f and g is a continuous mapping such that at time zero h agrees with f, at time 1 agrees with g, and smoothly varies at times between.)  An important special case of category-enriched categories are those whose morphisms form (not just a category but also) a groupoid, a category in which each map is invertible.  These can be thought of as “equivalence relations with evidence”; the vertical maps are “proofs” that two horizontal maps are equivalent (for example, homotopic).

The sort of category-enriched categories that I have in mind are called (strict) 2-categories.  And as the terminology suggests, there is nothing special about the number 2 here; one can consider n-categories for any number n, and even \infty-categories that incorporate all of these into one grand structure (strictly or weakly).  To help manage this structure, the categorists have a uniform terminology that I wish to introduce here, called cells.  The objects in a category are the 0-cells, the maps are the 1-cells, the maps between maps (transformations, homotopies, …) are the 2-cells, and so forth.  A 2-category is one for which we stop at 2-cells; all higher cells being degenerate and hence ignored.  Often the 2-cells (and higher) form groupoids (in one of two senses that I will come back to later); a (2,1)-category is a 2-category in which the 2-cells form a groupoid (but the 1-cells need only form a category).  The class of (\infty,1)-categories are the subject of intense interest these days, in part because of their importance in homotopy theory.  They are, however, notoriously hard to manage, and, from what I understand, it is not yet clear what is even the correct definition.

Returning to two-dimensional type theory, and applying the types-as-categories correspondence, we get that types are 0-cells, terms (with free variables) are 1-cells, and transformations are 2-cells.  General terms are not invertible, but transformations are, giving rise to a (strict) groupoid structure expressed by the judgement \alpha::M\simeq N:A.  The rules for composition of transformations expresses the vertical structure as forming a strict groupoid.  Ordinary substitution expresses the horizontal structure of terms as maps (plugging in terms for free variables).  But how do these interact?  In strict two-dimensional type theory this interaction is expressed by definitional equalities that express the interchange law between horizontal and vertical composition.  Specifically, we have the following rule:

\displaystyle{  {x:B\vdash \alpha::M\simeq N:A \qquad \vdash \beta::P\simeq Q:B}\over{\alpha[\beta]::\textit{map}\{x:B.A\}[\beta](M[P/x])\simeq N[Q/x]:A[Q/x]}  }

(More generally one must consider arbitrary contexts, and transformations between substitutions, but this instance of the general case expresses the essential idea.)

This “substitution” principle must satisfy some equational laws that amount to the following conditions:

  1. Identity: \alpha[\textit{id}]\equiv\alpha.
  2. Composition: \alpha[\beta[\gamma]]\equiv \alpha[\beta][\gamma].
  3. Interchange: (\alpha\circ\alpha')[\beta\circ\beta'] = \alpha[\beta] \circ \textit{resp}\{map[\beta]\} (\alpha'[\beta']).
  4. Delegation: (\textit{id}_M)[\delta]\equiv M[\delta].

(Here \textit{resp} is the application of a transformation to a term; it is there to “get the types right”.)  The interchange law is the type-theoretic analogue of the required interaction between 1-composition and 2-composition in a (strict) 2-category.

This is the judgmental structure of strict two-dimensional type theory.  There are several directions we can go from here.  One is to clarify the strict/weak distinction that has been underlying our discussions so far.  Another is to consider how to extend this to three- and higher-dimensional (or, dare I say it, infinite-dimensional) type theory.  The crucial ingredient for both of these discussions is one particular type constructor (more precisely, family of type constructors) that internalizes the hom structure as a type.  In the groupoidal case, where the higher structure is symmetric (invertible), this corresponds to (a version of) the Martin-Löf identity type.  The introduction rule is reflexivity, and the elimination rule is, in essence, the Yoneda Lemma from category theory.  When interpreted in terms of homotopy theory, this is the type of paths in a space, which is critical for defining the fundamental group(oid) of a space, and to clarifying the hinted-at distinction between weak and strict structure in higher dimensions.  More on that later!


Transformations as strict groupoids

May 30, 2011

The distinguishing feature of higher-dimensional type theory is the concept of equivalence of the members of a type that must be respected by all families of types.  To be sufficiently general it is essential to regard equivalence as a structure, rather than a property.  This is expressed by the judgement

\displaystyle \Gamma\vdash \alpha::M\simeq N:A

which states that M and N are equivalent members of type A, as evidenced by the transformation \alpha.  Respect for equivalence is ensured by the rule

\displaystyle{{\Gamma,x:A\vdash B\,\textsf{type}\quad  \Gamma\vdash \alpha :: M\simeq N:A \quad  \Gamma\vdash P:B[M/x]}\over  {\Gamma\vdash \textit{map}\{x:A.B\}[\alpha](P):B[N/x]}},

which states that equivalent members determine equivalent instances of a family of types.  The equivalence between instances is mediated by the operation \textit{map}\{x:A.B\}[\alpha](-), which sends members of B[M/x] to members of B[N/x].  We call this mapping the action of the family x:A.B on the transformation \alpha.

For reasons that will only become apparent as we go along, it is important that “equivalence” really be an equivalence: it must be, in an appropriate sense, reflexive, symmetric, and transitive.  The “appropriate sense” is precisely that we require the existence of transformations

\displaystyle{\Gamma\vdash \textit{id}::M\simeq M:A}

\displaystyle{{\Gamma\vdash\alpha::M\simeq N:A}\over{\Gamma\vdash\alpha^{-1}::N\simeq M:A}}

\displaystyle{{\Gamma\vdash \beta:N\simeq P:A\quad  \Gamma\vdash \alpha:M\simeq N:A}\over{\Gamma\vdash\beta\circ\alpha:M\simeq P:A}}

Moreover, these transformations must be respected by the action of any family, in a sense that we shall make clear momentarily.  Before doing so, let us observe that these transformations constitute the operations of a groupoid, which we may think of either as an equivalence relation equipped with evidence or a category in which every map is invertible (a generalized group).  While the former interpretation may not suggest it, the latter formulation implies that we should impose some requirements on how these transformations interact, namely the axioms of a groupoid:

  1. Composition (multiplication) is associative: \gamma\circ(\beta\circ\alpha)\equiv (\gamma\circ\beta)\circ\alpha::M\simeq N:A.
  2. Identity is the unit of composition: \textit{id}\circ\alpha\equiv\alpha::M\simeq N:A and \alpha\circ\textit{id}\equiv\alpha::M\simeq N:A.
  3. Inverses cancel: \alpha^{-1}\circ\alpha\equiv\textit{id}::M\simeq M:A and \alpha\circ\alpha^{-1}\equiv\textit{id}::N\simeq N:A.
These conditions, which impose equalities on transformations, demand that the second-dimensional structure of a type form a strict groupoid.  I will come back to an important weakening of these requirements later.

We further require that the action of a type family preserve the groupoid structure.  For this it is enough to require that it preserve identities and composition:

\displaystyle{\textit{map}\{x:A.B\}[\textit{id}](-) \equiv \textit{id}(-):B[M/x]}

and

\displaystyle{\begin{array}{c}\textit{map}\{x:A.B\}[\beta\circ\alpha](-)\\\equiv\\\textit{map}\{x:A.B\}[\beta](\textit{map}\{x:A.B\}[\alpha](-))\end{array}}.

Thinking of a groupoid as a category, these conditions state that the action of a type family be (strictly) functorial.  (Here again we are imposing strong requirements in order to facilitate the exposition; eventually we will consider a relaxation of these conditions that will admit a richer range of applications.)

(The alert reader will note that I have not formally introduced the concept of a transformation between types, nor the equality of these, into the theory.  There are different ways to skin this cat; for now, I will be a bit loose about the axiomatics in order to focus attention on the main ideas.  Rest assured that everything can be made precise!)

By demanding that the groupoid axioms hold strictly (as equalities) and that the action of families be strictly functorial, we have simplified the theory considerably by restricting it to dimension 2.  To relax these restrictions requires higher dimensions.  For example, we may demand only that the groupoid conditions hold up to a transformation of transformations, but hold strictly from then on; this is the 3-dimensional case.  Or we can relax all such conditions to hold only up to a higher transformation, resulting in finite dimensional type theory.  Similar considerations will apply to other conditions that we shall impose on the action of families, in particular to specify the action of type constructors on transformations, which I will discuss next time.  The presentation of finite-dimensional type theory will be aided by the introduction of identity types (also called path types).  Identity types avoid the need for an ever-expanding nesting of transformations between transformations between ….  More on that later!

Update (August 2012): Egbert Rijke has written lucidly on the topic of Yoneda’s Lemma and it’s relation to homotopy type theory in his Master’s Thesis, which I encourage readers to consult for a nice summary of higher-dimensional type theory.


Follow

Get every new post delivered to your Inbox.

Join 1,032 other followers