Extensionality, Intensionality, and Brouwer’s Dictum

August 11, 2012

There seems to be a popular misunderstanding about the propositions-as-types principle that has led some to believe that intensional type theory (ITT) is somehow preferable to or more sensible than extensional type theory (ETT).  Now, as a practical matter, few would dispute that ETT is much easier to use than ITT for mechanizing everyday mathematics.  Some justification for this will be given below, but I am mainly concerned with matters of principle.  Specifically, I wish to dispute the claim that t ETT is somehow “wrong” compared to ITT.  The root of the problem appears to be a misunderstanding of the fundamental ideas of intuitionism, which are expressed by the proposition-as-types principle.

The most popular conception appears to be the trivial one, namely that certain inductively defined formal systems of logic correspond syntactically to certain inductively defined formal systems of typing.  Such correspondences are not terribly interesting, because they can easily be made to hold by construction: all you need to do is to introduce proof terms that summarize a derivation, and then note that the proofs of a proposition correspond to the terms of the associated type.  In this form the propositions-as-types principle is often dubbed, rather grandly, the Curry-Howard Isomorphism.  It’s a truism that most things in mathematics are named after anyone but their discoverers, and that goes double in this case.  Neither Curry nor Howard discovered the principle (Howard himself disclaims credit for it), though they both did make contributions to it.  Moreover, this unfortunate name deprives credit to those who did the real work in inventing the concept, including Brouwer, Heyting, Kolmogorov, deBruijn, and Martin-Löf.  (Indeed, it is sometimes called the BHK Correspondence, which is both more accurate and less grandiose.)  Worse, there is an “isomorphism” only in the most trivial sense of an identity by definition, hardly worth emphasizing.

The interesting conception of the propositions-as-types principle is what I call Brouwer’s Dictum, which states that all of mathematics, including the concept of a proof, is to be derived from the concept of a construction, a computation classified by a type.  In intuitionistic mathematics proofs are themselves “first-class” mathematical objects that inhabit types that may as well be identified with the proposition that they prove.  Proving a proposition is no different than constructing a program of a type.  In this sense logic is a branch of mathematics, the branch concerned with those constructions that are proofs.  And mathematics is itself a branch of computer science, since according to Brouwer’s Dictum all of mathematics is to be based on the concept of computation.  But notice as well that there are many more constructions than those that correspond to proofs.  Numbers, for example, are perhaps the most basic ones, as would be any inductive or coinductive types, or even more exotic objects such as Brouwer’s own choice sequences.  From this point of view the judgement M\in A stating that M is a construction of type A is of fundamental importance, since it encompasses not only the formation of “ordinary” mathematical constructions, but also those that are distinctively intuitionistic, namely mathematical proofs.

An often misunderstood point that must be clarified before we continue is that the concept of proof in intuitionism is not to be identified with the concept of a formal proof in a fixed formal system.  What constitutes a proof of a proposition is a judgement, and there is no reason to suppose a priori that this judgement ought to be decidable.  It should be possible to recognize a proof when we see one, but it is not required that we be able to rule out what is a proof in all cases.  In contrast formal proofs are inductively defined and hence fully circumscribed, and we expect it to be decidable whether or not a purported formal proof is in fact a formal proof, that is whether it is well-formed according to the given inductively defined rules.  But the upshot of Gödel’s Theorem is that as soon as we fix the concept of formal proof, it is immediate that it is not an adequate conception of proof simpliciter, because there are propositions that are true, which is to say have a proof, but have no formal proof according to the given rules.  The concept of truth, even in the intuitionistic setting, eludes formalization, and it will ever be thus.  Putting all this another way, according to the intuitionistic viewpoint (and the mathematical practices that it codifies), there is no truth other than that given by proof.  Yet the rules of proof cannot be given in decidable form without missing the point.

It is for this reason that the first sense of the propositions-as-types principle discussed above is uninteresting, for it only ever codifies a decidable, and hence incomplete, conception of proof.  Moreover, the emphasis on an isomorphism between propositions and types also misses the point, because it fails to account for the many forms of type that do not correspond to propositions.  The formal correspondence is useful in some circumstances, namely those in which the object of study is a formal system.  So, for example, in LF the goal is to encode formal systems, and hence it is essential in the LF methodology that type checking be decidable.  But when one is talking about a general theory of computation, which is to say a general theory of mathematical constructions, there is no reason to expect either an isomorphism or decidability.  (So please stop referring to propositions-as-types as “the Curry-Howard Isomorphism”!)

We are now in a position to discuss the relationship between ITT and ETT, and to correct the misconception that ETT is somehow “wrong” because the typing judgement is not decidable.  The best way to understand the proper relationship between the two is to place them into the broader context of homotopy type theory, or HTT.  From the point of view of homotopy type theory ITT and ETT represent extremal points along a spectrum of type theories, which is to say a spectrum of conceptions of mathematical construction in Brouwer’s sense.  Extensional type theory is the theory of homotopy sets, or hSets for short, which are spaces that are homotopically discrete, meaning that the only path (evidence for equivalence) of two elements is in fact the trivial self-loop between an element and itself.  Therefore if we have a path between x and y in A, which is to say a proof that they are equivalent, then x and y are equal, and hence interchangeable in all contexts.  The bulk of everyday mathematics takes place within the universe of hSets, and hence is most appropriately expressed within ETT, and experience has born this out.  But it is also interesting to step outside of this framework and consider richer conceptions of type.

For example, as soon as we introduce universes, one is immediately confronted with the need to admit types that are not hSets.  A universe of hSets naturally includes non-trivial paths between elements witnessing their isomorphism as hSets, and hence their interchangeability in all contexts.  Taking a single universe of hSets as the sole source of such additional structure leads to (univalent) two-dimensional type theory.  In this terminology ETT is then to be considered as one-dimensional type theory.  Universes are not the only source of higher dimensionality.  For example, the interval has two elements, 0 and 1 connected by a path, the segment between them, which may be seen as evidence for their interchangeability (we can slide them along the segment one to the other).  Similarly, the circle S^1 is a two-dimensional inductively defined type with one element, a base point, and one path, a non-reflexive self-loop from the base point to itself.  It is now obvious that one may consider three-dimensional type theory, featuring types such as S^2, the sphere, and so forth.  Continuing this through all finite dimensions, we obtain finite-dimensional type theory, which is just ITT (type theory with no discreteness at any dimension).

From this perspective one can see more clearly why it has proved so awkward to formalize everyday mathematics in ITT.  Most such work takes place in the universe of hSets, and makes no use of higher-dimensional structure.  The natural setting for such things is therefore ETT, the theory of types as homotopically discrete sets.  By formalizing such mathematics within ITT one is paying the full cost of higher-dimensionality without enjoying any of its benefits.  This neatly confirms experience with using NuPRL as compared to using Coq for formulating the mathematics of homotopy sets, and why even die-hard ITT partisans find themselves wanting to switch to ETT for doing real work (certain ideological commitments notwithstanding).  On the other hand, as higher-dimensional structure becomes more important to the work we are doing, something other than ETT is required.  One candidate is a formulation of type theory with explicit levels, representing the dimensionality restriction appropriate to the problem domain.  So work with discrete sets would take place within level 1, which is just extensional type theory.  Level 2 is two-dimensional type theory, and so forth, and the union of all finite levels is something like ITT.  To make this work requires that there be a theory of cumulativity of levels, a theory of resizing that allows us to move work at a higher level to a lower level at which it still makes sense, and a theory of truncation that allows suppression of higher-dimensional structure (generalizing proof irrelevance and “squash” types).

However this may turn out, it is clear that the resulting type theory will be far richer than merely the codification of the formal proofs of some logical system.  Types such as the geometric spaces mentioned above do not arise as the types of proofs of propositions, but rather are among the most basic of mathematical constructions, in complete accordance with Brouwer’s dictum.


Church’s Law

August 9, 2012

A new feature of this year’s summer school was a reduction in the number of lectures, and an addition of daily open problem sessions for reviewing the day’s material. This turned out to be a great idea for everyone, because it gave us more informal time together, and gave the students a better chance at digesting a mountain of material. It also turned out to be a bit of an embarrassment for me, because I posed a question off the top of my head for which I thought I had two proofs, neither of which turned out to be valid. The claimed theorem is, in fact, true, and one of my proofs is easily corrected to resolve the matter (the other, curiously, remains irredeemable for reasons I’ll explain shortly). The whole episode is rather interesting, so let me recount a version of it here for your enjoyment.

The context of the discussion was extensional type theory, or ETT, which is characterized by an identification of judgemental with propositional equality: if you can prove that two objects are equal,then they are interchangeable for all purposes without specific arrangement. The alternative, intensional type theory,or ITT, regards judgemental equality as definitional equality (symbolic evaluation), and gives computational meaning to proofs of equality of objects of a type, allowing in particular transport across two instances of a family whose indices are equal. NuPRL is an example of an ETT; CiC is an example of an ITT.

Within the framework of ETT, the principle of function extensionality comes “for free”, because you can prove it to hold within the theory. Function extensionality states that f=g:A\to B whenever x:A\vdash f(x)=g(x):B. That is, two functions are if they are equal on all arguments (and, implicitly, respect equality of arguments). Function extensionality holds definitionally if your definitional equivalence includes the \eta and \xi rules, but in any case does not have the same force as extensional equality. Function extensionality as a principle of equality cannot be derived in ITT, but must be added as an additional postulate (or derived from a stronger postulate, such as univalence or the existence of a one-dimensional interval type).

Regardless of whether we are working in an extensional or an intensional theory, it is easy to see that all functions of type N\to N definable in type theory are computable. For example, we may show that all such functions may be encoded as recursive functions in the sense of Kleene, or in a more modern formulation we may give a structural operational semantics that provides a deterministic execution model for such functions (given n:N, run f:N\to N on n until it stops, and yield that as result). Of course the proof relies on some fairly involved meta-theory, but it is all constructively valid (in an informal sense) and hence provides a legitimate computational interpretation of the theory. Another way to say the same thing is to say that the comprehension principles of type theory are such that every object deemed to exist has a well-defined computational meaning, so it follows that all functions defined within it are going to be computable.

This is all just another instance of Church’s Law, the scientific law stating that any formalism for defining computable functions will turn out to be equivalent to, say, the λ-calculus when it comes to definability of number-theoretic functions. (Ordinarily Church’s Law is called Church’s Thesis, but for reasons given in my Practical Foundations book, I prefer to give it the full status of a scientific law.) Type theory is, in this respect, no better than any other formalism for defining computable functions. By now we have such faith in Church’s Law that this remark is completely unsurprising, even boring to state explicitly.

So it may come as a surprise to learn that Church’s Law is false. I’m being provocative here, so let me explain what I mean before I get flamed to death on the internet.   The point I wish to make is that there is an important distinction between the external and the internal properties of a theory. For example, in first-order logic the Löwenheim-Skolem Theorem tells us that if a first-order theory has an infinite model, then it has a countable model. This implies that, externally to ZF set theory, there are only countably many sets, even though internally to ZF set theory we can carry out Cantor’s argument to show that the powerset operation takes us to exponentially higher cardinalities far beyond the countable. One may say that the “reason” is that the evidence for the countability of sets is a bijection that is not definable within the theory, so that it cannot “understand” its own limitations. This is a good thing.

The situation with Church’s Law in type theory is similar. Externally we know that every function on the natural numbers is computable. But what about internally? The internal statement of Church’s Law is this: \Pi f:N\to N.\Sigma n:N. n\Vdash f, where the notation n\Vdash f means, informally, that n is the code of a program that, when executed on input m:N, evaluates to f(m). In Kleene’s original notation this would be rendered as \Pi m:N.\Sigma p:N.T(n,m,p)\wedge Id(U(p),f(m)), where the T predicate encodes the operational semantics, and the U predicate extracts the answer from a successful computation. Note that the expansion makes use of the identity type at the type N. The claim is that Church’s Law, stated as a type (proposition) within ETT, is false, which is to say that it entails a contradiction.

When I posed this as an exercise at the summer school, I had in mind two different proofs, which I will now sketch. Neither is valid, but there is a valid proof that I’ll come to afterwards.

Both proofs begin by applying the so-called Axiom of Choice. For those not familiar with type theory, the “axiom” of choice is in fact a theorem, stating that every total binary relation contains a function. Explicitly,

(\Pi x:A.\Sigma y:B.R(x,y)) \to \Sigma f:A\to B.\Pi x:A.R(x,f(x)).

The function f is the “choice function” that associates a witness to the totality of R to each argument x. In the present case if we postulate Church’s Law, then by the axiom of choice we have

\Sigma F:(N\to N)\to N.\Pi f:N\to N. F(f)\Vdash f.

That is, the functional F picks out, for each function f in N\to N, a (code for a) program that witnesses the computability of f. This should already seem suspicious, because by function extensionality the functional F must assign the same program to any two extensionally equal functions.

We may easily see that F is injective, for if F(f) is F(g), then both track both f and g, and hence f and g are (extensionally) equal. Thus we have an injection from N\to N into N, which seems “impossible” … except that it is not! Let’s try the proof that this is impossible, and see where it breaks down. Suppose that i:(N\to N)\to N is injective. Define d(x)=i^{-1}(x)(x)+1, and consider d(i(d))=i^{-1}(i(d))(i(d))+1=d(i(d))+1 so 0=1 and we are done. Not so fast! Since i is only injective, and not necessarily surjective, it is not clear how to define i^{-1}. The obvious idea is to send x=i(f) to f, and any x outside the image of i to, say, the identity. But there is no reason to suppose that the image of i is decidable, so the attempted definition breaks down. I hacked around with this for a while, trying to exploit properties of F to repair the proof (rather than work with a general injection, focus on the specific functional F), but failed. Andrej Bauer pointed out to me, to my surprise, that there is a model of ETT (which he constructed) that contains an injection of N\to N into N! So there is no possibility of rescuing this line of argument.

(Incidentally, we can show within ETT that there is no bijection between N and N\to N, using surjectivity to rescue the proof attempt above. Curiously, Lawvere has shown that there can be no surjection from N onto N\to N, but this does not seem to help in the present situation. This shows that the concept of countability is more subtle in the constructive setting than in the classical setting.)

But I had another argument in mind, so I was not worried. The functional F provides a decision procedure for equality for the type N\to N: given f,g:N\to N, compare F(f) with F(g). Surely this is impossible! But one cannot prove within type theory that \textrm{Id}_{N\to N}(-,-) is undecidable, because type theory is consistent with the law of the excluded middle, which states that every proposition is decidable. (Indeed, type theory proves that excluded middle is irrefutable for any particular proposition P: \neg\neg(P\vee\neg P).) So this proof also fails!

At this point it started to seem as though Church’s Law could be independent of ETT, as startling as that sounds. For ITT it is more plausible: equality of functions is definitional, so one could imagine associating an index with each function without disrupting anything. But for ETT this seemed implausible to me. Andrej pointed me to a paper by Maietti and Sambin that states that Church’s Law is incompatible with function extensionality and choice. So there must be another proof that refutes Church’s Law, and indeed there is one based on the aforementioned decidability of function equivalence (but with a slightly different line of reasoning than the one I suggested).

First, note that we can use the equality test for functions in N\to N to check for halting. Using the T predicate described above, we can define a function that is constantly 0 iff a given (code of a) program never halts on given input. We may then use the above-mentioned equality test to check for halting. So it suffices to show that the halting problem for (codes of) functions and inputs is not computable to complete the refutation of the internal form of Church’s Law.

Specifically, assume given h:N\times N\to N that, given a code for a function and an input, yields 0 or 1 according to whether or not that function halts when applied to that input. Define d:N\to N by \lambda x:N.\neg h(x,x), the usual diagonal function. Now apply the functional F obtained from Church’s Law using the Axiom of Choice to obtain n=F(d), the code for the function d, and consider h(n,n) to derive the needed contradiction. Notice that we have used Church’s Law here to obtain a code for the type-theoretic diagonal function, which is then passed to the halting tester in the usual way.

As you can see, the revised argument follows along lines similar to what I had originally envisioned (in the second version), but requires a bit more effort to push through the proof properly. (Incidentally, I don’t think the argument can be made to work in pure ITT, but perhaps it would go through for ITT enriched with function extensionality.)

Thus, Church’s Law is false internally to extensional type theory, even though it is evidently true externally for that theory. You can see the similarity to the situation in first-order logic described earlier. Even though all functions of type N\to N are computable, type theory itself is not capable of recognizing this fact (at least, not in the extensional case). And this is a good thing, not a bad thing! The whole beauty of constructive mathematics lies in the fact that it is just mathematics, free of any self-conscious recognition that we are writing programs when proving theorems constructively. We never have to reason about machine indices or any such nonsense, we just do mathematics under the discipline of not assuming that every proposition is decidable. One benefit is that the same mathematics admits interpretation not only in terms of computability, but also in terms of continuity in topological spaces, establishing a deep connection between two seemingly disparate topics.

(Hat tip to Andrej Bauer for help in sorting all this out. Here’s a link to a talk and a paper about the construction of a model of ETT in which there is an injection from N\to N to N.)

Update: word-smithing.


Follow

Get every new post delivered to your Inbox.

Join 1,294 other followers