Well, nothing, really, except that I recently learned (from Guy Blelloch) the origin of the notably inapt term *dynamic programming *for a highly useful method of memoization invented by Richard Bellman that is consonant with my suspicion. Bellman, it turns out, had much the same thought as mine about the appeal of the word “dynamic”, and used it consciously to further his own ends:

“I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes.

“An interesting question is, ‘Where did the name, dynamic programming, come from?’ The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word, research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term, research, in his presence. You can imagine how he felt, then, about the term, mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various rea- sons. I decided therefore to use the word, ‘programming.’ I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying—I thought, let’s kill two birds with one stone. Let’s take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities” (p. 159).

Hilarious, or what? It explains a lot, I must say, and confirms a long-standing suspicion of mine about the persistent belief in a non-existent opposition.

Filed under: Research, Teaching Tagged: dynamic and static typing ]]>

I’ve recently gotten an inkling of why it might be that many people equate the two concepts (or see no point in distinguishing them). This post is an attempt to clear up what I perceive to be a common misunderstanding that seems to explain it. It’s hard for me to say whether it really is all that common of a misunderstanding, but it’s the impression I’ve gotten, so forgive me if I’m over-stressing an obvious point. In any case I’m going to try for a “bottom up” explanation that might make more sense to some people.

The issue is scheduling.

The naive view of parallelism is that it’s just talk for concurrency, because all you do when you’re programming in parallel is fork off some threads, and then do something with their results when they’re done. I’ve previously argued that this is the wrong way to think about parallelism (it’s really about cost), but let’s just let that pass. It’s unarguably true that a parallel computation does consist of a bunch of, well, parallel computations. So, the argument goes, it’s nothing but concurrency, because concurrency is, supposedly, all about forking off some threads and waiting to see what they do, and then doing something with it. I’ve argued that that’s not a good way to think about concurrency either, but we’ll let that pass too. So, the story goes, concurrency and parallelism are synonymous, and bullshitters like me are just trying to confuse people and make trouble.

Being the troublemaker that I am, my response is, predictably, **no , just no**. Sure, it’s kinda sorta right, as I’ve already acknowledged, but not really, and here’s why: scheduling as you learned about it in OS class (for example) is an altogether different thing than scheduling for parallelism. And this is the heart of the matter, from a “bottom-up” perspective.

There are two aspects of OS-like scheduling that I think are relevant here. First, it is **non-deterministic**, and second, it is **competitive**. Non-deterministic, because you have little or no control over what runs when or for how long. A beast like the Linux scheduler is controlled by a zillion “voodoo parameters” (a turn of phrase borrowed from my queueing theory colleague, Mor Harchol-Balter), and who the hell knows what is going to happen to your poor threads once they’re in its clutches. Second, and more importantly, an OS-like scheduler is allocating resources **competitively**. You’ve got your threads, I’ve got my threads, and we both want ours to get run as soon as possible. We’ll even pay for the privilege (priorities) if necessary. The scheduler, and the queueing theory behind it (he says optimistically) is designed to optimize resource usage on a competitive basis, taking account of quality of service guarantees purchased by the participants. It does not matter whether there is one processor or one thousand processors, the schedule is unpredictable. That’s what makes concurrent programming hard: you have to program against all possible schedules. And that’s why you can’t prove much about the time or space complexity of your program when it’s implemented concurrently.

Parallel scheduling is a whole ‘nother ball of wax. It is (usually, but not necessarily) **deterministic**, so that you can prove bounds on its efficiency (Brent-type theorems, as I discussed in my previous post and in PFPL). And, more importantly, it is **cooperative** in the sense that all threads are working together for the same computation towards the same ends. The threads are scheduled so as to get the job (there’s only one) done as quickly and as efficiently as possible. Deterministic schedulers for parallelism are the most common, because they are the easiest to analyze with respect to their time and space bounds. **Greedy** schedulers, which guarantee to maximize use of available processors, never leaving any idle when there is work to be done, form an important class for which the simple form of Brent’s Theorem is obvious.

Many deterministic greedy scheduling algorithms are known, of which I will mention *p*-DFS and *p*-BFS, which do *p*-at-a-time depth- and breadth-first search of the dependency graph, and various forms of work-stealing schedulers, pioneered by Charles Leiserson at MIT. (Incidentally, if you don’t already know what *p*-DFS or *p*-BFS are, I’ll warn you that they are a little trickier than they sound. In particular *p*-DFS uses a data structure that is sort of like a stack but *is not a stack*.) These differ significantly in their time bounds (for example, work stealing usually involves expectation over a random variable, whereas the depth- and breadth traversals do not), and differ *dramatically* in their space complexity. For example, *p*-BFS is absolutely dreadful in its space complexity. For a full discussion of these issues in parallel scheduling, I recommend Dan Spoonhower’s PhD Dissertation. (His semantic profiling diagrams are amazingly beautiful and informative!)

So here’s the thing: when you’re programming in parallel, you don’t just throw some threads at some non-deterministic competitive scheduler. Rather, you generate an implicit dependency graph that a cooperative scheduler uses to maximize efficiency, end-to-end. At the high level you do an asymptotic cost analysis without considering platform parameters such as the number of processors or the nature of the interconnect. At the low level the implementation has to validate that cost analysis by using clever techniques to ensure that, once the platform parameters are known, maximum use is made of the computational resources to get your job done for you as fast as possible. Not only are there no bugs introduced by the mere fact of being scheduled in parallel, but even better, you can prove a theorem that tells you how fast your program is going to run on a real platform. Now how cool is that?

Filed under: Programming, Research Tagged: concurrency, parallelism ]]>

The advantage of a total programming language such as Goedel’s **T** is that it ensures, by type checking, that every program terminates, and that every function is total. There is simply no way to have a well-typed program that goes into an infinite loop. This may seem appealing, until one considers that the upper bound on the time to termination can be quite large, so large that some terminating programs might just as well diverge as far as we humans are concerned. But never mind that, let us grant that it is a virtue of **T** that it precludes divergence.

Why, then, bother with a language such as **PCF** that does not rule out divergence? After all, infinite loops are invariably bugs, so why not rule them out by type checking? (Don’t be fooled by glib arguments about useful programs, such as operating systems, that “run forever”. After all, infinite streams are programmable in the language **M** of inductive and coinductive types in which all functions terminate. Computing infinitely does not mean running forever, it just means “for as long as one wishes, without bound.”) The notion does seem appealing until one actually tries to write a program in a language such as **T**.

Consider computing the greatest common divisor (GCD) of two natural numbers. This can be easily programmed in **PCF** by solving the following equations using general recursion:

The type of defined in this manner has partial function type , which suggests that it may not terminate for some inputs. But we may prove by induction on the sum of the pair of arguments that it is, in fact, a total function.

Now consider programming this function in **T**. It is, in fact, programmable using only primitive recursion, but the code to do it is rather painful (try it!). One way to see the problem is that in **T** the only form of looping is one that reduces a natural number by one on each recursive call; it is not (directly) possible to make a recursive call on a smaller number other than the immediate predecessor. In fact one may code up more general patterns of terminating recursion using only primitive recursion as a primitive, but if you examine the details, you will see that doing so comes at a significant price in performance and program complexity. Program complexity can be mitigated by building libraries that codify standard patterns of reasoning whose cost of development should be amortized over all programs, not just one in particular. But there is still the problem of performance. Indeed, the encoding of more general forms of recursion into primitive recursion means that, deep within the encoding, there must be “timer” that “goes down by ones” to ensure that the program terminates. The result will be that programs written with such libraries will not be nearly as fast as they ought to be. (It is actually quite fun to derive “course of values” recursion from primitive recursion, and then to observe with horror what is actually going on, computationally, when using this derived notion.)

But, one may argue, **T** is simply not a serious language. A more serious total programming language would admit sophisticated patterns of control without performance penalty. Indeed, one could easily envision representing the natural numbers in binary, rather than unary, and allowing recursive calls to be made by halving to achieve logarithmic complexity. This is surely possible, as are numerous other such techniques. Could we not then have a practical language that rules out divergence?

We can, but at a cost. One limitation of total programming languages is that they are not universal: you cannot write an interpreter for **T** within **T **(see Chapter 9 of PFPL for a proof). More importantly, this limitation extends to any total language whatever. If this limitation does not seem important, then consider the *Blum Size Theorem (BST) *(from 1967), which places a very different limitation on total languages. Fix *any* total language, **L**, that permits writing functions on the natural numbers. Pick any blowup factor, say , or however expansive you wish to be. The BST states that there is a total function on the natural numbers that is programmable in **L**, but whose shortest program in **L** is larger by the given blowup factor than its shortest program in **PCF**!

The underlying idea of the proof is that *in a total language the proof of* *termination of a program must be baked into the code itself*, whereas *in* *a partial language the termination proof is an external verification condition* *left to the programmer*. Roughly speaking, there are, and always will be, programs whose termination proof is rather complicated to express, if you fix in advance the means by which it may be proved total. (In **T** it was primitive recursion, but one can be more ambitious, yet still get caught by the BST.) But if you leave room for ingenuity, then programs can be short, precisely because they do not have to embed the proof of their termination in their own running code.

There are ways around the BST, of course, and I am not saying otherwise. For example, the BST merely guarantees the existence of a bad case, so one can always argue that such a case will never arise in practice. Could be, but I did mention the GCD in **T** problem for a reason: there are natural problems that are difficult to express in a language such as** T**. By fixing the possible termination arguments in advance, one is tempting fate, for there are many problems, such as the Collatz Conjecture, for which the termination proof of a very simple piece of code has been an open problem for decades, and has resisted at least some serious attempts on it. One could argue that such a function is of no practical use. I agree, but I point out the example not to say that it is useful, but to say that it is likely that its eventual termination proof will be quite nasty, and that this will have to be reflected in the program itself if you are limited to a **T**-like language (rendering it, once again, useless). For another example, there is no inherent reason why termination need be assured by means similar to that used in **T**. We got around this issue in NuPRL by separating the code from the proof, using a type theory based on a partial programming language, not a total one. The proof of termination is still required for typing in the core theory (but not in the theory with “bar types” for embracing partiality). But it’s not baked into the code itself, affecting its run-time; it is “off to the side”, large though it may be).

*Updates:* word smithing,* *fixed bad link, corrected gcd, removed erroneous parenthetical reference to Coq, fixed LaTeX problems.

Filed under: Programming, Research, Teaching Tagged: functional programming, primitive recursion, programming languages, type theory, verification ]]>

I have spent the last couple of months recovering from the surgery itself, which involved a substantial incision, monitoring my progress and the health of the organ, and adjusting the immune suppressants to achieve a balance between avoiding rejection and inviting infection. As a pay-it-forward gesture I volunteered to be a subject in a study of a new immune suppressant, ASKP1240, that is being developed specifically for kidney transplant patients. All this means frequent visits to the transplant center and frequent blood draws to measure my kidney function and medication levels. The recovery and monitoring has kept me operating at a reduced level, including neglecting my blog.

The good news is that Tony and I have fully recovered from surgery, and I am happy to say that I am enjoying an optimal outcome so far. The transplanted organ began working immediately (this is not always the case), and within two days I had gotten back ten years of kidney function. By now I am at completely normal blood levels, with no signs of kidney disease, and no signs of rejection or other complications. Tony gave me a really good organ, and my body seems to be accepting it so far. I’m told that the first six months are determinative, so I expect to have a pretty solid sense of things by the summer.

I would like to say that among the many things I’ve learned and experienced these last few months, one is an appreciation for the importance of organ donation. Every year hundreds of people die from kidney disease for lack of a suitable organ. If someone is limited to the national cadaverous donors list, it can take more than two years to find an acceptable organ, during which time people often die waiting. Another complication is that someone may have a willing donor, perhaps a family member, with whom they are incompatible. There are now several living donor exchange networks that arrange chains of organ swaps (as many as 55 simultaneously, I’m told!) so that everyone gets a compatible organ. But to be part of such an exchange, you must have a living donor.

Living donation is a daunting prospect for many. It does, after all, involve major surgery, and therefore presents a health risk to the donor. On the other hand nature has provided that we can survive perfectly well on one kidney, and a donation is literally the difference between life and death for the recipient. The trade-off is, in objective terms, clearly in favor of donation, but the scarcity of organs makes clear that not everyone, subjectively, reaches the same conclusion. As an organ recipient, allow me to plead with you to consider becoming an organ donor, at least upon your death, and perhaps even as a living donor for those cases, such as kidney donation, where it is feasible. Should you donate, and find yourself in need of an organ later in life, you will receive top priority among all recipients for a donated organ. The donation process is entirely cost-free to the donor in monetary terms, but pays off big-time in terms of one’s personal satisfaction at having saved someone’s life.

Filed under: Uncategorized Tagged: kidney transplant, organ donation ]]>

Filed under: Research, Teaching Tagged: homotopy type theory, hott, hott book ]]>

As I mentioned, perhaps the best “definition” that is usually offered is to say that “declarative” is synonymous with “functional-and-logic-programming”. This is pretty unsatisfactory, since it is not so easy to define these terms either, and because, contrary to conventional classifications, the two concepts have pretty much nothing in common with each other (but for one thing to be mentioned shortly). The propositions-as-types principle helps set them clearly apart: whereas functional programming is about executing proofs, logic programming is about the search for proofs. Functional programming is based on the dynamics of proof given by Gentzen’s inversion principle. Logic programming is based on the dynamics of provability given by cut elimination and focusing. The two concepts of computation could not be further apart.

Yet they *do* have one thing in common that is usefully isolated as fundamental to what we mean by “declarative”, namely *the concept of a variable*. Introduced by the ancient Hindu and Muslim mathematicians, Brahmagupta and al Kwharizmi, the variable is one of the most remarkable achievements of the human intellect. In my previous post I had secretly hoped that someone would propose variables as being central to what we mean by “declarative”, but no one did, at least not in the comments section. My unstated motive for writing that post was not so much to argue that the term “declarative” is *empty*, but to test the hypothesis that few seem to have grasp the importance of this concept for designing a civilized, and broadly applicable, programming language.

My contention is that *variables*, properly so-called, are what distinguish “declarative” languages from “imperative” languages. Although the imperative languages, including all popular object-oriented languages, are based on a concept that is *called* a variable, they lack anything that actually *is* a variable. And this is where the trouble begins, and the need for the problematic distinction arises. The declarative concept of a variable is the *mathematical* concept of an *unknown* that is given meaning by substitution. The imperative concept of a variable, arising from low-level machine models, is instead given meaning by assignment (mutation), and, by a kind of a notational pun, allowed to appear in expressions in a way that resembles that of a proper variable. But the concepts are so fundamentally different, that I argue in *PFPL* that the imperative concept be called an “assignable”, which is more descriptive, rather than “variable”, whose privileged status should be emphasized, not obscured.

The problem with purely imperative programming languages is that they have *only* the concept of an assignable, and attempt to make it serve also as a concept of variable. The results are a miserable mess of semantic and practical complications. Decades of work has gone into rescuing us from the colossal mistake of identifying variables with assignables. And what is the outcome? If you want to reason about assignables, what you do is (a) write a mathematical formulation of your algorithm (using variables, of course) and (b) show that the imperative code simulates the functional behavior so specified. Under this methodology the mathematical formulation is taken as *self-evidently* correct, the standard against which the imperative program is judged, and is not itself in need of further verification, whereas the imperative formulation is, invariably, in need of *verification*.

What an odd state of affairs! The functional “specification” is itself a perfectly good, and apparently self-evidently correct, program. So why not just write the functional (*i.e.*, mathematical) formulation, and call it a day? Why indeed! Declarative languages, being grounded in the language of mathematics, allow for the *identification* of the “desired behavior” with the “executable code”. Indeed, the propositions-as-types principle elevates this identification to a fundamental organizing principle: propositions are types, and proofs are programs. Who needs verification? Once you have a mathematical specification of the behavior of a queue, say, you already have a running program; there is no need to relegate it to a stepping stone towards writing an awkward, and invariably intricate, imperative formulation that then requires verification to ensure that it works properly.

Functional programming languages are written in the universally applicable language of mathematics as expressed by the theory of types. Such languages are therefore an integral part of science itself, inseparable from our efforts to understand and master the workings of the world. Imperative programming has no role to play in this effort, and is, in my view, doomed in the long run to obsolescence, an artifact of engineering, rather than a fundamental discovery on a par with those of mathematics and science.

This brings me to my main point, the popular concept of a *domain-specific language*. Very much in vogue, DSL’s are offered as the solution to many of our programming woes. And yet, to borrow a phrase from my colleague Guy Blelloch, the elephant in the room is the question “what is a domain?”. I’ve yet to hear anyone explain how you decide what are the boundaries of a “domain-specific” language. Isn’t the “domain” mathematics and science itself? And does it not follow that the right language must be the language of mathematics and science? How can one rule out *anything* as being irrelevant to a “domain”? I think it is impossible, or at any rate inadvisable, to make such restrictions *a priori*. Indeed, full-spectrum functional languages are already the world’s best DSL’s, precisely because they are (or come closest to being) the very language of science, the ultimate “domain”.

Filed under: Research Tagged: assignables, declarative language, domain-specific language, variables ]]>

Or so I had thought. Earlier this week I attended a thriller of an NSF-sponsored workshop on high-level programming models for parallelism, where I was surprised by the declarative zombie once again coming to eat our brains. This got me to thinking, again, about whether the term has any useful meaning. For what it’s worth, and perhaps to generate useful debate, here’re some things that I think people mean, and why I don’t think they mean very much.

- “Declarative” means “high-level”. This just seems to replace one vague term by another.
- “Declarative” means “not imperative”. But this flies in the face of reality. Functional languages embrace and encompass imperative programming as a special case, and even Prolog has imperative features, such as assert and retract, that have imperative meaning.
- “Declarative” means “functional”. OK, but then we don’t really need another word.
- “Declarative” means “what, not how”. But every language has an operational semantics that defines how to execute its programs, and you must be aware of that semantics to understand programs written in it. Haskell has a definite evaluation order, just as much as ML has a different one, and even Prolog execution is defined by a clear operational semantics that determines the clause order and that can be influenced by “cut”.
- “Declarative” means “equational”. This does not distinguish anything, because there is a well-defined notion of equivalence for
*any*programming language, namely observational equivalence. Different languages induce different equivalences, of course, but how does one say that one equivalence is “better” than another? At any rate, I know of no stress on equational properties of logic programs, so either logic programs are not “declarative” or “equational reasoning” is not their defining characteristic. - “Declarative” means “referentially transparent”. The misappropriation of Quine’s terminology only confuses matters. All I’ve been able to make of this is that “referentially transparent” means that beta-equivalence is valid. But beta equivalence is not a property of an arbitrary programming language, nor in any case is it clear why this equivalence is first among equals. In any case why you would decide
*a priori*on what equivalences you want before you even know what it means to run a program? - “Declarative” means “has a denotation”. This gets closer to the point, I think, because we might well say that a declarative
*semantics*is one that gives meaning to programs as some kind of mapping between some sort of spaces. In other words, it would be a synonym for “denotational semantics”. But every language has a denotational semantics (perhaps by interpretation into a Kripke structure to impose sequencing), so having one does not seem to distinguish a useful class of languages. Moreover, even in the case of purely functional programs, the usual denotational semantics (as continuous maps) is not fully abstract, and the fully abstract semantics (as games) is highly operational. Perhaps a language is declarative in proportion to being able to give it semantics in some “familiar” mathematical setting? - “Declarative” means “implicitly parallelizable“. This was certainly the sense intended at the NSF meeting, but no two “declarative” languages seemed to have much in common. Charlie Garrod proposes just “implicit”, which is pretty much synonymous with “high level”, and may be the most precise sense there is to the term.

No doubt this list is not exhaustive, but I think it covers many of the most common interpretations. It seems to me that none of them have a clear meaning or distinguish a well-defined class of languages. Which leads me to ask, is there any such thing as a declarative programming language?

[Thanks to the late Steve Gould for inspiring the title of this post.]

[*Update*: wordsmithing.]

[*Update*: add a remark about semantics, add another candidate meaning.]

[*Update*: added link to previous post.]

Filed under: Research, Teaching Tagged: declarative language, functional programming, imperative programming ]]>

To my way of thinking, the denial of the universal validity of the excluded middle is not the *defining* feature of constructivity, but rather a *characteristic* feature of constructivity—it is the smoke, not the locomotive. But what, then, is the deeper meaning of constructivity that gives rise to the denial of many classical patterns of reasoning, such as proof by contradiction or reasoning by cases on whether a proposition is true or not? Although the full story is not yet completely clear, a necessary condition is that a constructive theory be *proof relevant*, meaning that proofs are mathematical objects like any other, and that they play a role in the development of constructive mathematics unlike their role in classical mathematics.

The most obvious manifestation of proof relevance is the defining characteristic of HoTT, that *proofs of equality* correspond to *paths in a space*. Paths may be thought of as evidence for the equality of their endpoints. That this is a good notion of equality follows from the homotopy invariance of the constructs of type theory: everything in sight respects paths (that is, respect the groupoid structure of types). More generally, theorems in HoTT tend to characterize the space of proofs of a proposition, rather than simply state that the corresponding type is inhabited. For example, the univalence axiom itself states an equivalence between proofs of equivalence of types in a universe and equivalences between these types. This sort of reasoning may take some getting used to, but its beauty is, to my way of thinking, undeniable. Classical modes of thought may be recovered by explicitly obliterating the structure of proofs using truncation. Sometimes this is the best or only available way to state a theorem, but usually one tends to say more than just that a type is inhabited, or that two types are mutually inhabited. In this respect the constructive viewpoint *enriches*, rather than *diminishes*, classical mathematics, a point that even the greatest mathematician of the 20th century, David Hilbert, seems to have missed.

The concept of proof relevance in HoTT seems to have revived a very common misunderstanding about the nature of proofs. Many people have been trained to think that a proof is a derivation in an axiomatic theory, such as set theory, a viewpoint often promoted in textbooks and bolstered by the argument that an informal proof can always be written out in full in this form, even if we don’t do that as a matter of course. It is a short step from there to the conclusion that proofs are therefore mathematical objects, even in classical set theory, because we can treat the derivations as elements of an inductively defined set (famously, the set of natural numbers, but more realistically using more natural representations of abstract syntax such as the s-expression formalism introduced by McCarthy in 1960 for exactly this purpose). From this point of view many people are left confused about the stress on “proofs as mathematical objects” as a defining characteristic of HoTT, and wonder what could be original about that.

The key to recognize that *a proof is not a formal proof*. To avoid further confusion, I hasten to add that by “formal” I do not mean “rigorous”, but rather “represented in a formal system” such as the axiomatic theory of sets. A *formal proof* is an element of a computably enumerable set generated by the axioms and rules of the formal theory. A *proof* is an argument that demonstrates the truth of a proposition. While formal proofs are always proofs (at least, under the assumption of consistency of the underlying formal theory), a proof need not be, or even have a representation as, a formal proof. The principal example of this distinction is Goedel’s Theorem, which proves that the computably enumerable set of formal provable propositions in axiomatic arithmetic is not decidable. The key step is to devise a self-referential proposition that (a) is not formally provable, but (b) has a proof that shows that it is true. The crux of the argument is that *once you fix the rules of proof, you automatically miss out true things that are not provable in that fixed system*.

Now comes the confusing part. HoTT is defined as a formal system, so why doesn’t the same argument apply? It does, pretty much verbatim! But this has no bearing on “proof relevance” in HoTT, because the proofs that are relevant are not the formal proofs (derivations) defining HoTT as a formal system. Rather proofs are formulated internally as objects of the type theory, and there is no commitment *a priori* to being the only forms of proof there are. Thus, for example, we may easily see that there are only countably many functions definable in HoTT from the outside (because it is defined by a formal system), but within the theory any function space on an infinite type has uncountably many elements. There is no contradiction, because the proofs of implications, being internal functions, are not identified with codes of formal derivations, and hence are not denumerable.

There is a close analogy, previously noted in this blog, with Church’s Law. Accepting Church’s Law internally amounts to fixing the programming language used to define functions in advance, permitting us to show, for example, that certain programs are not expressible in that language. But HoTT does not commit to Church’s Law, so such arguments amount to showing that, for example, there is no Turing machine to decide halting for Turing machines, but allowing that there could be constructive functions (say, equipped with oracles) that make such decisions.

The theory of formal proofs, often called proof theory, was dubbed *metamathematics* by Kleene. Until the development of type theory the study of proofs was confined to metamathematics. But now in the brave new world of *constructive mathematics* as embodied in HoTT, proofs (not just formal proofs) have pride of place in mathematics, and provide opportunities for expressing concepts clearly and cleanly that were hitherto obscured or even hidden from our view.

*Update*: Corrected silly wording mistake.

Filed under: Research Tagged: constructive mathematics, homotopy type theory ]]>

But what is it about HoTT that makes all these interpretations and applications possible? What is the key idea that separates HoTT from other approaches that seek to achieve similar ends? What makes HoTT so special?

In a word the answer is *constructivity.* The distinctive feature of HoTT is that it is based on Per Martin-Löf’s Intuitionistic Theory of Types, which was formulated as a foundation for *intuitionistic mathematics* as originally put forth by Brouwer in the 1930′s, and further developed by Bishop, Gentzen, Heyting, Kolmogorov, Kleene, Lawvere, and Scott, among many others. Briefly put, the idea of type theory is to codify and systematize the concept of a *mathematical construction* by characterizing the abstract properties, rather than the concrete realizations, of the objects used in everyday mathematics. Brouwer’s key insight, which lies at the heart of HoTT, is that *proofs are a form of construction* no different in kind or character from numbers, geometric figures, spaces, mappings, groups, algebras, or any other mathematical structure. Brouwer’s dictum, which distinguished his approach from competing alternatives, is that *logic is a part of mathematics*, rather than *mathematics is an application of logic*. Because for him the concept of a construction, including the concept of a proof, is prior to any other form of mathematical activity, including the study of proofs themselves (*i.e.*, logic).

So under Martin-Löf’s influence HoTT starts with the notion of *type* as a classification of the notion of *construction*, and builds upwards from that foundation. Unlike competing approaches to foundations, *proofs are mathematical objects* that play a central role in the theory. This conception is *central* to the homotopy-theoretic interpretation of type theory, which enriches types to encompass spaces with higher-dimensional structure. Specifically, the type is the type of *identifications* of and within the space . Identifications may be thought of as *proofs* that and are *equal* as elements of $A$, or, equivalently, as *paths* in the space between points and . The fundamental principles of abstraction at the heart of type theory ensure that *all constructs of the theory respect these identifications*, so that we may treat them as proofs of equality of two elements. There are three main sources of identifications in HoTT:

- Reflexivity, stating that everything is equal to itself.
- Higher inductive types, defining a type by giving its points, paths, paths between paths, and so on to any dimension.
- Univalence, which states that an equivalence between types determines a path between them.

I will not attempt here to explain each of these in any detail; everything you need to know is in the HoTT book. But I will say a few things about their consequences, just to give a flavor of what these new principles give us.

Perhaps the most important conceptual point is that mathematics in HoTT emphasizes the *structure of proofs* rather than their mere existence. Rather than settle for a mere logical equivalence between two types (mappings back and forth stating that each implies the other), one instead tends to examine the *entire space* of proofs of a proposition and how it relates to others. For example, the univalence axiom itself does not merely state that every equivalence between types gives rise to a path between them, but rather that there is an *equivalence* between the type of equivalences between two types and the type of paths between them. Familiar patterns such as “ iff ” tend to become ““, stating that the proofs of and the proofs of are equivalent. Of course one may *choose *neglect this additional information, stating only weaker forms of it using, say, truncation to suppress higher-dimensional information in a type, but the tendency is to *embrace* the structure and characterize the space of proofs as fully as possible.

A close second in importance is the *axiomatic freedom* afforded by constructive foundations. This point has been made many times by many authors in many different settings, but has particular bite in HoTT. The theory does not commit to (nor does it refute) the infamous *Law of the Excluded Middle* for arbitrary types: the type need not always be inhabited. This property of HoTT is absolutely essential to its expressive power. Not only does it admit a wider range of interpretations than are possible with the Law included, but it also allows for the *selective imposition* of the Law where it is needed to recover a classical argument, or where it is important to distinguish the implications of decidability in a given situation. (Here again I defer to the book itself for full details.) Similar considerations arise in connection with the many forms of Choice that can be expressed in HoTT, some of which are outright provable, others of which are independent as they are in axiomatic set theory.

Thus, what makes HoTT so special is that it is a* constructive* theory of mathematics. Historically, this has meant that it has a *computational* interpretation, expressed most vividly by the *propositions as types* principle. And yet, for all of its promise, what HoTT currently lacks is a computational interpretation! What, exactly, does it mean to *compute* with higher-dimensional objects? At the moment it is difficult to say for sure, though there seem to be clear intuitions in at least some cases of how to “implement” such a rich type theory. Alternatively, one may ask whether the term “constructive”, when construed in such a general setting, must inevitably involve a notion of computation. While it seems obvious on computational grounds that the Law of the Excluded Middle should not be considered universally valid, it becomes less clear why it is so important to omit this Law (and, essentially, no other) in order to obtain the richness of HoTT when no computational interpretation is extant. From my point of view understanding the *computational *meaning of higher-dimensional type theory is of paramount importance, because, for me, type theory is and always has been a *theory of computation* on which the entire edifice of mathematics ought to be built.

Filed under: Research Tagged: homotopy theory, type theory ]]>

The book is freely available on the web in various formats, including a PDF version with active references, an ebook version suitable for your reading device, and may be purchased in hard- or soft-cover from Lulu. The book itself is open source, and is available at the Hott Book Git Hub. The book is under the Creative Commons CC BY-SA license, and will be freely available in perpetuity.

Readers may also be interested in the posts on Homotopy Type Theory, the n-Category Cafe, and Mathematics and Computation which describe more about the book and the process of its creation.

Filed under: Research Tagged: category theory, homotopy theory, type theory ]]>