The crucial clue that allowed Holmes to solve the Silver Blaze mystery was the dog that didn’t bark: the apparent murderer and horse thief was not a stranger, but the trainer himself, the only one who could have gotten into the stables unnoticed. As I’ve mentioned, this semester Dan Licata and I are co-teaching a new course in functional programming for first-year students at Carnegie Mellon. Apart from parallelism, wbich I’ve already discussed, we are placing a strong emphasis on verification and proof as tools for the practicing programmer, chief among these being the infamous hellhounds of Computer Science, induction and recursion. And I am happy to say that, as in the Holmes story, these are the dogs that didn’t bark. The results from homework and an in-class exam make clear that the vast majority of the class of about 80 students have mastered these crucial techniques with a minimum of fuss and bother. In this post I’d like to explain why I think we’ve succeeded where, apparently, so many others have failed.
One of the pleasures of teaching at Carnegie Mellon is that we the students are bright (among the brightest in the country according to the usual metrics), open-minded (they’re game for whatever I wish to throw at them), and hard-working (Carnegie’s motto is “My heart is in the work.”). I am sure that this is a big part of the story. Most probably our students could master almost anything, no matter how poorly we explain it! But it can’t be just that either. My colleagues at other top schools, both research universities and teaching colleges, public and private, American and international, have often spoken of their frustration and despair at teaching programming using recursion, let alone proof by induction, to beginning (or even advanced) students. Some even go so far as to say that it can’t be done, that while the brightest students “get it”, the less bright students never will, so we should just give up even trying. I’m not convinced. I can’t prove that I’m right, but I do have quite a few years experience at pulling this off, so it is at least conceivable that the problem lies not with the students, but with the teaching methods. So let me offer mine as an example, and perhaps others will be able to use them as well.
Let’s start with the punch line, just to pique your interest (or ire): the answer is type theory. The great idea of constructive type theory is that there is no distinction between programs and proofs: the code embodies the reasoning the justifies it, and the reasoning is just a form of code. This is the propositions as types principle (sometimes grandly called the “Curry-Howard isomorphism”, the small problem being that neither Curry nor Howard invented it, nor is it an isomorphism). Theorems are types (specifications), and proofs are programs. Programs/proofs are executed by evaluating them by applying simplification rules to reduce them to canonical form. It’s just generalized high school algebra, in which you plug in values for variables and simplify using equational reasoning, but scaled up to a rich collection of types and programs acting on them. In other words type theory is a language-based model of computation, which supports reasoning directly about the code you write, not its translation or interpretation in some other terms.
The significance of a language-based model for teaching cannot be overestimated. One of the classic pitfalls of teaching recursion is to try to explain it by referring to some mysterious entity called a “stack” that is absolutely invisible in the program you write. There’s a long song and dance about “frames” and “procedure calls” and “pushing” and “popping” things on this “stack” of which the professor speaks, and no one understands a damn thing. It all sounds very complicated, and impressive in a way, but if there’s one thing people learn, it’s to stay the hell away from it! And in doing so, students are robbed of one of the most powerful and useful programming tools available to them, the key to writing clean, reliable code (and, as I’ve previously argued, for parallelism as well). The alternative is to use Plotkin’s structural operational semantics to define the behavior of all language features, including recursion. To pull it off, you have to have a semantics for the language you’re using (something that’s in short supply for commercial languages), and you have to be facile in using it yourself.
What makes structural operational semantics so useful is that the execution model is directly defined by an inductive definition of a transition relation,
, on programs. (I don’t have the space to explain it fully here, but you may consult Practical Foundations for Programming Languages for a full account.) The crucial idea is to use inference rules to give a structural definition of evaluation for programs themselves, not a translation into some underlying abstract machine code. For example, one rule for evaluation of addition expressions is
,
which states that one step of evaluation of the sum
consists of a step of evaluation of
to obtain
, with the result being
. Another rule states that
,
where $m$ is a number, and a third states that
, where $p$ is the sum of $m$ and $n$. With these (and similar) rules in place, one may give an execution trace in the form
,
which makes explicit the steps of evaluation. Each step is justified by a “proof tree” consisting of a composition of rules. Importantly, the size of the expressions
correlates directly with the space required by an implementation, allowing for a clean and clear explanation of space issues such as tail recursion.
This is a good start for explaining the semantics of a recursive function, but the real beauty of type theory emerges when teaching how to reason about a recursively defined function. The lesson of type theory is that induction and recursion are the same thing: an inductive proof is a recursive function, and a recursive function is an inductive proof. This means that there is really only one concept here, and not two, and that the two go hand-in-hand. It’s not a matter of “eat your spinach, it’s good for you”, but rather of a conceptual unification of proving and programming. For our students thinking inductively and programming recursively go together naturally, the dynamic duo of functional programming.
The starting point is, once again, an inductive definition. For specificity and ease of exposition, let me give an inductive definition of the natural numbers:


The definition uses the exact same rule format that we used to define the semantics. Inductive definitions given by a set of rules become second nature. Moreover, they transcribe almost literally into ML:
datatype nat = Zero | Succ of nat
So here we have the primordial example of an inductively defined type. Many data structures, such as various forms of trees, fit naturally into the same format, a fact that we use repeatedly throughout the semester.
Defining a function on an inductive type consists of giving a series of clauses, one for each rule defining the elements of that type:
fun plus x y = case x of Zero => y | Succ x’ => Succ (plus x’ y)
Whenever we have an inductive type, we use the same format each time for functions defined over it. Crucially, we allow ourselves “recursive calls” corresponding exactly to the premises of the rule governing a particular clause. There is no need for any fuss about “pushing” things, nor any discussion of “sizes” of things, just a pattern of programming that is guided directly by the definition of the type on which we are working.
And now comes the best part. If we want to prove something about a recursively defined function, we write a proof that has exactly the same form, one case for each rule in the inductive definition. And, just as we get to make a “recursive call” for each premise of each rule, so we also get to make a “recursive call” to the proof for each premise, that is, an appeal to the inductive hypothesis. The proof is, of course, a program that constructs a proof for every element of an inductive type by composing the inductive steps around a base case.
Importantly, the students “get it”. On the midterm we ask them to prove (in an in-class setting) a small theorem about a piece of code, or to write a piece of code and prove a small theorem about it, and they are able to do this quite reliably. In other words, it works!
Perhaps the most important thing, though, is to convey the sense of inevitability and naturality about induction and recursion. Don’t talk about proof as a separate topic! Don’t operationalize recursion beyond the bounds of the language model. Simply do the natural thing, and it will all seem natural and obvious … and fun!
Moreover, it scales. Our high-water mark prior to the midterm is to develop and debug a continuation-based regular expression matcher that has a subtle bug in it. We state carefully the specification of the matcher, and use an inductive proof attempt to reveal the error. The proof does not go through! And the failure makes clear that there is a bug that can be fixed in a number of ways, repairing the proof along with the code. I will detail this another time, but the impatient reader may wish to consult my JFP paper entitled “Proof-directed debugging” for an earlier version of what we taught this semester.
Let me close with two remarks.
Matthias Felleisen at Northeastern has been advocating essentially this methodology for decades; his work has strongly influenced mine. His results are nothing short of spectacular, and with a much broader range of students than I ordinarily face here at Carnegie Mellon. So, no excuses: read his How to Design Programs, and follow it!
Finally, this article explains why Haskell is not suitable for my purposes: the principle of induction is never valid for a Haskell program! The problem is that Haskell is a lazy language, rather than a language with laziness. It forces on you the existence of “undefined” values of every type, invalidating the bedrock principle of proof by induction. Put in other terms suggested to me by John Launchbury, Haskell has a paucity of types, lacking even the bog standard type of natural numbers. Imagine!
(And please note the analogy with dynamic languages, as opposed to languages with dynamic types. It’s the same point, albeit in another guise, namely the paucity of types.)