Models of computation may be divided into two categories, the λ-calculus, and all the others. Well, maybe Post’s production systems and Herbrand-Goedel equations belong in the former category, but certainly all the well-known, and widely accepted, models such as Turing machines or RAM’s, sit quite apart from the λ-calculus. Guy Blelloch has characterized the division as between languages and machines. The machine-based models are all based on the idea of a finite control augmented by unbounded memory. It’s up to the programmer to manage the memory to implement anything more structured than a bit, and there is no notion, other than via an interpreter, for changing the program during execution; the machine models are all decidedly non-von Neumann because they lack the idea of a stored program. The sole language-based model, by contrast, does not separate program from data, and offers a foundation on which realistic languages may be built, directly and without encoding. This would seem like a decisive advantage for the λ-calculus: whereas the machine models live in the early chapters of our textbooks, they are of essentially no practical importance, the λ-calculus is directly useful for programming and for mechanization of logical systems. And yet it is the machine models that dominate, especially in the world of algorithms and complexity, and the λ-calculus remains an esoteric oddity studied only by a relative minority of Computer Scientists. How can that be?
When confronted with this question, my typical colleague (at least on the theory side) dismisses the question as totally uninteresting because “all models of computation are polynomial-time equivalent”, so who cares? Well, if your work is all done modulo a polynomial factor in complexity, and you only care about computations over natural numbers (or other finitary objects), then I guess it doesn’t matter. You tell some story once and for all of how the objects of interest are represented on some model, and leave it at that; the details just don’t matter.
But obviously the models do matter, as evidenced by the inescapable fact that the very same colleague would never under any circumstance consider anything other than a classical machine model as the underpinning for his work! Surely, if they are all equivalent, then it can’t matter, so we can all switch to the λ-calculus, right? Well, true though that may be, I can assure you that your argument is going nowhere. Apparently some models are more equivalent than others after all! Why would that be?
One reason is that a lot of work is not up to a polynomial factor (one might even suspect, most of it isn’t.) So for those purposes the model may matter. On the other hand, no one actually uses the “official” models in their work. No one writes Turing machine code to describe an algorithm, nor even RAM code (Knuth notwithstanding). Rather, they write what is charmingly called pidgin Algol, or pidgin C, or similar imperative programming notation. That is, they use a programming language, not a machine! As well they should. But then why the emphasis on machine models? And what does that pidgin they’re writing mean?
The answer is that the language is defined by a translation (that is, a compiler) that specifies that such-and-such pidgin Algol stands for such-and-such RAM code or Turing machine code. The description is informal, to be sure, but precise enough that you get the idea. Crucially, you reason not about the code you wrote, but the code the compiler writes, because officially the target code is the “real” code, the pidgin being a convenience. For this to work, you have to hold the compiler in your head: it must be clear what is meant by everything you actually write by imagining its translation onto, say, a RAM. This is not too bad, provided that the pidgin is simple, so simple that you can compile it in your head. This works ok, but is increasingly problematic, because it limits the range of algorithms we can conveniently express and analyze by restricting the language in which we express them.
So why the insistence on such an awkward story? One reason is, evidently, tradition. If it was good enough for my advisor, it’s good enough for me. Or, if you want to get published, you’d better write in a style that is familiar to the people making the decisions.
A more substantial reason, though, is that machine models are the basis for establishing the cost of a computation. A “step” of computation is defined to be a machine step, and we compare algorithms by estimating the number of steps that they take to solve the same problem (up to a multiplicative factor, in most cases). As long as we can “hand compile” from pidgin Algol into machine code, we can assign a cost to programs written in a higher-than-assembly level language, and we can make rational comparisons between algorithms. This methodology has served us well, but it is beginning to break down (parallelism is one reason, but there are many others; I’ll leave discussion of this for another occasion).
There is an alternative, which is to provide a cost semantics for the language we write, and conduct our analyses on this basis directly, without appeal to a compiler or reference to an underlying machine. In short we adopt a linguistic model of computation, rather than a machine model, and life gets better! There is a wider range of options for expressing algorithms, and we simplify the story of how algorithms are to be analyzed.
To do this requires that we give a cost semantics for the language we’re actually using. And this is where the λ-calculus comes into play, for it is the paradigmatic example of how to specify a language-based model of computation. Briefly, the crucial notion in the λ-calculus is the concept of a reduction, written M→M’, expressing one step of computation of program M resulting in program M’. Execution is defined directly on the programs we write, and the model provides a well-defined notion of computation step that we can count to obtain the time complexity of the program. Decades of experience shows that this approach scales to realistic languages.
What’s not to like? Personally, I have no idea. I have been mystified by the suspicion with which the λ-calculus seems to be regarded in algorithmic circles. Just about any machine model is regarded as a perfectly good foundation, yet the behavior of the field itself shows that machine models are of limited utility for the work that is actually done!
The only reason I can think of, apart from powerful social effects, is that there is lingering, yet totally unfounded, doubt about whether the concept of cost that arises from the λ-calculus is suitable for algorithm analysis. (One researcher recently told me “you can hide an elephant inside a β-reduction!”, yet he could not explain to me why I’m wrong.) One way to justify this claim is to define, once and for all, a compiler from λ-calculus to RAM code that makes clear that the abstract steps of the λ-calculus can be implemented in constant time on a RAM (in the sense of not varying with the size of the input, only in the size of the static program). Blelloch and Greiner have carried out exactly this analysis in their seminal work in the early 90′s; see Guy’s web page for details.
In a later post I will take these ideas a bit further, and show (by explaining Blelloch’s work) that the λ-calculus serves not only as a good model for sequential algorithms, but also for parallel algorithms! And that we can readily extend the model to languages with important abstractions such as higher-order functions or exceptions without compromising their utility for expressing and analyzing algorithms.