Languages and Machines


Models of computation may be divided into two categories, the λ-calculus, and all the others.  Well, maybe Post’s production systems and Herbrand-Goedel equations belong in the former category, but certainly all the well-known, and widely accepted, models such as Turing machines or RAM’s, sit quite apart from the λ-calculus.  Guy Blelloch has characterized the division as between languages and machines.  The machine-based models are all based on the idea of a finite control augmented by unbounded memory.  It’s up to the programmer to manage the memory to implement anything more structured than a bit, and there is no notion, other than via an interpreter, for changing the program during execution; the machine models are all decidedly non-von Neumann because they lack the idea of a stored program.  The sole language-based model, by contrast, does not separate program from data, and offers a foundation on which realistic languages may be built, directly and without encoding.  This would seem like a decisive advantage for the λ-calculus: whereas the machine models live in the early chapters of our textbooks, they are of essentially no practical importance, the λ-calculus is directly useful for programming and for mechanization of logical systems.  And yet it is the machine models that dominate, especially in the world of algorithms and complexity, and the λ-calculus remains an esoteric oddity studied only by a relative minority of Computer Scientists.  How can that be?

When confronted with this question, my typical colleague (at least on the theory side) dismisses the question as totally uninteresting because “all models of computation are polynomial-time equivalent”, so who cares?  Well, if your work is all done modulo a polynomial factor in complexity, and you only care about computations over natural numbers (or other finitary objects), then I guess it doesn’t matter.  You tell some story once and for all of how the objects of interest are represented on some model, and leave it at that; the details just don’t matter.

But obviously the models do matter, as evidenced by the inescapable fact that the very same colleague would never under any circumstance consider anything other than a classical machine model as the underpinning for his work!  Surely, if they are all equivalent, then it can’t matter, so we can all switch to the λ-calculus, right?  Well, true though that may be, I can assure you that your argument is going nowhere.  Apparently some models are more equivalent than others after all!  Why would that be?

One reason is that a lot of work is not up to a polynomial factor (one might even suspect, most of it isn’t.)  So for those purposes the model may matter.  On the other hand, no one actually uses the “official” models in their work.  No one writes Turing machine code to describe an algorithm, nor even RAM code (Knuth notwithstanding).  Rather, they write what is charmingly called pidgin Algol, or pidgin C, or similar imperative programming notation.  That is, they use a programming language, not a machine!  As well they should.  But then why the emphasis on machine models?  And what does that pidgin they’re writing mean?

The answer is that the language is defined by a translation (that is, a compiler) that specifies that such-and-such pidgin Algol stands for such-and-such RAM code or Turing machine code.  The description is informal, to be sure, but precise enough that you get the idea.  Crucially, you reason not about the code you wrote, but the code the compiler writes, because officially the target code is the “real” code, the pidgin being a convenience.  For this to work, you have to hold the compiler in your head: it must be clear what is meant by everything you actually write by imagining its translation onto, say, a RAM.  This is not too bad, provided that the pidgin is simple, so simple that you can compile it in your head.  This works ok, but is increasingly problematic, because it limits the range of algorithms we can conveniently express and analyze by restricting the language in which we express them.

So why the insistence on such an awkward story?  One reason is, evidently, tradition.  If it was good enough for my advisor, it’s good enough for me.  Or, if you want to get published, you’d better write in a style that is familiar to the people making the decisions.

A more substantial reason, though, is that machine models are the basis for establishing the cost of a computation.  A “step” of computation is defined to be a machine step, and we compare algorithms by estimating the number of steps that they take to solve the same problem (up to a multiplicative factor, in most cases).  As long as we can “hand compile” from pidgin Algol into machine code, we can assign a cost to programs written in a higher-than-assembly level language, and we can make rational comparisons between algorithms.  This methodology has served us well, but it is beginning to break down (parallelism is one reason, but there are many others; I’ll leave discussion of this for another occasion).

There is an alternative, which is to provide a cost semantics for the language we write, and conduct our analyses on this basis directly, without appeal to a compiler or reference to an underlying machine.  In short we adopt a linguistic model of computation, rather than a machine model, and life gets better!  There is a wider range of options for expressing algorithms, and we simplify the story of how algorithms are to be analyzed.

To do this requires that we give a cost semantics for the language we’re actually using.  And this is where the λ-calculus comes into play, for it is the paradigmatic example of how to specify a language-based model of computation.  Briefly, the crucial notion in the λ-calculus is the concept of a reduction, written M→M’, expressing one step of computation of program M resulting in program M’.  Execution is defined directly on the programs we write, and the model provides a well-defined notion of computation step that we can count to obtain the time complexity of the program.  Decades of experience shows that this approach scales to realistic languages.

What’s not to like?  Personally, I have no idea.  I have been mystified by the suspicion with which the λ-calculus seems to be regarded in algorithmic circles.  Just about any machine model is regarded as a perfectly good foundation, yet the behavior of the field itself shows that machine models are of limited utility for the work that is actually done!

The only reason I can think of, apart from powerful social effects, is that there is lingering, yet totally unfounded, doubt about whether the concept of cost that arises from the λ-calculus is suitable for algorithm analysis.  (One researcher recently told me “you can hide an elephant inside a β-reduction!”, yet he could not explain to me why I’m wrong.)  One way to justify this claim is to define, once and for all, a compiler from λ-calculus to RAM code that makes clear that the abstract steps of the λ-calculus can be implemented in constant time on a RAM (in the sense of not varying with the size of the input, only in the size of the static program).  Blelloch and Greiner have carried out exactly this analysis in their seminal work in the early 90’s; see Guy’s web page for details.

In a later post I will take these ideas a bit further, and show (by explaining Blelloch’s work) that the λ-calculus serves not only as a good model for sequential algorithms, but also for parallel algorithms!  And that we can readily extend the model to languages with important abstractions such as higher-order functions or exceptions without compromising their utility for expressing and analyzing algorithms.

About these ads

8 Responses to Languages and Machines

  1. johneganz says:

    Models of computation may be divided into two categories, the λ-calculus, and all the others.

    Actually, if we’re talking about “computers” that have been physically realized, there are two models of computation: NAND logic, and nothing else.

    but certainly all the well-known, and widely accepted, models such as Turing machines or RAM’s, sit quite apart from the λ-calculus.

    … what? There is this well known concept known as Turing complete, along with the Church-Turing thesis, which states:

    Everything computable is computable by a Turing machine.

    In particular (from the wikipedia article on the Church-Turing thesis)-

    All three computational processes (recursion, the λ-calculus, and the Turing machine) were shown to be equivalent—all three approaches define the same class of functions.

    This sort of glosses over the fact that out of all the computation models proposed, only one of them, the Turing machine, could be made physically manifest. Members of the λ-calculus cult have a tendency to dismiss this detail as a “nitpicking point that is completely inconsequential”.

    Well, there is that one detail about Turing machines that doesn’t translate very well in to physical reality: the original model used a paper tape that was “infinitely long”. So physically real Turing machines have a tape of finite length, and all physically real Turing machines can be expressed using nothing more than a 2 input NAND gate (and yes, that computer that you are using right now can be reduced to nothing but billions of 2 input NAND gates).

    This would seem like a decisive advantage for the λ-calculus: whereas the machine models live in the early chapters of our textbooks, they are of essentially no practical importance, the λ-calculus is directly useful for programming and for mechanization of logical systems.

    Hmmm. Except for the fact that physically real computers are only of the “machine model” variety, I’m with you on the whole “no practical importance” point. I mean, if we ever wanted to do anything more than mentally masturbate at the church of λ-calculus and actually do something with a computer, we might have a problem, but I say we try to ride that pony!

    And yet it is the machine models that dominate, especially in the world of algorithms and complexity, and the λ-calculus remains an esoteric oddity studied only by a relative minority of Computer Scientists. How can that be?

    Yea! And I hate how when you try to share the greatness of the one true λ-calculus, they look at you like you’re a member of a cult, foaming at the mouth, babbling on incoherently like you have no idea what you’re talking about… or something… I’m fairly sure these blasphemous words by Dick Lipton, allegedly a Professor of Computer Science at Georgia Tech, have nothing to do with it:

    While working for Church, Stephen was trying to show that the lambda calculus could compute anything “computable”. Unfortunately, Church and Kleene could not see how to compute even the following function:

    x → x + 1.

    Yes, even the lowly successor function was not obviously computable in their calculus. The story goes that one day Stephen was getting his hair cut, when he realized how to encode the successor function into the lambda calculus. It is claimed, he jumped out of the barber chair, with half a haircut, and ran to tell Church. Not quite Archimedes running naked through the streets of Syracuse, shouting “Eureka,” but close. I think the story, true or not, says more about the nature of the lambda calculus as a basis of computing than anything else, but that is my take.

    There is an alternative, which is to provide a cost semantics for the language we write, and conduct our analyses on this basis directly, without appeal to a compiler or reference to an underlying machine. In short we adopt a linguistic model of computation, rather than a machine model, and life gets better! There is a wider range of options for expressing algorithms, and we simplify the story of how algorithms are to be analyzed.

    There might be an alternative, but I can guarantee you that ain’t it.

    To do this requires that we give a cost semantics for the language we’re actually using. And this is where the λ-calculus comes into play, for it is the paradigmatic example of how to specify a language-based model of computation. Briefly, the crucial notion in the λ-calculus is the concept of a reduction, written M→M’, expressing one step of computation of program M resulting in program M’. Execution is defined directly on the programs we write, and the model provides a well-defined notion of computation step that we can count to obtain the time complexity of the program. Decades of experience shows that this approach scales to realistic languages.

    … and “expressing one step of computation” is why. The model you are suggesting is literally identical to a Turing Machine- a list of step by step instructions, where each step happens one after the other. In other words, your typical “left to right, top to bottom” logical ordering of the execution of a the steps needed to carry out a computation.

    I am aware of only one family of languages that is genuinely different than a Turing Machine- hardware description languages such as Verilog. In Verilog, there is literally no concept of “this is the line that my program is on at this point in time.” Every single statement of every single line in every source file executes at the exact same time. As proof of this fact- think of a VLSI circuit, a 2D grid of boolean gates. Every single gate is executing in parallel and concurrently relative to all the other gates. Things do not happen infinitely fast, and (while not pedantically true) the only way to control and order when things happen is with the statement always @ (posedge clock). Timing closure tools are used to analyze the design to find the circuit path that takes the longest amount of time to propagate from its inputs to its outputs, which effectively defines “how long” the clock period must last, and therefore how fast the design is.

    • abstract type says:

      The statement you so proudly quote about the successor function is false.

    • blaisorblade says:

      The blog post is talking about models of computations, not physically realized computers. Even if you want to predict performance of physical computers, you still build a model to use in your “math”, which just needs to be convenient for the “math” itself (for instance, writing an interesting program and computing its complexity).

      Successor is easy, it’s predecessor which is complicated in *pure* lambda calculus — Lipton himself admits he “may have gotten that wrong” (http://rjlipton.wordpress.com/2009/05/27/arithmetic-hierarchy-and-pnp/#comment-599).

      Programming languages are usually extended lambda calculi, where no such complication arises.

  2. [...] up to a rich collection of types and programs acting on them.  In other words type theory is a language-based model of computation, which supports reasoning directly about the code you write, not its translation or interpretation [...]

  3. adrieng says:

    Well, wasn’t the supposedly too “coarse-grained” nature of ß-reduction one of the driving force behind the development of linear logic?

    See for example slide 17 from a this course (in French sorry…) given by Paul-André Melliès : http://www.pps.jussieu.fr/~mellies/mpri/mpri-m2/mpri-mellies-cours-intro.pdf

    For what it’s worth, Girard is still trying to express deep results from computational complexity theory in “linguistic” terms by using complex models based upon operator algebras.

    • adrieng says:

      one of the driving forces*

      Sorry for the typo.

    • Marc Hamann says:

      @adrieng:

      Linear logic does address a resource cost issue, but I don’t think that this issue is avoided by TMs either.
      TMs only seem like a better model of resources since we rely on our intuitions about the “physics” of the TM to substitute for explicit logical reasoning about resource usage.
      This “intuitive physics” likely hides more sins than any problems that arise from the relatively simple and explicit nature of syntactic reductions.

  4. [...] is not concurrency, but what is it?  What is an appropriate model for parallel computation?  As in the sequential case, there are two answers, a machine-based approach and a language-based approach.  I am going to [...]

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,424 other followers

%d bloggers like this: