After the midterm the course staff for the Functional Programming class had a T-shirt made up saying “Functions Are Values”. What prompted this is a curious blind spot shared by nearly every student in the class about the nature of functions in ML (or, for that matter, in math). The students performed admirably on the midterm examination, capably writing higher-order functions, proving correctness of programs by structural induction, and even using higher-order functions to express staged computation. And yet nearly every student got the following simple “gimme” question wrong:
What is the type and value of the expression “fn x => raise Fail”?
I sensed trouble when, during the examination, a student asked me to clarify the question for him. Needless to say, I was at a loss for words! Sure enough, all but one student got this simple question wrong, sometimes spectacularly.
Many said “it has no value” and offered a guess as to its type. Others said “it raises an exception” and therefore has type “exn”. (For those who don’t know ML, the type exn is the type of values associated with an exception.) Still others said “the compiler will notice that it always raises an exception, and will therefore reject it”, or words to that effect. Where the hell did they get that bizarre notion? Whatever the deficiencies of our teaching may be, we certainly never said anything close to that! Others got the type right, but still could not explain what is its value.
Given that they are clearly capable of writing higher-order functional programs, how could this happen? Obviously the fault is not with our students, but with ourselves. But what did we do wrong? How did we manage to stump so many students with what we thought was a free 3-pointer? To be honest, I don’t know. But I have a few theories that I thought I would air here in hopes of helping others avoid my mistakes, or nip misunderstandings in the bud.
Throughout the course we have been very careful to develop a rigorous language-based model of computation using structural operational semantics. We assign meaning to the program you write, not to its translation into some mysterious underlying machine that doesn’t really exist anyway (in other words, we avoided telling the conventional lies). Nowhere did we ever explain anything by reference to “the compiler”, except at one point, which I now realize was a cardinal mistake. Don’t make it yourself. Here’s how it happened. We meant well, but the spirit is weak, and sometimes we stray. Forgive me Father, for I have sinned…..
Here’s where we went wrong, and I think invited wild speculation that led to the mistakes on the examination. One cool use of higher-order functions is to stage computation. A good example is provided by a continuation-based regular expression matcher, which I will not explain in any detail here. The crucial point is that it has the type
regexp -> string -> (string -> bool) -> bool
which says that it accepts a regular expression, a string, and a continuation, matching an initial segment of the string, if possible, and passing the rest to the continuation, or returning false if not. As this description suggests we can think of the matcher as a curried form of a three-argument function, and that’s that. But a more interesting formulation stages the computation so that, given the regexp, it computes a matcher for that regular expression that may be applied repeatedly to many different candidate strings without reprocessing the regular expression. (The code for this is quite beautiful; it’s available on our course web site if you’re interested.)
All this can be explained quite neatly using operational semantics, showing how the recursion over the regexp is completed, yielding a matching function as result. It’s all very cool, except for one little thing: the result contains numerous beta redices that can be simplified to obtain a clearer and simpler representation of the matching code. Since the equations involved are all so self-evident, we stated (and here’s our mistake) that “the compiler can simplify these away to obtain the following code”, which was, of course, much clearer. What we said was perfectly true, but unfortunately it opened the floodgates of incomprehension. The trouble is, the students have no idea what a compiler is or how it works, so for them it’s a “magic wand” that somehow manages to get their ML code executed on an x86. And we invited them to speculate on what this mystical compiler does, and thereby (I conjecture) invited them to think that (a) what the compiler does is paramount for understanding anything, and (b) the compiler does whatever they wish to imagine it does. Somehow the students read our simple three-pointer as a “trick question” that was supposed to involve some mumbo-jumbo about the compiler, and that’s what we got, alas.
So, my first bit of advice is, don’t mention the compiler as an explanation for anything! They’ll have plenty of chances later in their careers to learn about compilers, and to understand how semantics informs compilation, what is valid equational reasoning, and so forth. But for freshmen, stick with the model. Otherwise you’re courting disaster. (I’ve already mentioned other situations where mentioning “the compiler” only muddies the waters and confuses students, the teaching of recursion being a prime example. I intend to discuss others in future posts.)
Now that I’ve taken the blame for my mistakes, I feel less guilty about shifting some of it elsewhere. I have two thoughts about why students resist the idea of functions being values of no fundamentally different character than any others. One source, no doubt, is that many of them have “learned” programming on the street, and have developed all sorts of misconceptions that are being applied here. Most popular languages make it hard to handle functions as values: you have to name them, you have to go to special trouble to pass them as arguments or return them as results, they are called “methods” and have a strange semantics involving state, and on and on. From this students acquire the sense that functions are “special”, and cannot be thought of like other things.
Another source of misunderstanding is that elementary mathematics (up through freshman year, at least) stresses the pointful style, always speaking of f(x), rather than f itself, and carrying this forward through calculus, writing d f(x) / dx for the derivative, confusing the function with its values, and so forth. Here again the message is clear: functions are “special” and are not to be treated on the same footing as numbers. It’s a pity, because I think that the point-free style is clearer and more natural, if non-standard. The differential is an operator acting on functions that produces a linear approximation to the function at each point in its domain (if the function in fact has such). Be that as it may, the consequence of the pointful style is that students develop early on a “fear of functions” that is hard to break.