Functions Are Values

After the midterm the course staff for the Functional Programming class had a T-shirt made up saying “Functions Are Values”.  What prompted this is a curious blind spot shared by nearly every student in the class about the nature of functions in ML (or, for that matter, in math).  The students performed admirably on the midterm examination, capably writing higher-order functions, proving correctness of programs by structural induction, and even using higher-order functions to express staged computation.  And yet nearly every student got the following simple “gimme” question wrong:

What is the type and value of the expression “fn x => raise Fail”?

I sensed trouble when, during the examination, a student asked me to clarify the question for him.  Needless to say, I was at a loss for words!  Sure enough, all but one student got this simple question wrong, sometimes spectacularly.

Many said “it has no value” and offered a guess as to its type.  Others said “it raises an exception” and therefore has type “exn”.  (For those who don’t know ML, the type exn is the type of values associated with an exception.)  Still others said “the compiler will notice that it always raises an exception, and will therefore reject it”, or words to that effect.  Where the hell did they get that bizarre notion?  Whatever the deficiencies of our teaching may be, we certainly never said anything close to that!  Others got the type right, but still could not explain what is its value.

Given that they are clearly capable of writing higher-order functional programs, how could this happen?  Obviously the fault is not with our students, but with ourselves.  But what did we do wrong?  How did we manage to stump so many students with what we thought was a free 3-pointer?  To be honest, I don’t know.  But I have a few theories that I thought I would air here in hopes of helping others avoid my mistakes, or nip misunderstandings in the bud.

Throughout the course we have been very careful to develop a rigorous language-based model of computation using structural operational semantics.  We assign meaning to the program you write, not to its translation into some mysterious underlying machine that doesn’t really exist anyway (in other words, we avoided telling the conventional lies).  Nowhere did we ever explain anything by reference to “the compiler”, except at one point, which I now realize was a cardinal mistake.  Don’t make it yourself.  Here’s how it happened.  We meant well, but the spirit is weak, and sometimes we stray.  Forgive me Father, for I have sinned…..

Here’s where we went wrong, and I think invited wild speculation that led to the mistakes on the examination.  One cool use of higher-order functions is to stage computation.  A good example is provided by a continuation-based regular expression matcher, which I will not explain in any detail here.  The crucial point is that it has the type

regexp -> string -> (string -> bool) -> bool

which says that it accepts a regular expression, a string, and a continuation, matching an initial segment of the string, if possible, and passing the rest to the continuation, or returning false if not.  As this description suggests we can think of the matcher as a curried form of a three-argument function, and that’s that.  But a more interesting formulation stages the computation so that, given the regexp, it computes a matcher for that regular expression that may be applied repeatedly to many different candidate strings without reprocessing the regular expression.  (The code for this is quite beautiful; it’s available on our course web site if you’re interested.)

All this can be explained quite neatly using operational semantics, showing how the recursion over the regexp is completed, yielding a matching function as result.  It’s all very cool, except for one little thing: the result contains numerous beta redices that can be simplified to obtain a clearer and simpler representation of the matching code.  Since the equations involved are all so self-evident, we stated (and here’s our mistake) that “the compiler can simplify these away to obtain the following code”, which was, of course, much clearer.  What we said was perfectly true, but unfortunately it opened the floodgates of incomprehension.  The trouble is, the students have no idea what a compiler is or how it works, so for them it’s a “magic wand” that somehow manages to get their ML code executed on an x86.  And we invited them to speculate on what this mystical compiler does, and thereby (I conjecture) invited them to think that (a) what the compiler does is paramount for understanding anything, and (b) the compiler does whatever they wish to imagine it does.  Somehow the students read our simple three-pointer as a “trick question” that was supposed to involve some mumbo-jumbo about the compiler, and that’s what we got, alas.

So, my first bit of advice is, don’t mention the compiler as an explanation for anything!  They’ll have plenty of chances later in their careers to learn about compilers, and to understand how semantics informs compilation, what is valid equational reasoning, and so forth.  But for freshmen, stick with the model.  Otherwise you’re courting disaster.  (I’ve already mentioned other situations where mentioning “the compiler” only muddies the waters and confuses students, the teaching of recursion being a prime example.  I intend to discuss others in future posts.)

Now that I’ve taken the blame for my mistakes, I feel less guilty about shifting some of it elsewhere.  I have two thoughts about why students resist the idea of functions being values of no fundamentally different character than any others.  One source, no doubt, is that many of them have “learned” programming on the street, and have developed all sorts of misconceptions that are being applied here.  Most popular languages make it hard to handle functions as values: you have to name them, you have to go to special trouble to pass them as arguments or return them as results, they are called “methods” and have a strange semantics involving state, and on and on.  From this students acquire the sense that functions are “special”, and cannot be thought of like other things.

Another source of misunderstanding is that elementary mathematics (up through freshman year, at least) stresses the pointful style, always speaking of f(x), rather than f itself, and carrying this forward through calculus, writing d f(x) / dx for the derivative, confusing the function with its values, and so forth.  Here again the message is clear: functions are “special” and are not to be treated on the same footing as numbers.  It’s a pity, because I think that the point-free style is clearer and more natural, if non-standard.  The differential is an operator acting on functions that produces a linear approximation to the function at each point in its domain (if the function in fact has such).  Be that as it may, the consequence of the pointful style is that students develop early on a “fear of functions” that is hard to break.


24 Responses to Functions Are Values

  1. andrewstellman says:

    Where the hell did they get that bizarre notion? Whatever the deficiencies of our teaching may be, we certainly never said anything close to that! Others got the type right, but still could not explain what is its value

    It’s extremely cool to see you tackling the issue of how to teach a specific, potentially tricky programming concept—and, specifically, how students learning that concept could go wrong. I’ve spent the last few years on similar problems, working on our two editions of Head First C# (O’Reilly 2010).

    It seems like you stumbled on a particularly interesting teaching challenge. I gave it some thought, because I think some of the specific teaching I’ve been relying on may be helpful. Admittedly, my audience is a bit different than yours. However, having been through your class myself (admittedly, two decades ago), I think I can still imagine myself in the mindset of someone learning these concepts for the first time.

    Anyway, I posted my thoughts on the O’Reilly blog: Functions are values: explore C# lambda types in Visual Studio. If you get a chance to look at it, skip down to the last paragraph—one of the more memorable exercises you gave us became the inspiration for an exercise in Head First C#.

  2. rgrig says:

    “What is the value of 1?”
    “That must be a trick question. Let me see if I can guess what he meant.”

    At least that’s what happens when I ask easy questions. People invariably think I have something complicated in mind and they try to guess what it is.

    Don’t blame the compiler. (Although, it certainly makes sense to avoid concepts that aren’t really needed.)

  3. You may be underestimating the mental leap it takes for most people to go from first order to higher order, regardless of any prior exposure or potentially misguided explanation. It really _is_ a whole new concept at first, and students have to wrap their head around it. That said, I agree that common notation, terminology, and semantics in mainstream languages does not help to make it any easier.

  4. marklillibridge says:

    “you never need mention closures, ever. see my book.”

    yeah, I realized that about 10 min. later… :-)

  5. marklillibridge says:

    The result of evaluating a function depends on the semantics in use. If you’re using rewriting and no store, then yes the function can evaluate to itself. If you are using a more usual semantics (like in ML), then the function evaluates to a closure — a pair containing an environment and the function itself.

    Consider what the follow evaluates to (forgive the made-up syntax):

    let x = new ref 0 in fn y => x! + x!

    That said, I assume you made clear what semantics you wanted them to consider, so this probably wasn’t the problem.

  6. diyria says:

    Just a note saying thanks for the thought on derivatives… i haven’t taken a calc course nearly two decades. I’ve never thought of the differential as an operator. A derivative, written as d f(a) / db, has always been more of a mystic set characters that show “rate of change”. In my current job I do not do much with math, but a light bulb just turned on thanks to your simple description. Thank you very much.

  7. bowaggoner says:

    Functions as values are simply not that intuitive. Especially for students having seen them previously in math class and Java.

    It’s natural to think of functions in terms of types. This function is a machine that takes in two inputs and produces one output. This intuition can serve a student well for higher-order functions, currying, all the things you mention that students are doing fine with. First-class: give me a machine with these inputs and these outputs; I’ll use it later. Currying: Start with a machine taking these inputs, and turn it into a machine that takes these inputs. Etc.

    But functions as values? That’s hard, because we tend to think of expressions as “evaluating” to a value.

    x = 10;  -- x "evaluates" to 10
    y = function(a,b) return a+b end;  -- y "evaluates" to ... what?

    The student thinks, well, we don’t know what y “evaluates” to until we “evaluate” it — till we plug in ‘a’ and ‘b’ and get the result. That’s what the “value” of y is. I guess it goes back to “pointful” style. This is where I think the confusion lies. We think of functions as types, or machines, whose “value” depends on the input and its “evaluation”.

  8. rafael111 says:

    When I was 9 years old I took a BASIC course because I wanted to build konami like
    games on the MSX. When I asked my teacher how to do it, she said “oh, for that you would
    need a compiler.” I still don´t know if she misunderstood my question, or deviated from
    it because it was impossible to build those games using basic, and the company didn´t teach
    Z80 assembly. The idea that stayed in my mind was exactly this: “Compilers are mystical entities
    that do whatever you want them to do. So I better learn more about them”.

  9. Rick Russell says:

    There is an interesting discussion on your post going on here, if you are interested:

    • andrejbauer says:

      It is not an interesting discussion, unless you count “mostly uninformed” as interesting.

    • Rick Russell says:

      I’m sure it is mostly uninformed; many of the students posting to the reddit discussion (including myself) have not had the thorough introduction to functional programming that Dr. Harper provides. That is why we seek to go beyond what our current curricula are teaching and learn more.

  10. andrejbauer says:

    I have had similar experiences with a “Theory of programming languages” course (think of it as a couple of chapters from your book, with heresy thrown in here and there). A standard question on our exams was “for each of the following five expressions, determine whether it has a type and state what it evaluates to (value, gets stuck, diverges)”. We learnt that: (1) students thought it was forbidden to evaluate expressions that did not have a type, (2) students thought the body of a function evaluates even if the function is not applied to anything, (3) students were eager to evaluate eagerly, even when the language was lazy. So in subsequent reincarnations of the course I very much emphasized that we can evaluate whatever we wish (and that the whole point of Type Safety was to give conditions under which it is safe to evaluate). It was harder to drive home the point about functions. I am still not sure why, but I think a basic misunderstanding of functions is getting propagated through math courses and badly designed programming languages. It is essentially a 19th century way of thinking about functions that prohibits one from writing just f. It must always be f(x) because f by itself is a non-entity.

    One thing I am sure of though. You give yourself too much credit for causing the problem by mentioning the compiler. They already knew there was a compiler, and really, do you think they listen and remember every word you say?

    By the way, you can order ahead some T-shirts saying “Booleans are values, too!” Students simply refuse to write things like let b = (x < y) in ...".

    • Rick Russell says:


      > Students simply refuse to write things like…

      That’s funny, I’ve always written subroutines (procedures, functions methods, whatever) to return the outcome of boolean expressions.

    • mitchwand says:

      Amen on the booleans. Will Clinger always used to inspect students’ spacing as a clue. If they wrote

      (if(< x 0)  ...)

      that was a signal that they didn’t understand that
      (< x 0) was an expression returning a value just like any other.

      I always used to tell students “lambda means later”, so
      (lambda (y) (+ x y)) meant: later on, when you get a value for y, add it to x.

    • neelk says:

      @mitchwand: I have to confess I have written both (lambda (x) (e x)) and (if e true false) from time to time!

      In both cases, the rationale was the same — eta-expansions makes the type of e more evident in the source.

  11. Franklin says:

    I believe that it does not help that most programming languages have special syntax for functions, both for defining and for calling. When I was a TA for you in 15-212-ML, I noticed that students had vastly different notions of what was going on depending on whether I wrote

    fun f x = expr


    val f = fn x => expr

    I believe this (and many other examples of confusion I saw) reflects a human cognitive bias to assume that what looks different must actually be totally different. This is why almost all discussions of programming languages among most programmers revolve entirely around matters of syntax. I cannot tell you how many I have met who claim to me that JavaScript, Java, and C are pretty much similar languages.

    P.S. I agree about the misleading nature of the notation still used in mathematics, which has undoubtedly confused generations of introductory calculus students. The well-established notation for functions, derivatives, and integrals is rather confusing.

  12. Ashley Yakeley says:

    I think there’s a temptation not to think of something as a value unless it can be sensibly put in (Haskell) type-class Eq.

    • Ashley Yakeley says:

      Although, of course, functions of suitably restricted types can sensibly be put in Eq, e.g. Bool -> Integer.

  13. comex says:

    I don’t know ML, but if I’m correct in translating the question to Haskell by asking about the expression (\x -> error “Fail”):

    I know the type, (a -> b), but it is rather bizarre at first glance, because it’s falsely promising to give you an instance of any type you want.

    I don’t know what the value is. There is no way to simplify the expression (except by using a library function), so how do you answer except with the expression itself?

    I might be misunderstanding the question, but in my case, at least, the problem does not stem from not understanding that functions are values.

    • pelotom says:

      I too am curious what answer you expected for the “value” part of the question… are you looking for “value, gets stuck, diverges” as Andrej said, or something else?

    • omerzach says:

      The question was actually:
      What are the type and value of the expression (fn (x : int) => x div 0)?

      The solution was (quoting from the sample solution):
      “This expression has type int -> int. Functions are values, so this expression is its own value.”

    • abstract type says:

      Thanks for the correction! (I just wrote down from memory the gist of the question.)

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: