Haskell Is Exceptionally Unsafe

August 14, 2012

It is well known that Haskell is not type safe.  The most blatant violation is the sometimes-necessary, and aptly named, unsafePerformIO operation.  You are enjoined not to use this in an unsafe manner, and must be careful to ensure that the encapsulated computation may be executed at any time because of the inherent unpredictability of lazy evaluation.  (The analogous operation in monadic ML, safePerformIO, is safe, because of the value restriction on polymorphism.)  A less blatant violation is that the equational theory of the language with seq is different from the equational theory without it, and the last I knew the GHC compiler was willing to make transformations that are valid only in the absence of this construct.  This too is well known.  A proper reformulation of the equational theory was given by Patricia Johann a few years ago as a step towards solving it.  (If the GHC compiler is no longer unsafe in this respect, it would be good to know.)

I’ve asked around a little bit, including some of the Haskell insiders, whether it is equally well known that the typed exception mechanism in Haskell is unsound.  From what I can tell this seems not to be as well-understood, so let me explain the situation as I see it, and I am sure I will be quickly corrected if I am wrong.  The starting point for me was the realization that in Haskell pure code (outside of the IO monad) may raise an exception, and, importantly for my point, the exceptions are user-defined.  Based on general semantic considerations, it seemed to me that this cannot possibly be sound, and Karl Crary helped me to isolate the source of the problem.

The difficulty is really nothing to do with exceptions per se, but rather with exception values.  (So my point has nothing whatsoever to do with imprecise exceptions.)  It seems to me that the root cause is an all-too-common misunderstanding of the concept of typed exceptions as they occur, for example, in Standard ML.  The mistake is a familiar one, the confusion of types with classes; it arises often in discussions related to oop.  To clarify the situation let me begin with a few remarks exception values in ML, and then move on to the issue at hand.

The first point, which is not particularly relevant to the present discussion, is that exceptions have nothing to do with dynamic binding.  The full details are in my book, so I will only summarize the situation here.  Many seem to have the idea that an exception handler, which typically looks like a sequence of clauses consisting of an exception and an action, amounts to a dynamic binding of each exception to its associated handler.  This is not so.  In fact the left-hand side of an exception clause is a pattern, including a variable (of which a wild card is one example), or nested patterns of exceptions and values.  I realize that one may implement the exception mechanism using a single dynamically bound symbol holding the current exception handler, but this implementation is but one of many possible ones, and does not in any case define the abstraction of exceptions.  Exceptions have no more to do with dynamic binding than does the familiar if-then-else available in any language.

The second point, which is pertinent to this discussion, is that exceptions have only one type.  You read that right: the typed exception mechanism in Standard ML is, in fact, uni-typed (where I have I heard that before?).  It is perfectly true that in Standard ML one may associate a value of any type you like with an exception, and this value may be communicated to the handler of that exception.  But it is perfectly false to say that there are multiple types of exceptions; there is only one, but it has many (in fact, “dynamically many”) classes.  When you declare, say, exception Error of string in Standard ML you are introducing a new class of type string->exn, so that Error s is an exception value of type exn carrying a string.  An exception handler associates to a computation of type α a function of type exn->α that handles any exceptions thrown by that computation.  To propagate uncaught exceptions, the handler is  implicitly post-composed with the handler fn x => raise x.  Pattern matching recovers the type of the associated value in the branch that handles a particular exception.  So, for example, a handler of the form Error x => exp propagates the fact that x has type string into the expression exp, and similarly for any other exception carrying any other type.  (Incidentally, another perfectly good handler is Error “abcdef” => exp, which handles only an Error exception with associated value “abcdef”, and no other.  This debunks the dynamic binding interpretation mentioned earlier.)

The third point is that exception classes are dynamically generated in the sense that each evaluation of an exception declaration generates a fresh exception.  This is absolutely essential for modularity, and is extremely useful for a wide variety of other purposes, including managing information flow security within a program. (See Chapter 34 of my book for a fuller discussion.)  O’Caml attempted to impose a static form of exceptions, but it is now recognized that this does not work properly in the presence of functors or separate compilation.  This means that exception declarations are inherently stateful and cannot be regarded as pure.

This got me to wondering how Haskell could get away with user-defined typed exceptions in “pure” code.  The answer seems to be “it can’t”, as the following example illustrates:

import Control.Exception

import Data.Typeable

newtype Foo = Foo (() -> IO ())

{- set Foo’s TypeRep to be the same as ErrorCall’s -}

instance Typeable Foo where

  typeOf _ = typeOf (undefined :: ErrorCall)

instance Show Foo where  show _ = “”

instance Exception Foo

main = Control.Exception.catch (error “kaboom”) (\ (Foo f) -> f ())

If you run this code in GHC, you get “internal error: stg_ap_v_ret”, which amounts to “going wrong.”  The use of exceptions is really incidental, except insofar as it forces the use of the class Typeable, which is exploited here.

How are we to understand this unsoundness?  My own diagnosis is that typed exceptions are mis-implemented in Haskell using types, rather than classes.  There is no need in ML for any form of type casting to implement exceptions, because there is exactly one type of exception values, albeit one with many classes.  Haskell, on the other hand, tries to have exceptions with different types.  This immediately involves us in type casting, and hilarity ensues.

The problem appears to be difficult to fix, because to do so would require that exception values be declared within the IO monad, which runs against the design principle (unavoidable, in my opinion) that exceptions are permissible in pure code.  (Exceptions are, according to Aleks Nanevski, co-monadic, not monadic.)  Alternatively, since Haskell lacks modularity, a possible fix is to permit only static exceptions in essentially the style proposed in O’Caml.  This amounts to collecting up the exception declarations on a whole-program basis, and implicitly making a global data declaration that declares all the exception constructors “up front”.  Exception handlers would then work by pattern matching, and all would be well.  Why this is not done already is a mystery to me; the current strategy looks a lot like a kludge once you see the problem with safety.

Insofar as my interpretation of what is going on here is correct, the example once again calls into question the dogma of the separation of Church from state as it is realized in Haskell.  As beguiling as it is, the idea, in its current form, simply does not work.  Dave MacQueen recently described it to me as a “tar baby”, from the Brer Rabbit story: the more you get involved with it, the more entangled you become.

Update: minor re-wording.


What is a Functional Language?

March 16, 2011

I mentioned in a previous post that I am developing a new course on functional programming for first-year students.  The immediate question, which none of you asked, is “what do you mean by functional programming?”.  Good question!  Now, if only I had a crisp answer, I’d be in good shape.

Socially speaking, the first problem I run into is a fundamental misunderstanding about the term “functional programming”.  Most people think that it stands for an ideological position that is defined in opposition to what everyone else thinks and does.  So in the mind of a typical colleague, when I speak of functional programming, she hears “deviant religious doctrine” and “knows” instinctively to disregard it.  Yet what I mean by functional programming subsumes what everyone else is doing as a narrow special case.  For as readers of this blog surely know, there is no better imperative language than a functional language!  In fact I often say that Haskell is the world’s best imperative programming language; I’m only half (if that much) joking.

So if functional programming subsumes imperative programming, then what the hell am I talking about when I advocate for functional programming at the introductory level?  And why, then, do we also have an imperative programming course?  Very good questions.  I’m not sure I have a definitive answer, but this being a blog, I can air my opinions.

Were it up to me, we would have one introductory course, probably two semesters long, on programming, which would, of course, emphasize functional as well as imperative techniques, all in one language.  Personally, I think this would be the right way to go, and would prepare the students for all future courses in our curriculum, and for work out there in the “real world.”  Not everyone believes me (and sometimes I am not sure I believe myself), so we must compromise.  While it surely sounds all nice and ecumenical to teach both imperative and functional programming, the reality is that we’re teaching old and new programming methods.  It’s not so much the imperative part that counts, but their obsolescence.  It is deemed important (and, to be honest, I can see the merit in the argument) that students learn the “old ways” because that’s how most of the world works.  So we have to use a language that is ill-defined (admits programs that cannot be given any meaning in terms of the language itself), that forces sequential, rather than parallel thinking, that demands manual allocation of memory, and that introduces all sorts of complications, such as null pointers, that need not exist at all.  And then we have to teach about things called “tools” that help manage the mess we’ve created—tools that do things like recover from Boolean blindness or check for null pointer errors.  To be sure, these tools are totally amazing, but then so are hot metal typesetting machines and steam locomotives.

So what, then, is a functional language?  I can do no better than list some criteria that are important for what I do; the languages that satisfy these criteria are commonly called functional languages, so that’s what I’ll call them too.

  1. It should have a well-defined semantics expressed in terms of the programs you write, not in terms of how it is “compiled” onto some hypothetical hardware and software platform.  I want to be able to reason about the code I’m writing, not about the code the compiler writes.  And I want to define the meaning of my program in terms of the constructs I’m using, not in terms of some supposed underlying mechanism that implements it.  This is the essence of abstraction.
  2. It should support both computation by evaluation and computation by execution.  Evaluation is a smooth generalization of high school- and university-level mathematics, and is defined in terms of standard mathematical concepts such as tuples, functions, numbers, and so forth.  Variables are given meaning by substitution, and evaluation consists of simplifying expressions.  Execution is the action of a program on another agent or data structure conceived of as existing independently of the program itself.  The data lives “over there” and the program “over here” diddles that data.  Assignables (my word for what in imperative languges are wrongly called variables) are given meaning by get and put operations (fetch and store), not by substitution.  Execution consists of performing these operations.
  3. It should be natural to consider both persistent and ephemeral data structures.  This is essentially a restatement of (2).  Persistent data structures are values that can be manipulated like any other value, regardless of their apparent “size” or “complexity”.  We can as easily pass functions as arguments and return them as results as we can do the same with integers.  We can think of trees or streams as values in the same manner.  Ephemeral data structures are what you learn about in textbooks.  They are external to the program, and manipulated destructively by execution.  These are useful, but not nearly as central as the current texts would have you believe (for lack of consideration of any other kind).
  4. It should have a natural parallel cost model based on the  dependencies among computations so that one may talk about parallel algorithms and their complexity as naturally as one currently discusses sequential algorithms.  Imperative-only programming languages fail this criterion miserably, because there are too many implicit dependencies among the components of a computation, with the result that sequentiality is forced on you by the artificial constraints of imperative thinking.
  5. It should have a rich type structure that permits introduction of new abstract types and which supports modular program development.  By giving short shrift to expressions and evaluation, imperative language have an impoverished notion of type—and not support for modularity.  Worse, object-oriented programming, a species of imperative programming, is fundamentally antimodular because of the absurd emphasis on inheritance and the reliance on class-based organizations in which functionality metastasizes throughout a program.

What languages satisfy these criteria best?  Principally ML (both Standard ML and Objective Caml, the “objective” part notwithstanding), and Haskell. (Unfortunately, Haskell loses on the cost model, about which more in another post.)

This is why we teach ML.