## What, If Anything, Is A Declarative Language?

July 18, 2013

Back in the 1980’s it was very fashionable to talk about “declarative” programming languages.  But to my mind there was never a clear definition of a “declarative language”, and hence no way to tell what is declarative and what is not.  Lacking any clear meaning, the term came to refer to the arbitrary conflation of functional with logic programming to such an extent that “functional-and-logic-programming” almost became a Germanic thing-in-itself (ding an sich).  Later, as the logic programming wave subsided, the term “declarative”,  like “object-oriented”, came to be an expression of approval, and then, mercifully, died out.

Or so I had thought.  Earlier this week I attended a thriller of an NSF-sponsored workshop on high-level programming models for parallelism, where I was surprised by the declarative zombie once again coming to eat our brains.  This got me to thinking, again, about whether the term has any useful meaning.  For what it’s worth, and perhaps to generate useful debate, here’re some things that I think people mean, and why I don’t think they mean very much.

1. “Declarative” means “high-level”.  This just seems to replace one vague term by another.
2. “Declarative” means “not imperative”.  But this flies in the face of reality.  Functional languages embrace and encompass imperative programming as a special case, and even Prolog has imperative features, such as assert and retract, that have imperative meaning.
3. “Declarative” means “functional”.  OK, but then we don’t really need another word.
4. “Declarative” means “what, not how”.  But every language has an operational semantics that defines how to execute its programs, and you must be aware of that semantics to understand programs written in it.  Haskell has a definite evaluation order, just as much as ML has a different one, and even Prolog execution is defined by a clear operational semantics that determines the clause order and that can be influenced by “cut”.
5. “Declarative” means “equational”.  This does not distinguish anything, because there is a well-defined notion of equivalence for any programming language, namely observational equivalence.  Different languages induce different equivalences, of course, but how does one say that one equivalence is “better” than another?  At any rate, I know of no stress on equational properties of logic programs, so either logic programs are not “declarative” or “equational reasoning” is not their defining characteristic.
6. “Declarative” means “referentially transparent”.  The misappropriation of Quine’s terminology only confuses matters.  All I’ve been able to make of this is that “referentially transparent” means that beta-equivalence is valid.  But beta equivalence is not a property of an arbitrary programming language, nor in any case is it clear why this equivalence is first among equals.  In any case why you would decide a priori on what equivalences you want before you even know what it means to run a program?
7. “Declarative” means “has a denotation”.  This gets closer to the point, I think, because we might well say that a declarative semantics is one that gives meaning to programs as some kind of mapping between some sort of spaces.  In other words, it would be a synonym for “denotational semantics”.  But every language has a denotational semantics (perhaps by interpretation into a Kripke structure to impose sequencing), so having one does not seem to distinguish a useful class of languages.  Moreover, even in the case of purely functional programs, the usual denotational semantics (as continuous maps) is not fully abstract, and the fully abstract semantics (as games) is highly operational.  Perhaps a language is declarative in proportion to being able to give it semantics in some “familiar” mathematical setting?
8. “Declarative” means “implicitly parallelizable“.  This was certainly the sense intended at the NSF meeting, but no two “declarative” languages seemed to have much in common.  Charlie Garrod proposes just “implicit”, which is pretty much synonymous with “high level”, and may be the most precise sense there is to the term.

No doubt this list is not exhaustive, but I think it covers many of the most common interpretations.  It seems to me that none of them have a clear meaning or distinguish a well-defined class of languages.  Which leads me to ask, is there any such thing as a declarative programming language?

[Thanks to the late Steve Gould for inspiring the title of this post.]

[Update: wordsmithing.]

## Intro Curriculum Update

August 17, 2012

In previous posts I have talked about the new introductory CS curriculum under development at Carnegie Mellon. After a year or so of planning, we began to roll out the new curriculum in the Spring of 2011, and have by now completed the transition. As mentioned previously, the main purpose is to bring the introductory sequence up to date, with particular emphasis on introducing parallelism and verification. A secondary purpose was to restore the focus on computing fundamentals, and correct the drift towards complex application frameworks that offer the students little sense of what is really going on. (The poster child was a star student who admitted that, although she had built a web crawler the previous semester, she in fact has no idea how to build a web crawler.) A particular problem is that what should have been a grounding in the fundamentals of algorithms and data structures turned into an exercise in object-oriented programming, swamping the core content with piles of methodology of dubious value to beginning students. (There is a new, separate, upper-division course on oo methodology for students interested in this topic.) A third purpose was to level the playing field, so that students who had learned about programming on the street were equally as challenged, if not more so, than students without much or any such experience. One consequence would be to reduce the concomitant bias against women entering CS, many fewer of whom having prior computing experience than the men.

The solution was a complete do-over, jettisoning the traditional course completely, and starting from scratch. The most important decision was to emphasize functional programming right from the start, and to build on this foundation for teaching data structures and algorithms. Not only does FP provide a much more natural starting point for teaching programming, it is infinitely more amenable to rigorous verification, and provides a natural model for parallel computation. Every student comes to university knowing some algebra, and they are therefore familiar with the idea of computing by calculation (after all, the word algebra derives from the Arabic al jabr, meaning system of calculation). Functional programming is a generalization of algebra, with a richer variety of data structures and a richer set of primitives, so we can build on that foundation. It is critically important that variables in FP are, in fact, mathematical variables, and not some distorted facsimile thereof, so all of their mathematical intuitions are directly applicable. So we can immediately begin discussing verification as a natural part of programming, using principles such as mathematical induction and equational reasoning to guide their thinking. Moreover, there are natural concepts of sequential time complexity, given by the number of steps required to calculate an answer, and parallel time complexity, given by the data dependencies in a computation (often made manifest by the occurrences of variables). These central concepts are introduced in the first week, and amplified throughout the semester.

Two major homework exercises embody the high points of the first-semester course, one to do with co-development of code with proof, the other to do with parallelism. Co-development of program and proof is illustrated by an online regular expression matcher. The problem is a gem for several reasons. One is that it is essentially impossible for anyone to solve by debugging a blank screen. This sets us up nicely for explaining the importance of specification and proof as part of the development process. Another is that it is very easy, almost inevitable, for students to make mistakes that are not easily caught or diagnosed by testing. We require the students to carry out a proof of the correctness of the matcher, and in the process discover a point where the proof breaks down, which then leads to a proper solution. (See my JFP paper “Proof-Directed Debugging” for a detailed development of the main ideas.) The matcher also provides a very nice lesson in the use of higher-order functions to capture patterns of control, resulting in an extremely clean and simple matcher whose correctness proof is highly non-trivial.

The main parallelism example is the Barnes-Hut algorithm for solving the n-body problem in physics. Barnes-Hut is an example of a “tree-based” method, invented by Andrew Appel, for solving this well-known problem. At a high level the main idea is to decompose space into octants (or quadrants if you’re working in the plane), recursively solving the problem for each octant and then combining the solutions to make an overall solution. The key idea is to use an approximation for bodies that are far enough away—a distant constellation can be regarded as an atomic body for the purposes of calculating the effects of its stars on the sun, say. The problem is naturally parallelizable, because of the recursive decomposition. Moreover, it provides a very nice link to their high school physics. Since FP is just an extension of mathematics, the equations specifying the force law and Newton’s Law carry over directly to the code. This is an important sense in which FP builds on and amplifies their prior mathematical experience, and shows how one may connect computer science with other sciences in a rather direct manner.

The introductory FP course establishes the basis for the new data structures and algorithms course that most students take in either the third or fourth semester. This course represents a radical departure from tradition in several ways. First, it is a highly rigorous course in algorithms that rivals the upper-division course taught at most universities (including our own) for the depth and breadth of ideas it develops. Second, it takes the stance that all algorithms are parallel algorithms, with sequential being but a special case of parallel. Of course some algorithms have a better parallelizability ratio (a precise technical characterization of the ability to make use of parallelism), and this can be greatly affected by data structure selection, among other considerations. Third, the emphasis is on persistent abstract types, which are indispensable for parallel computing. No more linked lists, no more null pointer nonsense, no more mutation. Instead we consider abstract types of graphs, trees, heaps, and so forth, all with a persistent semantics (operations create “new” ones, rather than modify “old” ones). Fourth, we build on the reasoning methods introduced in the first semester course to prove the correctness and the efficiency of algorithms. Functional programming makes all of this possible. Programs are concise and amenable to proof, they are naturally parallel, and they enjoy powerful typing properties that enforce abstraction in a mathematically rigorous manner. Fifth, there is a strong emphasis on problems of practical interest. For example, homework 1 is the shotgun method for genome sequencing, a parallel algorithm of considerable practical importance and renown.

There is a third introductory course in imperative programming, taken in either the first or second semester (alternating with the functional programming course at the student’s discretion), that teaches the “old ways” of doing things using a “safe” subset of C. Personally, I think this style of programming is obsolete, but there are many good reasons to continue to teach it, the biggest probably being the legacy problem. The emphasis is on verification, using simple assertions that are checked at run-time and which may be verified statically in some situations. It is here that students learn how to do things “the hard way” using low-level representations. This course is very much in the style of the 1970’s era data structures course, the main difference being that the current incarnation of Pascal has curly braces, rather than begin-end.

For the sake of those readers who may not be up to date on such topics, it seems important to emphasize that functional programming subsumes imperative programming. Every functional language is capable of expressing the old-fashioned sequential pointer-hacking implementation of data structures. You can even reproduce Tony Hoare’s self-described “billion dollar mistake” of the cursed “null pointer” if you’d like! But the whole point is that it is rarely useful, and almost never elegant, to work that way. (Curiously, the “monad mania” in the Haskell community stresses an imperative, sequential style of programming that is incompatible with parallelism, but this is not forced on you as it is in the imperative world.) From this point of view there no loss, and considerable gain, in teaching functional programming from the start. It puts a proper emphasis on mathematically sane programming methods, but allows for mutation-based programming where appropriate (for example, in engendering “side effects” on users).

I encourage readers to review the syllabi and course materials. There is quite a large body of material in place that we plan to expand into textbook-level treatments in the near future. We also plan to write a journal article documenting our experiences with these courses.

I am very grateful to my colleagues Guy Blelloch, Dan Licata, and Frank Pfenning for their efforts in helping to develop the new introductory curriculum at Carnegie Mellon, and to the teaching assistants whose hard work and dedication put the ideas into practice.

Update: Unfortunately, the homework assignments for these courses are not publicly available, because we want to minimize the temptation for students to make use of old assignments and solutions in preparing their own work.  I am working with my colleagues to find some way in which we can promote the ideas without sacrificing too much of the integrity of the course materials.  I apologize for the inconvenience.

## Words Matter

February 1, 2012

Yesterday, during a very nice presentation by Ohad Kammar at Carnegie Mellon, the discussion got derailed, in part, because of a standard, and completely needless, terminological confusion involving the word “variable”.  I’m foolish enough to try to correct it.

The problem is that we’ve all been taught to confuse variables with variables—that is, program variables with mathematical variables.  The distinction is basic.  Since time immemorial (well, at least since al Khwarizmi) we have had the notion of a variable, properly so-called, which is given meaning by substitution.  A variable is an unknown, or indeterminate, quantity that can be replaced by any value of its type (a type being, at least since Russell, the range of significance of a variable).  Frege gave the first systematic study of the quantifiers, and Church exploited the crucial concept of a variable to give the most sharply original and broadly applicable model of computation, the $\lambda$-calculus.

Since the dawn of Fortran something that is not a variable has come to be called a variable.  A program variable, in the sense of Fortran and every imperative language since, is not given meaning by substitution.  Rather, it is given meaning by (at least) two operations associated with it, one to get its contents and one to put new contents into it.  (And, maybe, an operation to form a reference to it, as in C or even Algol.)  Now as many of you know, I think that the concept of a program variable in this sense is by and large a bad idea, or at any rate not nearly as important as it has been made out to be in conventional (including object-oriented) languages, but that’s an argument for another occasion.

Instead, I’m making a plea.  Let’s continue to call variables variables.  It’s a perfectly good name, and refers to what is perhaps one of the greatest achievements of the human mind, the fundamental concept of algebra, the variable.  But let’s stop calling those other things variables!  In my Practical Foundations for Programming Languages I coined (as far as I know) a word that seems perfectly serviceable, namely an assignable.  The things called variables in imperative languages should, rather, be called assignables.  The word is only a tad longer than variable, and rolls off the tongue just as easily, and has the advantage of being an accurate description of what it really is.  What’s not to like?

Why bother?  For one thing, some languages have both concepts, a necessity if you want your language to be mathematically civilized (and you do).  For another, in the increasingly important world of program verification, the specification formalisms, being mathematical in nature, make use of variables, which most definitely are not assignables!  But the real reason to make the distinction is, after all, because words matter.  Two different things deserve to have two different names, and it only confuses matters to use the same word for both.  This week’s confusion was only one example of many that I have seen over the years.

So, my suggestion: let’s call variables variables, and let’s call those other things assignables.  In the fullnesss of time (i.e., once the scourge of imperative programming has been lifted) we may not need the distinction any longer.  But until then, why not draw the distinction properly?

Update: remove forward reference.

## Of Course ML Has Monads!

May 1, 2011

A popular meme in the world of PL’s is that “Haskell has monads”, with the implication that this is a distinctive feature of the language, separate from all others.  While it is true that Haskell has popularized the use of monads as a program structuring device, the idea of a monad is not so much an issue of language design (apart from the ad hoc syntactic support provided by Haskell), but rather one of library design.  After all, a monad is just one of a zillion signatures (type classes) with which to structure programs, and there is no particular reason why this one cannot be used in any language that supports even a modicum of modularity.

Examined from the point of view of ML, monads are but a particular of use of modules.  The signature of monads is given by the definition

signature MONAD = sig
val ret : 'a -> 'a monad
end

There are many, many, many structures that satisfy this signature; I needn’t (and, in any case, can’t) rehearse them all here.  One particularly simple example should suffice to give the general idea:

structure Option : MONAD = struct
type 'a monad = 'a option
fun ret x = SOME x
fun bnd (SOME x) k = k x
| bnd NONE k = NONE
end

This is of course the option monad, which is sometimes used to model the data flow aspects of exceptions, perhaps with some elaboration of the NONE case to associate an exceptional value with a non-result.  (The control flow aspects are not properly modeled this way, however.  For that one needs, in addition, access to some sort of jump mechanism.)

Examples like this one proliferate.  A monad is represented by a structure.  Any structure that provides the facilities specified by the MONAD signature gives rise to the characteristic sequentialization mechanisms codified by it.  Monad transformers are functors that transform one monad into another, with no fuss or bother, and no ad hoc mechanisms required.  Standard modular programming techniques suffice to represent monads; moreover, the techniques involved are fully general, and are equally applicable to other signatures of interest (arrows, or quivers, or bows, or what have you).  Moreover, it is shown in my paper with Chakravarty, Dreyer, and Keller how to integrate modules into the type inference mechanism of ML so that one can get automatic functor instantiation in those limited cases where it is self-evident what is intended.  This has been implemented by Karl Crary in a prototype compiler for an extension of Standard ML, and it would be good to see this supported in more broadly available compilers for the language.

The bulk of the mania about monads is therefore accounted for by modules.  I have no doubt, however, that you are wondering about the IO monad in Haskell.  Isn’t that a fundamental feature of the language that cannot be replicated in ML?  Hardly!  It’s entirely a matter of designing the signatures of the standard basis library modules, and nothing more.  The default basis library does not attempt to segregate effects into a monad, but it is perfectly straightforward to do this yourself, by providing your own layer over the standard basis, or to reorganize the standard basis to enforce the separation.  For example, the signature of reference cells might look like this:

signature REF = sig
type 'a ref
val ref : 'a -> 'a ref IO.monad
val ! : 'a ref -> 'a IO.monad
val := : 'a ref -> 'a -> unit IO.monad
end

Here we are presuming that we have a fixed declaration

structure IO : MONAD = ...

that packages up the basic IO primitives that are already implemented in the run-time system of ML, more or less like in Haskell.  The other signatures, such as those for mutable arrays or for performing input and output, would be modified in a similar manner to push effects into the IO monad.  Et voila, you have monadic effects.

There’s really nothing to it.  In fact, the whole exercise was carried out by a Carnegie Mellon student, Phillippe Ajoux, a couple of years ago.  He also wrote a number of programs in this style just to see how it all goes: swimmingly.  He also devised syntactic extensions to the Moscow ML compiler that provide a nicer notation for programming with monads, much as in Haskell, but better aligned with ML’s conventions.  (Ideally it should be possible to provide syntactic support for any signature, not just monads, but I’m not aware of a worked-out design for the general case, involving as it would an intermixing of parsing and elaboration.)

My point is that the ML module system can be deployed by you to impose the sorts of effect segregation imposed on you by default in Haskell.  There is nothing special about Haskell that makes this possible, and nothing special about ML that inhibits it.  It’s all a mode of use of modules.

So why don’t we do this by default?  Because it’s not such a great idea.  Yes, I know it sounds wonderful at first, but then you realize that it’s pretty horrible.  Once you’re in the IO monad, you’re stuck there forever, and are reduced to Algol-style imperative programming.  You cannot easily convert between functional and monadic style without a radical restructuring of code.  And you are deprived of the useful concept of a benign effect.

The moral of the story is that of course ML “has monads”, just like Haskell.  Whether you want to use them is up to you; they are just as useful, and just as annoying, in ML as they are in Haskell.  But they are not forced on you by the language designers!

Update: This post should’ve been called “ML Has Monads, Why Not?”, or “Of Course ML Has Comonads!”, but then no one was wondering about that.

Update: I now think that the last sentence is excessive.  My main point is simply that it’s very simple to go one way or the other with effects, if you have modules to structure things; it’s all a matter of library design.  A variant of ML that enforced the separation of effects is very easily constructed; the question is whether it is useful or not.  I’ve suggested that the monadic separation is beguiling, but not clearly a great idea.  Alternatively, one can say that we’re not that far away from eliminating laziness from Haskell, at least in this respect: just re-do the standard basis library in ML, and you’re a good ways there.  Plus you have modules, and we understand how to integrate type classes with modules, so the gap is rather small.

Update: Removed inaccurate remarks about unsafePerformIO.

Follow