A few new papers

July 21, 2014

I’ve just updated my web page with links to some new papers that are now available:

  1. Homotopical Patch Theory” by Carlo Angiuli, Ed Morehouse, Dan Licata, and Robert Harper. To appear, ICFP, Gothenburg, October 2014. We’re also preparing an expanded version with a new appendix containing material that didn’t make the cut for ICFP. (Why do we still have such rigid space limitations?  And why do we have such restricted pre-publication deadlines as we go through the charade of there being a “printing” of the proceedings? One day soon CS will step into its own bright new future.). The point of the paper is to show how to apply basic methods of homotopy theory to various equational theories of patches for various sorts of data. One may see it as an application of functorial semantics in HoTT, in which theories are “implemented” by interpretation into a universe of sets. The patch laws are necessarily respected by any such interpretation, since they are just cells of higher dimension and functors must behave functorially at all dimensions.
  2. Cache Efficient Functional Algorithms” by Guy E. Blelloch and Robert Harper. To appear, Comm. ACM Research Highlight this fall.  Rewritten version of POPL 2013 paper meant for a broad CS audience.  Part of a larger effort to promote integration of combinatorial theory with logical and semantic theory, two theory communities that, in the U.S. at least, ignore each other completely.  (Well, to be plain about it, it seems to me that the ignoring goes more in one direction than the other.)  Cost semantics is one bridge between the two schools of thought, abandoning the age-old “reason about the compiled code” model used in algorithm analysis.  Here we show that one can reason about spatial locality at the abstract level, without having to drop-down to low-level of how data structures are represented and allocated in memory.
  3. Refining Objects (Preliminary Summary)” by Robert Harper and Rowan Davies. To appear, Luca Cardelli 60th Birthday Celebration, Cambridge, October, 2014.  A paper I’ve been meaning to write sometime over the last 15 years, and finally saw the right opportunity, with Luca’s symposium coming up and Rowan Davies visiting Carnegie Mellon this past spring.  Plus it was a nice project to get me started working again after I was so rudely interrupted this past fall and winter.  Provides a different take on typing for dynamic dispatch that avoids the ad hoc methods introduced for oop, and instead deploying standard structural and behavioral typing techniques to do more with less.  This paper is a first cut as proof of concept, but it is clear that much more can be said here, all within the framework of standard proof-theoretic and realizability-theoretic interpretations of types.  It would help to have read the relevant parts of PFPL, particularly the under-development second edition, which provides the background elided in the paper.
  4. Correctness of Compiling Polymorphism to Dynamic Typing” by Kuen-Bang Hou (Favonia), Nick Benton, and Robert Harper, draft (summer 2014).  Classically polymorphic type assignment starts with untyped \lambda-terms and assigns types to them as descriptions of their behavior.  Viewed as a compilation strategy for a polymorphic language, type assignment is rather crude in that every expression is compiled in uni-typed form, complete with the overhead of run-time classification and class checking.  A more subtle strategy is to maintain as much structural typing as possible, resorting to the use of dynamic typing (recursive types, naturally) only for variable types.  The catch is that polymorphic instantiation requires computation to resolve the incompatibility between, say, a bare natural number, which you want to compute with, and its encoding as a value of the one true dynamic type, which you never want but are stuck with in dynamic languages.  In this paper we work out an efficient compilation scheme that maximizes statically available information, and makes use of dynamic typing only insofar as the program demands we do so.  Of course there are better ways to compile polymorphism, but this style is essentially forced on you by virtual machines such as the JVM, so it is worth studying the correctness properties of the translation, which we do here making use of a combination of structural and behavioral typing.

I hope to comment here more fully on these papers in the near future, but I also have a number of other essays queued up to go out as soon as I can find the time to write them.  Meanwhile, other deadlines loom large.

[Update: added fourth item neglected in first draft.  Revise formatting.  Add links to people. Brief summary of patch theory paper.  Minor typographical corrections.]

[Update: the promised expanded version of the forthcoming ICFP paper is now available.]


Summer of Programming Languages

July 6, 2014

Having just returned from the annual Oregon Programming Languages Summer School, at which I teach every year, I am once again very impressed with the impressive growth in the technical sophistication of the field and with its ability to attract brilliant young students whose enthusiasm and idealism are inspiring.  Eugene was, as ever, an ideal setting for the summer school, providing a gorgeous setting for work and relaxation.  I was particularly glad for the numerous chances to talk with students outside of the classroom, usually over beer, and I enjoyed, as usual, the superb cycling conditions in Eugene and the surrounding countryside.  Many students commented to me that the atmosphere at the summer school is wonderful, filled with people who are passionate about programming languages research, and suffused with a spirit of cooperation and sharing of ideas.

Started by Zena Ariola a dozen years ago, this year’s instance was organized by Greg Morrisett and Amal Ahmed in consultation with Zena.  As usual, the success of the school depended critically on the dedication of Jim Allen, who has been the de facto chief operating officer since it’s inception.  Without Jim, OPLSS could not exist.  His attention to detail, and his engagement with the students are legendary.   Support from the National Science Foundation CISE Division, ACM SIGPLANMicrosoft Research, Jane Street Capital, and BAE Systems was essential for providing an excellent venue,  for supporting a roster of first-rate lecturers, and for supporting the participation of students who might otherwise not have been able to attend.  And, of course, an outstanding roster of lecturers donated their time to come to Eugene for a week to share their ideas with the students and their fellow lecturers.

The schedule of lectures is posted on the web site, all of which were taped, and are made available on the web.  In addition many speakers provided course notes, software, and other backing materials that are also available online.  So even if you were not able to attend, you can still benefit from the summer school, and perhaps feel more motivated to come next summer.  Greg and I will be organizing, in consultation with Zena.  Applying the principle “don’t fix what isn’t broken”, we do not anticipate major changes, but there is always room for improvement and the need to freshen up the content every year.  For me the central idea of the summer school is the applicability of deep theory to everyday practice.  Long a dream held by researchers such as me, these connections become more “real” every year as the theoretical abstractions of yesterday become the concrete practices of today.  It’s breathtaking to see how far we’ve come from the days when I was a student just beginning to grasp the opportunities afforded by ideas from proof theory, type theory, and category theory (the Holy Trinity) to building beautiful software systems.  No longer the abstruse fantasies of mad (computer) scientists, these ideas are the very air we breathe in PL research.  Gone are the days of ad hoc language designs done in innocence of the foundations on which they rest.  Nowadays serious industrial-strength languages are emerging that are grounded in theory and informed by practice.

Two examples have arisen just this summer, Rust (from Mozila) and Swift (from Apple), that exemplify the trend.  Although I have not had time to study them carefully, much less write serious code using them, it is evident from even a brief review of their web sites that these are serious languages that take account of the academic developments of the last couple of decades in formulating new language designs to address new classes of problems that have arisen in programming practice.  These languages are type safe, a basic criterion of sensibility, and feature sophisticated type systems that include ideas such as sum types, which have long been missing from commercial languages, or provided only in comically obtuse ways (such as objects).  The infamous null pointer mistakes have been eradicated, and the importance of pattern matching (in the sense of the ML family of languages) is finally being appreciated as the cure for Boolean blindness.  For once I can look at new industrial languages without an overwhelming sense of disappointment, but instead with optimism and enthusiasm that important ideas are finally, at long last, being recognized and adopted.  As has often been observed, it takes 25 years for an academic language idea to make it into industrial practice.  With Java it was simply the 1970’s idea of automatic storage management; with languages such as Rust and Swift we are seeing ideas from the 80’s and 90’s make their way into industrial practice.  It’s cause for celebration, and encouragement for those entering the field: the right ideas do win out in the end, one just has to have the courage to be irrelevant.

I hope to find the time to comment more meaningfully on the recent developments in practical programming languages, including Rust and Swift, but also languages such as Go and OCaml that are also making inroads into programming practice.  (I’ve had quite enough to say about Haskell for the time being, so I’ll give that one a rest, but with a tip of the hat to its enormous popularity and influence, despite my criticisms.)  But for now, let me say that the golden age of programming language research is here and now, and promises to continue indefinitely as we develop a grand unified theory of programming and mathematics.


Bellman Confirms A Suspicion

April 21, 2014

As is by now well-known, I regard the supposed opposition between static and dynamic languages as a fallacy: the latter, being a special case of the former, can scarcely be an alternative to it.  I cannot tell you how many times I’ve been handed arguments along the lines of “oh, static languages are just fine, but I want something more dynamic,” the speaker not quite realizing the absurdity of what they are saying.  Yet somehow this sort of argument has an appeal, and I’ve often wondered why.  I think it’s mostly just semantic innocence, but I’ve long had the suspicion that part of it is that it sounds good to be dynamic (active, outgoing, nimble) rather than static (passive, boring, staid).  As we all know, much of the popularity of programming languages comes down to such superficialities and misunderstandings, so what else is new?

Well, nothing, really, except that I recently learned (from Guy Blelloch) the origin of the notably inapt term dynamic programming for a highly useful method of memoization invented by Richard Bellman that is consonant with my suspicion.  Bellman, it turns out, had much the same thought as mine about the appeal of the word “dynamic”, and used it consciously to further his own ends:

“I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes.

“An interesting question is, ‘Where did the name, dynamic programming, come from?’ The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word, research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term, research, in his presence. You can imagine how he felt, then, about the term, mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various rea- sons. I decided therefore to use the word, ‘programming.’ I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying—I thought, let’s kill two birds with one stone. Let’s take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities” (p. 159).

Hilarious, or what?  It explains a lot, I must say, and confirms a long-standing suspicion of mine about the persistent belief in a non-existent opposition.

Update: does anyone know why we say “memoization” rather than “memorization”?


Parallelism and Concurrency, Revisited

April 9, 2014

To my delight, I still get compliments on and criticisms of my post from three years ago (can it possibly be that long?) on parallelism and concurrency.  In that post I offered a “top down” argument to the effect that these are different abstractions with different goals: parallelism is about exploiting computational resources to maximize efficiency, concurrency is about non-deterministic composition of components in a system.  Parallelism never introduces bugs (the semantics is identical to the sequential execution), but concurrency could be said to be the mother lode of all bugs (the semantics of a component changes drastically, without careful provision, when composed concurrently with other components).  The two concepts just aren’t comparable, yet somehow the confusion between them persists.  (Not everyone agrees with me on this distinction, but neither have I seen a rigorous analysis that shows them to be the same concept.  Most complaints seem to be about my use of the words “parallelism” and “concurrency” , which is an unavoidable problem, or about my temerity in trying to define two somewhat ill-defined concepts, a criticism that I’ll just have to accept.)

I’ve recently gotten an inkling of why it might be that many people equate the two concepts (or see no point in distinguishing them).  This post is an attempt to clear up what I perceive to be a common misunderstanding that seems to explain it.  It’s hard for me to say whether it really is all that common of a misunderstanding, but it’s the impression I’ve gotten, so forgive me if I’m over-stressing an obvious point.  In any case I’m going to try for a “bottom up” explanation that might make more sense to some people.

The issue is scheduling.

The naive view of parallelism is that it’s just talk for concurrency, because all you do when you’re programming in parallel is fork off some threads, and then do something with their results when they’re done.  I’ve previously argued that this is the wrong way to think about parallelism (it’s really about cost), but let’s just let that pass.  It’s unarguably true that a parallel computation does consist of a bunch of, well, parallel computations.  So, the argument goes, it’s nothing but concurrency.  I’ve previously argued that that’s not a good way to think about concurrency either, but we’ll let that pass too.  So, the story goes, concurrency and parallelism are synonymous, and bullshitters like me are just trying to confuse people and make trouble.

Being the troublemaker that I am, my response is, predictably, nojust no.  Sure, it’s kinda sorta right, as I’ve already acknowledged, but not really, and here’s why: scheduling as you learned about it in OS class (for example) is an altogether different thing than scheduling for parallelism.  And this is the heart of the matter, from a “bottom-up” perspective.

There are two aspects of OS-like scheduling that I think are relevant here.  First, it is non-deterministic, and second, it is competitive.  Non-deterministic, because you have little or no control over what runs when or for how long.  A beast like the Linux scheduler is controlled by a zillion “voodoo parameters” (a turn of phrase borrowed from my queueing theory colleague, Mor Harchol-Balter), and who the hell knows what is going to happen to your poor threads once they’re in its clutches.  Second, and more importantly, an OS-like scheduler is allocating resources competitively.  You’ve got your threads, I’ve got my threads, and we both want ours to get run as soon as possible.  We’ll even pay for the privilege (priorities) if necessary.  The scheduler, and the queueing theory behind it (he says optimistically) is designed to optimize resource usage on a competitive basis, taking account of quality of service guarantees purchased by the participants.  It does not matter whether there is one processor or one thousand processors, the schedule is unpredictable.  That’s what makes concurrent programming hard: you have to program against all possible schedules.  And that’s why you can’t prove much about the time or space complexity of your program when it’s implemented concurrently.

Parallel scheduling is a whole ‘nother ball of wax.  It is (usually, but not necessarily) deterministic, so that you can prove bounds on its efficiency (Brent-type theorems, as I discussed in my previous post and in PFPL).  And, more importantly, it is cooperative in the sense that all threads are working together for the same computation towards the same ends.  The threads are scheduled so as to get the job (there’s only one) done as quickly and as efficiently as possible.  Deterministic schedulers for parallelism are the most common, because they are the easiest to analyze with respect to their time and space bounds.  Greedy schedulers, which guarantee to maximize use of available processors, never leaving any idle when there is work to be done, form an important class for which the simple form of Brent’s Theorem is obvious.

Many deterministic greedy scheduling algorithms are known, of which I will mention p-DFS and p-BFS, which do p-at-a-time depth- and breadth-first search of the dependency graph, and various forms of work-stealing schedulers, pioneered by Charles Leiserson at MIT.  (Incidentally, if you don’t already know what p-DFS or p-BFS are, I’ll warn you that they are a little trickier than they sound.  In particular p-DFS uses a data structure that is sort of like a stack but is not a stack.)  These differ significantly in their time bounds (for example, work stealing usually involves expectation over a random variable, whereas the depth- and breadth-first traversals do not), and differ dramatically in their space complexity.  For example, p-BFS is absolutely dreadful in its space complexity.  For a full discussion of these issues in parallel scheduling, I recommend Dan Spoonhower’s PhD Dissertation.  (His semantic profiling diagrams are amazingly beautiful and informative!)

So here’s the thing: when you’re programming in parallel, you don’t just throw some threads at some non-deterministic competitive scheduler.  Rather, you generate an implicit dependency graph that a cooperative scheduler uses to maximize efficiency, end-to-end.  At the high level you do an asymptotic cost analysis without considering platform parameters such as the number of processors or the nature of the interconnect.  At the low level the implementation has to validate that cost analysis by using clever techniques to ensure that, once the platform parameters are known, maximum use is made of the computational resources to get your job done for you as fast as possible.  Not only are there no bugs introduced by the mere fact of being scheduled in parallel, but even better, you can prove a theorem that tells you how fast your program is going to run on a real platform.  Now how cool is that?

[Update: word-smithing.]


Old neglected theorems are still theorems

March 20, 2014

I have very recently been thinking about the question of partiality vs totality in programming languages, a perennial topic in PL’s that every generation thinks it discovers for itself.  And this got me to remembering an old theorem that, it seems, hardly anyone knows ever existed in the first place.  What I like about the theorem is that it says something specific and technically accurate about the sizes of programs in total languages compared to those in partial languages.  The theorem provides some context for discussion that does not just amount to opinion or attitude (and attitude alway seems to abound when this topic arises).

The advantage of a total programming language such as Goedel’s T is that it ensures, by type checking, that every program terminates, and that every function is total. There is simply no way to have a well-typed program that goes into an infinite loop. This may seem appealing, until one considers that the upper bound on the time to termination can be quite large, so large that some terminating programs might just as well diverge as far as we humans are concerned. But never mind that, let us grant that it is a virtue of  T that it precludes divergence.

Why, then, bother with a language such as PCF that does not rule out divergence? After all, infinite loops are invariably bugs, so why not rule them out by type checking? (Don’t be fooled by glib arguments about useful programs, such as operating systems, that “run forever”. After all, infinite streams are programmable in the language M of inductive and coinductive types in which all functions terminate. Computing infinitely does not mean running forever, it just means “for as long as one wishes, without bound.”)  The notion does seem appealing until one actually tries to write a program in a language such as T.

Consider computing the greatest common divisor (GCD) of two natural numbers. This can be easily programmed in PCF by solving the following equations using general recursion:

\begin{array}{rcl}    \textit{gcd}(m,0) & = & m \\    \textit{gcd}(0,m) & = & m \\    \textit{gcd}(m,n) & = & \textit{gcd}(m-n,n) \quad \text{if}\ m>n \\    \textit{gcd}(m,n) & = & \textit{gcd}(m,n-m) \quad \text{if}\ m<n    \end{array}

The type of \textit{gcd} defined in this manner has partial function type (\mathbb{N}\times \mathbb{N})\rightharpoonup \mathbb{N}, which suggests that it may not terminate for some inputs. But we may prove by induction on the sum of the pair of arguments that it is, in fact, a total function.

Now consider programming this function in T. It is, in fact, programmable using only primitive recursion, but the code to do it is rather painful (try it!). One way to see the problem is that in T the only form of looping is one that reduces a natural number by one on each recursive call; it is not (directly) possible to make a recursive call on a smaller number other than the immediate predecessor. In fact one may code up more general patterns of terminating recursion using only primitive recursion as a primitive, but if you examine the details, you will see that doing so comes at a significant price in performance and program complexity. Program complexity can be mitigated by building libraries that codify standard patterns of reasoning whose cost of development should be amortized over all programs, not just one in particular. But there is still the problem of performance. Indeed, the encoding of more general forms of recursion into primitive recursion means that, deep within the encoding, there must be “timer” that “goes down by ones” to ensure that the program terminates. The result will be that programs written with such libraries will not be nearly as fast as they ought to be.  (It is actually quite fun to derive “course of values” recursion from primitive recursion, and then to observe with horror what is actually going on, computationally, when using this derived notion.)

But, one may argue, T is simply not a serious language. A more serious total programming language would admit sophisticated patterns of control without performance penalty. Indeed, one could easily envision representing the natural numbers in binary, rather than unary, and allowing recursive calls to be made by halving to achieve logarithmic complexity. This is surely possible, as are numerous other such techniques. Could we not then have a practical language that rules out divergence?

We can, but at a cost.  One limitation of total programming languages is that they are not universal: you cannot write an interpreter for T within T (see Chapter 9 of PFPL for a proof).  More importantly, this limitation extends to any total language whatever.  If this limitation does not seem important, then consider the Blum Size Theorem (BST) (from 1967), which places a very different limitation on total languages.  Fix any total language, L, that permits writing functions on the natural numbers. Pick any blowup factor, say 2^{2^n}, or however expansive you wish to be.  The BST states that there is a total function on the natural numbers that is programmable in L, but whose shortest program in L is larger by the given blowup factor than its shortest program in PCF!

The underlying idea of the proof is that in a total language the proof of termination of a program must be baked into the code itself, whereas in a partial language the termination proof is an external verification condition left to the programmer. Roughly speaking, there are, and always will be, programs whose termination proof is rather complicated to express, if you fix in advance the means by which it may be proved total. (In T it was primitive recursion, but one can be more ambitious, yet still get caught by the BST.)  But if you leave room for ingenuity, then programs can be short, precisely because they do not have to embed the proof of their termination in their own running code.

There are ways around the BST, of course, and I am not saying otherwise.  For example, the BST merely guarantees the existence of a bad case, so one can always argue that such a case will never arise in practice.  Could be, but I did mention the GCD in T problem for a reason: there are natural problems that are difficult to express in a language such as T.  By fixing the possible termination arguments in advance, one is tempting fate, for there are many problems, such as the Collatz Conjecture, for which the termination proof of a very simple piece of code has been an open problem for decades, and has resisted at least some serious attempts on it.  One could argue that such a function is of no practical use.  I agree, but I point out the example not to say that it is useful, but to say that it is likely that its eventual termination proof will be quite nasty, and that this will have to be reflected in the program itself if you are limited to a T-like language (rendering it, once again, useless).  For another example, there is no inherent reason why termination need be assured by means similar to that used in T.  We got around this issue in NuPRL by separating the code from the proof, using a type theory based on a partial programming language, not a total one.  The proof of termination is still required for typing in the core theory (but not in the theory with “bar types” for embracing partiality).  But it’s not baked into the code itself, affecting its run-time; it is “off to the side”, large though it may be).

Updates: word smithing, fixed bad link, corrected gcd, removed erroneous parenthetical reference to Coq, fixed LaTeX problems.


Homotopy type theory lectures and notes on-line

December 4, 2013

My activity on this blog has been reduced to nil recently because I have been spending my time preparing a series of lectures on homotopy type theory, starting from basic principles and ending with the application of univalence and higher inductive types to algebraic topology.  The course web page contains links to the video-taped lectures and to the course notes prepared by the students this semester.  These will be available indefinitely and are accessible to anyone interested in the course.  My hope is that these will provide a useful introduction to the HoTT Book, which is available for free on the web and may be printed on demand.  My intention is to write an introduction to dependent type theory for a CS audience, which will serve as a reference for much ongoing work in the area and as a prolegomenon to the HoTT Book itself.


There Is Such A Thing As A Declarative Language, and It’s The World’s Best DSL

July 22, 2013

In a recent post I asked whether there is any such thing as a declarative language. The main point was to argue that the standard “definitions” are, at best, not very precise, and to see whether anyone might offer a better definition. What I’m after is an explanation of why people seem to think that the phrase has meaning, even though they can’t say very clearly what they mean by it.  (One commenter analogized with “love” and “happiness”, but I would counter by saying that we’re trying to do science here, and we ought to be able to define our terms with some precision.)

As I mentioned, perhaps the best “definition” that is usually offered is to say that “declarative” is synonymous with “functional-and-logic-programming”.  This is pretty unsatisfactory, since it is not so easy to define these terms either, and because, contrary to conventional classifications, the two concepts have pretty much nothing in common with each other (but for one thing to be mentioned shortly). The propositions-as-types principle helps set them clearly apart: whereas functional programming is about executing proofs, logic programming is about the search for proofs. Functional programming is based on the dynamics of proof given by Gentzen’s inversion principle. Logic programming is based on the dynamics of provability given by cut elimination and focusing.  The two concepts of computation could not be further apart.

Yet they do have one thing in common that is usefully isolated as fundamental to what we mean by “declarative”, namely the concept of a variable.  Introduced by the ancient Hindu and Muslim mathematicians, Brahmagupta and al Kwharizmi, the variable is one of the most remarkable achievements of the human intellect.  In my previous post I had secretly hoped that someone would propose variables as being central to what we mean by “declarative”, but no one did, at least not in the comments section.  My unstated motive for writing that post was not so much to argue that the term “declarative” is empty, but to test the hypothesis that few seem to have grasp the importance of  this concept for designing a civilized, and broadly applicable, programming language.

My contention is that variables, properly so-called, are what distinguish “declarative” languages from “imperative” languages. Although the imperative languages, including all popular object-oriented languages, are based on a concept that is called a variable, they lack anything that actually is a variable.   And this is where the trouble begins, and the need for the problematic distinction arises.  The declarative concept of a variable is the mathematical concept of an unknown that is given meaning by substitution. The imperative concept of a variable, arising from low-level machine models, is instead given meaning by assignment (mutation), and, by a kind of a notational pun, allowed to appear in expressions in a way that resembles that of a proper variable.  But the concepts are so fundamentally different, that I argue in PFPL that the imperative concept be called an “assignable”, which is more descriptive, rather than “variable”, whose privileged status should be emphasized, not obscured.

The problem with purely imperative programming languages is that they have only the concept of an assignable, and attempt to make it serve also as a concept of variable. The results are a miserable mess of semantic and practical complications. Decades of work has gone into rescuing us from the colossal mistake of identifying variables with assignables. And what is the outcome? If you want to reason about assignables, what you do is (a) write a mathematical formulation of your algorithm (using variables, of course) and (b) show that the imperative code simulates the functional behavior so specified.   Under this methodology the mathematical formulation is taken as self-evidently correct, the standard against which the imperative program is judged, and is not itself in need of further verification, whereas the imperative formulation is, invariably, in need of verification.

What an odd state of affairs!  The functional “specification” is itself a perfectly good, and apparently self-evidently correct, program.  So why not just write the functional (i.e., mathematical) formulation, and call it a day?  Why indeed!  Declarative languages, being grounded in the language of mathematics, allow for the identification of the “desired behavior” with the “executable code”.  Indeed, the propositions-as-types principle elevates this identification to a fundamental organizing principle: propositions are types, and proofs are programs.  Who needs verification?  Once you have a mathematical specification of the behavior of a queue, say, you already have a running program; there is no need to relegate it to a stepping stone towards writing an awkward, and invariably intricate, imperative formulation that then requires verification to ensure that it works properly.

Functional programming languages are written in the universally applicable language of mathematics as expressed by the theory of types.  Such languages are therefore an integral part of science itself, inseparable from our efforts to understand and master the workings of the world.  Imperative programming has no role to play in this effort, and is, in my view, doomed in the long run to obsolescence, an artifact of engineering, rather than a fundamental discovery on a par with those of mathematics and science.

This brings me to my main point, the popular concept of a domain-specific language. Very much in vogue, DSL’s are offered as the solution to many of our programming woes. And yet, to borrow a phrase from my colleague Guy Blelloch, the elephant in the room is the question “what is a domain?”. I’ve yet to hear anyone explain how you decide what are the boundaries of a “domain-specific” language. Isn’t the “domain” mathematics and science itself? And does it not follow that the right language must be the language of mathematics and science? How can one rule out anything as being irrelevant to a “domain”?  I think it is impossible, or at any rate inadvisable, to make such restrictions a priori.  Indeed, full-spectrum functional languages are already the world’s best DSL’s, precisely because they are (or come closest to being) the very language of science, the ultimate “domain”.


Follow

Get every new post delivered to your Inbox.

Join 1,298 other followers