Salvador Dali’s masterpiece, The Persistence of Memory, is one of the most recognizable works of art in the world. The critic Dawn Ades described it as “a Surrealist meditation on the collapse of our notions of a fixed cosmic order” induced by Einstein’s penetrating analysis of the concepts of time and space in the physical world. Just as Dali’s Persistence of Memory demarcated the transition from age-old conceptions of time and space in physics, so does the computational concept of persistence of memory mark the transition from sequential time and mutable space to parallel time and immutable space.
A short while ago I described the distinction between parallelism and concurrency in terms of a cost model that assigns a parallel, as well as sequential, time complexity to a program. Parallelism is all about efficiency, not semantics; the meaning of a program is not affected by whether it is executed on one processor or many. Functional languages expose parallelism by limiting sequential dependencies among the parts of a computation; imperative languages introduce inessential dependencies that impede parallelism.
Another critical ingredient for parallelism is the concept of a persistent data structure, one for which its associated operations transform, rather than alter, it. A persistent dictionary, for example, has the characteristic that inserting an element results in a new dictionary with an expanded domain; the old dictionary persists, and is still available for further transformation. When calculating the sum of 2 and 2, resulting in 4, no one imagines that the 2′s are “used up” in the process! Nor does one worry whether the sum of 1 and 3 is the “same” 4 or a “different” 4! The very question is absurd (or, more precisely, trivial). So why do we worry about whether the result of inserting 2 into dict “uses up” the old dict? And why do we worry about whether inserting 2 into the empty dictionary twice results in the “same” dictionary or a “different” one?
Yet both academic and practical computing has all but confined itself to ephemeral data structures, which exhibit precisely such behavior. Operations on an ephemeral data structure “use up” the data structure, making it unavailable for further use without going to some trouble to get it back. The pathologies resulting from this abound. Standard compiler texts, for example, devote a chapter to concepts like “block structured symbol tables”, which are, in fact, nothing more than persistent dictionaries done the hard way. More generally, whenever a data structure has multiple futures, such as when backtracking or exploiting parallelism, ephemeral data structures get in the way. Indeed, the bulk of object-oriented programming, with its absurd over-emphasis on the “message passing” metaphor, stress the alteration of objects as the central organizing principle, confounding parallelism and complicating simple algorithms.
A prime virtue of functional languages is that persistence is the default case, but they can as readily support ephemeral data structures as any imperative (including object-oriented) language. All functional languages include types of mutable cells and mutable arrays, and provide support for conventional, sequential, imperative programming with semicolons and even curly braces! (Some do this better than others; Haskell is, in my view, the world’s best imperative programming language, and second-best functional language, but that’s a subject for another post.) But why would you want to? Why deprive yourself of the benefits of persistence, and insist instead on an ephemeral data structure?
This question came up recently in our planning for the Functional Programming class that we are teaching this semester for freshmen at Carnegie Mellon. All semester we have been using functional programming techniques to build clean, verifiable, modular, parallel programs. The students routinely prove theorems about their code, structure programs using abstract types, and exploit parallelism to improve the asymptotic performance of their programs. Recent homework assignments include the implementation of the parallel Barnes-Hut algorithm for the n-body problem in physics, and the parallel Jamboree algorithm for game-tree search in perfect information games. Persistent data structures are the key to making this possible; just try to code Barnes-Hut in an imperative language, and you will find yourself in a morass worrying about concurrency when you should instead be envisioning a recursive tree decomposition of space, and the computation of forces using formulas from high school physics.
We tried hard to find good motivations for using an ephemeral data structure when you can just as easily (actually, much more easily) use a persistent one. As we went through them, we realized that all of the standard arguments are questionable or false. The usual one is some vague notion of “efficiency” in either time or space. While I concede that one can, in principle, solve a particular, confined problem more efficiently by doing absolutely everything by hand (memory management, scheduling, arithmetic), in the overwhelming majority of cases the demands of evolution of code far outweigh the potential advantages of doing everything by hand. Modularity matters most when it comes to building and evolving large systems; functional languages, with persistent data structures, support modularity the best. (I’ll have more to say about this in a future post.)
Most of the arguments about efficiency, though, ignore questions of functionality. It is senseless to compare the “efficiency” of one data structure that provides different functionality than another. A persistent data structure does more for you than does an ephemeral one. It allows you to have multiple futures, including those that evolve in parallel with one another. It makes no sense to insist that some ephemeral approximation of such a data structure is “more efficient” if it does not provide those capabilities! And making it do so is, invariably, a bitch! Conventional ephemeral data structures are not readily parallelizable; it’s often a publishable result to get a decent degree of parallelism using imperative methods. By contrast, even freshmen (admittedly, Carnegie Mellon freshmen) can implement a parallel game tree search or a tree-based solution to the n-body problem in a one-week homework assignment.
So what’s the deal? Why should we care about ephemeral data structures at all? I have no idea. Mostly, it’s a legacy cost imposed on us by the overzealous emphasis on imperative methods and machine-based, rather than language-based, models of computation. But this will change. Starting this fall, introductory data structures and algorithms will be liberated from the limitations of imperative, object-oriented programming, and will instead stress persistent (as well as ephemeral) data structures, and parallel (including as a special case sequential) algorithms. The future of computing depends on parallelism (for efficiency), distribution (for scale), and verification (for quality). Only functional languages support all three naturally and conveniently; the old ones just don’t cut it.
Update: Here’s a chart summarizing the situation as I see it:
Conventional imperative programming works well for the ephemeral, sequential case; it is notoriously hard to use imperative methods for parallelism. Benign effects, as exemplified by self-adjusting data structures, can be used to give rise to persistent behavior in the sequential setting, but the use of effects impedes parallelism.