## Sequentiality as the Essence of Parallelism

I recently thought of a nice way to structure a language for parallel programming around the concept of sequential composition.  Think of parallelism as the default—evaluate everything in parallel unless the semantics of the situation precludes it.  Sums are posterior to summands, but the summands can be evaluated simultaneously.  You need a way to express the necessary dependencies without introducing any spurious ones.

There’s a tool for that, called lax logic, introduced by Fairtlough and Mendler and elaborated by Davies and Pfenning, which I use extensively in PFPL.  The imperative language Modernized Algol is formulated in the lax style, distinguishing two modes, or levels, of syntax, the (pure) expressions and the (impure) commands.  The lax modality, which links the two layers, behaves roughly like a monad, but, all the hype notwithstanding, it is not the central player.  It’s the modes, not the modality, that matter.  (See the Commentary on PFPL for more.)

The lax modality is just the ticket for expressing parallelism.  Rather than separate expressions from commands, here we distinguish between values and computations.  The names are important, to avoid possible confusion.  Values are fully evaluated; they are not a source of parallelism.  (If they were called “pure”, it would be irresistible to think otherwise.)  Computations have yet to be evaluated; they engender parallelism by sequential composition.  What?  No, you didn’t nod off! Let me explain.

Parallelism is all about the join points.  If parallel execution is the default, then the job of the programmer is not to induce parallelism, but to harness it.  And you do that by saying, “this computation depends on these others.”  Absent that, there is nothing else to say, just go for it.  No sub-languages.  No program analysis.  No escaping the monad.  Just express the necessary dependencies, and you’re good to go.

So, what are the join points?  They are the elimination forms for two parallel modalities.  They generalize the sequential case to allow for statically and dynamically determined parallelism.   A value of parallel product type is a tuple of unevaluated computations, a kind of “lazy” tuple (but not that kind of laziness!).  The elimination form evaluates all of the component computations in parallel, creates a value tuple from their values, and passes it to the body of the form.  A value of parallel sequence type is a generator consisting of two values, a natural number n indicating its size, and a function determining the ith component computation for each 1≤i<n.  The elimination form activates all n component computations, binds their values to a value sequence, and passes it to the body of the form.

The join point effects a change of type, from encapsulated computations to evaluated values, neatly generalizing sequential composition from a unary to a binary join.  If you’d like, the parallel products and parallel sequences are “generalized monads” that encapsulate not just one, but many, unevaluated computations.  But they are no more monads than they are in any other functional language: the equational laws need not hold in the presence of, say, divergence, or exceptions.

The dynamics assigns costs to computations, not to values, whose cost of creation has already been paid.  The computation that just returns a value has unit work and span.  Primitive operations take unit work and span.  The sequential composition of a parallel product with n components induces span one more than the maximum span of the constituents, and induces work one more than the sum of their work.  The dynamics of sequential composition for parallel sequences is similar, with the “arity” being determined dynamically rather than statically.

Programming in this style means making the join points explicit.  If you don’t like that, you can easily define derived forms—and derived costs—for constructs that do it for you.    For example, a pair of computations might be rendered as activating a parallel pair of its components, then returning the resulting value pair.  And so on and so forth.  It’s no big deal.

En passant this solves a nasty technical problem in a substitution-based cost semantics that does not make the modal distinction.  The issue is, how to distinguish between the creation of a value, and the many re-uses of it arising from substitution?  It’s not correct to charge again and again for the value each time you see it (this cost can be asymptotically significant), but you have to charge for creating it (it’s not free, and it can matter).  And, anyway, how is one to account for the cost of assessing whether an expression is, in fact, a value?  The usual move is to use an environment semantics to manage sharing.  But you don’t have to, the modal framework solves the problem, by distinguishing between a value per se; the computation that returns it fully created; and the computation that incrementally construcs it from its constituent parts.  It’s the old cons-vs-dotted pair issue, neatly resolved.

Please see Section 10 of the Commentary on PFPL for a fuller account.  The main idea is to generalize a type of single unevaluated computations, which arises in lax logic, to types of statically- and dynamically many unevaluated computations.  The bind operation becomes a join operation for these computations, turning a “lazy” tuple or sequence into eager tuples or sequences.  (This is not laziness in the sense of lazy evaluation though!)

Updates: word-smithing, added cite to Davies-Pfenning, replaced cite of course notes with reference to commentary.