The Semantics of Evaluation & Continuations

Posted on July 12, 2020

Continuations are a criminally underappreciated language feature. Very few languages (off the top of my head: most Scheme dialects, some Standard ML implementations, and Ruby) support the—already very expressive—undelimited continuations—the kind introduced by call/cc—and even fewer implement the more expressive delimited continuations. Continuations (and tail recursion) can, in a first-class, functional way, express all local and non-local control features, and are an integral part of efficient implementations of algebraic effects systems, both as language features and (most importantly) as libraries.

Continuations, however, are notoriously hard to understand. In an informal way, we can say that a continuation is a first-class representation of the “future” of a computation. By this I do not mean a future in the sense of an asynchronous computation, but in a temporal sense. While descriptions like this are “correct” in some sense, they’re also not useful. What does it mean to store the “future” of a computation in a value?

Operationally, we can model continuations as just a segment of control stack. This model is perhaps better suited for comprehension by programmers familiar with the implementation details of imperative programming languages, which is decidedly not my target audience. Regardless, this is how continuations (especially the delimited kind) are generally presented.

With no offense to the authors of these articles, some of which I greatly respect as programming language designers and implementers, I do not think that this approach is very productive in a functional programming context. This reductionist, imperative view would almost be like explaining the concept of a proper tail call by using a trampoline, rather than presenting trampolines as one particular implementation strategy for the tail-call-ly challenged.

Fig 1. A facsimile of a diagram you’d find attached to an explanation of delimited continuations. Newer frames on top.

In a more functional context, then, I’d like to present the concept of continuation as a reification of an evaluation context. I stress that this presentation is not novel, though it is, perhaps, uncommon outside of the academic literature on continuations. Reification, here, is the normal English word. It’s a posh way of saying “to make into a thing”. For example, an (eval) procedure is a reification of the language implementation—it’s an interpreter made into a thing, a thing that looks, walks and quacks like a procedure.

The idea of evaluation contexts, however, seems to have remained stuck in the ivory towers that the mainstream so often accuse us of inhabiting.

Evaluation Contexts

What is an evaluation context? Unhelpfully, the answer depends on the language we’re talking about. Since the language family you’re most likely to encounter continuations in is Scheme, we’ll use a Scheme-like language (not corresponding to any of the RKRS standards). Our language has the standard set of things you’d find in a minimalist functional language used for an academic text: lambda expressions, written (lambda arg exp) that close over variables in lexical scope; applications, also n-ary, written (func arg ), if-expressions written (if condition then else), with the else expression being optional and defaulting to the false value #f, integers, integer operations, and, of course, variables.

As an abbreviation one can write (lambda (a . args) body) to mean (lambda a (lambda args body)) (recursively), and similarily for applications (associating to the left), but the language itself only has unary application and currying. This is in deviation from actual Scheme implementations which have complex parameter passing schemes, including variadic arguments (collecting any overflow in a list), keyword and optional arguments, etc. While all of these features are important for day-to-day programming, they do nothing but cloud the presentation here with needless verbosity.

e ::= (lambda arg expr)   ; function definitions
    | (if expr expr expr) ; if expression
    | (expr expr)         ; function applications
    | var                 ; variables
    | #t | #f             ; scheme programmers spell booleans funny
    | 1 | 2 | 3 | 4 ...   ; integers

The set of values in this miniScheme language inclues lambda expressions, the booleans #t and #f, and the integers; Every other expression can potentially take a step. Here, taking a step means applying a reduction rule in the language’s semantics, or finding a congruence rule that allows some sub-expression to suffer a reduction rule.

An example of reduction rule is β-reduction, which happens when the function being applied is a λ-abstraction and all of its arguments have been reduced to values. The rule, which says that an application of a lambda to a value can be reduced to its body in one step, is generally written in the notation of sequent calculus as below.

Fancy reduction rule type-set in TeX (λx.e)ve{v/x}\frac{}{(\lambda x.e) v \longrightarrow e\{v/x\}}

However, let’s use the Lisp notation above and write reduction rules as if they were code. The notation I’m using here is meant to evoke Redex, a tool for defining programming semantics implemented as a Racket language. Redex is really neat and I highly recommend it for anyone interested in studying programming language semantics formally.

Reduction rules

(--> ((lambda x e) v)
     (subst x v e))
(--> (if #t e_1 e_2)
     (e_1))
(--> (if #f e_1 e_2)
     (e_2))

These rules, standard though they may be, have a serious problem. Which, you might ask? They only apply when the expressions of interest are already fully evaluated. No rule matches for when the condition of an if expression is a function application, or another conditional; The application rule, also, only applies when the argument has already been evaluated (it’s call-by-value, or strict). What can we do? Well, a simple and obvious solution is to specify congruence rules that let us reduce in places of interest.

Congruence rules

[ (--> e_1 v)
--------------------------
  (--> (e_1 e_2) (v e_2))]

Evaluating in function position

[ (--> e_2 v)
--------------------------
  (--> (e_1 e_2) (e_1 v))]

Evaluating in argument position

[ (--> e_1 v)
-----------------------
  (--> (if e_1 e_2 e_3)
       (if v e_2 e_3))]

Evaluating in scrutinee position

Hopefully the problem should be clear. If it isn’t, consider adding binary operators for the field operations: Each of the 4 (addition, subtraction, multiplication, division) needs 2 congruence rules, one for reducing either argument, even though they each have a single reduction rule. In general, an N-ary operator will have N congruence rules, one for each of its N operands, but only one reduction rule!

The solution to this problem comes in the form of evaluation contexts. We can define a grammar of “expressions with holes”, generally written as E[]\operatorname{E}[\cdot], where the \cdot stands for an arbitrary expression. In code, we’ll denote the hole with <>, perhaps in evocation of the macros cut and cute from SRFI 261.

Our grammar of evaluation contexts, which we’ll call E in accordance with tradition, looks like this:

E ::= <>                  ; any expression can be evaluated
    | (E e)               ; evaluate the function
    | (v E)               ; evaluate the argument
    | (if E e e)          ; evaluate the condition

Now we can write all our congruence rules by appealing to a much simpler, and most importantly, singular, context rule, that says reduction is legal anywhere in an evaluation context.

[(--> e v)
----------------------------------
 (--> (in-hole E e) (in-hole E v)]

Redex uses the notation (in-hole E e) to mean an evaluation context E with e “plugging” the hole <>.

What do evaluation contexts actually have to do with programming language implementation, however? Well, if you squint a bit, and maybe introduce some parameters, evaluation contexts look a lot like what you’d find attached to an operation in…

Continuation-passing Style

I’m not good at structuring blog posts, please bear with me.

Continuation-passing style, also known as CPS, is a popular intermediate representation for functional language compilers. The goal of CPS is to make evaluation order explicit, and to implement complex control operations like loops, early returns, exceptions, coroutines and more in terms of only lambda abstraction. To achieve this, the language is stratified into two kinds of expressions, “complex” and “atomic”.

Atomic expressions, or atoms, are not radioactive or explosive: In fact, they’re quite the opposite! The Atoms of CPS are the values of our direct-style language. Forms like #t, #f, numbers and lambda expressions will not undergo any more evaluation, and thus may appear anywhere. Complex expressions are those that do cause evaluation, such as conditionals and procedure application.

To make sure that evaluation order is explicit, every complex expression has a continuation attached, which here boils down to a function which receives the return value of the expression. Procedures, instead of returning to a caller, will instead tail-call their continuation.

The grammar of our mini Scheme after it has gone CPS transformation is as follows:

atom ::= (lambda (arg kont) expr) ; continuation argument
       | var                      ; these remain as they were.
       | #t | #f
       | 1 | 2 | 3 | 4
expr ::= (atom atom atom)    ; function, argument, continuation
       | (if atom expr expr) ; conditional, then_c, else_c
       | atom                ; atoms are also valid expressions

Note that function application now has three components, but all of them are atoms. Valid expressions include things like (f x halt), which means “apply f to x such that it returns to halt”, but do not include (f (g x y) (h y z)), which have an ambiguous reduction order. Instead, we must write a λ-abstraction to give a name to the result of each intermediate computation.

For example, the (surface) language application (e_1 e_2), where both are complex expressions, has to be rewritten as either of the following expressions, which correspond respectively to evaluating the function first or the argument first. (e_1 (lambda r1 (e_2 (lambda r2 (r1 r2))))

(e_1 (lambda r_1
       (e_2 (lambda r_2
              (r_1 r_2 k)))))

Evaluating the function first

(e_1 (lambda r_1
       (e_2 (lambda r_2
              (r_1 r_2 k)))))

Evaluating the argument first

If you have, at any point while reading the previous 2 or so paragraphs, squinted, then you already know where I’m going with this. If not, do it now.

The continuation of an expression corresponds to its evaluation context. The vs in our discussion of semantics are the atoms of CPS, and most importantly, contexts E get closed over with a lambda expression (lambda x E[x]), replacing the hole <> with a bound variable x. Evaluating an expression (f x) in a context E, say (<> v) corresponds to (f x (lambda x (x v))).

If a language implementation uses the same representation for both user procedures and lambda expressions—which is very inefficient, let me stress—then we get first-class control for “free”. First-class in the sense that control operations, like return, can be stored in variables, or lists, passed as arguments to procedures, etc. The fundamental first-class control operator for undelimited continuations is called call-with-current-continuation, generally abbreviated to call/cc.

(define call/cc (lambda (f cc) (f cc cc)))

Using call/cc and a mutable cell holding a list we can implement cooperative threading. The (yield) operator has the effect of capturing the current continuation and adding it to (the end) of the list, dequeueing a potentially different saved continuation from the list and jumping there instead.

(define threads '())
; Jump to the next thread (read: continuation) or exit the program
; if there are no more threads to schedule
(define exit
  (let ((exit exit))
    (lambda ()
      (if (null? threads)              ; are we out of threads to switch to?
        (exit)                         ; if so,  exit the program
        (let ((thr (car threads)))     ; select the first thread
          (set! threads (cdr threads)) ; dequeue it
          (thr))))))))                 ; jump there
; Add a function to the list of threads. After finishing its work,
; the function needs to (exit) so another thread can take over.
(define (fork f)
  (set! threads (append threads
                        (list (lambda () (f) (exit))))))
; Capture the current continuation, enqueue it, and switch to
; a different thread of execution.
(define (yield)
  (call/cc (lambda (cc)
              (set! threads (append threads (list cc)))
              (exit))))

That’s a cooperative threading implementation in 25 lines of Scheme! If whichever implementation you are using has a performant call/cc, this will correspond ropughly to the normal stack switching that a cooperative threading implementation has to do.

That last paragraph is a bit of a weasel, though. What’s a “performant call/cc” look like? Well, call/cc continuations have abortive behaviour2, which means they have the effect of replacing the current thread of control when invoked, instead of prepending a segment of stack—which is known as “functional continuations”. That is, it’s basically a spiffed up longjmp, which in addition to saving the state of the registers, copies the call stack along with it.

However, call/cc is a bit overkill for applications such as threads, and even then, it’s not the most powerful control abstraction. For one call/cc always copies the entire continuation, with no way to ahem delimit it. Because of this, abstractions built on call-with-current-continuation do not compose.

We can fix all of these problems, ironically enough, by adding more power. We instead introduce a pair operators, prompt and control, which…

Delimit your Continuations

This shtick again?

Delimited continuations are one of those rare ideas that happen once in a lifetime and revolutionise a field—maybe I’m exaggerating a bit. They, unfortunately, have not seen very widespread adoption. But they do have all the characteristics of one of those revolutionary ideas: they’re simple to explain, simple to implement, and very powerful.

The idea is obvious from the name, so much that it feels insulting to repeat it: instead of capturing the continuation, have a marker that delimits what’s going to be captured. What does this look like in our reduction semantics?

The syntax of evaluation contexts does not change, only the operations. We gain operations (prompt e) and (control k e), with the idea being that when (control k e) is invoked inside of a (prompt) form, the evaluation context from the control to the nearest outermost prompt is reified as a function and bound to k.

[-------------------
 (--> (prompt v) v)]

Delimiting a value does nothing

[----------------------------------------------------------
 (--> (prompt (in-hole E (control k body)))
      (prompt ((lambda k body) (lambda x (in-hole E x)))))]

Capture the continuation, bind it to the variable k, and keep going.

By not adding (prompt E) to the grammar of evaluation contexts, we ensure that E is devoid of any prompts by construction. This captures the intended semantics of “innermost enclosing prompt”—if E were modified to include prompts, the second rule would instead capture to the outermost enclosing prompt, and we’re back to undelimited call/cc.

Note that prompt and control are not the only pair of delimited control operators! There’s also shift and reset (and prompt0/control0). reset is basically the same thing as prompt, but shift is different from control in that the captured continuation has the prompt—uh, the delimiter—reinstated, so that it cannot “escape”.

[-----------------------------------------------------------------
 (--> (reset (in-hole E (shift k body)))
      (reset ((lambda k body) (lambda x (reset (in-hole E x))))))]

Reinstate the prompt when the captured continuation is applied.

Yet another pair, which I personally prefer, is what you’d find in Guile’s (ice-9 control), namely (call-with-prompt tag thunk handler) and (abort-to-prompt tag value). These are significantly more complex than bare shift and reset since they implement multi-prompt delimited continuations. They’re more like exception handlers than anything, with the addded power that your “exception handler” could restart the code after you throw.

(define-syntax reset
  (syntax-rules ()
    ((reset . body)
     (call-with-prompt (default-prompt-tag)
                       (lambda () . body)
                       (lambda (cont f) (f cont))))))
(define-syntax shift
  (syntax-rules ()
    ((shift k . body)
     (abort-to-prompt (default-prompt-tag)
                      (lambda (cont)
                        ((lambda (var) (reset . body))
                         (lambda vals (reset (apply cont vals))))))

call-with-prompt and abort-to-prompt subsume shift and reset.
Taken from the Guile Scheme implementation.

The operators call-with-prompt and abort-to-prompt are very convenient for the implementation of many control structures, like generators:

(define-syntax for/generator
  (syntax-rules ()
    ((_ name gen . body)
     (begin
       (define (work cont)
         (call-with-prompt
           'generator-tag
           cont
           (lambda (cont name) (begin . body)
                               (work cont))))
       (work gen)))))
(define (yield x) (abort-to-prompt 'generator-tag x))
(for/generator x (lambda ()
                   (yield 1)
                   (yield 2)
                   (yield 3))
    (display x) (display #\newline))

Exception handlers and threading in terms of shift and reset are left as an exercise to the reader.

But… why?

Control abstractions appeal to our—or at least mine—sense of beauty. Being able to implement control flow operations as part of the language is often touted as one of the superpowers that Haskell gets from its laziness and purity, and while that certainly is true, control operators let us model many, many more control flow abstractions, namely those involving non-local exits and entries.

Of course, all of these can be implemented in the language directly—JavaScript, for example, has async/await, stackless generators, and exceptions. However, this is not an advantage. These significantly complicate the implementation of the language (as opposed to having a single pair of operators that’s not much more complicated to implement than regular exception handlers) while also significantly diminishing its expressive power! For example, using our definition of generators above, the Scheme code on the right does what you expect, but the JavaScript code on the left only yields 20.

((lambda ()
    ((lambda ()
        (yield 10)))
    (yield 20)))
(function*() {
  (function*() {
    yield 10;
  })
  yield 20;
})();

Delimited continuations can also be used to power the implementation of algebraic effects systems, such as those present in the language Koka, which much lower overhead (both in terms of code size and speed) than the type-driven local CPS transformation that Koka presently uses.

Language implementations which provide delimited control operators can also be extended with effect system support post-hoc, in a library, with an example being the Eff library for Haskell and the associated GHC proposal to add delimited control operators (prompt and control).


  1. In reality it’s because I’m lazy and type setting so that it works properly both with KaTeX and with scripting disabled takes far too many keystrokes. Typing some code is easier.↩︎

  2. People have pointed out to me that “abortive continuation” is a bit oxymoronic, but I guess it’s been brought to you by the same folks who brought you “clopen set”.↩︎