The reasonable effectiveness of the continuation monad
There are common monads associated with common effects: Maybe
for failure,
[]
(list) for nondeterminism, State
for state…
What about the continuation
monad?
We shall see why the answer is all of the above, but better.
Indeed, many effects can be understood and implemented in a simple and uniform
fashion in terms of first-class continuations.
Extensions and imports for this Literate Haskell file
{-# LANGUAGE
InstanceSigs,
RankNTypes #-}
module Continuations where
import Control.Applicative ((<|>))
import Control.Monad (replicateM, when)
import Data.Foldable (for_)
A way too short introduction to continuation-passing style
The key insight behind continuations is that producing a result in a function is equivalent to calling another function which does the rest of the computation with that result.
In this small starting example, we apply some function timesThree
, and
compare the result to 10.
We will transform this code in continuation-passing style.
example1 :: Int -> Bool
example1 x = 10 < timesThree x where
timesThree :: Int -> Int
timesThree x = 3 * x
As our first step, following the train of thought above, instead of taking the
result of timesThree
and doing something (10 < _
) with it, let timesThree
do that operation directly.
example2 :: Int -> Bool
example2 x = timesThree x where
timesThree :: Int -> Bool
timesThree x = 10 < 3 * x
Of course, that’s not much of a timesThree
function anymore.
Moreover, we know how 3 * x
is going to be used in this case,
but that’s quite counter to modularity.
Let us generalize timesThree
: instead of hard-coding 10 < _
, we
parameterize timesThree
by the context in which the result 3 * x
will be
used. That context is called the continuation k
.
example3 :: Int -> Bool
example3 x = timesThree x (\ y -> 10 < y) where
timesThree :: Int -> (Int -> Bool) -> Bool
timesThree x k = k (3 * x)
Furthermore, the result of the continuation doesn’t have to be of type Bool
;
we can generalize the type of timesThree
further to also be parameterized by
the result type r
of the continuation.
In the main body where we apply timesThree
, r
is specialized to the type of
the final result, which is Bool
.
example4 :: Int -> Bool
example4 x = timesThree x (\ y -> 10 < y) where
timesThree :: Int -> (Int -> r) -> r
timesThree x k = k (3 * x)
That was continuation-passing style (CPS) in a nutshell.
Functions written in CPS can be composed as follows.
Let us refactor the comparison 10 < _
into another CPS function
greaterThanTen
.
Once the program is entirely written in CPS, the identity function
(here \ z -> z
) is commonly used as the last continuation,
which receives the final result.
example5 :: Int -> Bool
example5 x =
timesThree x (\ y ->
greaterThanTen y (\ z ->
z))
where
timesThree :: Int -> (Int -> r) -> r
timesThree x k = k (3 * x)
greaterThanTen :: Int -> (Bool -> r) -> r
greaterThanTen y k = k (10 < y)
Hey, this example looks a lot like do
notation…
Indeed, note how we changed the result type of timesThree
from Int
to
(Int -> r) -> r
; that mapping between types (_ -> r) -> r
defines a monad.
The Cont
monad
(The descriptions in this section are principally meant to provide context if
you’ve never seen the implementation of Cont
before, but they may be quite
dense. It’s not necessary to follow every single detail to catch the rest, so
skipping forward is an option.)
A function of type ((a -> r) -> r)
takes a continuation (a -> r)
and is expected to produce a result r
. The obvious way to do that is to apply
the continuation to a value a
, which is exactly the idea behind continuations
given at the beginning. In fact that is also what it means to “return” a
value in this monad (pureCont
below; the instances are collapsed at the end
of this section). As we will soon see, the power of the continuation monad
hides in the myriad other ways of using that continuation.
newtype Cont r a = Cont ((a -> r) -> r)
-- Eliminate Cont
runCont :: Cont r a -> (a -> r) -> r
runCont (Cont m) = m
-- Use the identity continuation to extract the final result.
evalCont :: Cont a a -> a
evalCont (Cont m) = m id
pureCont :: a -> Cont r a
pureCont a = Cont (\ k -> k a)
The bind (>>=)
of the monad captures the pattern in example5
above
to compose two CPS functions. We start with a continuation (k :: b -> r)
for the whole computation (Cont r b
). We first apply ma
,
with a continuation which takes the result a
of ma
, and passes it to mc
,
which in turn produces a b
that is just what k
wants.
bindCont :: Cont r a -> (a -> Cont r b) -> Cont r b
bindCont (Cont ma) mc_ =
Cont (\ k ->
ma (\ a ->
mc a (\ b ->
k b)))
where
mc = runCont . mc_
The instances of Functor
, Applicative
, Monad
for Cont
.
instance Functor (Cont r) where
fmap :: (a -> b) -> Cont r a -> Cont r b
fmap f (Cont m) = Cont (\ k -> m (k . f))
instance Applicative (Cont r) where
pure :: a -> Cont r a
pure = pureCont
<*>) :: Cont r (a -> b) -> Cont r a -> Cont r b
(Cont mf <*> Cont ma = Cont (\ k ->
mf (\ f ->
ma (\ a ->
k (f a))))
instance Monad (Cont r) where
>>=) :: Cont r a -> (a -> Cont r b) -> Cont r b
(>>=) = bindCont (
We can thus rewrite the example using do
-notation for Cont
.
example6 :: Int -> Bool
example6 x = evalCont $ do
y <- timesThree x
z <- greaterThanTen y
pure z
where
timesThree :: Int -> Cont r Int
timesThree x = Cont (\ k -> k (3 * x))
greaterThanTen :: Int -> Cont r Bool
greaterThanTen y = Cont (\ k -> k (10 < y))
Continuation transformers
Here is another way to look at monadic composition of Cont
.
If we unfold the definition of Cont
, a continuation in the continuation monad,
a -> Cont r b
, is really a function mapping continuations to
continuations, we shall call that a continuation transformer:
(b -> r) -> (a -> r)
.
They map “future” continuations to “present” continuations.1
This suggests to take a look at the fish operator, which composes monadic continuations.
>=>) :: Monad m => (a -> m b) -> (b -> m c) -> (a -> m c)
(>=>) :: (a -> Cont r b) -> (b -> Cont r c) -> (a -> Cont r c) (
Looking at the type of (>=>)
:
- unfold the definition of
Cont r b
to(b -> r) -> r
, - swap the arguments of each function
a -> (b -> r) -> r
to(b -> r) -> (a -> r)
.
The result shows that sequencing in the Cont
monad (with (>=>)
) is
basically function composition. The function f >=> g :: a -> Cont r c
takes a continuation c -> r
, passes it to the function g
to produce
a continuation b -> r
, which goes into f
to produce a continuation a -> r
(note that continuations do flow from right to left in f >=> g
).
>=>) ::
(a -> Cont r b) -> (b -> Cont r c) -> (a -> Cont r c)
({- 1 -} (a -> (b -> r) -> r) -> (b -> (c -> r) -> r) -> (a -> (c -> r) -> r)
{- 2 -} ((b -> r) -> a -> r) -> ((c -> r) -> b -> r) -> ((c -> r) -> a -> r)
y -> x ) -> (z -> y ) -> (z -> x )
(
.) ::
(y -> x) -> (z -> y) -> (z -> x) (
Many monads in one
In spite of (or thanks to) its simplicity, the Cont
monad is quite versatile.
Many kinds of effects can be represented in Cont
, all with only the one
Monad
instance given above, that knows nothing about effects.
In contrast with free monads, which are just waiting to be interpreted,
we can define an effect directly by its operations in Cont
.
The main idea is to consider what operations we allow on continuations.
Here we describe various restrictions through the result type r
in
(a -> r) -> r
, but there may be other ways.
Identity
Our starting point was that producing a result is equivalent to calling the
continuation.
If we add the constraint that the result type r
is abstract, so that there
are no operations possible on it, then calling the continuation with some
argument a
is the only option, i.e., we must produce a result a
, nothing
else. In that case, the continuation monad is isomorphic to the identity monad.
To express the restriction that r
is abstract, we can use the forall
quantifier. If no operations are possible on r
, then r
could actually be
any type. So Cont
computations defined under that restriction are
polymorphic in r
. We name Done
the resulting “specialization” of Cont
(if
we may call it one), with an isomorphism given by pure :: a -> Done a
from the Applicative
instance above, and runDone :: Done a -> a
below.
type Done a = forall r. Cont r a
-- forall r. (a -> r) -> r
runDone :: Done a -> a
runDone (Cont m) = m id
Maybe
The next interesting case to consider is that the continuation may be dropped.
But, (a -> r) -> r
must still somehow produce a result of type r
.
We thus replace r
with Maybe r
, so that a computation can
produce Nothing
instead of calling the continuation (abort
below).
As you might expect, the result is a monad which models computations that can
exit early with no output, i.e., a variant of the Maybe
monad,
Although the type Maybe
appears in this definition, the fact that
it is a monad is not used anywhere. In fact, whereas monadic composition
(>>=)
for Maybe
is defined with pattern-matching, there is not a single
case
in the operations for Abortable
defined here. Each constructor
is only used once, Nothing
when aborting the computation, Just
as the final continuation to indicate success. So Abortable
does not
just imitate Maybe
, it is even more efficient!
type Abortable a = forall r. Cont (Maybe r) a
-- forall r. (a -> Maybe r) -> Maybe r
abort :: Abortable x
abort = Cont (\ _k -> Nothing)
runAbortable :: Abortable a -> Maybe a
runAbortable (Cont m) = m Just
Contrary to Done
and Identity
,
Abortable
is not isomorphic to Maybe
.
Whereas a Maybe
computation must decide to be Just
or Nothing
on the spot, an Abortable
is a function (a -> Maybe r) -> Maybe r
,
which may inspect the continuation before making a decision,
even though “intuitively” it’s not supposed to.
Thus we can construct a computation secondGuess
which is expected to return a
Bool
, calls the continuation with True
(like pure True
would) but
backtracks to False
if that fails.
secondGuess :: Abortable Bool
secondGuess = Cont (\k -> k True <|> k False)
pureTrue :: Abortable Bool
pureTrue = pure True
runAbortable
maps both secondGuess
and pureTrue
to Just True
,
but they behave differently with a continuation which fails on True
and
succeeds on False
.
Nevertheless, it is not possible to construct examples such as secondGuess
with only the monad operations and abort
;
you have to break the Abortable
abstraction.
In that sense, Abortable
is still a practical alternative to the Maybe
monad.
Either
Naturally, a slight variant of “exit early” is to “exit early with an explicit
error”, obtained by replacing Maybe
with Either e
.
type Except e a = forall r. Cont (Either e r) a
-- forall r. (a -> Either e r) -> Either e r
throw :: e -> Except e a
throw e = Cont (\ _k -> Left e)
runExcept :: Except e a -> Either e a
runExcept (Cont m) = m Right
State
What if the continuation takes an extra parameter, with result type s -> r
?
Then we may want to call it with different parameters, resulting in a notion of
stateful computation.
Remember that the result type s -> r
is both the result type of the
continuation, and of the whole computation ((a -> s -> r) -> s -> r
).
The whole computation can just call the continuation (with some value a
) to
produce a result s -> r
, or it can first take the parameter s
, and
obtain an r
by calling the continuation with a different state.
Thus get
takes that parameter s
, and feeds it twice to the continuation
k :: s -> s -> r
, keeping the state (second argument) unchanged, but also
giving it as the main (first) argument of the subsequent computation.
The other function, put
ignores that parameter, and calls the continuation
k :: () -> s -> r
with another state given externally.
type State s a = forall r. Cont (s -> r) a
-- forall r. (a -> s -> r) -> s -> r
get :: State s s
get = Cont (\ k s -> k s s)
put :: s -> State s ()
put s = Cont (\ k _s -> k () s)
runState :: State s a -> s -> (a, s)
runState (Cont m) = m (,)
That State
is isomorphic to the standard definition, s -> (s, a)
.
Indeed, contrary to Abortable
, there is no observation to be made about the
continuation a -> s -> r
when r
is abstract.
As with Maybe
/Either
, there is no pattern-matching on pairs going on.
The s
and the a
are always just two arguments to the continuation, and a
pair gets built up only in the final continuation in runState
.
Writer
Are we running out of ideas for what to put in Cont _ a
?
Above we tried r
, Either e r
(sums!), s -> r
(exponentials!). Surely we
should also try products.
The result is not quite nice, because to do anything with the pair
we have to break the property that was maintained until now: that the
continuation k
is the last thing we call.
We can tell
an element of a monoid, by appending it in front of whatever the
rest of the computation outputs.
type Writer w a = forall r. Cont (w, r) a
-- forall r. (a -> (w, r)) -> (w, r)
tell :: Monoid w => w -> Writer w ()
tell w = Cont (\ k ->
let (w0, r) = k () in (w <> w0, r))
runWriter :: Monoid w => Writer w a -> (w, a)
runWriter (Cont m) = m (\ a -> (mempty, a))
State
, reversed
Cont (w, r)
can also be viewed as a variant of State
. Instead of treating
w
as a monoid, we can let the user update it however they want.
However, that update happens after the rest of the computation is done,
so the last update (in the order they would appear in a do
block for example)
is applied first to the initial state. This is the
reverse state monad, where modifications map the future state to the
past state.
Getting the current state in the RState
monad requires recursion:
the current state comes from the future (the continuation), which is
asking for the current state itself. With this rget
operation,
you have to be careful not to introduce any causality loop and
accidentally tear down the fabric of reality.
Compare our RState
with a more conventional definition of
it
as s -> (a, s)
. There, recursion is used in the definition of (>>=)
,
while get
is trivial, which is a situation opposite to RState
.
type RState s a = forall r. Cont (s, r) a
-- forall r. (a -> (s, r)) -> (s, r)
rmodify :: (s -> s) -> RState s ()
rmodify f = Cont (\ k ->
let (s, r) = k () in (f s, r))
rget :: RState s s
rget = Cont (\ k ->
let (s, r) = k s in (s, r))
runRState :: RState s a -> s -> (s, a)
runRState (Cont m) s = m (\a -> (s, a))
Tardis
We can combine State
, given by s -> r
, and RState
, given by (s, r)
:
instead, if we make the continuation result type s -> (s, r)
, we obtain a
Tardis
monad,
with one state going forward in time, and one going backwards.
The forward and backward states don’t actually have to be the same,
so we can also generalize (s -> (s, r))
into (fw -> (bw, r))
.
type Tardis bw fw a = forall r. Cont (fw -> (bw, r)) a
-- forall r. (a -> fw -> (bw, r)) -> fw -> (bw, r)
List
One last standard type we haven’t tried for r
is the type of lists.
In our previous examples, computations called the continuation only once
(or at least they should, we can exclude secondGuess
as a degenerate
example).
Equipping the result type with the structure of lists, we can call a
continuation multiple times, and return a combination of all the results.
This provides a model of nondeterministic computations, keeping track of all
possible executions, which is the same interpretation as the standard list []
monad.
decide
chooses both True
and False
, i.e., it calls the continuation
on both booleans, and concatenates the results together.
vanish
chooses nothing, it drops the continuation like abort
.
type List a = forall r. Cont [r] a
-- forall r. (a -> [r]) -> [r]
decide :: List Bool
decide = Cont (\ k -> k True ++ k False)
vanish :: forall a. List a
vanish = Cont (\ _k -> [])
runList :: List a -> [a]
runList (Cont m) = m (\ a -> [a])
There’s a handful of variations for that one. Use NonEmpty r
to rule out
vanish
; generalize over an abstract monoid or semigroup r
to prevent
inspection of the continuation; or use a Tree r
to keep track of the order of
choices.
type List1 a = forall r. Cont (NonEmpty r) a
type List' a = forall r. Monoid r => Cont r a
type List1' a = forall r. Semigroup r => Cont r a
type Tree0 a = forall r. Cont (Tree r) a
ContT
There is also a continuation monad transformer, which is simply the
continuation monad with a monadic result type m r
.
The transformers library defines ContT
as a newtype mostly so that it has
the right kind to be an instance of MonadTrans
.
All instances stay the same, so here we will prefer a type synonym to keep our
Monad
instance count at 1.
We will refer to ContT
and Cont
interchangeably, as we’re not too concerned
about kinds in this post, whichever looks better in context.
type ContT r m a = Cont (m r) a
-- (a -> m r) -> m r
What does it mean that ContT
is a monad transformer? There is a lift
function, which commutes with monadic operations (that’s called a
monad morphism). For ContT
, lift
is simply (>>=)
,
lift :: Monad m => m a -> ContT r m a
-- Monad m => m a -> (a -> m r) -> m r
lift u = Cont (\ k -> u >>= k)
-- Monad morphism laws:
-- lift (pure a) = pure a
-- lift (u >>= \ a -> k a) = lift u >>= \ a -> lift (k a)
CodensityT
A closely related sibling is the “codensity” monad
transformer,
where r
is universally quantified, like it is in previous examples.
Both ContT
and CodensityT
can be used to optimize monads2
that have expensive bind (>>=)
operations.
We won’t say anything here about the actual differences between ContT
and
CodensityT
.
type CodensityT m a = forall r. Cont (m r) a
-- forall r. (a -> m r) -> m r
In the examples above, the types we used instead of r
happen to be monads,
even if we did not rely on that fact. Here’s a quick summary, with the names
of the resulting variant of Cont
on the left, an equivalent definition in terms
of CodensityT
in the middle, and their more-or-less standard counterparts on
the right as they can be found on Hackage (base, transformers, rev-state
and tardis).
The words “retracts to” mean that there is a surjective but not injective
mapping from the left to the right.
Done = CodensityT Identity isomorphic to Identity
Abortable = CodensityT Maybe retracts to Maybe
Except e = CodensityT (Either e) retracts to Either e
State s = CodensityT (Reader s) isomorphic to State s
Writer w = CodensityT (Writer w) retracts to Writer w, or (reverse) State w
Tardis s = CodensityT (State s) retracts to Tardis s
List = CodensityT [] retracts to []
Many monad transformers in one
The monad transformers corresponding to the above monads also find
their equivalent in terms of Cont
. They are not exactly isomorphic, but
a noteworthy feature, as before, is that they still use the same old Monad
instance for Cont
. Operations do rely on a Monad
constraint for the
transformed monad m
.
ListT
Turning the previous examples into monad transformers is left as an exercise for the reader.
Here we will focus on List
; it is an interesting case because
a monad transformer corresponding to lists is notoriously non-obvious.
The obvious candidate m [a]
is not a monad (unless m
is commutative).
Curiously, we have the “monad” part down for free, and we only need to solve “list” and “transformer”.
We briefly saw earlier that we can get a “list” monad by using any monoid
instead of [r]
as the result type. We also saw that a monadic result type
m r
makes a monad transformer. In addition, any monad defines a monoid m ()
if we ignore the result (we can also use a different monoid instead of ()
but that doesn’t seem as interesting), with pure ()
as the unit
and (*>)
(or (>>)
) for composition. In fact, we only need an Applicative
constraint for the “list” operations, but lift
still requires Monad
.
We already had all the ingredients to make a list monad transformer!
Reading the definition of ListT
slowly, it takes a continuation
(a -> m ())
, and produces a computation m ()
. What can it actually do?
Mostly, call the continuation with various values of a
in some order.
type ListT m a = Cont (m ()) a
-- (a -> m ()) -> m ()
decideM :: Applicative m => ListT m Bool
decideM = Cont (\ k -> k True *> k False)
vanishM :: Applicative m => ListT m a
vanishM = Cont (\ _k -> pure ())
runListT :: Applicative m => (a -> m ()) -> ListT m a -> m ()
runListT k (Cont m) = m k
The list transformer is a nice pattern for deeply nested loops common in enumeration/search algorithms.
Here are three nested for_
loops:
-- All 3 bit patterns
threebit :: IO ()
threebit =
for_ [0, 1] $ \ i ->
for_ [0, 1] $ \ j ->
for_ [0, 1] $ \ k ->
printDigits [i, j, k]
printDigits :: [Int] -> IO ()
printDigits ds = do
for_ ds (\i -> putStr (show i))
putStrLn ""
Here they are again, where each value is bound using do
notation thanks to
the list transformer (this combination is really neat: Cont $ for_ [ ... ]
).
-- All 3 bit patterns
threebit' :: IO ()
threebit' = runListT printDigits $ do
i <- Cont $ for_ [0, 1]
j <- Cont $ for_ [0, 1]
k <- Cont $ for_ [0, 1]
pure [i, j, k]
Once iteration is captured in a monad, we can iterate across dimensions:
-- All 8 bit patterns
eightbit :: IO ()
eightbit = runListT printDigits $
replicateM 8 (Cont (for_ [0, 1]))
-- 00000000
-- 00000001
-- 00000010
-- 00000011
-- 00000100
-- 00000101
-- 00000110
-- ...
All of that is technically possible with just the list monad. The transformer really adds the ability to interleave enumeration and computation.
-- All 8 bit patterns, but show only the suffix that changed at every step.
eightbit' :: IO ()
eightbit' = runListT pure $ do
for_ [0 .. 7] $ \ n -> do
i <- Cont $ for_ [0, 1]
lift $ when (i == 1) $ putStr (replicate n ' ')
lift $ putStr (show (i :: Int))
lift $ putStrLn ""
-- 00000000
-- 1
-- 10
-- 1
-- 100
-- 1
-- 10
-- ...
Other list monad transformers
This “list monad transformer” is actually different from another incarnation which may be found on Hackage. The more common version of a “list monad transformer” is an “effectful list”, where the list constructors are interleaved with computations.3
newtype ListT m a = ListT (m (Maybe (a, ListT m a)))
The biggest difference is that the “effectful list” transformer naturally supports an uncons operation, which evaluates the effectful list and pauses after producing the first element (or the empty list).
The trade-off is that uncons has a cost in usability. The paused computation must be resumed explicitly: it may be dropped, or resumed more than once. The continuation transformer, by not allowing such “interruptions”, may offer stronger guarantees for resource management.
Conclusion
The continuation monad can thus serve as a uniform foundation for many kinds of monadic effects, and is even often a more efficient replacement of “standard” monads.
“Control operations” might cause some difficulties; those are operations
parameterized by computations, such as catch
and bracket
;
they weren’t discussed here, but I think the problems can be overcome.4
See also
- Oleg Kiselyov’s page on Continuations.
- The Mother of all Monads, by Dan Piponi.
- The best refactoring you’ve never heard of (aka. Defunctionalize the continuation), by James Koppel.
This is intimately related to predicate transformer semantics. There were two relevant papers at ICFP this year where continuations play a great role:
- A predicate transformer semantics for effects, by Wouter Swierstra and Tim Baanen, ICFP 2019. (PDF)
- Dijkstra monads for all, by Kenji Maillard et al., ICFP 2019. (arxiv)
Asymptotic improvement of computations over free monads, by Janis Voigtländer, MPC 2008. (PDF)↩︎
In more than one place: logict, pipes, list-t, list-transformer. In particular, logict provides a Church-encoded version of the “effectful list”, which brings it close to the continuation transformer, but there’s still a gap.↩︎
For example, there is an instance of the class
MonadReader
forContT
, with a problematic operationlocal :: MonadReader r m => (r -> r) -> m a -> m a
. It can be implemented by explicitly restoring the environment of the continuation followinglocal
. We also have to restrictContT
values under consideration to a subset of “well-behaved” ones. That would most likely forbid use ofcallCC
orshift
/reset
, but as we’ve seen throughout this post, there is a lot we can do without those: avoiding insane control operations keeps theCont
monad reasonable (cf. the title of this post).↩︎