Alpha-beta pruning is just minimax in a lattice of clamping functions
A lazy take on a classic game theory algorithm.

Table of contents
Haskell extensions and imports used in this post
{-# LANGUAGE
DataKinds,
DeriveGeneric,
DeriveTraversable,
DerivingStrategies,
GeneralizedNewtypeDeriving,
RankNTypes,
ScopedTypeVariables,
StandaloneDeriving,
TypeFamilies #-}
import Data.Ord (Down(Down, getDown))
import Data.List.NonEmpty (NonEmpty(..))
import qualified GHC.Generics as GHC
import Generics.SOP (Generic, HasDatatypeInfo, NP(..), K(..))
import Test.QuickCheck
import Test.StrictCheck
Minimax
Minimax is a general algorithm for finding optimal strategies. It’s not meant to be efficient or practical. It is more of a basic concept of game theory, and a reference against which to compare other game-solving algorithms.
We consider a simple model of two-player games. They take turns playing moves until reaching an end state with a final score. One player’s goal is to maximize the score, whereas the other player’s goal is to minimize it. Let us call these players Max and Min respectively, short for Maximizer and Minimizer.
We represent such a game by its game tree, which is made up of
three constructors:
a Max
(resp. Min
) node represents a game state where Max (resp. Min)
chooses the next move, each move resulting in a new game state,
and an End
leaf represents an end state as its score.
data Game score
= Max (NonEmpty (Game score))
| Min (NonEmpty (Game score))
| End score
deriving stock (Show, Functor, Foldable)
Note that Max
and Min
nodes must have at least one possible move.
You may be wondering about games that end when one player can no longer play:
instead of an empty Min
or Max
node, such game states simply correspond
to an End
leaf, making the final score
explicit.
Most real games just have a win/tie/lose end condition.
They naturally correspond to applying Game
to a type with three possible scores:
data WinLose = MinWins | Tie | MaxWins
deriving (Eq, Ord, Show)
In practice, chess engines don’t work with the whole game tree
since it is too massive. Instead, they build approximations by
pruning certain branches of the tree and replacing them with leaves.
The score on each leaf is a number which estimates how favorable
the game state is to either player. So we end up with
Game ℝ
, or Game Double
.
In general, the type Game
represents two-player games
with complete information and zero-sum objectives.
We shall assume that score
is a totally ordered set. This requirement
corresponds to a constraint Ord score
in Haskell. In that case,
there exists an “optimal strategy” for each player which guarantees
them an “optimal score” m
in the sense that as long as one player
sticks to their “optimal strategy”, the other player cannot
score better than m
.
This situation is what we call a Nash equilibrium in game theory.
For win/tie/lose games, the existence of a Nash equilibrium
means that either there is a winning strategy for one of the players,
or they must tie by playing optimally.
The “optimal score” m
is unique, and can be simply by a fold of the game tree,
replacing Max
and Min
constructors with the functions maximum
and minimum
.
This is the minimax algorithm:
minimax :: Ord score => Game score -> score
minimax (Max gs) = maximum (minimax <$> gs)
minimax (Min gs) = minimum (minimax <$> gs)
minimax (End s) = s
minimax
is quite an inefficient algorithm:
it must traverse the whole game tree. Indeed, maximum
and minimum
must traverse the whole list to find
the maximum or minimum element.
Often, we can do much better. For instance, consider the following tree:
Max [ End 0,
Min [ End (-1),
t ] ]
The minimax of that tree does not depend on the subtree t
.
Indeed, minimum [-1, minimax t]
is guaranteed to be at most -1
,
so the maximum between that value and 0
is guaranteed to be 0
.
Thus we can compute the minimax without inspecting the subtree t
,
which may be arbitrarily large.
That idea leads to a more efficient algorithm to compute the minimax.
Alpha-beta
The alpha-beta pruning algorithm1 is a modification of minimax with an extra pair of arguments:
alphabeta :: Ord score => Game score -> (score, score) -> score
The pair (alpha, beta)
represents a “relevance interval” which
relaxes the possible outputs of alphabeta
.
Either alphabeta t (alpha, beta)
produces a score within that interval,
in which case it is guaranteed to be equal to minimax
. Otherwise,
alphabeta t (alpha, beta)
produces a value outside of the interval,
in which case its exact value does not matter; it only has to be on
the same side of the interval as minimax t
. More rigorously:
- if
alpha < minimax t < beta
, thenalphabeta t (alpha, beta) = minimax t
; - if
minimax t <= alpha
, thenalphabeta t (alpha, beta) <= alpha
; - if
beta <= minimax t
, thenbeta <= alphabeta t (alpha, beta)
.
Leaving the value of alphabeta
underspecified when outside of the
interval allows the implementation to short-circuit:
we can stop searching through Max
nodes as soon as we can guarantee a score greater than beta
,
and we can stop searching through Min
nodes as soon as we can guarantee a score smaller than alpha
.
We can then use alphabeta
to redefine minimax
:
-- Minimax using alpha-beta pruning
minimaxAB :: (Ord score, Bounded score) => Game score -> score
minimaxAB t = alphabeta t (minBound, maxBound)
assuming that score
is Bounded
with extreme values
minBound :: score
and maxBound :: score
.
It’s possible to avoid the Bounded
constraint by changing
the interval type (score, score)
to (Maybe score, Maybe score)
,
which amounts to adding distinguished top and bottom elements.
We’ll stick with Bounded
to keep things a bit simpler.
Implementing alphabeta
is a standard exercise.
It is even easier when you have a formal specification like the above
to guide the implementation.
alphabeta :: Ord score => Game score -> (score, score) -> score
alphabeta (Max (g0 :| [])) i = alphabeta g0 i
alphabeta (Max (g0 :| g1 : gs)) (alpha, beta) =
let m0 = alphabeta g0 (alpha, beta) in
if beta <= m0 then m0
else m0 `max` alphabeta (Max (g1 :| gs)) (max alpha m0, beta)
alphabeta (Min (g0 :| [])) i = alphabeta g0 i
alphabeta (Min (g0 :| g1 : gs)) (alpha, beta) =
let m0 = alphabeta g0 (alpha, beta) in
if m0 <= alpha then m0
else m0 `min` alphabeta (Min (g1 :| gs)) (alpha, min beta m0)
alphabeta (End s) _ = s
But still, it is at least a little finicky and tedious to make sure that you haven’t mixed your alphas and betas.
As we will see in this post, we can streamline the implementation of alpha-beta pruning by factoring the short-circuiting logic out of the “minimax” logic.
Generalized minimax
Remark that minimax
only uses min
and max
(via minimum
and maximum
), rather than the comparison
functions of Ord
(compare
, (<=)
, etc.).
We can reduce the dependency footprint of minimax
by
defining a new class with only the necessary operations,
the class of lattices:
class Lattice a where
-- Join, least upper bound, max
\/) :: a -> a -> a
(-- Meet, greatest lower bound, min
/\) :: a -> a -> a (
In mathematics, lattices are algebraic structures with two operations
(\/)
(“join”) and (/\)
(“meet”)
satisfying commutativity, associativity, as well as the absorption laws:
x \/ (x /\ y) = x
x /\ (x \/ y) = x
In this post, we will only be looking at lattices that arise
out of total orders,
so this class is rather just a way of saying that we only
depend on min
and max
.
Binary operations can be iterated to combine lists of arguments,
similarly to the maximum
and minimum
functions:
-- maximum
joins :: Lattice a => NonEmpty a -> a
joins = foldr1 (\/)
-- minimum
meets :: Lattice a => NonEmpty a -> a
meets = foldr1 (/\)
Minimax in lattices is defined by replacing Max
and Min
nodes with
the joins
and meets
operations.
-- Minimax in lattices
minimaxL :: Lattice score => Game score -> score
minimaxL (Max gs) = joins (minimaxL <$> gs)
minimaxL (Min gs) = meets (minimaxL <$> gs)
minimaxL (End x) = x
minimaxL
generalizes minimax
since every decidable total order is a lattice
(because you can use (<=)
to define min
/max
).
Ideally this fact would be made explicit by making Lattice
into
a superclass of Ord
. Unfortunately in Haskell this would require us
to modify Ord
or redefine it.
Another way to express the relation between Lattice
and Ord
is through a newtype
.
newtype OrdLattice a = OrdLattice a
deriving newtype (Eq, Ord, Bounded)
unOrdLattice :: OrdLattice a -> a
unOrdLattice (OrdLattice x) = x
instance Ord a => Lattice (OrdLattice a) where
OrdLattice x \/ OrdLattice y = OrdLattice (max x y)
OrdLattice x /\ OrdLattice y = OrdLattice (min x y)
With that, we recover the starting minimax
by specializing
minimaxL
to OrdLattice s
, and then unwrapping OrdLattice
:
minimaxO :: Ord score => Game score -> score
minimaxO = unOrdLattice . minimaxL . fmap OrdLattice
Clamping functions
Focus on the type (score, score) -> score
which appears in the signature of alphabeta
.
More specifically, we are interested in a subset of those functions that
we shall call clamping functions.
Intuitively, a clamping function f
is a delayed representation of a constant s
:
the goal of f
is to compute s
, but it may also stop early with an approximation
if it’s not necessary to know the exact value of s
.
The name “clamping function” is a reference to the clamp
function:
clamp :: Ord score => score -> (score, score) -> score
clamp s (alpha, beta) = max alpha (min s beta)
We can think of the partially applied function clamp s
as an encoding of the constant s
,
which may or may not be output depending on the interval (alpha, beta)
.
More formally, a clamping function with value s
is a function f :: (score, score) -> score
that satisfies the following, for all (alpha, beta)
such that alpha < beta
:
- if
alpha < s < beta
, thenf (alpha, beta) = s
; - if
s <= alpha
, thenf (alpha, beta) <= alpha
; - if
beta <= s
, thenbeta <= f (alpha, beta)
.
Two clamping functions with the same value s
are considered equal.
In particular, as clamping functions, const s
is equal to clamp s
.
Making the notion of equality explicit is necessary to make sense of equations
(laws for lattices, homomorphisms, and isomorphisms).
We enshrine the definition of clamping functions in a newtype:
-- Type of clamping functions, satisfying the properties above.
newtype Clamping score = Clamping ((score, score) -> score)
unClamping :: Clamping score -> (score, score) -> score
unClamping (Clamping f) = f
For any value s
, we can construct the constant clamping function:
clamping :: score -> Clamping score
clamping s = Clamping (\_ -> s)
Note that \_ -> s
and clamp s
are both clamping functions with value s
,
so both are valid definitions of clamping s
.
We prefer the constant function \_ -> s
because it does less work.
Conversely, we can project clamping functions back into their values
by passing the whole interval (minBound, maxBound)
:
declamp :: Bounded score => Clamping score -> score
declamp (Clamping f) = f (minBound, maxBound)
Those two functions form an isomorphism between score
and Clamping score
,
meaning that they satisfy the following equations:
declamp . clamping = id
clamping . declamp = id
We now get to the secret sauce of this post: the maximum of two clamping functions (as well as the minimum). This operation can be defined in two ways. First is the naive definition, for reference:
-- "max" for clamping functions, naive variant
maxC :: Ord s => Clamping s -> Clamping s -> Clamping s
maxC (Clamping f) (Clamping g) = Clamping (\i -> max (f i) (g i))
Second is the lazy definition: if f (alpha, beta)
is greater
than the given upper bound beta
, then the max of f
and g
will
be even greater:
beta <= f (alpha, beta) <= max (f (alpha, beta)) (g (alpha, beta))
In that case, the maximum of f
and g
is allowed to output
f (alpha, beta)
without looking at g
.
Otherwise we must evaluate g
, but we can tighten the interval by
updating the lower bound to max alpha (f (alpha, beta))
.
-- "max" for clamping functions, lazy variant
lazyMaxC :: Ord s => Clamping s -> Clamping s -> Clamping s
lazyMaxC (Clamping f) (Clamping g) = Clamping (\(alpha, beta) ->
let fi = f (alpha, beta) in
if beta <= fi then fi else fi `max` g (max alpha fi, beta))
Dually, we also have a lazyMinC
.
lazyMinC :: Ord s => Clamping s -> Clamping s -> Clamping s
lazyMinC (Clamping f) (Clamping g) = Clamping (\(alpha, beta) ->
let fi = f (alpha, beta) in
if fi <= alpha then fi else fi `min` g (alpha, min beta fi))
To avoid repeating ourselves,
we can also reuse lazyMaxC
to implement lazyMinC
.
Use Down
to invert the ordering of an Ord
:
lazyMinC :: Ord s => Clamping s -> Clamping s -> Clamping s
lazyMinC f g = undualize (lazyMaxC (dualize f) (dualize g))
where
dualizeWith from to (Clamping h) =
Clamping (\(beta, alpha) -> from (h (to alpha, to beta)))
dualize = dualizeWith Down getDown -- Clamping s -> Clamping (Down s)
undualize = dualizeWith getDown Down -- Clamping (Down s) -> Clamping s
These “naive” and “lazy” functions denote the same value
(maxC = lazyMaxC
and minC = lazyMinC
),
but lazyMaxC
and lazyMinC
may do less work,
either by ignoring their second argument or by applying it to a smaller interval than expected.
The point is that these “lazy” functions embody the short-circuiting logic of alpha-beta pruning exactly. All that’s left to do is to plug them into minimax.
The lattice of clamping functions
With the lazy min and max that we just defined, we get a lattice:
instance Ord score => Lattice (Clamping score) where
\/) = lazyMaxC
(/\) = lazyMinC (
Specialize minimax in the lattice of clamping functions:
minimaxC :: Ord score => Game (Clamping score) -> Clamping score
minimaxC = minimaxL
This doesn’t look like much, but we have actually implemented the
alpha-beta pruning algorithm.
With a tiny bit of plumbing, we can redefine the function alphabeta
from earlier:
alphabeta' :: Ord score => Game score -> (score, score) -> score
alphabeta' = unClamping . minimaxC . fmap clamping
Then we want to partially apply alphabeta'
to the interval (minBound, maxBound)
.
This amounts to replacing unClamping
with declamp
in the body of alphabeta'
.
Behold our final implementation of minimax by alpha-beta pruning:
minimaxAB' :: (Ord score, Bounded score) => Game score -> score
minimaxAB' = declamp . minimaxC . fmap clamping
To sum up, we implemented alpha-beta pruning as a simple combination of:
- minimax, generalized from orders to lattices (
minimaxL
); - the lattice of clamping functions (
Lattice (Clamping score)
).
This alternative approach does not completely absolve you from effort:
you still have to juggle alphas and betas correctly to implement the lattice
(lazyMinC
and lazyMaxC
).
But unlike in the original alphabeta
,
you don’t have to do all that juggling in the middle of a recursive function.
The logic of alpha-beta pruning is neatly decomposed into bite-sized pieces.
Correctness for free
Since we just reused the code of minimax, it’s also easier to prove that that alpha-beta pruning yields the same result:
minimax = minimaxAB'
As we are about to see, this is a direct consequence of
the free theorem2 for minimaxL
:
any function of type forall s. Lattice s => Game s -> s
,
such as minimaxL
, commutes with any lattice homomorphism3 f
,
in the following sense:
f . minimaxL = minimaxL . fmap f
We can picture that equation as a commutative diagram:
\[\require{AMScd} \begin{CD} \small\texttt{Game s} @>{\texttt{minimaxL}}>> \small\texttt{s} \\ @V{\texttt{fmap f}}VV @VV{\texttt{f}}V \\ \small\texttt{Game t} @>{\texttt{minimaxL}}>> \small\texttt{t} \end{CD}\]
If f
has an inverse f⁻¹
, we can rewrite that to
minimaxL = f⁻¹ . minimaxL . fmap f
By replacing (f, f⁻¹)
with the isomorphism (clamping, declamp)
defined earlier,
we obtain exactly the equality between minimax and alpha-beta pruning:
minimaxL = declamp . minimaxL . fmap clamping
= minimaxAB'
As a commutative diagram:
\[\require{AMScd} \begin{CD} \small\texttt{Game s} @>{\texttt{minimaxAB’}\text{ (alpha-beta)}}>> \small\texttt{s} \\ @V{\texttt{fmap clamping}}VV @AA{\texttt{declamp}}A \\ \scriptsize\texttt{Game (Clamping s)} @>{\texttt{minimaxL}}>> \scriptsize\texttt{Clamping s} \end{CD}\]
QED.
(To be pedantic, the above proof
conflates minimaxL
with minimax
/minimaxO
,
which relies on pretending that Lattice
is a superclass of Ord
.
Below is another proof that doesn’t take that shortcut,
by going through the OrdLattice
newtype explicitly,
so this proof applies more directly to the Haskell definitions as written here.)
A somewhat more rigorous proof
We want to prove that the alpha-beta-pruning minimaxAB'
is equivalent to the naive minimax
:
minimax = minimaxAB'
Recall the free theorem of minimaxL
. For any lattice isomorphism (f, f⁻¹)
:
minimaxL = f⁻¹ . minimaxL . fmap f
Replace (f, f⁻¹)
with the lattice isomorphism (clamping . unOrdLattice, OrdLattice . declamp)
between the lattices OrdLattice score
and Clamping score
.
minimaxL = OrdLattice . declamp . minimaxL . fmap (clamping . unOrdLattice)
Now we can prove the equality between minimax
and minimaxAB'
,
using the above equation as the middle step, followed by
canceling inverses:
minimax
= minimaxO
= unOrdLattice . minimaxL . fmap OrdLattice
= unOrdLattice . OrdLattice . declamp . minimaxL . fmap (clamping . unOrdLattice) . fmap OrdLattice
= declamp . minimaxL . fmap clamping
= minimaxAB'
The above is only a proof of functional correctness:
minimax
and minimaxAB'
compute the same result.
To verify that minimaxAB'
does so more efficiently
is another problem for another day. For now, we can test it.
Strictness check
We test that our “fancy” implementation of alpha-beta (minimaxAB'
) has the same
strictness as the “classical” implementation (minimaxAB
),
which we presume to be much lazier than minimax.
We use StrictCheck for property-testing of strictness behaviors in Haskell.
The following test checks that minimaxAB
and minimaxAB'
have the
same demand on random inputs.
We use the function observe1
from StrictCheck to observe the demand
of a function f
: observe1
applies f
it to an instrumented copy
of the provided input g
, it forces the output (f g
of type Int
)
using the provided forcing function (`seq` ())
,
and finally returns the demand on the input tree g
that was observed
by forcing the instrumented copy of g
.
main :: IO ()
main = do
quickCheck $ \(g :: Game Int) ->
label (bucket (length g)) $
let demand f = snd (observe1 (`seq` ()) f g) in
demand minimaxAB === demand minimaxAB'
From the source repository of this blog, the following command compiles and runs this blog post:
cabal run alpha-beta
Instances and auxiliary definitions
-- Histogram of generated value sizes
bucket :: Int -> String
bucket n | n == 1 = "= 1"
| n < 10 = "< 10"
| n < 100 = "< 100"
| n < 1000 = "< 1000"
| otherwise = ">= 1000"
-- Instances
deriving stock instance GHC.Generic (Game a)
instance Generic (Game a)
instance HasDatatypeInfo (Game a)
instance Shaped a => Shaped (Game a)
instance Arbitrary a => Arbitrary (Game a) where
arbitrary = sized $ \n -> if n == 0 then End <$> arbitrary else
resize (n `div` 2) $ frequency
1, End <$> arbitrary), (2, Max <$> arbitrary), (2, Min <$> arbitrary)]
[(shrink (Max (g :| gs)) = g : gs ++ (Max <$> shrink (g :| gs))
shrink (Min (g :| gs)) = g : gs ++ (Min <$> shrink (g :| gs))
shrink (End s) = End <$> shrink s
instance Arbitrary a => Arbitrary (NonEmpty a) where
arbitrary = liftA2 (:|) arbitrary arbitrary
shrink (x :| xs) = [y :| ys | y : ys <- shrink (x : xs)]
Conclusion
I came up with this idea a while back on Stack Overflow, as an answer to Alpha-beta pruning with recursion schemes. My understanding of alpha-beta pruning changed overnight from a somewhat tricky algorithm to a completely trivial solution. Getting to reuse minimax is not only a satisfying achievement in refactoring, it enables a neat proof of correctness by parametricity (via free theorems).
The role of laziness should also be underscored. If you try to do the same thing in a call-by-value language, the implementation of “generalized minimax” must explicitly delay computations, obscuring the point:
Alpha-beta pruning is just minimax in a lattice of clamping functions.
For a clearer presentation, see the talk Alpha-Beta Pruning Explored, Extended and Verified (2024) by Tobias Nipkow.↩︎
Theorems for free! by Philip Wadler. Free theorems involving type constructor classes by Janis Voigtländer.↩︎
A lattice homomorphism
f
is a function that commutes with the lattice operations:
↩︎f (x /\ y) = f x /\ f y f (x \/ y) = f x \/ f y