Where does the name "algebraic data type" come from?

Posted on July 26, 2024

“Algebraic data types” is a beloved feature of programming languages with such a mysterious name. Where does this name come from?

There are two main episodes in this saga: Hope and Miranda. The primary conclusion is that the name comes from universal algebra, whereas another common interpretation of “algebraic” as a reference to “sums of products” is not historically accurate. We drive the point home with Clear. CLU is extra.

Disclaimer: I’m no historian and I’m nowhere as old as these languages to have any first-hand perspective. Corrections and suggestions for additional information are welcome.

Hope (1980)

Algebraic data types were at first simply called “data types”. This programming language feature is commonly attributed to Hope, an experimental applicative language by Rod Burstall et al.. Here is the relevant excerpt from the paper, illustrating its concrete syntax:

A data declaration is used to introduce a new data type along with the data constructors which create elements of that type. For example, the data declaration for natural numbers would be:

``data num == 0 ++ succ(num)``

(…) To define a type ‘tree of numbers’, we could say

``````data numtree == empty ++ tip(num)
++ node(numtree#numtree)``````

(The sign `#` gives the cartesian product of types). One of the elements of `numtree` is:

``````node(tip(succ(0)),
node(tip(succ(succ(0))), tip(0)))``````

But we would like to have trees of lists and trees of trees as well, without having to redefine them all separately. So we declare a type variable

``typevar alpha``

which when used in a type expression denotes any type (including second- and higher-order types). A general definition of `tree` as a parametric type is now possible:

``````data tree(alpha) == empty ++ tip(alpha)
++ node(tree(alpha)#tree(alpha))``````
Now `tree` is not a type but a unary type constructor – the type `numtree` can be dispensed with in favour of `tree(num)`.

Pattern matching in Hope is done in multi-clause function declarations or multi-clause lambdas. There was no `case` expression.

``````reverse(nil) <= nil
reverse(a::l) <= reverse(l) <> [a]``````
``````lambda true, p => p
| false, p => false``````

As far as I can tell, other early programming languages cite Hope or one of its descendants as their inspiration for data types. There is a slightly earlier appearance in NPL by Darlington and the same Burstall, but I couldn’t find a source describing the language or any samples of data type declarations. Given the proximity, it seems reasonable to consider them the same language to a large extent. This paper by Burstall and Darlington (1977) seems to be using NPL in its examples, but data types are only introduced informally; see on page 62 (page 19 of the PDF):

We need a data type `atom`, from which we derive a data type `tree`, using constructor functions `tip` to indicate a tip and `tree` to combine two subtrees

``````tip : atoms → trees
tree : trees x trees → trees``````

We also need lists of atoms and of trees, so for any type `alpha` let

``````nil : alpha-lists
cons : alphas x alpha-lists → alpha-lists``````

Hope inspired ML (OCaml’s grandpa) to adopt data types. In Standard ML:

``datatype 'a option = Nothing | Some of 'a``

Before it became Standard, ML started out as the “tactic language” of the LCF proof assistant by Robin Milner, and early versions did not feature data types (see the first version of Edinburgh LCF). it’s unclear when data types were added exactly, but The Definition of Standard ML by Milner et al. credits Hope for it (in Appendix F: The Development of ML):

Two movements led to the re-design of ML. One was the work of Rod Burstall and his group on specifications, crystallised in the specification language Clear and in the functional programming language Hope; the latter was for expressing executable specifications. The outcome of this work which is relevant here was twofold. First, there were elegant programming features in Hope, particularly pattern matching and clausal function definitions; second, there were ideas on modular construction of specifications, using signatures in the interfaces. A smaller but significant movement was by Luca Cardelli, who extended the data-type repertoire in ML by adding named records and variant types.

Miranda (1985)

“Data types” as a programming language feature appeared in Hope, but its first mention under the name “algebraic data types” that I could find is in Miranda: a non-strict functional language with polymorphic types by David Turner in 1985:

Algebraic data types

The basic method of introducing a new concrete data type, as in a number of other languages, is to declare a free algebra. In Miranda this is done by an equation using the symbol `::=`,

``tree ::= Niltree | Node num tree tree``
being a typical example. (…) The idea of using free algebras to define data types has a long and respectable history [Landin 64], [Burstall 69], [Hoare 75]. We call it a free algebra, because there are no associated laws, such as a law equating a tree with its mirror image. Two trees are equal only if they are constructed in exactly the same way.

In case you aren’t aware, Miranda is a direct precursor of Haskell. A minor similarity with Haskell that we can see here is that data constructors are curried in Miranda, unlike in Hope and ML. Another distinguishing feature of Miranda is laziness. See also A History of Haskell: being lazy with class.

Below are links to the articles cited in the quote above. The first [Landin 64] doesn’t explicitly talk about algebra in this sense, while [Burstall 69] and [Hoare 75] refer to “word algebra” rather than “free algebra” to describe the same structure, without putting “algebra” in the same phrase as “type” yet.

Hoare’s paper contains some futuristic pseudocode in particular:

A possible notation for such a type definition was suggested by Knuth; it is a mixture of BNF (the `|` symbol) and the PASCAL definition of a type by enumeration:

``````type proposition = (prop (letter) | neg (proposition) |
conj, disj (proposition, proposition));``````

(…) In defining operations on a data structure, it is usually necessary to enquire which of the various forms the structure takes, and what are its components. For this, I suggest an elegant notation which has been implemented by Fred McBride in his pattern-matching LISP. Consider for example a function intended to count the number of `&`s contained in a proposition. (…)

``````function andcount (p: proposition): integer;
andcount := cases p of
(prop(c) → 0|
neg(q) → andcount(q)|
conj(q,r) → andcount(q) + andcount(r)+1|
disj(q,r) → andcount(q) + andcount(r));``````

Fred McBride’s pattern-matching LISP is the topic of his PhD dissertation. There is not enough room on this page to write about the groundbreaking history of LISP.

Unfree algebras in Miranda

If algebraic data types are “free algebras”, one may naturally wonder whether “unfree algebras” have a role to play. Miranda allows quotienting data type definitions by equations (“laws” or “rewrite rules”). You could then define the integers like this, with a constructor to decrement numbers, and equations to reduce integers to a canonical representation:

``````int ::= Zero | Suc int | Pred int
Suc (Pred n) => n
Pred (Suc n) => n``````

In hindsight this is superfluous, but it’s fun to see this kind of old experiments in programming languages. The modern equivalent in Haskell would be to hide the data constructors and expose smart constructors instead. There are uses for quotient types in proof assistants and dependently typed languages, but they work quite differently.

Sums of products?

There is another folklore interpretation of “algebraic” in “algebraic data types” as referring to “sums of products”.

It’s not an uncommon interpretation. In fact, trying to find a source for this folklore is what got me going on this whole adventure. The Wikipedia article on algebraic data types at the time of writing doesn’t outright say it, but it does refer to sums and products several times while making no mention of free algebras. Some [citation needed] tags should be sprinkled around. The Talk page of that article contains an unresolved discussion of this issue, with links to a highly upvoted SO answer and another one whose references don’t provide first-hand account of the origins of the term. For sure, following that idea leads to some fun combinatorics, like differentiation on data types, but that doesn’t seem to have been the original meaning of “algebraic data types”.

That interpretation might have been in some people’s mind in the 70s and 80s, even if only as a funny coincidence, but I haven’t found any written evidence of it except maybe this one sentence in a later paper, Some history of programming languages by David Turner (2012):

The ISWIM paper also has the first appearance of algebraic type definitions used to define structures. This is done in words, but the sum-of-products idea is clearly there.

It’s only a “maybe” because while the phrase “algebraic type” undeniably refers to sums of products, it’s not clear that the adjective “algebraic” specifically is meant to be associated with “sum-of-products” in that sentence. We could replace “algebraic type” with “data type” without changing the meaning of the sentence.

Clear (1979)

In contrast, free algebras—or initial algebras as one might prefer to call them—are a concept from the areas of universal algebra and category theory with a well-established history in programming language theory by the time algebraic data types came around, with influential contributions by a certain ADJ group; see for example Initial algebra semantics and continuous algebras.

Ironically, much related work focused on the other ADT, “abstract data types”. Using universal algebra as a foundation, a variety of “specification languages” have been designed for defining algebraic structures, notably the OBJ family of languages created by Joseph Goguen (a member of the aforementioned ADJ group) and others, and the Clear language by Rod Burstall (of Hope fame) and Joseph Goguen. Details of the latter can be found in The Semantics of Clear, a specification language. (You may remember seeing a mention of Clear earlier in the quote from The Definition of Standard ML.)

Example theories in Clear

Here is the theory of monoids in Clear. It consists of one sort named `carrier`, an element (a nullary operation) named `empty` and a binary operation `append`.

``````constant Monoid = theory
sorts carrier
opns empty : carrier
append : carrier,carrier -> carrier
eqns all x: carrier . append(x,empty) = x
all x: carrier . append(empty,x) = x
all x,y,z: carrier . append(append(x,y),z) = append(x,append(y,z))
endth``````

A theory is an interface. Its implementations are called algebras. In that example, the algebras of “the theory of monoids” are exactly monoids.

In every theory, there is an initial algebra obtained by turning the operations into constructors (or “uninterpreted operations”), equating elements (which are trees of constructors) modulo the equations of the theory. For the example above, the initial monoid is a singleton monoid, with only an empty element (all occurrences of `append` are simplified away by the two equations for `empty`), which is not very interesting. Better examples are those corresponding to the usual data types.

The booleans can be defined as the initial algebra of the theory with one sort (`truthvalue`) and two values of that sort, `true` and `false`.

``````constant Bool = theory data
sorts truthvalue
opns true,false: truthvalue
endth``````

In Clear, the initial algebra is specified by adding the `data` keyword to a `theory`. In the semantics of Clear, rather than thinking in terms of a specific algebra, a “data theory” is still a theory (an interface), with additional constraints that encode “initiality”, so the only possible algebra (implementation) is the initial one. My guess as to why the concept of data theory is set up that way is that it allows plain theories and data theories to be combined seamlessly.

The natural numbers are the initial algebra of `zero` and `succ`:

``````constant Nat = theory data
sorts nat
opns zero: nat
succ: nat -> nat
endth``````

At this point, the connection between “data theories” in Clear and data types in Hope and subsequent languages is hopefully clear.

More substantial examples in Clear

Theories can be extended into bigger theories with new sorts, operations, and equations. Here is an extended theory of booleans with two additional operations `not`, `and`, and their equations. This should demonstrate that, beyond the usual mathematical structures, we can define non-trivial operations in this language:

``````constant Bool1 = enrich Bool by
opns not: truthvalue -> truthvalue
and: truthvalue,truthvalue -> truthvalue
eqns all . not true = false
all . not false = true
all p: truthvalue . and(false, p) = false
all p: truthvalue . and(true, p) = p
enden``````

Initial algebras are also called free algebras, but that gets confusing because “free” is an overloaded word. Earlier for instance, you might have expected the initial monoid, or “free monoid”, to be the monoid of lists. The monoid of lists is the initial algebra in a slightly different theory: the theory of monoids with an embedding from a fixed set of elements `A`.

We might formalize it as follows in Clear. The theory `List` is parameterized by an algebra `A` of the theory `Set`, and its body is the same as `Monoid`, except that we renamed `carrier` to `list`, we added an `embed` operation, and we added the `data` keyword to restrict that theory to its initial algebra.

``````constant Set = theory sorts element endth
procedure List(A : Set) = theory data
sorts list
opns empty : list
append : list,list -> list
embed : element of A -> list
eqns all x: list . append(x,empty) = x
all x: list . append(empty,x) = x
all x,y,z: list . append(append(x,y),z) = append(x,append(y,z))
endth``````

One may certainly see a resemblance between theories in Clear, modules in ML, and object-oriented classes. It’s always funny to find overlaps between the worlds of functional and object-oriented programming.

CLU (1977)

CLU is a programming language created at MIT by Barbara Liskov and her students in the course of their work on data abstraction.

It features tagged union types, which are called “oneof types”. (Source: CLU Reference Manual by Barbara Liskov et al. (1979).)

``````T = oneof[empty:       null,
integer:     int,
real_num:    real,
complex_num: complex]``````

Values are constructed by naming the oneof type (either as an identifier bound to it, or by spelling out the `oneof` construct) then the tag prefixed by `make_`:

``T\$make_integer(42)``

The `tagcase` destructs “oneof” values.

``````x: oneof[pair: pair, empty: null]
...
tagcase x
tag empty: return(false)
tag pair(p: pair): if (p.car = i)
then return(true)
else x := down(p.cdr)
end
end``````

The main missing feature for parity with algebraic data types is recursive type definitions, which are not allowed directly. They can be achieved indirectly though inconveniently through multiple clusters (classes in modern terminology). (Source: A History of CLU by Barbara Liskov (1992).)

Burstall’s papers on Hope and Clear cite CLU, but beyond that it doesn’t seem easy to make precise claims about the influence of CLU, which is an object-oriented language, on the evolution of those other declarative languages developed across the pond.