# Heterogeneous lists with dependent types in Haskell

With the help of GHC’s many extensions, we can encode simple forms of dependent types, allowing us to enforce more expressive invariants in our programs at compile time.

In this post, I will walk you through an implementation of heterogeneous lists to demonstrate the features and techniques commonly associated with the current flavor of “dependent Haskell”.

My Literate Haskell blog posts render the whole source, including the extensions. In this post, I will mention most of them when they are used, and explain them a bit.

## Extensions and imports for this Literate Haskell file

```
{-# LANGUAGE
AllowAmbiguousTypes,
ConstraintKinds,
DataKinds,
FlexibleContexts,
FlexibleInstances,
GADTs,
InstanceSigs,
MultiParamTypeClasses,
PolyKinds,
RankNTypes,
ScopedTypeVariables,
TypeApplications,
TypeFamilies,
TypeOperators,
UndecidableInstances #-}
module HLists where
```

As for those that won’t be mentioned later, let’s take care of them now briefly.

`InstanceSigs`

allows type signatures to appear in instance declarations, so that we don’t need to go back to the type class to remember what is being defined. This is also consistent with the common practice of annotating all toplevel bindings.`ConstraintKinds`

,`FlexibleContexts`

,`FlexibleInstances`

,`ScopedTypeVariables`

and`UndecidableInstances`

lift restrictions that are mostly historical. They are quite benign extensions that might be unnoticeable when enabled if you don’t already know about them.

The GHC user guide also documents all the available extensions.

## A type for heterogeneous lists

We will construct lists using the infix pair type below as a cons, and the unit
type `()`

as a nil.
The infix type constructor `(:*)`

is allowed by the `TypeOperators`

extension
(not to be confused with the identically-named data constructor `(:*)`

,
which is allowed by the standard).

```
data a :* b = a :* b deriving Show
infixr 1 :*
```

Another common candidate for heterogeneous lists is a GADT indexed by a list of
types, but I prefer `(:*)`

for its simplicity.

Here is an example of a list of three elements:

```
hlistExample1 :: Int :* String :* String :* ()
hlistExample1 = 1 :* "One" :* "1" :* ()
```

You might wonder, why even have a nil, since there can be any type to the right
of `(:*)`

? This is clearly shorter:

```
hlistExample2 :: Int :* String :* String
hlistExample2 = 1 :* "One" :* "1"
```

The problem is that it introduces an odd corner case, where the last element of
a list can never be a heterogeneous list, since `(:*)`

will be interpreted as a
tail of the list.
Is the above list a list of three elements, or is it a list of two (one `Int`

and one `String :* String`

)?

Such a choice would be reflected in the rest of the implementation as an
unnecessary complication compared to having lists be clearly terminated by
`()`

.

Hence, it is possible to accidentally construct these ill-formed lists, but something will break at compile time anyway, so that’s okay.

## Task: get an element by type

We will implement a function to extract an element of a given type in the list.
It is declared as part of the following two-parameter type class.
This requires the `MultiParamTypeClasses`

extension.

```
class Get a bs where
get :: bs -> a
```

For example, we can get the string out of this triple as follows:

```
getExample1 :: String
getExample1 =
get @String ((1 :: Int) :* "One" :* "1" :* ())
-- "One"
```

In the simple examples of this post, we can always infer the output type of
`get`

from the context where it is applied.
But in general, an explicit annotation will often be necessary, and that is
nicely done with the `TypeApplications`

extension as shown in the above example.
`get`

has two type arguments, in the order of the type class arguments.^{1}
Here we explicitly apply it to its first type argument, which is the type
of the element we want to extract.

We’ll go through three progressively more elaborate implementations for `Get`

.
To avoid conflicts between these attempts, we will declare a new copy of that
class at the beginning of each section.

### Conventional approach

```
class Get0 a bs where
get0 :: bs -> a
```

The simplest solution is to write two overlapping instances.
First, when the type of the head of the list doesn’t match, we keep looking in
the tail, calling `get0`

recursively.

```
instance Get0 a bs => Get0 a (b :* bs) where
get0 :: (b :* bs) -> a
get0 (_ :* ys) = (get0 :: bs -> a) ys
```

Second, when the types match, we return the head of the list.

```
instance {-# OVERLAPPING #-} Get0 a (a :* bs) where
get0 :: (a :* bs) -> a
get0 (x :* _) = x
```

By default, overlapping instances are forbidden: often enough, they are caused
by a mistake.
We indicate that the overlap is intentional using the `OVERLAPPING`

pragma on
the *more specific* instance.^{2}
The instance `(Get0 a (a :* bs))`

is more specific than `(Get0 a (b :* bs))`

because we can substitute variables in the latter to obtain the former: replace
`b`

with `a`

.

Before going in more details, let’s give this implementation a try. We can reproduce the above example successfully:

```
get0Example1 :: String
get0Example1 =
get0 @String ((1 :: Int) :* "One" :* "1" :* ())
-- "One"
```

What about this more abstract operation: get the third element of any sufficiently long list.

```
getThird0 :: (a :* b :* c :* ds) -> c
getThird0 = get0
-- Type error!
```

The constraint solver gets stuck on solving `Get0 c (a :* b :* c :* ds)`

because it cannot determine whether `a`

is equal to `c`

,
so it can’t choose one of the above instances.

If instances are allowed to overlap, which instance should it select?
I won’t go into the rationale here, but the rule is to always prefer the most
specific instance, and only pick a less specific one when there is no way to
instantiate type variables in the environment to make the other more specific
instance(s) match.
In this case, since `a`

can be made equal to `c`

, the `OVERLAPPING`

instance
can still potentially match, hence the error.

The GHC user guide actually has a section on overlapping instances which is a pretty good reference about this algorithm if you can get past the jargon.

One way out is to add the unresolved constraint to the signature of
`getThird0`

, and let it be resolved at use sites with concrete types.

```
getThird0
:: Get0 c (a :* b :* c :* ds)
=> (a :* b :* c :* ds)
-> c
getThird0 = get0
```

However, if `c`

is equal to `a`

or `b`

, this function won’t be doing what its
name and type advertise!

```
-- a, b, c all equal to String, ds ~ (String :* ())
get0Example2 :: String
get0Example2 =
getThird0 ("One" :* "Two" :* "Three" :* "Four" :* ())
-- "One"
-- We would like "Three"
```

This problem is of the kind that has many solutions, and the best one would really
depend on the purpose of this `get`

function.
But the one I want to share here is to write the constraint that `c`

is not
equal to `a`

or `b`

. This makes the type of `getThird`

less general to allow
the `Get`

constraint above to be solved at the definition of `getThird`

.

### Type disequality constraints

```
class Get1 a bs where
get1 :: bs -> a
```

Immediately, we face an obstacle: there is an equality constraint `(~)`

, but no
primitive disequality constraint “`(/~)`

”, or any general way to negate a
constraint (such a thing would be incompatible with the open-world principle
of type classes).

Fortunately, we can define equality differently: as a boolean-valued
closed `TypeFamily`

.
(Are we playing language extensions bingo?)

The type-level function `(==)`

has two clauses, it is equal to `True`

if the
operands are equal, `False`

otherwise.

```
type family a == b :: Bool where
a == a = 'True
a == b = 'False
```

Note that type families allow *nonlinear pattern-matching*: the first clause
matches twice on the same variable. This is not allowed at the term level,
because there is no universal way of comparing values, especially
infinite values, functions, and values of abstract types such as `IO`

.^{3}
But at the type level, it is possible.

Data constructors such as `True`

and `False`

are allowed at (or *promoted to*)
the type level by the `DataKinds`

extension.
They are disambiguated from type constructors using the single-quote prefix,
since they can use the same names even in a single module.
`DataKinds`

is the kind^{4} of extension that is so intuitive one doesn’t
even notice when they’re using it.

Now we can match on the outcome of that comparison operation.
The constraint `'False ~ (a == b)`

holds if and only if `a`

and `b`

are not
equal. Conversely, one could also write `'True ~ (a == b)`

in the case of an
equality, but the more common `a ~ b`

works as well, if not better.^{5}
Note that `a ~ b`

implies `'True ~ (a == b)`

, but the type checker is
not aware of the converse! From the type checker’s point of view, `(==)`

is a
type-level function like any other, and it doesn’t make any particular effort
to reason about its output.

We introduce a new type class `GetIf`

, with one more parameter to carry the
comparison result.
The superclass constraint is a minor safety net, ensuring there’s only one way
to use that type class: we must set the boolean `aeqb`

to `a == b`

,
and instances must also satisfy that constraint.

```
class (aeqb ~ (a == b)) => GetIf aeqb a b bs where
getIf :: (b :* bs) -> a
```

We have two cases, depending on the value of the boolean `aeqb`

.
Either `a`

is equal to `b`

, then we get the head of the list.

```
instance (a ~ b) => GetIf 'True a b bs where
getIf :: (a :* bs) -> a
getIf (x :* _) = x
```

Or `a`

is not equal to `b`

, then we keep looking in the tail. Since instances
must satisfy their superclass constraints, we must add `'False ~ (a == b)`

.

```
instance ('False ~ (a == b), Get1 a bs)
=> GetIf 'False a b bs where
getIf :: (b :* bs) -> a
getIf (_ :* ys) = get1 ys
```

Finally, `Get1`

and `GetIf`

are connected by this instance of `Get1`

which immediately defers to `GetIf`

.

```
instance GetIf (a == b) a b bs => Get1 a (b :* bs) where
get1 = getIf
```

The main benefit of this technique over the previous one is that it avoids
overlapping instances, which can be difficult to understand.
The overlap still exists, as part of the definition of `(==)`

, but that is
contained in a general and reusable construct, and the rule of matching clauses
sequentially for closed type families may be more intuitive.^{6}

For clarity, we can define a “class synonym” for type disequality. For constraints, the advantage of this method over type synonyms is that type synonyms can’t be partially applied, while “class synonyms” can.

```
class ('False ~ (a == b)) => a /~ b
instance ('False ~ (a == b)) => a /~ b
```

We can thus give `getThird`

the following type, ensuring that the `c`

we
get is indeed the third element in the list.

```
getThird1 :: (c /~ a, c /~ b) => (a :* b :* c :* ds) -> c
getThird1 = get1
```

Now using `getThird1`

with types for the first or second element equal to the third
results in a type error.
Whether that is good or bad depends on what you actually want to do with those
heterogeneous lists.

```
getThird1Example :: String
getThird1Example =
getThird1 @String ("One" :* "Two" :* "Three" :* "Four" :* ())
-- Type error!
```

### Type-level conditionals

One last issue to address about that solution is that it is quite cumbersome
to implement.
We have to define a new type class, and the original class does nothing but
call the new one immediately.
Conceptually, all that work is only to match on a boolean, one of the simplest
data types (or, data kinds?).
We’d like a similarly simple abstraction to do that.
How about `if`

? We can certainly make it a type family^{7}:

```
type family If (b :: Bool) (c :: k) (d :: k) :: k where
If 'True c _ = c
If 'False _ d = d
```

This family is polykinded, which is just another name for polymorphism
of type-level constructs: `If :: forall k. Bool -> k -> k -> k`

.
(`PolyKinds`

extension; kinds are to types as types are to terms.)

It will take a bit of setup to be able to use `If`

, so let’s put it aside for
the moment.

We also need a type class to pattern match on the type-level boolean and decide
what to do *at run time*, as opposed to type families, which only live
at compile time.

Here’s a first attempt.

```
class IsBool0 b where
_If0 :: forall r. r -> r -> r
instance IsBool0 'True where
_If0 x _ = x
instance IsBool0 'False where
_If0 _ y = y
```

Note that `b`

does not appear in the type of `_If0`

, meaning that `b`

cannot be
inferred from the context where `_If0`

is applied.
Historically, this would always result in ambiguity errors, thus, such a type
was forbidden, forcing us to resolve the ambiguity by adding a trivial argument
or tagging the function somehow.

The `AllowAmbiguousTypes`

extension tells the compiler to accept definitions
like `_If0`

above, keeping signatures tidy;
the ambiguity can be resolved at use sites with the `TypeApplications`

extension, as it makes type arguments explicit (corresponding to those
variables under `forall`

).

Some examples of `_If0`

in action:

```
_If0Examples :: [String]
_If0Examples =
_If0 @'True "This" "That"
[ -- "This"
_If0 @'False "This" "That"
, -- "That"
_If0 @(Int == Bool) "Int is equal to Bool" "Int is not equal to Bool"
, -- "Int is not equal to Bool"
]
```

However we will have trouble implementing `Get`

.
To see why, below is what the final instance will roughly look like.
There are two cases as always.
If the type `a`

we are looking for is equal to `b`

, then we return the head `y`

of the list, otherwise we keep looking in the tail `ys`

.

```
instance (???) => Get a (b :* bs) where
get :: b :* bs -> a
get (y :* ys) =
_If @(a == b)
y -- When (a == b) ~ 'True, i.e., a ~ b
get ys) -- When (a == b) ~ 'False, i.e., a /~ b (
```

The two branches of `_If`

make assumptions that the type checker (currently)
rejects.
In the `True`

branch, we return `y :: b`

when the expected type is `a`

,
this assumes that `a`

is equal to `b`

.
In the `False`

branch, we call `get`

on the tail, which requires a
`Get a bs`

constraint.
We could add it to the context (`(???)`

above), but that would imply a
traversal of the whole type list, whereas the behavior of the previous
implementations was to stop as soon as we found the element we are looking for.

One way to view this problem is that the type of `_If`

doesn’t encode the
condition that the boolean is `True`

in one branch, and `False`

in the other.
It turns out we can apply that intuition quite literally in order to fix `_If`

:

```
class IsBool b where
_If
:: forall r
. (('True ~ b) => r)
-> (('False ~ b) => r)
-> r
```

The `RankNTypes`

extension allows quantifiers and constraints to occur
in a type signature in more locations than only the beginning.^{8}
Thus, the two arguments of `_If`

are now parameterized by constraints.
The function `_If`

can only use its first argument if `b`

is `True`

, and its
second argument if `b`

is `False`

, whereas those arguments can use the
constraints to deduce other useful facts locally.
Interestingly, there is thus exactly one way to write each of the two `IsBool`

instances:

```
instance IsBool 'True where
_If :: forall r. r -> (('False ~ 'True) => r) -> r
_If x _ = x
instance IsBool 'False where
_If :: forall r. (('True ~ 'False) => r) -> r -> r
_If _ y = y
```

The last piece of the puzzle is to complete the `(???)`

context in the `Get`

instance sketched above.

`_If`

requires an`IsBool`

instance in any case.We shall use the

`If`

type family we defined previously to construct a context which depends on the value of`(a == b)`

.

In each branch of `_If`

, `(a == b)`

is assumed to be either `True`

or `False`

,
which allows the compiler to evaluate `If (a == b) _ _`

, uncovering the
constraint that each case exactly needs, `(a ~ b)`

in one, `(Get a bs)`

in the
other.

We define a type synonym `If'`

to collapse the `IsBool`

and `If`

constraints
together.

```
type If' b c d = (IsBool b, If b c d)
instance If' (a == b) (a ~ b) (Get a bs)
=> Get a (b :* bs) where
get :: b :* bs -> a
get (y :* ys) =
_If @(a == b)
y -- (If 'True (a ~ b) _) becomes (a ~ b)
get ys) -- (If 'False _ (Get a bs)) becomes (Get a bs) (
```

We’re now reaching the end of this heterogeneous list example.
From the outside, `get`

should behave identically to `get1`

,
but the definition of `get`

was refactored to a single instance.
Do you find that easier to read than the previous versions?

## Appendix: Dependently typed programming with singletons

A central reference in Haskell’s dependently-typed landscape is
the `singletons`

library.
One common point between this blog post and singletons is that, in fact,
the class `IsBool`

is the Church encoding of the `Bool`

singleton.

A singleton is a type with a single inhabitant. A useful class of singletons
can be defined as Generalized Algebraic Data Types (`GADTs`

extension).
GADTs are types whose parameters depend on the constructors, and vice versa.

The parameter `b`

of `SBool`

below is tied to the constructor:
`STrue`

is the unique inhabitant of `SBool 'True`

and `SFalse`

is the unique
inhabitant of `SBool 'False`

. The point of such a type is that
by pattern matching on an `SBool b`

, we can refine the value of `b`

in each branch, similarly to the `_If`

combinator.

```
data SBool (b :: Bool) where
STrue :: SBool 'True
SFalse :: SBool 'False
```

There is an isomorphism between `SBool`

and `IsBool`

instances, that we can
witness by desugaring `IsBool`

instances to the polymorphic functions
that they contain.

```
-- Desugar IsBool by unwrapping _If
type IsBool_ b
= forall r
. (('True ~ b) => r)
-> (('False ~ b) => r)
-> r
_SBool_to_IsBool :: forall b. SBool b -> IsBool_ b
_SBool_to_IsBool sb = case sb of
STrue -> (\x _ -> x)
SFalse -> (\_ y -> y)
_IsBool_to_SBool :: forall b. IsBool_ b -> SBool b
_IsBool_to_SBool _If = _If STrue SFalse
```

It is easy to check that `_IsBool_to_SBool . _SBool_to_IsBool = id`

.
The converse `_SBool_to_IsBool . _IsBool_to_SBool = id`

is also true, but
the proof relies on a more involved argument (parametricity, free theorems).

The “Church encoding” claim above can be made more explicit with the
following equivalent^{9} reformulation of `SBool`

. This also shows another
characteristic of GADTs: constructors can be given arbitrary function types,
especially polymorphic functions with constraints, as long as the result
matches the type being defined. The function constraints and arguments
become constructor fields.

```
data SBool' (b :: Bool) where
STrue' :: ('True ~ b) => SBool' b
SFalse' :: ('False ~ b) => SBool' b
```

It should be compared to the `Either`

data type (which can be defined using
GADT notation to make explicit the types of its constructors)
and its Church encoding.

```
data Either a b where
Left :: a -> Either a b
Right :: b -> Either a b
type Either_ a b = forall r. (a -> r) -> (b -> r) -> r
```

## On the same topic

- Dependent types in Haskell, talk by Stephanie Weirich.
- Practical dependent types in Haskell, blog post by Justin Le.
- Decidable propositional equality in Haskell,
blog post by Richard Eisenberg, about unequality constraints (
`(a == b) ~ 'False`

). - Dependently typed programming with singletons, by Richard Eisenberg and Stephanie Weirich (Haskell Symposium 2012).
- Strongly typed heterogeneous collections (PDF), by Oleg Kiselyov, Ralf Lämmel, and Keean Schupke (Haskell Workshop 2004, extended version).