# A quick tour of generic-random

Metaprogramming with Generics in Haskell allows us to derive many functions and
types directly from newly declared types. Here is a quick toy demonstration of
using generic-random to
derive `arbitrary`

from the
QuickCheck library.
I won’t go into any implementation details; to learn about generics in general,
check out this tutorial!

### Starters

Below is a type `MyType`

with a simple, handwritten
`Arbitrary`

instance.

```
{-# LANGUAGE InstanceSigs, TypeApplications #-}
import Test.QuickCheck
data MyType
= OneThing Int
| TwoThings Double String
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = oneof [
OneThing <$> arbitrary @Int,
TwoThings <$> arbitrary @Double <*> arbitrary @String]
```

(Also showing off the
`InstanceSigs`

and
`TypeApplications`

extensions. These annotations are inferable here, but helpful!
Especially the former.)

We generate either `OneThing`

or `TwoThings`

with probability 1/2 each,
and use other existing `Arbitrary`

instances to fill their respective fields.

Now, let us add a constructor to `MyType`

:

```
data MyType
= OneThing Int
| TwoThings Double String
| ThreeThings (Maybe Integer) [()] (Bool -> Word)
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = oneof [
OneThing <$> arbitrary @Int,
TwoThings <$> arbitrary @Double <*> arbitrary @String]
```

That compiles ~~therefore it’s correct~~ but the new constructor is not
generated by `arbitrary`

yet! Of course, we must also remember to update any
code involving the modified `MyType`

.

```
data MyType
= OneThing Int
| TwoThings Double String
| ThreeThings (Maybe Integer) [()] (Bool -> Word)
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = oneof [
OneThing <$> arbitrary @Int,
TwoThings <$> arbitrary @Double <*> arbitrary @String,
ThreeThings <$> arbitrary <*> arbitrary <*> arbitrary]
-- N.B.: QuickCheck can generate functions
```

(The lazy programmer gives up spelling out all the field types of `ThreeThings`

.)

### Main course

Typing `arbitrary`

so often gets repetitive;
here enters
generic-random.

```
-- In addition to the first LANGUAGE/import header
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
import Generic.Random
data MyType
= OneThing Int
| TwoThings Double String
| ThreeThings (Maybe Integer) [()] (Bool -> Word)
deriving Generic
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = genericArbitraryU
-- Uniform distribution of MyType constructors
```

In contrast to the previous snippets, `genericArbitraryU`

automatically
adapts to changes in the numbers of constructors and fields of `MyType`

.

We may find `OneThing`

a boring enough test case that we should generate it
less often, here with probability 1/9.

```
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = genericArbitrary (1 % 4 % 4 % ())
-- 1/(1+4+4): OneThing
-- 4/(1+4+4): TwoThings
-- 4/(1+4+4): ThreeThings
```

Now, forgetting to update the distribution when the number of constructor changes would result in a compile-time error. It’s also possible to statically enforce the correspondence between weights and constructor names (the declaration order must match too).

```
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = genericArbitrary
1 :: W "OneThing") %
((4 :: W "TwoThings") %
(4 :: W "ThreeThings") %
( ())
```

Suddenly, we realize `Nothing`

is not a thing, so
`ThreeThings Nothing [()] fromInteger`

is not really “three things”.

To implement the requirement that no `Nothing`

is generated, last year we
would have had to go back to the fully handwritten generator (with
`frequency`

instead of
`oneof`

to preserve the distribution).

```
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = frequency [
1, OneThing <$> arbitrary @Int),
(4, TwoThings <$> arbitrary @Double <*> arbitrary @String),
(4, ThreeThings <$> (Just <$> arbitrary) <*> arbitrary <*> arbitrary)] (
```

But now, since generic-random-1.1, we can say: “for any field of type
`Maybe Integer`

, use this generator; otherwise use `arbitrary`

, as before”.

```
-- Heterogeneous list of generators, of length 1, with cons (:@).
custom :: GenList '[Maybe Integer]
custom = (Just <$> arbitrary) :@ Nil
instance Arbitrary MyType where
arbitrary :: Gen MyType
arbitrary = genericArbitraryG custom (1 % 4 % 4 % ())
```

If that is too heavy handed, we can also mention specific fields by name, when they have one (there is an example at the end of this “tutorial module”).

We are reaching the end of this tour. A compilable version of that last snippet.

### N.B.

Random generation for testing is a largely open topic. generic-random implements a very simple and specific kind of random generators, and it is not always applicable: depending on the type and distribution of constructors, it may not terminate within a reasonable time, and many applications need much more structured generators to achieve the best coverage.

### Dessert (Conclusion)

Other than just indulging in our laziness when writing code, automating boilerplate-writing has benefits that may lighten the burden of maintenance:

we can’t get the boilerplate wrong if we don’t write it, and the boilerplate may rewrite itself when types changes (e.g., we can’t forget to generate a constructor; that is admittedly hyperbolic, only certain kinds of mistakes are actually prevented);

not only that, it might not even be necessary to

*know*how to write the boilerplate to get something working (here, a newcomer could get generators and play with the rest of QuickCheck without having to do any monadic programming with`Gen`

, although more documentation seems necessary to put that into practice);we can optimize the boilerplate by changing the one piece of code that generates it, instead of the many places where it would be duplicated (e.g.,

`frequency`

and`oneof`

are the easiest things to use but call*recursive*functions on mostly static lists, which are thus not optimized away by GHC; a generic library can transparently use a more efficient implementation for all users to benefit).

Feel free to make a pull request or open an issue if you’d like to see some new option in generic-random or any other improvement!

### P.S.

generic-random changed a lot since its creation. The initial
implementation derived Boltzmann samplers, which are heavier in complexity
and dependencies; that can now be found in the
boltzmann-samplers
library (I’m slowly working on a `GHC.Generics`

version instead of SYB).
The now simpler generic-random doesn’t have as nice probabilistic guarantees
as for Boltzmann samplers, but it is actually not clear how a globally
uniform-ish distribution improves random testing and whether that is worth the
extra complexity. Even with a naive distribution of constructors:

small types (i.e., with few inhabitants) are quickly covered;

for large types, we still generate a good variety of test cases quickly;

anyway, what is the uniform (or actually, “sizewise uniform” for Boltzmann samplers) distribution for

`Double`

? For functions with infinite domain?

Moreover, if you really need a uniform distribution, take a look at testing-feat! (So far I found it’s much more efficient than Boltzmann samplers.)