Monday, May 22, 2017

More type classes in OCaml

More type classes in OCaml

More type classes

Author: Joel Björnson

About the author: Joel has been enjoying functional programming ever since being introduced to Haskell at Chalmers University (Sweden). Since then he's been dabbling in F# and more recently OCaml. He's currently based in London, working at the intersection of functional programming and finance.

As demonstrated in previous articles on this blog, OCaml comes with a rich module system. Among other things it enables developers to write code that is polymorphic over module signatures. As such, parameterized modules (aka functors) play a similar role to what type classes do in Haskell, and as explained here, it is straightforward to map simple type classes to modules in OCaml. In Haskell, type classes are often used as design patterns for structuring programs and over the years a taxonomy of type classes inspired by Category Theory has evolved. Standard patterns such as functor, monad and monoid have had a strong influence on the design of common APIs. In OCaml there is less focus on such idioms. In this post we explore how some of the Haskell patterns implemented in terms of type classes can be ported to OCaml and how that impacts program design. In particular we cover four commonly used type classes that ships with standard Haskell distributions:

  • Functor
  • Monoid
  • Applicative Functor
  • Traversable
For a more comprehensive guide to these patterns and others Typclassopedia serves as a good resource. Before tackling the technical aspects it may be worth elaborating a bit on the motivation behind introducing these types of abstractions in the first place. Justifications fall under different categories:
  1. API design
  2. Code reusability
  3. Testability
The first one is about program design - by mapping a data type to a pattern such as applicative functor, we obtain a set of combinators for operating on values of that type. Ideally that means less time spent on inventing custom operators and figuring out their semantics. When multiple libraries share patterns it also increases the likelihood that consumers of anyone of those libraries already are familiar with the corresponding combinators. For example looking at the map function over some custom data type, one should expect it to have similar properties to the function List.map operating on lists. The second point is about code reuse. By writing functions that are expressed solely in terms of other module signatures they become reusable in different contexts; For instance by implementing the primitive operators for a monoid we get additional combinators (such as concat) defined generically for any monoid for free! Thirdly, these patterns all come with a set of theoretically well founded properties. As demonstrated by some of the examples below, it is also possible to write generic tests that can be used to validate concrete implementations of the patterns for different underlying data types.

Prerequisites and conventions

From now on we'll be using lowercase names such as applicative functor to describe the patterns themselves. Names starting with uppercase, for instance Applicative refers to the Haskell name of the type class, while signature names in OCaml are all uppercase (e.g APPLICATIVE ). To avoid confusion between the terms functor as in the functor pattern and OCaml functors referring to parameterized modules, we use the name ocaml-functor to mean the latter. Basic familiarity with Haskell syntax is also be assumed. Importantly, note that Haskell uses the infix operator (.) for function composition:

f . g = \x -> f (g x)

In the OCaml code below we instead use ( << ) to be defined similarly along with (the more idiomatic) pattern of forward composition ( >> ) , an identity function id and a function const:
let ( << ) f g x = f (g x);;
let ( >> ) f g = g << f;;
let id x       = x;;
let const x _  = x;;

We also deviate slightly from the naming conventions in Haskell for operations, for instance fmap becomes map and mconcat is named concat.

Representing the patterns

We use a standard approach for mapping type classes in Haskell to module signatures in OCaml.

Functors

The Functor type class captures the pattern of mapping over the value(s) of some parameterized data type. In Haskell it can be defined as:

class  Functor f  where
  fmap :: (a -> b) -> f a -> f b

In OCaml we may instead construct a corresponding module signature:
module type FUNCTOR = sig
  type 'a t
  val map : ('a -> 'b) -> 'a t -> 'b t
end;;

In order for a type to qualify as a functor, one need to provide an implementation for map (fmap in Haskell) that satisfies the signature. For instance, the Functor instance for the list type in Haskell is given by:

instance Functor [] where
  fmap = map

Here, map is the standard map function over lists. In OCaml we create a module implementing the FUNCTOR signature, which for lists may look like:
module ListFunctor : FUNCTOR with type 'a t = 'a list = struct
  type 'a t = 'a list
  let map f = List.map f
end;;

One difference is that the module is named which allows for multiple instances of the same signature for the same type to coexist. The with type construct is required in order to be able to export the type 'a t specified by the signature. It makes the fact that ListFunctor.t is indeed the type list transparent, allowing us to apply ListFunctor.map to ordinary lists.

Type classes in Haskell often come with a set of laws. These are specifications that any instance of the type class must obey. However they are not enforced by the type system and thus need to be considered when writing the implementation. For Functors, any instances should satisfy the following constraints:

fmap id = id fmap (f . g)  = fmap f . fmap g

These invariants state that the map function must be structure preserving, i.e. is not allowed to change the shape of the given value mapped over. They also have a deeper theoretical justification when described in terms of Functors in Category Theory. From a practical point of view it is sufficient to note that violating this constraint leads to code that is difficult to reason about and refactor. Consider for instance a function:
let increment_twice xs =
  xs
  |> List.map (fun x -> x + 1)
  |> List.map (fun x -> x + 1)
;;

One should expect that applying the following optimization:
let increment_twice xs = List.map (fun x -> x + 2) xs;;

does not impact its semantics.

An immediate advantage of capturing the functor pattern explicitly via a signature (FUNCTOR) is that it enables us to to define an additional parameterized module with tests for validating any concrete implementation of the signature:

module TestFunctor (F : FUNCTOR) = struct

  let test_id x = F.map id x = x

  let test_compose xs =
    let f x = x mod 2 in
    let g x = x - 1 in
    F.map (g >> f) xs = F.map f (F.map g xs)

end;;

The tests here correspond to the two functor laws stated above.

For instance to test ListFunctor we first apply TestFunctor to this module in order to retrieve a specialized version:

module TFL = TestFunctor (ListFunctor);;

Here are a few examples of using the module:

TFL.test_id [];;
TFL.test_id [1;2];;
TFL.test_compose [];;
TFL.test_compose [1;2;3];;

The option type in OCaml also forms a functor:

module OptionFunctor : FUNCTOR with type 'a t = 'a option = struct
  type 'a t = 'a option
  let map f = function
    | Some x  -> Some (f x)
    | None    -> None
end;;

And similar to the list example, we get a test module for free:
module TOF = TestFunctor (OptionFunctor);;

TOF.test_id (Some 42);;
TOF.test_id None;;
TOF.test_compose (Some 42);;
TOF.test_compose None;;

As will be illustrated by some of the examples below, functors are not only applicable to container like types and also not all containers form functors.

Monoids

Monoid is another example of a common pattern where instances can be found for a variety of types. In Haskell it's defined as:

class Monoid m where
  mempty :: m
  mappend :: m -> m -> m

Any type qualifying as a monoid must have identity value (mempty) and a binary operator (mappend) for composing any two elements.

The OCaml version can be specified by the following module type:

module type MONOID = sig
  type t
  val empty : t
  val append : t -> t -> t
end;;

There are also a few laws that instances should obey:

mappend mempty x        = x
mappend x mempty        = x
mappend x (mappend y z) = mappend (mappend x y) z

The first two state that mempty is an identity element with respect to mappend and the second one that mappend must be associative. Again, this can be captured by a test module:

module TestMonoid (M : MONOID) = struct

  let test_left_id x = M.append M.empty x = x

  let test_right_id x = M.append x M.empty = x

  let test_assoc x y z =
    M.append x (M.append y z) = M.append (M.append x y) z

end;;

One of the more famous monoids is given by the natural numbers with addition and identity element 0:

module IntAddMonoid : MONOID with type t = int = struct
  type t = int
  let empty = 0
  let append = ( + )
end;;

Another advantage of formalizing patterns by explicit signatures is that it enables us to define derived combinators generically. For example, the append operation from IntAddMonoid can be lifted to a sum function accepting a list of integers, adding them together or defaulting to 0 if the empty list is given:

open IntAddMonoid;;
let sum xs = List.fold_left append empty xs;;

The scheme can be generalized to operate on any list of monoids. To avoid having to specify the implementation manually for each monoid instance, one may construct a module-functor for generating extension functions:

module MonoidUtils (M : MONOID) = struct
  include M
  let ( <+> ) x y = append x y
  let concat xs = List.fold_left ( <+> ) empty xs
end;;

Here MonoidUtils takes a MONOID module and re-exports its definition along with two additional utility functions, an infix version of append ( <+> ) and concat.

Another example of a monoid is a list, parameterized over any type. In Haskell the instance is given by:

instance Monoid [a] where
  mempty = []
  mappend x y = x ++ y

Where (++) is the concatenation operator for lists. In OCaml one could imagine attempting something like:
(* Pseudo-code - not valid OCaml! *)
module ListMonoid : MONOID with type t = 'a list = struct
  type t = 'a list
  let empty = []
  let append xs ys = xs @ ys
end;;

However it is not possible to directly parameterize modules by types. A work around can be achieved by first introducing a dummy module for wrapping the type and passing it along as a module parameter:
module type TYPE = sig type t end;;

module ListMonoid (T : TYPE) : MONOID with type t = T.t list = struct
  type t = T.t list
  let empty = []
  let append xs ys = xs @ ys
end;;

This comes with an obvious disadvantage of having to create specialized versions for each concrete list type. Some of the inconvenience is compensated for by explicit type parameters and support for local modules, created at run-time. Here's an example implementing concat for lists in terms of the generic list monoid:

let concat (type a) xs =
  let module MU = MonoidUtils (ListMonoid(struct type t = a end)) in
  MU.concat xs;;

Its signature is inferred as:

val concat : 'a list list -> 'a list

Applicative Functors

An applicative functor has more structure than a regular functor. In Haskell it can be defined as:

class (Functor f) => Applicative f where
    pure  :: a -> f a
    (<*>) :: f (a -> b) -> f a -> f b

The function pure turns a (pure) value into an applicative value and ( <*> ) takes a function wrapped inside an applicative along with an applicative value and returns an applicative result corresponding to applying the value to the function. The additional constraint ((Functor f) => Applicative f enforces that any type that instantiates the Applicative type class must also be an instance of Functor.

In OCaml we can achieve something similar by including the FUNCTOR signature within a new signature APPLICATIVE as in:

module type APPLICATIVE = sig
  include FUNCTOR
  val pure : 'a -> 'a t
  val apply : ('a -> 'b) t -> 'a t -> 'b t
end;;

Here the infix operator ( <*> ) is named apply.

For a concrete example consider the applicative instance for the list type. Using the ListFunctor module from above:

module ListApplicative : APPLICATIVE with type 'a t = 'a list = struct
  include ListFunctor

  let pure x = [x]

  let apply fs xs =
    concat @@ map (fun f -> map (fun x -> f x) xs) fs

end;;

ListApplicative simply re-exports the implementation of ListFunctor to satisfy the functor part of the signature, also mirroring the constraint from the Haskell version.

pure wraps a value in a list. apply takes a list of functions, a list of values and applies each function to all elements of the list. Once again we may construct a utility module with some extra operators implemented using the primitive functions:

module ApplicativeUtils (A : APPLICATIVE) = struct
  include A
  let ( <$> ) f     = map f
  let ( <*> ) f     = apply f
  let ( <* ) x y    = const <$> x <*> y
  let ( *> ) x y    = (fun _ y -> y) <$> x <*> y
  let liftA2 f x y  = f <$> x <*> y
end;;

The infix operators are variations of apply and map, liftA2 is for conveniently lifting a regular function of two arguments into a function operating on two applicative values.

By applying ListApplicative to the ApplicativeUtils functor we obtain a concrete module for operating on lists:

module LAU = ApplicativeUtils (ListApplicative);;

Its full signature can be listed by:
#show_module LAU;;

Producing the following output:
module LAU :
    sig
      type 'a t = 'a ListApplicative.t
      val map : ('a -> 'b) -> 'a t -> 'b t
      val pure : 'a -> 'a t
      val apply : ('a -> 'b) t -> 'a t -> 'b t
      val ( <$> ) : ('a -> 'b) -> 'a t -> 'b t
      val ( <* ) : 'a t -> 'b t -> 'a t
      val ( *> ) : 'a t -> 'b t -> 'b t
      val liftA2 : ('a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
    end

Finally let's a take a look at a concrete example to see what the applicative interface actually brings in terms of functionality. Say we want to generate some mock data to be used for testing. Given the following types:

type offer = Ask | Bid;;

type quote =
  {
    time : int;
    offer : offer;
    ticker : string;
    value : float;
  };;

The snippet below produces a list of all possible combinations of some example data by combining a set of properties:

let quotes =
  let open LAU in
  (fun time offer ticker value -> { time; offer; ticker; value })
  <$> [1;2;3;4;5]
  <*> [Ask; Bid]
  <*> ["XYZ"; "ZYK"; "ABC";"CDE"; "QRZ"]
  <*> [100.; 90.; 80.; 70.];;

By composing applications of pure and ( <*> ) we lift functions of arbitrary arity into applicative versions. For the list applicative, that means a generalized version of Cartesian products.

Another useful instance of an applicative functor is the option type:

module OptionApplicative : APPLICATIVE with type 'a t = 'a option =
struct
  include OptionFunctor

  let pure x = Some x

  let apply fo xo =
    match fo, xo with
    | Some f, Some x  -> Some (f x)
    | _               -> None

end;;

Here we rely on the OptionFunctor module to manage the functor part. pure returns a value wrapped by the Some constructor and apply only produces a value if neither of its arguments are None values. As with many other examples of instances, there is basically only one feasible implementation to choose from given the type constraints of the function signature.

With the implementation of the core interface, utilities come for free:

module OAU = ApplicativeUtils (OptionApplicative);;

We can now use it to conveniently lift operations into versions accepting optional arguments. Consider the following (safe) versions of division and square-root functions:

let ( //. ) n d = if d = 0. then None else Some (n /. d);;
let ssqrt x = if x < 0. then None else Some (sqrt x);;

Say we want to implement the formula f(x,y,z) = (x / y) + sqrt(x) - sqrt(y). The obvious approach is to use pattern matching as in:

let f x y z =
  match x //. y, ssqrt x, ssqrt y with
  | Some z, Some r1, Some r2  -> Some (z +. r1 -. r2)
  | _                         -> None
;;

Using the applicative operators from the OAU module enables an alternative (more succinct) definition:

open OAU;;
let f x y z =
 (fun z r1 r2 -> z +. r1 -. r2) <$> (x //. y) <*> ssqrt x <*> ssqrt y
;;

Applicative functors also come with a set of laws. In Haskell expressed as:

-- Identity
pure id <*> v                 = v

-- Homomorphism
pure f <*> pure x             = pure (f x)

-- Interchange
u <*> pure y                  = pure ($ y) <*> u

--- Composition
pure (.) <*> u <*> v <*> w   = u <*> (v <*> w)

These may again be turned into a generic testing module:

module TestApplicative (A : APPLICATIVE) = struct

  module AU = ApplicativeUtils(A)

  open AU

  let test_id x = (pure id <*> x) = x

  let test_hom f x = pure f <*> pure x = pure (f x)

  let test_interchange u y =
    (u <*> pure y) = (pure (fun f -> f y) <*> u)

  let test_composition u v w =
    (pure ( << ) <*> u <*> v <*> w) = (u <*> (v <*> w))

end;;

and be used to validate arbitrary instances of this pattern.

For example to test the list instance, we first construct a concrete module using the TestApplicative functor:

module TAL = TestApplicative (ListApplicative);;

This may be used as in:
TAL.test_hom String.length "Homomorphism";;

Traversables

Traversable is an interesting type class which also brings a couple of additional challenges to our effort of mapping Haskell patterns to OCaml. It may be described as a generalization of the iterator pattern and is defined in Haskell as:

class (Functor t, Foldable t) => Traversable t where
  traverse  :: Applicative f => (a -> f b) -> t a -> f (t b)
  sequenceA :: Applicative f => t (f a) -> f (t a)
  mapM      ::       Monad m => (a -> m b) -> t a -> m (t b)
  sequence  ::       Monad m => t (m a) -> m (t a)

Concrete instances can be written by implementing any one of the above functions as they can all be expressed in terms of each other. We could potentially replicate this flexibility in OCaml by a set of different module-functors with signatures wrapping each function. However, for the purpose of this exercise we settle on traverse as the defining implementation. Traverse is also parameterized over an Applicative functor. A first attempt in OCaml might be something along the lines of:

(*Psuedo-code - not valid OCaml!*)
module type TRAVERSABLE = sig
  type 'a t
  val traverse :  (type a)
                  (module A : APPLICATIVE with type 'a t = 'a a)
                  ('a -> 'b a) ->
                  'a A.t ->
                  ('b t) a
end;;

However here the type a would itself require a type parameter. In Haskell lingo it is said to have kind (* -> *). Unfortunately OCaml does not support higher-kinded polymorphism.

Instead of passing APPLICATIVE as an argument to each invocation of traverse we can embed it in the module signature:

module type TRAVERSABLE = sig
  type 'a t
  module Applicative : APPLICATIVE
  val traverse : ('a -> 'b Applicative.t) -> 'a t -> ('b t) Applicative.t
end;;

To mimic the Haskell constraints it is tempting to also require the FUNCTOR interface by throwing in a extra include FUNCTOR. However, there's a technical reason for why this may not be a good idea which we'll return to in a bit.

Even though the signature references a specific implementation of an APPLICATIVE we can recover genericity by relying on module-functors to specify the implementation of a traversable for any applicative argument. Let's consider the functor for list traversables:

module ListTraversable  (A : APPLICATIVE) :
                        TRAVERSABLE with type 'a t = 'a list
                            and type 'a Applicative.t = 'a A.t =
struct

  type 'a t = 'a list

  module Applicative = A

  let rec traverse f xs =
    let module AU = ApplicativeUtils(A) in
    let open AU in
    match xs with
    | []      -> A.pure []
    | x :: xs -> (fun y ys -> y :: ys) <$> f x <*> traverse f xs

end;;

Here one has to accept some extra verbosity compared to the Haskell version, although the code itself is fairly straightforward. The functor argument A of type APPLICATIVE serves to fulfil the requirement of having to export the APPLICATIVE module. The implementation of traverse is the interesting bit. Note that it is indeed defined generically for any applicative functor. The ApplicativeUtils module constructor comes in handy for accessing the infix versions of the operators.

To give ListTraversable a try, consider how it can be used for option effects:

module LTO = ListTraversable (OptionApplicative);;

This results in a module with the following signature:

module LTO :
  sig
    type 'a t = 'a list
    val map : ('a -> 'b) -> 'a t -> 'b t
    module Applicative : sig  end
    val traverse : ('a -> 'b Applicative.t) -> 'a t -> 'b t Applicative.t
  end;;

where we also know that the Applicative sub-module is in fact our OptionApplicative.

traverse in this context is a function that allows us to map each element of a list to an optional value where the computation produces a list with all values collected, only in case every element was successfully mapped to a Some value.

For example using the safe square root function from above we can transform it into a version operating on lists:

let all_roots = LTO.traverse ssqrt;;

It only returns a list of values with the results in case each element produced a valid square root. A few examples:

# all_roots [4.;9.;16.];;
- : float LTO.t LTO.Applicative.t = Some [2.; 3.; 4.]

# all_roots [4.;-9.; 16.];;
- : float LTO.t LTO.Applicative.t = None

Next, let's consider a custom type ('a tree) for which we are also able to implement the traversable interface:

type 'a tree =
  | Leaf
  | Node of 'a tree * 'a * 'a tree
;;

let node l x r = Node (l,x,r);;

Following is an instance of traversable for trees:

module TreeTraversable (A : APPLICATIVE) :
                        TRAVERSABLE with type 'a t = 'a tree
                                  and type 'a Applicative.t = 'a A.t =
struct
  module Applicative = A

  type 'a t = 'a tree
  type 'a a = 'a A.t

  let rec traverse f t =
    let module AU = ApplicativeUtils(A) in let open AU in
    match t with
    | Leaf          -> pure Leaf
    | Node (l,x,r)  -> node <$> traverse f l <*> f x <*> traverse f r

end;;

From the Haskell specification we know that any traversable must be a functor. Comparing the signatures for map and traverse also reveals their similarities:

val map       : ('a -> b)       -> 'a t -> 'b t
val traverse  : ('a -> 't A.t)  -> 'a t -> ('b t) A.t

However, embedding map in the module signature for TRAVERSABLE forces the user to define it manually. Would it be possible to achieve a generic implementation expressed in terms of the traverse function?

It can be done by choosing a suitable Applicative where the effect does not impact the result. The simplest possible type forming an applicative functor is the identity type:

type 'a id = 'a;;

for which a trivial APPLICATIVE instance exist:
module IdApplicative : APPLICATIVE with type 'a t = 'a id = struct
  type 'a t     = 'a id
  let pure x    = x
  let map f     = f
  let apply f   = map f
end;;

Using IdApplicative for the effect, traverse collapses into map:

module TreeTraversableId = TreeTraversable (IdApplicative);;
let map f = TreeTraversableId.traverse f;;

Similar to the pattern of utility modules for extending the interface with additional functions we may implement another module-functor TraversableFunctor that produces a functor instance given a module-functor for building traversables:

module TraversableFunctor (MT : functor (A : APPLICATIVE) ->
             TRAVERSABLE with type 'a Applicative.t = 'a A.t) =
struct
  module TI = MT(IdApplicative)
  let map f = TI.traverse f
end;;

Following is an example creating a functor for trees derived from its traversable implementation:

module TTU = TraversableFunctor (TreeTraversable);;

Its map function can be used as in:
TTU.map (fun x -> x * x) (node Leaf 3 (node Leaf 5 Leaf));;

We could also define another utility module for deriving the sequence operator, in order to recover some of the functionality from Haskell, where sequence can be defined by instantiating Traversable and only implementing traverse:

module TraversableSequence  (T : TRAVERSABLE ) = struct
  let sequence xs = T.traverse id xs
end;;

The Haskell documentation for Applicative also dictates a set of laws:

-- Naturality
t . traverse f = traverse (t . f)

-- Identity
traverse Identity = Identity

-- Composition
traverse (Compose . fmap g . f) = Compose . fmap (traverse g) . traverse f

The naturality law assumes that t is a natural transformation from one applicative functor to another. Porting it to OCaml requires a couple of further tricks. First we dedicate a specific module functor for the task which takes two arguments for the applicatives mapped between, along with a traversable constructor. In order to also connect the types, an additional module TYPE2 representing types of kind (* -> *) is introduced:

module type TYPE2 = sig type 'a t end;;

module TestTraversableNat (T2 : TYPE2)
                          (A1 : APPLICATIVE)
                          (A2 : APPLICATIVE)
                          (MT : functor (A : APPLICATIVE) ->
                            TRAVERSABLE with
                                  type 'a Applicative.t = 'a A.t
                                            and type 'a t = 'a T2.t ) =
struct
  module T1 : TRAVERSABLE with
     type 'a Applicative.t = 'a A1.t and type 'a t = 'a T2.t = MT (A1)

  module T2 : TRAVERSABLE with
     type 'a Applicative.t = 'a A2.t and type 'a t = 'a T2.t = MT (A2)

  type nat = { t : 'a. 'a A1.t -> 'a A2.t }

  let test f {t} x = t (T1.traverse f x) = T2.traverse ( f >> t) x
end;;

Here, nat represents the mapping from A1 to A2 and the type is introduced in order to be able to express that the transformation is existentially quantified over all type parameters to A1.t.

Here's an example of a concrete realization of a test module:

module TTN =
    TestTraversableNat (struct type 'a t = 'a list end)
                       (IdApplicative)
                       (OptionApplicative)
                       (ListTraversable)
end;;

The second law, identity, is expressed in terms of the type Identity and its functor and applicative instances in Haskell. Identity in haskell is defined as:

newtype Identity a = Identity a

instance Functor Identity where
  fmap f (Identity x)  = Identity (f x)

instance Applicative Identity where
  pure = Identity
  (Identity f) <*> (Identity x) = Identity (f x)

We've already seen its corresponding OCaml type 'a id and the applicative instance, IdApplicative. Using that we may create another test module for the identity law:

module TestTraversableId ( MT : functor (A : APPLICATIVE) ->
                      TRAVERSABLE with type 'a Applicative.t = 'a A.t) =
struct
  module TI = MT (IdApplicative)
  let test x = TI.traverse id x = x
end;;

The following example shows how it can be used to test the ListTraversable module-functor:

module TTIL = TestTraversableId (ListTraversable);;
TTIL.test [1;2;3];;

The final law, composibility, relies on the type Compose which takes two higher-kinded type arguments and composes them:

newtype Compose f g a = Compose (f (g a))

Its functor and applicative functor instances are achieved by:

instance (Functor f, Functor g) => Functor (Compose f g) where
  fmap f (Compose x) = Compose (fmap (fmap f) x)

instance (Applicative f, Applicative g) => Applicative (Compose f g) where
  pure x = Compose (pure (pure x))
  Compose f <*> Compose x = Compose ((<*>) <$> f <*> x)

Once again to circumvent the higher-kinded type restriction we need to resort to modules in OCaml. The following module-functor takes two applicatives as arguments and produces an APPLICATIVE module for the composed type:

module ComposeApplicative (F : APPLICATIVE)
                          (G : APPLICATIVE)
                          : APPLICATIVE with type 'a t = ('a G.t) F.t =
struct

  type 'a t = ('a G.t) F.t

  let pure x = F.pure (G.pure x)

  let map f = F.map (G.map f)

  let apply f x =
    let module FU = ApplicativeUtils(F) in let open FU in
    (G.apply) <$> f <*> x

end;;

Finally tackling the law expressed using the Compose type:

traverse (Compose . fmap g . f) = Compose . fmap (traverse g) . traverse f

requires some even heavier plumbing. To demonstrate that it's possible, here's an implementation:
module TestTraversableCompose (T2 : TYPE2)
                              (F  : APPLICATIVE)
                              (G  : APPLICATIVE)
                              (MT : functor (A : APPLICATIVE) ->
                              TRAVERSABLE with type 'a Applicative.t = 'a A.t
                                                    and type 'a t = 'a T2.t) =
struct

  module AC : APPLICATIVE with
        type 'a t = 'a G.t F.t = ComposeApplicative(F) (G)

  module TF : TRAVERSABLE with
        type 'a Applicative.t = 'a F.t
          and type 'a t = 'a T2.t = MT (F)

  module TG : TRAVERSABLE with
        type 'a Applicative.t = 'a G.t
          and type 'a t = 'a T2.t = MT (G)

  module TC : TRAVERSABLE with
        type 'a Applicative.t = 'a G.t F.t
          and type 'a t = 'a T2.t = MT (AC)

  let test f g x =
    F.map (TG.traverse g) (TF.traverse f x) = TC.traverse (f >> F.map g) x

end;;

It can be used for testing various combinations of traversables and applicatives. For example:

module TTCL = TestTraversableCompose  (struct type 'a t = 'a list end)
                                      (ListApplicative)
                                      (OptionApplicative)
                                      (ListTraversable);;
TTCL.test (fun x -> [x; x + 1])
          (fun x -> if x > 10 then Some (-x) else None)
          [1;2;3;5];;

Using the patterns

Now that we've laid the ground and introduced formalized interfaces for some common patterns, the next sections provide a couple of more examples of how these idioms can be used in practice when designing libraries and programs.

A minimal parsing library

In the following example we implement a simple parsing library and see how monoids and applicative functors guide the design of the API.

First consider a suitable definition of a parser type:

type 'a p = char list -> ('a * char list) option;;

A parser 'a p is a function from a list of characters (input) to an option of a tuple of a value of type 'a, produced by the parser along with the remaining tokens of the input. The type is similar in spirit to parsing combinator libraries such as Haskell's Parsec.

To be able to define parsers that parses a single character, here's a function that takes a mapping function from a character to an optional value and returns a parser that, when successful, produces a value and consumes one element of the input:

let token f = function
  | []      -> None
  | x :: xs -> OptionFunctor.map (fun y -> (y, xs)) @@ f x
;;

It can be used to implement a specialized version char for matching characters:

let char c = token (fun c' -> if c = c' then Some c else None);;

Another useful parser is the one matching an empty list of input:

let empty = function
  | []  -> Some ((), [])
  | _   -> None
;;

In order to compose parsers, either by sequencing them - one parser followed by another, or choosing between multiple parsers, we need to come up with a set of suitable combinators.

Rather than trying to derive such functions directly one can start by looking at existing patterns and identify the ones applicable to parsers.

Doing that requires little thinking besides coming up with feasible implementations and ensuring that the implementation is compliant with the corresponding set of constraints (laws).

For instance, with the parser definition above, we are able to to define an applicative functor interface:

module ParserApplicative : APPLICATIVE with type 'a t = 'a p = struct

  type 'a t = 'a p

  let map f p = p >> OptionFunctor.map (fun (x, cs) -> (f x, cs))

  let pure x cs = Some (x, cs)

  let apply f x cs =
    match f cs with
    | Some (f, cs)  -> map f x cs
    | None          -> None

end;;

To convince ourselves that the implementation is sound we can use equational reasoning to prove the laws explicitly. Relying on the TestApplicative module in this case is problematic since it requires comparing for equality and our parser type is a function. A better implementation of the test modules would also allow parameterization of a comparator module.

The ParserApplicative module grants us access to the functions ( <*> ) and ( <$> ) for composing parsers of different types:

module APU = ApplicativeUtils (ParserApplicative);;

To give an example, here is a parser that parses the input ['a';'b';'c'] and produces a unit result:

open APU;;
let abc = (fun _ _ _ -> ()) <$> char 'a' <*> char 'b' <*> char 'c';;

In order to represent grammars that allow alternative parsing constructs, we need a way to choose between a set of potential parsers. That is, collapsing a set of parsers into a single parser. Phrased differently, we are looking for a monoid:

module ParserMonoid (T : TYPE) : MONOID with type t = T.t p = struct

  type t = T.t p

  let empty _ = None

  let append p q  cs =
    match p cs with
    | Some x  -> Some x
    | None    -> q cs

end;;

Here, empty is the parser that always fails and append takes two parsers and returns a parser that for any input first attempts to run the first parser and in case it fails resorts to the second one.

We can now use the ParserMonoid to define a few utility functions:

let fail (type a) =
  let module M = ParserMonoid(struct type t = a end) in
  M.empty
;;

let choose (type a) ps =
  let module MU = MonoidUtils (ParserMonoid(struct type t = a end)) in
  MU.concat ps
;;

let ( <|> ) p q = choose [p; q];;

The functor, applicative functor and a monoid combinators for the parser type, form the baseline of the API. They are also sufficient for implementing a function for turning any parser into one that applies the parser recursively and collects the results in a list:

open APU;;

let delay f cs = f () cs;;

let rec many p =
  List.cons <$> p <*> (delay @@ fun _ -> many p)
  <|>
  pure []
;;

The purpose of the function delay is to avoid infinite recursion by allowing to construct parsers lazily (ones that are only realized on demand).

The definition of many states that it is a parser that either parses one result of the given parser p followed by many results; Or in case it fails, consumes no input and returns an empty list (pure []).

Another handy combinator is filter that takes a predicate function for filtering a parser by only allowing it to succeed when its result satisfies the predicate:

let filter f p cs =
  match p cs with
  | Some (x, cs) when f x -> Some (x,cs)
  | _                     -> None
;;

We can use it to define a variation of many for parsing one or more elements:

let many_one p = filter ((<>) []) @@ many p;;

When it comes to actually exposing the API for a parser combinator library we may still choose to shield users from any references to modules such as ParserApplicative or ParserMonoid and also include a sub-set of the derived utility functions.

Here is an example of such a module signature:

module type PARSER = sig
  type 'a t

  val empty : unit t
  val run : 'a t -> string -> 'a option
  val map : ('a -> 'b) -> 'a t -> 'b t
  val pure : 'a -> 'a t
  val apply : ('a -> 'b) t -> 'a t -> 'b t
  val ( <$> ) : ('a -> 'b) -> 'a t -> 'b t
  val ( <*> ) : ('a -> 'b) t -> 'a t -> 'b t
  val ( <*  ) : 'a t -> 'b t -> 'a t
  val ( *>  ) : 'a t -> 'b t -> 'b t
  val token : (char -> 'a option) -> 'a t
  val char : char -> char t
  val fail : 'a t
  val choose : 'a t list -> 'a t
  val ( <|> ) : 'a t -> 'a t -> 'a t
  val many : 'a t -> 'a list t
  val many_one : 'a t -> 'a list t
  val filter : ('a -> bool) -> 'a t -> 'a t

end;;

Note that it also makes the parser type itself abstract and instead exposes a run function that takes a string as input rather than a list of characters. To turn a string into a list of characters, access to a function such as:

let list_of_string s =
  let rec aux i l = if i < 0 then l else aux (i - 1) (s.[i] :: l) in
  aux (String.length s - 1) []
;;

is assumed.

The following implementation realizes the signature using the ParserMonoid and applicative utils (APU) as defined above:

module Parser : PARSER = struct
  include APU

  let run p s = OptionFunctor.map fst @@ p @@ list_of_string s

  let empty = function
    | []  -> Some ((), [])
    | _   -> None

  let token f = function
    | []      -> None
    | x :: xs -> OptionFunctor.map (fun y -> (y, xs)) @@ f x

  let char c = token (fun c' -> if c = c' then Some c else None)

  let fail (type a) =
    let module M = ParserMonoid(struct type t = a end) in
    M.empty

  let choose (type a) ps =
    let module MU = MonoidUtils(ParserMonoid(struct type t = a end)) in
    MU.concat ps

  let ( <|> ) p q = choose [p; q]

  let delay f cs = f () cs

  let rec many p =
    List.cons <$> p <*> (delay @@ fun _ -> many p)
    <|>
    pure []

  let filter f p cs =
    match p cs with
    | Some (x, cs) when f x -> Some (x,cs)
    | _                     -> None

  let many_one p = filter ((<>) []) @@ many p

end;;

Finally for an example of how to use the library, consider a parser for parsing dates of the format YYYY-MM-DD:

open Parser;;

(* Parser for a single digit *)
let digit = "0123456789" |> list_of_string |> List.map char |> choose;;

(* Integer parser *)
let int =
  let string_of_list = List.map (String.make 1) >> String.concat "" in
  (string_of_list >> int_of_string) <$> many_one digit;;

(* Integers in a given range *)
let int_range mn mx = filter (fun n -> mn <= n && n <= mx) int;;

(* Parser for digit prefixed by '0'. Ex "07" *)
let zero_digit = char '0' *> int_range 1 9;;

(* Years between 1700 and 2400 *)
let year  = int_range 1700 2400;;

(* Month as in '01, 02, .. , 11' *)
let month = zero_digit <|> int_range 11 12;;

(* Day as in '01, 02, .. 31 *)
let day = zero_digit <|> int_range 11 31;;

(* Parser for date of format "YYYY-MM-DD" *)
let date =
  (fun y m d -> (y,m,d))
  <$> (year <* char '-')
  <*> (month <* char '-')
  <*> day
;;

Here are a few examples of running the date parser with different string inputs:

# run date "2019-01-23";;
- : (int * int * int) option = Some (2019, 1, 23)

# run date "2019-1-23";;
- : (int * int * int) option = None

# run date "999-1-23";;
- : (int * int * int) option = None

Analyzing boolean expressions

In the next example we consider designing a library for representing and operating on boolean expressions. It naturally generalizes to other forms of deeply embedded domain specific languages (EDSLs).

Consider the following data type for representing a boolean expression with variables, parameterized over the variable type.

type 'a exp =
  | True
  | False
  | And of 'a exp * 'a exp
  | Or of 'a exp * 'a exp
  | Not of 'a exp
  | Var of 'a
;;

For convenience we define some helper functions corresponding to the expression constructors:

let etrue           = True;;
let efalse          = False;;
let ( <&> ) e1 e2   = And (e1, e2);;
let ( <|> ) e1 e2   = Or (e1, e2);;
let enot e          = Not e;;
let var x           = Var x;;

What patterns are applicable to the 'a exp type? There are two monoid instances corresponding to the boolean operators and and or. Expressions form a monoid with identity false and the append function or:

module MonoidOrFalse (T : TYPE) : MONOID with type t = T.t exp = struct
  type t = T.t exp
  let empty = efalse and append = ( <|> )
end;;

The other monoid is for true and and:

module MonoidAndTrue (T : TYPE) : MONOID with type t = T.t exp = struct
  type t = T.t exp
  let empty = etrue and append = ( <&> )
end;;

As demonstrated previously, monoids can be promoted to operate on lists. In this case for composing a list of expression values:

let any (type a) es =
  let module M = MonoidUtils (MonoidOrFalse (struct type t = a end)) in
  M.concat es;;

let all (type a) es =
  let module M = MonoidUtils (MonoidAndTrue (struct type t = a end)) in
  M.concat es;;

Continuing through the list of patterns - the expression type naturally forms a traversable:

module ExpTraversable (A : APPLICATIVE) :
             TRAVERSABLE with type 'a t = 'a exp
                 and type 'a Applicative.t = 'a A.t =
struct
  module Applicative = A

  type 'a a = 'a A.t
  type 'a t = 'a exp

  let rec traverse f exp =
    let module AU = ApplicativeUtils(A) in
    let open AU in
    match exp with
    | True          -> A.pure etrue
    | False         -> A.pure efalse
    | And (e1, e2)  -> (<&>) <$> traverse f e1 <*> traverse f e2
    | Or (e1, e2)   -> (<|>) <$> traverse f e1 <*> traverse f e2
    | Not e         -> enot  <$> traverse f e
    | Var v         -> var   <$> f v

end;;

Using this module we also obtain the functor instance for free and are able to implement a map function for expressions via:

module EF = TraversableFunctor (ExpTraversable);;
let map f = EF.map f;;

For example we may use map to create a function that adds a prefix to each variable for values of type string exp:

let add_var_prefix p = map (Printf.sprintf "%s%s" p);;
The traversable instance may also be utilized when it comes to evaluating expressions. First consider the following function for evaluating expressions parameterized by a boolean value, that is expressions where each variable is realized as concrete boolean value:
let rec eval_bool_exp = function
  | True          -> true
  | False         -> false
  | And (e1, e2)  -> eval_bool_exp e1 && eval_bool_exp e2
  | Or (e1, e2)   -> eval_bool_exp e1 || eval_bool_exp e2
  | Not e         -> not (eval_bool_exp e)
  | Var x         -> x
;;

In order to write a more generic version that evaluates expressions parameterized by an arbitrary type we need to pass an environment for mapping variables to boolean values. The task can be solved by considering the traversable instance for expressions where the effect is given by the option applicative, and then map over the result and evaluate with eval_bool_exp:

let eval env =
  let module T = ExpTraversable (OptionApplicative) in
  T.traverse env >> OptionFunctor.map eval_bool_exp
;;

To test, assume an environment with two variables (x and y):

let env = function
  | "x"   -> Some true
  | "y"   -> Some false
  | _     -> None
;;

Here are a couple of examples of evaluation expressions using the environment:

# eval env (var "x" <|> enot (var "y" <&> var "y"));;
- : bool OptionFunctor.t = Some true

# eval env (var "z" <|> enot (var "y" <&> var "y"));;
- : bool OptionFunctor.t = None

Next, say we're asked to write a function that extracts all variables from an expression. Could we leverage the traversable instance for this task as well?

At a first glance this may seem like a stretch as traverse maps over an expression and rebuilds it. This time we're only interested in collecting the variables traversed over. The trick is to pick an appropriate applicative instance; In this case where the effect is accumulating the results. Following is a module-functor, for creating an applicative functor that accumulates results of some type in a list:

module BagApplicative (T : TYPE) : APPLICATIVE with type 'a t = T.t list =
struct
  type 'a t = T.t list
  let pure _ = []
  let map _ x = x
  let apply f = (@) f
end;;

Note that the type parameter 'a in 'a t is not actually used and its only purpose is to satisfy the signature of APPLICATIVE. The function pure creates an empty list ignoring its argument. map is effectively a no-op and apply simply concatenates the results of both arguments (which are both lists of accumulated items).

With BagApplicative at our disposal, extracting the variables is a matter of traversing an expression and for each variable encountered putting it in the bag:

let variables (type a) exp =
  let module T = ExpTraversable (BagApplicative (struct type t = a end)) in
  T.traverse ListApplicative.pure exp
;;

This works for any expression type as the explicit type parameter a is used to instantiate the BagApplicative module-functor. The purpose of ListApplicative.pure is to wrap a variable in a singleton list, synonymous with fun x -> [x].

Here's an example of using it to collect variables of a string exp:

# variables (var "x"  <|> (var "y" <&> (enot (var "z"))));;
- : string list = ["x"; "y"; "z"]

Let's take a look at yet another exercise for where traversables prove to be useful. Say we wish to expand an expression with variables into all possible realizations. For example, the expression var "x" <|> (var "y" <&> (enot (var "z"))) contains three variables so there are 23 different realizations corresponding to each permutation of true/false assignments to x, y and z.

Every one of them gives rise to an expression where the variable is replaced with a boolean value. The variables function above already allows us to extract the list but how can we generate all permutations?

For traversing a list (in this case of variables, we may use ListTraversable. Since each variable yields two possible values (variable and bool pair) we can collect them using the ListApplicative:

module LTLA = ListTraversable (ListApplicative);;

The module LTLA now contains a traverse function:

val traverse : ('a -> 'b list) -> 'a list -> 'a list list

Given that ListApplicative corresponds to the Cartesian product, the effect of traverse is to generate all permutations.

For instance:

# LTLA.traverse (fun x -> ['a']) [1;2;3];;
-: char LTLA.t LTLA.Applicative.t = [['a'; 'a'; 'a']]

Here, each value 1, 2 and 3, can replaced with exactly one value a producing a list with a single permutation.

Allowing each number to be mapped to two possible values yields 8 results:

# LTLA.traverse (fun _ -> ['a'; 'b']) [1;2;3];;
  - : char LTLA.t LTLA.Applicative.t =
  [['a'; 'a'; 'a']; ['a'; 'a'; 'b']; ['a'; 'b'; 'a']; ['a'; 'b'; 'b'];
   ['b'; 'a'; 'a']; ['b'; 'a'; 'b']; ['b'; 'b'; 'a']; ['b'; 'b'; 'b']]

In our case we're interested in mapping the unique set of collected variables to the values true or false for which each resulting permutation yields a lookup function. The function realizations below also maps over the lookup function to replace the variables with their corresponding boolean values for the given expression:

let realizations exp =
  variables exp
  |> LTLA.traverse (fun x -> [(x,true); (x,false)])
  |> ListFunctor.map (fun xs -> map (fun x -> List.assoc x xs) exp)
;;

Here's an example of using it:
# realizations (var "x"  <|> (var "y" <&> (enot (var "z"))));;

returning the following result:
[
  Or (Var true, And (Var true, Not (Var true)));
  Or (Var true, And (Var true, Not (Var false)));
  Or (Var true, And (Var false, Not (Var true)));
  Or (Var true, And (Var false, Not (Var false)));
  Or (Var false, And (Var true, Not (Var true)));
  Or (Var false, And (Var true, Not (Var false)));
  Or (Var false, And (Var false, Not (Var true)));
  Or (Var false, And (Var false, Not (Var false)))
]

To see why such a function may be useful, consider how it can be used for evaluating all expanded versions of an expression in order to deduce whether or not an expression always evaluates to true or false, irrespective of the choice of variables:

let always_true exp =
  realizations exp
  |> all
  |> eval_bool_exp
;;

let always_false exp =
  realizations exp
  |> any
  |> eval_bool_exp
  |> not
;;

At last, a few examples to demonstrate their behavior:

# always_true (var "x");;
- : bool = false

# always_true (var "x" <|> etrue);;
- : bool = true

# always_true (var "x" <|> (enot (var "x")));;
- : bool = true

# always_false (var "x");;
- : bool = false

# always_false (var "x" <&> (enot (var "x")));;
- : bool = true

Summary

Many Haskell patterns implemented in terms of type classes may indeed be ported to OCaml. As demonstrated by the introduction of functor, monoid, applicative functor and traversable module signatures, and the examples, there is a case to be made for why leveraging patterns can help with guiding the specification of APIs and enable code reuse.

On a more philosophical level - thinking in terms of patterns changes the methodology of writing programs. Under this regime, rather than solving isolated problems one starts by implementing generic functionality and later focus on how to make use of it for addressing more specific problems.

In the parser example we saw how two patterns, monoid and applicative functor, were sufficient for describing the primitive set of base combinators ((<|>, <*> etc), out of which others could be inferred (e.g. many and many_one).

In the example for representing and operating on boolean expressions, defining a traversable instance formed the cornerstone from which a variety of functionality was derived, including:

  • Mapping over expression
  • Evaluating expressions
  • Collecting variables from expressions
This was all accomplished by customizing the effect of traverse by varying the applicative functor argument.