Friday, March 10, 2017

Universal type

Universal type

A universal type is a type into which all other types can be embedded. A module implementing such a type here will satisfy the following signature.

module type UNIVERSAL = sig
  type t
  val embed : unit → (α → t) * (t → α option)
end;;
      
The type t is the universal type and each call to embed returns a pair of functions : an injection function for embedding a value into the universal type and, a projection function for extracting the value from its embedding in the universal type. The following code demonstrates intended usage.
module type TEST = sig
  val run : unit → unit
end;;

module type UNIVERSAL_TEST = functor (U : UNIVERSAL) → TEST;;

module Basic_usage : UNIVERSAL_TEST = 
  functor (U : UNIVERSAL) → struct
    let run () =
      let ((of_int : int → U.t)
         , (to_int : U.t → int option)) = U.embed () in
      let ((of_string : string → U.t)
         , (to_string : U.t → string option)) = U.embed () in

      let r : U.t ref = ref (of_int 13) in

      begin
        assert (to_int !r = Some 13);
        assert (to_string !r = None);

        r := of_string "foo";

        assert (to_string !r = Some "foo");
        assert (to_int !r = None);
      end
  end;;
      

One possible implementation is via the use of exceptions together with local modules. The core idea exploits the fact that the primitive type exn is an open extensible sum. Here's the complete implementation. We'll break it down later.

module Universal_exn : UNIVERSAL = struct

  type t = exn

  module type ANY = sig
    type c
    exception E of c
  end

  type α any = (module ANY with type c = α)

  let mk : unit → α any =
    fun (type s) () →
      (module struct
        type c = s
        exception E of c
      end : ANY with type c = s)

  let inj (type s) (p : s any) (x : s) : t =
    let module Any = (val p : ANY with type c = s) in
    Any.E x

  let proj (type s) (p : s any) (y : t) : s option =
    let module Any = (val p : ANY with type c = s) in
    match y with
    | Any.E x → Some x
    | _ as e →  None

  let embed () = let p = mk () in inj p, proj p

end;;
    
Before delving into an explanation of the program, one can verify it satisfies the basic test.
# module Test_basic = Mk_universal_test(Universal_exn);;
# Test_basic.run ();;
- : unit = ()
   

The definition of the universal type t is an alias to the predefined type exn.

type t = exn
   
A module type ANY is introduced. Modules that implement this signature define an abstract type c and introduce an exn constructor E of c.
module type ANY = sig
  type c
  exception E of c
end
   
An alias for the type of a module value satisfying this signature comes next. Using aliases of this kind are helpful in reducing "syntactic verbosity" in code accepting and returning module values.
type α any = (module ANY with type c = α)
   
Next follow a set of functions that are private to the implementation of the module. The first of these is mk.
let mk : unit → α any =
  fun (type s) () →
    (module struct
      type c = s
      exception E of c
    end : ANY with type c = s)
   
This function mk takes the unit argument and each invocation computes a new module instance which is packed as a first class value and returned. The locally abstract type s connects to the α in the return type.

The next function to be defined is inj which computes a universal type value from its argument.

let inj (type s) (p : s any) (x : s) : t =
  let module Any = (val p : ANY with type c = s) in
  Any.E x
   
As was the case for mk, a locally abstract type is used in the definition of inj and observe how that type ensures a coherence between the module parameter p and the type of the parameter to be embedded x.

The projection function proj comes next and also uses a locally abstract type ensuring coherence among its parameters.

let proj (type s) (p : s any) (y : t) : s option =
  let module Any = (val p : ANY with type c = s) in
  match y with
  | Any.E x → Some x
  | _ as e →  None
   
The body of proj unpacks the module value parameter into a module named Any and then attempts to match y against the constructor defined by Any. Recall, at the end of the day, y is of type exn. The match contains two cases : the first matching the constructor Any.E x with x having type s, the second anything else (that is, proj is total).

Finally we come to the public function embed.

let embed () = let p = mk () in inj p, proj p
   
embed calls mk to produce a new embedding p and returns the pair of partial applications (inj p, proj p).

Our examination of the implementation is concluded. There are however various implications of the implementation we are now in a position to understand that are not immediately obvious from the specification. As a first example, consider the inferred type of a call to embed.

# module U = Universal_exn;;
# let inj, proj = U.embed ();;
val inj : '_a → U.t = 
val proj : U.t → '_a option = 
   
Property 1 : embed produces weakly polymorphic functions.

Consider the following scenario:
# let of_int, to_int = U.embed ();;
# let of_string, to_string = U.embed ();;
   
At this point all of of_int, to_int, of_string and to_string are weakly polymorphic.
# let r : U.t ref = ref (of_int 13);;
   
The call to of_int fixes it's type to int → U.t and that of to_int to U.t → int option. The types of of_string and to_string remain unfixed.
# to_string !r = Some 13;;
- : bool = false
   
The programmer has mistakenly used a projection function from a different embedding.
# r := of_string "foo";;
Error: This expression has type string but an expression was expected of
type int
   
The erroneous usage of to_string incidentally fixed the type of of_string to int → U.t! One can imagine errors of this kind being difficult to diagnose.

Property 2 : Injection/projection functions of different embeddings may not be mixed.

# let ((of_int : int → U.t), (to_int : U.t → int option)) = U.embed ();;
# let ((of_int' : int → U.t), (to_int' : U.t → int option)) = U.embed ();;
   
Two embeddings have been created, the programmer has fixed the types of the injection/projection functions a priori (presumably as a defense against the problems of the preceding example).
# let r : U.t ref = ref (of_int 13);;
# to_int !r = Some 13;;
- : bool = true
   
The first embedding is used to inject an integer. That integer can be extracted using the projection function of that embedding.
#  to_int' !r = Some 13;;
- : bool = false
   
We cannot extract the integer from the universal type value using the projection function from the other embedding despite all of the types involved being the same. One can imagine that also as being quite the source of bugs and confusion to the unknowing programmer.

The constraint of property 2 is beyond the ability of the type system to enforce. About the best one can do is to enhance the specification of the type with documentation.

module type UNIV = sig
  type t

  val embed : unit → (α → t) * (t → α option)
 
  (* A pair [(inj, proj)] returned by [embed] works together in that
     [proj u] will return [Some v] if and only if [u] was created by
     [inj v]. If [u] was created by a different function then [proj u] 
     will return [None]. *)

end;;