Shayne Fletcher: March 2017

Thursday, March 30, 2017

Dealing with source code locations (in lexical and syntax analysis)

Locations

Writing compilers and interpreters requires rigorous management of source code locations harvested during syntax analysis and associated error handling mechanisms that involve reporting those locations along with details of errors they associate to.

This article does a "deep dive" into the the Location module of the OCaml compiler. The original source can be found in the ocaml/parsing directory of an OCaml distribution (copyright Xavier Leroy).

Location is a masterclass in using the standard library Format module. If you have had difficulties understanding Format and what it provides the OCaml programmer, then this is for you. Furthermore, Location contains invaluable idioms for error reporting & exception handling. Learn them here to be able to apply them in your own programs.

Describing locations

A location corresponds to a range of characters in a source file. Location defines this type and a suite of functions for the production of location values.

type t = { 
  loc_start : Lexing.position;
  loc_end : Lexing.position;
  loc_ghost : bool
}

val none : t
val in_file : string → t
val init : Lexing.lexbuf → string → unit
val curr : Lexing.lexbuf → t

val symbol_rloc : unit → t
val symbol_gloc : unit → t
val rhs_loc : int → t

type 'a loc = { 
  txt : 'a;
  loc : t; 
}

val mkloc : 'a → t → 'a loc
val mknoloc : 'a → 'a loc

A value of the (standard library) type Lexing.position describes a point in a source file.

    type position = {
      pos_fname : string;
      pos_lnum : int;
      pos_bol : int;
      pos_cnum : int
    }

The fields of this record have the following meanings:

pos_fname is the file name
pos_lnum is the line number
pos_bol is the offset of the beginning of the line (the number of characters between the beginning of the lexbuf and the beginning of the line)
pos_cnum is the offset of the position (number of characters between the beginning of the lexbuf (details below) and the position)

The difference between pos_cnum and pos_bol is the character offset within the line (i.e. the column number, assuming each character is one column wide).

A location in a source file is defined by two positions : where the location starts and where the location ends.

type t = {
  loc_start : position;
  loc_end : position;
  loc_ghost : bool
}

The third field loc_ghost is used to disambiguate locations that do not appear explicitly in the source file. A location will not be marked as ghost if it contains a piece of code that is syntactically valid and corresponds to an AST node and will be marked as a ghost location otherwise.

There is a specific value denoting a null position. It is called none and it is defined by the function in_file.

let in_file (name : string) : t =
  let loc : position = {
    pos_fname = name; (*The name of the file*)
    pos_lnum = 1; (*The line number of the position*)
    pos_bol = 0; (*Offset from the beginning of the lexbuf of the line*)
    pos_cnum = -1; (*Offset of the position from the beginning of the lexbuf*)
  } in
  { loc_start = loc; loc_end = loc; loc_ghost = true }

let none : t = in_file "_none_"

Lexing.lexbuf is the (standard library) type of lexer buffers. A lexer buffer is the argument passed to the scanning functions defined by generated scanners (lexical analyzers). The lexer buffer holds the current state of the scanner plus a function to refill the buffer from the input.

type lexbuf = {
  refill_buff : lexbuf → unit;
  mutable lex_buffer : bytes;
  mutable lex_buffer_len : int;
  mutable lex_abs_pos : int;
  mutable lex_start_pos : int;
  mutable lex_curr_pos : int;
  mutable lex_last_pos : int;
  mutable lex_last_action : int;
  mutable lex_eof_reached : bool;
  mutable lex_mem : int array;
  mutable lex_start_p : position;
  mutable lex_curr_p : position;
}

At each token, the lexing engine will copy lex_curr_p to lex_start_p then change the pos_cnum field of lex_curr_p by updating it with the number of characters read since the start of the lexbuf. The other fields are left unchanged by the the lexing engine. In order to keep them accurate, they must be initialized before the first use of the lexbuf and updated by the relevant lexer actions.

(*Set the file name and line number of the [lexbuf] to be the start
   of the named file*)
let init (lexbuf : Lexing.lexbuf) (fname : string) : unit =
  let open Lexing in
  lexbuf.lex_curr_p <- {
    pos_fname = fname;
    pos_lnum = 1;
    pos_bol = 0;
    pos_cnum = 0;
  }

The location of the current token in a lexbuf is computed by the function curr.

(*Get the location of the current token from the [lexbuf]*)
let curr (lexbuf : Lexing.lexbuf) : t = 
  let open Lexing in {
    loc_start = lexbuf.lex_start_p;
    loc_end = lexbuf.lex_curr_p;
    loc_ghost = false;
  }

Parsing is the run-time library for parsers generated by ocamlyacc. The functions symbol_start and symbol_end are to be called in the action part of a grammar rule (only). They return the offset of the string that matches the left-hand-side of the rule : symbol_start returns the offset of the first character; symbol_end returns the offset after the last character. The first character in a file is at offset 0.

symbol_start_pos and symbol_end_pos are like symbol_start and symbol_end but return Lexing.position values instead of offsets.

(*Compute the span of the left-hand-side of the matched rule in the
   program source*)
let symbol_rloc () : t = {
  loc_start = Parsing.symbol_start_pos ();
  loc_end = Parsing.symbol_end_pos ();
  loc_ghost = false
}

(*Same as [symbol_rloc] but designates the span as a ghost range*)
let symbol_gloc () : t =  { 
  loc_start = Parsing.symbol_start_pos ();
  loc_end = Parsing.symbol_end_pos ();
  loc_ghost = true
}

The Parsing functions rhs_start and rhs_end are the same as symbol_start and symbol_end but return the offset of the string matching the n-th item on the right-hand-side of the rule where n is the integer parameter to rhs_start and rhs_end. n is 1 for the leftmost item. rhs_start_pos and rhs_end_pos return a position instead of an offset.

(*Compute the span of the [n]th item of the right-hand-side of the
   matched rule*)
let rhs_loc n = {
  loc_start = Parsing.rhs_start_pos n;
  loc_end = Parsing.rhs_end_pos n;
  loc_ghost = false;
}

The type 'a loc associates a value with a location.

(*A type for the association of a value with a location*)
type 'a loc = { 
  txt : 'a;
  loc : t; 
}

(*Create an ['a loc] value from an ['a] value and location*)
let mkloc (txt : 'a) (loc : t) : 'a loc = { txt ; loc }

(*Create an ['a loc] value bound to the distinguished location called
   [none]*)
let mknoloc (txt : 'a) : 'a loc = mkloc txt none

Error reporting with locations

Location has a framework for error reporting across modules concerned with locations (think lexer, parser, type-checker, etc).

open Format

type error =
{
  loc : t;
  msg : string;
  sub : error list;
}

val error_of_printer : t →  (formatter → 'a → unit) → 'a → error
val errorf_prefixed : ?loc : t → ?sub : error list → ('a, Format.formatter, unit, error) format4 → 'a

So, in the definition of the error record, loc is a location in the source code, msg an explanation of the error and sub a list of related errors. We deal here with the error formatting functions. The utility function print_error_prefix simply writes an error prefix to a formatter.

let error_prefix = "Error"

let warning_prefix = "Warning"

let print_error_prefix (ppf : formatter) () : unit =
  fprintf ppf "@{<error>%s@}:" error_prefix;
  ()

The syntax, "@{<error>%s}@" associates the embedded text with the named tag "error".

Next another utility, pp_ksprintf.

let pp_ksprintf 
    ?(before : (formatter → unit) option) 
    (k : string → 'd)
    (fmt : ('a, formatter, unit, 'd) format4) : 'a =
  let buf = Buffer.create 64 in
  let ppf = Format.formatter_of_buffer buf in
  begin match before with
    | None → ()
    | Some f → f ppf
  end;
  kfprintf
    (fun (_ : formatter) : 'd →
      pp_print_flush ppf ();
      let msg = Buffer.contents buf in
      k msg)
    ppf fmt

It proceeds as follows. A buffer and a formatter over that buffer is created. When presented with all of the arguments of the format operations implied by the fmt argument, if the before argument is non-empty, call it on the formatter. Finally, call kfprintf (from the standard library Format module) which performs the format operations on the buffer before handing control to a function that retrieves the contents of the now formatted buffer and passes them to the user provided continuation k.

With pp_ksprintf at our disposal, one can write the function errorf_prefixed.

let errorf_prefixed 
    ?(loc:t = none) 
    ?(sub : error list = []) 
    (fmt : ('a, Format.formatter, unit, error) format4) : 'a =
  let e : 'a =
    pp_ksprintf
      ~before:(fun ppf → fprintf ppf "%a " print_error_prefix ())
      (fun (msg : string) : error → {loc; msg; sub})
    fmt
  in e

errorf_prefixed computes a function. The function it computes provides the means to produce error values by way of formatting operations to produce the msg field of the error result value. The formatting operations include prefixing the msg field with the error_prefix string. The type of the arguments of the computed function unifies with the type variable 'a. In other words, the type of the computed function is 'a → error. For example, the type of errorf_prefixed "%d %s" is

int → string →
    error

The groundwork laid with errorf_prefixed above means a polymorphic function error_of_printer can now be produced.

let error_of_printer 
    (loc : t) 
    (printer : formatter → 'error_t → unit) 
    (x : 'error_t) : error =
  let mk_err : 'error_t → error = 
    errorf_prefixed ~loc "%a@?" printer in
  mk_err x

The idea is that error_of_printer is provided a function that can format a value of type 'error. This function is composed with errorf_prefixed thereby producing a function of type 'error → error. For example, we can illustrate how this works by making an error of a simple integer with code like the following:

# error_of_printer none (fun ppf x → Format.fprintf ppf "Code %d" x) 3;;
- : error =
{loc =
  {loc_start =
    {pos_fname = "_none_"; pos_lnum = 1; pos_bol = 0; pos_cnum = -1};
   loc_end = {pos_fname = "_none_"; pos_lnum = 1; pos_bol = 0; pos_cnum = -1};
   loc_ghost = true};
 msg = "Error: Code 3"; sub = []}

So, that's error_of_printer. The following utility function is much simpler - it simply writes a given filename to a formatter.

let print_filename (ppf : formatter) (file : string) : unit =
  fprintf ppf "%s" file

Next, a set of constants for consistent messages that involve locations and a function to get the file, line and column of a position.

let (msg_file, msg_line, msg_chars, msg_to, msg_colon) =
    ("File \"",        (*'msg file'*)
     "\", line ",      (*'msg line'*)
     ", characters ",  (*'msg chars'*)
     "-",              (*'msg to'*)
     ":")              (*'msg colon'*)

let get_pos_info pos = (pos.pos_fname, pos.pos_lnum, pos.pos_cnum - pos.pos_bol)

Making use of the above we have now print_loc : a function to print a location on a formatter in terms of file, line and character numbers.

let print_loc (ppf : formatter) (loc : t) : unit  =
  let (file, line, startchar) = get_pos_info loc.loc_start in
  let endchar = loc.loc_end.pos_cnum - loc.loc_start.pos_cnum + startchar in
  if file = "//toplevel//" then
    fprintf ppf "Characters %i-%i"
      loc.loc_start.pos_cnum loc.loc_end.pos_cnum
  else begin
    fprintf ppf "%s@{<loc>%a%s%i" msg_file print_filename file msg_line line;
    if startchar >= 0 then
      fprintf ppf "%s%i%s%i" msg_chars startchar msg_to endchar;
    fprintf ppf "@}"
  end

Locations generally speaking come out in a format along the lines of:

File "<string>, line 1,
    characters 0-10:"

let print (ppf : formatter) (loc : t) : unit =
  (* The syntax, [@{<loc>%a@}] associates the embedded text with the
     named tag 'loc'*)
  fprintf ppf "@{<loc>%a@}%s@." print_loc loc msg_colon

That last function, print is just a small wrapper over print_loc that appends a colon to the location.

Exception handling involving errors with locations

This section is concerned with the following section of the Location's signature.

val register_error_of_exn : (exn → error option) → unit
val error_of_exn : exn → error option
val error_reporter : (formatter → error → unit) ref
val report_error : formatter → error → unit
val report_exception : formatter → exn → unit

Location contains a mutable list of exception handlers where an exception handler is a function of type exn → error option.

let error_of_exn : (exn → error option) list ref = ref []

A function is provided that adds an exception handler to the above list.

let register_error_of_exn f = error_of_exn := f :: !error_of_exn

The next function error_of_exn (yes, it is the only remaining function that manipulates the list error_exn previously defined directly) walks the list looking for a handler returning the contents of the result of the first such function that doesn't return a None value.

let error_of_exn exn =
  let rec loop = function
    | [] → None
    | f :: rest →
      match f exn with
      | Some _ as r → r
      | None → loop rest
  in
  loop !error_of_exn

We define now a "default" error reporting function. Given a formatter and an error, write the error location, an explanation of the error to the formatter and the same for any associated "sub" errors.

let rec default_error_reporter 
    (ppf : formatter) ({loc; msg; sub} : error) : unit =
  print ppf loc;
  Format.pp_print_string ppf msg;
  List.iter (Format.fprintf ppf "@\n@[<2>%a@]" default_error_reporter) sub

Now, error_reporter itself is a reference cell with default value default_error_reporter.

let error_reporter = ref default_error_reporter

This next function, print_updating_num_loc_lines looks more complicated than it is but does demonstrate a rather advanced usage of Format by containing calls to the functions pp_get_formatter_out_functions, pp_set_formatter_out_functions to tempoarily replace the default function for writing strings. The semantic of the function is to print an error on a formatter incidentally recording the number of lines required to do so.

(* A mutable line count*)
let num_loc_lines : int ref = ref 0

(*Prints an error on a formatter incidentally recording the number of
  lines required to do so*)
let print_updating_num_loc_lines 
    (ppf : formatter) 
    (f : formatter → error → unit) 
    (arg : error) : unit =
  (*A record of functions of output primitives*)
  let out_functions : formatter_out_functions
      = pp_get_formatter_out_functions ppf () in
  (*The strategoy is to temporarily replace the basic function for
    writing a string with this one*)
  let out_string (str : string) (start : int) (len : int) : unit =
    (*A function for counting line breaks in [str]. [c] is the current
      count, [i] is the current char under consideration*)
    let rec count (i : int) (c : int) : int=
      if i = start + len then c
      else if String.get str i = '\n' then count (succ i) (succ c)
      else count (succ i) c 
    in
    (*Update the count*)
    num_loc_lines := !num_loc_lines + count start 0;
    (*Write the string to the formatter*)
    out_functions.out_string str start len 
  in
  (*Replace the standard string output primitive with the one just
    defined *)
  pp_set_formatter_out_functions ppf 
    {out_functions with out_string} ;
  (*Write the error to the formatter*)
  f ppf arg ;
  pp_print_flush ppf ();
  (*Restore the standard primitive*)
  pp_set_formatter_out_functions ppf out_functions

The function report_error uses the currently installed error reporter to write an error report for a given error and formatter incidentally updating a count indicating the number of lines written.

let report_error (ppf : formatter) (err : error) : unit=
  print_updating_num_loc_lines ppf !error_reporter err

This next function, report_exception_rec tries a number of times to find a handler for a given error and if successful formats it. In the worst case a handler is never found and the exception propogates.

let rec report_exception_rec (n : int) (ppf : formatter) (exn : exn) : unit =
  (*Try to find a handler for the exception*)
  try match error_of_exn exn with
  | Some err → 
    (*Format the resulting error using the current error reporter*)
    fprintf ppf "@[%a@]@." report_error err 
  (*The syntax @[%a@]@ writes function output in a box followed by a
    'cut' break hint*)
  | None → raise exn (*A handler could not be found*)
  with exn when n > 0 →
    (*A handler wasn't found. Try again*)
    report_exception_rec (n - 1) ppf exn

The last piece is report_exception. It attempts to write an error report for the given exception on the provided formatter. The exception can be re-raised if no handler is found.

let report_exception (ppf : formatter) (exn : exn) : unit = 
  report_exception_rec 5 ppf exn

Usage

In this section we see how an example of how the above machinery is used. Consider defining a lexical analyzer as an example. Suppose the scanner is defined by the file lexer.mll (the input file to ocamllex). We can imagine its header containing code like the following.

     {
        (*The cases of lexer errors*)
        type error =
          | Illegal_character of char
          | Unterminated_comment of Location.t

        (*The lexer exception type*)
        exception Error of error * Location.t

        (*This function takes a formatter and an instance of type
          [error] and writes a message to the formatter explaining the
          meaning. This is a "printer"*)
        let report_error (ppf : Format.formatter) : error → unit = function
         | Illegal_character c → 
            Format.fprintf ppf "Illegal character (%s)" (Char.escaped c)
         | Unterminated_comment _ → 
            Format.fprintf ppf "Comment not terminated"

        (*Note that [report_error] is a function that unifies with
          the [formatter → 'a → unit] parameter of
          [error_of_printer]*)

        (*Register an exception handler for the lexer exception type*)
        let () =
         Location.register_error_of_exn
          (function
           | Error (err, loc) →
              Some (Location.error_of_printer loc report_error err)
           | _ →  None
          )
     }

     /*...*/
     rule token = ...

A function to handle errors with attached locations (in a REPL for example) is expressible as an idiom as simple as something like this.

let handle_interpreter_error ?(finally=(fun () → ())) ex =
  finally ();
  Location.report_exception (Format.std_formatter) ex

let safe_proc ?finally f =
  try f ()
  with exn → handle_interpreter_error ?finally exn

Tuesday, March 21, 2017

Polynomials over rings

This post provides a workout in generic programming using modules & functors.

The program presented here models univariate polynomials over rings based on an exercise in "The Module Language" chapter, of Didier Rémy's book, Using, Understanding and Unraveling the OCaml Lanaguage.

Arithmetics and rings

We begin with a type for modules implementing arithmetic.

    module type ARITH = sig
      type t
      val of_int : int -> t            val to_int : t -> int
      val of_string : string -> t      val to_string : t -> string
      val zero : t                     val one : t
      val add : t -> t -> t            val sub : t -> t -> t
      val mul : t -> t -> t            val div : t -> t -> t
      val compare : t -> t -> int      val equal : t -> t -> bool
    end;;

A ring is a set equipped with two binary operations that generalize the arithmetic operations of addition and multiplication.

    module type RING = sig
      type t
      type extern_t
      val print : t -> unit
      val make : extern_t -> t         val show : t -> extern_t
      val zero : t                     val one : t
      val add : t -> t -> t            val mul : t -> t -> t
      val equal : t -> t -> bool
    end;;

We can build rings over arithmetics with functors. This particular one fixes the external representation of the elements of the ring to int.

    module Ring (A : ARITH) :
      RING  with type t = A.t and type extern_t = int =
    struct
      include A
      type extern_t = int
      let make = of_int                let show = to_int
      let print x = print_int (show x)
    end;;

Thus, here for example are rings over various specific arithmetic modules.

    module Ring_int32 = Ring (Int32);;
    module Ring_int64 = Ring (Int64);;
    module Ring_nativeint = Ring (Nativeint);;
    module Ring_int = Ring (
      struct
        type t = int
        let of_int x = x                   let to_int x = x
        let of_string = int_of_string      let to_string = string_of_int
        let zero = 0                       let one = 1
        let add = ( + )                    let sub = ( - )
        let mul = ( * )                    let div = ( / )
        let compare = Pervasives.compare   let equal = ( = )
      end
    );;

Polynomials

We define now the type of polynomials.

    module type POLYNOMIAL = sig
      type coeff (*Type of coefficients*)
      type coeff_extern_t (*Type of coeff. external rep*)

      (*Polynomials satisfy the ring interface*)
      include RING (*Declares a type [t] and [extern_t]*)

      (*Function to evaluate a polynomial at a point*)
      val eval : t -> coeff -> coeff
    end;;

Given a module implementing a ring, we can generate a module implementing polynomials with coefficients drawn from the ring.

    module Polynomial (R : RING) :
      POLYNOMIAL with type coeff = R.t
      and type coeff_extern_t = R.extern_t
      and type extern_t = (R.extern_t * int) list =
    struct
    
      type coeff = R.t (*Coefficient type*)
      type coeff_extern_t = R.extern_t (*External coeff. rep*)
      type extern_t = (coeff_extern_t * int) list (*External polynomial rep*)
    
      (*List of coefficients and their powers*)
      type t = (coeff * int) list (*Invariant : Ordered by powers,
                                    lower order terms at the front*)

      (* ... *)

    end;;

As the comments indicate, the polynomial data structure is a list of pairs of coefficients and powers, ordered so that lower powers come before higher ones. Here's a simple printing utility to aid visualization.

    let print p =
      List.iter
        (fun (c, k) -> Printf.printf "+ (";
          R.print c;
          Printf.printf ")X^%d " k)
        p

In order that we get a canonical representation, null coefficients are eliminated. In particular, the null monomial is simply the empty list.

    let zero = []

The multiplicative identity one is not really necessary as it is just a particular monomial however, its presence makes polynomials themselves satisfy the interface of rings.

    let one = [R.one, 0]

This helper function constructs monomials.

    let monomial (a : coeff) (k : int) =
      if k < 0 then
        failwith "monomial : negative powers not supported"
      else if R.equal a R.zero then [] else [a, k]

Next up, we define addition of polynomials by the following function. Care is taken to ensure the representation invariant is respected.

    let rec add u v =
      match u, v with
      | [], _ -> v
      | _, [] -> u
      | ((c1, k1) :: r1 as p1), ((c2, k2) :: r2 as p2) ->
        if k1 < k2 then
          (c1, k1) :: (add r1 p2)
        else if k1 = k2 then
          let c = R.add c1 c2 in
          if R.equal c R.zero then add r1 r2
          else (c, k1) :: (add r1 r2)
        else (c2, k2) :: (add p1 r2)

With monomial and add avaialable, we can now write make that computes a polynomial from an external representation. We also give the inverse function show here too.

    let make l =
      List.fold_left (fun acc (c, k) ->
        add (monomial (R.make c) k) acc) zero l

    let show p =
      List.fold_right (fun (c, k) acc -> (R.show c, k) :: acc) p []

The module private function times left-multiplies a polynomial by a monomial.

    let rec times (c, k) = function
      | [] -> []
      | (c1, k1) :: q ->
        let c2 = R.mul c c1 in
        if R.equal c2 R.zero then times (c, k) q
        else (c2, k + k1) :: times (c, k) q

Given the existence of times, polynomial multiplication can be expressed in a "one-liner".

    let mul p = List.fold_left (fun r m -> add r (times m p)) zero

Comparing two polynomials for equality is achieved with the following predicate.

    let rec equal p1 p2 =
      match p1, p2 with
      | [], [] -> true
      | (c1, k1) :: q1, (c2, k2) :: q2 ->
        k1 = k2 && R.equal c1 c2 && equal q1 q2
      | _ -> false

In the course of evaluating polynomials for a specific value of their indeterminate, we'll require a function for computing powers. The following routine uses the exponentiation by squaring technique.

    let rec pow c = function
      | 0 -> R.one
      | 1 -> c
      | k ->
        let l = pow c (k lsr 1) in
        let l2 = R.mul l l in
        if k land 1 = 0 then l2 else R.mul c l2

Finally, the function eval for evaluation of a polynomial at a specific point. The implementation uses Horner's rule for computationally efficient evaluation.

    let eval p c = match List.rev p with
      | [] -> R.zero
      | (h :: t) ->
        let reduce (a, k) (b, l) =
          let n = pow c (k - l) in
          let t = R.add (R.mul a n) b in
          (t, l)  in
        let a, k = List.fold_left reduce h t in
        R.mul (pow c k) a

Testing and example usage

The following interactive session creates a polynomial with integer coefficients module and uses it to confirm the equivalence of $(1 + x)(1 - x)$ with $(1 - x^{2})$.

    # #use "polynomial.ml";;
    # module R = Ring_int;;
    # module P = Polynomial (R);;
    # let p = P.mul (P.make [(1, 0); (1, 1)]) (P.make [(1, 0); (-1, 1)]);;
    # let q = P.make [(1, 0); (-1, 2)];;
    # P.equal p q;;
    - : bool = true

Polynomials in two variables, can be treated as univariate polynomials with polynomial coefficients. For example, the polynomial $(X + Y)$ can be regarded as $(1 \times X^{1})Y^{0} + (1 \times X^{0})Y^{1}$. Similarly we can write $X - Y$ as $(1 \times X^{1})Y^{0} + (-1 \times X^{0}) Y^1$ and now check the equivalence $(X + Y)(X - Y) = (X^{2} - Y^{2})$.

    #module Y = Polynomial (P);;
    
    #(* (X + Y) *)
    #let r = Y.make [
    #  ([1, 1], 0); (*(1 X^1) Y^0*)
    #  ([1, 0], 1)  (*(1 X^0) Y^1*)
    #];;

    #(* (X - Y) *)
    #let s = Y.make [
    #  ([1, 1], 0); (*(1 X^1) Y^0*)
    #  ([-1, 0], 1) (*((-1) X^0) Y^1*)
    #];;

    #(* (X^2 - Y^2) *)
    #let t = Y.make [
    #  ([1, 2], 0);   (*(1 X^2) Y^0*)
    #  ([-1, 0], 2)  (* (-1 X^0) Y^2*)
    #];;

    #Y.equal (Y.mul r s) t;;
    - : bool = true

Friday, March 10, 2017

Universal type

A universal type is a type into which all other types can be embedded. A module implementing such a type here will satisfy the following signature.

module type UNIVERSAL = sig
  type t
  val embed : unit → (α → t) * (t → α option)
end;;

The type t is the universal type and each call to embed returns a pair of functions : an injection function for embedding a value into the universal type and, a projection function for extracting the value from its embedding in the universal type. The following code demonstrates intended usage.

module type TEST = sig
  val run : unit → unit
end;;

module type UNIVERSAL_TEST = functor (U : UNIVERSAL) → TEST;;

module Basic_usage : UNIVERSAL_TEST = 
  functor (U : UNIVERSAL) → struct
    let run () =
      let ((of_int : int → U.t)
         , (to_int : U.t → int option)) = U.embed () in
      let ((of_string : string → U.t)
         , (to_string : U.t → string option)) = U.embed () in

      let r : U.t ref = ref (of_int 13) in

      begin
        assert (to_int !r = Some 13);
        assert (to_string !r = None);

        r := of_string "foo";

        assert (to_string !r = Some "foo");
        assert (to_int !r = None);
      end
  end;;

One possible implementation is via the use of exceptions together with local modules. The core idea exploits the fact that the primitive type exn is an open extensible sum. Here's the complete implementation. We'll break it down later.

module Universal_exn : UNIVERSAL = struct

  type t = exn

  module type ANY = sig
    type c
    exception E of c
  end

  type α any = (module ANY with type c = α)

  let mk : unit → α any =
    fun (type s) () →
      (module struct
        type c = s
        exception E of c
      end : ANY with type c = s)

  let inj (type s) (p : s any) (x : s) : t =
    let module Any = (val p : ANY with type c = s) in
    Any.E x

  let proj (type s) (p : s any) (y : t) : s option =
    let module Any = (val p : ANY with type c = s) in
    match y with
    | Any.E x → Some x
    | _ as e →  None

  let embed () = let p = mk () in inj p, proj p

end;;

Before delving into an explanation of the program, one can verify it satisfies the basic test.

# module Test_basic = Mk_universal_test(Universal_exn);;
# Test_basic.run ();;
- : unit = ()

The definition of the universal type t is an alias to the predefined type exn.

type t = exn

A module type ANY is introduced. Modules that implement this signature define an abstract type c and introduce an exn constructor E of c.

module type ANY = sig
  type c
  exception E of c
end

An alias for the type of a module value satisfying this signature comes next. Using aliases of this kind are helpful in reducing "syntactic verbosity" in code accepting and returning module values.

type α any = (module ANY with type c = α)

Next follow a set of functions that are private to the implementation of the module. The first of these is mk.

let mk : unit → α any =
  fun (type s) () →
    (module struct
      type c = s
      exception E of c
    end : ANY with type c = s)

This function mk takes the unit argument and each invocation computes a new module instance which is packed as a first class value and returned. The locally abstract type s connects to the α in the return type.

The next function to be defined is inj which computes a universal type value from its argument.

let inj (type s) (p : s any) (x : s) : t =
  let module Any = (val p : ANY with type c = s) in
  Any.E x

As was the case for mk, a locally abstract type is used in the definition of inj and observe how that type ensures a coherence between the module parameter p and the type of the parameter to be embedded x.

The projection function proj comes next and also uses a locally abstract type ensuring coherence among its parameters.

let proj (type s) (p : s any) (y : t) : s option =
  let module Any = (val p : ANY with type c = s) in
  match y with
  | Any.E x → Some x
  | _ as e →  None

The body of proj unpacks the module value parameter into a module named Any and then attempts to match y against the constructor defined by Any. Recall, at the end of the day, y is of type exn. The match contains two cases : the first matching the constructor Any.E x with x having type s, the second anything else (that is, proj is total).

Finally we come to the public function embed.

let embed () = let p = mk () in inj p, proj p

embed calls mk to produce a new embedding p and returns the pair of partial applications (inj p, proj p).

Our examination of the implementation is concluded. There are however various implications of the implementation we are now in a position to understand that are not immediately obvious from the specification. As a first example, consider the inferred type of a call to embed.

# module U = Universal_exn;;
# let inj, proj = U.embed ();;
val inj : '_a → U.t = 
val proj : U.t → '_a option =

Property 1 : embed produces weakly polymorphic functions.

Consider the following scenario:

# let of_int, to_int = U.embed ();;
# let of_string, to_string = U.embed ();;

At this point all of of_int, to_int, of_string and to_string are weakly polymorphic.

# let r : U.t ref = ref (of_int 13);;

The call to of_int fixes it's type to

int →
   U.t

and that of to_int to

U.t → int
   option

. The types of of_string and to_string remain unfixed.

# to_string !r = Some 13;;
- : bool = false

The programmer has mistakenly used a projection function from a different embedding.

# r := of_string "foo";;
Error: This expression has type string but an expression was expected of
type int

The erroneous usage of to_string incidentally fixed the type of of_string to int → U.t! One can imagine errors of this kind being difficult to diagnose.

Property 2 : Injection/projection functions of different embeddings may not be mixed.

# let ((of_int : int → U.t), (to_int : U.t → int option)) = U.embed ();;
# let ((of_int' : int → U.t), (to_int' : U.t → int option)) = U.embed ();;

Two embeddings have been created, the programmer has fixed the types of the injection/projection functions a priori (presumably as a defense against the problems of the preceding example).

# let r : U.t ref = ref (of_int 13);;
# to_int !r = Some 13;;
- : bool = true

The first embedding is used to inject an integer. That integer can be extracted using the projection function of that embedding.

#  to_int' !r = Some 13;;
- : bool = false

We cannot extract the integer from the universal type value using the projection function from the other embedding despite all of the types involved being the same. One can imagine that also as being quite the source of bugs and confusion to the unknowing programmer.

The constraint of property 2 is beyond the ability of the type system to enforce. About the best one can do is to enhance the specification of the type with documentation.

module type UNIV = sig
  type t

  val embed : unit → (α → t) * (t → α option)
 
  (* A pair [(inj, proj)] returned by [embed] works together in that
     [proj u] will return [Some v] if and only if [u] was created by
     [inj v]. If [u] was created by a different function then [proj u] 
     will return [None]. *)

end;;

Tuesday, March 7, 2017

Polymorphic variants : Subtyping and variance

Polymorphic variants : subtyping and variance

Polymorphic variants : Subtyping and variance

Here are some expressions in the top-level involving polymorphic variant types.

    # let x = [ `On; `Off ];;
    val x : [> `Off | `On ] list = [ `On; `Off ]

The notation [> `Off | `On ] represents a type that at least contains the constructors `Off and `On. Of course, there are an unlimited number of such types so [> `Off | `On ] is a set in fact.

    # let n = `Number 1;;
    val n : [> `Number of int ] = `Number 1

The value n is of a type that at least contains the constructor `Number of int.

    # let f = function | `On → 1 | `Off → 0 | `Number n → n;;
    val f : [< `Number of int | `Off | `On ] → int = <fun>

The function f accomodates exactly three cases corresponding to the constructors `Off, `On and

`Number
    of int

. This informs us that

[< `Number of int |
    `Off | `On ]

is of a type that at most has the constructors `Off, `On and

`Number
    of int

The expression ($expr$:>$typexpr$) coerces the expression $expr$ to type $typexpr$. The expression ($expr$ : $typ_{1}$ :> $typ_{2}$ ) coerces the exprssion $expr$ from $typ_{1}$ to $typ_{2}$. It is only possible to coerce an expression $expr$ from type $typ_{1}$ to type $typ_{2}$, if the type of $expr$ is an instance of $typ_{1}$ and $typ_{1}$ is a subtype of $typ_{2}$.

    # let f x = (x :> [ `A | `B ]);;
    val f : [< `A | `B ] → [ `A | `B ] = <fun>

We see x needs to be coercible to type

[ `A |
     `B ]

. So, we read [< `A | `B ] as a type that at most contains the constructors `A and `B. [ `A ], [ `B ] and [ `A | `B ] are the subtypes of

[ `A | `B
     ]

. It follows then that [< `A | `B ] is the set of subtypes of [ `A | `B ].

    # let f x = (x :> [ `A | `B ] * [ `C | `D ]);;
    val f : [< `A | `B ] * [< `C | `D ] → [ `A | `B ] * [ `C | `D ] = <fun>

We see x needs to be coercible to

[ `A | `B ] *
    [ `C | `D ]

. This coercion can only proceed if x designates a pair with first component a subtype of [ `A | `B ] and second component a subtype of [ `C | `D ]. So we see,

[< `A | `B ] *
    [< `C | `D ]

is the set of subtypes of

[ `A | `B ]
    * [ `C | `D ]

    # let f x = (x  :> [ `A ] → [ `C | `D ]);;
    val f : ([> `A ] → [< `C | `D ]) → [ `A ] → [ `C | `D ] = <fun>

We see x needs to be coercible to

[ `A ] → [
    `C | `D ]

. This coercion can only proceed if x designates an arrow where the argument is of a type that at least contains the constructor `A. That is

[ `A
    ]

, [ `A | ... ] where the "..." represent more constructors. From this we see that

[>
    `A]

is the set of supertypes of [ `A ]. The return value of x (by logic we've already established) is a subtype of

[ `C | `D
    ]

. So, [> `A ] → [< `C | `D ] is the set of subtypes of [ `A ] → [ `C | `D ].

The transformation represented by f above, has coerced an arrow x to a new arrow. The type of the argument of x is [> `A ] and the type of the argument of the resulting arrow is [ `A ]. That is, for the argument, the transformation between types has taken a supertype to a subtype. The argument type is said to be in a "contravariant" position. On the other hand, the result type of x is [< `C | `D ] and the arrow that is produced from it, f x, has result type [ `C | `D ]. That is, the coercion for the result type has taken a subtype to a supertype : the result type in x is said to be in "covariant" position.

In the following, the type α t is abstract.

    # type α t;;
    # let f (x : [ `A ] t) = (x :> [ `A | `B ] t);;
    Error: Type [ `A ] t is not a subtype of [ `A | `B ] t

Indeed, [ `A ] is a subtype of

[ `A | `B
    ]

but that, a priori, does not say anything about the subtyping relationship between [ `A ] t and [ `A | `B ] t. For this coercion to be legal, the parameter α must be given a covariant annotation:

    # type +α t;;
    # let f (x : [ `A ] t) = (x :> [ `A | `B ] t);;
    val f : [ `A ] t → [ `A | `B ] t = <fun>

The declaration type +α t declares that the abstract type is covariant in its parameter : if the type $\tau$ is a subtype of $\sigma$, then $\tau\;t$ is a subtype of $\sigma\;t$. Here, $\tau = $[ `A ], $\sigma = $[ `A | `B ], $\tau$ is a subtype of $\sigma$ and [ `A ] t is a subtype of

[ `A | `B]
    t

Here is a similar example, but this time, in the other direction.

    # type α t;;
    # let f (x : [ `A | `B ] t) = (x :> [ `A ] t);;
    Error: This expression cannot be coerced to type [ `A ] t

The type variable can be annotated as contravariant however, and the coercion function typechecks.

    # type -α t;;
    # let f (x : [`A | `B] t) = (x :> [ `A ] t);;
    val f : [ `A | `B ] t → [ `A ] t = <fun>

The declaration type -α t declares that the abstract type t is contravariant in its parameter : if $\tau$ is a subtype of $\sigma$ then $\sigma\;t$ is a subtype of $\tau\;t$. In this example, $\tau = $[ `A ] and $\sigma = $[`A | `B], $\tau$ is a subtype of $\sigma$ and [ `A | `B ] t is a subtype of [ `A ] t.

In the following, type α t is not abstract and variance can be inferred.

    # type α t = {x : α} ;;
    # let f x = (x : [`A] t :> [`A | `B] t);;
    val f : [ `A ] t → [ `A | `B ] t = <fun>

Introducing a constraint however inhibits variance inference.

    # type α t = {x : α} constraint α = [< `A | `B ];;
    # let f x = (x : [ `A ] t :> [ `A | `B ] t);;
    Error: Type [ `A ] t is not a subtype of [ `A | `B ] t

This situation can be overcome by introducing a covariance annotation.

    # type +α t = {x : α} constraint α = [< `A | `B ];;
    # let f x = (x : [ `A ] t :> [ `A | `B ] t);;
    val f : [ `A ] t → [ `A | `B ] t = <fun>

In the following example, α does not participate in the definition of type α t.

    # type α t = {x : int};;
    # let f x = (x : [ `A ] t :> [ `B ] t);;
    val f : [ `A ] t → [ `B ] t = <fun>

In this case, any conversion between δ t and ε t is legal : the types δ and ε are not required to have a subtyping relation between them.