Saturday, October 29, 2016

Implementing type-classes as OCaml modules

Implementing type-classes as OCaml modules

Modular type classes

We revisit the idea of type-classes first explored in this post. This time though, the implementation technique will be by via OCaml modules inspired by the paper "Modular Type Classes" [2] by Harper et. al.

Starting with the basics, consider the class of types whose values can be compared for equality. Call this type-class Eq. We represent the class as a module signature.

    module type EQ = sig
      type t

      val eq : t * t → bool
    end
    
Specific instances of Eq are modules that implement this signature. Here are two examples.
    module Eq_bool : EQ with type t = bool = struct
      type t = bool
    
      let eq (a, b) = a = b
    end
    
    module Eq_int : EQ with type t = int = struct
      type t = int
    
      let eq (a, b) = a = b
    end
    

Given instances of class Eq (X and Y say,) we realize that products of those instances are also in Eq. This idea can be expressed as a functor with the following type.

    module type EQ_PROD =
      functor (X : EQ) (Y : EQ) → EQ with type t = X.t * Y.t
    
The implementation of this functor is simply stated as the following.
    module Eq_prod : EQ_PROD =
      functor (X : EQ) (Y : EQ) → struct
        type t = X.t * Y.t

        let eq ((x1, y1), (x2, y2)) =  X.eq (x1, x2) && Y.eq(y1, y2)
    end
    
With this functor we can build concrete instances for products. Here's one example.
    module Eq_bool_int : 
       EQ with type t = (bool * int) = Eq_prod (Eq_bool) (Eq_int)
    

The class Eq can be used as a building block for the construction of new type classes. For example, we might define a new type-class Ord that admits types that are equality comparable and whose values can be ordered with a "less-than" relation. We introduce a new module type to describe this class.

    module type ORD = sig
      include EQ

      val lt : t * t → bool
    end
    
Here's an example instance of this class.
    module Ord_int : ORD with type t = int = struct
      include Eq_int
    
      let lt (x, y) = Pervasives.( < ) x y
    end
    
As before, given two instances of this class, we observe that products of these instances also reside in the class. Accordingly, we have this functor type
    module type ORD_PROD =
      functor (X : ORD) (Y : ORD) → ORD with type t = X.t * Y.t
    
with the following implementation.
    module Ord_prod : ORD_PROD =
      functor (X : ORD) (Y : ORD) → struct
        include Eq_prod (X) (Y)
    
        let lt ((x1, y1), (x2, y2)) =
          X.lt (x1, x2) || X.eq (x1, x2) && Y.lt (y1, y2)
      end
    
This is the corresponding instance for pairs of intgers.
    module Ord_int_int = Ord_prod (Ord_int) (Ord_int)
    
Here's a simple usage example.
    let test_ord_int_int = 
      let x = (1, 2) and y = (1, 4) in
      assert ( not (Ord_int_int.eq (x, y)) && Ord_int_int.lt (x, y))
    

Using type-classes to implement parameteric polymorphism

This section begins with the Show type-class.

     module type SHOW = sig
      type t

      val show : t → string
    end
   
In what follows, it is convenient to make an alias for module values of this type.
   type α show_impl = (module SHOW with type t = α)
   
Here are two instances of this class...
    module Show_int : SHOW with type t = int = struct
      type t = int
    
      let show = Pervasives.string_of_int
    end

    module Show_bool : SHOW with type t = bool = struct
      type t = bool

      let show = function | true → "True" | false → "False"
    end
   
... and here these instances are "packed" as values.
    let show_int : int show_impl = 
      (module Show_int : SHOW with type t = int)

    let show_bool : bool show_impl = 
      (module Show_bool : SHOW with type t = bool)
   
The existence of the Show class is all that is required to enable the writing of our first parametrically polymorphic function.
    let print : α show_impl → α → unit =
      fun (type a) (show : a show_impl) (x : a) →
      let module Show = (val show : SHOW with type t = a) in
      print_endline@@ Show.show x
    
    let test_print_1 : unit = print show_bool true
    let test_print_2 : unit = print show_int 3
   
The function print can be used with values of any type α as long as the caller can produce evidence of α's membership in Show (in the form of a compatible instance).

The next example begins with the definition of a type-class Num (the class of additive numbers) together with some example instances.

    module type NUM = sig
      type t
    
      val from_int : int → t
      val ( + ) : t → t → t
    end
    
    module Num_int : NUM with type t = int = struct
      type t = int
    
      let from_int x = x
      let ( + ) = Pervasives.( + )
    end
    
    let num_int = (module Num_int : NUM with type t = int)
    
    module Num_bool : NUM with type t = bool = struct
      type t = bool
    
      let from_int = function | 0 → false | _ → true
      let ( + ) = function | true → fun _ → true | false → fun x → x
    end
    
    let num_bool = (module Num_bool : NUM with type t = bool)
   
The existence of Num admits writing a polymorphic function sum that will work for any α list of values if only α can be shown to be in Num.
    let sum : α num_impl → α list → α =
      fun (type a) (num : a num_impl) (ls : a list) →
        let module Num = (val num : NUM with type t = a) in
        List.fold_right Num.( + ) ls (Num.from_int 0)
    
    let test_sum = sum num_int [1; 2; 3; 4]
   
This next function requires evidence of membership in two classes.
    let print_incr : (α show_impl * α num_impl) → α → unit =
      fun (type a) ((show : a show_impl), (num : a num_impl)) (x : a) →
        let module Num = (val num : NUM with type t = a) in
        let open Num
        in print show (x + from_int 1)
    
    (*An instantiation*)
    let print_incr_int (x : int) : unit = print_incr (show_int, num_int) x
   

If α is in Show then we can easily extend Show to include the type α list. As we saw earlier, this kind of thing can be done with an approriate functor.

    module type LIST_SHOW =
      functor (X : SHOW) → SHOW with type t = X.t list
    
    module List_show : LIST_SHOW =
      functor (X : SHOW) → struct
        type t = X.t list
    
        let show =
            fun xs →
              let rec go first = function
                | [] → "]"
                | h :: t →
                  (if (first) then "" else ", ") ^ X.show h ^ go false t 
              in "[" ^ go true xs
      end
   
There is also another way : one can write a function to dynamically compute an α list show_impl from an α show_impl.
  let show_list : α show_impl → α list show_impl =
    fun (type a) (show : a show_impl) →
      let module Show = (val show : SHOW with type t = a) in
      (module struct
        type t = a list
  
        let show : t → string =
          fun xs →
            let rec go first = function
              | [] → "]"
              | h :: t →
                (if (first) then "" else ", ") ^ Show.show h ^ go false t
            in "[" ^ go true xs
      end : SHOW with type t = a list)
    
   let testls : string = let module Show =
       (val (show_list show_int) : SHOW with type t = int list) in
      Show.show (1 :: 2 :: 3 :: [])
  

The type-class Mul is an aggregation of the type-classes Eq and Num together with a function to perform multiplication.

   module type MUL = sig
      include EQ
      include NUM with type t := t
    
      val mul : t → t → t
   end
    
   type α mul_impl = (module MUL with type t = α)
    
   module type MUL_F =
     functor (E : EQ) (N : NUM with type t = E.t) → MUL with type t = E.t
   
A default instance of Mul can be provided given compatible instances of Eq and Num.
    module Mul_default : MUL_F =
      functor (E : EQ) (N : NUM with type t = E.t)  → struct
        include (E : EQ with type t = E.t)
        include (N : NUM with type t := E.t)
    
        let mul : t → t → t =
          let rec loop x y = begin match () with
            | () when eq (x, (from_int 0)) → from_int 0
            | () when eq (x, (from_int 1)) → y
            | () → y + loop (x + (from_int (-1))) y
          end in loop
    
    end
    
    module Mul_bool : MUL with type t = bool = 
      Mul_default (Eq_bool) (Num_bool)
   
Specific instances can be constructed as needs demand.
   module Mul_int : MUL with type t = int = struct
     include (Eq_int : EQ with type t = int)
     include (Num_int : NUM with type t := int)
    
     let mul = Pervasives.( * )
   end
    
   let dot : α mul_impl → α list → α list → α =
     fun (type a) (mul : a mul_impl) →
       fun xs ys →
         let module M = (val mul : MUL with type t = a) in
         sum (module M : NUM with type t = a)@@ List.map2 M.mul xs ys
    
   let test_dot =
     dot (module Mul_int : MUL with type t = int) [1; 2; 3] [4; 5; 6]
   
Note that in this definition of dot, coercision of the provided Mul instance to its base Num instance is performed.

This last section provides an example of polymorphic recursion utilizing the dynamic production of evidence by way of the show_list function presented earlier.

   let rec replicate : int → α → α list =
     fun n x → if n <= 0 then [] else x :: replicate (n - 1) x
   
   let rec print_nested : α. α show_impl → int → α → unit =
     fun show_mod → function
     | 0 → fun x → print show_mod x
     | n → fun x → print_nested (show_list show_mod) (n - 1) (replicate n x)
   
   let test_nested =
     let n = read_int () in
     print_nested (module Show_int : SHOW with type t = int) n 5
   


References:
[1] Implementing, and Understanding Type Classes -- Oleg Kiselyov
[2] Modular Type Classes -- Harper et. al.

Wednesday, October 26, 2016

Haskell type-classes in OCaml and C++

Haskell type-classes in OCaml and C++

This article examines the emulation of Haskell like type-classes in OCaml and C++. It follows [1] closely (recommended for further reading), extending on some of the example code given there to include C++.

First stop, a simplified version of the Show type-class with a couple of simple instances.

    class Show a where
      show :: a -> string

    instance Show Int where
      show x = Prelude.show x -- internal

    instance Show Bool where
      str True = "True"
      str False = "False"
    
The OCaml equivalent shown here uses the "dictionary passing" technique for implementation. The type-class declaration Show in Haskell translates to a data-type declaration for a polymorphic record α show in OCaml.
    type α show = {
      show : α → string
    }

    let show_bool : bool show = {
      show = function | true → "True" | false → "False"
    }

    let show_int : int show = {
      show = string_of_int
    }
    
In C++ we can use a template class to represent the type-class and specializations to represent the instances.
      template <class A> struct Show {};

      template <>
      struct Show<int> {
        static std::string (*show)(int);
      };
      std::string(*Show<int>::show)(int) = &std::to_string;

      template <>
      struct Show<bool> {
        static std::string show (bool);
      };
      std::string Show<bool>::show (bool b) { return b ? "true" : "false"; }
    

Next up print, a parametrically polymorphic function.

      print :: Show a => a -> IO ()
      print x = putStrLn$ show x
    
According to our dictionary passing scheme in OCaml, this renders as the following.
      let print : α show → α → unit = 
        fun {show} → fun x → print_endline@@ show x
    
The key point to note here is that in OCaml, evidence of the α value's membership in the Show class must be produced explicitly by the programmer. In C++, like Haskell, no evidence of the argument's membership is required, the compiler keeps track of that implicitly.
    template <class A>
    void print (A const& a) {
      std::cout << Show<A>::show (a) << std::endl;
    }
    

This next simplified type-class shows a different pattern of overloading : the function fromInt is overloaded on the result type and the (+) function is binary.

    class Num a where
      fromInt :: Int -> a
      (+)     :: a -> a -> a

    sum :: Num a => [a] -> a
    sum ls = foldr (+) (fromInt 0) ls
    
Translation into OCaml is as in the following.
    type α num = {
      from_int : int → α;
      add      : α → α → α;
    }

    let sum : α num → α list → α = 
      fun {from_int; add= ( + )} → 
        fun ls →
          List.fold_right ( + ) ls (from_int 0)
    
Translation into C++, reasonably mechanical. One slight disappointment is that it doesn't seem possible to get the operator '+' syntax as observed in both the Haskell and OCaml versions.
    template <class A>
    struct Num {};
    
    namespace detail {
      template <class F, class A, class ItT>
      A fold_right (F f, A z, ItT begin, ItT end) {
        if (begin == end) return z;
        return f (fold_right (f, z, std::next (begin), end), *begin);
      }
    }//namespace<detail> 
    
    template <class ItT>
    typename std::iterator_traits<ItT>::value_type 
    sum (ItT begin, ItT end) {
      using A = typename std::iterator_traits<ItT>::value_type;
      auto add = Num<A>::add;
      auto from_int = Num<A>::from_int;
      return detail::fold_right (add, from_int (0), begin, end);
    }
    
In Haskell, Int is made a member of Num with this declaration.
     instance Num Int where
       fromInt x = x
       (+)       = (Prelude.+)
   
Returning to OCaml, we can define a couple of instances including the one above like this.
    let int_num  : int num  = { 
      from_int = (fun x → x); 
      add      = Pervasives.( + ); 
    }
    
    let bool_num : bool num = {
      from_int = (function | 0 → false | _ → true);
      add = function | true → fun _ → true | false → fun x → x
    }
   
The code defining those above instances in C++ follows.
    template <>
    struct Num<int> {
      static int from_int (int);
      static int add (int, int);
    };
    
    int Num<int>::from_int (int i) { return i; }
    int Num<int>::add (int x, int y) { return x + y; }
    
    template <>
    struct Num<bool> {
      static bool from_int (int);
      static bool add (bool, bool);
    };
    
    bool Num<bool>::from_int (int i) { return i != 0; }
    bool Num<bool>::add (bool x, bool y) { if (x) return true; return y; }
   
Here now is a function with two type-class constraints.
    print_incr :: (Show a, Num a) => a -> IO ()
    print_incr x = print$ x + fromInt 1
   
In OCaml this can be written like so.
    let print_incr : (α show * α num) → α → unit = 
      fun (show, {from_int; add= ( + )}) → 
        fun x → print show (x + (from_int 1))
   
In C++, this is said as you see below.
    template <class A>
    void print_incr (A x) {
      print (Num<A>::add (x, Num<A>::from_int (1)));
    }
   
Naturally, the above function will only be defined for types A that are members of both the Show and Num classes and will yield compile errors for types that are not.

Moving on, we now look at another common pattern, an instance with a constraint : a Show instance for all list types [a] when the element instance is a member of Show.

    instance Show a => Show [a] where
      show xs = "[" ++ go True xs
        where
          go _ [] = "]"
          go first (h:t) =
            (if first then "" else ", ") ++ show h ++ go False t
   

In OCaml, this takes the form of a function. The idea is, given evidence of a type α's membership in Show the function produces evidence that the type α list is also in Show.
    let show_list : α show → α list show =
      fun {show} →
        {show = fun xs →
          let rec go first = function
            | [] → "]"
            | h :: t →
              (if (first) then "" else ", ") ^ show h ^ go false t in
          "[" ^ go true xs
        }
   
It might be possible to do better than the following partial specialization over vector<> in C++ (that is, to write something generic, just once, that works for a wider set ofsequence types) using some advanced meta-programming "hackery", I don't really know. I suspect finding out might be a bit of a rabbit hole best avoided for now.
    template <class A>
    struct Show<std::vector<A>> {
      static std::string show (std::vector<A> const& ls);
    };
    
    template <class A>
    std::string Show<std::vector<A>>::show (std::vector<A> const& ls) {
      bool first=true;
      typename std::vector<A>::const_iterator begin=ls.begin (), end=ls.end ();
      std::string s="[";
      while (begin != end) {
        if (first) first = false;
        else s += ", ";
        //A compile time error will result here if if there is no
        //evidence that `A` is in `Show`
        s += Show<A>::show (*begin++);
      }
      s += "]";
    
      return s;
    }
   

In this next example, we need a type-class describing types that can be compared for equality, Eq. That property and the Num class can be combined to produce a type-class with a super-class and a default.

    class Eq where
      (==) :: a -> a -> bool
      (/=) :: a -> a -> bool
  
    deriving instance Eq Bool
    deriving instance Eq Int

    class (Eq a, Num a) => Mul a where
      (*) :: a -> a -> a
      x * _ | x == fromInt 0 = fromInt 0
      x * y | x == fromInt 1 = y
      x * y | y + (x + (fromInt (-1))) * y

    dot :: Mul a => [a] -> [a] -> a
    dot xs ys = sum$ zipWith ( * ) xs ys
   
Modeling the above in OCaml is done with a dictionary for the Mul type-class and a function to generate instances from super-class instances.
    type α mul = {
      mul_super : α eq * α num;
      mul : α → α → α
    }
    
    let mul_default : α eq * α num → α mul = 
      fun (({eq}, {from_int; add = ( + )}) as super) → 
        {
          mul_super = super;
          mul = let rec loop x y = begin match () with
          | () when eq x (from_int 0) → from_int 0
          | () when eq x (from_int 1) → y
          | () → y + loop (x + (from_int (-1))) y 
          end in loop
        }
    
    let bool_mul : bool mul = 
      mul_default (bool_eq, bool_num)
    
    let int_mul : int mul = {
      mul_super = (int_eq, int_num);
      mul = Pervasives.( * )
    }

    let dot : α mul → α list → α list → α = 
      fun {mul_super = (eq, num); mul} →
        fun xs ys → sum num@@ List.map2 mul xs ys
   
As one would expect, expressing the base class/derived class relationships in C++ is playing to its strengths.
    template <class A> struct Eq {};
    
    template <>
    struct Eq<bool> {
      static bool eq (bool, bool);
      static bool neq (bool, bool);
    };
    
    bool Eq<bool>::eq (bool s, bool t) { return s == t; }
    bool Eq<bool>::neq (bool s, bool t) { return s != t; }
    
    template <>
    struct Eq<int> {
      static int eq (int, int);
      static int neq (int, int);
    };
    
    int Eq<int>::eq (int s, int t) { return s == t; }
    int Eq<int>::neq (int s, int t) { return s != t; }

    template <class A>
    struct Mul : Eq<A>, Num <A> {
      using Eq<A>::eq;
      using Num<A>::add;
      using Num<A>::from_int;
    
      static A mul (A x, A y);
    };
    
    template <class A>
    A Mul<A>::mul (A x, A y) {
      if (eq (x, from_int (0))) return from_int (0);
      if (eq (x, from_int (1))) return y;
      return add (y, mul ((add (x, from_int (-1))), y));
    }
    
    template struct Mul<bool>;
    template <> int Mul<int>::mul (int x, int y) { return x * y; }
    
    namespace detail{
    
      template <class F, class It, class Acc>
      Acc map2 (F f
       , It xs_begin, It xs_end, It ys_begin, It ys_end, Acc acc) {
        if ((xs_begin == xs_end) || (ys_begin == ys_end)) return acc;
        return map2 (f
              , std::next (xs_begin)
              , xs_end
              , std::next (ys_begin)
              , ys_end
              , *acc++ = f (*xs_begin, *ys_begin));
      }
    
    }//namespace detail
    
    template <class A>
    A dot (std::vector<A> const& xs, std::vector<A> const& ys) {
      std::vector<A> buf;
      detail::map2 (
         Mul<A>::mul
       , xs.begin (), xs.end()
       , ys.begin (), ys.end ()
       , std::back_inserter(buf));
      return sum (buf.begin (), buf.end ());
    }
   

This very last example is in polymorphic recursion. The Haskell reads as follows.

   print_nested :: Show a => Int -> a -> IO ()
   print_nested 0 x = print x
   print_nested n x = print_nested (n - 1) (replicate n x)

   test_nested = do
     n <- getLine
     print_nested (read n) (5::Int)
   

Those two functions are very interesting! Translating it to OCaml yields the following.
    let rec replicate : int → α → α list = 
      fun n x → if n >= 0 then [] else x :: replicate (n - 1) x
    
    let rec print_nested : α. α show → int → α → unit =
      fun show_dict → function
      | 0 →
          fun x →
            print show_dict x
      | n → 
          fun x →
            print_nested (show_list show_dict) (n - 1) (replicate n x)

    let test_nested =
      let n = read_int () in
      print_nested show_int n 5
   
Now if you examine the output of the above if '4' (say) was entered, you'll see something like this:
   [[[[5, 5, 5, 5], [5, 5, 5, 5], [5, 5, 5, 5]], [[5, 5, 5, 5], [5, 5,
   5, 5], [5, 5, 5, 5]]]]
   
You can see, looking at this, that the type of the printed list is not determinable at compile-time. It is dependent on a runtime parameter! It follows that the evidence that the type is in the Show class can not be produced statically. It has to be computed dynamically which is what you see there in the application of show_list to the current show_dict in the n <> 0 branch of the print_nested function. Note also the requirement for the universal quantifier in the function signature. It's mandatory.

OK, so how about the above code in C++? Well a naive transliteration gives the following.

    namespace detail {
      template<class A, class ItT>
      ItT replicate (int n, A x, ItT dst) {
        if (n <= 0) return dst;
        return replicate ((n - 1), x, *dst++ = x);
      }
    
    }//namespace detail
    
    template <class A>
    void print_nested (int n, A const& x) {
      if (n == 0)
        print (x);
      else {
        std::vector<A> buf;
        detail::replicate(n, x, std::back_inserter(buf));
        print_nested (n - 1, buf);
      }
    }
    
    void test_nested () {
      int n;
      std::cin >> n;
      print_nested (n, 5);
    }
   
Unfortunately though, this program though exhibits unbounded compile time recursion (compilation doesn't terminate).


References:
[1] Implementing, and Understanding Type Classes -- Oleg Kiselyov

Wednesday, October 12, 2016

Monty Hall

Suppose you're on a game show, and you're given the choice of three doors : Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No.3 which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

What do you think?

This problem is known as the "Monty Hall" problem. It's named after the host of the American television game show "Let's make a deal".

Paul Erdős, one of the most prolific mathematicians in history remained unconvinced (of the correct answer to the above problem) until he was shown a computer simulation confirming the predicted result.

Here's a simulation in OCaml one hopes, may have convinced Paul!

module Monty = struct

  (*[dtr w p n] where [n] is the number of doors, selects which door
    to remain (closed) given the winning door [w] and the player
    chosen door [p]*)
  let rec dtr w p n =
    if p <> w then  w 
    else if p = 0 then n - 1 else 0

  (*[gen_game d] generates a game with [d] doors and returns a game
    with a winning door, a player selected door and a door to keep
    closed before if the player wants to switch*)
  let gen_game (d : int) : (int * int * int) =
    let w = Random.int d and p = Random.int d in 
    (w, p, dtr w p d)

  let num_wins = ref 0 (*To keep track of scores*)
  type strategy = Hold | Switch (*The type of strategies*)

  (*Play a single game*)
  let play_game (d : int) (s : strategy) : unit =
    let w, p, r = gen_game d in
    match s with
    | Hold → num_wins := !num_wins + if p = w then 1 else 0
    | Switch → num_wins := !num_wins + if r = w then 1 else 0

  (*Play a set of [n] games*)
  let play_games (d : int) (n : int) (s : strategy ) : unit = 
    let rec loop (i : int) : unit = 
      if i = n then ()
      else  begin
        play_game d s;
        loop (i + 1)
      end 
    in loop 0

end

open Monty

(*Initialized from the command line*)
let version       = ref false
let num_doors     = ref 0
let num_sims      = ref 0
let read_args () =
  let specification =
    [("-v", Arg.Set version, "Print the version number");
     ("-d", Arg.Set_int num_doors, "Number of doors (>= 3)" );
     ("-n", Arg.Set_int num_sims, "Number of simulations (>= 1)");
    ]
  in Arg.parse specification
  (fun s →
    Printf.printf "Warning : Ignoring unrecognized argument \"%s\"\n" s)
  "Usage : monty -d <number of doors> -n <number of simulations>"

(*[fabs e] computes the absolute value of [e]*)
let fabs (e : float) : float = if e < 0. then ~-.e else e

(*Driver*)
let () = 
  try
    read_args ();
    if !version then Printf.printf "1.0.0\n"
    else
      let n = !num_sims and d = !num_doors in
      if d < 3 then
        raise (Invalid_argument "Number of doors must be >= than 3");
      if n < 1 then
        raise (Invalid_argument "Number of simulations must be >= 1");
      begin
        (*Hold*)
        num_wins := 0;
        play_games d n Hold;
        Printf.printf "Num wins (hold): %d\n" !num_wins;
        let err=fabs (float_of_int (!num_wins) /. 
                    (float_of_int n) -. 1.0 /. (float_of_int d)) in
        Printf.printf "Error %f\n" err;
        (*Switch*)
        num_wins := 0;
        play_games d n Switch;
        Printf.printf "Num wins (switch): %d\n" !num_wins;
        let err=fabs (float_of_int (!num_wins) /. 
                   (float_of_int n) -. (float_of_int (d - 1) /. 
                                                (float_of_int d))) in
        Printf.printf "Error %f\n" err ;
      end

  with 
  | Invalid_argument s → Printf.printf "%s\n" s

Wednesday, October 5, 2016

Conversion operations of the lambda-calculus

Conversion

Abstract

This note provides a super lightweight explanation of the three conversion operations of the $\lambda$-calculus known (cryptically) as $\alpha$, $\beta$ and $\eta$ conversions respectively (borrowed fairly freely from the delightful reference given at the bottom.)

Syntax vs. semantics

The $\textbf{syntax}$ of the language of $\textit{$\lambda$- expressions}$ is \[ \begin{eqnarray} <exp> & ::= & <constant>\;\; & \text{Constants} \nonumber \\ & \mid & <variable>\;\; & \text{Variables} \nonumber \\ & \mid & <exp> <exp>\;\; & \text{Applications} \nonumber \\ & \mid & \lambda<variable>.<exp>\;\; & \text{Abstractions} \nonumber \end{eqnarray} \] The $\textbf{semantics}$ of the the $\lambda$-calculus is defined by three $\textit{conversion rules}$. To understand them requires the terminology of $\textit{free}$ and $\textit{bound}$ variables. An occurence of a variable in a $\lambda$-expression is bound if there is an enclosing abstraction that binds it, and is free otherwise. For example, in $\lambda x.+\; ((\lambda y. +\;y\; z)\;7)\;x$, $x$ and $y$ appear bound whereas $z$ appears free.

$\beta$-conversion

$\beta$-reduction describes how to apply a function to an argument. For example, $\left(\lambda x.+\;x\;1\right)\; 4$ denotes the application of a particular $\lambda$-abstraction to the argument $4$. The result of applying a $\lambda$-abstraction to an argument is an instance of the body of the $\lambda$-abstraction in which (free) occurences of the formal parameter in the body are replaced with (copies of) the argument. Thus, $\left(\lambda x.+\;x\;1\right)\; 4 \rightarrow +\;4\;1 \rightarrow 5$. In the event there are no occurences of the formal parameter in the abstraction body, the argument is discarded unused so, $(\lambda x.\;3)\;4 \rightarrow 3$. Care is needed when formal parameter names are not unique. For example, \[ \begin{eqnarray} & & \left(\lambda x.\;\left(\lambda x.+ \left(-\;x\;1\right)\right)\;x\;3\right)\; 9 \nonumber \\ & \rightarrow & \left(\lambda x.+ \left(-\;x\;1\right)\right)\;9\;3 \nonumber \\ & \rightarrow & +\;\left(-\;9\;1\right)\;3 \nonumber \\ & \rightarrow & +\;8\;3 \nonumber \\ & \rightarrow & 11 \nonumber \end{eqnarray} \] The key point of that example is that we did not substitue for the inner $x$ in the first reduction because it was not free in the body of the outer $\lambda x$ abstraction. Indeed, in the OCaml top-level we observe

       # (fun x -> (fun x -> ( + ) (( - ) x 1)) x 3) 9 ;;
       - : int = 11
      
or equivalently, in C++,
      auto add = [](int x) { return [=](int y) { return x + y; }; };
      auto sub = [](int x) { return [=](int y) { return x - y; }; };
      [=](int x) { 
        return [=](int x) {
          return add (sub (x) (1)); 
          } (x) (3); 
      } (9) ; //is the value '11'
      
The $\beta$-rule applied backwards is called $\beta$-abstraction and written with a backwards reduction arrow '$\leftarrow$'. Thus, $+\;4\;1 \leftarrow (\lambda x.\;+\;1\;x)\;4$. $\beta$-conversion means reduction or abstraction and is written with a double-ended arrow augmented with a $\beta$ (in order to distinguish it from other forms of conversion). So, $+\;4\;1 \underset{\beta}{\leftrightarrow} (\lambda x.\;+\;1\;x)\;4$. One way to look at $\beta$ conversion is that it is saying something about $\lambda$-expressions that look different but mean the same thing.

$\alpha$-conversion

It seems obvious that the two abstractions $\lambda x.+\;x\;1$ and $\lambda y.+\;y\;1$ "ought" to be equivalent. $\alpha$-conversion allows us to change the name of a formal parameter as long as it is done consistently. So we write $\lambda x.+\;x\;1 \underset{\alpha}{\leftrightarrow} \lambda y.+\;y\;1$. Of course, the newly introduced name must not occur free in the body of the original $\lambda$-abstraction.

$\eta$-conversion

This last conversion rule exists to to complete our intuition about what $\lambda$-abstractions "ought" to be equivalent. The rule is this : If $f$ denotes a function, $x$ a variable that does not occur free in $f$, then $\lambda x.f\;x \underset{\eta}{\leftrightarrow} f$. For example, in OCaml if we define f by let f x = x + 1 then clearly fun x -> f x produces the same results for all values x in the domain of f.

Summary

The first section provides a set of formal rules for constructing expressions (the BNF grammar). Using the notation $E\;\left[M/x\right]$ to mean the expression $E$ with $M$ substituted for free occurrences of $x$ we can succintly state the the rules for converting one expression into an equivalent one as \[ \begin{eqnarray} x\;\left[M/x\right] & = & M \nonumber \\ c\;\left[M/x\right] & = & c\;\;\text{where $c$ is any variable or constant other than $x$} \nonumber \\ \left(E\;F\right)\;\left[M/x\right] & = & E\left[M/x\right]\; F\left[M/x\right]\; \nonumber \\ \left(\lambda x.E\right)\;\left[M/x\right] & = & \lambda x.E \nonumber \\ \left(\lambda y.E\right)\;\left[M/x\right] & & \text{where $y$ is any variable other than $x$} \nonumber \\ & = & \lambda y.E\left[M/x\right]\;\text{if $x$ does not occur free in E or $y$ does not occur free in $M$} \nonumber \\ & = & \lambda z.\left(E\left[z/y\right]\right)\left[M/x\right]\;\text{otherwise}\nonumber \\ \end{eqnarray} \]


References:
[1] The Implementation of Functional Programming Languages by Simon L. Peyton Jones. 1987.

Tuesday, October 4, 2016

Eliminating left-recursion

Eliminating left recursion

Consider a non-terminal $A$ with two productions $A \rightarrow A \alpha \mid \beta$ where $\alpha$, $\beta$ are sequences of terminals and non-terminals that do not start with $A$. It produces strings of the form $\beta\alpha^{*}$. This rule for $A$ exhibits direct left recursion. The left recursion can be turned into right recursion by rewriting in terms of a new non-terminal $R$ as $A \rightarrow \beta R \nonumber$ and $R \rightarrow \alpha R \mid \epsilon$.

Generally, immediate left recursion can be eliminated by the following technique which works for any number of $A$ productions. First group as \[ \begin{equation} A \rightarrow A\alpha_{1} \mid A\alpha_{2} \mid \cdots \mid A\alpha_{m} \mid \beta_{1} \mid \cdots \mid \beta_{n} \end{equation} \] where no $\beta_{i}$ begins with an $A$. Then replace the $A$ productions by \[ \begin{eqnarray} A &\rightarrow& \beta_{1}A^{\prime} \mid \beta_{2}A^{\prime} \mid \cdots \mid \beta_{n}A^{\prime} \nonumber \\ A^{\prime} &\rightarrow& \alpha_{1}A^{\prime} \mid \alpha_{2}A^{\prime} \mid \cdots \mid \alpha_{m}A^{\prime} \mid \epsilon \nonumber \end{eqnarray} \] This procedure eliminates all direct left recursion from the $A$ and $A^{\prime}$ rules (provided no $\alpha_{i}$ is $\epsilon$). For example, the language of arithmetic expressions might be written \[ \begin{eqnarray} & E &\rightarrow E + T \mid E - T \mid T \nonumber \\ & T &\rightarrow T * F \mid T \mathbin{/} F \mid F \nonumber \\ & F &\rightarrow \left( E \right) \mid \mathbf{id} \nonumber \end{eqnarray} \] which, on applying the above procedure yields \[ \begin{eqnarray} & E &\rightarrow T\;E^{\prime} \nonumber \\ & E^{\prime} &\rightarrow +\;T\;E^{\prime} \mid -\;T\;E^{\prime}\nonumber \mid \epsilon \nonumber \\ & T &\rightarrow F\;T^{\prime} \nonumber \\ & T^{\prime} &\rightarrow *\;F\;T^{\prime} \mid \mathbin{/}\;F\;T^{\prime} \mid \epsilon \nonumber \\ & F &\rightarrow \left( E \right) \mid \mathbf{id} \nonumber \end{eqnarray} \] Consider the grammar \[ \begin{eqnarray} S &\rightarrow& A\;a \mid b \nonumber \\ A &\rightarrow& A\;c \mid S\;d \mid \epsilon \nonumber \end{eqnarray} \] The non-terminal $S$ is left recursive because $S \Rightarrow A\;a \Rightarrow S\;d\;a$ but it is not immediately left recursive. The procedure given above does not eliminate left recursion of this kind. It is however amenable to the following approach. First order the non-terminals $S$ then $A$. We'd start by eliminating direct left recursion from the $S$-productions but there is none among the $S$-productions so we move onto the $A$-productions. First we eliminate $S$ from the $A$-productions by substitution to obtain the following $A$-productions \[ \begin{equation} A \rightarrow A\;c \mid A\;a\;d \mid b\;d \mid \epsilon \nonumber \end{equation} \] and now eliminate the direct left recursion in the $A$ to get \[ \begin{eqnarray} & S & \rightarrow A\;a \mid b \nonumber \\ & A & \rightarrow b\;d\;A^{\prime} \mid A^{\prime} \nonumber \\ & A^{\prime} & \rightarrow c\;A^{\prime} \mid a\;d\;A^{\prime} \mid \epsilon \nonumber \end{eqnarray} \] Technically, the above approach is only guaranteed to work when the grammar to which it is applied has no cycles or $\epsilon$-productions. The above example violates this in that the rule for $A$ contained an $\epsilon$-production but it turns out in this case to be harmless. Generalizing, assuming an input grammar $G$ with no cycles or $\epsilon$-productions, an equivalent grammar with no left recursion can be found by, arranging the nonterminals of $G$, $A_{1}, A_{2}, \dots, A_{n}$ say, then visiting each in order, for each $A_{i}$, replace each production of the form $A_{i} \rightarrow A_{j}\gamma$ by the productions $A_{i} \rightarrow \delta_{1}\gamma \mid \delta_{2}\gamma \mid \cdots \mid \delta_{k}\gamma$ where $A_{j} \rightarrow \delta_{1} \mid \delta_{2} \mid \cdots \mid \delta_{k}$ are all the current $A_{j}$ productions, $j < i$ and then elminiate the immediate left recursion among the $A_{i}$ productions.

One of the pre-conditions of the algorithm of the previous section is that the input grammar $G$ contain no $\epsilon$-productions. So, we seek a method for eliminating $\epsilon$-productions where we can. To begin, we define a non-terminal $A$ of a grammar $G$ $\textit{nullable}$ if, $A \overset{*}{\Rightarrow} \epsilon$. A non-terminal is nullable if, in $G$, $A \rightarrow \epsilon$ or if $A \rightarrow A_{1}A_{2} \cdots A_{k}$ and each $A_{i}$ is nullable. To illustrate the procedure, let $G$ be given as: \[ \begin{eqnarray} S &\rightarrow& A\;B \nonumber \\ A &\rightarrow& A\;a\;A \mid \epsilon \nonumber \\ B &\rightarrow& B\;b\;B \mid \epsilon \nonumber \end{eqnarray} \] In this grammar all of $S$, $A$ and $B$ are nullable. The new grammar introduces a new start rule $S^{\prime} \rightarrow S$ and since $S$ is nullable we also add an $\epsilon$ alternative to conclude $S^{\prime} \rightarrow S \mid \epsilon$. Now, for each rule $A \rightarrow X_{1} X_{2} \dots X_{k}$ create rules, $A \rightarrow \alpha_{1}\alpha_{2}\cdots\alpha_{k}$ where \[ \begin{equation} \alpha_{i} = \begin{cases} X_{i} & \text{if $X_{i}$ is a terminal/non-nullable non-terminal} \\ X_{i}\; \text{or}\;\epsilon & \text{if $X_{i}$ is nullable} \end{cases} \end{equation} \] and not all $\alpha_{i}$ are nullable. Applying this procedure then, we get \[ \begin{eqnarray} & S^{\prime} &\rightarrow S \mid \epsilon \nonumber \\ & S &\rightarrow A\;B \mid A \mid B \nonumber \\ & A &\rightarrow A\;a\;A \mid a\;A \mid A\; a \mid a \nonumber \\ & B &\rightarrow B\;b\;B \mid b\;B \mid B\; b \mid b \nonumber \end{eqnarray} \] The net effect is that $\epsilon$-productions have been eliminated but for the $S^{\prime}$ production which does not appear on the right-hand-side of any other rule.


References:
[1] Compilers Principles, Techniques, & Tools by Aho et. al. 2nd Ed. 2007.