This article examines the emulation of Haskell like type-classes in OCaml and C++. It follows [1] closely (recommended for further reading), extending on some of the example code given there to include C++.
First stop, a simplified version of the Show
type-class with a couple of simple instances.
class Show a where show :: a -> string instance Show Int where show x = Prelude.show x -- internal instance Show Bool where str True = "True" str False = "False"The OCaml equivalent shown here uses the "dictionary passing" technique for implementation. The type-class declaration
Show
in Haskell translates to a data-type
declaration for a polymorphic record α show
in
OCaml.
type α show = { show : α → string } let show_bool : bool show = { show = function | true → "True" | false → "False" } let show_int : int show = { show = string_of_int }In C++ we can use a template class to represent the type-class and specializations to represent the instances.
template <class A> struct Show {}; template <> struct Show<int> { static std::string (*show)(int); }; std::string(*Show<int>::show)(int) = &std::to_string; template <> struct Show<bool> { static std::string show (bool); }; std::string Show<bool>::show (bool b) { return b ? "true" : "false"; }
Next up print
, a parametrically polymorphic function.
print :: Show a => a -> IO () print x = putStrLn$ show xAccording to our dictionary passing scheme in OCaml, this renders as the following.
let print : α show → α → unit = fun {show} → fun x → print_endline@@ show xThe key point to note here is that in OCaml, evidence of the
α
value's membership in
the Show
class must be produced explicitly by the
programmer. In C++, like Haskell, no evidence of the argument's
membership is required, the compiler keeps track of that
implicitly.
template <class A> void print (A const& a) { std::cout << Show<A>::show (a) << std::endl; }
This next simplified type-class shows a different pattern of
overloading : the function fromInt
is overloaded on
the result type and the (+)
function is binary.
class Num a where fromInt :: Int -> a (+) :: a -> a -> a sum :: Num a => [a] -> a sum ls = foldr (+) (fromInt 0) lsTranslation into OCaml is as in the following.
type α num = { from_int : int → α; add : α → α → α; } let sum : α num → α list → α = fun {from_int; add= ( + )} → fun ls → List.fold_right ( + ) ls (from_int 0)Translation into C++, reasonably mechanical. One slight disappointment is that it doesn't seem possible to get the operator '
+
' syntax as observed in both the Haskell
and OCaml versions.
template <class A> struct Num {}; namespace detail { template <class F, class A, class ItT> A fold_right (F f, A z, ItT begin, ItT end) { if (begin == end) return z; return f (fold_right (f, z, std::next (begin), end), *begin); } }//namespace<detail> template <class ItT> typename std::iterator_traits<ItT>::value_type sum (ItT begin, ItT end) { using A = typename std::iterator_traits<ItT>::value_type; auto add = Num<A>::add; auto from_int = Num<A>::from_int; return detail::fold_right (add, from_int (0), begin, end); }In Haskell,
Int
is made a member of Num
with this declaration.
instance Num Int where fromInt x = x (+) = (Prelude.+)Returning to OCaml, we can define a couple of instances including the one above like this.
let int_num : int num = { from_int = (fun x → x); add = Pervasives.( + ); } let bool_num : bool num = { from_int = (function | 0 → false | _ → true); add = function | true → fun _ → true | false → fun x → x }The code defining those above instances in C++ follows.
template <> struct Num<int> { static int from_int (int); static int add (int, int); }; int Num<int>::from_int (int i) { return i; } int Num<int>::add (int x, int y) { return x + y; } template <> struct Num<bool> { static bool from_int (int); static bool add (bool, bool); }; bool Num<bool>::from_int (int i) { return i != 0; } bool Num<bool>::add (bool x, bool y) { if (x) return true; return y; }Here now is a function with two type-class constraints.
print_incr :: (Show a, Num a) => a -> IO () print_incr x = print$ x + fromInt 1In OCaml this can be written like so.
let print_incr : (α show * α num) → α → unit = fun (show, {from_int; add= ( + )}) → fun x → print show (x + (from_int 1))In C++, this is said as you see below.
template <class A> void print_incr (A x) { print (Num<A>::add (x, Num<A>::from_int (1))); }Naturally, the above function will only be defined for types
A
that are members of both the Show
and Num
classes and will yield compile errors for
types that are not.
Moving on, we now look at another common pattern, an instance
with a constraint : a Show
instance for all list
types [a]
when the element instance is a
member of
Show
.
instance Show a => Show [a] where show xs = "[" ++ go True xs where go _ [] = "]" go first (h:t) = (if first then "" else ", ") ++ show h ++ go False tIn OCaml, this takes the form of a function. The idea is, given evidence of a type
α
's membership
in Show
the function produces evidence that the
type α list
is also in Show
.
let show_list : α show → α list show = fun {show} → {show = fun xs → let rec go first = function | [] → "]" | h :: t → (if (first) then "" else ", ") ^ show h ^ go false t in "[" ^ go true xs }It might be possible to do better than the following partial specialization over
vector<>
in C++ (that is, to
write something generic, just once, that works for a wider set
ofsequence types) using some advanced meta-programming "hackery", I
don't really know. I suspect finding out might be a bit of a rabbit
hole best avoided for now.
template <class A> struct Show<std::vector<A>> { static std::string show (std::vector<A> const& ls); }; template <class A> std::string Show<std::vector<A>>::show (std::vector<A> const& ls) { bool first=true; typename std::vector<A>::const_iterator begin=ls.begin (), end=ls.end (); std::string s="["; while (begin != end) { if (first) first = false; else s += ", "; //A compile time error will result here if if there is no //evidence that `A` is in `Show` s += Show<A>::show (*begin++); } s += "]"; return s; }
In this next example, we need a type-class describing types that
can be compared for equality, Eq
. That property and
the Num
class can be combined to produce a
type-class with a super-class and a default.
class Eq where (==) :: a -> a -> bool (/=) :: a -> a -> bool deriving instance Eq Bool deriving instance Eq Int class (Eq a, Num a) => Mul a where (*) :: a -> a -> a x * _ | x == fromInt 0 = fromInt 0 x * y | x == fromInt 1 = y x * y | y + (x + (fromInt (-1))) * y dot :: Mul a => [a] -> [a] -> a dot xs ys = sum$ zipWith ( * ) xs ysModeling the above in OCaml is done with a dictionary for the
Mul
type-class and a function to generate
instances from super-class instances.
type α mul = { mul_super : α eq * α num; mul : α → α → α } let mul_default : α eq * α num → α mul = fun (({eq}, {from_int; add = ( + )}) as super) → { mul_super = super; mul = let rec loop x y = begin match () with | () when eq x (from_int 0) → from_int 0 | () when eq x (from_int 1) → y | () → y + loop (x + (from_int (-1))) y end in loop } let bool_mul : bool mul = mul_default (bool_eq, bool_num) let int_mul : int mul = { mul_super = (int_eq, int_num); mul = Pervasives.( * ) } let dot : α mul → α list → α list → α = fun {mul_super = (eq, num); mul} → fun xs ys → sum num@@ List.map2 mul xs ysAs one would expect, expressing the base class/derived class relationships in C++ is playing to its strengths.
template <class A> struct Eq {}; template <> struct Eq<bool> { static bool eq (bool, bool); static bool neq (bool, bool); }; bool Eq<bool>::eq (bool s, bool t) { return s == t; } bool Eq<bool>::neq (bool s, bool t) { return s != t; } template <> struct Eq<int> { static int eq (int, int); static int neq (int, int); }; int Eq<int>::eq (int s, int t) { return s == t; } int Eq<int>::neq (int s, int t) { return s != t; } template <class A> struct Mul : Eq<A>, Num <A> { using Eq<A>::eq; using Num<A>::add; using Num<A>::from_int; static A mul (A x, A y); }; template <class A> A Mul<A>::mul (A x, A y) { if (eq (x, from_int (0))) return from_int (0); if (eq (x, from_int (1))) return y; return add (y, mul ((add (x, from_int (-1))), y)); } template struct Mul<bool>; template <> int Mul<int>::mul (int x, int y) { return x * y; } namespace detail{ template <class F, class It, class Acc> Acc map2 (F f , It xs_begin, It xs_end, It ys_begin, It ys_end, Acc acc) { if ((xs_begin == xs_end) || (ys_begin == ys_end)) return acc; return map2 (f , std::next (xs_begin) , xs_end , std::next (ys_begin) , ys_end , *acc++ = f (*xs_begin, *ys_begin)); } }//namespace detail template <class A> A dot (std::vector<A> const& xs, std::vector<A> const& ys) { std::vector<A> buf; detail::map2 ( Mul<A>::mul , xs.begin (), xs.end() , ys.begin (), ys.end () , std::back_inserter(buf)); return sum (buf.begin (), buf.end ()); }
This very last example is in polymorphic recursion. The Haskell reads as follows.
print_nested :: Show a => Int -> a -> IO () print_nested 0 x = print x print_nested n x = print_nested (n - 1) (replicate n x) test_nested = do n <- getLine print_nested (read n) (5::Int)Those two functions are very interesting! Translating it to OCaml yields the following.
let rec replicate : int → α → α list = fun n x → if n >= 0 then [] else x :: replicate (n - 1) x let rec print_nested : α. α show → int → α → unit = fun show_dict → function | 0 → fun x → print show_dict x | n → fun x → print_nested (show_list show_dict) (n - 1) (replicate n x) let test_nested = let n = read_int () in print_nested show_int n 5Now if you examine the output of the above if '4' (say) was entered, you'll see something like this:
[[[[5, 5, 5, 5], [5, 5, 5, 5], [5, 5, 5, 5]], [[5, 5, 5, 5], [5, 5, 5, 5], [5, 5, 5, 5]]]]You can see, looking at this, that the type of the printed list is not determinable at compile-time. It is dependent on a runtime parameter! It follows that the evidence that the type is in the
Show
class can not be produced statically. It has
to be computed dynamically which is what you see there in the
application of show_list
to the
current show_dict
in the n <> 0
branch of the print_nested
function. Note also the
requirement for the universal quantifier in the function
signature. It's mandatory.
OK, so how about the above code in C++? Well a naive transliteration gives the following.
namespace detail { template<class A, class ItT> ItT replicate (int n, A x, ItT dst) { if (n <= 0) return dst; return replicate ((n - 1), x, *dst++ = x); } }//namespace detail template <class A> void print_nested (int n, A const& x) { if (n == 0) print (x); else { std::vector<A> buf; detail::replicate(n, x, std::back_inserter(buf)); print_nested (n - 1, buf); } } void test_nested () { int n; std::cin >> n; print_nested (n, 5); }Unfortunately though, this program though exhibits unbounded compile time recursion (compilation doesn't terminate).
References:
[1] Implementing, and Understanding Type Classes -- Oleg Kiselyov