This article examines the emulation of Haskell like type-classes in OCaml and C++. It follows [1] closely (recommended for further reading), extending on some of the example code given there to include C++.
First stop, a simplified version of the Show
type-class with a couple of simple instances.
class Show a where
show :: a -> string
instance Show Int where
show x = Prelude.show x -- internal
instance Show Bool where
str True = "True"
str False = "False"
The OCaml equivalent shown here uses the "dictionary passing"
technique for implementation. The type-class
declaration Show in Haskell translates to a data-type
declaration for a polymorphic record α show in
OCaml.
type α show = {
show : α → string
}
let show_bool : bool show = {
show = function | true → "True" | false → "False"
}
let show_int : int show = {
show = string_of_int
}
In C++ we can use a template class to represent the type-class and
specializations to represent the instances.
template <class A> struct Show {};
template <>
struct Show<int> {
static std::string (*show)(int);
};
std::string(*Show<int>::show)(int) = &std::to_string;
template <>
struct Show<bool> {
static std::string show (bool);
};
std::string Show<bool>::show (bool b) { return b ? "true" : "false"; }
Next up print, a parametrically polymorphic function.
print :: Show a => a -> IO ()
print x = putStrLn$ show x
According to our dictionary passing scheme in OCaml, this renders as the following.
let print : α show → α → unit =
fun {show} → fun x → print_endline@@ show x
The key point to note here is that in OCaml, evidence of
the α value's membership in
the Show class must be produced explicitly by the
programmer. In C++, like Haskell, no evidence of the argument's
membership is required, the compiler keeps track of that
implicitly.
template <class A>
void print (A const& a) {
std::cout << Show<A>::show (a) << std::endl;
}
This next simplified type-class shows a different pattern of
overloading : the function fromInt is overloaded on
the result type and the (+) function is binary.
class Num a where
fromInt :: Int -> a
(+) :: a -> a -> a
sum :: Num a => [a] -> a
sum ls = foldr (+) (fromInt 0) ls
Translation into OCaml is as in the following.
type α num = {
from_int : int → α;
add : α → α → α;
}
let sum : α num → α list → α =
fun {from_int; add= ( + )} →
fun ls →
List.fold_right ( + ) ls (from_int 0)
Translation into C++, reasonably mechanical. One slight
disappointment is that it doesn't seem possible to get the
operator '+' syntax as observed in both the Haskell
and OCaml versions.
template <class A>
struct Num {};
namespace detail {
template <class F, class A, class ItT>
A fold_right (F f, A z, ItT begin, ItT end) {
if (begin == end) return z;
return f (fold_right (f, z, std::next (begin), end), *begin);
}
}//namespace<detail>
template <class ItT>
typename std::iterator_traits<ItT>::value_type
sum (ItT begin, ItT end) {
using A = typename std::iterator_traits<ItT>::value_type;
auto add = Num<A>::add;
auto from_int = Num<A>::from_int;
return detail::fold_right (add, from_int (0), begin, end);
}
In Haskell, Int is made a member of Num
with this declaration.
instance Num Int where
fromInt x = x
(+) = (Prelude.+)
Returning to OCaml, we can define a couple of instances including the one above like this.
let int_num : int num = {
from_int = (fun x → x);
add = Pervasives.( + );
}
let bool_num : bool num = {
from_int = (function | 0 → false | _ → true);
add = function | true → fun _ → true | false → fun x → x
}
The code defining those above instances in C++ follows.
template <>
struct Num<int> {
static int from_int (int);
static int add (int, int);
};
int Num<int>::from_int (int i) { return i; }
int Num<int>::add (int x, int y) { return x + y; }
template <>
struct Num<bool> {
static bool from_int (int);
static bool add (bool, bool);
};
bool Num<bool>::from_int (int i) { return i != 0; }
bool Num<bool>::add (bool x, bool y) { if (x) return true; return y; }
Here now is a function with two type-class constraints.
print_incr :: (Show a, Num a) => a -> IO ()
print_incr x = print$ x + fromInt 1
In OCaml this can be written like so.
let print_incr : (α show * α num) → α → unit =
fun (show, {from_int; add= ( + )}) →
fun x → print show (x + (from_int 1))
In C++, this is said as you see below.
template <class A>
void print_incr (A x) {
print (Num<A>::add (x, Num<A>::from_int (1)));
}
Naturally, the above function will only be defined for
types A that are members of both the Show
and Num classes and will yield compile errors for
types that are not.
Moving on, we now look at another common pattern, an instance
with a constraint : a Show instance for all list
types [a] when the element instance is a
member of Show.
instance Show a => Show [a] where
show xs = "[" ++ go True xs
where
go _ [] = "]"
go first (h:t) =
(if first then "" else ", ") ++ show h ++ go False t
In OCaml, this takes the form of a function. The idea is, given
evidence of a type α's membership
in Show the function produces evidence that the
type α list is also in Show.
let show_list : α show → α list show =
fun {show} →
{show = fun xs →
let rec go first = function
| [] → "]"
| h :: t →
(if (first) then "" else ", ") ^ show h ^ go false t in
"[" ^ go true xs
}
It might be possible to do better than the following partial
specialization over vector<> in C++ (that is, to
write something generic, just once, that works for a wider set
ofsequence types) using some advanced meta-programming "hackery", I
don't really know. I suspect finding out might be a bit of a rabbit
hole best avoided for now.
template <class A>
struct Show<std::vector<A>> {
static std::string show (std::vector<A> const& ls);
};
template <class A>
std::string Show<std::vector<A>>::show (std::vector<A> const& ls) {
bool first=true;
typename std::vector<A>::const_iterator begin=ls.begin (), end=ls.end ();
std::string s="[";
while (begin != end) {
if (first) first = false;
else s += ", ";
//A compile time error will result here if if there is no
//evidence that `A` is in `Show`
s += Show<A>::show (*begin++);
}
s += "]";
return s;
}
In this next example, we need a type-class describing types that
can be compared for equality, Eq. That property and
the Num class can be combined to produce a
type-class with a super-class and a default.
class Eq where
(==) :: a -> a -> bool
(/=) :: a -> a -> bool
deriving instance Eq Bool
deriving instance Eq Int
class (Eq a, Num a) => Mul a where
(*) :: a -> a -> a
x * _ | x == fromInt 0 = fromInt 0
x * y | x == fromInt 1 = y
x * y | y + (x + (fromInt (-1))) * y
dot :: Mul a => [a] -> [a] -> a
dot xs ys = sum$ zipWith ( * ) xs ys
Modeling the above in OCaml is done with a dictionary for
the Mul type-class and a function to generate
instances from super-class instances.
type α mul = {
mul_super : α eq * α num;
mul : α → α → α
}
let mul_default : α eq * α num → α mul =
fun (({eq}, {from_int; add = ( + )}) as super) →
{
mul_super = super;
mul = let rec loop x y = begin match () with
| () when eq x (from_int 0) → from_int 0
| () when eq x (from_int 1) → y
| () → y + loop (x + (from_int (-1))) y
end in loop
}
let bool_mul : bool mul =
mul_default (bool_eq, bool_num)
let int_mul : int mul = {
mul_super = (int_eq, int_num);
mul = Pervasives.( * )
}
let dot : α mul → α list → α list → α =
fun {mul_super = (eq, num); mul} →
fun xs ys → sum num@@ List.map2 mul xs ys
As one would expect, expressing the base class/derived class
relationships in C++ is playing to its strengths.
template <class A> struct Eq {};
template <>
struct Eq<bool> {
static bool eq (bool, bool);
static bool neq (bool, bool);
};
bool Eq<bool>::eq (bool s, bool t) { return s == t; }
bool Eq<bool>::neq (bool s, bool t) { return s != t; }
template <>
struct Eq<int> {
static int eq (int, int);
static int neq (int, int);
};
int Eq<int>::eq (int s, int t) { return s == t; }
int Eq<int>::neq (int s, int t) { return s != t; }
template <class A>
struct Mul : Eq<A>, Num <A> {
using Eq<A>::eq;
using Num<A>::add;
using Num<A>::from_int;
static A mul (A x, A y);
};
template <class A>
A Mul<A>::mul (A x, A y) {
if (eq (x, from_int (0))) return from_int (0);
if (eq (x, from_int (1))) return y;
return add (y, mul ((add (x, from_int (-1))), y));
}
template struct Mul<bool>;
template <> int Mul<int>::mul (int x, int y) { return x * y; }
namespace detail{
template <class F, class It, class Acc>
Acc map2 (F f
, It xs_begin, It xs_end, It ys_begin, It ys_end, Acc acc) {
if ((xs_begin == xs_end) || (ys_begin == ys_end)) return acc;
return map2 (f
, std::next (xs_begin)
, xs_end
, std::next (ys_begin)
, ys_end
, *acc++ = f (*xs_begin, *ys_begin));
}
}//namespace detail
template <class A>
A dot (std::vector<A> const& xs, std::vector<A> const& ys) {
std::vector<A> buf;
detail::map2 (
Mul<A>::mul
, xs.begin (), xs.end()
, ys.begin (), ys.end ()
, std::back_inserter(buf));
return sum (buf.begin (), buf.end ());
}
This very last example is in polymorphic recursion. The Haskell reads as follows.
print_nested :: Show a => Int -> a -> IO ()
print_nested 0 x = print x
print_nested n x = print_nested (n - 1) (replicate n x)
test_nested = do
n <- getLine
print_nested (read n) (5::Int)
Those two functions are very interesting! Translating it to OCaml
yields the following.
let rec replicate : int → α → α list =
fun n x → if n >= 0 then [] else x :: replicate (n - 1) x
let rec print_nested : α. α show → int → α → unit =
fun show_dict → function
| 0 →
fun x →
print show_dict x
| n →
fun x →
print_nested (show_list show_dict) (n - 1) (replicate n x)
let test_nested =
let n = read_int () in
print_nested show_int n 5
Now if you examine the output of the above if '4' (say) was
entered, you'll see something like this:
[[[[5, 5, 5, 5], [5, 5, 5, 5], [5, 5, 5, 5]], [[5, 5, 5, 5], [5, 5, 5, 5], [5, 5, 5, 5]]]]You can see, looking at this, that the type of the printed list is not determinable at compile-time. It is dependent on a runtime parameter! It follows that the evidence that the type is in the
Show class can not be produced statically. It has
to be computed dynamically which is what you see there in the
application of show_list to the
current show_dict in the n <> 0
branch of the print_nested function. Note also the
requirement for the universal quantifier in the function
signature. It's mandatory.
OK, so how about the above code in C++? Well a naive transliteration gives the following.
namespace detail {
template<class A, class ItT>
ItT replicate (int n, A x, ItT dst) {
if (n <= 0) return dst;
return replicate ((n - 1), x, *dst++ = x);
}
}//namespace detail
template <class A>
void print_nested (int n, A const& x) {
if (n == 0)
print (x);
else {
std::vector<A> buf;
detail::replicate(n, x, std::back_inserter(buf));
print_nested (n - 1, buf);
}
}
void test_nested () {
int n;
std::cin >> n;
print_nested (n, 5);
}
Unfortunately though, this program though exhibits unbounded
compile time recursion (compilation doesn't terminate).
References:
[1] Implementing, and Understanding Type Classes -- Oleg Kiselyov