Saturday, February 1, 2014

Extending OCaml in C++ - Boost.Date Time example

Extending OCaml in C++ - Boost.Date Time example

I aim to show just how easy it is to build on existing C++ libraries to extend OCaml. This should serve as a reasonable tutorial to get you going. What follows here is a toy example exporting functionality of the Boost.Datetime library.

Specifically, we're going to look at the minimal amount of machinery that will enable us to call the Boost.Date_time Gregorian function day_clock::local_day () from OCaml.

The first order of business is to agree on a common representation of a Gregorian date when exchanging data between OCaml and C++. Boost.Date_time has functions for shifting internal representations of Gregorian dates to/from struct tm so we can opt for taking advantage of the existing OCaml Unix module type tm to represent OCaml dates.

We start with the C wrapper for day_clock::local_day () suitable for use as an external in an OCaml module. This is file bdate_c.cpp.

#include <caml/mlvalues.h>
#include <caml/memory.h>
#include <caml/alloc.h>
Reflect Boost.Date_time boost::gregorian::date.
#include <boost/date_time/gregorian/gregorian.hpp>
Use for intermediate representation

type tm = {
   tm_sec : int; (* Seconds 0..60 *)
   tm_min : int; (* Minutes 0..59 *)
   tm_hour : int; (* Hours 0..23 *)
   tm_mday : int; (* Day of month 1..31 *)
   tm_mon : int; (* Month of year 0..11 *)
   tm_year : int; (* Year - 1900 *)
   tm_wday : int; (* Day of week (Sunday is 0) *)
   tm_yday : int; (* Day of year 0..365 *)
   tm_isdst : bool; (* Daylight time savings in effect *)
extern "C" value caml_boost_gregorian_day_clock_local_day ()
  value res=0;
  struct caml__roots_block blk, *caml__frame=caml_local_roots; = caml_local_roots;
  blk.nitems = 1;
  blk.ntables = 1;
  blk.tables [0] = &res;
  caml_local_roots = &blk;

  struct tm t = 
    boost::gregorian::to_tm (
      boost::gregorian::day_clock::local_day ());

  res = caml_alloc (9, 0);
  Store_field (res, 0, Val_int (t.tm_sec));
  Store_field (res, 1, Val_int (t.tm_min));
  Store_field (res, 2, Val_int (t.tm_hour));
  Store_field (res, 3, Val_int (t.tm_mday));
  Store_field (res, 4, Val_int (t.tm_mon));
  Store_field (res, 5, Val_int (t.tm_year));
  Store_field (res, 6, Val_int (t.tm_wday));
  Store_field (res, 7, Val_int (t.tm_yday));
  Store_field (res, 8, Val_bool (false)); //t.tm_isddst is always -1

  value tmp = res;
  caml_local_roots = caml__frame;

  return tmp;
We can turn that source file into a regular static archive libbdate_c.a with the commands shown in this bash fragment.
echo Building libbdate.a...
g++ -c -I/home/fletch/project/boost_1_55_0   \
       -I/home/fletch/.opam/4.00.1/lib/ocaml \
ar rvs libbdate_c.a bdate_c.o
(of course, if following along you'll need to adjust the paths in the above according to your needs). The equivalent Windows commands are like this.
echo Building libbdate.a...
cl /nologo /EHsc /c /Fo /MD /Ic:/project/boost_1_55_0/   \
       /IC:/ocamlms64/lib \
       /DBOOST_ALL_NO_LIB=1 \
lib /NOLOGO /OUT:libbdate_c.lib bdate_c.obj

Now we can turn attention to the Bdate OCaml module. First a module type. This file is bdate_sig.mli.

(**[Bdate] module interface*)

(** A set of date-time libraries based on generic programming
    concepts. See
    Boost.Date_time} *)
module type S = sig

  (** {2 Types}*)

  type t
  (** The type of a date*)

  val string_of_date : t -> string
  (** @return a string representation of a date*)

  (**{2 Functions}*)

  val local_day : unit -> t
  (** Get the local day based on the time zone settings of the

The next file bdate.mli just "passes through" the module type S defined above.
include Bdate_sig.S

And now we implement the module in file


external boost_gregorian_day_clock_local_day : unit -> t = "caml_boost_gregorian_day_clock_local_day"

let local_day () = boost_gregorian_day_clock_local_day ()
let string_of_date tm = 
  Printf.sprintf "%04d-%02d-%02d" 
    (tm.Unix.tm_year+1900) (tm.Unix.tm_mon+1) (tm.Unix.tm_mday)

The following bit of bash will create a compiled archive bdate.cmxa and compiled module interface bdate.cmi of the above OCaml files.

echo Compiling bdate.cmxa...
ocamlopt.opt -c bdate_sig.mli bdate.mli
ocamlopt.opt -a -o bdate.cmxa bdate.cmx

OK, we've got all we need to write a test OCaml program. This is

let _ = 
  let t = Bdate.local_day () in
  Printf.printf "The current date is %s\n" (Bdate.string_of_date t)
This program when run, will print the current date. Here's the bash to build it.
echo Compiling bdate_test.opt...
ocamlopt.opt -c -I .
#Take care to get the ordering right here
ocamlopt.opt -verbose -cclib -lstdc++ \
  -o bdate_test \
  unix.cmxa libbdate_c.a bdate.cmxa bdate_test.cmx 
Building this on Windows is no problem either. In this case the commands look like the following.
echo Compiling bdate_test.opt...
ocamlopt.opt -c -I .
ocamlopt.opt -verbose \
  -o bdate_test.exe \
  unix.cmxa libbdate_c.lib bdate.cmxa bdate_test.cmx 
(That is, we don't need to provide any additional link libraries.)

We expect to see output like

The current date is 2014-02-01
when we run this program (of course, the date printed will substituted with the current date).

Literate programming is the business. Here's a final set of bash commands for generating the Bdate module documentation.

echo Generating documentation...
mkdir -p doc
ocamldoc -intro intro -d doc -html -colorize-code -stars -sort \
  bdate_sig.mli bdate.mli
That assumes existence of a file called intro containing markup for the module documentation "header". Here's one that will do.
{1 Bdate} 
The [Bdate] package. A set of date-time libraries based on generic
programming concepts. See
{2 Index}
{2 Modules}


On Windows, ocamlopt calls out to flexlink to link an executable. We can take flexlink out of the build-chain in this instance. One example bash procedure to do that is

echo Cleaning up intermediate files...
rm *.obj *.lib *.opt *.cmx *.cmxa *.cmi

echo Compiling target libbdate.a...
cl /nologo /EHsc /c /Fo /MD /Ic:/project/boost_1_55_0/   \
       /IC:/ocamlms64/lib \
       /DBOOST_ALL_NO_LIB=1 \
lib /NOLOGO /OUT:libbdate_c.lib bdate_c.obj

echo Compiling target bdate_ocaml.obj...
ocamlopt.opt -c bdate_sig.mli bdate.mli
ocamlopt.opt -output-obj -o bdate_ocaml.obj unix.cmxa bdate.cmx bdate_test.cmx std_exit.cmx

#  bdet_ocaml.obj:
#      ocamlopt.opt -output-obj -o bdate_ocaml.obj \
# unix.cmxa std_exit.cmx
#  .DEFAULT: bdet_ocaml.obj

echo Compiling target bdate_test.exe...
cl /Febdate_test.exe \
  /EHsc /MD /nologo driver.c bdate_ocaml.obj \
  libbdate_c.lib \
  c:/ocamlms64/lib/libunix.lib \
  c:/ocamlms64/lib/libasmrun.lib \

# bdate_test.exe : libbdate_c.lib bdate_ocaml.obj driver.c
#   cl /Febdate_test.exe \
#     /EHsc /MD /nologo driver.c bdate_ocaml.obj \
#     libbdate_c.lib \
#     c:/ocamlms64/lib/libasmrun.lib \
#     c:/ocamlms64/lib/libunix.lib \
#     ws2_32.lib
# .DEFAULT: bdate_test.exe
# .DEFAULT: $(CProgramCopy _, $(BIN_DIR), bdate_test)

echo Generating documentation...
mkdir -p doc
ocamldoc -intro intro -d doc -html -colorize-code -stars -sort \
  bdate_sig.mli bdate.mli
(the commands contained in the comments are equivalent omake diretives).

The above approach requires supplying a C driver that gives control to OCaml. Here's a suitable definition for driver.c.
void caml_main (char**);

int main (int argc, char** argv) 
  caml_main (argv); 

  return 0; 

void flexdll_dump_exports(void* u){(void)u;}
void *flexdll_dlopen(char const* file, int mode){(void)file; (void)mode; return (void*)0;}
void flexdll_dlclose(void* u){(void)u;}
void* flexdll_dlsym(void* u, const char * n) {(void)u; (void)n; return (void*)0;}
char* flexdll_dlerror(){static char* flexdll_error_buffer = "flexdll is not availble"; return flexdll_error_buffer;}