Thursday, November 13, 2014

Dimensional Analysis in OCaml

Dimensional analysis in OCaml

In 1994, Barton and Nackman in their book 'Scientific Engineering in C++' [1] demonstrated how one could encode the rules of dimensional analysis into the C++ type system enabling compile-time checking (no run-time cost) of the plausibility (at least up to the dimensional correctness) of computations.

In 2004, Abrahams & Gurtovy in 'C++ Template Metaprogramming' [2] showed the Barton Nackman technique to be elegantly implementable using compile time type sequences encoding integer constants. The key properties of the technique are:

  • Encoding of integers as types;
  • Compile time manipulation of sequences of these integer encodings to deduce/produce new derived types.

For a good while there it escaped me how to approach this problem in OCaml and it bothered me somewhat. I turned to the caml-list for guidance and I'm happy to say some notable 'Camlers' there helped me out (thank-you Octachron, Mario Alvarez Picallo, Thomas Gazagnaire, Roberto Di Cosmo and David Mentre)!

The key idea in the solution to follow is the encoding of integers into the type-system as differences between two Peano numbers. The details of the approach are presented in the excellent paper "Many Holes in Hindley-Milner" by Sam Lindley of the University of Edinburgh.

Credit for the code that follows goes to Mario Alvarez Picallo. I generalized Mario's program to the extent that I could do the "force on a laptop" exercise (as presented in the online Boost.MPL tutorial).

The module interface is where all the work is - getting the "type-math" correct.

module type S = sig

  type +'a s = 'a * 'a
  type (+'a,+'b,+'c,+'d,+'e,+'f,+'g,+'h,+'i,+'j,+'k,+'l,+'m,+'n) t

  (*Base dimensions*)

  val mass : 
    float -> ('a,'a s,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
  val length : 
    float -> ('a,'a,'b,'b s,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
  val time : 
    float -> ('a,'a,'b,'b,'c,'c s,'d,'d,'e,'e,'f,'f,'g,'g) t
  val charge : 
    float -> ('a,'a,'b,'b,'c,'c,'d,'d s,'e,'e,'f,'f,'g,'g) t
  val temperature : 
    float -> ('a,'a,'b,'b,'c,'c,'d,'d,'e,'e s,'f,'f,'g,'g) t
  val intensity : 
    float -> ('a,'a,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f s,'g,'g) t
  val angle :
     float -> ('a,'a,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g s) t

  (*Composite dimensions*)

  val velocity :
    float -> ('a,'a,'b,'b s,'c s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
  val acceleration :
    float -> ('a,'a,'b,'b s,'c s s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
  val momentum :
    float -> ('a,'a s,'b,'b s,'c s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
  val force :
    float -> ('a,'a s,'b,'b s,'c s s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t


  val ( + ) : 
    ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t -> 
      ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t -> 
        ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t
  val ( - ) :
    ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t -> 
      ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t -> 
        ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t
  val ( * ) :
    ('a0,'b0,'c0,'d0,'e0,'f0,'g0,'h0,'i0,'j0,'k0,'l0,'m0,'n0) t -> 
      ('b0,'b1,'d0,'d1,'f0,'f1,'g0,'h1,'i0,'j1,'k0,'l1,'m0,'n1) t -> 
        ('a0,'b1,'c0,'d1,'e0,'f1,'g0,'h1,'i0,'j1,'k0,'l1,'m0,'n1) t
  val ( / ) :
    ('a0,'b0,'c0,'d0,'e0,'f0,'g0,'h0,'i0,'j0,'k0,'l0,'m0,'n0) t -> 
      ('a1,'b0,'c1,'d0,'e1,'f0,'g1,'h0,'i1,'j0,'k1,'l0,'m1,'n0) t -> 
        ('a0,'a1,'c0,'c1,'e0,'e1,'g0,'g1,'i0,'i1,'k0,'k1,'m0,'m1) t

  (*Conversion to float*)

  val value : ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t -> float

That's the hard part, the module implementation itself is trivial.

module Dim : S = struct

  type +'a s = 'a * 'a
  type (+'a,+'b,+'c,+'d,+'e,+'f,+'g,+'h,+'i,+'j,+'k,+'l,+'m,+'n) t = float

  let mass x = x
  let length x = x
  let time x = x
  let charge x = x
  let temperature x = x
  let intensity x = x
  let angle x = x

  let velocity x = x
  let acceleration x = x
  let momentum x = x
  let force x = x

  let ( + ) = ( +. )
  let ( - ) = ( -. )
  let ( * ) = ( *. )
  let ( / ) = ( /. )

  let value x = x


And the motivating "force on a laptop" calculation? Well in the top-level it proceeds like this.

# open Dim ;;
# let m = mass 5.0 ;;
val m : ('a,'a Dim.s,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) Dim.t =
# let a = acceleration 9.8 ;;
val a :
  ('a,'a,'b,'b Dim.s,'c Dim.s Dim.s,'c,'d,'d,'e,'e,'f,'f,'g,'g)
  Dim.t = <abstr>
# let f = m * a ;;
val f :
  ('a,'a Dim.s,'b,'b Dim.s,'c Dim.s Dim.s,'c,'d,'d,'e,'e,'f,'f,'g,'g)
  Dim.t = <abstr>
Now to verify the result.
# let m2 = f / a ;;
val m2 :
  ('a,'a Dim.s,'b,'b,'c Dim.s Dim.s,'c Dim.s Dim.s,'d,'d,'e,'e,'f,'f,'g,'g)
  Dim.t = <abstr>
If we got things right, then we'd expect that the difference m2 - m be close to zero (within rounding error).
# value (m2 - m) ;;
- : float = 0.
Indeed it is as we hoped.

The key test though is this, if we had written a/f instead of f/a we want that there be type-check failure preventing the mistake from propagating through the program.

# let m2 = a / f (*oops*) ;;
val m2 :
  ('a Dim.s,'a,'b,'b,'c Dim.s Dim.s,'c Dim.s Dim.s,'d,'d,'e,'e,'f,'f,'g,'g)
  Dim.t = 
# m2 - m ;;
Characters 5-6:
  m2 - m ;;
  This expression has type
   ('a Dim.s,'a Dim.s Dim.s,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) Dim.t
  but an expression was expected of type
   ('a Dim.s,'a,'h,'h,'i Dim.s Dim.s,'i Dim.s Dim.s,'j,'j,'k,'k,'l,'l,'m,'m) Dim.t
  The type variable 'a occurs inside 'a Dim.s * 'a Dim.s
And there it is. Happy days!

[1] John J. Barton and Lee R. Nackman. Scientific and Engineering C++: an Introduction with Advanced Techniques and Examples. Reading, MA: Addison Wesley. ISBN 0-201-53393-6. 1994.

[2] David Abrahams and Aleksey Gurtovy C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond (C++ in Depth Series), Addison-Wesley Professional. ISBN:0321227255. 2004.

Sunday, November 9, 2014



When you begin to program in OCaml, one of the first pieces of advice you encounter is to prefer the '::' operator over the '@' operator for list construction. The question is, does it matter? Even knowing that '@' exhibits complexity $O(N)$ as opposed to $O(1)$ for '::' should you care? I mean, how much of a difference can it make really?

The answer of course is yes. In practical terms, it matters. No, it really, really matters. Let's quantify it. Here's a version of the range function that uses '@'.

let range s e =
  let rec loop acc s e =
    if s >= e then acc
    else loop (acc @ [s]) (s + 1) e 
  in loop [] s e
As an aside, you'll note that it's been written to be tail-recursive (so as to avoid inducing stack overflow).

Timing this function for building lists of $2^{10} = 1,024$ elements to $2^{16} = 65,536$ elements produces the following table:

ntime (s)

Ouch! To build a list of just $65,536$ elements, the program takes over 2 minutes!?!

Ok, here's the version of range that uses '::' to build the list (and necessarily does a List.rev on the result in order that the elements don't come out "back to front"):

let range s e =
  let rec loop acc s e =
    if s >= e then acc
    else loop (s :: acc) (s + 1) e 
  in List.rev (loop [] s e)

And the timings for this one?

ntime (s)

That is, this one builds the list of $65,536$ elements in just $5$ milliseconds!