## Thursday, November 13, 2014

### Dimensional analysis in OCaml

In 1994, Barton and Nackman in their book 'Scientific Engineering in C++' [1] demonstrated how one could encode the rules of dimensional analysis into the C++ type system enabling compile-time checking (no run-time cost) of the plausibility (at least up to the dimensional correctness) of computations.

In 2004, Abrahams & Gurtovy in 'C++ Template Metaprogramming' [2] showed the Barton Nackman technique to be elegantly implementable using compile time type sequences encoding integer constants. The key properties of the technique are:

• Encoding of integers as types;
• Compile time manipulation of sequences of these integer encodings to deduce/produce new derived types.

For a good while there it escaped me how to approach this problem in OCaml and it bothered me somewhat. I turned to the caml-list for guidance and I'm happy to say some notable 'Camlers' there helped me out (thank-you Octachron, Mario Alvarez Picallo, Thomas Gazagnaire, Roberto Di Cosmo and David Mentre)!

The key idea in the solution to follow is the encoding of integers into the type-system as differences between two Peano numbers. The details of the approach are presented in the excellent paper "Many Holes in Hindley-Milner" by Sam Lindley of the University of Edinburgh.

Credit for the code that follows goes to Mario Alvarez Picallo. I generalized Mario's program to the extent that I could do the "force on a laptop" exercise (as presented in the online Boost.MPL tutorial).

The module interface is where all the work is - getting the "type-math" correct.

module type S = sig

type +'a s = 'a * 'a
type (+'a,+'b,+'c,+'d,+'e,+'f,+'g,+'h,+'i,+'j,+'k,+'l,+'m,+'n) t

(*Base dimensions*)

val mass :
float -> ('a,'a s,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
val length :
float -> ('a,'a,'b,'b s,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
val time :
float -> ('a,'a,'b,'b,'c,'c s,'d,'d,'e,'e,'f,'f,'g,'g) t
val charge :
float -> ('a,'a,'b,'b,'c,'c,'d,'d s,'e,'e,'f,'f,'g,'g) t
val temperature :
float -> ('a,'a,'b,'b,'c,'c,'d,'d,'e,'e s,'f,'f,'g,'g) t
val intensity :
float -> ('a,'a,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f s,'g,'g) t
val angle :
float -> ('a,'a,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g s) t

(*Composite dimensions*)

val velocity :
float -> ('a,'a,'b,'b s,'c s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
val acceleration :
float -> ('a,'a,'b,'b s,'c s s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
val momentum :
float -> ('a,'a s,'b,'b s,'c s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t
val force :
float -> ('a,'a s,'b,'b s,'c s s,'c,'d,'d,'e,'e,'f,'f,'g,'g) t

(*Arithmetic*)

val ( + ) :
('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t ->
('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t ->
('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t
val ( - ) :
('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t ->
('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t ->
('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t
val ( * ) :
('a0,'b0,'c0,'d0,'e0,'f0,'g0,'h0,'i0,'j0,'k0,'l0,'m0,'n0) t ->
('b0,'b1,'d0,'d1,'f0,'f1,'g0,'h1,'i0,'j1,'k0,'l1,'m0,'n1) t ->
('a0,'b1,'c0,'d1,'e0,'f1,'g0,'h1,'i0,'j1,'k0,'l1,'m0,'n1) t
val ( / ) :
('a0,'b0,'c0,'d0,'e0,'f0,'g0,'h0,'i0,'j0,'k0,'l0,'m0,'n0) t ->
('a1,'b0,'c1,'d0,'e1,'f0,'g1,'h0,'i1,'j0,'k1,'l0,'m1,'n0) t ->
('a0,'a1,'c0,'c1,'e0,'e1,'g0,'g1,'i0,'i1,'k0,'k1,'m0,'m1) t

(*Conversion to float*)

val value : ('a,'b,'c,'d,'e,'f,'g,'h,'i,'j,'k,'l,'m,'n) t -> float
end

That's the hard part, the module implementation itself is trivial.

module Dim : S = struct

type +'a s = 'a * 'a
type (+'a,+'b,+'c,+'d,+'e,+'f,+'g,+'h,+'i,+'j,+'k,+'l,+'m,+'n) t = float

let mass x = x
let length x = x
let time x = x
let charge x = x
let temperature x = x
let intensity x = x
let angle x = x

let velocity x = x
let acceleration x = x
let momentum x = x
let force x = x

let ( + ) = ( +. )
let ( - ) = ( -. )
let ( * ) = ( *. )
let ( / ) = ( /. )

let value x = x

end

And the motivating "force on a laptop" calculation? Well in the top-level it proceeds like this.

# open Dim ;;
# let m = mass 5.0 ;;
val m : ('a,'a Dim.s,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) Dim.t =
<abstr>
# let a = acceleration 9.8 ;;
val a :
('a,'a,'b,'b Dim.s,'c Dim.s Dim.s,'c,'d,'d,'e,'e,'f,'f,'g,'g)
Dim.t = <abstr>
# let f = m * a ;;
val f :
('a,'a Dim.s,'b,'b Dim.s,'c Dim.s Dim.s,'c,'d,'d,'e,'e,'f,'f,'g,'g)
Dim.t = <abstr>
Now to verify the result.
# let m2 = f / a ;;
val m2 :
('a,'a Dim.s,'b,'b,'c Dim.s Dim.s,'c Dim.s Dim.s,'d,'d,'e,'e,'f,'f,'g,'g)
Dim.t = <abstr>
If we got things right, then we'd expect that the difference m2 - m be close to zero (within rounding error).
# value (m2 - m) ;;
- : float = 0.
Indeed it is as we hoped.

The key test though is this, if we had written a/f instead of f/a we want that there be type-check failure preventing the mistake from propagating through the program.

# let m2 = a / f (*oops*) ;;
val m2 :
('a Dim.s,'a,'b,'b,'c Dim.s Dim.s,'c Dim.s Dim.s,'d,'d,'e,'e,'f,'f,'g,'g)
Dim.t =
# m2 - m ;;
Characters 5-6:
m2 - m ;;
^
Error:
This expression has type
('a Dim.s,'a Dim.s Dim.s,'b,'b,'c,'c,'d,'d,'e,'e,'f,'f,'g,'g) Dim.t
but an expression was expected of type
('a Dim.s,'a,'h,'h,'i Dim.s Dim.s,'i Dim.s Dim.s,'j,'j,'k,'k,'l,'l,'m,'m) Dim.t
The type variable 'a occurs inside 'a Dim.s * 'a Dim.s
And there it is. Happy days!

[1] John J. Barton and Lee R. Nackman. Scientific and Engineering C++: an Introduction with Advanced Techniques and Examples. Reading, MA: Addison Wesley. ISBN 0-201-53393-6. 1994.

[2] David Abrahams and Aleksey Gurtovy C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond (C++ in Depth Series), Addison-Wesley Professional. ISBN:0321227255. 2004.

## Sunday, November 9, 2014

### Complexity

When you begin to program in OCaml, one of the first pieces of advice you encounter is to prefer the '::' operator over the '@' operator for list construction. The question is, does it matter? Even knowing that '@' exhibits complexity $O(N)$ as opposed to $O(1)$ for '::' should you care? I mean, how much of a difference can it make really?

The answer of course is yes. In practical terms, it matters. No, it really, really matters. Let's quantify it. Here's a version of the range function that uses '@'.

let range s e =
let rec loop acc s e =
if s >= e then acc
else loop (acc @ [s]) (s + 1) e
in loop [] s e
As an aside, you'll note that it's been written to be tail-recursive (so as to avoid inducing stack overflow).

Timing this function for building lists of $2^{10} = 1,024$ elements to $2^{16} = 65,536$ elements produces the following table:

ntime (s)
100.011243
110.047308
120.197137
130.866350
144.130998
1522.769691
16142.506546

Ouch! To build a list of just $65,536$ elements, the program takes over 2 minutes!?!

Ok, here's the version of range that uses '::' to build the list (and necessarily does a List.rev on the result in order that the elements don't come out "back to front"):

let range s e =
let rec loop acc s e =
if s >= e then acc
else loop (s :: acc) (s + 1) e
in List.rev (loop [] s e)

And the timings for this one?

ntime (s)
100.000037
110.000097
120.000143
130.000324
140.002065
150.001195
160.005341

That is, this one builds the list of $65,536$ elements in just $5$ milliseconds!

Convinced?