Tuesday, September 20, 2016

Custom operators in OCaml

If like me, you've always been a little hazy on the rules for defining OCaml operators then, this little post might help!

It is possible to "inject" user-defined operator syntax into OCaml programs. Here's how it works. First we define a set of characters called "symbol characters".

Symbol character (definition)

A character that is one of

! $ % & * + - . / : < = > ? @ ^ | ~

Prefix operators

The ! ("bang") prefix operator, has a predefined semantic as the operation of "de-referencing" a reference cell. A custom prefix operator can made by from a ! followed by one or more symbol characters.

So, to give some examples, one can define prefix operators like !!, !~ or even something as exotic as !::>. For example, one might write something like

let ( !+ ) x : int ref → unit = incr x
as a syntactic sugar equivalent to fun x → incr x

Additionally, prefix operators can begin with one of ~ and ? and, as in the case of !, must be followed by one or more symbol characters. So, in summary, a prefix operator begins with one of

! ~ ?
and is followed by one or more symbol characters.

For example let ( ~! ) x = incr x defines an alternative syntax equivalent to the !+ operator presented earlier.

Prefix operators have the highest possible precedence.

Infix operators

It is in fact possible to define operators in 5 different categories. What distinguish these categories from each other are their associativity and precedence properties.

Level 0

Level 0 operators are left associative with the same precedence as =. A level 0 operator starts with one of

= < > | & $
and is followed by zero or more symbol chars. For example, >>= is an operator much beloved by monadic programmers and |> (pipe operator) is a builtin equivalent to let ( |> ) x f = f x.

Level 1

Level 1 operators are right associative, have a precedence just above = and start with one of

@ ^
. That is, these operators are consistent with operations involving joining things. @@ (the "command" operator) of course has a predefined semantic as function application, that is, equivalent to the definition let ( @@ ) f x = f x.

Level 2

Level 2 operators are left associative have a precedence level shared with + and - and indeed, are defined with a leading (one of)

+ -
and, as usual, followed by a sequence of symbol characters. These operators are consistent for usage with operations generalizing addition or difference like operations. Some potential operators of this kind are +~, ++ and so on.

Level 3

Level 3 operators are also left associative and have a precedence level shared with * and /. Operators of this kind start with one of

* / %
followed by zero or more symbol characters and are evocative of operations akin to multiplication, division. For example, *~ might make a good companion for +~ of the previous section.

Level 4

Level 4 operators are right associative and have a precedence above *. The level 4 operators begin with

**
and are followed by zero or more symbol characters. The operation associated with ** is exponentiation (binds tight and associates to the right). The syntax **~ would fit nicely into the +~, *~ set of the earlier sections.