Sunday, February 15, 2015

Fold left via fold right

The puzzle is to express fold_left entirely in terms of fold_right. For example, an attempted solution like

let fold_left f e s =
  List.rev (fold_right (fun a acc -> f acc a) e (List.rev s))
is inadmissible because it relies on List.rev and thus is not entirely in terms of fold_right.

Recall that given a function $f$, a seed $e$ and a list $[a_{1}; a_{2}; \cdots; a_{N}]$, fold_left computes $f\;(\cdots f\;(f\;e\;a_{1})\;a_{2}\;\cdots)\;a_{N}$ whereas fold_right computes $f\;a_{1}\;(f\;a_{2}\;(\cdots\;(f\;a_{N}\;e)\cdots))$. There's really no choice but to delay computation and the expression that solves this problem is this.

let fold_left f e s =
  List.fold_right (fun a acc -> fun x -> acc (f x a)) s (fun x -> x) e

For example, in the top-level

 # fold_left (fun acc x -> x * x :: acc) [] [1; 2; 3;] ;;
   - : int list = [9; 4; 1]

To see how this works, consider the right fold over the list $[a_{1}; a_{2}; a_{3}]$ (say)

  • On encountering $a_{3}$ we compute $f_{3} = \lambda x_{3} . i\; (f\;x_{3}\;a_{3})$;
  • On encountering $a_{2}$ we compute $f_{2} = \lambda x_{2} . f_{3}\;(f\;x_{2}\;a_{2})$;
  • On encountering $a_{1}$ we compute $f_{1} = \lambda x_{1} . f_{2}\;(f\;x_{1}\;a_{1})$;
but then, $f_{1}\;e = f_{2}\;(f\;e\;a_{1}) = f_{3}\;(f\;(f\;e\;a_{1})\;a_{2}) = f\;(f\;(f\;e\;a_{1})\;a_{2})\;a_{3}$ as desired.

Saturday, February 7, 2015

Recursive lists in C++

Earlier this week, I had a need for a recursive list, that is, a list defined in terms of itself. I think, "back in the day" implementing a data structure of that sort would have been a snap for the everyday C programmer. Today, in this modern C++ world I found myself struggling a little and came to think that maybe the old ways are fading :)

For motivation, here's a couple of examples of the sort of thing I'm talking about.

(1) The list [0; 1; 0; 1; 0; 1; ...] is a list with a cycle in it. In OCaml you'd write that as let rec l = 0 :: 1 :: l.

(2) An interpreter using the technique of environments and closures can require an environment ((string * value) list) to contain a closure where the closure contains the environment. In OCaml you'd write let rec vars = (tag, V_closure (vars, xpr)) :: !env; env := vars.

Of course with pointers, it's not hard to implement recursive structures in C++. The trouble is having to concern yourself with their memory management due to the absence of garbage collection.

Alright, here is what I came up with. The code is pretty short.

#include <boost/variant.hpp>

#include <memory>
#include <stdexcept>

template <class T> struct node;
template <class T> using node_ptr=typename node<T>::node_ptr;
template <class T> using node_weak_ptr=typename node<T>::weak_ptr;
template <class T> using node_shared_ptr=typename node<T>::shared_ptr;
template <class T> struct ptr_t;
template <class T> using list=ptr_t<node<T>>;
template <class T> using list_ref=node_weak_ptr<T>;

template <class T> list<T> nil (); 
template <class T> bool empty (list<T> l);
template <class T> list<T> cons (T val, list<T> l);
template <class T> T& hd (list<T> l);
template <class T> list<T>& tl (list<T> l);
template <class T> list_ref<T> ref (list<T> src);
template <class T> bool is_ref (list<T> src);

The idea behind the implementation is generalize a pointer to node as a union with two variants, a shared pointer or a weak pointer.

template <class T> struct ptr_t :
  boost::variant <std::shared_ptr<T>, std::weak_ptr<T>> {
  typedef boost::variant <std::shared_ptr<T>, std::weak_ptr<T>> base;
  ptr_t () {}
  ptr_t (std::weak_ptr<T> p) : base (p) {}
  ptr_t (std::shared_ptr<T> p) : base (p) {}
};

template <class T>
struct node {
  typedef ptr_t<node> node_ptr;
  typedef std::weak_ptr<node> weak_ptr;
  typedef std::shared_ptr<node> shared_ptr;

  T data;
  node_ptr next;
};

This little bit of implementation detail comes up a couple of times so it's handy to factor it out.

namespace {
  //'get' at the raw pointer in the union of a smart/weak pointer
  template <class T>
  T* get (ptr_t<T> l) {
    if (std::shared_ptr<T>* p=
        boost::get<std::shared_ptr<T>>(&l)) {
      return p->get ();
    }
    return boost::get<std::weak_ptr<T>>(l).lock ().get ();
  }
}//namespace<anonymous>

The rest of the implementation is basically a set of "one-liners".

template <class T> list<T> nil (){ 
  return node_shared_ptr<T> (); 
}

template <class T> bool empty (list<T> l) { 
  return (get (l)) == nullptr; 
}

template <class T> list<T> cons (T val, list<T> l) {
  return node_shared_ptr<T> (new node<T>{val, l});
}

template <class T> T& hd (list<T> l) {   
  if (empty (l))
    throw std::runtime_error ("hd");
  return get (l) -> data;
}

template <class T> list<T>& tl (list<T> l) { 
  if (empty (l))
    throw std::runtime_error ("tl");
  return get (l) -> next; 
}

template <class T> bool is_ref (list<T> src) {
  return boost::get<list_ref<T>>(&src)!=nullptr;
}

template <class T> node_weak_ptr<T> ref (list<T> src)  {
  return node_weak_ptr<T>(boost::get<node_shared_ptr<T>>(src));
}

OK, well, that's about it. Let's see, regarding usage, (1) could be expressed like this

list<int> l = cons (0, cons (1, nil<int> ())); tl (tl (l)) = ref (l);
or, if we assume the existence of a 'last' function with an obvious definition, could be tidied up to read
list<int> l = cons (0, cons (1, nil<int> ())); tl (last (l)) = ref (l);
and (2) can be stated like this
typedef std::pair<std::string, value_t> p_t;
list<p_t> vars = node_shared_ptr<p_t>(new node<p_t>);
hd (vars) = std::make_pair (tag, V_closure {ref (vars), xpr});
tl (vars) = *env; 
*env = vars;