117 Advanced

Recursive Types

Functional Programming

Tutorial

The Problem

A recursive type is one whose definition refers to itself — a tree node contains child nodes, a linked list node points to the next node. This is fundamental to representing hierarchical data: file systems, abstract syntax trees, JSON documents, and parse trees. Rust requires recursive types to have a known size at compile time, which forces explicit indirection via Box<T> at recursive positions.

🎯 Learning Outcomes

• Understand why Rust requires Box<T> (or other heap-indirected pointers) for recursive type definitions

• Learn to implement a binary search tree using recursive enums

• See how insert, search, and traversal operations work on recursive structures

• Compare Rust's explicit Box wrapping to OCaml's implicit heap allocation for algebraic types

Code Example

#[derive(Debug)]
pub enum Tree<T> {
    Leaf,
    Node(Box<Tree<T>>, T, Box<Tree<T>>),
}

impl<T: Ord> Tree<T> {
    pub fn insert(self, x: T) -> Self {
        match self {
            Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
            Tree::Node(l, v, r) => {
                if x < v      { Tree::Node(Box::new(l.insert(x)), v, r) }
                else if x > v { Tree::Node(l, v, Box::new(r.insert(x))) }
                else          { Tree::Node(l, v, r) }
            }
        }
    }

    pub fn to_sorted_vec(&self) -> Vec<&T> {
        match self {
            Tree::Leaf => vec![],
            Tree::Node(l, v, r) => {
                let mut result = l.to_sorted_vec();
                result.push(v);
                result.extend(r.to_sorted_vec());
                result
            }
        }
    }
}

(* OCaml heap-allocates everything; recursive types just work *)
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree

let rec insert x = function
  | Leaf -> Node (Leaf, x, Leaf)
  | Node (l, v, r) ->
    if x < v then Node (insert x l, v, r)
    else if x > v then Node (l, v, insert x r)
    else Node (l, v, r)

let rec to_sorted_list = function
  | Leaf -> []
  | Node (l, v, r) -> to_sorted_list l @ [v] @ to_sorted_list r

(* Linked list *)
type 'a mylist = Nil | Cons of 'a * 'a mylist

let rec length = function Nil -> 0 | Cons (_, t) -> 1 + length t

(* Expression AST *)
type expr =
  | Num of float
  | Add of expr * expr
  | Mul of expr * expr
  | Neg of expr

let rec eval = function
  | Num n -> n
  | Add (a, b) -> eval a +. eval b
  | Mul (a, b) -> eval a *. eval b
  | Neg e -> -. eval e

Key Differences

Indirection: OCaml allocates all variants on the heap automatically; Rust requires explicit Box<T> at recursive positions to break size cycles.

Ownership on insert: Rust's insert(self, x) consumes the old tree (move semantics); OCaml's insert returns a new tree while the old one stays alive via GC.

Pattern matching: Both use structural pattern matching, but Rust must dereference Box in patterns (handled transparently by the compiler in modern Rust).

Leaf representation: OCaml's Leaf is a zero-size value; Rust's Tree::Leaf is similarly zero-size, but Node pays for two Box pointers.

OCaml Approach

OCaml's type system handles recursive algebraic types without any annotation:

type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree

Every OCaml value is heap-allocated and pointer-accessed through the GC, so there is no size-at-compile-time problem. Pattern matching on trees is identical in structure to Rust's match, but without Box::new or dereferencing.

Full Source

#![allow(clippy::all)]
// Example 117: Recursive Types with Box
//
// Recursive types need Box in Rust because the compiler must know
// the size of each type at compile time. Box<T> is pointer-sized,
// breaking the infinite-size recursion.

// ── Approach 1: Binary search tree ──────────────────────────────────────────
//
// OCaml: type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
// Rust:  the Node children must be boxed so the compiler sees a fixed size.

#[derive(Debug)]
pub enum Tree<T> {
    Leaf,
    Node(Box<Tree<T>>, T, Box<Tree<T>>),
}

impl<T: Ord> Tree<T> {
    pub fn new() -> Self {
        Tree::Leaf
    }

    /// Insert a value, returning the updated tree (consumes self).
    pub fn insert(self, x: T) -> Self {
        match self {
            Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
            Tree::Node(l, v, r) => {
                if x < v {
                    Tree::Node(Box::new(l.insert(x)), v, r)
                } else if x > v {
                    Tree::Node(l, v, Box::new(r.insert(x)))
                } else {
                    Tree::Node(l, v, r)
                }
            }
        }
    }

    /// In-order traversal yields elements in sorted order.
    pub fn to_sorted_vec(&self) -> Vec<&T> {
        match self {
            Tree::Leaf => vec![],
            Tree::Node(l, v, r) => {
                let mut result = l.to_sorted_vec();
                result.push(v);
                result.extend(r.to_sorted_vec());
                result
            }
        }
    }

    pub fn contains(&self, x: &T) -> bool {
        match self {
            Tree::Leaf => false,
            Tree::Node(l, v, r) => {
                if x < v {
                    l.contains(x)
                } else if x > v {
                    r.contains(x)
                } else {
                    true
                }
            }
        }
    }
}

impl<T: Ord> Default for Tree<T> {
    fn default() -> Self {
        Tree::new()
    }
}

// ── Approach 2: Singly-linked list ──────────────────────────────────────────
//
// OCaml: type 'a mylist = Nil | Cons of 'a * 'a mylist
// Rust:  tail must be Box'd — same reason as the tree above.

#[derive(Debug)]
pub enum List<T> {
    Nil,
    Cons(T, Box<List<T>>),
}

impl<T> List<T> {
    pub fn nil() -> Self {
        List::Nil
    }

    /// Prepend an element (O(1)).
    pub fn cons(head: T, tail: Self) -> Self {
        List::Cons(head, Box::new(tail))
    }

    pub fn is_empty(&self) -> bool {
        matches!(self, List::Nil)
    }

    pub fn len(&self) -> usize {
        match self {
            List::Nil => 0,
            List::Cons(_, tail) => 1 + tail.len(),
        }
    }

    pub fn to_vec(&self) -> Vec<&T> {
        let mut result = vec![];
        let mut current = self;
        loop {
            match current {
                List::Nil => break,
                List::Cons(h, tail) => {
                    result.push(h);
                    current = tail;
                }
            }
        }
        result
    }
}

// ── Approach 3: Expression AST ───────────────────────────────────────────────
//
// A classic use-case: recursive expression trees for interpreters / compilers.

#[derive(Debug)]
pub enum Expr {
    Num(f64),
    Add(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
    Neg(Box<Expr>),
}

pub fn eval(expr: &Expr) -> f64 {
    match expr {
        Expr::Num(n) => *n,
        Expr::Add(a, b) => eval(a) + eval(b),
        Expr::Mul(a, b) => eval(a) * eval(b),
        Expr::Neg(e) => -eval(e),
    }
}

// ── Tests ────────────────────────────────────────────────────────────────────

#[cfg(test)]
mod tests {
    use super::*;

    // Tree tests

    #[test]
    fn tree_insert_and_sorted_order() {
        let tree = [5, 3, 7, 1, 4, 6, 8]
            .into_iter()
            .fold(Tree::new(), Tree::insert);
        let sorted: Vec<i32> = tree.to_sorted_vec().into_iter().copied().collect();
        assert_eq!(sorted, [1, 3, 4, 5, 6, 7, 8]);
    }

    #[test]
    fn tree_contains() {
        let tree = [10, 5, 15].into_iter().fold(Tree::new(), Tree::insert);
        assert!(tree.contains(&5));
        assert!(tree.contains(&15));
        assert!(!tree.contains(&99));
    }

    #[test]
    fn tree_empty_is_leaf() {
        let tree: Tree<i32> = Tree::new();
        assert!(tree.to_sorted_vec().is_empty());
        assert!(!tree.contains(&0));
    }

    #[test]
    fn tree_no_duplicates() {
        let tree = [3, 3, 3].into_iter().fold(Tree::new(), Tree::insert);
        assert_eq!(tree.to_sorted_vec(), [&3]);
    }

    // List tests

    #[test]
    fn list_len_and_to_vec() {
        let list = List::cons(1, List::cons(2, List::cons(3, List::nil())));
        assert_eq!(list.len(), 3);
        assert_eq!(list.to_vec(), [&1, &2, &3]);
    }

    #[test]
    fn list_nil_is_empty() {
        let list: List<i32> = List::nil();
        assert_eq!(list.len(), 0);
        assert!(list.to_vec().is_empty());
    }

    // Expr / AST tests

    #[test]
    fn expr_eval_arithmetic() {
        // (2 + 3) * -(4)  == -20
        let expr = Expr::Mul(
            Box::new(Expr::Add(
                Box::new(Expr::Num(2.0)),
                Box::new(Expr::Num(3.0)),
            )),
            Box::new(Expr::Neg(Box::new(Expr::Num(4.0)))),
        );
        assert!((eval(&expr) - (-20.0)).abs() < f64::EPSILON);
    }

    #[test]
    fn expr_eval_leaf() {
        assert!((eval(&Expr::Num(42.0)) - 42.0).abs() < f64::EPSILON);
    }
}

(* Example 117: Recursive Types with Box *)

(* Approach 1: Binary tree *)
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree

let rec insert x = function
  | Leaf -> Node (Leaf, x, Leaf)
  | Node (l, v, r) ->
    if x < v then Node (insert x l, v, r)
    else if x > v then Node (l, v, insert x r)
    else Node (l, v, r)

let rec to_sorted_list = function
  | Leaf -> []
  | Node (l, v, r) -> to_sorted_list l @ [v] @ to_sorted_list r

let approach1 () =
  let t = List.fold_left (fun acc x -> insert x acc) Leaf [5;3;7;1;4;6;8] in
  let sorted = to_sorted_list t in
  assert (sorted = [1;3;4;5;6;7;8]);
  Printf.printf "Sorted: %s\n" (String.concat ", " (List.map string_of_int sorted))

(* Approach 2: Linked list *)
type 'a mylist = Nil | Cons of 'a * 'a mylist

let rec length = function Nil -> 0 | Cons (_, rest) -> 1 + length rest
let rec map f = function Nil -> Nil | Cons (x, rest) -> Cons (f x, map f rest)

let approach2 () =
  let lst = Cons (1, Cons (2, Cons (3, Nil))) in
  assert (length lst = 3);
  let doubled = map (fun x -> x * 2) lst in
  assert (length doubled = 3);
  Printf.printf "Length: %d\n" (length lst)

(* Approach 3: Expression tree *)
type expr =
  | Lit of float
  | Var of string
  | BinOp of string * expr * expr

let rec show = function
  | Lit f -> string_of_float f
  | Var s -> s
  | BinOp (op, a, b) -> Printf.sprintf "(%s %s %s)" (show a) op (show b)

let approach3 () =
  let e = BinOp ("+", Lit 1.0, BinOp ("*", Var "x", Lit 3.0)) in
  let s = show e in
  Printf.printf "Expr: %s\n" s;
  assert (String.length s > 0)

let () =
  approach1 ();
  approach2 ();
  approach3 ();
  Printf.printf "✓ All tests passed\n"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    // Tree tests

    #[test]
    fn tree_insert_and_sorted_order() {
        let tree = [5, 3, 7, 1, 4, 6, 8]
            .into_iter()
            .fold(Tree::new(), Tree::insert);
        let sorted: Vec<i32> = tree.to_sorted_vec().into_iter().copied().collect();
        assert_eq!(sorted, [1, 3, 4, 5, 6, 7, 8]);
    }

    #[test]
    fn tree_contains() {
        let tree = [10, 5, 15].into_iter().fold(Tree::new(), Tree::insert);
        assert!(tree.contains(&5));
        assert!(tree.contains(&15));
        assert!(!tree.contains(&99));
    }

    #[test]
    fn tree_empty_is_leaf() {
        let tree: Tree<i32> = Tree::new();
        assert!(tree.to_sorted_vec().is_empty());
        assert!(!tree.contains(&0));
    }

    #[test]
    fn tree_no_duplicates() {
        let tree = [3, 3, 3].into_iter().fold(Tree::new(), Tree::insert);
        assert_eq!(tree.to_sorted_vec(), [&3]);
    }

    // List tests

    #[test]
    fn list_len_and_to_vec() {
        let list = List::cons(1, List::cons(2, List::cons(3, List::nil())));
        assert_eq!(list.len(), 3);
        assert_eq!(list.to_vec(), [&1, &2, &3]);
    }

    #[test]
    fn list_nil_is_empty() {
        let list: List<i32> = List::nil();
        assert_eq!(list.len(), 0);
        assert!(list.to_vec().is_empty());
    }

    // Expr / AST tests

    #[test]
    fn expr_eval_arithmetic() {
        // (2 + 3) * -(4)  == -20
        let expr = Expr::Mul(
            Box::new(Expr::Add(
                Box::new(Expr::Num(2.0)),
                Box::new(Expr::Num(3.0)),
            )),
            Box::new(Expr::Neg(Box::new(Expr::Num(4.0)))),
        );
        assert!((eval(&expr) - (-20.0)).abs() < f64::EPSILON);
    }

    #[test]
    fn expr_eval_leaf() {
        assert!((eval(&Expr::Num(42.0)) - 42.0).abs() < f64::EPSILON);
    }
}

Deep Comparison

OCaml vs Rust: Recursive Types with Box

Side-by-Side Code

OCaml

(* OCaml heap-allocates everything; recursive types just work *)
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree

let rec insert x = function
  | Leaf -> Node (Leaf, x, Leaf)
  | Node (l, v, r) ->
    if x < v then Node (insert x l, v, r)
    else if x > v then Node (l, v, insert x r)
    else Node (l, v, r)

let rec to_sorted_list = function
  | Leaf -> []
  | Node (l, v, r) -> to_sorted_list l @ [v] @ to_sorted_list r

(* Linked list *)
type 'a mylist = Nil | Cons of 'a * 'a mylist

let rec length = function Nil -> 0 | Cons (_, t) -> 1 + length t

(* Expression AST *)
type expr =
  | Num of float
  | Add of expr * expr
  | Mul of expr * expr
  | Neg of expr

let rec eval = function
  | Num n -> n
  | Add (a, b) -> eval a +. eval b
  | Mul (a, b) -> eval a *. eval b
  | Neg e -> -. eval e

Rust (idiomatic — Box for heap indirection)

#[derive(Debug)]
pub enum Tree<T> {
    Leaf,
    Node(Box<Tree<T>>, T, Box<Tree<T>>),
}

impl<T: Ord> Tree<T> {
    pub fn insert(self, x: T) -> Self {
        match self {
            Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
            Tree::Node(l, v, r) => {
                if x < v      { Tree::Node(Box::new(l.insert(x)), v, r) }
                else if x > v { Tree::Node(l, v, Box::new(r.insert(x))) }
                else          { Tree::Node(l, v, r) }
            }
        }
    }

    pub fn to_sorted_vec(&self) -> Vec<&T> {
        match self {
            Tree::Leaf => vec![],
            Tree::Node(l, v, r) => {
                let mut result = l.to_sorted_vec();
                result.push(v);
                result.extend(r.to_sorted_vec());
                result
            }
        }
    }
}

Rust (linked list with Box)

#[derive(Debug)]
pub enum List<T> {
    Nil,
    Cons(T, Box<List<T>>),
}

Rust (expression AST with Box)

#[derive(Debug)]
pub enum Expr {
    Num(f64),
    Add(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
    Neg(Box<Expr>),
}

pub fn eval(expr: &Expr) -> f64 {
    match expr {
        Expr::Num(n)    => *n,
        Expr::Add(a, b) => eval(a) + eval(b),
        Expr::Mul(a, b) => eval(a) * eval(b),
        Expr::Neg(e)    => -eval(e),
    }
}

Type Signatures

Concept	OCaml	Rust
Recursive tree	`type 'a tree = Leaf \\| Node of 'a tree * 'a * 'a tree`	`enum Tree<T> { Leaf, Node(Box<Tree<T>>, T, Box<Tree<T>>) }`
Linked list	`type 'a mylist = Nil \\| Cons of 'a * 'a mylist`	`enum List<T> { Nil, Cons(T, Box<List<T>>) }`
Optional value	`'a option`	`Option<T>`
Heap pointer	implicit (GC)	`Box<T>` (explicit)
Insert signature	`val insert : 'a -> 'a tree -> 'a tree`	`fn insert(self, x: T) -> Self`
Eval signature	`val eval : expr -> float`	`fn eval(expr: &Expr) -> f64`

Key Insights

Heap allocation is implicit in OCaml, explicit in Rust. OCaml's GC stores every value behind a pointer; the programmer never thinks about it. Rust forces you to say Box::new(...) exactly where heap indirection occurs, making allocation costs visible and auditable.

**Box<T> is pointer-sized.** The compiler knows sizeof(Box<T>) == sizeof(usize), so sizeof(Node) = sizeof(Box<Tree<T>>) + sizeof(T) + sizeof(Box<Tree<T>>) is computable. Without Box, the size formula would be self-referential and the compiler rejects it with E0072: recursive type has infinite size.

**insert consumes self instead of returning a new allocation.** OCaml's insert creates new Node values sharing subtrees via the GC. Rust's version does the same by consuming the old tree and constructing a new one — no Clone required because ownership transfers.

Pattern matching looks identical. Both languages destructure recursive types with match/function. The only textual difference is Box::new(...) at construction sites and the absence of dereferencing when matching (Rust auto-derefs through Box in patterns).

The same pattern scales to any recursive data structure. Trees, lists, expression ASTs, s-expressions, trie nodes — any recursive enum uses Box on the recursive fields. The shape of the code is always the same; only the payload type and match arms change.

When to Use Each Style

Use idiomatic Rust (iterator/method style) when: building production data structures where you want to leverage std traits (Iterator, Default, From).

Use recursive Rust (close to OCaml) when: translating algorithms directly from functional pseudocode or a textbook, where preserving the structural correspondence helps understanding and correctness review.

Exercises

Add a contains method to Tree<T> that searches for a value using binary search.

Implement to_sorted_vec using in-order traversal and verify it returns elements in ascending order.

Add a height method that computes the maximum depth of the tree from root to any leaf.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust