ExamplesBy LevelBy TopicLearning Paths
117 Advanced

Recursive Types

Functional Programming

Tutorial

The Problem

A recursive type is one whose definition refers to itself — a tree node contains child nodes, a linked list node points to the next node. This is fundamental to representing hierarchical data: file systems, abstract syntax trees, JSON documents, and parse trees. Rust requires recursive types to have a known size at compile time, which forces explicit indirection via Box<T> at recursive positions.

🎯 Learning Outcomes

  • • Understand why Rust requires Box<T> (or other heap-indirected pointers) for recursive type definitions
  • • Learn to implement a binary search tree using recursive enums
  • • See how insert, search, and traversal operations work on recursive structures
  • • Compare Rust's explicit Box wrapping to OCaml's implicit heap allocation for algebraic types
  • Code Example

    #[derive(Debug)]
    pub enum Tree<T> {
        Leaf,
        Node(Box<Tree<T>>, T, Box<Tree<T>>),
    }
    
    impl<T: Ord> Tree<T> {
        pub fn insert(self, x: T) -> Self {
            match self {
                Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
                Tree::Node(l, v, r) => {
                    if x < v      { Tree::Node(Box::new(l.insert(x)), v, r) }
                    else if x > v { Tree::Node(l, v, Box::new(r.insert(x))) }
                    else          { Tree::Node(l, v, r) }
                }
            }
        }
    
        pub fn to_sorted_vec(&self) -> Vec<&T> {
            match self {
                Tree::Leaf => vec![],
                Tree::Node(l, v, r) => {
                    let mut result = l.to_sorted_vec();
                    result.push(v);
                    result.extend(r.to_sorted_vec());
                    result
                }
            }
        }
    }

    Key Differences

  • Indirection: OCaml allocates all variants on the heap automatically; Rust requires explicit Box<T> at recursive positions to break size cycles.
  • Ownership on insert: Rust's insert(self, x) consumes the old tree (move semantics); OCaml's insert returns a new tree while the old one stays alive via GC.
  • Pattern matching: Both use structural pattern matching, but Rust must dereference Box in patterns (handled transparently by the compiler in modern Rust).
  • Leaf representation: OCaml's Leaf is a zero-size value; Rust's Tree::Leaf is similarly zero-size, but Node pays for two Box pointers.
  • OCaml Approach

    OCaml's type system handles recursive algebraic types without any annotation:

    type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
    

    Every OCaml value is heap-allocated and pointer-accessed through the GC, so there is no size-at-compile-time problem. Pattern matching on trees is identical in structure to Rust's match, but without Box::new or dereferencing.

    Full Source

    #![allow(clippy::all)]
    // Example 117: Recursive Types with Box
    //
    // Recursive types need Box in Rust because the compiler must know
    // the size of each type at compile time. Box<T> is pointer-sized,
    // breaking the infinite-size recursion.
    
    // ── Approach 1: Binary search tree ──────────────────────────────────────────
    //
    // OCaml: type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
    // Rust:  the Node children must be boxed so the compiler sees a fixed size.
    
    #[derive(Debug)]
    pub enum Tree<T> {
        Leaf,
        Node(Box<Tree<T>>, T, Box<Tree<T>>),
    }
    
    impl<T: Ord> Tree<T> {
        pub fn new() -> Self {
            Tree::Leaf
        }
    
        /// Insert a value, returning the updated tree (consumes self).
        pub fn insert(self, x: T) -> Self {
            match self {
                Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
                Tree::Node(l, v, r) => {
                    if x < v {
                        Tree::Node(Box::new(l.insert(x)), v, r)
                    } else if x > v {
                        Tree::Node(l, v, Box::new(r.insert(x)))
                    } else {
                        Tree::Node(l, v, r)
                    }
                }
            }
        }
    
        /// In-order traversal yields elements in sorted order.
        pub fn to_sorted_vec(&self) -> Vec<&T> {
            match self {
                Tree::Leaf => vec![],
                Tree::Node(l, v, r) => {
                    let mut result = l.to_sorted_vec();
                    result.push(v);
                    result.extend(r.to_sorted_vec());
                    result
                }
            }
        }
    
        pub fn contains(&self, x: &T) -> bool {
            match self {
                Tree::Leaf => false,
                Tree::Node(l, v, r) => {
                    if x < v {
                        l.contains(x)
                    } else if x > v {
                        r.contains(x)
                    } else {
                        true
                    }
                }
            }
        }
    }
    
    impl<T: Ord> Default for Tree<T> {
        fn default() -> Self {
            Tree::new()
        }
    }
    
    // ── Approach 2: Singly-linked list ──────────────────────────────────────────
    //
    // OCaml: type 'a mylist = Nil | Cons of 'a * 'a mylist
    // Rust:  tail must be Box'd — same reason as the tree above.
    
    #[derive(Debug)]
    pub enum List<T> {
        Nil,
        Cons(T, Box<List<T>>),
    }
    
    impl<T> List<T> {
        pub fn nil() -> Self {
            List::Nil
        }
    
        /// Prepend an element (O(1)).
        pub fn cons(head: T, tail: Self) -> Self {
            List::Cons(head, Box::new(tail))
        }
    
        pub fn is_empty(&self) -> bool {
            matches!(self, List::Nil)
        }
    
        pub fn len(&self) -> usize {
            match self {
                List::Nil => 0,
                List::Cons(_, tail) => 1 + tail.len(),
            }
        }
    
        pub fn to_vec(&self) -> Vec<&T> {
            let mut result = vec![];
            let mut current = self;
            loop {
                match current {
                    List::Nil => break,
                    List::Cons(h, tail) => {
                        result.push(h);
                        current = tail;
                    }
                }
            }
            result
        }
    }
    
    // ── Approach 3: Expression AST ───────────────────────────────────────────────
    //
    // A classic use-case: recursive expression trees for interpreters / compilers.
    
    #[derive(Debug)]
    pub enum Expr {
        Num(f64),
        Add(Box<Expr>, Box<Expr>),
        Mul(Box<Expr>, Box<Expr>),
        Neg(Box<Expr>),
    }
    
    pub fn eval(expr: &Expr) -> f64 {
        match expr {
            Expr::Num(n) => *n,
            Expr::Add(a, b) => eval(a) + eval(b),
            Expr::Mul(a, b) => eval(a) * eval(b),
            Expr::Neg(e) => -eval(e),
        }
    }
    
    // ── Tests ────────────────────────────────────────────────────────────────────
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        // Tree tests
    
        #[test]
        fn tree_insert_and_sorted_order() {
            let tree = [5, 3, 7, 1, 4, 6, 8]
                .into_iter()
                .fold(Tree::new(), Tree::insert);
            let sorted: Vec<i32> = tree.to_sorted_vec().into_iter().copied().collect();
            assert_eq!(sorted, [1, 3, 4, 5, 6, 7, 8]);
        }
    
        #[test]
        fn tree_contains() {
            let tree = [10, 5, 15].into_iter().fold(Tree::new(), Tree::insert);
            assert!(tree.contains(&5));
            assert!(tree.contains(&15));
            assert!(!tree.contains(&99));
        }
    
        #[test]
        fn tree_empty_is_leaf() {
            let tree: Tree<i32> = Tree::new();
            assert!(tree.to_sorted_vec().is_empty());
            assert!(!tree.contains(&0));
        }
    
        #[test]
        fn tree_no_duplicates() {
            let tree = [3, 3, 3].into_iter().fold(Tree::new(), Tree::insert);
            assert_eq!(tree.to_sorted_vec(), [&3]);
        }
    
        // List tests
    
        #[test]
        fn list_len_and_to_vec() {
            let list = List::cons(1, List::cons(2, List::cons(3, List::nil())));
            assert_eq!(list.len(), 3);
            assert_eq!(list.to_vec(), [&1, &2, &3]);
        }
    
        #[test]
        fn list_nil_is_empty() {
            let list: List<i32> = List::nil();
            assert_eq!(list.len(), 0);
            assert!(list.to_vec().is_empty());
        }
    
        // Expr / AST tests
    
        #[test]
        fn expr_eval_arithmetic() {
            // (2 + 3) * -(4)  == -20
            let expr = Expr::Mul(
                Box::new(Expr::Add(
                    Box::new(Expr::Num(2.0)),
                    Box::new(Expr::Num(3.0)),
                )),
                Box::new(Expr::Neg(Box::new(Expr::Num(4.0)))),
            );
            assert!((eval(&expr) - (-20.0)).abs() < f64::EPSILON);
        }
    
        #[test]
        fn expr_eval_leaf() {
            assert!((eval(&Expr::Num(42.0)) - 42.0).abs() < f64::EPSILON);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        // Tree tests
    
        #[test]
        fn tree_insert_and_sorted_order() {
            let tree = [5, 3, 7, 1, 4, 6, 8]
                .into_iter()
                .fold(Tree::new(), Tree::insert);
            let sorted: Vec<i32> = tree.to_sorted_vec().into_iter().copied().collect();
            assert_eq!(sorted, [1, 3, 4, 5, 6, 7, 8]);
        }
    
        #[test]
        fn tree_contains() {
            let tree = [10, 5, 15].into_iter().fold(Tree::new(), Tree::insert);
            assert!(tree.contains(&5));
            assert!(tree.contains(&15));
            assert!(!tree.contains(&99));
        }
    
        #[test]
        fn tree_empty_is_leaf() {
            let tree: Tree<i32> = Tree::new();
            assert!(tree.to_sorted_vec().is_empty());
            assert!(!tree.contains(&0));
        }
    
        #[test]
        fn tree_no_duplicates() {
            let tree = [3, 3, 3].into_iter().fold(Tree::new(), Tree::insert);
            assert_eq!(tree.to_sorted_vec(), [&3]);
        }
    
        // List tests
    
        #[test]
        fn list_len_and_to_vec() {
            let list = List::cons(1, List::cons(2, List::cons(3, List::nil())));
            assert_eq!(list.len(), 3);
            assert_eq!(list.to_vec(), [&1, &2, &3]);
        }
    
        #[test]
        fn list_nil_is_empty() {
            let list: List<i32> = List::nil();
            assert_eq!(list.len(), 0);
            assert!(list.to_vec().is_empty());
        }
    
        // Expr / AST tests
    
        #[test]
        fn expr_eval_arithmetic() {
            // (2 + 3) * -(4)  == -20
            let expr = Expr::Mul(
                Box::new(Expr::Add(
                    Box::new(Expr::Num(2.0)),
                    Box::new(Expr::Num(3.0)),
                )),
                Box::new(Expr::Neg(Box::new(Expr::Num(4.0)))),
            );
            assert!((eval(&expr) - (-20.0)).abs() < f64::EPSILON);
        }
    
        #[test]
        fn expr_eval_leaf() {
            assert!((eval(&Expr::Num(42.0)) - 42.0).abs() < f64::EPSILON);
        }
    }

    Deep Comparison

    OCaml vs Rust: Recursive Types with Box

    Side-by-Side Code

    OCaml

    (* OCaml heap-allocates everything; recursive types just work *)
    type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
    
    let rec insert x = function
      | Leaf -> Node (Leaf, x, Leaf)
      | Node (l, v, r) ->
        if x < v then Node (insert x l, v, r)
        else if x > v then Node (l, v, insert x r)
        else Node (l, v, r)
    
    let rec to_sorted_list = function
      | Leaf -> []
      | Node (l, v, r) -> to_sorted_list l @ [v] @ to_sorted_list r
    
    (* Linked list *)
    type 'a mylist = Nil | Cons of 'a * 'a mylist
    
    let rec length = function Nil -> 0 | Cons (_, t) -> 1 + length t
    
    (* Expression AST *)
    type expr =
      | Num of float
      | Add of expr * expr
      | Mul of expr * expr
      | Neg of expr
    
    let rec eval = function
      | Num n -> n
      | Add (a, b) -> eval a +. eval b
      | Mul (a, b) -> eval a *. eval b
      | Neg e -> -. eval e
    

    Rust (idiomatic — Box for heap indirection)

    #[derive(Debug)]
    pub enum Tree<T> {
        Leaf,
        Node(Box<Tree<T>>, T, Box<Tree<T>>),
    }
    
    impl<T: Ord> Tree<T> {
        pub fn insert(self, x: T) -> Self {
            match self {
                Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
                Tree::Node(l, v, r) => {
                    if x < v      { Tree::Node(Box::new(l.insert(x)), v, r) }
                    else if x > v { Tree::Node(l, v, Box::new(r.insert(x))) }
                    else          { Tree::Node(l, v, r) }
                }
            }
        }
    
        pub fn to_sorted_vec(&self) -> Vec<&T> {
            match self {
                Tree::Leaf => vec![],
                Tree::Node(l, v, r) => {
                    let mut result = l.to_sorted_vec();
                    result.push(v);
                    result.extend(r.to_sorted_vec());
                    result
                }
            }
        }
    }
    

    Rust (linked list with Box)

    #[derive(Debug)]
    pub enum List<T> {
        Nil,
        Cons(T, Box<List<T>>),
    }
    

    Rust (expression AST with Box)

    #[derive(Debug)]
    pub enum Expr {
        Num(f64),
        Add(Box<Expr>, Box<Expr>),
        Mul(Box<Expr>, Box<Expr>),
        Neg(Box<Expr>),
    }
    
    pub fn eval(expr: &Expr) -> f64 {
        match expr {
            Expr::Num(n)    => *n,
            Expr::Add(a, b) => eval(a) + eval(b),
            Expr::Mul(a, b) => eval(a) * eval(b),
            Expr::Neg(e)    => -eval(e),
        }
    }
    

    Type Signatures

    ConceptOCamlRust
    Recursive treetype 'a tree = Leaf \| Node of 'a tree * 'a * 'a treeenum Tree<T> { Leaf, Node(Box<Tree<T>>, T, Box<Tree<T>>) }
    Linked listtype 'a mylist = Nil \| Cons of 'a * 'a mylistenum List<T> { Nil, Cons(T, Box<List<T>>) }
    Optional value'a optionOption<T>
    Heap pointerimplicit (GC)Box<T> (explicit)
    Insert signatureval insert : 'a -> 'a tree -> 'a treefn insert(self, x: T) -> Self
    Eval signatureval eval : expr -> floatfn eval(expr: &Expr) -> f64

    Key Insights

  • Heap allocation is implicit in OCaml, explicit in Rust. OCaml's GC stores every value behind a pointer; the programmer never thinks about it. Rust forces you to say Box::new(...) exactly where heap indirection occurs, making allocation costs visible and auditable.
  • **Box<T> is pointer-sized.** The compiler knows sizeof(Box<T>) == sizeof(usize), so sizeof(Node) = sizeof(Box<Tree<T>>) + sizeof(T) + sizeof(Box<Tree<T>>) is computable. Without Box, the size formula would be self-referential and the compiler rejects it with E0072: recursive type has infinite size.
  • **insert consumes self instead of returning a new allocation.** OCaml's insert creates new Node values sharing subtrees via the GC. Rust's version does the same by consuming the old tree and constructing a new one — no Clone required because ownership transfers.
  • Pattern matching looks identical. Both languages destructure recursive types with match/function. The only textual difference is Box::new(...) at construction sites and the absence of dereferencing when matching (Rust auto-derefs through Box in patterns).
  • The same pattern scales to any recursive data structure. Trees, lists, expression ASTs, s-expressions, trie nodes — any recursive enum uses Box on the recursive fields. The shape of the code is always the same; only the payload type and match arms change.
  • When to Use Each Style

    Use idiomatic Rust (iterator/method style) when: building production data structures where you want to leverage std traits (Iterator, Default, From).

    Use recursive Rust (close to OCaml) when: translating algorithms directly from functional pseudocode or a textbook, where preserving the structural correspondence helps understanding and correctness review.

    Exercises

  • Add a contains method to Tree<T> that searches for a value using binary search.
  • Implement to_sorted_vec using in-order traversal and verify it returns elements in ascending order.
  • Add a height method that computes the maximum depth of the tree from root to any leaf.
  • Open Source Repos