Recursive Types
Tutorial
The Problem
A recursive type is one whose definition refers to itself — a tree node contains child nodes, a linked list node points to the next node. This is fundamental to representing hierarchical data: file systems, abstract syntax trees, JSON documents, and parse trees. Rust requires recursive types to have a known size at compile time, which forces explicit indirection via Box<T> at recursive positions.
🎯 Learning Outcomes
Box<T> (or other heap-indirected pointers) for recursive type definitionsBox wrapping to OCaml's implicit heap allocation for algebraic typesCode Example
#[derive(Debug)]
pub enum Tree<T> {
Leaf,
Node(Box<Tree<T>>, T, Box<Tree<T>>),
}
impl<T: Ord> Tree<T> {
pub fn insert(self, x: T) -> Self {
match self {
Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
Tree::Node(l, v, r) => {
if x < v { Tree::Node(Box::new(l.insert(x)), v, r) }
else if x > v { Tree::Node(l, v, Box::new(r.insert(x))) }
else { Tree::Node(l, v, r) }
}
}
}
pub fn to_sorted_vec(&self) -> Vec<&T> {
match self {
Tree::Leaf => vec![],
Tree::Node(l, v, r) => {
let mut result = l.to_sorted_vec();
result.push(v);
result.extend(r.to_sorted_vec());
result
}
}
}
}Key Differences
Box<T> at recursive positions to break size cycles.insert(self, x) consumes the old tree (move semantics); OCaml's insert returns a new tree while the old one stays alive via GC.Box in patterns (handled transparently by the compiler in modern Rust).Leaf is a zero-size value; Rust's Tree::Leaf is similarly zero-size, but Node pays for two Box pointers.OCaml Approach
OCaml's type system handles recursive algebraic types without any annotation:
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
Every OCaml value is heap-allocated and pointer-accessed through the GC, so there is no size-at-compile-time problem. Pattern matching on trees is identical in structure to Rust's match, but without Box::new or dereferencing.
Full Source
#![allow(clippy::all)]
// Example 117: Recursive Types with Box
//
// Recursive types need Box in Rust because the compiler must know
// the size of each type at compile time. Box<T> is pointer-sized,
// breaking the infinite-size recursion.
// ── Approach 1: Binary search tree ──────────────────────────────────────────
//
// OCaml: type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
// Rust: the Node children must be boxed so the compiler sees a fixed size.
#[derive(Debug)]
pub enum Tree<T> {
Leaf,
Node(Box<Tree<T>>, T, Box<Tree<T>>),
}
impl<T: Ord> Tree<T> {
pub fn new() -> Self {
Tree::Leaf
}
/// Insert a value, returning the updated tree (consumes self).
pub fn insert(self, x: T) -> Self {
match self {
Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
Tree::Node(l, v, r) => {
if x < v {
Tree::Node(Box::new(l.insert(x)), v, r)
} else if x > v {
Tree::Node(l, v, Box::new(r.insert(x)))
} else {
Tree::Node(l, v, r)
}
}
}
}
/// In-order traversal yields elements in sorted order.
pub fn to_sorted_vec(&self) -> Vec<&T> {
match self {
Tree::Leaf => vec![],
Tree::Node(l, v, r) => {
let mut result = l.to_sorted_vec();
result.push(v);
result.extend(r.to_sorted_vec());
result
}
}
}
pub fn contains(&self, x: &T) -> bool {
match self {
Tree::Leaf => false,
Tree::Node(l, v, r) => {
if x < v {
l.contains(x)
} else if x > v {
r.contains(x)
} else {
true
}
}
}
}
}
impl<T: Ord> Default for Tree<T> {
fn default() -> Self {
Tree::new()
}
}
// ── Approach 2: Singly-linked list ──────────────────────────────────────────
//
// OCaml: type 'a mylist = Nil | Cons of 'a * 'a mylist
// Rust: tail must be Box'd — same reason as the tree above.
#[derive(Debug)]
pub enum List<T> {
Nil,
Cons(T, Box<List<T>>),
}
impl<T> List<T> {
pub fn nil() -> Self {
List::Nil
}
/// Prepend an element (O(1)).
pub fn cons(head: T, tail: Self) -> Self {
List::Cons(head, Box::new(tail))
}
pub fn is_empty(&self) -> bool {
matches!(self, List::Nil)
}
pub fn len(&self) -> usize {
match self {
List::Nil => 0,
List::Cons(_, tail) => 1 + tail.len(),
}
}
pub fn to_vec(&self) -> Vec<&T> {
let mut result = vec![];
let mut current = self;
loop {
match current {
List::Nil => break,
List::Cons(h, tail) => {
result.push(h);
current = tail;
}
}
}
result
}
}
// ── Approach 3: Expression AST ───────────────────────────────────────────────
//
// A classic use-case: recursive expression trees for interpreters / compilers.
#[derive(Debug)]
pub enum Expr {
Num(f64),
Add(Box<Expr>, Box<Expr>),
Mul(Box<Expr>, Box<Expr>),
Neg(Box<Expr>),
}
pub fn eval(expr: &Expr) -> f64 {
match expr {
Expr::Num(n) => *n,
Expr::Add(a, b) => eval(a) + eval(b),
Expr::Mul(a, b) => eval(a) * eval(b),
Expr::Neg(e) => -eval(e),
}
}
// ── Tests ────────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
// Tree tests
#[test]
fn tree_insert_and_sorted_order() {
let tree = [5, 3, 7, 1, 4, 6, 8]
.into_iter()
.fold(Tree::new(), Tree::insert);
let sorted: Vec<i32> = tree.to_sorted_vec().into_iter().copied().collect();
assert_eq!(sorted, [1, 3, 4, 5, 6, 7, 8]);
}
#[test]
fn tree_contains() {
let tree = [10, 5, 15].into_iter().fold(Tree::new(), Tree::insert);
assert!(tree.contains(&5));
assert!(tree.contains(&15));
assert!(!tree.contains(&99));
}
#[test]
fn tree_empty_is_leaf() {
let tree: Tree<i32> = Tree::new();
assert!(tree.to_sorted_vec().is_empty());
assert!(!tree.contains(&0));
}
#[test]
fn tree_no_duplicates() {
let tree = [3, 3, 3].into_iter().fold(Tree::new(), Tree::insert);
assert_eq!(tree.to_sorted_vec(), [&3]);
}
// List tests
#[test]
fn list_len_and_to_vec() {
let list = List::cons(1, List::cons(2, List::cons(3, List::nil())));
assert_eq!(list.len(), 3);
assert_eq!(list.to_vec(), [&1, &2, &3]);
}
#[test]
fn list_nil_is_empty() {
let list: List<i32> = List::nil();
assert_eq!(list.len(), 0);
assert!(list.to_vec().is_empty());
}
// Expr / AST tests
#[test]
fn expr_eval_arithmetic() {
// (2 + 3) * -(4) == -20
let expr = Expr::Mul(
Box::new(Expr::Add(
Box::new(Expr::Num(2.0)),
Box::new(Expr::Num(3.0)),
)),
Box::new(Expr::Neg(Box::new(Expr::Num(4.0)))),
);
assert!((eval(&expr) - (-20.0)).abs() < f64::EPSILON);
}
#[test]
fn expr_eval_leaf() {
assert!((eval(&Expr::Num(42.0)) - 42.0).abs() < f64::EPSILON);
}
}#[cfg(test)]
mod tests {
use super::*;
// Tree tests
#[test]
fn tree_insert_and_sorted_order() {
let tree = [5, 3, 7, 1, 4, 6, 8]
.into_iter()
.fold(Tree::new(), Tree::insert);
let sorted: Vec<i32> = tree.to_sorted_vec().into_iter().copied().collect();
assert_eq!(sorted, [1, 3, 4, 5, 6, 7, 8]);
}
#[test]
fn tree_contains() {
let tree = [10, 5, 15].into_iter().fold(Tree::new(), Tree::insert);
assert!(tree.contains(&5));
assert!(tree.contains(&15));
assert!(!tree.contains(&99));
}
#[test]
fn tree_empty_is_leaf() {
let tree: Tree<i32> = Tree::new();
assert!(tree.to_sorted_vec().is_empty());
assert!(!tree.contains(&0));
}
#[test]
fn tree_no_duplicates() {
let tree = [3, 3, 3].into_iter().fold(Tree::new(), Tree::insert);
assert_eq!(tree.to_sorted_vec(), [&3]);
}
// List tests
#[test]
fn list_len_and_to_vec() {
let list = List::cons(1, List::cons(2, List::cons(3, List::nil())));
assert_eq!(list.len(), 3);
assert_eq!(list.to_vec(), [&1, &2, &3]);
}
#[test]
fn list_nil_is_empty() {
let list: List<i32> = List::nil();
assert_eq!(list.len(), 0);
assert!(list.to_vec().is_empty());
}
// Expr / AST tests
#[test]
fn expr_eval_arithmetic() {
// (2 + 3) * -(4) == -20
let expr = Expr::Mul(
Box::new(Expr::Add(
Box::new(Expr::Num(2.0)),
Box::new(Expr::Num(3.0)),
)),
Box::new(Expr::Neg(Box::new(Expr::Num(4.0)))),
);
assert!((eval(&expr) - (-20.0)).abs() < f64::EPSILON);
}
#[test]
fn expr_eval_leaf() {
assert!((eval(&Expr::Num(42.0)) - 42.0).abs() < f64::EPSILON);
}
}
Deep Comparison
OCaml vs Rust: Recursive Types with Box
Side-by-Side Code
OCaml
(* OCaml heap-allocates everything; recursive types just work *)
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
let rec insert x = function
| Leaf -> Node (Leaf, x, Leaf)
| Node (l, v, r) ->
if x < v then Node (insert x l, v, r)
else if x > v then Node (l, v, insert x r)
else Node (l, v, r)
let rec to_sorted_list = function
| Leaf -> []
| Node (l, v, r) -> to_sorted_list l @ [v] @ to_sorted_list r
(* Linked list *)
type 'a mylist = Nil | Cons of 'a * 'a mylist
let rec length = function Nil -> 0 | Cons (_, t) -> 1 + length t
(* Expression AST *)
type expr =
| Num of float
| Add of expr * expr
| Mul of expr * expr
| Neg of expr
let rec eval = function
| Num n -> n
| Add (a, b) -> eval a +. eval b
| Mul (a, b) -> eval a *. eval b
| Neg e -> -. eval e
Rust (idiomatic — Box for heap indirection)
#[derive(Debug)]
pub enum Tree<T> {
Leaf,
Node(Box<Tree<T>>, T, Box<Tree<T>>),
}
impl<T: Ord> Tree<T> {
pub fn insert(self, x: T) -> Self {
match self {
Tree::Leaf => Tree::Node(Box::new(Tree::Leaf), x, Box::new(Tree::Leaf)),
Tree::Node(l, v, r) => {
if x < v { Tree::Node(Box::new(l.insert(x)), v, r) }
else if x > v { Tree::Node(l, v, Box::new(r.insert(x))) }
else { Tree::Node(l, v, r) }
}
}
}
pub fn to_sorted_vec(&self) -> Vec<&T> {
match self {
Tree::Leaf => vec![],
Tree::Node(l, v, r) => {
let mut result = l.to_sorted_vec();
result.push(v);
result.extend(r.to_sorted_vec());
result
}
}
}
}
Rust (linked list with Box)
#[derive(Debug)]
pub enum List<T> {
Nil,
Cons(T, Box<List<T>>),
}
Rust (expression AST with Box)
#[derive(Debug)]
pub enum Expr {
Num(f64),
Add(Box<Expr>, Box<Expr>),
Mul(Box<Expr>, Box<Expr>),
Neg(Box<Expr>),
}
pub fn eval(expr: &Expr) -> f64 {
match expr {
Expr::Num(n) => *n,
Expr::Add(a, b) => eval(a) + eval(b),
Expr::Mul(a, b) => eval(a) * eval(b),
Expr::Neg(e) => -eval(e),
}
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| Recursive tree | type 'a tree = Leaf \| Node of 'a tree * 'a * 'a tree | enum Tree<T> { Leaf, Node(Box<Tree<T>>, T, Box<Tree<T>>) } |
| Linked list | type 'a mylist = Nil \| Cons of 'a * 'a mylist | enum List<T> { Nil, Cons(T, Box<List<T>>) } |
| Optional value | 'a option | Option<T> |
| Heap pointer | implicit (GC) | Box<T> (explicit) |
| Insert signature | val insert : 'a -> 'a tree -> 'a tree | fn insert(self, x: T) -> Self |
| Eval signature | val eval : expr -> float | fn eval(expr: &Expr) -> f64 |
Key Insights
Box::new(...) exactly where heap indirection occurs, making allocation costs visible and auditable.Box<T> is pointer-sized.** The compiler knows sizeof(Box<T>) == sizeof(usize), so sizeof(Node) = sizeof(Box<Tree<T>>) + sizeof(T) + sizeof(Box<Tree<T>>) is computable. Without Box, the size formula would be self-referential and the compiler rejects it with E0072: recursive type has infinite size.insert consumes self instead of returning a new allocation.** OCaml's insert creates new Node values sharing subtrees via the GC. Rust's version does the same by consuming the old tree and constructing a new one — no Clone required because ownership transfers.match/function. The only textual difference is Box::new(...) at construction sites and the absence of dereferencing when matching (Rust auto-derefs through Box in patterns).Box on the recursive fields. The shape of the code is always the same; only the payload type and match arms change.When to Use Each Style
Use idiomatic Rust (iterator/method style) when: building production data structures where you want to leverage std traits (Iterator, Default, From).
Use recursive Rust (close to OCaml) when: translating algorithms directly from functional pseudocode or a textbook, where preserving the structural correspondence helps understanding and correctness review.
Exercises
contains method to Tree<T> that searches for a value using binary search.to_sorted_vec using in-order traversal and verify it returns elements in ascending order.height method that computes the maximum depth of the tree from root to any leaf.