116 Intermediate

116-box-heap — Box<T>: Heap Allocation

Functional Programming

Tutorial

The Problem

Rust values live on the stack by default. Two situations require heap allocation: when a value is too large for the stack, and when a type is recursive (its size would be infinite). Box<T> is Rust's simplest heap allocation — a single owned pointer to a heap-allocated T with no reference counting overhead.

Box<T> is also the mechanism for trait objects (Box<dyn Trait>) enabling runtime polymorphism — the Rust equivalent of OOP's virtual dispatch.

🎯 Learning Outcomes

• Use Box<T> to heap-allocate large data with single ownership

• Understand why recursive types require Box (breaks the infinite size recursion)

• Use Box<dyn Trait> for dynamic dispatch and runtime polymorphism

• Know the deref coercion: Box<T> automatically derefs to &T

• Understand that Box<T> drops the inner value when it goes out of scope

Code Example

pub trait Shape { fn area(&self) -> f64; }

pub fn total_area(shapes: &[Box<dyn Shape>]) -> f64 {
    shapes.iter().map(|s| s.area()).sum()
}

(* OCaml allocates everything on the GC heap automatically. *)
(* Recursive types work without indirection syntax: *)
type expr =
  | Num of int
  | Add of expr * expr
  | Mul of expr * expr

let rec eval = function
  | Num n -> n
  | Add (a, b) -> eval a + eval b
  | Mul (a, b) -> eval a * eval b

let () =
  let e = Add (Num 1, Mul (Num 2, Num 3)) in
  assert (eval e = 7);
  print_endline "ok"

Key Differences

Automatic vs explicit: OCaml heap-allocates all non-trivial values automatically; Rust requires Box<T> to explicitly move to the heap.

Recursive types: OCaml recursive ADTs work without any annotation; Rust requires Box<T> to break the size cycle.

**dyn Trait**: Rust's Box<dyn Trait> enables dynamic dispatch; OCaml uses polymorphic variants or first-class modules for equivalent runtime polymorphism.

Drop semantics: Rust's Box<T> drops the inner value when the Box goes out of scope; OCaml's GC collects values when reference count drops to zero.

OCaml Approach

OCaml heap-allocates everything except integers and booleans — all ADT values, structs, and closures live on the heap implicitly:

type expr =
  | Num of int
  | Add of expr * expr  (* no Box needed — GC handles recursion *)
  | Mul of expr * expr

let rec eval = function
  | Num n -> n
  | Add (a, b) -> eval a + eval b
  | Mul (a, b) -> eval a * eval b

OCaml's GC manages the recursive heap allocations automatically. There is no equivalent to Box — all variant values are heap-allocated by the runtime.

Full Source

#![allow(clippy::all)]
// Example 116: Box<T> — Heap Allocation
//
// Box<T> puts data on the heap with single ownership.
// Use for: large data, recursive types, and trait objects (dyn Trait).

// --- Approach 1: Heap-allocating large data ---
//
// The heap holds the array; the stack holds only an 8-byte pointer.
pub fn sum_boxed_squares(n: usize) -> i64 {
    let squares: Box<Vec<i64>> = Box::new((0..n as i64).map(|i| i * i).collect());
    squares.iter().sum()
}

// --- Approach 2: Recursive types require Box for known size ---
//
// Without Box the compiler cannot compute the size of Expr
// (it would be infinitely recursive).  Box breaks the cycle:
// each variant stores a pointer (8 bytes), not the full sub-tree.
#[derive(Debug, PartialEq)]
pub enum Expr {
    Num(i32),
    Add(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
}

pub fn eval(expr: &Expr) -> i32 {
    match expr {
        Expr::Num(n) => *n,
        Expr::Add(a, b) => eval(a) + eval(b),
        Expr::Mul(a, b) => eval(a) * eval(b),
    }
}

// Convenience constructors so call-sites stay readable.
pub fn num(n: i32) -> Expr {
    Expr::Num(n)
}
pub fn add(a: Expr, b: Expr) -> Expr {
    Expr::Add(Box::new(a), Box::new(b))
}
pub fn mul(a: Expr, b: Expr) -> Expr {
    Expr::Mul(Box::new(a), Box::new(b))
}

// --- Approach 3: Trait objects — heterogeneous collections ---
//
// Box<dyn Shape> is always pointer-sized regardless of the concrete type,
// letting us store mixed shapes in a single Vec.
pub trait Shape {
    fn area(&self) -> f64;
}

pub struct Circle {
    pub radius: f64,
}
pub struct Rectangle {
    pub width: f64,
    pub height: f64,
}

impl Shape for Circle {
    fn area(&self) -> f64 {
        std::f64::consts::PI * self.radius * self.radius
    }
}

impl Shape for Rectangle {
    fn area(&self) -> f64 {
        self.width * self.height
    }
}

pub fn total_area(shapes: &[Box<dyn Shape>]) -> f64 {
    shapes.iter().map(|s| s.area()).sum()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_boxed_value_dereferences_transparently() {
        let b: Box<i32> = Box::new(42);
        // Deref coercion: *b behaves like i32
        assert_eq!(*b, 42);
        assert_eq!(*b + 1, 43);
    }

    #[test]
    fn test_sum_boxed_squares() {
        // 0² + 1² + 2² + 3² + 4² = 0+1+4+9+16 = 30
        assert_eq!(sum_boxed_squares(5), 30);
        assert_eq!(sum_boxed_squares(0), 0);
        assert_eq!(sum_boxed_squares(1), 0); // only 0²
    }

    #[test]
    fn test_recursive_expr_eval() {
        // 1 + 2 * 3  (multiplication binds tighter in OCaml source)
        let expr = add(num(1), mul(num(2), num(3)));
        assert_eq!(eval(&expr), 7);

        // (1 + 2) * (3 + 4) = 3 * 7 = 21
        let expr2 = mul(add(num(1), num(2)), add(num(3), num(4)));
        assert_eq!(eval(&expr2), 21);
    }

    #[test]
    fn test_trait_objects_in_vec() {
        let shapes: Vec<Box<dyn Shape>> = vec![
            Box::new(Rectangle {
                width: 4.0,
                height: 5.0,
            }),
            Box::new(Circle { radius: 0.0 }),
        ];
        let total = total_area(&shapes);
        assert!((total - 20.0).abs() < 1e-10);
    }

    #[test]
    fn test_box_size_is_pointer_sized() {
        // Box<T> is always one pointer wide, regardless of T's size.
        assert_eq!(
            std::mem::size_of::<Box<[i32; 1000]>>(),
            std::mem::size_of::<*const u8>()
        );
    }

    #[test]
    fn test_nested_expr_num_only() {
        assert_eq!(eval(&num(0)), 0);
        assert_eq!(eval(&num(-5)), -5);
    }
}

(* Example 116: Box<T> for Heap Allocation *)

(* OCaml allocates everything on the GC heap automatically.
   Rust's Box<T> is the explicit way to heap-allocate. *)

(* Approach 1: Large data structures *)
let approach1 () =
  let big_array = Array.init 1000 (fun i -> i * i) in
  assert (big_array.(999) = 999 * 999);
  Printf.printf "big_array[999] = %d\n" big_array.(999)

(* Approach 2: Indirection for recursive types (see example 117) *)
type expr =
  | Num of int
  | Add of expr * expr
  | Mul of expr * expr

let rec eval = function
  | Num n -> n
  | Add (a, b) -> eval a + eval b
  | Mul (a, b) -> eval a * eval b

let approach2 () =
  let e = Add (Num 1, Mul (Num 2, Num 3)) in
  let result = eval e in
  assert (result = 7);
  Printf.printf "1 + 2*3 = %d\n" result

(* Approach 3: Trait-like polymorphism with first-class modules *)
module type Shape = sig
  val area : unit -> float
  val name : unit -> string
end

let circle r : (module Shape) = (module struct
  let area () = Float.pi *. r *. r
  let name () = "circle"
end)

let square s : (module Shape) = (module struct
  let area () = s *. s
  let name () = "square"
end)

let approach3 () =
  let shapes = [circle 5.0; square 4.0] in
  List.iter (fun (module S : Shape) ->
    Printf.printf "%s: area=%.2f\n" (S.name ()) (S.area ())
  ) shapes

let () =
  approach1 ();
  approach2 ();
  approach3 ();
  Printf.printf "✓ All tests passed\n"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_boxed_value_dereferences_transparently() {
        let b: Box<i32> = Box::new(42);
        // Deref coercion: *b behaves like i32
        assert_eq!(*b, 42);
        assert_eq!(*b + 1, 43);
    }

    #[test]
    fn test_sum_boxed_squares() {
        // 0² + 1² + 2² + 3² + 4² = 0+1+4+9+16 = 30
        assert_eq!(sum_boxed_squares(5), 30);
        assert_eq!(sum_boxed_squares(0), 0);
        assert_eq!(sum_boxed_squares(1), 0); // only 0²
    }

    #[test]
    fn test_recursive_expr_eval() {
        // 1 + 2 * 3  (multiplication binds tighter in OCaml source)
        let expr = add(num(1), mul(num(2), num(3)));
        assert_eq!(eval(&expr), 7);

        // (1 + 2) * (3 + 4) = 3 * 7 = 21
        let expr2 = mul(add(num(1), num(2)), add(num(3), num(4)));
        assert_eq!(eval(&expr2), 21);
    }

    #[test]
    fn test_trait_objects_in_vec() {
        let shapes: Vec<Box<dyn Shape>> = vec![
            Box::new(Rectangle {
                width: 4.0,
                height: 5.0,
            }),
            Box::new(Circle { radius: 0.0 }),
        ];
        let total = total_area(&shapes);
        assert!((total - 20.0).abs() < 1e-10);
    }

    #[test]
    fn test_box_size_is_pointer_sized() {
        // Box<T> is always one pointer wide, regardless of T's size.
        assert_eq!(
            std::mem::size_of::<Box<[i32; 1000]>>(),
            std::mem::size_of::<*const u8>()
        );
    }

    #[test]
    fn test_nested_expr_num_only() {
        assert_eq!(eval(&num(0)), 0);
        assert_eq!(eval(&num(-5)), -5);
    }
}

Deep Comparison

OCaml vs Rust: Box\<T\> — Heap Allocation

Side-by-Side Code

OCaml

(* OCaml allocates everything on the GC heap automatically. *)
(* Recursive types work without indirection syntax: *)
type expr =
  | Num of int
  | Add of expr * expr
  | Mul of expr * expr

let rec eval = function
  | Num n -> n
  | Add (a, b) -> eval a + eval b
  | Mul (a, b) -> eval a * eval b

let () =
  let e = Add (Num 1, Mul (Num 2, Num 3)) in
  assert (eval e = 7);
  print_endline "ok"

Rust (idiomatic — trait objects)

pub trait Shape { fn area(&self) -> f64; }

pub fn total_area(shapes: &[Box<dyn Shape>]) -> f64 {
    shapes.iter().map(|s| s.area()).sum()
}

Rust (functional/recursive — expression evaluator)

#[derive(Debug, PartialEq)]
pub enum Expr {
    Num(i32),
    Add(Box<Expr>, Box<Expr>),  // Box breaks the infinite-size cycle
    Mul(Box<Expr>, Box<Expr>),
}

pub fn eval(expr: &Expr) -> i32 {
    match expr {
        Expr::Num(n)    => *n,
        Expr::Add(a, b) => eval(a) + eval(b),
        Expr::Mul(a, b) => eval(a) * eval(b),
    }
}

Type Signatures

Concept	OCaml	Rust
Heap allocation	automatic (GC)	`Box::new(value)`
Recursive type	`type expr = Add of expr * expr`	`Add(Box<Expr>, Box<Expr>)`
Trait object	first-class module / object	`Box<dyn Trait>`
Pointer size	hidden (GC word)	`std::mem::size_of::<Box<T>>() == 8`
Deallocation	GC	automatic on `Drop`

Key Insights

Implicit vs explicit heap allocation: OCaml's GC heap-allocates nearly everything (tuples, variants, closures) with no syntax at all. Rust defaults to the stack and requires Box::new(...) to opt into heap allocation, making the memory location explicit and auditable.

Recursive types: OCaml's algebraic types are represented as GC-managed pointers internally, so Add of expr * expr just works. Rust needs the Box wrapper to make the recursive variant's size finite; without it the compiler rejects the type as having infinite size.

No garbage collector: Box<T> follows Rust's ownership rules — when the Box goes out of scope, drop is called and the heap memory is freed immediately. There is no pause, no collector thread, and no runtime overhead beyond the allocation itself.

Trait objects and heterogeneous collections: OCaml achieves polymorphism via parametric types or first-class modules. Rust uses Box<dyn Trait> to store different concrete types behind a uniform pointer, enabling Vec<Box<dyn Shape>> where each element may be a different struct.

**Zero overhead for Box<T>:** At runtime, a Box<T> compiles down to a raw pointer plus a free on drop — exactly what a C programmer would write by hand. There is no reference counting, no fat pointer (unlike Rc or Arc), and no metadata unless the pointee is a dyn Trait (which adds a vtable pointer).

When to Use Each Style

**Use Box<T> (idiomatic Rust) when:** you need a single-owner heap allocation — a recursive enum, a large array you don't want on the stack, or a Box<dyn Trait> for runtime polymorphism.

Use recursive Rust when: modelling tree-structured data (ASTs, parse trees, linked lists) and you want the OCaml-like pattern-match style, just with explicit Box indirection at each recursive position.

Exercises

Add Neg(Box<Expr>) and Div(Box<Expr>, Box<Expr>) variants to Expr and extend eval to handle them.

Implement a Display for Expr that prints expressions with correct parenthesization.

Write a list of heterogeneous closures Vec<Box<dyn Fn(i32) -> i32>> and apply them all in sequence to an initial value.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust