116-box-heap — Box<T>: Heap Allocation
Tutorial
The Problem
Rust values live on the stack by default. Two situations require heap allocation: when a value is too large for the stack, and when a type is recursive (its size would be infinite). Box<T> is Rust's simplest heap allocation — a single owned pointer to a heap-allocated T with no reference counting overhead.
Box<T> is also the mechanism for trait objects (Box<dyn Trait>) enabling runtime polymorphism — the Rust equivalent of OOP's virtual dispatch.
🎯 Learning Outcomes
Box<T> to heap-allocate large data with single ownershipBox (breaks the infinite size recursion)Box<dyn Trait> for dynamic dispatch and runtime polymorphismBox<T> automatically derefs to &TBox<T> drops the inner value when it goes out of scopeCode Example
pub trait Shape { fn area(&self) -> f64; }
pub fn total_area(shapes: &[Box<dyn Shape>]) -> f64 {
shapes.iter().map(|s| s.area()).sum()
}Key Differences
Box<T> to explicitly move to the heap.Box<T> to break the size cycle.dyn Trait**: Rust's Box<dyn Trait> enables dynamic dispatch; OCaml uses polymorphic variants or first-class modules for equivalent runtime polymorphism.Box<T> drops the inner value when the Box goes out of scope; OCaml's GC collects values when reference count drops to zero.OCaml Approach
OCaml heap-allocates everything except integers and booleans — all ADT values, structs, and closures live on the heap implicitly:
type expr =
| Num of int
| Add of expr * expr (* no Box needed — GC handles recursion *)
| Mul of expr * expr
let rec eval = function
| Num n -> n
| Add (a, b) -> eval a + eval b
| Mul (a, b) -> eval a * eval b
OCaml's GC manages the recursive heap allocations automatically. There is no equivalent to Box — all variant values are heap-allocated by the runtime.
Full Source
#![allow(clippy::all)]
// Example 116: Box<T> — Heap Allocation
//
// Box<T> puts data on the heap with single ownership.
// Use for: large data, recursive types, and trait objects (dyn Trait).
// --- Approach 1: Heap-allocating large data ---
//
// The heap holds the array; the stack holds only an 8-byte pointer.
pub fn sum_boxed_squares(n: usize) -> i64 {
let squares: Box<Vec<i64>> = Box::new((0..n as i64).map(|i| i * i).collect());
squares.iter().sum()
}
// --- Approach 2: Recursive types require Box for known size ---
//
// Without Box the compiler cannot compute the size of Expr
// (it would be infinitely recursive). Box breaks the cycle:
// each variant stores a pointer (8 bytes), not the full sub-tree.
#[derive(Debug, PartialEq)]
pub enum Expr {
Num(i32),
Add(Box<Expr>, Box<Expr>),
Mul(Box<Expr>, Box<Expr>),
}
pub fn eval(expr: &Expr) -> i32 {
match expr {
Expr::Num(n) => *n,
Expr::Add(a, b) => eval(a) + eval(b),
Expr::Mul(a, b) => eval(a) * eval(b),
}
}
// Convenience constructors so call-sites stay readable.
pub fn num(n: i32) -> Expr {
Expr::Num(n)
}
pub fn add(a: Expr, b: Expr) -> Expr {
Expr::Add(Box::new(a), Box::new(b))
}
pub fn mul(a: Expr, b: Expr) -> Expr {
Expr::Mul(Box::new(a), Box::new(b))
}
// --- Approach 3: Trait objects — heterogeneous collections ---
//
// Box<dyn Shape> is always pointer-sized regardless of the concrete type,
// letting us store mixed shapes in a single Vec.
pub trait Shape {
fn area(&self) -> f64;
}
pub struct Circle {
pub radius: f64,
}
pub struct Rectangle {
pub width: f64,
pub height: f64,
}
impl Shape for Circle {
fn area(&self) -> f64 {
std::f64::consts::PI * self.radius * self.radius
}
}
impl Shape for Rectangle {
fn area(&self) -> f64 {
self.width * self.height
}
}
pub fn total_area(shapes: &[Box<dyn Shape>]) -> f64 {
shapes.iter().map(|s| s.area()).sum()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_boxed_value_dereferences_transparently() {
let b: Box<i32> = Box::new(42);
// Deref coercion: *b behaves like i32
assert_eq!(*b, 42);
assert_eq!(*b + 1, 43);
}
#[test]
fn test_sum_boxed_squares() {
// 0² + 1² + 2² + 3² + 4² = 0+1+4+9+16 = 30
assert_eq!(sum_boxed_squares(5), 30);
assert_eq!(sum_boxed_squares(0), 0);
assert_eq!(sum_boxed_squares(1), 0); // only 0²
}
#[test]
fn test_recursive_expr_eval() {
// 1 + 2 * 3 (multiplication binds tighter in OCaml source)
let expr = add(num(1), mul(num(2), num(3)));
assert_eq!(eval(&expr), 7);
// (1 + 2) * (3 + 4) = 3 * 7 = 21
let expr2 = mul(add(num(1), num(2)), add(num(3), num(4)));
assert_eq!(eval(&expr2), 21);
}
#[test]
fn test_trait_objects_in_vec() {
let shapes: Vec<Box<dyn Shape>> = vec![
Box::new(Rectangle {
width: 4.0,
height: 5.0,
}),
Box::new(Circle { radius: 0.0 }),
];
let total = total_area(&shapes);
assert!((total - 20.0).abs() < 1e-10);
}
#[test]
fn test_box_size_is_pointer_sized() {
// Box<T> is always one pointer wide, regardless of T's size.
assert_eq!(
std::mem::size_of::<Box<[i32; 1000]>>(),
std::mem::size_of::<*const u8>()
);
}
#[test]
fn test_nested_expr_num_only() {
assert_eq!(eval(&num(0)), 0);
assert_eq!(eval(&num(-5)), -5);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_boxed_value_dereferences_transparently() {
let b: Box<i32> = Box::new(42);
// Deref coercion: *b behaves like i32
assert_eq!(*b, 42);
assert_eq!(*b + 1, 43);
}
#[test]
fn test_sum_boxed_squares() {
// 0² + 1² + 2² + 3² + 4² = 0+1+4+9+16 = 30
assert_eq!(sum_boxed_squares(5), 30);
assert_eq!(sum_boxed_squares(0), 0);
assert_eq!(sum_boxed_squares(1), 0); // only 0²
}
#[test]
fn test_recursive_expr_eval() {
// 1 + 2 * 3 (multiplication binds tighter in OCaml source)
let expr = add(num(1), mul(num(2), num(3)));
assert_eq!(eval(&expr), 7);
// (1 + 2) * (3 + 4) = 3 * 7 = 21
let expr2 = mul(add(num(1), num(2)), add(num(3), num(4)));
assert_eq!(eval(&expr2), 21);
}
#[test]
fn test_trait_objects_in_vec() {
let shapes: Vec<Box<dyn Shape>> = vec![
Box::new(Rectangle {
width: 4.0,
height: 5.0,
}),
Box::new(Circle { radius: 0.0 }),
];
let total = total_area(&shapes);
assert!((total - 20.0).abs() < 1e-10);
}
#[test]
fn test_box_size_is_pointer_sized() {
// Box<T> is always one pointer wide, regardless of T's size.
assert_eq!(
std::mem::size_of::<Box<[i32; 1000]>>(),
std::mem::size_of::<*const u8>()
);
}
#[test]
fn test_nested_expr_num_only() {
assert_eq!(eval(&num(0)), 0);
assert_eq!(eval(&num(-5)), -5);
}
}
Deep Comparison
OCaml vs Rust: Box\<T\> — Heap Allocation
Side-by-Side Code
OCaml
(* OCaml allocates everything on the GC heap automatically. *)
(* Recursive types work without indirection syntax: *)
type expr =
| Num of int
| Add of expr * expr
| Mul of expr * expr
let rec eval = function
| Num n -> n
| Add (a, b) -> eval a + eval b
| Mul (a, b) -> eval a * eval b
let () =
let e = Add (Num 1, Mul (Num 2, Num 3)) in
assert (eval e = 7);
print_endline "ok"
Rust (idiomatic — trait objects)
pub trait Shape { fn area(&self) -> f64; }
pub fn total_area(shapes: &[Box<dyn Shape>]) -> f64 {
shapes.iter().map(|s| s.area()).sum()
}
Rust (functional/recursive — expression evaluator)
#[derive(Debug, PartialEq)]
pub enum Expr {
Num(i32),
Add(Box<Expr>, Box<Expr>), // Box breaks the infinite-size cycle
Mul(Box<Expr>, Box<Expr>),
}
pub fn eval(expr: &Expr) -> i32 {
match expr {
Expr::Num(n) => *n,
Expr::Add(a, b) => eval(a) + eval(b),
Expr::Mul(a, b) => eval(a) * eval(b),
}
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| Heap allocation | automatic (GC) | Box::new(value) |
| Recursive type | type expr = Add of expr * expr | Add(Box<Expr>, Box<Expr>) |
| Trait object | first-class module / object | Box<dyn Trait> |
| Pointer size | hidden (GC word) | std::mem::size_of::<Box<T>>() == 8 |
| Deallocation | GC | automatic on Drop |
Key Insights
Box::new(...) to opt into heap allocation, making the memory location explicit and auditable.Add of expr * expr just works. Rust needs the Box wrapper to make the recursive variant's size finite; without it the compiler rejects the type as having infinite size.Box<T> follows Rust's ownership rules — when the Box goes out of scope, drop is called and the heap memory is freed immediately. There is no pause, no collector thread, and no runtime overhead beyond the allocation itself.Box<dyn Trait> to store different concrete types behind a uniform pointer, enabling Vec<Box<dyn Shape>> where each element may be a different struct.Box<T>:** At runtime, a Box<T> compiles down to a raw pointer plus a free on drop — exactly what a C programmer would write by hand. There is no reference counting, no fat pointer (unlike Rc or Arc), and no metadata unless the pointee is a dyn Trait (which adds a vtable pointer).When to Use Each Style
**Use Box<T> (idiomatic Rust) when:** you need a single-owner heap allocation — a recursive enum, a large array you don't want on the stack, or a Box<dyn Trait> for runtime polymorphism.
Use recursive Rust when: modelling tree-structured data (ASTs, parse trees, linked lists) and you want the OCaml-like pattern-match style, just with explicit Box indirection at each recursive position.
Exercises
Neg(Box<Expr>) and Div(Box<Expr>, Box<Expr>) variants to Expr and extend eval to handle them.Display for Expr that prints expressions with correct parenthesization.Vec<Box<dyn Fn(i32) -> i32>> and apply them all in sequence to an initial value.