ExamplesBy LevelBy TopicLearning Paths
552 Intermediate

Arena Allocation Pattern

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Arena Allocation Pattern" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Arena allocation (also called region-based memory management) is a technique where all allocations live in a single region and are freed all at once when the region is dropped. Key difference from OCaml: 1. **Lifetime enforcement**: Rust arena references are tied to the arena's lifetime at compile time — use

Tutorial

The Problem

Arena allocation (also called region-based memory management) is a technique where all allocations live in a single region and are freed all at once when the region is dropped. This is dramatically faster than individual heap allocations: no per-object malloc overhead, excellent cache locality, and zero fragmentation. Compilers (LLVM, GCC), game engines, and web browsers use arenas for AST nodes, parse results, and per-frame allocations. In Rust, arenas elegantly tie allocated objects to the arena's lifetime, preventing use-after-arena-drop at compile time.

🎯 Learning Outcomes

  • • How StringArena stores all strings in a Vec<String> and returns &str references tied to the arena's lifetime
  • • Why arena-allocated references cannot outlive the arena (lifetime enforcement)
  • • How arenas eliminate per-object allocation overhead for batch workloads
  • • How the bumpalo crate implements a production arena allocator
  • • Where arenas are used: compilers (LLVM), parsers (nom ASTs), game engines, HTTP request processing
  • Code Example

    #![allow(clippy::all)]
    //! Arena Allocation Pattern
    //!
    //! All allocations tied to arena lifetime.
    
    /// Simple arena for strings.
    pub struct StringArena {
        storage: Vec<String>,
    }
    
    impl StringArena {
        pub fn new() -> Self {
            StringArena {
                storage: Vec::new(),
            }
        }
    
        pub fn alloc(&mut self, s: &str) -> &str {
            self.storage.push(s.to_string());
            self.storage.last().unwrap()
        }
    
        pub fn len(&self) -> usize {
            self.storage.len()
        }
    
        pub fn is_empty(&self) -> bool {
            self.storage.is_empty()
        }
    }
    
    impl Default for StringArena {
        fn default() -> Self {
            Self::new()
        }
    }
    
    /// Typed arena.
    pub struct Arena<T> {
        items: Vec<T>,
    }
    
    impl<T> Arena<T> {
        pub fn new() -> Self {
            Arena { items: Vec::new() }
        }
    
        pub fn alloc(&mut self, item: T) -> &T {
            self.items.push(item);
            self.items.last().unwrap()
        }
    }
    
    impl<T> Default for Arena<T> {
        fn default() -> Self {
            Self::new()
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_string_arena() {
            let mut arena = StringArena::new();
            arena.alloc("hello");
            arena.alloc("world");
            assert_eq!(arena.len(), 2);
            // Note: can't keep refs across alloc calls with &mut self
        }
    
        #[test]
        fn test_typed_arena() {
            let mut arena: Arena<i32> = Arena::new();
            arena.alloc(1);
            arena.alloc(2);
            // Refs would require interior mutability for real arena
        }
    }

    Key Differences

  • Lifetime enforcement: Rust arena references are tied to the arena's lifetime at compile time — use-after-free is impossible; OCaml relies on the GC to prevent this at runtime.
  • Batch free: Rust arenas (like bumpalo) support O(1) batch deallocation by dropping the arena; OCaml's GC defers collection until pressure warrants it.
  • Cache performance: Bumpalo's bump allocation keeps objects contiguous in memory; OCaml's copying GC (in its minor heap) achieves similar cache locality for young objects.
  • Vec reallocation hazard: The simple StringArena implementation can invalidate references if the Vec reallocates — production Rust arenas use slab allocation to prevent this.
  • OCaml Approach

    OCaml's GC naturally acts as an arena — objects allocated during processing are collected together when they become unreachable. For performance-sensitive arenas, OCaml uses Bigarray or manual memory management via Bigarray.Array1:

    (* OCaml GC is effectively an automatic arena *)
    let parse_and_process input =
      let ast = parse input in  (* AST nodes live until parse_and_process returns *)
      analyze ast
      (* ast and all its nodes are GC-collected after this *)
    

    Explicit arenas in OCaml typically use Buffer.t for strings or custom allocators for performance.

    Full Source

    #![allow(clippy::all)]
    //! Arena Allocation Pattern
    //!
    //! All allocations tied to arena lifetime.
    
    /// Simple arena for strings.
    pub struct StringArena {
        storage: Vec<String>,
    }
    
    impl StringArena {
        pub fn new() -> Self {
            StringArena {
                storage: Vec::new(),
            }
        }
    
        pub fn alloc(&mut self, s: &str) -> &str {
            self.storage.push(s.to_string());
            self.storage.last().unwrap()
        }
    
        pub fn len(&self) -> usize {
            self.storage.len()
        }
    
        pub fn is_empty(&self) -> bool {
            self.storage.is_empty()
        }
    }
    
    impl Default for StringArena {
        fn default() -> Self {
            Self::new()
        }
    }
    
    /// Typed arena.
    pub struct Arena<T> {
        items: Vec<T>,
    }
    
    impl<T> Arena<T> {
        pub fn new() -> Self {
            Arena { items: Vec::new() }
        }
    
        pub fn alloc(&mut self, item: T) -> &T {
            self.items.push(item);
            self.items.last().unwrap()
        }
    }
    
    impl<T> Default for Arena<T> {
        fn default() -> Self {
            Self::new()
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_string_arena() {
            let mut arena = StringArena::new();
            arena.alloc("hello");
            arena.alloc("world");
            assert_eq!(arena.len(), 2);
            // Note: can't keep refs across alloc calls with &mut self
        }
    
        #[test]
        fn test_typed_arena() {
            let mut arena: Arena<i32> = Arena::new();
            arena.alloc(1);
            arena.alloc(2);
            // Refs would require interior mutability for real arena
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_string_arena() {
            let mut arena = StringArena::new();
            arena.alloc("hello");
            arena.alloc("world");
            assert_eq!(arena.len(), 2);
            // Note: can't keep refs across alloc calls with &mut self
        }
    
        #[test]
        fn test_typed_arena() {
            let mut arena: Arena<i32> = Arena::new();
            arena.alloc(1);
            arena.alloc(2);
            // Refs would require interior mutability for real arena
        }
    }

    Deep Comparison

    OCaml vs Rust: lifetime arena pattern

    See example.rs and example.ml for implementations.

    Key Differences

  • OCaml uses garbage collection
  • Rust uses ownership and borrowing
  • Both support the core concept
  • Exercises

  • Fixed-capacity arena: Implement an arena that pre-allocates a Vec<String> with a fixed capacity and returns Err when full, ensuring no reallocation can invalidate references.
  • AST arena: Build a simple arena for AST nodes using Vec<Box<AstNode>> where each node is a heap allocation whose reference is valid for the arena's lifetime.
  • Benchmark comparison: Compare allocating 10,000 strings with (a) individual String::from, (b) a simple arena, and (c) direct Vec<String> — measure and explain the performance differences.
  • Open Source Repos