ExamplesBy LevelBy TopicLearning Paths
363 Advanced

363: Arena Allocation

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "363: Arena Allocation" functional Rust example. Difficulty level: Advanced. Key concepts covered: Functional Programming. General-purpose allocators (malloc/jemalloc) have overhead: each allocation needs bookkeeping metadata, thread-local allocation caches, and potentially lock contention. Key difference from OCaml: | Aspect | Rust arena | OCaml GC / manual arena |

Tutorial

The Problem

General-purpose allocators (malloc/jemalloc) have overhead: each allocation needs bookkeeping metadata, thread-local allocation caches, and potentially lock contention. For workloads that allocate thousands of small objects and free them all at once — AST nodes during parsing, temporary nodes in a graph algorithm, frame data in a game loop — arena allocation (bump allocation) is dramatically faster. The arena pre-allocates one large block and serves allocations by simply advancing a pointer. "Freeing" individual objects is a no-op; the entire arena resets in O(1). This pattern powers programming language parsers, game engines, and database query planners.

🎯 Learning Outcomes

  • • Implement a bump allocator arena using a pre-allocated Vec<u8> and Cell<usize> offset
  • • Handle alignment by rounding the offset up to the required alignment boundary
  • • Use Cell<usize> for interior mutability so &self methods can mutate the offset
  • • Count allocations with a second Cell<usize> for diagnostics
  • • Reset the arena in O(1) by setting offset back to zero
  • • Understand that arena allocation trades allocation/free speed for batch-free semantics
  • Code Example

    struct Arena {
        data: Vec<u8>,
        offset: Cell<usize>,
        allocations: Cell<usize>,
    }
    
    impl Arena {
        fn new(capacity: usize) -> Self {
            Self {
                data: vec![0u8; capacity],
                offset: Cell::new(0),
                allocations: Cell::new(0),
            }
        }
    }

    Key Differences

    AspectRust arenaOCaml GC / manual arena
    Allocation costO(1) pointer bumpO(1) GC minor heap (usually)
    DeallocationO(1) arena reset (all at once)GC-managed individually
    Safetyunsafe needed for typed accessSafe (GC handles lifetime)
    AlignmentManual calculation requiredGC handles alignment
    Use caseParsers, compilers, gamesRarely needed; GC does it

    OCaml Approach

    OCaml's GC serves as a kind of arena — you can allocate objects freely and the GC handles collection. For explicit arena semantics, Bigarray or Bytes with a ref-based offset:

    type arena = {
      data: bytes;
      mutable offset: int;
    }
    
    let create capacity = { data = Bytes.create capacity; offset = 0 }
    
    let alloc a size align =
      let aligned = (a.offset + align - 1) land (lnot (align - 1)) in
      if aligned + size > Bytes.length a.data then None
      else begin
        a.offset <- aligned + size;
        Some aligned
      end
    
    let reset a = a.offset <- 0
    

    In practice, OCaml's generational GC already provides fast allocation for short-lived objects (minor heap bump allocation). Explicit arenas are less common in OCaml than in Rust, where every allocation is explicit and arena-vs-per-object is a meaningful choice.

    Full Source

    #![allow(clippy::all)]
    //! Arena / Bump Allocation Pattern
    //!
    //! Allocate many objects into a single memory region and free them all at once.
    
    use std::cell::Cell;
    
    // === Approach 1: Simple Bump Allocator ===
    
    /// A simple bump allocator arena
    pub struct Arena {
        data: Vec<u8>,
        offset: Cell<usize>,
        allocations: Cell<usize>,
    }
    
    impl Arena {
        /// Create a new arena with the given capacity in bytes
        pub fn new(capacity: usize) -> Self {
            Self {
                data: vec![0u8; capacity],
                offset: Cell::new(0),
                allocations: Cell::new(0),
            }
        }
    
        /// Allocate space for bytes with given size and alignment
        /// Returns the offset into the arena, or None if out of space
        pub fn alloc_bytes(&self, size: usize, align: usize) -> Option<usize> {
            let offset = self.offset.get();
            let aligned = (offset + align - 1) & !(align - 1);
            let new_offset = aligned + size;
            if new_offset > self.data.len() {
                return None;
            }
            self.offset.set(new_offset);
            self.allocations.set(self.allocations.get() + 1);
            Some(aligned)
        }
    
        /// Get the number of bytes currently allocated
        pub fn allocated(&self) -> usize {
            self.offset.get()
        }
    
        /// Get the number of allocations made
        pub fn allocation_count(&self) -> usize {
            self.allocations.get()
        }
    
        /// Reset the arena, freeing all allocations at once
        pub fn reset(&self) {
            self.offset.set(0);
            self.allocations.set(0);
        }
    
        /// Get the total capacity of the arena
        pub fn capacity(&self) -> usize {
            self.data.len()
        }
    
        /// Get the remaining space in the arena
        pub fn remaining(&self) -> usize {
            self.capacity() - self.allocated()
        }
    
        /// Get the utilization ratio (0.0 to 1.0)
        pub fn utilization(&self) -> f64 {
            self.allocated() as f64 / self.capacity() as f64
        }
    }
    
    // === Approach 2: Typed Arena ===
    
    /// A typed arena that stores values of a single type
    pub struct TypedArena<T> {
        items: Vec<Box<T>>,
    }
    
    impl<T> TypedArena<T> {
        /// Create a new empty typed arena
        pub fn new() -> Self {
            Self { items: Vec::new() }
        }
    
        /// Create a typed arena with pre-allocated capacity
        pub fn with_capacity(cap: usize) -> Self {
            Self {
                items: Vec::with_capacity(cap),
            }
        }
    
        /// Allocate a value in the arena, returning a reference
        pub fn alloc(&mut self, val: T) -> &T {
            self.items.push(Box::new(val));
            self.items.last().unwrap()
        }
    
        /// Get the count of allocated items
        pub fn count(&self) -> usize {
            self.items.len()
        }
    
        /// Check if the arena is empty
        pub fn is_empty(&self) -> bool {
            self.items.is_empty()
        }
    
        /// Clear all allocations
        pub fn clear(&mut self) {
            self.items.clear();
        }
    }
    
    impl<T> Default for TypedArena<T> {
        fn default() -> Self {
            Self::new()
        }
    }
    
    // === Approach 3: Scoped Arena Pattern ===
    
    /// Execute a function with a fresh arena, automatically freed afterwards
    pub fn with_arena<T, F>(capacity: usize, f: F) -> T
    where
        F: FnOnce(&Arena) -> T,
    {
        let arena = Arena::new(capacity);
        let result = f(&arena);
        // arena is automatically dropped here, freeing all memory
        result
    }
    
    /// Execute a function with a typed arena
    pub fn with_typed_arena<T, R, F>(f: F) -> R
    where
        F: FnOnce(&mut TypedArena<T>) -> R,
    {
        let mut arena = TypedArena::new();
        let result = f(&mut arena);
        // arena is automatically dropped here
        result
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_bump_alloc_basic() {
            let arena = Arena::new(256);
            let o1 = arena.alloc_bytes(8, 8).unwrap();
            let o2 = arena.alloc_bytes(8, 8).unwrap();
            assert_eq!(o1, 0);
            assert_eq!(o2, 8);
            assert_eq!(arena.allocation_count(), 2);
        }
    
        #[test]
        fn test_alignment() {
            let arena = Arena::new(256);
            arena.alloc_bytes(1, 1).unwrap(); // offset now 1
            let aligned = arena.alloc_bytes(8, 8).unwrap();
            assert_eq!(aligned, 8); // aligned to 8-byte boundary
        }
    
        #[test]
        fn test_reset_clears() {
            let arena = Arena::new(64);
            arena.alloc_bytes(32, 1).unwrap();
            assert_eq!(arena.allocated(), 32);
            arena.reset();
            assert_eq!(arena.allocated(), 0);
            assert_eq!(arena.allocation_count(), 0);
        }
    
        #[test]
        fn test_out_of_space() {
            let arena = Arena::new(16);
            assert!(arena.alloc_bytes(8, 1).is_some());
            assert!(arena.alloc_bytes(8, 1).is_some());
            assert!(arena.alloc_bytes(1, 1).is_none()); // out of space
        }
    
        #[test]
        fn test_utilization() {
            let arena = Arena::new(100);
            arena.alloc_bytes(50, 1).unwrap();
            assert!((arena.utilization() - 0.5).abs() < 0.001);
        }
    
        #[test]
        fn test_typed_arena_basic() {
            let mut arena: TypedArena<i32> = TypedArena::new();
            let v = arena.alloc(42);
            assert_eq!(*v, 42);
            assert_eq!(arena.count(), 1);
        }
    
        #[test]
        fn test_typed_arena_strings() {
            let mut arena: TypedArena<String> = TypedArena::new();
            let s1 = arena.alloc("hello".to_string());
            assert_eq!(s1, "hello");
            let s2 = arena.alloc("world".to_string());
            assert_eq!(s2, "world");
            assert_eq!(arena.count(), 2);
        }
    
        #[test]
        fn test_typed_arena_clear() {
            let mut arena: TypedArena<i32> = TypedArena::new();
            arena.alloc(1);
            arena.alloc(2);
            arena.alloc(3);
            assert_eq!(arena.count(), 3);
            arena.clear();
            assert!(arena.is_empty());
        }
    
        #[test]
        fn test_with_arena_scoped() {
            let result = with_arena(1024, |arena| {
                let _ = arena.alloc_bytes(100, 1);
                let _ = arena.alloc_bytes(200, 1);
                arena.allocation_count()
            });
            assert_eq!(result, 2);
            // arena is freed after this point
        }
    
        #[test]
        fn test_remaining_space() {
            let arena = Arena::new(100);
            assert_eq!(arena.remaining(), 100);
            arena.alloc_bytes(40, 1).unwrap();
            assert_eq!(arena.remaining(), 60);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_bump_alloc_basic() {
            let arena = Arena::new(256);
            let o1 = arena.alloc_bytes(8, 8).unwrap();
            let o2 = arena.alloc_bytes(8, 8).unwrap();
            assert_eq!(o1, 0);
            assert_eq!(o2, 8);
            assert_eq!(arena.allocation_count(), 2);
        }
    
        #[test]
        fn test_alignment() {
            let arena = Arena::new(256);
            arena.alloc_bytes(1, 1).unwrap(); // offset now 1
            let aligned = arena.alloc_bytes(8, 8).unwrap();
            assert_eq!(aligned, 8); // aligned to 8-byte boundary
        }
    
        #[test]
        fn test_reset_clears() {
            let arena = Arena::new(64);
            arena.alloc_bytes(32, 1).unwrap();
            assert_eq!(arena.allocated(), 32);
            arena.reset();
            assert_eq!(arena.allocated(), 0);
            assert_eq!(arena.allocation_count(), 0);
        }
    
        #[test]
        fn test_out_of_space() {
            let arena = Arena::new(16);
            assert!(arena.alloc_bytes(8, 1).is_some());
            assert!(arena.alloc_bytes(8, 1).is_some());
            assert!(arena.alloc_bytes(1, 1).is_none()); // out of space
        }
    
        #[test]
        fn test_utilization() {
            let arena = Arena::new(100);
            arena.alloc_bytes(50, 1).unwrap();
            assert!((arena.utilization() - 0.5).abs() < 0.001);
        }
    
        #[test]
        fn test_typed_arena_basic() {
            let mut arena: TypedArena<i32> = TypedArena::new();
            let v = arena.alloc(42);
            assert_eq!(*v, 42);
            assert_eq!(arena.count(), 1);
        }
    
        #[test]
        fn test_typed_arena_strings() {
            let mut arena: TypedArena<String> = TypedArena::new();
            let s1 = arena.alloc("hello".to_string());
            assert_eq!(s1, "hello");
            let s2 = arena.alloc("world".to_string());
            assert_eq!(s2, "world");
            assert_eq!(arena.count(), 2);
        }
    
        #[test]
        fn test_typed_arena_clear() {
            let mut arena: TypedArena<i32> = TypedArena::new();
            arena.alloc(1);
            arena.alloc(2);
            arena.alloc(3);
            assert_eq!(arena.count(), 3);
            arena.clear();
            assert!(arena.is_empty());
        }
    
        #[test]
        fn test_with_arena_scoped() {
            let result = with_arena(1024, |arena| {
                let _ = arena.alloc_bytes(100, 1);
                let _ = arena.alloc_bytes(200, 1);
                arena.allocation_count()
            });
            assert_eq!(result, 2);
            // arena is freed after this point
        }
    
        #[test]
        fn test_remaining_space() {
            let arena = Arena::new(100);
            assert_eq!(arena.remaining(), 100);
            arena.alloc_bytes(40, 1).unwrap();
            assert_eq!(arena.remaining(), 60);
        }
    }

    Deep Comparison

    OCaml vs Rust: Arena Allocation

    Side-by-Side Comparison

    Arena Type Definition

    OCaml:

    type 'a arena = {
      mutable items: 'a list;
      mutable count: int;
    }
    
    let make_arena () = { items=[]; count=0 }
    

    Rust:

    struct Arena {
        data: Vec<u8>,
        offset: Cell<usize>,
        allocations: Cell<usize>,
    }
    
    impl Arena {
        fn new(capacity: usize) -> Self {
            Self {
                data: vec![0u8; capacity],
                offset: Cell::new(0),
                allocations: Cell::new(0),
            }
        }
    }
    

    Allocation

    OCaml:

    let arena_alloc a v =
      a.items <- v :: a.items;
      a.count <- a.count + 1;
      v
    

    Rust:

    fn alloc_bytes(&self, size: usize, align: usize) -> Option<usize> {
        let offset = self.offset.get();
        let aligned = (offset + align - 1) & !(align - 1);
        let new_offset = aligned + size;
        if new_offset > self.data.len() { return None; }
        self.offset.set(new_offset);
        self.allocations.set(self.allocations.get() + 1);
        Some(aligned)
    }
    

    Reset / Free All

    OCaml:

    let arena_reset a =
      Printf.printf "Freeing %d items\n" a.count;
      a.items <- [];
      a.count <- 0
    

    Rust:

    fn reset(&self) {
        self.offset.set(0);
        self.allocations.set(0);
    }
    

    Key Differences

    AspectOCamlRust
    Memory modelGC-managed listRaw byte buffer
    AllocationPrepend to listBump pointer
    Free allClear list, GC collectsReset offset (O(1))
    AlignmentN/A (GC handles)Manual alignment
    Typed safetyType parameter 'aSeparate TypedArena<T>
    Interior mutabilitymutable fieldsCell<usize>

    Memory Layout

    OCaml: Items stored as linked list. Each cons cell has GC overhead. The arena doesn't actually control memory layout - GC does.

    Rust: Contiguous byte buffer. Bump pointer advances linearly. True arena semantics with O(1) reset.

    Use Cases

    Use CaseOCaml ApproachRust Approach
    Compiler ASTMinor heap (GC optimizes)typed_arena::Arena
    Game frame dataN/Abumpalo::Bump
    Per-request allocationMinor heapArena with reset
    Persistent structuresNatural (immutable)Requires Rc/Arc

    Performance

  • OCaml: GC pause latency, but excellent for short-lived objects
  • Rust: Deterministic O(1) allocation and O(1) reset, no GC pauses
  • Exercises

  • Typed allocation: Using unsafe, implement alloc<T>(&self) -> Option<*mut T> that returns a properly aligned pointer into the arena's buffer for type T using std::mem::size_of::<T>() and std::mem::align_of::<T>().
  • AST arena: Build a simple expression parser where all AST nodes are allocated into a single arena; reset the arena after each parse to reuse memory for the next input.
  • Fragmentation analysis: Allocate a mix of 1-byte, 4-byte, and 8-byte values and measure internal fragmentation (wasted alignment padding) as a percentage of total allocated bytes.
  • Open Source Repos