370: SmallVec Pattern
Tutorial Video
Text description (accessibility)
This video demonstrates the "370: SmallVec Pattern" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Most collections in practice are small â a function rarely returns more than a few results, most AST nodes have 2-3 children, most tokenizers produce short token sequences. Key difference from OCaml: | Aspect | Rust `SmallVec<T, N>` | OCaml variant |
Tutorial
The Problem
Most collections in practice are small â a function rarely returns more than a few results, most AST nodes have 2-3 children, most tokenizers produce short token sequences. Standard Vec<T> always heap-allocates, even for zero or one element. The SmallVec optimization stores up to N items inline (on the stack or in the struct), spilling to heap only when needed. The smallvec crate implements this; this example shows the pattern from scratch using a const-generic N. SmallVec is used in LLVM (for instruction operands), Rust's own compiler, browser engines, and game ECS implementations to eliminate heap pressure for small collections.
🎯 Learning Outcomes
SmallVec<T, const N: usize> using an Inline/Heap enum[Option<T>; N] inline to avoid heap allocation for small countsVec<T> when the inline capacity is exceededpush, get, len, and iter abstracting over both variantsconst N: usize) for compile-time array sizesVec (by the inline array size)Code Example
enum SmallVec<T, const N: usize> {
Inline { data: [Option<T>; N], len: usize },
Heap(Vec<T>),
}
impl<T, const N: usize> SmallVec<T, N> {
fn push(&mut self, val: T) {
match self {
Inline { data, len } if *len < N => {
data[*len] = Some(val);
*len += 1;
}
Inline { data, len } => {
// Spill to heap
let mut v = /* collect inline items */;
v.push(val);
*self = Heap(v);
}
Heap(v) => v.push(val),
}
}
}Key Differences
| Aspect | Rust SmallVec<T, N> | OCaml variant |
|---|---|---|
| Inline storage | [Option<T>; N] on stack/struct | 'a array in variant |
| Heap spill | One-time move to Vec<T> | Switch to list |
| Const generics | const N: usize â compile-time | Runtime constant |
| Production crate | smallvec (extensively used in Rust ecosystem) | No standard equivalent |
| Size tradeoff | sizeof(SmallVec) = max(sizeof(Inline), sizeof(Heap)) | Similar |
OCaml Approach
OCaml's minor heap already provides fast allocation for small values, making SmallVec less critical. A functional approximation uses a variant type:
type 'a small_vec =
| Small of 'a array * int (* inline data + length *)
| Large of 'a list (* spilled to list *)
let max_inline = 4
let push sv x = match sv with
| Small (a, n) when n < max_inline ->
a.(n) <- x; Small (a, n + 1)
| Small (a, n) ->
let lst = Array.to_list a |> List.filteri (fun i _ -> i < n) in
Large (x :: lst)
| Large lst -> Large (x :: lst)
In practice, OCaml's allocator is fast enough that this optimization is rarely needed â the GC minor heap bump-allocates small objects at CPU-cache speed.
Full Source
#![allow(clippy::all)]
//! SmallVec Pattern
//!
//! Store small collections inline, spill to heap when needed.
/// SmallVec: inline for âĪN items, heap for more
#[derive(Debug, Clone)]
pub enum SmallVec<T, const N: usize> {
Inline { data: [Option<T>; N], len: usize },
Heap(Vec<T>),
}
impl<T: Clone + Default, const N: usize> SmallVec<T, N> {
/// Create a new empty SmallVec
pub fn new() -> Self {
Self::Inline {
data: std::array::from_fn(|_| None),
len: 0,
}
}
/// Push an element
pub fn push(&mut self, val: T) {
match self {
Self::Inline { data, len } if *len < N => {
data[*len] = Some(val);
*len += 1;
}
Self::Inline { data, len } => {
// Overflow: move to heap
let mut v: Vec<T> = data[..*len].iter_mut().filter_map(|x| x.take()).collect();
v.push(val);
*self = Self::Heap(v);
}
Self::Heap(v) => v.push(val),
}
}
/// Get element by index
pub fn get(&self, i: usize) -> Option<&T> {
match self {
Self::Inline { data, len } => {
if i < *len {
data[i].as_ref()
} else {
None
}
}
Self::Heap(v) => v.get(i),
}
}
/// Get mutable element by index
pub fn get_mut(&mut self, i: usize) -> Option<&mut T> {
match self {
Self::Inline { data, len } => {
if i < *len {
data[i].as_mut()
} else {
None
}
}
Self::Heap(v) => v.get_mut(i),
}
}
/// Get length
pub fn len(&self) -> usize {
match self {
Self::Inline { len, .. } => *len,
Self::Heap(v) => v.len(),
}
}
/// Check if empty
pub fn is_empty(&self) -> bool {
self.len() == 0
}
/// Check if stored inline (no heap allocation)
pub fn is_inline(&self) -> bool {
matches!(self, Self::Inline { .. })
}
/// Pop last element
pub fn pop(&mut self) -> Option<T> {
match self {
Self::Inline { data, len } if *len > 0 => {
*len -= 1;
data[*len].take()
}
Self::Inline { .. } => None,
Self::Heap(v) => v.pop(),
}
}
/// Clear all elements
pub fn clear(&mut self) {
match self {
Self::Inline { data, len } => {
for item in data.iter_mut().take(*len) {
*item = None;
}
*len = 0;
}
Self::Heap(v) => v.clear(),
}
}
/// Convert to Vec
pub fn to_vec(&self) -> Vec<T> {
match self {
Self::Inline { data, len } => data[..*len].iter().filter_map(|x| x.clone()).collect(),
Self::Heap(v) => v.clone(),
}
}
/// Iterate over elements
pub fn iter(&self) -> impl Iterator<Item = &T> {
match self {
Self::Inline { data, len } => data[..*len]
.iter()
.filter_map(|x| x.as_ref())
.collect::<Vec<_>>()
.into_iter(),
Self::Heap(v) => v.iter().collect::<Vec<_>>().into_iter(),
}
}
}
impl<T: Clone + Default, const N: usize> Default for SmallVec<T, N> {
fn default() -> Self {
Self::new()
}
}
impl<T: Clone + Default, const N: usize> FromIterator<T> for SmallVec<T, N> {
fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> Self {
let mut sv = Self::new();
for item in iter {
sv.push(item);
}
sv
}
}
/// Specialized SmallVec for common sizes
pub type SmallVec4<T> = SmallVec<T, 4>;
pub type SmallVec8<T> = SmallVec<T, 8>;
pub type SmallVec16<T> = SmallVec<T, 16>;
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_inline_small() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
for i in 0..4 {
sv.push(i);
}
assert!(sv.is_inline());
assert_eq!(sv.len(), 4);
}
#[test]
fn test_heap_on_overflow() {
let mut sv: SmallVec<i32, 2> = SmallVec::new();
sv.push(1);
sv.push(2);
assert!(sv.is_inline());
sv.push(3);
assert!(!sv.is_inline());
assert_eq!(sv.len(), 3);
}
#[test]
fn test_get_items() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
for i in 0..6 {
sv.push(i);
}
assert_eq!(sv.get(0), Some(&0));
assert_eq!(sv.get(5), Some(&5));
assert_eq!(sv.get(6), None);
}
#[test]
fn test_pop() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
sv.push(2);
assert_eq!(sv.pop(), Some(2));
assert_eq!(sv.pop(), Some(1));
assert_eq!(sv.pop(), None);
}
#[test]
fn test_clear() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
sv.push(2);
sv.clear();
assert!(sv.is_empty());
}
#[test]
fn test_to_vec() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
sv.push(2);
sv.push(3);
assert_eq!(sv.to_vec(), vec![1, 2, 3]);
}
#[test]
fn test_from_iterator() {
let sv: SmallVec<i32, 4> = vec![1, 2, 3].into_iter().collect();
assert!(sv.is_inline());
assert_eq!(sv.len(), 3);
let sv2: SmallVec<i32, 2> = vec![1, 2, 3, 4, 5].into_iter().collect();
assert!(!sv2.is_inline());
assert_eq!(sv2.len(), 5);
}
#[test]
fn test_get_mut() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
*sv.get_mut(0).unwrap() = 10;
assert_eq!(sv.get(0), Some(&10));
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_inline_small() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
for i in 0..4 {
sv.push(i);
}
assert!(sv.is_inline());
assert_eq!(sv.len(), 4);
}
#[test]
fn test_heap_on_overflow() {
let mut sv: SmallVec<i32, 2> = SmallVec::new();
sv.push(1);
sv.push(2);
assert!(sv.is_inline());
sv.push(3);
assert!(!sv.is_inline());
assert_eq!(sv.len(), 3);
}
#[test]
fn test_get_items() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
for i in 0..6 {
sv.push(i);
}
assert_eq!(sv.get(0), Some(&0));
assert_eq!(sv.get(5), Some(&5));
assert_eq!(sv.get(6), None);
}
#[test]
fn test_pop() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
sv.push(2);
assert_eq!(sv.pop(), Some(2));
assert_eq!(sv.pop(), Some(1));
assert_eq!(sv.pop(), None);
}
#[test]
fn test_clear() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
sv.push(2);
sv.clear();
assert!(sv.is_empty());
}
#[test]
fn test_to_vec() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
sv.push(2);
sv.push(3);
assert_eq!(sv.to_vec(), vec![1, 2, 3]);
}
#[test]
fn test_from_iterator() {
let sv: SmallVec<i32, 4> = vec![1, 2, 3].into_iter().collect();
assert!(sv.is_inline());
assert_eq!(sv.len(), 3);
let sv2: SmallVec<i32, 2> = vec![1, 2, 3, 4, 5].into_iter().collect();
assert!(!sv2.is_inline());
assert_eq!(sv2.len(), 5);
}
#[test]
fn test_get_mut() {
let mut sv: SmallVec<i32, 4> = SmallVec::new();
sv.push(1);
*sv.get_mut(0).unwrap() = 10;
assert_eq!(sv.get(0), Some(&10));
}
}
Deep Comparison
OCaml vs Rust: SmallVec Pattern
Side-by-Side Comparison
OCaml Approach
OCaml:
(* OCaml arrays of floats/ints are already unboxed *)
let small_array = [|1;2;3;4|] (* stack-like, unboxed *)
(* For dynamic small collections, use list or array *)
let push_small arr x = Array.append arr [|x|]
Rust Approach
Rust:
enum SmallVec<T, const N: usize> {
Inline { data: [Option<T>; N], len: usize },
Heap(Vec<T>),
}
impl<T, const N: usize> SmallVec<T, N> {
fn push(&mut self, val: T) {
match self {
Inline { data, len } if *len < N => {
data[*len] = Some(val);
*len += 1;
}
Inline { data, len } => {
// Spill to heap
let mut v = /* collect inline items */;
v.push(val);
*self = Heap(v);
}
Heap(v) => v.push(val),
}
}
}
Key Differences
| Aspect | OCaml | Rust |
|---|---|---|
| Small optimization | Lists are cons cells | Explicit SmallVec |
| Array allocation | Always heap | Inline for N items |
| Const generics | N/A | const N: usize |
| Memory layout | GC-managed | Explicit enum |
Memory Layout
OCaml: Arrays are heap-allocated but contain unboxed primitives. Lists are cons cells (always heap).
Rust SmallVec:
[Option<T>; N] on stack + lenPerformance Characteristics
| Operation | SmallVec (inline) | SmallVec (heap) | Vec |
|---|---|---|---|
| Push | O(1) | O(1) amortized | O(1) amortized |
| Access | O(1) | O(1) | O(1) |
| Memory | Stack | Heap | Heap |
| Allocation | None | Once on spill | On first push |
When to Use
Exercises
fn iter(&self) -> impl Iterator<Item = &T> that works for both Inline and Heap variants without allocating an intermediate Vec.SmallVec<i32, 4> and Vec<i32>, push 1, 2, 4, and 8 elements respectively, and measure heap allocations using a custom allocator or std::alloc::System with tracking.smallvec crate**: Replace the manual implementation with smallvec::SmallVec<[i32; 4]> from the smallvec crate; verify that slicing, sorting, and extend all work correctly.