Zero-Cost Abstractions
Tutorial
The Problem
High-level code often runs slower than hand-written low-level code in languages without zero-cost guarantees ā intermediate allocations, dynamic dispatch, and interpreter overhead add up. Rust's zero-cost abstraction principle guarantees that iterator chains, closures, and newtype wrappers compile to identical machine code as their hand-written equivalents. This enables writing expressive, composable code without sacrificing performance ā a core design goal of the language.
🎯 Learning Outcomes
Code Example
pub fn sum_even_squares(n: i64) -> i64 {
(0..n).filter(|x| x % 2 == 0).map(|x| x * x).sum()
}Key Differences
Seq.impl Fn ā zero heap allocation, inlined call; OCaml closures are heap-allocated function records.#[repr(transparent)]); OCaml has no equivalent ā modules wrapping types may add indirection.OCaml Approach
OCaml does not guarantee zero-cost abstractions by default. List.filter |> List.map allocates two intermediate lists. OCaml closures carry a heap-allocated environment record. The flambda optimizer can eliminate some overhead in batch compilation, and Seq provides lazy iteration, but these are opt-in rather than guaranteed by the language design.
Full Source
#![allow(clippy::all)]
//! Example 119: Zero-Cost Abstractions
//!
//! Iterator chains, closures, and newtypes compile to the same machine code
//! as hand-written equivalents ā abstraction with no runtime penalty.
// āā Approach 1: Iterator chains āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
//
// `.filter().map().sum()` fuses into a single loop at compile time.
// No intermediate `Vec` is ever allocated.
/// Sum of squares of all even numbers in `0..n`.
pub fn sum_even_squares(n: i64) -> i64 {
(0..n).filter(|x| x % 2 == 0).map(|x| x * x).sum()
}
/// Same computation written as an explicit loop ā produces identical assembly.
pub fn sum_even_squares_manual(n: i64) -> i64 {
let mut acc = 0i64;
for x in 0..n {
if x % 2 == 0 {
acc += x * x;
}
}
acc
}
// āā Approach 2: Closures monomorphised at the call site āāāāāāāāāāāāāāāāāāāāāāā
//
// `impl Fn(f64) -> f64` is a zero-sized type; LLVM inlines the closure body
// completely. No heap allocation, no indirect call.
/// Returns a closure that evaluates the polynomial `cā + cāx + cāx² + ā¦`.
pub fn make_polynomial(coeffs: Vec<f64>) -> impl Fn(f64) -> f64 {
move |x| {
coeffs
.iter()
.enumerate()
.map(|(i, &c)| c * x.powi(i as i32))
.sum()
}
}
// āā Approach 3: Newtypes ā compile-time safety, zero runtime cost āāāāāāāāāāāāā
//
// `Meters` and `Seconds` are distinct types that the compiler enforces,
// yet at runtime each is just a bare `f64`.
/// Distance measured in metres.
#[derive(Debug, Clone, Copy, PartialEq)]
pub struct Meters(pub f64);
/// Time measured in seconds.
#[derive(Debug, Clone, Copy, PartialEq)]
pub struct Seconds(pub f64);
/// Speed measured in metres per second (derived from newtypes).
#[derive(Debug, Clone, Copy, PartialEq)]
pub struct MetersPerSecond(pub f64);
impl Meters {
pub fn value(self) -> f64 {
self.0
}
}
impl Seconds {
pub fn value(self) -> f64 {
self.0
}
}
/// Compute speed ā the type system prevents accidentally passing `Seconds`
/// where `Meters` is expected, at zero runtime cost.
pub fn speed(distance: Meters, time: Seconds) -> MetersPerSecond {
MetersPerSecond(distance.0 / time.0)
}
// āā Approach 4: Higher-order pipeline ā idiomatic functional style āāāāāāāāāāāā
/// Apply a pipeline of transformations to a slice, returning a collected `Vec`.
/// Each `fn` pointer is monomorphised; the chain is inlined by the optimiser.
pub fn pipeline<T, U, F>(data: &[T], transform: F) -> Vec<U>
where
T: Copy,
F: Fn(T) -> U,
{
data.iter().copied().map(transform).collect()
}
// āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_iterator_chain_equals_manual_loop() {
// The two implementations must produce identical results for all n.
for n in [0, 1, 10, 100, 1000] {
assert_eq!(
sum_even_squares(n),
sum_even_squares_manual(n),
"mismatch at n={n}"
);
}
}
#[test]
fn test_sum_even_squares_known_value() {
// 0..10 even numbers: 0,2,4,6,8 ā squares: 0,4,16,36,64 ā sum=120
assert_eq!(sum_even_squares(10), 120);
}
#[test]
fn test_polynomial_closure() {
// p(x) = 1 + 2x + 3x²
// p(0) = 1, p(1) = 6, p(2) = 17
let poly = make_polynomial(vec![1.0, 2.0, 3.0]);
assert!((poly(0.0) - 1.0).abs() < 1e-10);
assert!((poly(1.0) - 6.0).abs() < 1e-10);
assert!((poly(2.0) - 17.0).abs() < 1e-10);
}
#[test]
fn test_polynomial_constant() {
// p(x) = 5 (single coefficient)
let poly = make_polynomial(vec![5.0]);
assert!((poly(0.0) - 5.0).abs() < 1e-10);
assert!((poly(99.0) - 5.0).abs() < 1e-10);
}
#[test]
fn test_newtype_speed() {
let d = Meters(100.0);
let t = Seconds(9.58); // Usain Bolt world record
let s = speed(d, t);
assert!((s.0 - 100.0 / 9.58).abs() < 1e-10);
}
#[test]
fn test_newtype_zero_cost_value_access() {
// Newtype wrapper adds no overhead ā .value() is an identity function.
let m = Meters(42.0);
let s = Seconds(7.0);
assert_eq!(m.value(), 42.0);
assert_eq!(s.value(), 7.0);
}
#[test]
fn test_pipeline_transform() {
let data = [1i32, 2, 3, 4, 5];
let doubled = pipeline(&data, |x| x * 2);
assert_eq!(doubled, vec![2, 4, 6, 8, 10]);
}
#[test]
fn test_pipeline_empty() {
let data: [i32; 0] = [];
let result = pipeline(&data, |x| x + 1);
assert_eq!(result, Vec::<i32>::new());
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_iterator_chain_equals_manual_loop() {
// The two implementations must produce identical results for all n.
for n in [0, 1, 10, 100, 1000] {
assert_eq!(
sum_even_squares(n),
sum_even_squares_manual(n),
"mismatch at n={n}"
);
}
}
#[test]
fn test_sum_even_squares_known_value() {
// 0..10 even numbers: 0,2,4,6,8 ā squares: 0,4,16,36,64 ā sum=120
assert_eq!(sum_even_squares(10), 120);
}
#[test]
fn test_polynomial_closure() {
// p(x) = 1 + 2x + 3x²
// p(0) = 1, p(1) = 6, p(2) = 17
let poly = make_polynomial(vec![1.0, 2.0, 3.0]);
assert!((poly(0.0) - 1.0).abs() < 1e-10);
assert!((poly(1.0) - 6.0).abs() < 1e-10);
assert!((poly(2.0) - 17.0).abs() < 1e-10);
}
#[test]
fn test_polynomial_constant() {
// p(x) = 5 (single coefficient)
let poly = make_polynomial(vec![5.0]);
assert!((poly(0.0) - 5.0).abs() < 1e-10);
assert!((poly(99.0) - 5.0).abs() < 1e-10);
}
#[test]
fn test_newtype_speed() {
let d = Meters(100.0);
let t = Seconds(9.58); // Usain Bolt world record
let s = speed(d, t);
assert!((s.0 - 100.0 / 9.58).abs() < 1e-10);
}
#[test]
fn test_newtype_zero_cost_value_access() {
// Newtype wrapper adds no overhead ā .value() is an identity function.
let m = Meters(42.0);
let s = Seconds(7.0);
assert_eq!(m.value(), 42.0);
assert_eq!(s.value(), 7.0);
}
#[test]
fn test_pipeline_transform() {
let data = [1i32, 2, 3, 4, 5];
let doubled = pipeline(&data, |x| x * 2);
assert_eq!(doubled, vec![2, 4, 6, 8, 10]);
}
#[test]
fn test_pipeline_empty() {
let data: [i32; 0] = [];
let result = pipeline(&data, |x| x + 1);
assert_eq!(result, Vec::<i32>::new());
}
}
Deep Comparison
OCaml vs Rust: Zero-Cost Abstractions
Side-by-Side Code
OCaml
(* OCaml: filter + map + fold ā creates intermediate lists *)
let sum_even_squares n =
List.init n Fun.id
|> List.filter (fun x -> x mod 2 = 0)
|> List.map (fun x -> x * x)
|> List.fold_left ( + ) 0
(* Closure-returning function ā may allocate a closure record on the heap *)
let make_polynomial coeffs =
fun x ->
List.mapi (fun i c -> c *. (x ** float_of_int i)) coeffs
|> List.fold_left ( +. ) 0.0
Rust (idiomatic ā iterator chain)
pub fn sum_even_squares(n: i64) -> i64 {
(0..n).filter(|x| x % 2 == 0).map(|x| x * x).sum()
}
Rust (functional/recursive ā explicit fold)
pub fn sum_even_squares_recursive(n: i64) -> i64 {
fn go(x: i64, limit: i64, acc: i64) -> i64 {
if x >= limit { return acc; }
let next = if x % 2 == 0 { acc + x * x } else { acc };
go(x + 1, limit, next)
}
go(0, n, 0)
}
Rust (newtype ā zero-cost phantom type)
pub struct Meters(pub f64);
pub struct Seconds(pub f64);
pub fn speed(d: Meters, t: Seconds) -> f64 {
d.0 / t.0
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| Filter+map+sum | 'a list -> int (intermediate lists) | impl Iterator (lazy, fused) |
| Closure-returning fn | float list -> float -> float (may box closure) | fn(...) -> impl Fn(f64) -> f64 (monomorphised) |
| Newtype | type meters = Meters of float (constructor overhead in some backends) | struct Meters(f64) (transparent repr) |
| Polymorphic pipeline | ('a -> 'b) -> 'a list -> 'b list | fn<T,U,F: Fn(T)->U>(&[T], F) -> Vec<U> |
Key Insights
List.filter and List.map each traverse and allocate a new list. Rust's Iterator adapters are lazy; the compiler fuses the entire chain into a single loop with no heap traffic.#[repr(transparent)] (the default for single-field structs) guarantees the wrapper is bit-for-bit identical to its inner type ā the type information disappears entirely after type-checking.Iterator protocol, impl Trait return types, and newtype wrappers give the optimiser the information it needs to produce the same (or better) assembly as a hand-written C loop ā without asking the programmer to give up expressiveness.cargo asm or Godbolt, the two functions produce identical machine code.When to Use Each Style
Use the iterator chain when: you want readable, composable data transformation ā filter, map, flat_map, take_while, sum, fold. The compiler handles fusion; you get clarity for free.
Use explicit loops when: the transformation requires mutable state that doesn't fit cleanly into a fold (e.g., sliding-window state machines), or when readability genuinely benefits from imperative style. The performance will be equivalent.
Use newtypes when: you want the compiler to enforce unit correctness (metres vs seconds, user IDs vs product IDs) at zero runtime cost. Prefer them over type aliases (type Meters = f64) which are transparent to the type checker.
Exercises
cargo asm or Godbolt) of sum_even_squares and sum_even_squares_manual ā verify they produce identical loops.Celsius newtype and a to_fahrenheit conversion; confirm the newtype has size_of::<Celsius>() == size_of::<f64>().