Closures in Benchmarking
Tutorial Video
Text description (accessibility)
This video demonstrates the "Closures in Benchmarking" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Micro-benchmarking is notoriously tricky: compilers optimize aggressively, CPUs speculate and prefetch, and without careful technique, what you measure may not reflect what actually runs in production. Key difference from OCaml: 1. **Dead code prevention**: Rust has `std::hint::black_box` as a stable stdlib function; OCaml uses `Sys.opaque_identity` (internal, not guaranteed stable) or `ignore` which the optimizer may eliminate.
Tutorial
The Problem
Micro-benchmarking is notoriously tricky: compilers optimize aggressively, CPUs speculate and prefetch, and without careful technique, what you measure may not reflect what actually runs in production. Closures are central to benchmarking APIs because they encapsulate the code under test while providing the benchmark harness control over setup, teardown, and iteration. std::hint::black_box prevents the optimizer from eliminating computations whose results are discarded. This example shows how to build closure-based benchmarking utilities similar to criterion.
🎯 Learning Outcomes
FnMut() -> T encapsulates the code under test in a reusable waystd::hint::black_box is essential for preventing dead code eliminationbench_compare takes two FnMut closures and reports their relative performanceCode Example
use std::hint::black_box;
pub fn bench<T, F: FnMut() -> T>(name: &str, iters: usize, mut f: F) {
let start = Instant::now();
for _ in 0..iters {
black_box(f()); // prevent optimization
}
println!("{}: {:?}", name, start.elapsed());
}Key Differences
std::hint::black_box as a stable stdlib function; OCaml uses Sys.opaque_identity (internal, not guaranteed stable) or ignore which the optimizer may eliminate.std::time::Instant has nanosecond resolution; OCaml's Unix.gettimeofday has microsecond resolution on most platforms.FnMut() -> T allows the benchmark to observe the return value via black_box; OCaml's unit -> unit discards the result, potentially allowing optimization.criterion (statistics-based) and divan as mature benchmark frameworks; OCaml has Core_bench from Jane Street and Bechamel.OCaml Approach
OCaml benchmarking uses the Core_bench or Bechamel library. Benchmarks are registered as closures:
let bench name f =
let t0 = Unix.gettimeofday () in
for _ = 1 to 1000 do ignore (f ()) done;
let t1 = Unix.gettimeofday () in
Printf.printf "%s: %.2fus\n" name ((t1 -. t0) /. 1000.0 *. 1e6)
OCaml lacks a black_box equivalent in stdlib — ignore or Sys.opaque_identity is used instead.
Full Source
#![allow(clippy::all)]
//! Closures in Benchmarking
//!
//! Patterns for measuring performance with closures and black_box.
use std::hint::black_box;
use std::time::{Duration, Instant};
/// Simple benchmark runner: time a closure over N iterations.
pub fn bench<T, F: FnMut() -> T>(name: &str, iterations: usize, mut f: F) -> Duration {
// Warmup
for _ in 0..iterations / 10 {
black_box(f());
}
let start = Instant::now();
for _ in 0..iterations {
black_box(f());
}
let elapsed = start.elapsed();
let per_iter = elapsed / iterations as u32;
println!(
"{}: {:?} per iteration ({} iters)",
name, per_iter, iterations
);
elapsed
}
/// Compare two implementations.
pub fn bench_compare<T, F1, F2>(name1: &str, f1: F1, name2: &str, f2: F2, iterations: usize)
where
F1: FnMut() -> T,
F2: FnMut() -> T,
{
let t1 = bench(name1, iterations, f1);
let t2 = bench(name2, iterations, f2);
let ratio = t1.as_nanos() as f64 / t2.as_nanos() as f64;
println!("Ratio {}/{}: {:.2}x", name1, name2, ratio);
}
/// Prevent value from being optimized away.
pub fn consume<T>(value: T) -> T {
black_box(value)
}
/// Benchmark with setup and teardown closures.
pub fn bench_with_setup<S, T, Setup, Test, Teardown>(
name: &str,
iterations: usize,
mut setup: Setup,
mut test: Test,
mut teardown: Teardown,
) -> Duration
where
Setup: FnMut() -> S,
Test: FnMut(S) -> T,
Teardown: FnMut(T),
{
// Warmup
for _ in 0..iterations / 10 {
let s = setup();
let t = test(s);
teardown(t);
}
let start = Instant::now();
for _ in 0..iterations {
let s = black_box(setup());
let t = black_box(test(s));
teardown(t);
}
let elapsed = start.elapsed();
println!("{}: {:?} total ({} iters)", name, elapsed, iterations);
elapsed
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_bench_basic() {
let duration = bench("simple_add", 1000, || 1 + 1);
assert!(duration.as_nanos() > 0);
}
#[test]
fn test_consume() {
let value = consume(42);
assert_eq!(value, 42);
}
#[test]
fn test_bench_with_closure_state() {
let mut counter = 0;
let _ = bench("counter", 100, || {
counter += 1;
counter
});
assert_eq!(counter, 110); // 10 warmup + 100 iterations
}
#[test]
fn test_bench_with_setup() {
let duration = bench_with_setup(
"vec_sum",
100,
|| vec![1, 2, 3, 4, 5],
|v| v.iter().sum::<i32>(),
|_| {},
);
assert!(duration.as_nanos() > 0);
}
#[test]
fn test_black_box_prevents_optimization() {
// Without black_box, compiler might optimize this away
let result = black_box(vec![1, 2, 3].iter().sum::<i32>());
assert_eq!(result, 6);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_bench_basic() {
let duration = bench("simple_add", 1000, || 1 + 1);
assert!(duration.as_nanos() > 0);
}
#[test]
fn test_consume() {
let value = consume(42);
assert_eq!(value, 42);
}
#[test]
fn test_bench_with_closure_state() {
let mut counter = 0;
let _ = bench("counter", 100, || {
counter += 1;
counter
});
assert_eq!(counter, 110); // 10 warmup + 100 iterations
}
#[test]
fn test_bench_with_setup() {
let duration = bench_with_setup(
"vec_sum",
100,
|| vec![1, 2, 3, 4, 5],
|v| v.iter().sum::<i32>(),
|_| {},
);
assert!(duration.as_nanos() > 0);
}
#[test]
fn test_black_box_prevents_optimization() {
// Without black_box, compiler might optimize this away
let result = black_box(vec![1, 2, 3].iter().sum::<i32>());
assert_eq!(result, 6);
}
}
Deep Comparison
OCaml vs Rust: Benchmark Closures
OCaml
let bench name iterations f =
let start = Unix.gettimeofday () in
for _ = 1 to iterations do
ignore (f ())
done;
let elapsed = Unix.gettimeofday () -. start in
Printf.printf "%s: %f sec\n" name elapsed
let () = bench "sum" 1000 (fun () -> List.fold_left (+) 0 [1;2;3])
Rust
use std::hint::black_box;
pub fn bench<T, F: FnMut() -> T>(name: &str, iters: usize, mut f: F) {
let start = Instant::now();
for _ in 0..iters {
black_box(f()); // prevent optimization
}
println!("{}: {:?}", name, start.elapsed());
}
Key Differences
black_box prevents compiler from optimizing away resultsignore discards result but doesn't prevent optimizationExercises
bench to return not just total duration but also mean, standard deviation, and min/max per iteration over multiple runs.bench_vs!(name1 => expr1, name2 => expr2, iterations: N) that benchmarks two expressions and asserts the ratio is within a given tolerance.bench_with_setup to benchmark Vec::sort vs Vec::sort_unstable on random data — ensure the setup closure generates a new random vector each iteration so sorting is never pre-sorted.