ExamplesBy LevelBy TopicLearning Paths
530 Intermediate

Closures in Benchmarking

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Closures in Benchmarking" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Micro-benchmarking is notoriously tricky: compilers optimize aggressively, CPUs speculate and prefetch, and without careful technique, what you measure may not reflect what actually runs in production. Key difference from OCaml: 1. **Dead code prevention**: Rust has `std::hint::black_box` as a stable stdlib function; OCaml uses `Sys.opaque_identity` (internal, not guaranteed stable) or `ignore` which the optimizer may eliminate.

Tutorial

The Problem

Micro-benchmarking is notoriously tricky: compilers optimize aggressively, CPUs speculate and prefetch, and without careful technique, what you measure may not reflect what actually runs in production. Closures are central to benchmarking APIs because they encapsulate the code under test while providing the benchmark harness control over setup, teardown, and iteration. std::hint::black_box prevents the optimizer from eliminating computations whose results are discarded. This example shows how to build closure-based benchmarking utilities similar to criterion.

🎯 Learning Outcomes

  • • How FnMut() -> T encapsulates the code under test in a reusable way
  • • Why std::hint::black_box is essential for preventing dead code elimination
  • • How warmup iterations reduce JIT and cache effects in micro-benchmarks
  • • How bench_compare takes two FnMut closures and reports their relative performance
  • • How setup and teardown closures enable fair measurement of just the code under test
  • Code Example

    use std::hint::black_box;
    
    pub fn bench<T, F: FnMut() -> T>(name: &str, iters: usize, mut f: F) {
        let start = Instant::now();
        for _ in 0..iters {
            black_box(f());  // prevent optimization
        }
        println!("{}: {:?}", name, start.elapsed());
    }

    Key Differences

  • Dead code prevention: Rust has std::hint::black_box as a stable stdlib function; OCaml uses Sys.opaque_identity (internal, not guaranteed stable) or ignore which the optimizer may eliminate.
  • Timing granularity: Rust's std::time::Instant has nanosecond resolution; OCaml's Unix.gettimeofday has microsecond resolution on most platforms.
  • Generic test body: Rust's FnMut() -> T allows the benchmark to observe the return value via black_box; OCaml's unit -> unit discards the result, potentially allowing optimization.
  • Library ecosystem: Rust has criterion (statistics-based) and divan as mature benchmark frameworks; OCaml has Core_bench from Jane Street and Bechamel.
  • OCaml Approach

    OCaml benchmarking uses the Core_bench or Bechamel library. Benchmarks are registered as closures:

    let bench name f =
      let t0 = Unix.gettimeofday () in
      for _ = 1 to 1000 do ignore (f ()) done;
      let t1 = Unix.gettimeofday () in
      Printf.printf "%s: %.2fus\n" name ((t1 -. t0) /. 1000.0 *. 1e6)
    

    OCaml lacks a black_box equivalent in stdlib — ignore or Sys.opaque_identity is used instead.

    Full Source

    #![allow(clippy::all)]
    //! Closures in Benchmarking
    //!
    //! Patterns for measuring performance with closures and black_box.
    
    use std::hint::black_box;
    use std::time::{Duration, Instant};
    
    /// Simple benchmark runner: time a closure over N iterations.
    pub fn bench<T, F: FnMut() -> T>(name: &str, iterations: usize, mut f: F) -> Duration {
        // Warmup
        for _ in 0..iterations / 10 {
            black_box(f());
        }
    
        let start = Instant::now();
        for _ in 0..iterations {
            black_box(f());
        }
        let elapsed = start.elapsed();
        let per_iter = elapsed / iterations as u32;
        println!(
            "{}: {:?} per iteration ({} iters)",
            name, per_iter, iterations
        );
        elapsed
    }
    
    /// Compare two implementations.
    pub fn bench_compare<T, F1, F2>(name1: &str, f1: F1, name2: &str, f2: F2, iterations: usize)
    where
        F1: FnMut() -> T,
        F2: FnMut() -> T,
    {
        let t1 = bench(name1, iterations, f1);
        let t2 = bench(name2, iterations, f2);
    
        let ratio = t1.as_nanos() as f64 / t2.as_nanos() as f64;
        println!("Ratio {}/{}: {:.2}x", name1, name2, ratio);
    }
    
    /// Prevent value from being optimized away.
    pub fn consume<T>(value: T) -> T {
        black_box(value)
    }
    
    /// Benchmark with setup and teardown closures.
    pub fn bench_with_setup<S, T, Setup, Test, Teardown>(
        name: &str,
        iterations: usize,
        mut setup: Setup,
        mut test: Test,
        mut teardown: Teardown,
    ) -> Duration
    where
        Setup: FnMut() -> S,
        Test: FnMut(S) -> T,
        Teardown: FnMut(T),
    {
        // Warmup
        for _ in 0..iterations / 10 {
            let s = setup();
            let t = test(s);
            teardown(t);
        }
    
        let start = Instant::now();
        for _ in 0..iterations {
            let s = black_box(setup());
            let t = black_box(test(s));
            teardown(t);
        }
        let elapsed = start.elapsed();
        println!("{}: {:?} total ({} iters)", name, elapsed, iterations);
        elapsed
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_bench_basic() {
            let duration = bench("simple_add", 1000, || 1 + 1);
            assert!(duration.as_nanos() > 0);
        }
    
        #[test]
        fn test_consume() {
            let value = consume(42);
            assert_eq!(value, 42);
        }
    
        #[test]
        fn test_bench_with_closure_state() {
            let mut counter = 0;
            let _ = bench("counter", 100, || {
                counter += 1;
                counter
            });
            assert_eq!(counter, 110); // 10 warmup + 100 iterations
        }
    
        #[test]
        fn test_bench_with_setup() {
            let duration = bench_with_setup(
                "vec_sum",
                100,
                || vec![1, 2, 3, 4, 5],
                |v| v.iter().sum::<i32>(),
                |_| {},
            );
            assert!(duration.as_nanos() > 0);
        }
    
        #[test]
        fn test_black_box_prevents_optimization() {
            // Without black_box, compiler might optimize this away
            let result = black_box(vec![1, 2, 3].iter().sum::<i32>());
            assert_eq!(result, 6);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_bench_basic() {
            let duration = bench("simple_add", 1000, || 1 + 1);
            assert!(duration.as_nanos() > 0);
        }
    
        #[test]
        fn test_consume() {
            let value = consume(42);
            assert_eq!(value, 42);
        }
    
        #[test]
        fn test_bench_with_closure_state() {
            let mut counter = 0;
            let _ = bench("counter", 100, || {
                counter += 1;
                counter
            });
            assert_eq!(counter, 110); // 10 warmup + 100 iterations
        }
    
        #[test]
        fn test_bench_with_setup() {
            let duration = bench_with_setup(
                "vec_sum",
                100,
                || vec![1, 2, 3, 4, 5],
                |v| v.iter().sum::<i32>(),
                |_| {},
            );
            assert!(duration.as_nanos() > 0);
        }
    
        #[test]
        fn test_black_box_prevents_optimization() {
            // Without black_box, compiler might optimize this away
            let result = black_box(vec![1, 2, 3].iter().sum::<i32>());
            assert_eq!(result, 6);
        }
    }

    Deep Comparison

    OCaml vs Rust: Benchmark Closures

    OCaml

    let bench name iterations f =
      let start = Unix.gettimeofday () in
      for _ = 1 to iterations do
        ignore (f ())
      done;
      let elapsed = Unix.gettimeofday () -. start in
      Printf.printf "%s: %f sec\n" name elapsed
    
    let () = bench "sum" 1000 (fun () -> List.fold_left (+) 0 [1;2;3])
    

    Rust

    use std::hint::black_box;
    
    pub fn bench<T, F: FnMut() -> T>(name: &str, iters: usize, mut f: F) {
        let start = Instant::now();
        for _ in 0..iters {
            black_box(f());  // prevent optimization
        }
        println!("{}: {:?}", name, start.elapsed());
    }
    

    Key Differences

  • Rust: black_box prevents compiler from optimizing away results
  • OCaml: ignore discards result but doesn't prevent optimization
  • Both: Closures enable flexible benchmark setup
  • Rust: Setup/teardown closures for stateful benchmarks
  • Rust: Criterion crate for production benchmarking
  • Exercises

  • Statistics bench: Extend bench to return not just total duration but also mean, standard deviation, and min/max per iteration over multiple runs.
  • Comparison macro: Write a macro bench_vs!(name1 => expr1, name2 => expr2, iterations: N) that benchmarks two expressions and asserts the ratio is within a given tolerance.
  • Setup benchmark: Use bench_with_setup to benchmark Vec::sort vs Vec::sort_unstable on random data — ensure the setup closure generates a new random vector each iteration so sorting is never pre-sorted.
  • Open Source Repos