ExamplesBy LevelBy TopicLearning Paths
753 Intermediate

753-bench-harness-pattern — Benchmark Harness Pattern

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "753-bench-harness-pattern — Benchmark Harness Pattern" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Percentile latency matters more than mean latency for user-facing systems. Key difference from OCaml: 1. **GC noise**: Rust benchmarks are not affected by garbage collection; OCaml benchmarks must account for minor and major GC pauses.

Tutorial

The Problem

Percentile latency matters more than mean latency for user-facing systems. A p99 of 50ms means 1% of requests are slow — unacceptable for interactive applications. Production benchmark frameworks like Criterion report p50, p90, p95, and p99. This example builds a stdlib-only harness that computes full percentile statistics from per-iteration samples, demonstrating the statistical foundation beneath tools like Criterion, Divan, and hyperfine.

🎯 Learning Outcomes

  • • Collect per-iteration Duration samples and sort them for percentile computation
  • • Compute p50, p90, p99 as array index lookups after sorting
  • • Use std::hint::black_box to prevent dead-code elimination of benchmarked functions
  • • Understand why p99 matters more than mean for user-facing latency
  • • Structure a reusable bench(name, iterations, warmup, f) function
  • Code Example

    pub fn bench<F, R>(name: &str, iterations: usize, mut f: F) -> Stats
    where
        F: FnMut() -> R,
    {
        let mut samples = Vec::new();
        for _ in 0..iterations {
            let start = Instant::now();
            black_box(f());
            samples.push(start.elapsed());
        }
        compute_stats(samples)
    }

    Key Differences

  • GC noise: Rust benchmarks are not affected by garbage collection; OCaml benchmarks must account for minor and major GC pauses.
  • Percentile support: This example implements percentiles from scratch; Rust's criterion provides them via Criterion::bench_function. OCaml's core_bench provides similar output.
  • Warmup: Both harnesses implement warmup phases; Criterion's warmup duration is configurable, while this example uses a fixed iteration count.
  • Statistical analysis: Criterion performs regression analysis and outlier detection; this example provides raw percentiles without statistical significance testing.
  • OCaml Approach

    OCaml's core_bench library provides Bench.Test.create with built-in percentile reporting. Jane Street uses it extensively in their trading systems. bechamel is an alternative with more statistical sophistication (R² goodness of fit). OCaml's GC adds noise to benchmarks that Rust avoids; core_bench accounts for this by measuring GC pressure separately.

    Full Source

    #![allow(clippy::all)]
    //! # Benchmark Harness Pattern
    //!
    //! Measuring hot functions with percentiles (std-only).
    
    use std::hint::black_box;
    use std::time::{Duration, Instant};
    
    /// Statistics from a benchmark run
    #[derive(Debug)]
    pub struct Stats {
        pub mean: Duration,
        pub min: Duration,
        pub p50: Duration,
        pub p90: Duration,
        pub p99: Duration,
        pub max: Duration,
    }
    
    /// Compute statistics from a vector of samples
    pub fn compute_stats(mut samples: Vec<Duration>) -> Stats {
        assert!(!samples.is_empty());
        samples.sort_unstable();
        let n = samples.len();
    
        let sum: Duration = samples.iter().sum();
        let mean = sum / n as u32;
    
        Stats {
            mean,
            min: samples[0],
            p50: samples[n / 2],
            p90: samples[n * 90 / 100],
            p99: samples[n * 99 / 100],
            max: samples[n - 1],
        }
    }
    
    /// Run a benchmark
    pub fn bench<F, R>(name: &str, iterations: usize, warmup: usize, mut f: F) -> Stats
    where
        F: FnMut() -> R,
    {
        // Warmup
        for _ in 0..warmup {
            black_box(f());
        }
    
        // Measure
        let mut samples = Vec::with_capacity(iterations);
        for _ in 0..iterations {
            let start = Instant::now();
            black_box(f());
            samples.push(start.elapsed());
        }
    
        let stats = compute_stats(samples);
        println!(
            "[{}] mean={:?} min={:?} p50={:?} p90={:?} p99={:?} max={:?}",
            name, stats.mean, stats.min, stats.p50, stats.p90, stats.p99, stats.max
        );
        stats
    }
    
    /// Format duration in human-readable form
    pub fn format_duration(d: Duration) -> String {
        let nanos = d.as_nanos();
        if nanos < 1_000 {
            format!("{}ns", nanos)
        } else if nanos < 1_000_000 {
            format!("{:.2}µs", nanos as f64 / 1_000.0)
        } else if nanos < 1_000_000_000 {
            format!("{:.2}ms", nanos as f64 / 1_000_000.0)
        } else {
            format!("{:.2}s", d.as_secs_f64())
        }
    }
    
    // Functions to benchmark
    
    /// Fibonacci recursive (slow)
    pub fn fib_recursive(n: u64) -> u64 {
        if n <= 1 {
            n
        } else {
            fib_recursive(n - 1) + fib_recursive(n - 2)
        }
    }
    
    /// Fibonacci iterative (fast)
    pub fn fib_iterative(n: u64) -> u64 {
        if n <= 1 {
            return n;
        }
        let (mut a, mut b) = (0u64, 1u64);
        for _ in 2..=n {
            let c = a + b;
            a = b;
            b = c;
        }
        b
    }
    
    /// Sum a slice
    pub fn sum_slice(data: &[i64]) -> i64 {
        data.iter().sum()
    }
    
    /// Sum using fold
    pub fn sum_fold(data: &[i64]) -> i64 {
        data.iter().sum::<i64>()
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_fib_recursive() {
            assert_eq!(fib_recursive(0), 0);
            assert_eq!(fib_recursive(1), 1);
            assert_eq!(fib_recursive(10), 55);
        }
    
        #[test]
        fn test_fib_iterative() {
            assert_eq!(fib_iterative(0), 0);
            assert_eq!(fib_iterative(1), 1);
            assert_eq!(fib_iterative(10), 55);
            assert_eq!(fib_iterative(20), 6765);
        }
    
        #[test]
        fn test_fib_equivalence() {
            for n in 0..20 {
                assert_eq!(fib_recursive(n), fib_iterative(n));
            }
        }
    
        #[test]
        fn test_compute_stats() {
            let samples: Vec<Duration> = (1..=100).map(|i| Duration::from_nanos(i * 100)).collect();
            let stats = compute_stats(samples);
            assert_eq!(stats.min, Duration::from_nanos(100));
            assert_eq!(stats.max, Duration::from_nanos(10000));
        }
    
        #[test]
        fn test_format_duration() {
            assert_eq!(format_duration(Duration::from_nanos(500)), "500ns");
            assert_eq!(format_duration(Duration::from_micros(100)), "100.00µs");
            assert_eq!(format_duration(Duration::from_millis(50)), "50.00ms");
            assert_eq!(format_duration(Duration::from_secs(2)), "2.00s");
        }
    
        #[test]
        fn test_sum_functions() {
            let data: Vec<i64> = (1..=100).collect();
            assert_eq!(sum_slice(&data), 5050);
            assert_eq!(sum_fold(&data), 5050);
        }
    
        #[test]
        fn test_bench_runs() {
            let stats = bench("test", 10, 2, || fib_iterative(10));
            assert!(stats.mean > Duration::ZERO);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_fib_recursive() {
            assert_eq!(fib_recursive(0), 0);
            assert_eq!(fib_recursive(1), 1);
            assert_eq!(fib_recursive(10), 55);
        }
    
        #[test]
        fn test_fib_iterative() {
            assert_eq!(fib_iterative(0), 0);
            assert_eq!(fib_iterative(1), 1);
            assert_eq!(fib_iterative(10), 55);
            assert_eq!(fib_iterative(20), 6765);
        }
    
        #[test]
        fn test_fib_equivalence() {
            for n in 0..20 {
                assert_eq!(fib_recursive(n), fib_iterative(n));
            }
        }
    
        #[test]
        fn test_compute_stats() {
            let samples: Vec<Duration> = (1..=100).map(|i| Duration::from_nanos(i * 100)).collect();
            let stats = compute_stats(samples);
            assert_eq!(stats.min, Duration::from_nanos(100));
            assert_eq!(stats.max, Duration::from_nanos(10000));
        }
    
        #[test]
        fn test_format_duration() {
            assert_eq!(format_duration(Duration::from_nanos(500)), "500ns");
            assert_eq!(format_duration(Duration::from_micros(100)), "100.00µs");
            assert_eq!(format_duration(Duration::from_millis(50)), "50.00ms");
            assert_eq!(format_duration(Duration::from_secs(2)), "2.00s");
        }
    
        #[test]
        fn test_sum_functions() {
            let data: Vec<i64> = (1..=100).collect();
            assert_eq!(sum_slice(&data), 5050);
            assert_eq!(sum_fold(&data), 5050);
        }
    
        #[test]
        fn test_bench_runs() {
            let stats = bench("test", 10, 2, || fib_iterative(10));
            assert!(stats.mean > Duration::ZERO);
        }
    }

    Deep Comparison

    OCaml vs Rust: Benchmark Harness Pattern

    Basic Benchmark Structure

    Rust

    pub fn bench<F, R>(name: &str, iterations: usize, mut f: F) -> Stats
    where
        F: FnMut() -> R,
    {
        let mut samples = Vec::new();
        for _ in 0..iterations {
            let start = Instant::now();
            black_box(f());
            samples.push(start.elapsed());
        }
        compute_stats(samples)
    }
    

    OCaml (Core_bench)

    let () =
      Command.run (Bench.make_command [
        Bench.Test.create ~name:"fib_recursive" (fun () ->
          ignore (fib_recursive 20));
        Bench.Test.create ~name:"fib_iterative" (fun () ->
          ignore (fib_iterative 20));
      ])
    

    Preventing Optimization

    Rust

    use std::hint::black_box;
    
    // Prevent compiler from optimizing away the result
    black_box(f());
    

    OCaml

    (* Use Sys.opaque_identity *)
    ignore (Sys.opaque_identity (f ()))
    

    Computing Statistics

    Rust

    pub struct Stats {
        pub mean: Duration,
        pub min: Duration,
        pub p50: Duration,
        pub p90: Duration,
        pub p99: Duration,
        pub max: Duration,
    }
    

    Key Differences

    AspectOCamlRust
    LibraryCore_benchcriterion, or std-only
    Anti-optimizationSys.opaque_identitystd::hint::black_box
    TimerUnix.gettimeofdayInstant::now()
    StatisticsBuilt into Core_benchManual or criterion
    WarmupAutomaticManual

    Exercises

  • Add an outlier detection step: remove samples more than 3 standard deviations from the mean and recompute statistics on the cleaned dataset.
  • Implement a compare_stats function that prints a speedup table: for each percentile, shows the ratio between two Stats values and whether the difference is significant.
  • Add throughput reporting: given a bytes_processed count per iteration, compute and display MB/s or million-ops/s alongside latency percentiles.
  • Open Source Repos