442 Fundamental

442: Scoped Threads — Borrowing Across Threads

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "442: Scoped Threads — Borrowing Across Threads" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. `thread::spawn` requires `'static` data — you can't borrow a local variable across threads because the spawned thread might outlive the caller's stack frame. Key difference from OCaml: 1. **Lifetime restriction**: Rust's `spawn` requires `'static`; scoped threads lift this. OCaml has no lifetime restriction since GC manages all values.

Tutorial

The Problem

thread::spawn requires 'static data — you can't borrow a local variable across threads because the spawned thread might outlive the caller's stack frame. This forces Arc<T> and cloning even when you just want to process slices of a local array in parallel. thread::scope (stabilized in Rust 1.63) solves this: scoped threads are guaranteed to complete before the scope exits, so they can safely borrow any data from the enclosing scope — including stack-allocated slices, without Arc or clone.

Scoped threads enable efficient parallel processing of local data: parallel prefix sums, parallel sorting passes, parallel data transformation — all with zero heap allocation overhead.

🎯 Learning Outcomes

• Understand why thread::spawn requires 'static but thread::scope does not

• Learn how scope.spawn(|| borrowed_data) borrows data safely within the scope lifetime

• See how parallel_sum splits a slice and processes halves concurrently

• Understand the scope guarantee: all threads are joined before thread::scope returns

• Learn when to prefer scoped threads over Arc<T> + send threads

Code Example

fn parallel_sum(data: &[i64]) -> i64 {
    let (left, right) = data.split_at(data.len() / 2);
    let mut ls = 0i64;
    let mut rs = 0i64;
    
    thread::scope(|s| {
        let t1 = s.spawn(|| left.iter().sum::<i64>());
        let t2 = s.spawn(|| right.iter().sum::<i64>());
        ls = t1.join().unwrap();
        rs = t2.join().unwrap();
    }); // auto-join here
    
    ls + rs
}

let parallel_sum arr =
  let n = Array.length arr in
  let mid = n / 2 in
  let left  = ref 0 in
  let right = ref 0 in
  let t1 = Thread.create (fun () ->
    left := Array.fold_left (+) 0 (Array.sub arr 0 mid)) () in
  let t2 = Thread.create (fun () ->
    right := Array.fold_left (+) 0 (Array.sub arr mid (n-mid))) () in
  Thread.join t1; Thread.join t2;
  !left + !right

Key Differences

Lifetime restriction: Rust's spawn requires 'static; scoped threads lift this. OCaml has no lifetime restriction since GC manages all values.

Allocation overhead: Rust's scoped threads avoid Arc allocation; OCaml always uses heap allocation.

Guarantee mechanism: Rust's scope is a closure that joins all threads on exit — enforced by the borrow checker; OCaml has no equivalent guarantee.

Rayon comparison: rayon::scope extends this pattern with work stealing for better load balancing; std::thread::scope is the simpler no-dependency version.

OCaml Approach

OCaml's Thread.create requires heap-allocated data — OCaml's GC manages lifetimes so there's no stack-lifetime restriction. Any OCaml value can be shared across threads without the 'static requirement. However, mutable state still requires synchronization (Mutex.t). OCaml 5.x's Domain.spawn has similar freedom — domains share the heap and can access any allocated value.

Full Source

#![allow(clippy::all)]
//! # Scoped Threads — Borrow Stack Data Across Threads
//!
//! Use `thread::scope` to spawn threads that borrow local data directly
//! — no `Arc`, no cloning, no heap allocation.

use std::thread;

/// Approach 1: Parallel sum using scoped threads
///
/// Splits data and processes halves in parallel, borrowing directly.
pub fn parallel_sum(data: &[i64]) -> i64 {
    if data.len() < 2 {
        return data.iter().sum();
    }

    let (left, right) = data.split_at(data.len() / 2);
    let mut ls = 0i64;
    let mut rs = 0i64;

    thread::scope(|s| {
        let t1 = s.spawn(|| left.iter().sum::<i64>());
        let t2 = s.spawn(|| right.iter().sum::<i64>());
        ls = t1.join().unwrap();
        rs = t2.join().unwrap();
    });

    ls + rs
}

/// Approach 2: Parallel map over chunks
///
/// Process data in parallel chunks, collecting results.
pub fn parallel_map<T, U, F>(data: &[T], chunk_size: usize, f: F) -> Vec<U>
where
    T: Sync,
    U: Send,
    F: Fn(&T) -> U + Sync,
{
    let mut results = Vec::with_capacity(data.len());

    thread::scope(|s| {
        let handles: Vec<_> = data
            .chunks(chunk_size)
            .map(|chunk| s.spawn(|| chunk.iter().map(&f).collect::<Vec<_>>()))
            .collect();

        for handle in handles {
            results.extend(handle.join().unwrap());
        }
    });

    results
}

/// Approach 3: Multiple readers of borrowed data
///
/// Multiple threads can borrow shared references simultaneously.
pub fn parallel_count_matches(data: &[i32], predicate: impl Fn(&i32) -> bool + Sync) -> usize {
    let num_threads = 4.min(data.len());
    if num_threads == 0 {
        return 0;
    }

    let chunk_size = (data.len() + num_threads - 1) / num_threads;
    let mut counts = vec![0usize; num_threads];

    thread::scope(|s| {
        let handles: Vec<_> = data
            .chunks(chunk_size)
            .enumerate()
            .map(|(i, chunk)| {
                let pred = &predicate;
                s.spawn(move || chunk.iter().filter(|x| pred(x)).count())
            })
            .collect();

        for (i, h) in handles.into_iter().enumerate() {
            if i < counts.len() {
                counts[i] = h.join().unwrap();
            }
        }
    });

    counts.iter().sum()
}

#[cfg(test)]
mod tests {
    use super::*;
    use std::thread;

    #[test]
    fn test_parallel_sum_basic() {
        let data: Vec<i64> = (1..=100).collect();
        assert_eq!(parallel_sum(&data), 5050);
    }

    #[test]
    fn test_parallel_sum_empty() {
        let data: Vec<i64> = vec![];
        assert_eq!(parallel_sum(&data), 0);
    }

    #[test]
    fn test_parallel_sum_single() {
        let data: Vec<i64> = vec![42];
        assert_eq!(parallel_sum(&data), 42);
    }

    #[test]
    fn test_borrow_string_in_scope() {
        let s = String::from("hello");
        thread::scope(|sc| {
            sc.spawn(|| assert_eq!(s.len(), 5));
        });
    }

    #[test]
    fn test_multiple_readers() {
        let message = String::from("shared");
        let mut results = Vec::new();

        thread::scope(|s| {
            let h1 = s.spawn(|| message.len());
            let h2 = s.spawn(|| message.chars().count());
            results.push(h1.join().unwrap());
            results.push(h2.join().unwrap());
        });

        assert_eq!(results, vec![6, 6]);
    }

    #[test]
    fn test_parallel_map() {
        let data = vec![1, 2, 3, 4, 5, 6, 7, 8];
        let results = parallel_map(&data, 2, |x| x * x);
        assert_eq!(results, vec![1, 4, 9, 16, 25, 36, 49, 64]);
    }

    #[test]
    fn test_parallel_count_matches() {
        let data: Vec<i32> = (1..=100).collect();
        let count = parallel_count_matches(&data, |&x| x % 2 == 0);
        assert_eq!(count, 50);
    }

    #[test]
    fn test_mutable_split() {
        let mut data = vec![1, 2, 3, 4, 5, 6];
        let (left, right) = data.split_at_mut(3);

        thread::scope(|s| {
            s.spawn(|| {
                for x in left.iter_mut() {
                    *x *= 2;
                }
            });
            s.spawn(|| {
                for x in right.iter_mut() {
                    *x *= 3;
                }
            });
        });

        assert_eq!(data, vec![2, 4, 6, 12, 15, 18]);
    }
}

(* 442. Scoped threads – OCaml *)
(* Classic OCaml: must join manually before data goes out of scope *)
let parallel_sum arr =
  let n = Array.length arr in
  let mid = n / 2 in
  let left  = ref 0 in
  let right = ref 0 in
  let t1 = Thread.create (fun () ->
    left := Array.fold_left (+) 0 (Array.sub arr 0 mid)) () in
  let t2 = Thread.create (fun () ->
    right := Array.fold_left (+) 0 (Array.sub arr mid (n-mid))) () in
  Thread.join t1; Thread.join t2;
  !left + !right

let () =
  let data = Array.init 100 (fun i -> i+1) in
  Printf.printf "Sum = %d (expected 5050)\n" (parallel_sum data)

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;
    use std::thread;

    #[test]
    fn test_parallel_sum_basic() {
        let data: Vec<i64> = (1..=100).collect();
        assert_eq!(parallel_sum(&data), 5050);
    }

    #[test]
    fn test_parallel_sum_empty() {
        let data: Vec<i64> = vec![];
        assert_eq!(parallel_sum(&data), 0);
    }

    #[test]
    fn test_parallel_sum_single() {
        let data: Vec<i64> = vec![42];
        assert_eq!(parallel_sum(&data), 42);
    }

    #[test]
    fn test_borrow_string_in_scope() {
        let s = String::from("hello");
        thread::scope(|sc| {
            sc.spawn(|| assert_eq!(s.len(), 5));
        });
    }

    #[test]
    fn test_multiple_readers() {
        let message = String::from("shared");
        let mut results = Vec::new();

        thread::scope(|s| {
            let h1 = s.spawn(|| message.len());
            let h2 = s.spawn(|| message.chars().count());
            results.push(h1.join().unwrap());
            results.push(h2.join().unwrap());
        });

        assert_eq!(results, vec![6, 6]);
    }

    #[test]
    fn test_parallel_map() {
        let data = vec![1, 2, 3, 4, 5, 6, 7, 8];
        let results = parallel_map(&data, 2, |x| x * x);
        assert_eq!(results, vec![1, 4, 9, 16, 25, 36, 49, 64]);
    }

    #[test]
    fn test_parallel_count_matches() {
        let data: Vec<i32> = (1..=100).collect();
        let count = parallel_count_matches(&data, |&x| x % 2 == 0);
        assert_eq!(count, 50);
    }

    #[test]
    fn test_mutable_split() {
        let mut data = vec![1, 2, 3, 4, 5, 6];
        let (left, right) = data.split_at_mut(3);

        thread::scope(|s| {
            s.spawn(|| {
                for x in left.iter_mut() {
                    *x *= 2;
                }
            });
            s.spawn(|| {
                for x in right.iter_mut() {
                    *x *= 3;
                }
            });
        });

        assert_eq!(data, vec![2, 4, 6, 12, 15, 18]);
    }
}

Deep Comparison

OCaml vs Rust: Scoped Threads

Parallel Sum Pattern

OCaml

let parallel_sum arr =
  let n = Array.length arr in
  let mid = n / 2 in
  let left  = ref 0 in
  let right = ref 0 in
  let t1 = Thread.create (fun () ->
    left := Array.fold_left (+) 0 (Array.sub arr 0 mid)) () in
  let t2 = Thread.create (fun () ->
    right := Array.fold_left (+) 0 (Array.sub arr mid (n-mid))) () in
  Thread.join t1; Thread.join t2;
  !left + !right

Rust

fn parallel_sum(data: &[i64]) -> i64 {
    let (left, right) = data.split_at(data.len() / 2);
    let mut ls = 0i64;
    let mut rs = 0i64;
    
    thread::scope(|s| {
        let t1 = s.spawn(|| left.iter().sum::<i64>());
        let t2 = s.spawn(|| right.iter().sum::<i64>());
        ls = t1.join().unwrap();
        rs = t2.join().unwrap();
    }); // auto-join here
    
    ls + rs
}

Key Differences

Feature	OCaml	Rust
Data passing	Copy or ref (GC managed)	Direct borrow (`&[T]`)
Join guarantee	Manual — programmer must remember	Automatic at scope exit
Return values	Via `ref` cells	Direct from `join()`
Memory safety	GC prevents dangling	Scope lifetime proves safety
Zero-copy	Requires sub-array copy	`split_at` is zero-copy

Borrowing Local Variables

OCaml

let message = "hello" in
let t = Thread.create (fun () ->
  Printf.printf "%s\n" message
) () in
Thread.join t
(* Works because GC tracks the string *)

Rust

let message = String::from("hello");
thread::scope(|s| {
    s.spawn(|| println!("{}", message));  // borrows &message
    s.spawn(|| println!("len={}", message.len()));
});
// message still owned here — no move needed

Mutable Access in Parallel

OCaml

(* Requires mutex for mutable access *)
let arr = [|1;2;3;4;5;6|] in
let mutex = Mutex.create () in
(* Manual coordination needed *)

Rust

let mut data = vec![1, 2, 3, 4, 5, 6];
let (left, right) = data.split_at_mut(3);

thread::scope(|s| {
    s.spawn(|| left.iter_mut().for_each(|x| *x *= 2));
    s.spawn(|| right.iter_mut().for_each(|x| *x *= 3));
});
// Compiler proves left and right don't overlap

Exercises

Parallel prefix sum: Use thread::scope to compute prefix sums in parallel: split the array into N chunks, compute each chunk's sum in parallel, then do a sequential pass to add the previous chunk's total to each chunk's elements.

Parallel quicksort: Implement in-place parallel quicksort using scoped threads: partition the array, then sort both partitions in separate threads using thread::scope. Stop spawning threads when partitions are smaller than a threshold.

Parallel matrix multiply: Use thread::scope to multiply two matrices by assigning each output row to a separate thread. Verify results match sequential multiplication.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust