344 Advanced

344: Structured Concurrency

Functional Programming

Tutorial

The Problem

Spawning threads that outlive their creator creates "orphan threads" — tasks that may read freed memory, hold resources after the owning scope exits, or silently swallow panics. Structured concurrency (Nathaniel J. Smith, 2018; popularized by Kotlin coroutines and Python's anyio) ensures that spawned tasks are strictly scoped: they cannot outlive the block that created them. Rust's thread::scope implements this at the language level — borrowed references are valid for the entire scope, and the scope blocks until all spawned threads complete. No raw Arc needed for data shared with child threads; borrows work directly.

🎯 Learning Outcomes

• Use thread::scope to spawn threads that borrow data from the enclosing scope

• Understand that the scope automatically joins all threads before returning

• Recognize that scoped threads can borrow &[T] without cloning to Arc

• Implement divide-and-conquer parallelism (parallel reduce) using scoped threads

• Distinguish structured (thread::scope) from unstructured (thread::spawn) concurrency

• See how structured concurrency prevents resource leaks and dangling references

Code Example

#![allow(clippy::all)]
//! # Structured Concurrency
//! Spawn tasks that are guaranteed to complete before the scope exits.

use std::thread;

pub fn scoped_work<F, R>(work: F) -> R
where
    F: FnOnce() -> R + Send,
    R: Send,
{
    thread::scope(|s| {
        let handle = s.spawn(work);
        handle.join().unwrap()
    })
}

pub fn parallel_sum(nums: &[i32]) -> i32 {
    if nums.len() < 100 {
        return nums.iter().sum();
    }
    let mid = nums.len() / 2;
    let (left, right) = nums.split_at(mid);
    thread::scope(|s| {
        let l = s.spawn(|| parallel_sum(left));
        let r = s.spawn(|| parallel_sum(right));
        l.join().unwrap() + r.join().unwrap()
    })
}

#[cfg(test)]
mod tests {
    use super::*;
    #[test]
    fn scoped_returns_result() {
        assert_eq!(scoped_work(|| 42), 42);
    }
    #[test]
    fn parallel_sum_correct() {
        let nums: Vec<i32> = (1..=1000).collect();
        assert_eq!(parallel_sum(&nums), 500500);
    }
}

(* OCaml: structured concurrency via nested thread management *)

let run_scoped tasks =
  let handles = List.map (fun f -> Thread.create f ()) tasks in
  List.iter Thread.join handles

let with_resources setup teardown f =
  let r = setup () in
  (try f r with e -> teardown r; raise e);
  teardown r

let () =
  run_scoped [
    (fun () ->
      Thread.delay 0.01;
      Printf.printf "Task A done\n");
    (fun () ->
      Thread.delay 0.005;
      Printf.printf "Task B done\n");
    (fun () ->
      Printf.printf "Task C done\n");
  ];
  Printf.printf "All tasks in scope completed\n"

Key Differences

Aspect	Rust `thread::scope`	OCaml `Domain.spawn` + `join`
Borrow across threads	Yes — compiler-verified	No — must copy or use `Arc`-equivalent
Auto-join on exit	Yes — scope guarantees it	Manual `Domain.join` required
Panic propagation	Propagated on `join().unwrap()`	`Domain.join` re-raises
Nesting	Works recursively	Works recursively
Overhead	OS threads	Domains (lighter than OS threads)

OCaml Approach

OCaml 5 domains provide similar structured parallelism:

let parallel_sum nums =
  let n = Array.length nums in
  let mid = n / 2 in
  let left = Array.sub nums 0 mid in
  let right = Array.sub nums mid (n - mid) in
  let d = Domain.spawn (fun () -> Array.fold_left (+) 0 left) in
  let r_sum = Array.fold_left (+) 0 right in
  r_sum + Domain.join d

OCaml copies subarrays (GC-safe) rather than borrowing slices. Domain.join plays the role of scope exit — it blocks until the spawned domain finishes. Unlike Rust's scope, OCaml doesn't statically prevent domains from escaping their creation context, but Domain.join achieves the same runtime guarantee.

Full Source

#![allow(clippy::all)]
//! # Structured Concurrency
//! Spawn tasks that are guaranteed to complete before the scope exits.

use std::thread;

pub fn scoped_work<F, R>(work: F) -> R
where
    F: FnOnce() -> R + Send,
    R: Send,
{
    thread::scope(|s| {
        let handle = s.spawn(work);
        handle.join().unwrap()
    })
}

pub fn parallel_sum(nums: &[i32]) -> i32 {
    if nums.len() < 100 {
        return nums.iter().sum();
    }
    let mid = nums.len() / 2;
    let (left, right) = nums.split_at(mid);
    thread::scope(|s| {
        let l = s.spawn(|| parallel_sum(left));
        let r = s.spawn(|| parallel_sum(right));
        l.join().unwrap() + r.join().unwrap()
    })
}

#[cfg(test)]
mod tests {
    use super::*;
    #[test]
    fn scoped_returns_result() {
        assert_eq!(scoped_work(|| 42), 42);
    }
    #[test]
    fn parallel_sum_correct() {
        let nums: Vec<i32> = (1..=1000).collect();
        assert_eq!(parallel_sum(&nums), 500500);
    }
}

(* OCaml: structured concurrency via nested thread management *)

let run_scoped tasks =
  let handles = List.map (fun f -> Thread.create f ()) tasks in
  List.iter Thread.join handles

let with_resources setup teardown f =
  let r = setup () in
  (try f r with e -> teardown r; raise e);
  teardown r

let () =
  run_scoped [
    (fun () ->
      Thread.delay 0.01;
      Printf.printf "Task A done\n");
    (fun () ->
      Thread.delay 0.005;
      Printf.printf "Task B done\n");
    (fun () ->
      Printf.printf "Task C done\n");
  ];
  Printf.printf "All tasks in scope completed\n"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;
    #[test]
    fn scoped_returns_result() {
        assert_eq!(scoped_work(|| 42), 42);
    }
    #[test]
    fn parallel_sum_correct() {
        let nums: Vec<i32> = (1..=1000).collect();
        assert_eq!(parallel_sum(&nums), 500500);
    }
}

Deep Comparison

OCaml vs Rust: Structured Concurrency

Overview

See the example.rs and example.ml files for detailed implementations.

Key Differences

Aspect	OCaml	Rust
Type system	Hindley-Milner	Ownership + traits
Memory	GC	Zero-cost abstractions
Mutability	Explicit ref	mut keyword
Error handling	Option/Result	Result<T, E>

See README.md for detailed comparison.

Exercises

Parallel map: Implement parallel_map<T, R>(items: &[T], f: impl Fn(&T) -> R + Sync) -> Vec<R> using thread::scope, splitting the slice into as many chunks as CPU cores.

Early termination: Modify parallel_sum to abort early if either half panics — propagate the panic correctly instead of panicking twice.

Depth limit: Add a depth parameter to the recursive parallel_sum that falls back to sequential when depth reaches 0; find experimentally at what depth the overhead of spawning exceeds the parallel benefit.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust