559 Intermediate

Region Inference

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Region Inference" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Rust's borrow checker models each reference as having a "region" — a set of program points at which the reference is valid. Key difference from OCaml: 1. **Tracking granularity**: Rust's region inference is per

Tutorial

The Problem

Rust's borrow checker models each reference as having a "region" — a set of program points at which the reference is valid. The compiler infers the minimal region sufficient to cover all uses of the reference, rather than using a static scope. Region inference is the theoretical foundation that makes both lexical lifetimes and NLL work. Understanding how the compiler infers regions helps predict what the borrow checker will accept, explains "does not live long enough" errors, and motivates the Polonius project which extends region inference to handle more cases soundly.

🎯 Learning Outcomes

• What a region is: a set of program points at which a reference is considered live

• How the compiler infers the minimal region covering all uses of a reference

• How nested regions work: inner borrows end before outer ones

• How region inference interacts with control flow (if/else, loops)

• Why Polonius extends region inference to track where borrows flow through control paths

Code Example

#![allow(clippy::all)]
//! Region Inference
//!
//! How the compiler infers lifetime regions.

/// Compiler infers minimal region for borrow.
pub fn inferred_region() {
    let mut x = 5;
    let r = &x; // region starts
    let _ = *r; // region ends (last use)
    x = 10; // OK: region ended
    assert_eq!(x, 10);
}

/// Region spans usage.
pub fn region_span(data: &[i32]) -> i32 {
    // 'a region covers entire function
    data.iter().sum()
}

/// Nested regions.
pub fn nested_regions(v: &mut Vec<i32>) {
    {
        let r = &v[0..2]; // inner region
        let _ = r.len();
    }
    v.push(10); // outer region continues
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_inferred() {
        inferred_region();
    }

    #[test]
    fn test_span() {
        let data = [1, 2, 3];
        assert_eq!(region_span(&data), 6);
    }

    #[test]
    fn test_nested() {
        let mut v = vec![1, 2, 3];
        nested_regions(&mut v);
        assert_eq!(v.len(), 4);
    }
}

(* Region inference in OCaml -- handled by GC, no inference needed *)
let borrow_and_use data transform =
  let result = transform data in
  result

let () =
  let data = "hello world" in
  let result = borrow_and_use data (fun s -> String.length s) in
  Printf.printf "length: %d\n" result;

  (* Inferring which "region" a value belongs to *)
  let get_first_word s =
    match String.split_on_char ' ' s with
    | [] -> ""
    | w :: _ -> w
  in
  Printf.printf "first: %s\n" (get_first_word "hello world rust")

Key Differences

Tracking granularity: Rust's region inference is per-program-point; OCaml's GC is reachability-based — fundamentally different models with different tradeoffs.

Minimal regions: Rust's NLL infers the minimal region, avoiding false positives; earlier Rust used lexical scopes (maximal regions), rejecting more correct programs.

Control flow: Region inference handles if/else and loops by computing the union of regions across control paths; OCaml has no equivalent analysis.

Polonius improvement: Polonius computes per-path (not just per-point) regions, accepting programs that NLL rejects due to conservative union of paths; OCaml never needs this refinement.

OCaml Approach

OCaml has no region inference. The GC uses a reachability-based model: a value is alive as long as any path from a root (stack variable, global) can reach it. This is simpler than region inference — no program-point-level tracking is needed.

let region_demo () =
  let v = [| 1; 2; 3; 4; 5 |] in
  let r = v.(0) in  (* no "region" — just a copy *)
  (* can modify v freely — r is just an int *)
  v.(0) <- 10;
  r  (* returns original value *)

Full Source

#![allow(clippy::all)]
//! Region Inference
//!
//! How the compiler infers lifetime regions.

/// Compiler infers minimal region for borrow.
pub fn inferred_region() {
    let mut x = 5;
    let r = &x; // region starts
    let _ = *r; // region ends (last use)
    x = 10; // OK: region ended
    assert_eq!(x, 10);
}

/// Region spans usage.
pub fn region_span(data: &[i32]) -> i32 {
    // 'a region covers entire function
    data.iter().sum()
}

/// Nested regions.
pub fn nested_regions(v: &mut Vec<i32>) {
    {
        let r = &v[0..2]; // inner region
        let _ = r.len();
    }
    v.push(10); // outer region continues
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_inferred() {
        inferred_region();
    }

    #[test]
    fn test_span() {
        let data = [1, 2, 3];
        assert_eq!(region_span(&data), 6);
    }

    #[test]
    fn test_nested() {
        let mut v = vec![1, 2, 3];
        nested_regions(&mut v);
        assert_eq!(v.len(), 4);
    }
}

(* Region inference in OCaml -- handled by GC, no inference needed *)
let borrow_and_use data transform =
  let result = transform data in
  result

let () =
  let data = "hello world" in
  let result = borrow_and_use data (fun s -> String.length s) in
  Printf.printf "length: %d\n" result;

  (* Inferring which "region" a value belongs to *)
  let get_first_word s =
    match String.split_on_char ' ' s with
    | [] -> ""
    | w :: _ -> w
  in
  Printf.printf "first: %s\n" (get_first_word "hello world rust")

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_inferred() {
        inferred_region();
    }

    #[test]
    fn test_span() {
        let data = [1, 2, 3];
        assert_eq!(region_span(&data), 6);
    }

    #[test]
    fn test_nested() {
        let mut v = vec![1, 2, 3];
        nested_regions(&mut v);
        assert_eq!(v.len(), 4);
    }
}

Deep Comparison

OCaml vs Rust: lifetime region inference

See example.rs and example.ml for implementations.

Exercises

Manual region tracing: Take inferred_region and add comments to every line marking which references are in scope at that point — identify where each region starts and ends.

Loop region: Write a loop that borrows v[i], uses it, then modifies v[i+1] — identify whether the borrow region covers the whole loop iteration or ends at the borrow's last use.

Region extension: Try extending a borrow's region by using it later in the function — observe how the compiler's error message identifies the conflicting use point.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust