ExamplesBy LevelBy TopicLearning Paths
215 Advanced

Recursion Schemes — Separating What From How

Recursion Schemes

Tutorial Video

Text description (accessibility)

This video demonstrates the "Recursion Schemes — Separating What From How" functional Rust example. Difficulty level: Advanced. Key concepts covered: Recursion Schemes. When processing a recursive data structure (AST, tree, config), every function (`eval`, `show`, `depth`, `count_nodes`) rewrites the same recursive traversal with only the leaf/combine logic differing. Key difference from OCaml: 1. **Ownership model:** OCaml passes by GC reference freely; Rust `cata` consumes `Expr` (owned), while direct helpers borrow `&Expr` — the type system enforces the distinction.

Tutorial

The Problem

When processing a recursive data structure (AST, tree, config), every function (eval, show, depth, count_nodes) rewrites the same recursive traversal with only the leaf/combine logic differing. Recursion schemes factor the traversal out into one place — cata — so each new function becomes a plain, non-recursive algebra.

🎯 Learning Outcomes

  • • How to construct a base functor (ExprF<A>) by replacing recursive positions with a type variable
  • • How map on the functor enables the catamorphism to recurse automatically
  • • Why Rust uses owned values (Expr) in cata while direct recursion prefers borrows (&Expr)
  • • How closures as algebras compile to zero-cost abstractions via monomorphisation
  • 🦀 The Rust Way

    Rust defines enum ExprF<A> as the base functor with a map method. The catamorphism cata<A, F>(e: Expr, alg: &F) -> A consumes the tree (ownership, no GC) and passes alg by reference so it can be called repeatedly. Closures serve as algebras; monomorphisation ensures no runtime overhead.

    Code Example

    pub fn eval(e: &Expr) -> i64 {
        match e {
            Expr::Lit(n)    => *n,
            Expr::Add(a, b) => eval(a) + eval(b),
            Expr::Mul(a, b) => eval(a) * eval(b),
        }
    }
    
    pub fn show(e: &Expr) -> String {
        match e {
            Expr::Lit(n)    => n.to_string(),
            Expr::Add(a, b) => format!("({} + {})", show(a), show(b)),
            Expr::Mul(a, b) => format!("({} * {})", show(a), show(b)),
        }
    }

    Key Differences

  • Ownership model: OCaml passes by GC reference freely; Rust cata consumes Expr (owned), while direct helpers borrow &Expr — the type system enforces the distinction.
  • Functor map: OCaml fmap is a standalone function; Rust ExprF::map is a method, matching idiomatic Rust style.
  • Algebra representation: OCaml uses named functions; Rust uses Fn(ExprF<A>) -> A closures or function pointers — same flexibility, different syntax.
  • Monomorphisation: Rust generates a specialised cata for each algebra at compile time, so there is no dynamic dispatch or boxing penalty.
  • OCaml Approach

    OCaml defines type 'a expr_f as the base functor, implements fmap to transform recursive positions, and writes cata alg e = alg (fmap (cata alg) (project e)). Algebras are ordinary functions like eval_alg : int expr_f -> int. OCaml's GC makes value passing frictionless.

    Full Source

    #![allow(clippy::all)]
    //! # Example 215: Recursion Schemes — Separating What From How
    //!
    //! Demonstrates how to factor recursion *out* of business logic using
    //! catamorphisms. Instead of copying the same traversal into every function,
    //! you write one `cata` that handles the recursion, and plain functions
    //! (algebras) that handle only the local computation step.
    
    // ============================================================
    // Approach 1: Direct recursion — recursion is mixed with logic
    // ============================================================
    
    /// A simple arithmetic expression tree.
    #[derive(Debug, Clone)]
    pub enum Expr {
        Lit(i64),
        Add(Box<Expr>, Box<Expr>),
        Mul(Box<Expr>, Box<Expr>),
    }
    
    impl Expr {
        pub fn lit(n: i64) -> Self {
            Expr::Lit(n)
        }
        pub fn make_add(a: Expr, b: Expr) -> Self {
            Expr::Add(Box::new(a), Box::new(b))
        }
        pub fn make_mul(a: Expr, b: Expr) -> Self {
            Expr::Mul(Box::new(a), Box::new(b))
        }
    }
    
    /// Evaluate — recursion entangled with arithmetic.
    pub fn eval(e: &Expr) -> i64 {
        match e {
            Expr::Lit(n) => *n,
            Expr::Add(a, b) => eval(a) + eval(b),
            Expr::Mul(a, b) => eval(a) * eval(b),
        }
    }
    
    /// Pretty-print — same traversal structure, different payload.
    pub fn show(e: &Expr) -> String {
        match e {
            Expr::Lit(n) => n.to_string(),
            Expr::Add(a, b) => format!("({} + {})", show(a), show(b)),
            Expr::Mul(a, b) => format!("({} * {})", show(a), show(b)),
        }
    }
    
    /// Tree depth — yet again the identical recursive shape.
    pub fn depth(e: &Expr) -> usize {
        match e {
            Expr::Lit(_) => 0,
            Expr::Add(a, b) | Expr::Mul(a, b) => 1 + depth(a).max(depth(b)),
        }
    }
    
    // ============================================================
    // Approach 2: Recursion Schemes — factor the recursion out
    // ============================================================
    //
    // Key idea: make a "base functor" ExprF<A> where every recursive
    // occurrence of Expr is replaced by a type variable A.
    // Then:
    //   - `ExprF<Box<Expr>>` ≅ one layer of Expr (= `project`)
    //   - `map` applies a function to every recursive position
    //   - `cata` wires map + recursion together, accepting an "algebra"
    //     `ExprF<B> -> B` that only handles the *local* step.
    //
    // Result: eval, show, depth become three-line functions with zero
    // recursion. Adding a new traversal costs only the algebra.
    
    /// Base functor: `Expr` with recursive positions replaced by `A`.
    pub enum ExprF<A> {
        Lit(i64),
        Add(A, A),
        Mul(A, A),
    }
    
    impl<A> ExprF<A> {
        /// Functorial map — apply `f` to every recursive position.
        pub fn map<B, F: Fn(A) -> B>(self, f: F) -> ExprF<B> {
            match self {
                ExprF::Lit(n) => ExprF::Lit(n),
                ExprF::Add(a, b) => ExprF::Add(f(a), f(b)),
                ExprF::Mul(a, b) => ExprF::Mul(f(a), f(b)),
            }
        }
    }
    
    /// Project: peel off one layer of `Expr`, exposing `ExprF<Box<Expr>>`.
    fn project(e: Expr) -> ExprF<Box<Expr>> {
        match e {
            Expr::Lit(n) => ExprF::Lit(n),
            Expr::Add(a, b) => ExprF::Add(a, b),
            Expr::Mul(a, b) => ExprF::Mul(a, b),
        }
    }
    
    /// Catamorphism: the *only* place recursion lives.
    ///
    /// Given an algebra `f: ExprF<A> -> A`, folds the whole tree bottom-up:
    /// 1. Peel one layer with `project`.
    /// 2. Recurse into every child with `cata` itself (via `map`).
    /// 3. Hand the fully-reduced layer to the algebra `f`.
    pub fn cata<A, F>(e: Expr, alg: &F) -> A
    where
        F: Fn(ExprF<A>) -> A,
    {
        alg(project(e).map(|child| cata(*child, alg)))
    }
    
    // ============================================================
    // Algebras — pure logic, zero recursion
    // ============================================================
    
    /// Evaluate using cata: just describe the *local* computation.
    pub fn eval_cata(e: Expr) -> i64 {
        cata(e, &|node| match node {
            ExprF::Lit(n) => n,
            ExprF::Add(a, b) => a + b,
            ExprF::Mul(a, b) => a * b,
        })
    }
    
    /// Pretty-print using cata.
    pub fn show_cata(e: Expr) -> String {
        cata(e, &|node| match node {
            ExprF::Lit(n) => n.to_string(),
            ExprF::Add(a, b) => format!("({a} + {b})"),
            ExprF::Mul(a, b) => format!("({a} * {b})"),
        })
    }
    
    /// Depth using cata.
    pub fn depth_cata(e: Expr) -> usize {
        cata(e, &|node: ExprF<usize>| match node {
            ExprF::Lit(_) => 0,
            ExprF::Add(a, b) | ExprF::Mul(a, b) => 1 + a.max(b),
        })
    }
    
    /// Count nodes using cata — free, no new traversal code needed.
    pub fn count_nodes(e: Expr) -> usize {
        cata(e, &|node| match node {
            ExprF::Lit(_) => 1,
            ExprF::Add(a, b) | ExprF::Mul(a, b) => 1 + a + b,
        })
    }
    
    // ============================================================
    // Tests
    // ============================================================
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        /// (2 + 3) * 4  =  20,  depth 2,  5 nodes
        fn sample() -> Expr {
            Expr::make_mul(Expr::make_add(Expr::lit(2), Expr::lit(3)), Expr::lit(4))
        }
    
        // --- Approach 1: direct ---
    
        #[test]
        fn test_eval_direct_literal() {
            assert_eq!(eval(&Expr::lit(7)), 7);
        }
    
        #[test]
        fn test_eval_direct_compound() {
            assert_eq!(eval(&sample()), 20);
        }
    
        #[test]
        fn test_show_direct() {
            assert_eq!(show(&sample()), "((2 + 3) * 4)");
        }
    
        #[test]
        fn test_depth_direct() {
            assert_eq!(depth(&Expr::lit(0)), 0);
            assert_eq!(depth(&sample()), 2);
        }
    
        // --- Approach 2: cata ---
    
        #[test]
        fn test_eval_cata_literal() {
            assert_eq!(eval_cata(Expr::lit(42)), 42);
        }
    
        #[test]
        fn test_eval_cata_compound() {
            assert_eq!(eval_cata(sample()), 20);
        }
    
        #[test]
        fn test_show_cata() {
            assert_eq!(show_cata(sample()), "((2 + 3) * 4)");
            assert_eq!(show_cata(Expr::lit(99)), "99");
        }
    
        #[test]
        fn test_depth_cata() {
            assert_eq!(depth_cata(Expr::lit(0)), 0);
            assert_eq!(depth_cata(sample()), 2);
            // Deeper tree: ((2+3)*4) + 1  →  depth 3
            let deeper = Expr::make_add(sample(), Expr::lit(1));
            assert_eq!(depth_cata(deeper), 3);
        }
    
        #[test]
        fn test_count_nodes() {
            // (2 + 3) * 4  →  mul, add, lit(2), lit(3), lit(4)  =  5
            assert_eq!(count_nodes(sample()), 5);
            assert_eq!(count_nodes(Expr::lit(0)), 1);
        }
    
        #[test]
        fn test_direct_and_cata_agree() {
            // Both approaches must produce identical results.
            let e_ref = sample();
            let e_owned = sample();
            assert_eq!(eval(&e_ref), eval_cata(e_owned));
        }
    
        #[test]
        fn test_nested_mul() {
            // 2 * (3 * 4)  =  24
            let e = Expr::make_mul(Expr::lit(2), Expr::make_mul(Expr::lit(3), Expr::lit(4)));
            assert_eq!(eval(&e), 24);
            assert_eq!(eval_cata(e.clone()), 24);
            assert_eq!(show(&e), "(2 * (3 * 4))");
            assert_eq!(show_cata(e), "(2 * (3 * 4))");
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        /// (2 + 3) * 4  =  20,  depth 2,  5 nodes
        fn sample() -> Expr {
            Expr::make_mul(Expr::make_add(Expr::lit(2), Expr::lit(3)), Expr::lit(4))
        }
    
        // --- Approach 1: direct ---
    
        #[test]
        fn test_eval_direct_literal() {
            assert_eq!(eval(&Expr::lit(7)), 7);
        }
    
        #[test]
        fn test_eval_direct_compound() {
            assert_eq!(eval(&sample()), 20);
        }
    
        #[test]
        fn test_show_direct() {
            assert_eq!(show(&sample()), "((2 + 3) * 4)");
        }
    
        #[test]
        fn test_depth_direct() {
            assert_eq!(depth(&Expr::lit(0)), 0);
            assert_eq!(depth(&sample()), 2);
        }
    
        // --- Approach 2: cata ---
    
        #[test]
        fn test_eval_cata_literal() {
            assert_eq!(eval_cata(Expr::lit(42)), 42);
        }
    
        #[test]
        fn test_eval_cata_compound() {
            assert_eq!(eval_cata(sample()), 20);
        }
    
        #[test]
        fn test_show_cata() {
            assert_eq!(show_cata(sample()), "((2 + 3) * 4)");
            assert_eq!(show_cata(Expr::lit(99)), "99");
        }
    
        #[test]
        fn test_depth_cata() {
            assert_eq!(depth_cata(Expr::lit(0)), 0);
            assert_eq!(depth_cata(sample()), 2);
            // Deeper tree: ((2+3)*4) + 1  →  depth 3
            let deeper = Expr::make_add(sample(), Expr::lit(1));
            assert_eq!(depth_cata(deeper), 3);
        }
    
        #[test]
        fn test_count_nodes() {
            // (2 + 3) * 4  →  mul, add, lit(2), lit(3), lit(4)  =  5
            assert_eq!(count_nodes(sample()), 5);
            assert_eq!(count_nodes(Expr::lit(0)), 1);
        }
    
        #[test]
        fn test_direct_and_cata_agree() {
            // Both approaches must produce identical results.
            let e_ref = sample();
            let e_owned = sample();
            assert_eq!(eval(&e_ref), eval_cata(e_owned));
        }
    
        #[test]
        fn test_nested_mul() {
            // 2 * (3 * 4)  =  24
            let e = Expr::make_mul(Expr::lit(2), Expr::make_mul(Expr::lit(3), Expr::lit(4)));
            assert_eq!(eval(&e), 24);
            assert_eq!(eval_cata(e.clone()), 24);
            assert_eq!(show(&e), "(2 * (3 * 4))");
            assert_eq!(show_cata(e), "(2 * (3 * 4))");
        }
    }

    Deep Comparison

    OCaml vs Rust: Recursion Schemes — Separating What From How

    Side-by-Side Code

    OCaml (direct recursion — the problem)

    let rec eval = function
      | Lit n       -> n
      | Add (a, b)  -> eval a + eval b
      | Mul (a, b)  -> eval a * eval b
    
    let rec show = function
      | Lit n       -> string_of_int n
      | Add (a, b)  -> "(" ^ show a ^ " + " ^ show b ^ ")"
      | Mul (a, b)  -> "(" ^ show a ^ " * " ^ show b ^ ")"
    

    OCaml (recursion scheme — the solution)

    (* Base functor: recursive positions become type variable 'a *)
    type 'a expr_f =
      | LitF of int
      | AddF of 'a * 'a
      | MulF of 'a * 'a
    
    let fmap f = function
      | LitF n      -> LitF n
      | AddF (a, b) -> AddF (f a, f b)
      | MulF (a, b) -> MulF (f a, f b)
    
    (* project: peel one layer *)
    let project = function
      | Lit n      -> LitF n
      | Add (a, b) -> AddF (a, b)
      | Mul (a, b) -> MulF (a, b)
    
    (* cata: the ONE place recursion lives *)
    let rec cata alg e = alg (fmap (cata alg) (project e))
    
    (* algebras: zero recursion *)
    let eval_alg = function LitF n -> n | AddF (a,b) -> a+b | MulF (a,b) -> a*b
    let eval e = cata eval_alg e
    
    let show_alg = function
      | LitF n      -> string_of_int n
      | AddF (a, b) -> "(" ^ a ^ " + " ^ b ^ ")"
      | MulF (a, b) -> "(" ^ a ^ " * " ^ b ^ ")"
    let show e = cata show_alg e
    

    Rust (idiomatic — direct recursion)

    pub fn eval(e: &Expr) -> i64 {
        match e {
            Expr::Lit(n)    => *n,
            Expr::Add(a, b) => eval(a) + eval(b),
            Expr::Mul(a, b) => eval(a) * eval(b),
        }
    }
    
    pub fn show(e: &Expr) -> String {
        match e {
            Expr::Lit(n)    => n.to_string(),
            Expr::Add(a, b) => format!("({} + {})", show(a), show(b)),
            Expr::Mul(a, b) => format!("({} * {})", show(a), show(b)),
        }
    }
    

    Rust (recursion scheme — catamorphism)

    // Base functor: A replaces every recursive Expr position
    pub enum ExprF<A> { Lit(i64), Add(A, A), Mul(A, A) }
    
    impl<A> ExprF<A> {
        pub fn map<B, F: Fn(A) -> B>(self, f: F) -> ExprF<B> {
            match self {
                ExprF::Lit(n)    => ExprF::Lit(n),
                ExprF::Add(a, b) => ExprF::Add(f(a), f(b)),
                ExprF::Mul(a, b) => ExprF::Mul(f(a), f(b)),
            }
        }
    }
    
    fn project(e: Expr) -> ExprF<Box<Expr>> { /* strip one layer */ }
    
    // The ONE place recursion lives
    pub fn cata<A, F: Fn(ExprF<A>) -> A>(e: Expr, alg: &F) -> A {
        alg(project(e).map(|child| cata(*child, alg)))
    }
    
    // Algebras: pure logic, no recursion
    pub fn eval_cata(e: Expr) -> i64 {
        cata(e, &|node| match node {
            ExprF::Lit(n)    => n,
            ExprF::Add(a, b) => a + b,
            ExprF::Mul(a, b) => a * b,
        })
    }
    
    pub fn show_cata(e: Expr) -> String {
        cata(e, &|node| match node {
            ExprF::Lit(n)    => n.to_string(),
            ExprF::Add(a, b) => format!("({a} + {b})"),
            ExprF::Mul(a, b) => format!("({a} * {b})"),
        })
    }
    

    Type Signatures

    ConceptOCamlRust
    Base functortype 'a expr_f = LitF of int \| AddF of 'a * 'a \| MulF of 'a * 'aenum ExprF<A> { Lit(i64), Add(A, A), Mul(A, A) }
    Functor mapval fmap : ('a -> 'b) -> 'a expr_f -> 'b expr_ffn map<B, F: Fn(A) -> B>(self, f: F) -> ExprF<B>
    Catamorphismval cata : ('a expr_f -> 'a) -> expr -> 'afn cata<A, F: Fn(ExprF<A>) -> A>(e: Expr, alg: &F) -> A
    Algebra'a expr_f -> 'aFn(ExprF<A>) -> A
    Direct evalexpr -> intfn eval(e: &Expr) -> i64
    Cata evalexpr -> intfn eval_cata(e: Expr) -> i64

    Key Insights

  • The problem is duplicated recursion. Every function that processes a tree (eval, show, depth, count_nodes) rewrites the same match arms with recursive calls. The recursion pattern never changes — only the leaf/combine logic does. This is the boilerplate recursion schemes eliminate.
  • The base functor is the key abstraction. ExprF<A> is Expr with every recursive Expr replaced by a type variable A. In OCaml this is type 'a expr_f; in Rust it's enum ExprF<A>. The type parameter is the "hole" where the result of recursive calls slots in.
  • **map makes it a functor.** Implementing fmap/map for ExprF lets the catamorphism apply any function to the recursive positions without caring about tree structure. In OCaml, fmap is a standalone function; in Rust, it's a method on ExprF<A>.
  • Ownership shapes the API. OCaml's GC means cata can pass values freely. In Rust, cata consumes the Expr (owned value) to avoid reference-counting overhead. The direct approach uses &Expr (borrowed) since it doesn't need to restructure the tree — a natural split the type system enforces.
  • Closures as algebras. In OCaml, algebras are ordinary functions passed to cata. In Rust, they're closures |node: ExprF<A>| -> A, passed as &F where F: Fn(ExprF<A>) -> A. Rust's monomorphisation means this compiles to the same machine code as writing the recursion by hand — zero abstraction cost.
  • Extensibility gain. Once cata exists, every new traversal costs exactly one algebra function — no match arms, no Box, no recursion. Adding count_nodes took three lines. Adding a new Expr variant (e.g., Neg) requires updating ExprF, project, and the map — but then every existing algebra fails to compile until updated, giving compile-time exhaustiveness checking across all consumers at once.
  • When to Use Each Style

    Use direct recursion when: the data structure is small and stable (2–3 variants, unlikely to grow), or you are writing a one-off transformation and the indirection of a functor layer adds more complexity than it removes.

    Use catamorphism / recursion schemes when: you have multiple traversals over the same type, the type is growing (new variants expected), or you want to guarantee that the recursion logic is correct once and reuse it everywhere without risk of per-function bugs.

    Exercises

  • Implement anamorphism (unfold) — the dual of catamorphism — that generates a recursive structure from a seed value, and use it to build a list of Fibonacci numbers up to a limit.
  • Implement hylomorphism (a compose of ana and cata) and use it to implement Mergesort: the unfold splits the list, the fold merges sorted sublists.
  • Implement paramorphism — a fold that also provides the original sub-structure at each step — and use it to implement a tails function that returns all suffixes of a list.
  • Open Source Repos