ExamplesBy LevelBy TopicLearning Paths
512 Intermediate

Closure State Machine

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Closure State Machine" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. State machines appear in: lexers (tokenising source code), protocol parsers (HTTP state machine), UI event systems (button: idle → pressed → released), and regex engines. Key difference from OCaml: 1. **`Box` overhead**: Each state transition in Rust allocates a `Box`; OCaml's closures are GC

Tutorial

The Problem

State machines appear in: lexers (tokenising source code), protocol parsers (HTTP state machine), UI event systems (button: idle → pressed → released), and regex engines. The traditional implementation uses an enum of states with a match on transitions. A functional alternative represents each state as a closure that takes input and returns the next state — the continuation-passing style. This approach is compositional: states are values that can be passed, stored, and combined.

🎯 Learning Outcomes

  • • Represent states as Box<dyn Fn(char) -> StateResult> — closures as transitions
  • • Define StateResult as an enum: Accept | Reject | Continue(Box<dyn Fn(char) -> StateResult>)
  • • Drive the machine with a run_machine(input: &str) -> bool loop
  • • Understand how Continue(next_state) is the continuation-passing equivalent of goto state
  • • Recognise that free functions state_start, state_after_a, state_after_b serve as state closures
  • Code Example

    pub enum StateResult {
        Accept,
        Reject,
        Continue(Box<dyn Fn(char) -> StateResult>),
    }
    
    pub fn state_start(c: char) -> StateResult {
        match c {
            'a' => StateResult::Continue(Box::new(state_after_a)),
            'b' => StateResult::Continue(Box::new(state_after_b)),
            _ => StateResult::Reject,
        }
    }

    Key Differences

  • **Box overhead**: Each state transition in Rust allocates a Box; OCaml's closures are GC-managed without explicit boxing.
  • Mutual recursion: OCaml uses let rec ... and ... for mutually recursive state functions; Rust uses separately defined free functions (no mutual fn recursion syntax needed).
  • Type-state alternative: For compile-time state machine verification, Rust's type-state pattern (using distinct types for each state) is preferred over runtime closures.
  • **Debug for StateResult**: Rust must manually implement Debug for StateResult because Box<dyn Fn> is not Debug; OCaml's polymorphic printing handles this automatically.
  • OCaml Approach

    OCaml's algebraic types and first-class functions make this pattern natural:

    type result = Accept | Reject | Continue of (char -> result)
    
    let rec state_start c = match c with
      | 'a' -> Continue state_after_a
      | 'b' -> Continue state_after_b
      | _ -> Reject
    
    and state_after_b c = match c with
      | 'b' -> Continue state_after_b
      | _ -> Reject
    
    let run input =
      String.fold_left (fun state c -> match state c with
        | Continue next -> next
        | other -> Fun.const other) state_start input = Accept
    

    OCaml's and (mutually recursive definitions) is cleaner than Rust's free functions here.

    Full Source

    #![allow(clippy::all)]
    //! Closures as State Machine Transitions
    //!
    //! Pattern: a*b+ — zero-or-more 'a' followed by one-or-more 'b'
    
    /// State machine result.
    pub enum StateResult {
        Accept,
        Reject,
        Continue(Box<dyn Fn(char) -> StateResult>),
    }
    
    impl std::fmt::Debug for StateResult {
        fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
            match self {
                StateResult::Accept => write!(f, "Accept"),
                StateResult::Reject => write!(f, "Reject"),
                StateResult::Continue(_) => write!(f, "Continue(<fn>)"),
            }
        }
    }
    
    /// State: start — expects 'a' or 'b'
    pub fn state_start(c: char) -> StateResult {
        match c {
            'a' => StateResult::Continue(Box::new(state_after_a)),
            'b' => StateResult::Continue(Box::new(state_after_b)),
            _ => StateResult::Reject,
        }
    }
    
    /// State: seen at least one 'a', expecting more 'a' or 'b'
    pub fn state_after_a(c: char) -> StateResult {
        match c {
            'a' => StateResult::Continue(Box::new(state_after_a)),
            'b' => StateResult::Continue(Box::new(state_after_b)),
            _ => StateResult::Reject,
        }
    }
    
    /// State: seen at least one 'b' after a's — expecting more 'b' or end
    pub fn state_after_b(c: char) -> StateResult {
        match c {
            'b' => StateResult::Continue(Box::new(state_after_b)),
            _ => StateResult::Reject,
        }
    }
    
    /// Run the state machine on input.
    pub fn run_machine(input: &str) -> bool {
        let mut state: Box<dyn Fn(char) -> StateResult> = Box::new(state_start);
        let mut saw_b = false;
    
        for c in input.chars() {
            match state(c) {
                StateResult::Accept => return true,
                StateResult::Reject => return false,
                StateResult::Continue(next) => {
                    if c == 'b' {
                        saw_b = true;
                    }
                    state = next;
                }
            }
        }
    
        // Must have seen at least one 'b' to accept
        saw_b
    }
    
    /// Alternative: enum-based state machine (more idiomatic Rust)
    #[derive(Debug, Clone, Copy, PartialEq)]
    pub enum State {
        Start,
        AfterA,
        AfterB,
        Reject,
    }
    
    pub fn transition(state: State, c: char) -> State {
        match (state, c) {
            (State::Start, 'a') => State::AfterA,
            (State::Start, 'b') => State::AfterB,
            (State::AfterA, 'a') => State::AfterA,
            (State::AfterA, 'b') => State::AfterB,
            (State::AfterB, 'b') => State::AfterB,
            _ => State::Reject,
        }
    }
    
    pub fn run_enum_machine(input: &str) -> bool {
        let final_state = input.chars().fold(State::Start, transition);
        final_state == State::AfterB
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_accepts_b() {
            assert!(run_machine("b"));
        }
    
        #[test]
        fn test_accepts_ab() {
            assert!(run_machine("ab"));
        }
    
        #[test]
        fn test_accepts_aab() {
            assert!(run_machine("aab"));
        }
    
        #[test]
        fn test_accepts_aaabbb() {
            assert!(run_machine("aaabbb"));
        }
    
        #[test]
        fn test_rejects_a_only() {
            assert!(!run_machine("a"));
            assert!(!run_machine("aaa"));
        }
    
        #[test]
        fn test_rejects_empty() {
            assert!(!run_machine(""));
        }
    
        #[test]
        fn test_enum_machine_matches() {
            for input in &["b", "ab", "aab", "aaabbb", "bbb"] {
                assert_eq!(run_machine(input), run_enum_machine(input));
            }
            for input in &["a", "aaa", "", "ba", "aba"] {
                assert_eq!(run_machine(input), run_enum_machine(input));
            }
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_accepts_b() {
            assert!(run_machine("b"));
        }
    
        #[test]
        fn test_accepts_ab() {
            assert!(run_machine("ab"));
        }
    
        #[test]
        fn test_accepts_aab() {
            assert!(run_machine("aab"));
        }
    
        #[test]
        fn test_accepts_aaabbb() {
            assert!(run_machine("aaabbb"));
        }
    
        #[test]
        fn test_rejects_a_only() {
            assert!(!run_machine("a"));
            assert!(!run_machine("aaa"));
        }
    
        #[test]
        fn test_rejects_empty() {
            assert!(!run_machine(""));
        }
    
        #[test]
        fn test_enum_machine_matches() {
            for input in &["b", "ab", "aab", "aaabbb", "bbb"] {
                assert_eq!(run_machine(input), run_enum_machine(input));
            }
            for input in &["a", "aaa", "", "ba", "aba"] {
                assert_eq!(run_machine(input), run_enum_machine(input));
            }
        }
    }

    Deep Comparison

    OCaml vs Rust: Closure State Machines

    OCaml

    type state_result = Accept | Reject | Continue of (char -> state_result)
    
    let rec state_start c = match c with
      | 'a' -> Continue state_after_a
      | 'b' -> Continue state_after_b
      | _ -> Reject
    

    Rust

    pub enum StateResult {
        Accept,
        Reject,
        Continue(Box<dyn Fn(char) -> StateResult>),
    }
    
    pub fn state_start(c: char) -> StateResult {
        match c {
            'a' => StateResult::Continue(Box::new(state_after_a)),
            'b' => StateResult::Continue(Box::new(state_after_b)),
            _ => StateResult::Reject,
        }
    }
    

    Key Differences

  • OCaml: Functions stored directly in variants
  • Rust: Need Box<dyn Fn> for dynamic dispatch
  • Both: State transitions as function returns
  • Rust: Enum-based approach often more idiomatic
  • Both support modeling complex state machines functionally
  • Exercises

  • Extend the DFA: Add StateResult::Error(String) for states that can provide error messages, and implement a new state that recognises (a|b)*c (any sequence ending in c).
  • State machine builder: Design a DfaBuilder that accepts a HashMap<(StateId, char), StateId> transition table and generates the corresponding closure-based state machine.
  • Type-state comparison: Rewrite the a*b+ recogniser using the type-state pattern (each state is a distinct struct type, transitions are methods) and compare the compile-time guarantees vs. the closure approach.
  • Open Source Repos