Closure State Machine
Tutorial Video
Text description (accessibility)
This video demonstrates the "Closure State Machine" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. State machines appear in: lexers (tokenising source code), protocol parsers (HTTP state machine), UI event systems (button: idle → pressed → released), and regex engines. Key difference from OCaml: 1. **`Box` overhead**: Each state transition in Rust allocates a `Box`; OCaml's closures are GC
Tutorial
The Problem
State machines appear in: lexers (tokenising source code), protocol parsers (HTTP state machine), UI event systems (button: idle → pressed → released), and regex engines. The traditional implementation uses an enum of states with a match on transitions. A functional alternative represents each state as a closure that takes input and returns the next state — the continuation-passing style. This approach is compositional: states are values that can be passed, stored, and combined.
🎯 Learning Outcomes
Box<dyn Fn(char) -> StateResult> — closures as transitionsStateResult as an enum: Accept | Reject | Continue(Box<dyn Fn(char) -> StateResult>)run_machine(input: &str) -> bool loopContinue(next_state) is the continuation-passing equivalent of goto statestate_start, state_after_a, state_after_b serve as state closuresCode Example
pub enum StateResult {
Accept,
Reject,
Continue(Box<dyn Fn(char) -> StateResult>),
}
pub fn state_start(c: char) -> StateResult {
match c {
'a' => StateResult::Continue(Box::new(state_after_a)),
'b' => StateResult::Continue(Box::new(state_after_b)),
_ => StateResult::Reject,
}
}Key Differences
Box overhead**: Each state transition in Rust allocates a Box; OCaml's closures are GC-managed without explicit boxing.let rec ... and ... for mutually recursive state functions; Rust uses separately defined free functions (no mutual fn recursion syntax needed).Debug for StateResult**: Rust must manually implement Debug for StateResult because Box<dyn Fn> is not Debug; OCaml's polymorphic printing handles this automatically.OCaml Approach
OCaml's algebraic types and first-class functions make this pattern natural:
type result = Accept | Reject | Continue of (char -> result)
let rec state_start c = match c with
| 'a' -> Continue state_after_a
| 'b' -> Continue state_after_b
| _ -> Reject
and state_after_b c = match c with
| 'b' -> Continue state_after_b
| _ -> Reject
let run input =
String.fold_left (fun state c -> match state c with
| Continue next -> next
| other -> Fun.const other) state_start input = Accept
OCaml's and (mutually recursive definitions) is cleaner than Rust's free functions here.
Full Source
#![allow(clippy::all)]
//! Closures as State Machine Transitions
//!
//! Pattern: a*b+ — zero-or-more 'a' followed by one-or-more 'b'
/// State machine result.
pub enum StateResult {
Accept,
Reject,
Continue(Box<dyn Fn(char) -> StateResult>),
}
impl std::fmt::Debug for StateResult {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
StateResult::Accept => write!(f, "Accept"),
StateResult::Reject => write!(f, "Reject"),
StateResult::Continue(_) => write!(f, "Continue(<fn>)"),
}
}
}
/// State: start — expects 'a' or 'b'
pub fn state_start(c: char) -> StateResult {
match c {
'a' => StateResult::Continue(Box::new(state_after_a)),
'b' => StateResult::Continue(Box::new(state_after_b)),
_ => StateResult::Reject,
}
}
/// State: seen at least one 'a', expecting more 'a' or 'b'
pub fn state_after_a(c: char) -> StateResult {
match c {
'a' => StateResult::Continue(Box::new(state_after_a)),
'b' => StateResult::Continue(Box::new(state_after_b)),
_ => StateResult::Reject,
}
}
/// State: seen at least one 'b' after a's — expecting more 'b' or end
pub fn state_after_b(c: char) -> StateResult {
match c {
'b' => StateResult::Continue(Box::new(state_after_b)),
_ => StateResult::Reject,
}
}
/// Run the state machine on input.
pub fn run_machine(input: &str) -> bool {
let mut state: Box<dyn Fn(char) -> StateResult> = Box::new(state_start);
let mut saw_b = false;
for c in input.chars() {
match state(c) {
StateResult::Accept => return true,
StateResult::Reject => return false,
StateResult::Continue(next) => {
if c == 'b' {
saw_b = true;
}
state = next;
}
}
}
// Must have seen at least one 'b' to accept
saw_b
}
/// Alternative: enum-based state machine (more idiomatic Rust)
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum State {
Start,
AfterA,
AfterB,
Reject,
}
pub fn transition(state: State, c: char) -> State {
match (state, c) {
(State::Start, 'a') => State::AfterA,
(State::Start, 'b') => State::AfterB,
(State::AfterA, 'a') => State::AfterA,
(State::AfterA, 'b') => State::AfterB,
(State::AfterB, 'b') => State::AfterB,
_ => State::Reject,
}
}
pub fn run_enum_machine(input: &str) -> bool {
let final_state = input.chars().fold(State::Start, transition);
final_state == State::AfterB
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_accepts_b() {
assert!(run_machine("b"));
}
#[test]
fn test_accepts_ab() {
assert!(run_machine("ab"));
}
#[test]
fn test_accepts_aab() {
assert!(run_machine("aab"));
}
#[test]
fn test_accepts_aaabbb() {
assert!(run_machine("aaabbb"));
}
#[test]
fn test_rejects_a_only() {
assert!(!run_machine("a"));
assert!(!run_machine("aaa"));
}
#[test]
fn test_rejects_empty() {
assert!(!run_machine(""));
}
#[test]
fn test_enum_machine_matches() {
for input in &["b", "ab", "aab", "aaabbb", "bbb"] {
assert_eq!(run_machine(input), run_enum_machine(input));
}
for input in &["a", "aaa", "", "ba", "aba"] {
assert_eq!(run_machine(input), run_enum_machine(input));
}
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_accepts_b() {
assert!(run_machine("b"));
}
#[test]
fn test_accepts_ab() {
assert!(run_machine("ab"));
}
#[test]
fn test_accepts_aab() {
assert!(run_machine("aab"));
}
#[test]
fn test_accepts_aaabbb() {
assert!(run_machine("aaabbb"));
}
#[test]
fn test_rejects_a_only() {
assert!(!run_machine("a"));
assert!(!run_machine("aaa"));
}
#[test]
fn test_rejects_empty() {
assert!(!run_machine(""));
}
#[test]
fn test_enum_machine_matches() {
for input in &["b", "ab", "aab", "aaabbb", "bbb"] {
assert_eq!(run_machine(input), run_enum_machine(input));
}
for input in &["a", "aaa", "", "ba", "aba"] {
assert_eq!(run_machine(input), run_enum_machine(input));
}
}
}
Deep Comparison
OCaml vs Rust: Closure State Machines
OCaml
type state_result = Accept | Reject | Continue of (char -> state_result)
let rec state_start c = match c with
| 'a' -> Continue state_after_a
| 'b' -> Continue state_after_b
| _ -> Reject
Rust
pub enum StateResult {
Accept,
Reject,
Continue(Box<dyn Fn(char) -> StateResult>),
}
pub fn state_start(c: char) -> StateResult {
match c {
'a' => StateResult::Continue(Box::new(state_after_a)),
'b' => StateResult::Continue(Box::new(state_after_b)),
_ => StateResult::Reject,
}
}
Key Differences
Exercises
StateResult::Error(String) for states that can provide error messages, and implement a new state that recognises (a|b)*c (any sequence ending in c).DfaBuilder that accepts a HashMap<(StateId, char), StateId> transition table and generates the corresponding closure-based state machine.a*b+ recogniser using the type-state pattern (each state is a distinct struct type, transitions are methods) and compare the compile-time guarantees vs. the closure approach.