153 Advanced

Satisfy Parser

Functional Programming

Tutorial

The Problem

Rather than enumerate every specific character a parser might accept, satisfy generalizes character matching to any predicate Fn(char) -> bool. This single combinator replaces dozens of specific parsers: is_digit, is_letter, is_whitespace, is_alphanumeric all become one-liners built from satisfy. The predicate-based approach is more extensible, composable, and mirrors the mathematical notation for character classes used in formal grammar theory.

🎯 Learning Outcomes

• Understand satisfy as the universal character matching primitive

• Learn how to build specific parsers (digit, letter, alphanumeric) from satisfy

• See how description strings in satisfy produce readable error messages

• Practice building a small parser vocabulary entirely from one primitive

Code Example

fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
    F: Fn(char) -> bool + 'a,
{
    let desc = desc.to_string();
    Box::new(move |input: &'a str| {
        match input.chars().next() {
            Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
            Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
            None => Err(format!("Expected {}, got EOF", desc)),
        }
    })
}

let satisfy (pred : char -> bool) (desc : string) : char parser = fun input ->
  match advance input with
  | Some (ch, rest) when pred ch -> Ok (ch, rest)
  | Some (ch, _) -> Error (Printf.sprintf "Character '%c' does not satisfy %s" ch desc)
  | None -> Error (Printf.sprintf "Expected %s, got EOF" desc)

Key Differences

Description in signature: Rust's satisfy takes desc: &str inline; OCaml's angstrom uses a <?> operator to attach descriptions separately.

Predicate type: Rust's F: Fn(char) -> bool + 'a captures the lifetime of the predicate; OCaml's predicates are plain function values managed by the GC.

Unicode awareness: Rust's char is always a Unicode scalar value; OCaml's char is a byte (0..255), requiring Uchar for full Unicode.

Composability: Both satisfy variants compose identically with many0, many1, map, and choice; the higher-level combinators are the same.

OCaml Approach

OCaml's angstrom provides satisfy : (char -> bool) -> char t directly. The idiomatic pattern:

let digit = satisfy Char.is_digit
let letter = satisfy Char.is_alpha
let alphanumeric = satisfy (fun c -> Char.is_alpha c || Char.is_digit c)

OCaml's lighter closure syntax (Char.is_digit) compared to Rust's (|c| c.is_ascii_digit()) makes these definitions more compact. Error messages in angstrom are produced separately via <?> "description".

Full Source

#![allow(clippy::all)]
// Example 153: Satisfy Parser
// Parse a character matching a predicate

type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;

// ============================================================
// Approach 1: satisfy with predicate and description
// ============================================================

fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
    F: Fn(char) -> bool + 'a,
{
    let desc = desc.to_string();
    Box::new(move |input: &'a str| match input.chars().next() {
        Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
        Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
        None => Err(format!("Expected {}, got EOF", desc)),
    })
}

// ============================================================
// Approach 2: Build specific parsers from satisfy
// ============================================================

fn is_digit<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_digit(), "digit")
}

fn is_letter<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_alphabetic(), "letter")
}

fn is_alphanumeric<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_alphanumeric(), "alphanumeric")
}

fn is_whitespace_char<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_whitespace(), "whitespace")
}

fn is_uppercase<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_uppercase(), "uppercase letter")
}

fn is_lowercase<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_lowercase(), "lowercase letter")
}

// ============================================================
// Approach 3: satisfy_or with custom error function
// ============================================================

fn satisfy_or<'a, F, E>(pred: F, on_fail: E) -> Parser<'a, char>
where
    F: Fn(char) -> bool + 'a,
    E: Fn(char) -> String + 'a,
{
    Box::new(move |input: &'a str| match input.chars().next() {
        Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
        Some(c) => Err(on_fail(c)),
        None => Err("Unexpected EOF".to_string()),
    })
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_digit_success() {
        let p = is_digit();
        assert_eq!(p("42"), Ok(('4', "2")));
    }

    #[test]
    fn test_digit_failure() {
        let p = is_digit();
        assert!(p("abc").is_err());
    }

    #[test]
    fn test_letter_success() {
        let p = is_letter();
        assert_eq!(p("hello"), Ok(('h', "ello")));
    }

    #[test]
    fn test_letter_failure() {
        let p = is_letter();
        assert!(p("123").is_err());
    }

    #[test]
    fn test_alphanumeric() {
        let p = is_alphanumeric();
        assert_eq!(p("a1"), Ok(('a', "1")));
        assert_eq!(p("1a"), Ok(('1', "a")));
        assert!(p("!x").is_err());
    }

    #[test]
    fn test_whitespace() {
        let p = is_whitespace_char();
        assert_eq!(p(" x"), Ok((' ', "x")));
        assert_eq!(p("\tx"), Ok(('\t', "x")));
        assert!(p("x").is_err());
    }

    #[test]
    fn test_uppercase() {
        let p = is_uppercase();
        assert_eq!(p("Hello"), Ok(('H', "ello")));
        assert!(p("hello").is_err());
    }

    #[test]
    fn test_lowercase() {
        let p = is_lowercase();
        assert_eq!(p("hello"), Ok(('h', "ello")));
        assert!(p("Hello").is_err());
    }

    #[test]
    fn test_custom_predicate() {
        let hex = satisfy(|c| c.is_ascii_hexdigit(), "hex digit");
        assert_eq!(hex("ff"), Ok(('f', "f")));
        assert!(hex("zz").is_err());
    }

    #[test]
    fn test_satisfy_or_custom_error() {
        let p = satisfy_or(|c| c == '@', |c| format!("Expected '@', found '{}'", c));
        assert_eq!(p("@hello"), Ok(('@', "hello")));
        assert_eq!(p("hello"), Err("Expected '@', found 'h'".to_string()));
    }

    #[test]
    fn test_empty_input() {
        let p = is_digit();
        assert!(p("").is_err());
    }
}

(* Example 153: Satisfy Parser *)
(* Parse a character matching a predicate *)

type 'a parse_result = ('a * string, string) result
type 'a parser = string -> 'a parse_result

let advance input =
  if String.length input > 0 then
    Some (input.[0], String.sub input 1 (String.length input - 1))
  else None

(* Approach 1: satisfy with a predicate *)
let satisfy (pred : char -> bool) (desc : string) : char parser = fun input ->
  match advance input with
  | Some (ch, rest) when pred ch -> Ok (ch, rest)
  | Some (ch, _) -> Error (Printf.sprintf "Character '%c' does not satisfy %s" ch desc)
  | None -> Error (Printf.sprintf "Expected %s, got EOF" desc)

(* Approach 2: Build specific parsers from satisfy *)
let is_digit = satisfy (fun c -> c >= '0' && c <= '9') "digit"
let is_letter = satisfy (fun c ->
  (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) "letter"
let is_alphanumeric = satisfy (fun c ->
  (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
  (c >= '0' && c <= '9')) "alphanumeric"
let is_whitespace = satisfy (fun c -> c = ' ' || c = '\t' || c = '\n' || c = '\r') "whitespace"

(* Approach 3: Satisfy with custom error message *)
let satisfy_or (pred : char -> bool) (on_fail : char -> string) : char parser = fun input ->
  match advance input with
  | Some (ch, rest) when pred ch -> Ok (ch, rest)
  | Some (ch, _) -> Error (on_fail ch)
  | None -> Error "Unexpected EOF"

let is_uppercase = satisfy_or
  (fun c -> c >= 'A' && c <= 'Z')
  (fun c -> Printf.sprintf "'%c' is not uppercase" c)

(* Tests *)
let () =
  assert (is_digit "42" = Ok ('4', "2"));
  assert (Result.is_error (is_digit "abc"));
  assert (is_letter "hello" = Ok ('h', "ello"));
  assert (Result.is_error (is_letter "123"));
  assert (is_alphanumeric "a1" = Ok ('a', "1"));
  assert (is_alphanumeric "1a" = Ok ('1', "a"));
  assert (is_whitespace " x" = Ok (' ', "x"));
  assert (is_uppercase "Hello" = Ok ('H', "ello"));
  assert (Result.is_error (is_uppercase "hello"));
  assert (Result.is_error (is_digit ""));
  print_endline "✓ All tests passed"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_digit_success() {
        let p = is_digit();
        assert_eq!(p("42"), Ok(('4', "2")));
    }

    #[test]
    fn test_digit_failure() {
        let p = is_digit();
        assert!(p("abc").is_err());
    }

    #[test]
    fn test_letter_success() {
        let p = is_letter();
        assert_eq!(p("hello"), Ok(('h', "ello")));
    }

    #[test]
    fn test_letter_failure() {
        let p = is_letter();
        assert!(p("123").is_err());
    }

    #[test]
    fn test_alphanumeric() {
        let p = is_alphanumeric();
        assert_eq!(p("a1"), Ok(('a', "1")));
        assert_eq!(p("1a"), Ok(('1', "a")));
        assert!(p("!x").is_err());
    }

    #[test]
    fn test_whitespace() {
        let p = is_whitespace_char();
        assert_eq!(p(" x"), Ok((' ', "x")));
        assert_eq!(p("\tx"), Ok(('\t', "x")));
        assert!(p("x").is_err());
    }

    #[test]
    fn test_uppercase() {
        let p = is_uppercase();
        assert_eq!(p("Hello"), Ok(('H', "ello")));
        assert!(p("hello").is_err());
    }

    #[test]
    fn test_lowercase() {
        let p = is_lowercase();
        assert_eq!(p("hello"), Ok(('h', "ello")));
        assert!(p("Hello").is_err());
    }

    #[test]
    fn test_custom_predicate() {
        let hex = satisfy(|c| c.is_ascii_hexdigit(), "hex digit");
        assert_eq!(hex("ff"), Ok(('f', "f")));
        assert!(hex("zz").is_err());
    }

    #[test]
    fn test_satisfy_or_custom_error() {
        let p = satisfy_or(|c| c == '@', |c| format!("Expected '@', found '{}'", c));
        assert_eq!(p("@hello"), Ok(('@', "hello")));
        assert_eq!(p("hello"), Err("Expected '@', found 'h'".to_string()));
    }

    #[test]
    fn test_empty_input() {
        let p = is_digit();
        assert!(p("").is_err());
    }
}

Deep Comparison

Comparison: Example 153 — Satisfy Parser

Core satisfy

OCaml:

let satisfy (pred : char -> bool) (desc : string) : char parser = fun input ->
  match advance input with
  | Some (ch, rest) when pred ch -> Ok (ch, rest)
  | Some (ch, _) -> Error (Printf.sprintf "Character '%c' does not satisfy %s" ch desc)
  | None -> Error (Printf.sprintf "Expected %s, got EOF" desc)

Rust:

fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
    F: Fn(char) -> bool + 'a,
{
    let desc = desc.to_string();
    Box::new(move |input: &'a str| {
        match input.chars().next() {
            Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
            Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
            None => Err(format!("Expected {}, got EOF", desc)),
        }
    })
}

Building specific parsers

OCaml:

let is_digit = satisfy (fun c -> c >= '0' && c <= '9') "digit"
let is_letter = satisfy (fun c ->
  (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) "letter"

Rust:

fn is_digit<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_digit(), "digit")
}

fn is_letter<'a>() -> Parser<'a, char> {
    satisfy(|c| c.is_ascii_alphabetic(), "letter")
}

Exercises

Build hex_digit() -> Parser<char> using satisfy and the predicate |c| c.is_ascii_hexdigit().

Write printable_char() -> Parser<char> that accepts any non-control, non-whitespace character.

Implement not_char(c: char) -> Parser<char> that accepts any character except c, built from satisfy.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust