ExamplesBy LevelBy TopicLearning Paths
153 Advanced

Satisfy Parser

Functional Programming

Tutorial

The Problem

Rather than enumerate every specific character a parser might accept, satisfy generalizes character matching to any predicate Fn(char) -> bool. This single combinator replaces dozens of specific parsers: is_digit, is_letter, is_whitespace, is_alphanumeric all become one-liners built from satisfy. The predicate-based approach is more extensible, composable, and mirrors the mathematical notation for character classes used in formal grammar theory.

🎯 Learning Outcomes

  • • Understand satisfy as the universal character matching primitive
  • • Learn how to build specific parsers (digit, letter, alphanumeric) from satisfy
  • • See how description strings in satisfy produce readable error messages
  • • Practice building a small parser vocabulary entirely from one primitive
  • Code Example

    fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
    where
        F: Fn(char) -> bool + 'a,
    {
        let desc = desc.to_string();
        Box::new(move |input: &'a str| {
            match input.chars().next() {
                Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
                Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
                None => Err(format!("Expected {}, got EOF", desc)),
            }
        })
    }

    Key Differences

  • Description in signature: Rust's satisfy takes desc: &str inline; OCaml's angstrom uses a <?> operator to attach descriptions separately.
  • Predicate type: Rust's F: Fn(char) -> bool + 'a captures the lifetime of the predicate; OCaml's predicates are plain function values managed by the GC.
  • Unicode awareness: Rust's char is always a Unicode scalar value; OCaml's char is a byte (0..255), requiring Uchar for full Unicode.
  • Composability: Both satisfy variants compose identically with many0, many1, map, and choice; the higher-level combinators are the same.
  • OCaml Approach

    OCaml's angstrom provides satisfy : (char -> bool) -> char t directly. The idiomatic pattern:

    let digit = satisfy Char.is_digit
    let letter = satisfy Char.is_alpha
    let alphanumeric = satisfy (fun c -> Char.is_alpha c || Char.is_digit c)
    

    OCaml's lighter closure syntax (Char.is_digit) compared to Rust's (|c| c.is_ascii_digit()) makes these definitions more compact. Error messages in angstrom are produced separately via <?> "description".

    Full Source

    #![allow(clippy::all)]
    // Example 153: Satisfy Parser
    // Parse a character matching a predicate
    
    type ParseResult<'a, T> = Result<(T, &'a str), String>;
    type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
    
    // ============================================================
    // Approach 1: satisfy with predicate and description
    // ============================================================
    
    fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
    where
        F: Fn(char) -> bool + 'a,
    {
        let desc = desc.to_string();
        Box::new(move |input: &'a str| match input.chars().next() {
            Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
            Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
            None => Err(format!("Expected {}, got EOF", desc)),
        })
    }
    
    // ============================================================
    // Approach 2: Build specific parsers from satisfy
    // ============================================================
    
    fn is_digit<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_digit(), "digit")
    }
    
    fn is_letter<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_alphabetic(), "letter")
    }
    
    fn is_alphanumeric<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_alphanumeric(), "alphanumeric")
    }
    
    fn is_whitespace_char<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_whitespace(), "whitespace")
    }
    
    fn is_uppercase<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_uppercase(), "uppercase letter")
    }
    
    fn is_lowercase<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_lowercase(), "lowercase letter")
    }
    
    // ============================================================
    // Approach 3: satisfy_or with custom error function
    // ============================================================
    
    fn satisfy_or<'a, F, E>(pred: F, on_fail: E) -> Parser<'a, char>
    where
        F: Fn(char) -> bool + 'a,
        E: Fn(char) -> String + 'a,
    {
        Box::new(move |input: &'a str| match input.chars().next() {
            Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
            Some(c) => Err(on_fail(c)),
            None => Err("Unexpected EOF".to_string()),
        })
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_digit_success() {
            let p = is_digit();
            assert_eq!(p("42"), Ok(('4', "2")));
        }
    
        #[test]
        fn test_digit_failure() {
            let p = is_digit();
            assert!(p("abc").is_err());
        }
    
        #[test]
        fn test_letter_success() {
            let p = is_letter();
            assert_eq!(p("hello"), Ok(('h', "ello")));
        }
    
        #[test]
        fn test_letter_failure() {
            let p = is_letter();
            assert!(p("123").is_err());
        }
    
        #[test]
        fn test_alphanumeric() {
            let p = is_alphanumeric();
            assert_eq!(p("a1"), Ok(('a', "1")));
            assert_eq!(p("1a"), Ok(('1', "a")));
            assert!(p("!x").is_err());
        }
    
        #[test]
        fn test_whitespace() {
            let p = is_whitespace_char();
            assert_eq!(p(" x"), Ok((' ', "x")));
            assert_eq!(p("\tx"), Ok(('\t', "x")));
            assert!(p("x").is_err());
        }
    
        #[test]
        fn test_uppercase() {
            let p = is_uppercase();
            assert_eq!(p("Hello"), Ok(('H', "ello")));
            assert!(p("hello").is_err());
        }
    
        #[test]
        fn test_lowercase() {
            let p = is_lowercase();
            assert_eq!(p("hello"), Ok(('h', "ello")));
            assert!(p("Hello").is_err());
        }
    
        #[test]
        fn test_custom_predicate() {
            let hex = satisfy(|c| c.is_ascii_hexdigit(), "hex digit");
            assert_eq!(hex("ff"), Ok(('f', "f")));
            assert!(hex("zz").is_err());
        }
    
        #[test]
        fn test_satisfy_or_custom_error() {
            let p = satisfy_or(|c| c == '@', |c| format!("Expected '@', found '{}'", c));
            assert_eq!(p("@hello"), Ok(('@', "hello")));
            assert_eq!(p("hello"), Err("Expected '@', found 'h'".to_string()));
        }
    
        #[test]
        fn test_empty_input() {
            let p = is_digit();
            assert!(p("").is_err());
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_digit_success() {
            let p = is_digit();
            assert_eq!(p("42"), Ok(('4', "2")));
        }
    
        #[test]
        fn test_digit_failure() {
            let p = is_digit();
            assert!(p("abc").is_err());
        }
    
        #[test]
        fn test_letter_success() {
            let p = is_letter();
            assert_eq!(p("hello"), Ok(('h', "ello")));
        }
    
        #[test]
        fn test_letter_failure() {
            let p = is_letter();
            assert!(p("123").is_err());
        }
    
        #[test]
        fn test_alphanumeric() {
            let p = is_alphanumeric();
            assert_eq!(p("a1"), Ok(('a', "1")));
            assert_eq!(p("1a"), Ok(('1', "a")));
            assert!(p("!x").is_err());
        }
    
        #[test]
        fn test_whitespace() {
            let p = is_whitespace_char();
            assert_eq!(p(" x"), Ok((' ', "x")));
            assert_eq!(p("\tx"), Ok(('\t', "x")));
            assert!(p("x").is_err());
        }
    
        #[test]
        fn test_uppercase() {
            let p = is_uppercase();
            assert_eq!(p("Hello"), Ok(('H', "ello")));
            assert!(p("hello").is_err());
        }
    
        #[test]
        fn test_lowercase() {
            let p = is_lowercase();
            assert_eq!(p("hello"), Ok(('h', "ello")));
            assert!(p("Hello").is_err());
        }
    
        #[test]
        fn test_custom_predicate() {
            let hex = satisfy(|c| c.is_ascii_hexdigit(), "hex digit");
            assert_eq!(hex("ff"), Ok(('f', "f")));
            assert!(hex("zz").is_err());
        }
    
        #[test]
        fn test_satisfy_or_custom_error() {
            let p = satisfy_or(|c| c == '@', |c| format!("Expected '@', found '{}'", c));
            assert_eq!(p("@hello"), Ok(('@', "hello")));
            assert_eq!(p("hello"), Err("Expected '@', found 'h'".to_string()));
        }
    
        #[test]
        fn test_empty_input() {
            let p = is_digit();
            assert!(p("").is_err());
        }
    }

    Deep Comparison

    Comparison: Example 153 — Satisfy Parser

    Core satisfy

    OCaml:

    let satisfy (pred : char -> bool) (desc : string) : char parser = fun input ->
      match advance input with
      | Some (ch, rest) when pred ch -> Ok (ch, rest)
      | Some (ch, _) -> Error (Printf.sprintf "Character '%c' does not satisfy %s" ch desc)
      | None -> Error (Printf.sprintf "Expected %s, got EOF" desc)
    

    Rust:

    fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
    where
        F: Fn(char) -> bool + 'a,
    {
        let desc = desc.to_string();
        Box::new(move |input: &'a str| {
            match input.chars().next() {
                Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
                Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
                None => Err(format!("Expected {}, got EOF", desc)),
            }
        })
    }
    

    Building specific parsers

    OCaml:

    let is_digit = satisfy (fun c -> c >= '0' && c <= '9') "digit"
    let is_letter = satisfy (fun c ->
      (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) "letter"
    

    Rust:

    fn is_digit<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_digit(), "digit")
    }
    
    fn is_letter<'a>() -> Parser<'a, char> {
        satisfy(|c| c.is_ascii_alphabetic(), "letter")
    }
    

    Exercises

  • Build hex_digit() -> Parser<char> using satisfy and the predicate |c| c.is_ascii_hexdigit().
  • Write printable_char() -> Parser<char> that accepts any non-control, non-whitespace character.
  • Implement not_char(c: char) -> Parser<char> that accepts any character except c, built from satisfy.
  • Open Source Repos