Digit Parser
Functional Programming
Tutorial
The Problem
Numbers are ubiquitous in data formats — configuration files, JSON, CSV, protocol messages. Parsing integers and floats correctly requires handling signs, leading zeros (allowed for floats, disallowed in JSON for integers), and overflow. Building a number parser from primitives demonstrates the full combinator pipeline: match sign, match digits, collect and convert, handle errors. This is the most universally used parser in real-world applications.
🎯 Learning Outcomes
many1 + map + flat_map combine for number parsingCode Example
fn digit<'a>() -> Parser<'a, u32> {
map(satisfy(|c| c.is_ascii_digit(), "digit"), |c| c as u32 - '0' as u32)
}Key Differences
i64::from_str returns Err on overflow; OCaml's int_of_string raises Failure (exception); Zarith in OCaml never overflows.is_ascii_digit() handles ASCII 0-9; OCaml's c >= '0' && c <= '9' is equivalent; Unicode digit handling requires additional work in both.Vec<char>, joins to String, then parses — three steps; OCaml similarly needs String.init or Buffer.t for the intermediate.OCaml Approach
OCaml's standard library provides int_of_string and float_of_string. In angstrom:
let digit = satisfy (fun c -> c >= '0' && c <= '9')
let uint = many1 digit >>| (fun cs -> int_of_string (String.init (List.length cs) (List.nth cs)))
OCaml's arbitrary-precision integers (Zarith) handle overflow naturally where Rust must explicitly check bounds.
Full Source
#![allow(clippy::all)]
// Example 161: Digit Parser
// Parse digits: single digit, multi-digit integer, positive/negative
type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
F: Fn(char) -> bool + 'a,
{
let desc = desc.to_string();
Box::new(move |input: &'a str| match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
_ => Err(format!("Expected {}", desc)),
})
}
fn many1<'a, T: 'a>(p: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut rem) = p(input)?;
let mut v = vec![first];
while let Ok((val, r)) = p(rem) {
v.push(val);
rem = r;
}
Ok((v, rem))
})
}
fn map<'a, A: 'a, B: 'a, F>(p: Parser<'a, A>, f: F) -> Parser<'a, B>
where
F: Fn(A) -> B + 'a,
{
Box::new(move |input: &'a str| {
let (v, r) = p(input)?;
Ok((f(v), r))
})
}
fn opt<'a, T: 'a>(p: Parser<'a, T>) -> Parser<'a, Option<T>> {
Box::new(move |input: &'a str| match p(input) {
Ok((v, r)) => Ok((Some(v), r)),
Err(_) => Ok((None, input)),
})
}
// ============================================================
// Approach 1: Single digit → u32
// ============================================================
fn digit<'a>() -> Parser<'a, u32> {
map(satisfy(|c| c.is_ascii_digit(), "digit"), |c| {
c as u32 - '0' as u32
})
}
// ============================================================
// Approach 2: Natural number (unsigned) → u64
// ============================================================
fn natural<'a>() -> Parser<'a, u64> {
map(many1(satisfy(|c| c.is_ascii_digit(), "digit")), |digits| {
digits
.iter()
.fold(0u64, |acc, &d| acc * 10 + (d as u64 - '0' as u64))
})
}
// ============================================================
// Approach 3: Signed integer → i64
// ============================================================
fn integer<'a>() -> Parser<'a, i64> {
Box::new(|input: &'a str| {
let (sign, rest) = opt(satisfy(|c| c == '+' || c == '-', "sign"))(input)?;
let (n, rem) = natural()(rest)?;
let value = match sign {
Some('-') => -(n as i64),
_ => n as i64,
};
Ok((value, rem))
})
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_digit() {
assert_eq!(digit()("5rest"), Ok((5, "rest")));
}
#[test]
fn test_digit_zero() {
assert_eq!(digit()("0x"), Ok((0, "x")));
}
#[test]
fn test_digit_fail() {
assert!(digit()("abc").is_err());
}
#[test]
fn test_natural() {
assert_eq!(natural()("42rest"), Ok((42, "rest")));
}
#[test]
fn test_natural_zero() {
assert_eq!(natural()("0"), Ok((0, "")));
}
#[test]
fn test_natural_large() {
assert_eq!(natural()("123456"), Ok((123456, "")));
}
#[test]
fn test_integer_positive() {
assert_eq!(integer()("42"), Ok((42, "")));
}
#[test]
fn test_integer_negative() {
assert_eq!(integer()("-42"), Ok((-42, "")));
}
#[test]
fn test_integer_plus() {
assert_eq!(integer()("+42"), Ok((42, "")));
}
#[test]
fn test_integer_zero() {
assert_eq!(integer()("0"), Ok((0, "")));
}
#[test]
fn test_integer_fail() {
assert!(integer()("abc").is_err());
}
}
✓ Tests
Rust test suite
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_digit() {
assert_eq!(digit()("5rest"), Ok((5, "rest")));
}
#[test]
fn test_digit_zero() {
assert_eq!(digit()("0x"), Ok((0, "x")));
}
#[test]
fn test_digit_fail() {
assert!(digit()("abc").is_err());
}
#[test]
fn test_natural() {
assert_eq!(natural()("42rest"), Ok((42, "rest")));
}
#[test]
fn test_natural_zero() {
assert_eq!(natural()("0"), Ok((0, "")));
}
#[test]
fn test_natural_large() {
assert_eq!(natural()("123456"), Ok((123456, "")));
}
#[test]
fn test_integer_positive() {
assert_eq!(integer()("42"), Ok((42, "")));
}
#[test]
fn test_integer_negative() {
assert_eq!(integer()("-42"), Ok((-42, "")));
}
#[test]
fn test_integer_plus() {
assert_eq!(integer()("+42"), Ok((42, "")));
}
#[test]
fn test_integer_zero() {
assert_eq!(integer()("0"), Ok((0, "")));
}
#[test]
fn test_integer_fail() {
assert!(integer()("abc").is_err());
}
}
Deep Comparison
Comparison: Example 161 — Digit Parser
Single digit
OCaml:
let digit : int parser =
map (fun c -> Char.code c - Char.code '0')
(satisfy (fun c -> c >= '0' && c <= '9') "digit")
Rust:
fn digit<'a>() -> Parser<'a, u32> {
map(satisfy(|c| c.is_ascii_digit(), "digit"), |c| c as u32 - '0' as u32)
}
Natural number
OCaml:
let natural : int parser =
map (fun digits -> List.fold_left (fun acc d -> acc * 10 + d) 0 digits)
(many1 digit)
Rust:
fn natural<'a>() -> Parser<'a, u64> {
map(
many1(satisfy(|c| c.is_ascii_digit(), "digit")),
|digits| digits.iter().fold(0u64, |acc, &d| acc * 10 + (d as u64 - '0' as u64)),
)
}
Signed integer
OCaml:
let integer : int parser = fun input ->
match opt (satisfy (fun c -> c = '+' || c = '-') "sign") input with
| Ok (sign, rest) ->
(match natural rest with
| Ok (n, rem) ->
let value = match sign with Some '-' -> -n | _ -> n in
Ok (value, rem)
| Error e -> Error e)
| Error e -> Error e
Rust:
fn integer<'a>() -> Parser<'a, i64> {
Box::new(|input: &'a str| {
let (sign, rest) = opt(satisfy(|c| c == '+' || c == '-', "sign"))(input)?;
let (n, rem) = natural()(rest)?;
let value = match sign {
Some('-') => -(n as i64),
_ => n as i64,
};
Ok((value, rem))
})
}
Exercises
"0x1F" → 31.bounded_int<const MIN: i64, const MAX: i64>() -> Parser<i64> that fails if the parsed value is out of range."0b1010" → 10.