Separated List
Tutorial
The Problem
Comma-separated values, semicolon-separated statements, pipe-separated fields — separated lists appear everywhere in text formats. separated_list0 and separated_list1 parse sequences of items with a separator between each pair, correctly handling the last item (no trailing separator in strict formats) and empty lists. Building this from primitives requires careful handling of the separator-before-next-item pattern to avoid consuming the separator when the list has ended.
🎯 Learning Outcomes
separated_list0 and separated_list1 combinators(sep item)* is the tailmany0 and pairCode Example
fn separated_list0<'a, T: 'a, S: 'a>(
sep: Parser<'a, S>, item: Parser<'a, T>,
) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut remaining) = match item(input) {
Err(_) => return Ok((vec![], input)),
Ok(r) => r,
};
let mut results = vec![first];
loop {
let after_sep = match sep(remaining) {
Err(_) => break,
Ok((_, r)) => r,
};
match item(after_sep) {
Ok((val, rest)) => { results.push(val); remaining = rest; }
Err(_) => break,
}
}
Ok((results, remaining))
})
}Key Differences
sep_by variant allows trailing separators; a separate opt(sep) must be added for formats like Rust's trailing commas in arrays.Vec<T>; OCaml returns 'a list — both are ordered sequences.ws_wrap or use lexeme; these examples show the pure combinator without whitespace.OCaml Approach
Angstrom provides sep_by : 'a t -> 'b t -> 'b list t and sep_by1:
let csv_row = sep_by (char ',') field_parser
OCaml's functional style makes the separator combinator more concise. The implementation uses many (sep *> item) after the first item — structurally identical to Rust's approach but expressed with >>= and infix operators.
Full Source
#![allow(clippy::all)]
// Example 166: Separated List
// separated_list0, separated_list1: comma-separated values
type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
F: Fn(char) -> bool + 'a,
{
let desc = desc.to_string();
Box::new(move |input: &'a str| match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
_ => Err(format!("Expected {}", desc)),
})
}
fn many1<'a, T: 'a>(p: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut rem) = p(input)?;
let mut v = vec![first];
while let Ok((val, r)) = p(rem) {
v.push(val);
rem = r;
}
Ok((v, rem))
})
}
// ============================================================
// Approach 1: separated_list0 — zero or more items
// ============================================================
fn separated_list0<'a, T: 'a, S: 'a>(
sep: Parser<'a, S>,
item: Parser<'a, T>,
) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut remaining) = match item(input) {
Err(_) => return Ok((vec![], input)),
Ok(r) => r,
};
let mut results = vec![first];
loop {
let after_sep = match sep(remaining) {
Err(_) => break,
Ok((_, r)) => r,
};
match item(after_sep) {
Ok((val, rest)) => {
results.push(val);
remaining = rest;
}
Err(_) => break, // backtrack: don't consume trailing sep
}
}
Ok((results, remaining))
})
}
// ============================================================
// Approach 2: separated_list1 — one or more
// ============================================================
fn separated_list1<'a, T: 'a, S: 'a>(
sep: Parser<'a, S>,
item: Parser<'a, T>,
) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (results, rest) = separated_list0_inner(&sep, &item, input)?;
if results.is_empty() {
Err("Expected at least one item".to_string())
} else {
Ok((results, rest))
}
})
}
fn separated_list0_inner<'a, T, S>(
sep: &(dyn Fn(&'a str) -> ParseResult<'a, S>),
item: &(dyn Fn(&'a str) -> ParseResult<'a, T>),
input: &'a str,
) -> ParseResult<'a, Vec<T>> {
let (first, mut remaining) = match item(input) {
Err(_) => return Ok((vec![], input)),
Ok(r) => r,
};
let mut results = vec![first];
loop {
let after_sep = match sep(remaining) {
Err(_) => break,
Ok((_, r)) => r,
};
match item(after_sep) {
Ok((val, rest)) => {
results.push(val);
remaining = rest;
}
Err(_) => break,
}
}
Ok((results, remaining))
}
// ============================================================
// Approach 3: With trailing separator allowed
// ============================================================
fn separated_list_trailing<'a, T: 'a, S: 'a>(
sep: Parser<'a, S>,
item: Parser<'a, T>,
) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut remaining) = match item(input) {
Err(_) => return Ok((vec![], input)),
Ok(r) => r,
};
let mut results = vec![first];
loop {
let after_sep = match sep(remaining) {
Err(_) => break,
Ok((_, r)) => r,
};
match item(after_sep) {
Ok((val, rest)) => {
results.push(val);
remaining = rest;
}
Err(_) => {
remaining = after_sep;
break;
} // consume trailing sep
}
}
Ok((results, remaining))
})
}
/// Comma with optional whitespace
fn comma<'a>() -> Parser<'a, char> {
Box::new(|input: &'a str| {
let trimmed = input.trim_start();
match trimmed.chars().next() {
Some(',') => Ok((',', trimmed[1..].trim_start())),
_ => Err("Expected ','".to_string()),
}
})
}
/// Digit string
fn digit_str<'a>() -> Parser<'a, String> {
Box::new(|input: &'a str| {
let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
let (chars, rest) = p(input)?;
Ok((chars.into_iter().collect(), rest))
})
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_sep_list0_multiple() {
let p = separated_list0(comma(), digit_str());
let (v, r) = p("1, 2, 3").unwrap();
assert_eq!(v, vec!["1", "2", "3"]);
assert_eq!(r, "");
}
#[test]
fn test_sep_list0_empty() {
let p = separated_list0(comma(), digit_str());
let (v, _) = p("").unwrap();
assert!(v.is_empty());
}
#[test]
fn test_sep_list0_single() {
let p = separated_list0(comma(), digit_str());
let (v, _) = p("42").unwrap();
assert_eq!(v, vec!["42"]);
}
#[test]
fn test_sep_list1_success() {
let p = separated_list1(comma(), digit_str());
let (v, _) = p("1, 2").unwrap();
assert_eq!(v, vec!["1", "2"]);
}
#[test]
fn test_sep_list1_empty_fails() {
let p = separated_list1(comma(), digit_str());
assert!(p("").is_err());
}
#[test]
fn test_trailing_sep() {
let p = separated_list_trailing(comma(), digit_str());
let (v, rest) = p("1, 2, ").unwrap();
assert_eq!(v, vec!["1", "2"]);
assert_eq!(rest, "");
}
#[test]
fn test_no_trailing() {
let p = separated_list0(comma(), digit_str());
// Should not consume trailing comma
let (v, rest) = p("1, 2, abc").unwrap();
assert_eq!(v, vec!["1", "2"]);
assert_eq!(rest, ", abc"); // comma before abc not consumed (backtrack)
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_sep_list0_multiple() {
let p = separated_list0(comma(), digit_str());
let (v, r) = p("1, 2, 3").unwrap();
assert_eq!(v, vec!["1", "2", "3"]);
assert_eq!(r, "");
}
#[test]
fn test_sep_list0_empty() {
let p = separated_list0(comma(), digit_str());
let (v, _) = p("").unwrap();
assert!(v.is_empty());
}
#[test]
fn test_sep_list0_single() {
let p = separated_list0(comma(), digit_str());
let (v, _) = p("42").unwrap();
assert_eq!(v, vec!["42"]);
}
#[test]
fn test_sep_list1_success() {
let p = separated_list1(comma(), digit_str());
let (v, _) = p("1, 2").unwrap();
assert_eq!(v, vec!["1", "2"]);
}
#[test]
fn test_sep_list1_empty_fails() {
let p = separated_list1(comma(), digit_str());
assert!(p("").is_err());
}
#[test]
fn test_trailing_sep() {
let p = separated_list_trailing(comma(), digit_str());
let (v, rest) = p("1, 2, ").unwrap();
assert_eq!(v, vec!["1", "2"]);
assert_eq!(rest, "");
}
#[test]
fn test_no_trailing() {
let p = separated_list0(comma(), digit_str());
// Should not consume trailing comma
let (v, rest) = p("1, 2, abc").unwrap();
assert_eq!(v, vec!["1", "2"]);
assert_eq!(rest, ", abc"); // comma before abc not consumed (backtrack)
}
}
Deep Comparison
Comparison: Example 166 — Separated List
separated_list0
OCaml:
let separated_list0 (sep : 'b parser) (item : 'a parser) : 'a list parser = fun input ->
match item input with
| Error _ -> Ok ([], input)
| Ok (first, rest) ->
let rec go acc remaining =
match sep remaining with
| Error _ -> Ok (List.rev acc, remaining)
| Ok (_, after_sep) ->
match item after_sep with
| Error _ -> Ok (List.rev acc, remaining)
| Ok (v, rest') -> go (v :: acc) rest'
in go [first] rest
Rust:
fn separated_list0<'a, T: 'a, S: 'a>(
sep: Parser<'a, S>, item: Parser<'a, T>,
) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut remaining) = match item(input) {
Err(_) => return Ok((vec![], input)),
Ok(r) => r,
};
let mut results = vec![first];
loop {
let after_sep = match sep(remaining) {
Err(_) => break,
Ok((_, r)) => r,
};
match item(after_sep) {
Ok((val, rest)) => { results.push(val); remaining = rest; }
Err(_) => break,
}
}
Ok((results, remaining))
})
}
Exercises
"(a, b, c)" → vec!["a", "b", "c"] using delimited + separated_list0."1, 2, 3," → vec![1, 2, 3]."[[1,2],[3,4,5]]" → vec![vec![1,2], vec![3,4,5]] using recursive separated list parsers.