ExamplesBy LevelBy TopicLearning Paths
166 Advanced

Separated List

Functional Programming

Tutorial

The Problem

Comma-separated values, semicolon-separated statements, pipe-separated fields — separated lists appear everywhere in text formats. separated_list0 and separated_list1 parse sequences of items with a separator between each pair, correctly handling the last item (no trailing separator in strict formats) and empty lists. Building this from primitives requires careful handling of the separator-before-next-item pattern to avoid consuming the separator when the list has ended.

🎯 Learning Outcomes

  • • Implement separated_list0 and separated_list1 combinators
  • • Understand the interleave pattern: item, (sep item)*, where (sep item)* is the tail
  • • Learn how separated list parsing relates to many0 and pair
  • • See how separated lists form the basis for CSV, function argument lists, and array literals
  • Code Example

    fn separated_list0<'a, T: 'a, S: 'a>(
        sep: Parser<'a, S>, item: Parser<'a, T>,
    ) -> Parser<'a, Vec<T>> {
        Box::new(move |input: &'a str| {
            let (first, mut remaining) = match item(input) {
                Err(_) => return Ok((vec![], input)),
                Ok(r) => r,
            };
            let mut results = vec![first];
            loop {
                let after_sep = match sep(remaining) {
                    Err(_) => break,
                    Ok((_, r)) => r,
                };
                match item(after_sep) {
                    Ok((val, rest)) => { results.push(val); remaining = rest; }
                    Err(_) => break,
                }
            }
            Ok((results, remaining))
        })
    }

    Key Differences

  • Separator consumption: Both parsers consume the separator as part of the item following it (not before it) — this is the standard approach ensuring correct error location.
  • Trailing separator: Neither sep_by variant allows trailing separators; a separate opt(sep) must be added for formats like Rust's trailing commas in arrays.
  • Return type: Rust returns Vec<T>; OCaml returns 'a list — both are ordered sequences.
  • Whitespace around separators: Production parsers wrap separators with ws_wrap or use lexeme; these examples show the pure combinator without whitespace.
  • OCaml Approach

    Angstrom provides sep_by : 'a t -> 'b t -> 'b list t and sep_by1:

    let csv_row = sep_by (char ',') field_parser
    

    OCaml's functional style makes the separator combinator more concise. The implementation uses many (sep *> item) after the first item — structurally identical to Rust's approach but expressed with >>= and infix operators.

    Full Source

    #![allow(clippy::all)]
    // Example 166: Separated List
    // separated_list0, separated_list1: comma-separated values
    
    type ParseResult<'a, T> = Result<(T, &'a str), String>;
    type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
    
    fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
    where
        F: Fn(char) -> bool + 'a,
    {
        let desc = desc.to_string();
        Box::new(move |input: &'a str| match input.chars().next() {
            Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
            _ => Err(format!("Expected {}", desc)),
        })
    }
    
    fn many1<'a, T: 'a>(p: Parser<'a, T>) -> Parser<'a, Vec<T>> {
        Box::new(move |input: &'a str| {
            let (first, mut rem) = p(input)?;
            let mut v = vec![first];
            while let Ok((val, r)) = p(rem) {
                v.push(val);
                rem = r;
            }
            Ok((v, rem))
        })
    }
    
    // ============================================================
    // Approach 1: separated_list0 — zero or more items
    // ============================================================
    
    fn separated_list0<'a, T: 'a, S: 'a>(
        sep: Parser<'a, S>,
        item: Parser<'a, T>,
    ) -> Parser<'a, Vec<T>> {
        Box::new(move |input: &'a str| {
            let (first, mut remaining) = match item(input) {
                Err(_) => return Ok((vec![], input)),
                Ok(r) => r,
            };
            let mut results = vec![first];
            loop {
                let after_sep = match sep(remaining) {
                    Err(_) => break,
                    Ok((_, r)) => r,
                };
                match item(after_sep) {
                    Ok((val, rest)) => {
                        results.push(val);
                        remaining = rest;
                    }
                    Err(_) => break, // backtrack: don't consume trailing sep
                }
            }
            Ok((results, remaining))
        })
    }
    
    // ============================================================
    // Approach 2: separated_list1 — one or more
    // ============================================================
    
    fn separated_list1<'a, T: 'a, S: 'a>(
        sep: Parser<'a, S>,
        item: Parser<'a, T>,
    ) -> Parser<'a, Vec<T>> {
        Box::new(move |input: &'a str| {
            let (results, rest) = separated_list0_inner(&sep, &item, input)?;
            if results.is_empty() {
                Err("Expected at least one item".to_string())
            } else {
                Ok((results, rest))
            }
        })
    }
    
    fn separated_list0_inner<'a, T, S>(
        sep: &(dyn Fn(&'a str) -> ParseResult<'a, S>),
        item: &(dyn Fn(&'a str) -> ParseResult<'a, T>),
        input: &'a str,
    ) -> ParseResult<'a, Vec<T>> {
        let (first, mut remaining) = match item(input) {
            Err(_) => return Ok((vec![], input)),
            Ok(r) => r,
        };
        let mut results = vec![first];
        loop {
            let after_sep = match sep(remaining) {
                Err(_) => break,
                Ok((_, r)) => r,
            };
            match item(after_sep) {
                Ok((val, rest)) => {
                    results.push(val);
                    remaining = rest;
                }
                Err(_) => break,
            }
        }
        Ok((results, remaining))
    }
    
    // ============================================================
    // Approach 3: With trailing separator allowed
    // ============================================================
    
    fn separated_list_trailing<'a, T: 'a, S: 'a>(
        sep: Parser<'a, S>,
        item: Parser<'a, T>,
    ) -> Parser<'a, Vec<T>> {
        Box::new(move |input: &'a str| {
            let (first, mut remaining) = match item(input) {
                Err(_) => return Ok((vec![], input)),
                Ok(r) => r,
            };
            let mut results = vec![first];
            loop {
                let after_sep = match sep(remaining) {
                    Err(_) => break,
                    Ok((_, r)) => r,
                };
                match item(after_sep) {
                    Ok((val, rest)) => {
                        results.push(val);
                        remaining = rest;
                    }
                    Err(_) => {
                        remaining = after_sep;
                        break;
                    } // consume trailing sep
                }
            }
            Ok((results, remaining))
        })
    }
    
    /// Comma with optional whitespace
    fn comma<'a>() -> Parser<'a, char> {
        Box::new(|input: &'a str| {
            let trimmed = input.trim_start();
            match trimmed.chars().next() {
                Some(',') => Ok((',', trimmed[1..].trim_start())),
                _ => Err("Expected ','".to_string()),
            }
        })
    }
    
    /// Digit string
    fn digit_str<'a>() -> Parser<'a, String> {
        Box::new(|input: &'a str| {
            let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
            let (chars, rest) = p(input)?;
            Ok((chars.into_iter().collect(), rest))
        })
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_sep_list0_multiple() {
            let p = separated_list0(comma(), digit_str());
            let (v, r) = p("1, 2, 3").unwrap();
            assert_eq!(v, vec!["1", "2", "3"]);
            assert_eq!(r, "");
        }
    
        #[test]
        fn test_sep_list0_empty() {
            let p = separated_list0(comma(), digit_str());
            let (v, _) = p("").unwrap();
            assert!(v.is_empty());
        }
    
        #[test]
        fn test_sep_list0_single() {
            let p = separated_list0(comma(), digit_str());
            let (v, _) = p("42").unwrap();
            assert_eq!(v, vec!["42"]);
        }
    
        #[test]
        fn test_sep_list1_success() {
            let p = separated_list1(comma(), digit_str());
            let (v, _) = p("1, 2").unwrap();
            assert_eq!(v, vec!["1", "2"]);
        }
    
        #[test]
        fn test_sep_list1_empty_fails() {
            let p = separated_list1(comma(), digit_str());
            assert!(p("").is_err());
        }
    
        #[test]
        fn test_trailing_sep() {
            let p = separated_list_trailing(comma(), digit_str());
            let (v, rest) = p("1, 2, ").unwrap();
            assert_eq!(v, vec!["1", "2"]);
            assert_eq!(rest, "");
        }
    
        #[test]
        fn test_no_trailing() {
            let p = separated_list0(comma(), digit_str());
            // Should not consume trailing comma
            let (v, rest) = p("1, 2, abc").unwrap();
            assert_eq!(v, vec!["1", "2"]);
            assert_eq!(rest, ", abc"); // comma before abc not consumed (backtrack)
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_sep_list0_multiple() {
            let p = separated_list0(comma(), digit_str());
            let (v, r) = p("1, 2, 3").unwrap();
            assert_eq!(v, vec!["1", "2", "3"]);
            assert_eq!(r, "");
        }
    
        #[test]
        fn test_sep_list0_empty() {
            let p = separated_list0(comma(), digit_str());
            let (v, _) = p("").unwrap();
            assert!(v.is_empty());
        }
    
        #[test]
        fn test_sep_list0_single() {
            let p = separated_list0(comma(), digit_str());
            let (v, _) = p("42").unwrap();
            assert_eq!(v, vec!["42"]);
        }
    
        #[test]
        fn test_sep_list1_success() {
            let p = separated_list1(comma(), digit_str());
            let (v, _) = p("1, 2").unwrap();
            assert_eq!(v, vec!["1", "2"]);
        }
    
        #[test]
        fn test_sep_list1_empty_fails() {
            let p = separated_list1(comma(), digit_str());
            assert!(p("").is_err());
        }
    
        #[test]
        fn test_trailing_sep() {
            let p = separated_list_trailing(comma(), digit_str());
            let (v, rest) = p("1, 2, ").unwrap();
            assert_eq!(v, vec!["1", "2"]);
            assert_eq!(rest, "");
        }
    
        #[test]
        fn test_no_trailing() {
            let p = separated_list0(comma(), digit_str());
            // Should not consume trailing comma
            let (v, rest) = p("1, 2, abc").unwrap();
            assert_eq!(v, vec!["1", "2"]);
            assert_eq!(rest, ", abc"); // comma before abc not consumed (backtrack)
        }
    }

    Deep Comparison

    Comparison: Example 166 — Separated List

    separated_list0

    OCaml:

    let separated_list0 (sep : 'b parser) (item : 'a parser) : 'a list parser = fun input ->
      match item input with
      | Error _ -> Ok ([], input)
      | Ok (first, rest) ->
        let rec go acc remaining =
          match sep remaining with
          | Error _ -> Ok (List.rev acc, remaining)
          | Ok (_, after_sep) ->
            match item after_sep with
            | Error _ -> Ok (List.rev acc, remaining)
            | Ok (v, rest') -> go (v :: acc) rest'
        in go [first] rest
    

    Rust:

    fn separated_list0<'a, T: 'a, S: 'a>(
        sep: Parser<'a, S>, item: Parser<'a, T>,
    ) -> Parser<'a, Vec<T>> {
        Box::new(move |input: &'a str| {
            let (first, mut remaining) = match item(input) {
                Err(_) => return Ok((vec![], input)),
                Ok(r) => r,
            };
            let mut results = vec![first];
            loop {
                let after_sep = match sep(remaining) {
                    Err(_) => break,
                    Ok((_, r)) => r,
                };
                match item(after_sep) {
                    Ok((val, rest)) => { results.push(val); remaining = rest; }
                    Err(_) => break,
                }
            }
            Ok((results, remaining))
        })
    }
    

    Exercises

  • Parse a function argument list: "(a, b, c)"vec!["a", "b", "c"] using delimited + separated_list0.
  • Add support for trailing separators: "1, 2, 3,"vec![1, 2, 3].
  • Parse nested lists: "[[1,2],[3,4,5]]"vec![vec![1,2], vec![3,4,5]] using recursive separated list parsers.
  • Open Source Repos