170 Advanced

Error Recovery

Functional Programming

Tutorial

The Problem

Production parsers must provide actionable error messages, not "parse error at offset 423." Users need to know what went wrong, where (line and column), and what was expected. Error recovery goes further: instead of stopping at the first error, the parser continues, collecting multiple errors. This requires tracking position (line/column from offset), enriching errors with expected-token information, and synchronizing on error tokens to resume parsing.

🎯 Learning Outcomes

• Understand position tracking (offset, line, column) in a parser

• Learn how to enrich error messages with "expected X, got Y" information

• See the panic-mode error recovery strategy: skip to a synchronization token

• Appreciate why error quality is a significant differentiator in production parsers

Code Example

#[derive(Debug, Clone)]
struct Position { offset: usize, line: usize, col: usize }

fn advance_pos(pos: &Position, c: char) -> Position {
    if c == '\n' {
        Position { offset: pos.offset + 1, line: pos.line + 1, col: 1 }
    } else {
        Position { offset: pos.offset + 1, line: pos.line, col: pos.col + 1 }
    }
}

type position = { offset: int; line: int; col: int }

let advance_pos pos c =
  if c = '\n' then { offset = pos.offset + 1; line = pos.line + 1; col = 1 }
  else { offset = pos.offset + 1; line = pos.line; col = pos.col + 1 }

Key Differences

Position threading: Rust passes Position as an explicit value; OCaml often uses a state monad or implicit position tracking in the input type.

Error continuation: Both languages' simple parsers stop at the first error; production parsers in both use panic-mode recovery or grammar-level error tokens.

Error type: Rust's ParseError is a structured value; OCaml's errors are typically strings or custom error types in the result type.

Generator support: Menhir provides automatic error recovery and localization; hand-written parsers in both languages must implement this manually.

OCaml Approach

OCaml's Menhir provides built-in error recovery via error tokens and recovery rules in the grammar. Hand-written OCaml parsers use angstrom's Consume.No and position tracking via pos combinators. angstrom reports byte offsets; converting to line/column requires scanning the input up to the offset. Error reporting in production OCaml parsers (like merlin) tracks position via a state monad.

Full Source

#![allow(clippy::all)]
// Example 170: Error Recovery
// Parser error messages with position info and expected tokens

type ParseResult<'a, T> = Result<(T, &'a str, Position), ParseError>;

#[derive(Debug, Clone, PartialEq)]
struct Position {
    offset: usize,
    line: usize,
    col: usize,
}

#[derive(Debug, Clone)]
struct ParseError {
    pos: Position,
    expected: Vec<String>,
    got: String,
}

impl std::fmt::Display for ParseError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(
            f,
            "Error at line {}, col {}: expected {}, got {}",
            self.pos.line,
            self.pos.col,
            self.expected.join(" or "),
            self.got
        )
    }
}

fn init_pos() -> Position {
    Position {
        offset: 0,
        line: 1,
        col: 1,
    }
}

fn advance_pos(pos: &Position, c: char) -> Position {
    if c == '\n' {
        Position {
            offset: pos.offset + 1,
            line: pos.line + 1,
            col: 1,
        }
    } else {
        Position {
            offset: pos.offset + 1,
            line: pos.line,
            col: pos.col + 1,
        }
    }
}

// ============================================================
// Approach 1: Satisfy with position tracking
// ============================================================

fn satisfy_pos<'a>(
    pred: impl Fn(char) -> bool,
    desc: &str,
    input: &'a str,
    pos: &Position,
) -> ParseResult<'a, char> {
    match input.chars().next() {
        Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..], advance_pos(pos, c))),
        Some(c) => Err(ParseError {
            pos: pos.clone(),
            expected: vec![desc.to_string()],
            got: format!("'{}'", c),
        }),
        None => Err(ParseError {
            pos: pos.clone(),
            expected: vec![desc.to_string()],
            got: "EOF".to_string(),
        }),
    }
}

fn tag_pos<'a>(expected: &str, input: &'a str, pos: &Position) -> ParseResult<'a, &'a str> {
    if input.starts_with(expected) {
        let mut new_pos = pos.clone();
        for c in expected.chars() {
            new_pos = advance_pos(&new_pos, c);
        }
        Ok((&input[..expected.len()], &input[expected.len()..], new_pos))
    } else {
        let got_len = expected.len().min(input.len());
        Err(ParseError {
            pos: pos.clone(),
            expected: vec![format!("\"{}\"", expected)],
            got: format!("\"{}\"", &input[..got_len]),
        })
    }
}

// ============================================================
// Approach 2: Error merging for alternatives
// ============================================================

fn alt_pos<'a, T>(
    r1: ParseResult<'a, T>,
    r2: impl FnOnce() -> ParseResult<'a, T>,
) -> ParseResult<'a, T> {
    match r1 {
        Ok(r) => Ok(r),
        Err(e1) => match r2() {
            Ok(r) => Ok(r),
            Err(e2) => {
                if e1.pos.offset == e2.pos.offset {
                    let mut expected = e1.expected;
                    expected.extend(e2.expected);
                    Err(ParseError {
                        pos: e1.pos,
                        expected,
                        got: e1.got,
                    })
                } else if e1.pos.offset > e2.pos.offset {
                    Err(e1)
                } else {
                    Err(e2)
                }
            }
        },
    }
}

// ============================================================
// Approach 3: Error recovery — skip to sync point
// ============================================================

fn recover_until<'a, T>(
    sync: impl Fn(char) -> bool,
    result: ParseResult<'a, T>,
    input: &'a str,
    pos: &Position,
) -> Result<(Option<T>, &'a str, Position), ParseError> {
    match result {
        Ok((v, rest, new_pos)) => Ok((Some(v), rest, new_pos)),
        Err(_) => {
            let mut remaining = input;
            let mut current_pos = pos.clone();
            while let Some(c) = remaining.chars().next() {
                if sync(c) {
                    return Ok((None, remaining, current_pos));
                }
                remaining = &remaining[c.len_utf8()..];
                current_pos = advance_pos(&current_pos, c);
            }
            Ok((None, "", current_pos))
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_error_position() {
        let pos = init_pos();
        let err = satisfy_pos(|c| c.is_ascii_digit(), "digit", "abc", &pos).unwrap_err();
        assert_eq!(err.pos.line, 1);
        assert_eq!(err.pos.col, 1);
        assert_eq!(err.expected, vec!["digit"]);
        assert_eq!(err.got, "'a'");
    }

    #[test]
    fn test_error_merge() {
        let pos = init_pos();
        let r1 = satisfy_pos(|c| c.is_ascii_digit(), "digit", "!", &pos);
        let err = alt_pos(r1, || {
            satisfy_pos(|c| c.is_ascii_alphabetic(), "letter", "!", &pos)
        })
        .unwrap_err();
        assert_eq!(err.expected.len(), 2);
        assert!(err.expected.contains(&"digit".to_string()));
        assert!(err.expected.contains(&"letter".to_string()));
    }

    #[test]
    fn test_position_tracking() {
        let pos = init_pos();
        let (_, _, new_pos) = tag_pos("ab", "ab\ncd", &pos).unwrap();
        assert_eq!(new_pos.line, 1);
        assert_eq!(new_pos.col, 3);
    }

    #[test]
    fn test_newline_tracking() {
        let pos = init_pos();
        let (_, rest, new_pos) = tag_pos("ab\n", "ab\ncd", &pos).unwrap();
        assert_eq!(new_pos.line, 2);
        assert_eq!(new_pos.col, 1);
        assert_eq!(rest, "cd");
    }

    #[test]
    fn test_recovery() {
        let pos = init_pos();
        let bad = satisfy_pos(|c| c.is_ascii_digit(), "digit", "abc;rest", &pos);
        let (val, rest, _) = recover_until(|c| c == ';', bad, "abc;rest", &pos).unwrap();
        assert!(val.is_none());
        assert_eq!(rest, ";rest");
    }

    #[test]
    fn test_display_error() {
        let err = ParseError {
            pos: Position {
                offset: 5,
                line: 2,
                col: 3,
            },
            expected: vec!["digit".into(), "letter".into()],
            got: "'!'".into(),
        };
        assert_eq!(
            format!("{}", err),
            "Error at line 2, col 3: expected digit or letter, got '!'"
        );
    }
}

(* Example 170: Error Recovery *)
(* Parser error messages with position info and expected tokens *)

type position = { offset: int; line: int; col: int }

type parse_error = {
  pos: position;
  expected: string list;
  got: string;
}

type 'a parse_result = ('a * string * position, parse_error) result
type 'a parser = string -> position -> 'a parse_result

let advance_pos pos c =
  if c = '\n' then { offset = pos.offset + 1; line = pos.line + 1; col = 1 }
  else { offset = pos.offset + 1; line = pos.line; col = pos.col + 1 }

let init_pos = { offset = 0; line = 1; col = 1 }

let format_error err =
  Printf.sprintf "Error at line %d, col %d: expected %s, got %s"
    err.pos.line err.pos.col
    (String.concat " or " err.expected)
    err.got

(* Approach 1: Satisfy with position tracking *)
let satisfy pred desc : char parser = fun input pos ->
  if String.length input > 0 && pred input.[0] then
    let c = input.[0] in
    Ok (c, String.sub input 1 (String.length input - 1), advance_pos pos c)
  else
    let got = if String.length input > 0 then Printf.sprintf "'%c'" input.[0] else "EOF" in
    Error { pos; expected = [desc]; got }

let tag expected : string parser = fun input pos ->
  let len = String.length expected in
  if String.length input >= len && String.sub input 0 len = expected then
    let new_pos = ref pos in
    String.iter (fun c -> new_pos := advance_pos !new_pos c) expected;
    Ok (expected, String.sub input len (String.length input - len), !new_pos)
  else
    let got = if String.length input >= len then Printf.sprintf "\"%s\"" (String.sub input 0 len)
              else Printf.sprintf "\"%s\"" input in
    Error { pos; expected = [Printf.sprintf "\"%s\"" expected]; got }

(* Approach 2: Error merging for alternatives *)
let alt (p1 : 'a parser) (p2 : 'a parser) : 'a parser = fun input pos ->
  match p1 input pos with
  | Ok _ as r -> r
  | Error e1 ->
    match p2 input pos with
    | Ok _ as r -> r
    | Error e2 ->
      (* Merge errors: if at same position, combine expected lists *)
      if e1.pos.offset = e2.pos.offset then
        Error { pos = e1.pos; expected = e1.expected @ e2.expected; got = e1.got }
      else if e1.pos.offset > e2.pos.offset then Error e1
      else Error e2

(* Approach 3: Error recovery — skip to sync point *)
let recover_until (sync : char -> bool) (p : 'a parser) : 'a option parser = fun input pos ->
  match p input pos with
  | Ok (v, rest, new_pos) -> Ok (Some v, rest, new_pos)
  | Error _err ->
    (* Skip characters until we find a sync point *)
    let rec skip inp p =
      if String.length inp = 0 then Ok (None, "", p)
      else if sync inp.[0] then Ok (None, inp, p)
      else skip (String.sub inp 1 (String.length inp - 1)) (advance_pos p inp.[0])
    in
    skip input pos

(* Tests *)
let () =
  let digit = satisfy (fun c -> c >= '0' && c <= '9') "digit" in
  (match digit "abc" init_pos with
   | Error err ->
     assert (err.pos.line = 1 && err.pos.col = 1);
     assert (err.expected = ["digit"]);
     assert (err.got = "'a'")
   | _ -> failwith "Expected error");

  let digit_or_letter = alt digit
    (satisfy (fun c -> (c >= 'a' && c <= 'z')) "letter") in
  (match digit_or_letter "!" init_pos with
   | Error err ->
     assert (List.length err.expected = 2);
     assert (List.mem "digit" err.expected);
     assert (List.mem "letter" err.expected)
   | _ -> failwith "Expected merged error");

  (* Position tracking *)
  let pos = { offset = 0; line = 1; col = 1 } in
  (match tag "ab" "ab\ncd" pos with
   | Ok (_, _, new_pos) ->
     assert (new_pos.line = 1 && new_pos.col = 3)
   | _ -> failwith "Tag failed");

  print_endline "✓ All tests passed"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_error_position() {
        let pos = init_pos();
        let err = satisfy_pos(|c| c.is_ascii_digit(), "digit", "abc", &pos).unwrap_err();
        assert_eq!(err.pos.line, 1);
        assert_eq!(err.pos.col, 1);
        assert_eq!(err.expected, vec!["digit"]);
        assert_eq!(err.got, "'a'");
    }

    #[test]
    fn test_error_merge() {
        let pos = init_pos();
        let r1 = satisfy_pos(|c| c.is_ascii_digit(), "digit", "!", &pos);
        let err = alt_pos(r1, || {
            satisfy_pos(|c| c.is_ascii_alphabetic(), "letter", "!", &pos)
        })
        .unwrap_err();
        assert_eq!(err.expected.len(), 2);
        assert!(err.expected.contains(&"digit".to_string()));
        assert!(err.expected.contains(&"letter".to_string()));
    }

    #[test]
    fn test_position_tracking() {
        let pos = init_pos();
        let (_, _, new_pos) = tag_pos("ab", "ab\ncd", &pos).unwrap();
        assert_eq!(new_pos.line, 1);
        assert_eq!(new_pos.col, 3);
    }

    #[test]
    fn test_newline_tracking() {
        let pos = init_pos();
        let (_, rest, new_pos) = tag_pos("ab\n", "ab\ncd", &pos).unwrap();
        assert_eq!(new_pos.line, 2);
        assert_eq!(new_pos.col, 1);
        assert_eq!(rest, "cd");
    }

    #[test]
    fn test_recovery() {
        let pos = init_pos();
        let bad = satisfy_pos(|c| c.is_ascii_digit(), "digit", "abc;rest", &pos);
        let (val, rest, _) = recover_until(|c| c == ';', bad, "abc;rest", &pos).unwrap();
        assert!(val.is_none());
        assert_eq!(rest, ";rest");
    }

    #[test]
    fn test_display_error() {
        let err = ParseError {
            pos: Position {
                offset: 5,
                line: 2,
                col: 3,
            },
            expected: vec!["digit".into(), "letter".into()],
            got: "'!'".into(),
        };
        assert_eq!(
            format!("{}", err),
            "Error at line 2, col 3: expected digit or letter, got '!'"
        );
    }
}

Deep Comparison

Comparison: Example 170 — Error Recovery

Position tracking

OCaml:

type position = { offset: int; line: int; col: int }

let advance_pos pos c =
  if c = '\n' then { offset = pos.offset + 1; line = pos.line + 1; col = 1 }
  else { offset = pos.offset + 1; line = pos.line; col = pos.col + 1 }

Rust:

#[derive(Debug, Clone)]
struct Position { offset: usize, line: usize, col: usize }

fn advance_pos(pos: &Position, c: char) -> Position {
    if c == '\n' {
        Position { offset: pos.offset + 1, line: pos.line + 1, col: 1 }
    } else {
        Position { offset: pos.offset + 1, line: pos.line, col: pos.col + 1 }
    }
}

Error merging

OCaml:

if e1.pos.offset = e2.pos.offset then
  Error { pos = e1.pos; expected = e1.expected @ e2.expected; got = e1.got }

Rust:

if e1.pos.offset == e2.pos.offset {
    let mut expected = e1.expected;
    expected.extend(e2.expected);
    Err(ParseError { pos: e1.pos, expected, got: e1.got })
}

Exercises

Add a collect_errors wrapper that runs a parser and returns all errors found, continuing past each one using skip-to-semicolon recovery.

Implement line/column tracking by scanning the input up to the error offset and counting newlines.

Write an expected combinator that wraps any parser and adds a human-readable "expected X" to its error message.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust