ExamplesBy LevelBy TopicLearning Paths
040 Intermediate

040 — Parse a Dotstring Back to a Binary Tree

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "040 — Parse a Dotstring Back to a Binary Tree" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Parsing a dotstring back to a binary tree (OCaml 99 Problems, complement to #39) is the deserialization half of the round-trip. Key difference from OCaml: 1. **Error handling**: Rust's `Result<Tree<char>, String>` makes parse errors explicit in the type system. OCaml's `failwith` raises an exception — callers must use `try...with` to catch it. A functional OCaml style would return `option` or `result`.

Tutorial

The Problem

Parsing a dotstring back to a binary tree (OCaml 99 Problems, complement to #39) is the deserialization half of the round-trip. It is also a textbook example of recursive descent parsing: each call to the parse function consumes exactly one self-delimiting unit (either a . for leaf or a character followed by two recursive calls for a node).

Recursive descent parsers are the basis for most hand-written language parsers (JSON parsers, expression evaluators, configuration file readers). The self-delimiting property of dotstrings makes this one of the simplest examples of recursive descent — no lookahead, no backtracking, no ambiguity.

🎯 Learning Outcomes

  • • Write a recursive descent parser that consumes characters left to right
  • • Use a &mut usize position cursor to track how much input has been consumed
  • • Handle the single-character lookahead needed to distinguish . from node values
  • • Implement error handling for malformed dotstrings
  • • Connect this parser structure to JSON/XML parsers and formal language grammars
  • • Write a recursive descent parser that threads &mut usize position cursor through recursive calls
  • • Handle parse errors with Result<Tree<char>, String> — return descriptive error messages for malformed input
  • Code Example

    #![allow(clippy::all)]
    // Placeholder — pending conversion

    Key Differences

  • Error handling: Rust's Result<Tree<char>, String> makes parse errors explicit in the type system. OCaml's failwith raises an exception — callers must use try...with to catch it. A functional OCaml style would return option or result.
  • Position threading: Rust: &mut usize (single shared mutable reference). OCaml: return (result, new_pos) pairs (no mutation). Both are equivalent; OCaml's style is more testable.
  • Composability: The OCaml functional style composes naturally — you can build more complex parsers by sequencing parse calls. Rust with &mut usize also composes but requires managing the shared mutable state.
  • Parser combinators: Both approaches generalize to parser combinator libraries. Rust has nom, pest, winnow. OCaml has angstrom, mparser. The dotstring parser is the simplest case of what these libraries handle.
  • Stateful parsing: The parser consumes characters one by one, threading a Chars iterator through recursive calls. This is a simple recursive descent parser — the same pattern used in compiler front-ends.
  • **&mut Chars:** Passing chars: &mut Chars allows each recursive call to advance the iterator. The next() call returns Some(char) or None when the input is exhausted.
  • Round-trip invariant: parse(serialize(tree)) == tree for any tree. This is a strong property to verify with property-based testing.
  • Functional vs imperative threading: OCaml returns (tree, remaining) tuples for functional threading of parser state. Rust uses &mut usize position. Both thread state through recursion; only the syntax differs.
  • OCaml Approach

    OCaml's functional version returns (Tree<char>, int) pairs: let rec parse s pos = if pos >= String.length s then failwith "unexpected end" else let c = s.[pos] in if c = '.' then (Leaf, pos + 1) else let (l, p1) = parse s (pos + 1) in let (r, p2) = parse s p1 in (Node (c, l, r), p2). The position threads through all recursive calls as an explicit argument and return value.

    Full Source

    #![allow(clippy::all)]
    // Placeholder — pending conversion

    Deep Comparison

    OCaml vs Rust: Dotstring Parse

    Overview

    See the example.rs and example.ml files for detailed implementations.

    Key Differences

    AspectOCamlRust
    Type systemHindley-MilnerOwnership + traits
    MemoryGCZero-cost abstractions
    MutabilityExplicit refmut keyword
    Error handlingOption/ResultResult<T, E>

    See README.md for detailed comparison.

    Exercises

  • Robust parser: Add full error recovery: report the position of the error and the character that caused it. Return Result<Tree<char>, ParseError> where ParseError includes position and message.
  • Parser combinator: Rewrite the parser using a type Parser<T> = impl Fn(&[char], usize) -> Result<(T, usize), String>. Define map, and_then, and or combinators, then compose them to build the dotstring parser.
  • Extended format: Extend the parser to handle multi-character node values enclosed in brackets: [abc] for a node with value "abc". Modify the grammar and the parser together.
  • Error recovery: Modify the parser to collect all errors encountered and report all of them, rather than stopping at the first error.
  • Parser generalization: Implement a generic Parser<T> type as a newtype over impl Fn(&str) -> Result<(T, &str), String> and rewrite the tree parser using combinators like map, and_then, and or.
  • Open Source Repos