040 — Parse a Dotstring Back to a Binary Tree
Tutorial Video
Text description (accessibility)
This video demonstrates the "040 — Parse a Dotstring Back to a Binary Tree" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Parsing a dotstring back to a binary tree (OCaml 99 Problems, complement to #39) is the deserialization half of the round-trip. Key difference from OCaml: 1. **Error handling**: Rust's `Result<Tree<char>, String>` makes parse errors explicit in the type system. OCaml's `failwith` raises an exception — callers must use `try...with` to catch it. A functional OCaml style would return `option` or `result`.
Tutorial
The Problem
Parsing a dotstring back to a binary tree (OCaml 99 Problems, complement to #39) is the deserialization half of the round-trip. It is also a textbook example of recursive descent parsing: each call to the parse function consumes exactly one self-delimiting unit (either a . for leaf or a character followed by two recursive calls for a node).
Recursive descent parsers are the basis for most hand-written language parsers (JSON parsers, expression evaluators, configuration file readers). The self-delimiting property of dotstrings makes this one of the simplest examples of recursive descent — no lookahead, no backtracking, no ambiguity.
🎯 Learning Outcomes
&mut usize position cursor to track how much input has been consumed. from node values&mut usize position cursor through recursive callsResult<Tree<char>, String> — return descriptive error messages for malformed inputCode Example
#![allow(clippy::all)]
// Placeholder — pending conversionKey Differences
Result<Tree<char>, String> makes parse errors explicit in the type system. OCaml's failwith raises an exception — callers must use try...with to catch it. A functional OCaml style would return option or result.&mut usize (single shared mutable reference). OCaml: return (result, new_pos) pairs (no mutation). Both are equivalent; OCaml's style is more testable.parse calls. Rust with &mut usize also composes but requires managing the shared mutable state.nom, pest, winnow. OCaml has angstrom, mparser. The dotstring parser is the simplest case of what these libraries handle.Chars iterator through recursive calls. This is a simple recursive descent parser — the same pattern used in compiler front-ends.&mut Chars:** Passing chars: &mut Chars allows each recursive call to advance the iterator. The next() call returns Some(char) or None when the input is exhausted.parse(serialize(tree)) == tree for any tree. This is a strong property to verify with property-based testing.(tree, remaining) tuples for functional threading of parser state. Rust uses &mut usize position. Both thread state through recursion; only the syntax differs.OCaml Approach
OCaml's functional version returns (Tree<char>, int) pairs: let rec parse s pos = if pos >= String.length s then failwith "unexpected end" else let c = s.[pos] in if c = '.' then (Leaf, pos + 1) else let (l, p1) = parse s (pos + 1) in let (r, p2) = parse s p1 in (Node (c, l, r), p2). The position threads through all recursive calls as an explicit argument and return value.
Full Source
#![allow(clippy::all)]
// Placeholder — pending conversionDeep Comparison
OCaml vs Rust: Dotstring Parse
Overview
See the example.rs and example.ml files for detailed implementations.
Key Differences
| Aspect | OCaml | Rust |
|---|---|---|
| Type system | Hindley-Milner | Ownership + traits |
| Memory | GC | Zero-cost abstractions |
| Mutability | Explicit ref | mut keyword |
| Error handling | Option/Result | Result<T, E> |
See README.md for detailed comparison.
Exercises
Result<Tree<char>, ParseError> where ParseError includes position and message.Parser<T> = impl Fn(&[char], usize) -> Result<(T, usize), String>. Define map, and_then, and or combinators, then compose them to build the dotstring parser.[abc] for a node with value "abc". Modify the grammar and the parser together.Parser<T> type as a newtype over impl Fn(&str) -> Result<(T, &str), String> and rewrite the tree parser using combinators like map, and_then, and or.