Lifetimes in Enums
Tutorial Video
Text description (accessibility)
This video demonstrates the "Lifetimes in Enums" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Enums can hold references just like structs, and the same lifetime annotation rules apply. Key difference from OCaml: 1. **Zero
Tutorial
The Problem
Enums can hold references just like structs, and the same lifetime annotation rules apply. Token types in parsers, parse result types, and zero-copy JSON/YAML values are all classic examples: they borrow slices of the input string rather than copying, making parsing dramatically faster. The Token<'a> pattern is foundational in parser combinators (nom, winnow, pest) — a lexer tokenizes a string slice and yields tokens that are lightweight views into the original input, requiring no allocation.
🎯 Learning Outcomes
Token<'a> with Word(&'a str) variants enables zero-copy tokenizationParseResult<'a, T> models a remaining-input alongside a parsed valueJsonValue<'a> builds a zero-copy JSON tree borrowing strings from the sourceCode Example
// Enum variant with reference needs lifetime
#[derive(Debug)]
pub enum Token<'a> {
Word(&'a str), // borrows from input
Number(i64),
Punctuation(char),
End,
}
pub enum ParseResult<'a, T> {
Ok(T, &'a str), // remaining input
Err(&'a str, String), // failing position + message
}Key Differences
Word(&'a str) is a zero-copy view into the input; OCaml Word of string copies the substring unless explicit slice types are used.'a automatically through recursive enum arms like Array(Vec<JsonValue<'a>>); OCaml records and variants need no lifetime propagation.Number(i64)) and borrowed variants (Word(&'a str)) in the same type; OCaml has no such distinction — all values are uniformly GC-managed.ParseResult<'a, T> threads the remaining &'a str through the type system; OCaml parser combinators return (value, rest: string) tuples with no lifetime tracking.OCaml Approach
OCaml variant types for tokens use string (owned) since strings are immutable and GC-managed. Zero-copy parsing requires explicit Bigarray or Bytes slices:
type token = Word of string | Number of int | Punct of char | End
type 'a parse_result = Ok of 'a * string | Err of string * string
All string values in OCaml are GC-managed, so there is no dangling reference concern — but they are copied by default.
Full Source
#![allow(clippy::all)]
//! Lifetimes in Enums
//!
//! Enum variants containing references require lifetime parameters.
/// Token borrowing from the input string.
#[derive(Debug, PartialEq, Clone)]
pub enum Token<'a> {
Word(&'a str),
Number(i64),
Punctuation(char),
End,
}
/// Parse result that borrows from input.
#[derive(Debug)]
pub enum ParseResult<'a, T> {
Ok(T, &'a str), // value + remaining input
Err(&'a str, String), // failing input + error message
}
/// JSON-like value that may borrow from source.
#[derive(Debug, Clone, PartialEq)]
pub enum JsonValue<'a> {
Null,
Bool(bool),
Number(f64),
String(&'a str), // borrows from source
Array(Vec<JsonValue<'a>>),
Object(Vec<(&'a str, JsonValue<'a>)>),
}
/// Simple tokenizer.
pub fn tokenize(input: &str) -> Vec<Token<'_>> {
let mut tokens = Vec::new();
let mut chars = input.char_indices().peekable();
while let Some((i, c)) = chars.next() {
match c {
' ' | '\t' | '\n' => continue,
'.' | ',' | '!' | '?' => tokens.push(Token::Punctuation(c)),
'0'..='9' => {
let start = i;
while chars
.peek()
.map(|(_, c)| c.is_ascii_digit())
.unwrap_or(false)
{
chars.next();
}
let end = chars.peek().map(|(i, _)| *i).unwrap_or(input.len());
let num: i64 = input[start..end].parse().unwrap_or(0);
tokens.push(Token::Number(num));
}
'a'..='z' | 'A'..='Z' => {
let start = i;
while chars
.peek()
.map(|(_, c)| c.is_alphanumeric())
.unwrap_or(false)
{
chars.next();
}
let end = chars.peek().map(|(i, _)| *i).unwrap_or(input.len());
tokens.push(Token::Word(&input[start..end]));
}
_ => {}
}
}
tokens.push(Token::End);
tokens
}
impl<'a> JsonValue<'a> {
pub fn is_null(&self) -> bool {
matches!(self, JsonValue::Null)
}
pub fn as_str(&self) -> Option<&'a str> {
match self {
JsonValue::String(s) => Some(s),
_ => None,
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_tokenize() {
let tokens = tokenize("hello world 123");
assert_eq!(
tokens,
vec![
Token::Word("hello"),
Token::Word("world"),
Token::Number(123),
Token::End
]
);
}
#[test]
fn test_tokenize_punctuation() {
let tokens = tokenize("hello, world!");
assert!(tokens.contains(&Token::Punctuation(',')));
assert!(tokens.contains(&Token::Punctuation('!')));
}
#[test]
fn test_json_value() {
let source = "hello";
let value = JsonValue::String(source);
assert_eq!(value.as_str(), Some("hello"));
}
#[test]
fn test_json_nested() {
let key = "name";
let val = "John";
let obj = JsonValue::Object(vec![(key, JsonValue::String(val))]);
if let JsonValue::Object(pairs) = obj {
assert_eq!(pairs[0].0, "name");
}
}
#[test]
fn test_parse_result() {
let input = "remaining";
let result: ParseResult<i32> = ParseResult::Ok(42, input);
if let ParseResult::Ok(v, rest) = result {
assert_eq!(v, 42);
assert_eq!(rest, "remaining");
}
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_tokenize() {
let tokens = tokenize("hello world 123");
assert_eq!(
tokens,
vec![
Token::Word("hello"),
Token::Word("world"),
Token::Number(123),
Token::End
]
);
}
#[test]
fn test_tokenize_punctuation() {
let tokens = tokenize("hello, world!");
assert!(tokens.contains(&Token::Punctuation(',')));
assert!(tokens.contains(&Token::Punctuation('!')));
}
#[test]
fn test_json_value() {
let source = "hello";
let value = JsonValue::String(source);
assert_eq!(value.as_str(), Some("hello"));
}
#[test]
fn test_json_nested() {
let key = "name";
let val = "John";
let obj = JsonValue::Object(vec![(key, JsonValue::String(val))]);
if let JsonValue::Object(pairs) = obj {
assert_eq!(pairs[0].0, "name");
}
}
#[test]
fn test_parse_result() {
let input = "remaining";
let result: ParseResult<i32> = ParseResult::Ok(42, input);
if let ParseResult::Ok(v, rest) = result {
assert_eq!(v, 42);
assert_eq!(rest, "remaining");
}
}
}
Deep Comparison
OCaml vs Rust: Enum Lifetimes
OCaml
(* Variant can hold string without lifetime annotation *)
type token =
| Word of string
| Number of int
| Punctuation of char
| End
type 'a parse_result =
| Ok of 'a * string
| Err of string * string
Rust
// Enum variant with reference needs lifetime
#[derive(Debug)]
pub enum Token<'a> {
Word(&'a str), // borrows from input
Number(i64),
Punctuation(char),
End,
}
pub enum ParseResult<'a, T> {
Ok(T, &'a str), // remaining input
Err(&'a str, String), // failing position + message
}
Key Differences
Exercises
tokenize to handle multi-digit numbers and punctuation sequences, returning a Vec<Token<'_>> that borrows entirely from the input string.fn parse_word<'a>(input: &'a str) -> ParseResult<'a, &'a str> that consumes and returns the first whitespace-delimited word.Vec<(&str, &str)> of key-value pairs where both key and value are slices of the original input.