ExamplesBy LevelBy TopicLearning Paths
535 Intermediate

Lifetimes in Enums

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Lifetimes in Enums" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Enums can hold references just like structs, and the same lifetime annotation rules apply. Key difference from OCaml: 1. **Zero

Tutorial

The Problem

Enums can hold references just like structs, and the same lifetime annotation rules apply. Token types in parsers, parse result types, and zero-copy JSON/YAML values are all classic examples: they borrow slices of the input string rather than copying, making parsing dramatically faster. The Token<'a> pattern is foundational in parser combinators (nom, winnow, pest) — a lexer tokenizes a string slice and yields tokens that are lightweight views into the original input, requiring no allocation.

🎯 Learning Outcomes

  • • How enum variants with reference fields require a lifetime parameter on the enum
  • • How Token<'a> with Word(&'a str) variants enables zero-copy tokenization
  • • How ParseResult<'a, T> models a remaining-input alongside a parsed value
  • • How JsonValue<'a> builds a zero-copy JSON tree borrowing strings from the source
  • • Where lifetime-annotated enums appear: nom tokens, serde zero-copy deserialization, ASTs
  • Code Example

    // Enum variant with reference needs lifetime
    #[derive(Debug)]
    pub enum Token<'a> {
        Word(&'a str),    // borrows from input
        Number(i64),
        Punctuation(char),
        End,
    }
    
    pub enum ParseResult<'a, T> {
        Ok(T, &'a str),           // remaining input
        Err(&'a str, String),     // failing position + message
    }

    Key Differences

  • Zero-copy tokens: Rust Word(&'a str) is a zero-copy view into the input; OCaml Word of string copies the substring unless explicit slice types are used.
  • Enum lifetime propagation: Rust propagates 'a automatically through recursive enum arms like Array(Vec<JsonValue<'a>>); OCaml records and variants need no lifetime propagation.
  • Mixed owned/borrowed variants: Rust enums can mix owned variants (Number(i64)) and borrowed variants (Word(&'a str)) in the same type; OCaml has no such distinction — all values are uniformly GC-managed.
  • Parser result threading: Rust ParseResult<'a, T> threads the remaining &'a str through the type system; OCaml parser combinators return (value, rest: string) tuples with no lifetime tracking.
  • OCaml Approach

    OCaml variant types for tokens use string (owned) since strings are immutable and GC-managed. Zero-copy parsing requires explicit Bigarray or Bytes slices:

    type token = Word of string | Number of int | Punct of char | End
    type 'a parse_result = Ok of 'a * string | Err of string * string
    

    All string values in OCaml are GC-managed, so there is no dangling reference concern — but they are copied by default.

    Full Source

    #![allow(clippy::all)]
    //! Lifetimes in Enums
    //!
    //! Enum variants containing references require lifetime parameters.
    
    /// Token borrowing from the input string.
    #[derive(Debug, PartialEq, Clone)]
    pub enum Token<'a> {
        Word(&'a str),
        Number(i64),
        Punctuation(char),
        End,
    }
    
    /// Parse result that borrows from input.
    #[derive(Debug)]
    pub enum ParseResult<'a, T> {
        Ok(T, &'a str),       // value + remaining input
        Err(&'a str, String), // failing input + error message
    }
    
    /// JSON-like value that may borrow from source.
    #[derive(Debug, Clone, PartialEq)]
    pub enum JsonValue<'a> {
        Null,
        Bool(bool),
        Number(f64),
        String(&'a str), // borrows from source
        Array(Vec<JsonValue<'a>>),
        Object(Vec<(&'a str, JsonValue<'a>)>),
    }
    
    /// Simple tokenizer.
    pub fn tokenize(input: &str) -> Vec<Token<'_>> {
        let mut tokens = Vec::new();
        let mut chars = input.char_indices().peekable();
    
        while let Some((i, c)) = chars.next() {
            match c {
                ' ' | '\t' | '\n' => continue,
                '.' | ',' | '!' | '?' => tokens.push(Token::Punctuation(c)),
                '0'..='9' => {
                    let start = i;
                    while chars
                        .peek()
                        .map(|(_, c)| c.is_ascii_digit())
                        .unwrap_or(false)
                    {
                        chars.next();
                    }
                    let end = chars.peek().map(|(i, _)| *i).unwrap_or(input.len());
                    let num: i64 = input[start..end].parse().unwrap_or(0);
                    tokens.push(Token::Number(num));
                }
                'a'..='z' | 'A'..='Z' => {
                    let start = i;
                    while chars
                        .peek()
                        .map(|(_, c)| c.is_alphanumeric())
                        .unwrap_or(false)
                    {
                        chars.next();
                    }
                    let end = chars.peek().map(|(i, _)| *i).unwrap_or(input.len());
                    tokens.push(Token::Word(&input[start..end]));
                }
                _ => {}
            }
        }
        tokens.push(Token::End);
        tokens
    }
    
    impl<'a> JsonValue<'a> {
        pub fn is_null(&self) -> bool {
            matches!(self, JsonValue::Null)
        }
    
        pub fn as_str(&self) -> Option<&'a str> {
            match self {
                JsonValue::String(s) => Some(s),
                _ => None,
            }
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_tokenize() {
            let tokens = tokenize("hello world 123");
            assert_eq!(
                tokens,
                vec![
                    Token::Word("hello"),
                    Token::Word("world"),
                    Token::Number(123),
                    Token::End
                ]
            );
        }
    
        #[test]
        fn test_tokenize_punctuation() {
            let tokens = tokenize("hello, world!");
            assert!(tokens.contains(&Token::Punctuation(',')));
            assert!(tokens.contains(&Token::Punctuation('!')));
        }
    
        #[test]
        fn test_json_value() {
            let source = "hello";
            let value = JsonValue::String(source);
            assert_eq!(value.as_str(), Some("hello"));
        }
    
        #[test]
        fn test_json_nested() {
            let key = "name";
            let val = "John";
            let obj = JsonValue::Object(vec![(key, JsonValue::String(val))]);
            if let JsonValue::Object(pairs) = obj {
                assert_eq!(pairs[0].0, "name");
            }
        }
    
        #[test]
        fn test_parse_result() {
            let input = "remaining";
            let result: ParseResult<i32> = ParseResult::Ok(42, input);
            if let ParseResult::Ok(v, rest) = result {
                assert_eq!(v, 42);
                assert_eq!(rest, "remaining");
            }
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_tokenize() {
            let tokens = tokenize("hello world 123");
            assert_eq!(
                tokens,
                vec![
                    Token::Word("hello"),
                    Token::Word("world"),
                    Token::Number(123),
                    Token::End
                ]
            );
        }
    
        #[test]
        fn test_tokenize_punctuation() {
            let tokens = tokenize("hello, world!");
            assert!(tokens.contains(&Token::Punctuation(',')));
            assert!(tokens.contains(&Token::Punctuation('!')));
        }
    
        #[test]
        fn test_json_value() {
            let source = "hello";
            let value = JsonValue::String(source);
            assert_eq!(value.as_str(), Some("hello"));
        }
    
        #[test]
        fn test_json_nested() {
            let key = "name";
            let val = "John";
            let obj = JsonValue::Object(vec![(key, JsonValue::String(val))]);
            if let JsonValue::Object(pairs) = obj {
                assert_eq!(pairs[0].0, "name");
            }
        }
    
        #[test]
        fn test_parse_result() {
            let input = "remaining";
            let result: ParseResult<i32> = ParseResult::Ok(42, input);
            if let ParseResult::Ok(v, rest) = result {
                assert_eq!(v, 42);
                assert_eq!(rest, "remaining");
            }
        }
    }

    Deep Comparison

    OCaml vs Rust: Enum Lifetimes

    OCaml

    (* Variant can hold string without lifetime annotation *)
    type token =
      | Word of string
      | Number of int
      | Punctuation of char
      | End
    
    type 'a parse_result =
      | Ok of 'a * string
      | Err of string * string
    

    Rust

    // Enum variant with reference needs lifetime
    #[derive(Debug)]
    pub enum Token<'a> {
        Word(&'a str),    // borrows from input
        Number(i64),
        Punctuation(char),
        End,
    }
    
    pub enum ParseResult<'a, T> {
        Ok(T, &'a str),           // remaining input
        Err(&'a str, String),     // failing position + message
    }
    

    Key Differences

  • OCaml: string in variant is owned/GC-managed
  • Rust: &'a str in variant borrows from external source
  • Rust: Enum lifetime means "valid while source is valid"
  • Both: Enums can contain references or values
  • Rust: Zero-copy parsing possible with borrowed tokens
  • Exercises

  • Full tokenizer: Extend tokenize to handle multi-digit numbers and punctuation sequences, returning a Vec<Token<'_>> that borrows entirely from the input string.
  • Parser combinator: Implement fn parse_word<'a>(input: &'a str) -> ParseResult<'a, &'a str> that consumes and returns the first whitespace-delimited word.
  • Zero-copy JSON strings: Write a function that takes a JSON-like string and returns a Vec<(&str, &str)> of key-value pairs where both key and value are slices of the original input.
  • Open Source Repos