ExamplesBy LevelBy TopicLearning Paths
888 Intermediate

888-peekable-iterator — Peekable Iterator

Functional Programming

Tutorial

The Problem

Many parsing and streaming algorithms need to look ahead at the next element before deciding whether to consume it. A lexer scanning multi-character numbers must peek at the next character without consuming it. A run-length encoder must check if the next element matches the current group. Implementing lookahead without peekable iterators requires awkward "push-back" buffers or read-one-ahead state variables. Rust's .peekable() adapter adds a peek() method that returns a reference to the next element without advancing the iterator. OCaml handles this with explicit option state or stream parsers from the Stream module.

🎯 Learning Outcomes

  • • Use .peekable() to add lookahead to any Rust iterator
  • • Use peek() inside a while loop to implement conditional consumption
  • • Build a simple tokenizer for multi-character numbers using peekable char iterators
  • • Implement run-length grouping using peek to detect group boundaries
  • • Compare with OCaml's stream-based parsing approach
  • Code Example

    pub fn sum_while_positive(data: &[i32]) -> i32 {
        let mut iter = data.iter().peekable();
        let mut sum = 0;
        while iter.peek().map_or(false, |&&v| v > 0) {
            sum += iter.next().unwrap();
        }
        sum
    }

    Key Differences

  • Borrow semantics: Rust peek() returns Option<&Self::Item> — a reference into the iterator; consuming requires a separate next() call. OCaml Stream.peek returns a copy.
  • Mutable iterator state: Rust's peekable stores one buffered element; OCaml's Stream maintains its own internal buffer.
  • Combinator style: OCaml parsers often use applicative/monadic combinators (let*, >>=); Rust peekable enables hand-written recursive descent.
  • Integration: Rust peekable works with any iterator (char iterators, token iterators, etc.); OCaml Stream requires wrapping the source.
  • OCaml Approach

    OCaml's Stream module provides Stream.peek, Stream.next, and Stream.junk for similar lookahead. More commonly, OCaml parsers use the Scanf module or write recursive descent parsers with explicit state. The Angstrom library provides monadic combinators. For ad-hoc tokenization, OCaml often uses a Buffer.t for accumulation and a ref for the current character position — more explicit state management than Rust's peekable iterator.

    Full Source

    #![allow(clippy::all)]
    // Example 094: Peekable Iterator
    // Lookahead parsing with .peekable() — inspect the next element without consuming it.
    
    // === Approach 1: Idiomatic Rust — consume while condition holds ===
    // Uses peek() to decide whether to advance; no "push-back" needed.
    pub fn sum_while_positive(data: &[i32]) -> i32 {
        let mut iter = data.iter().peekable();
        let mut sum = 0;
        while iter.peek().is_some_and(|&&v| v > 0) {
            sum += iter.next().unwrap();
        }
        sum
    }
    
    // Group consecutive equal elements using peek() to detect group boundaries.
    pub fn group_consecutive<T: PartialEq + Clone>(data: &[T]) -> Vec<Vec<T>> {
        let mut iter = data.iter().peekable();
        let mut groups: Vec<Vec<T>> = Vec::new();
    
        while let Some(item) = iter.next() {
            let mut group = vec![item.clone()];
            // Peek ahead: keep consuming while next equals current
            while iter.peek().is_some_and(|next| *next == item) {
                group.push(iter.next().unwrap().clone());
            }
            groups.push(group);
        }
        groups
    }
    
    // === Approach 2: Tokenizer using peekable — multi-char number scanning ===
    #[derive(Debug, PartialEq, Clone)]
    pub enum Token {
        Num(i64),
        Plus,
        Minus,
        Star,
        Slash,
        LParen,
        RParen,
    }
    
    pub fn tokenize(input: &str) -> Vec<Token> {
        let mut chars = input.chars().peekable();
        let mut tokens = Vec::new();
    
        while let Some(&ch) = chars.peek() {
            match ch {
                ' ' | '\t' | '\n' => {
                    chars.next();
                }
                '0'..='9' => {
                    // Consume digit run with next_if — the power move
                    let mut num_str = String::new();
                    while let Some(d) = chars.next_if(|c| c.is_ascii_digit()) {
                        num_str.push(d);
                    }
                    tokens.push(Token::Num(num_str.parse().unwrap()));
                }
                '+' => {
                    chars.next();
                    tokens.push(Token::Plus);
                }
                '-' => {
                    chars.next();
                    tokens.push(Token::Minus);
                }
                '*' => {
                    chars.next();
                    tokens.push(Token::Star);
                }
                '/' => {
                    chars.next();
                    tokens.push(Token::Slash);
                }
                '(' => {
                    chars.next();
                    tokens.push(Token::LParen);
                }
                ')' => {
                    chars.next();
                    tokens.push(Token::RParen);
                }
                other => panic!("unexpected character: {other:?}"),
            }
        }
        tokens
    }
    
    // === Approach 3: next_if combinator — functional lookahead ===
    // Advance only if predicate holds; return None if it doesn't.
    pub fn take_while_lt<'a>(
        iter: &mut std::iter::Peekable<impl Iterator<Item = &'a i32>>,
        limit: i32,
    ) -> Vec<i32> {
        let mut out = Vec::new();
        while let Some(&&v) = iter.peek() {
            if v >= limit {
                break;
            }
            out.push(v);
            iter.next();
        }
        out
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        // --- sum_while_positive ---
    
        #[test]
        fn test_sum_while_positive_all_positive() {
            assert_eq!(sum_while_positive(&[1, 2, 3, 4]), 10);
        }
    
        #[test]
        fn test_sum_while_positive_stops_at_nonpositive() {
            // stops at 0, ignores trailing values
            assert_eq!(sum_while_positive(&[3, 2, 0, 5]), 5);
        }
    
        #[test]
        fn test_sum_while_positive_negative_first() {
            assert_eq!(sum_while_positive(&[-1, 2, 3]), 0);
        }
    
        #[test]
        fn test_sum_while_positive_empty() {
            assert_eq!(sum_while_positive(&[]), 0);
        }
    
        // --- group_consecutive ---
    
        #[test]
        fn test_group_consecutive_mixed() {
            let result = group_consecutive(&[1, 1, 2, 3, 3, 3]);
            assert_eq!(result, vec![vec![1, 1], vec![2], vec![3, 3, 3]]);
        }
    
        #[test]
        fn test_group_consecutive_all_same() {
            let result = group_consecutive(&[7, 7, 7]);
            assert_eq!(result, vec![vec![7, 7, 7]]);
        }
    
        #[test]
        fn test_group_consecutive_all_distinct() {
            let result = group_consecutive(&[1, 2, 3]);
            assert_eq!(result, vec![vec![1], vec![2], vec![3]]);
        }
    
        #[test]
        fn test_group_consecutive_empty() {
            let result: Vec<Vec<i32>> = group_consecutive(&[]);
            assert_eq!(result, Vec::<Vec<i32>>::new());
        }
    
        // --- tokenize ---
    
        #[test]
        fn test_tokenize_number_and_ops() {
            use Token::*;
            let toks = tokenize("12 + 3 * (4 - 1)");
            assert_eq!(
                toks,
                vec![
                    Num(12),
                    Plus,
                    Num(3),
                    Star,
                    LParen,
                    Num(4),
                    Minus,
                    Num(1),
                    RParen
                ]
            );
        }
    
        #[test]
        fn test_tokenize_multi_digit_numbers() {
            use Token::*;
            let toks = tokenize("100 / 25");
            assert_eq!(toks, vec![Num(100), Slash, Num(25)]);
        }
    
        #[test]
        fn test_tokenize_single_number() {
            assert_eq!(tokenize("42"), vec![Token::Num(42)]);
        }
    
        #[test]
        fn test_tokenize_empty() {
            assert_eq!(tokenize(""), vec![]);
        }
    
        // --- take_while_lt ---
    
        #[test]
        fn test_take_while_lt_basic() {
            let data = [1, 2, 3, 10, 11];
            let mut iter = data.iter().peekable();
            let taken = take_while_lt(&mut iter, 5);
            assert_eq!(taken, vec![1, 2, 3]);
            // iterator still has 10, 11
            assert_eq!(iter.next(), Some(&10));
        }
    
        #[test]
        fn test_take_while_lt_none_qualify() {
            let data = [5, 6, 7];
            let mut iter = data.iter().peekable();
            let taken = take_while_lt(&mut iter, 5);
            assert_eq!(taken, vec![]);
            assert_eq!(iter.next(), Some(&5));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        // --- sum_while_positive ---
    
        #[test]
        fn test_sum_while_positive_all_positive() {
            assert_eq!(sum_while_positive(&[1, 2, 3, 4]), 10);
        }
    
        #[test]
        fn test_sum_while_positive_stops_at_nonpositive() {
            // stops at 0, ignores trailing values
            assert_eq!(sum_while_positive(&[3, 2, 0, 5]), 5);
        }
    
        #[test]
        fn test_sum_while_positive_negative_first() {
            assert_eq!(sum_while_positive(&[-1, 2, 3]), 0);
        }
    
        #[test]
        fn test_sum_while_positive_empty() {
            assert_eq!(sum_while_positive(&[]), 0);
        }
    
        // --- group_consecutive ---
    
        #[test]
        fn test_group_consecutive_mixed() {
            let result = group_consecutive(&[1, 1, 2, 3, 3, 3]);
            assert_eq!(result, vec![vec![1, 1], vec![2], vec![3, 3, 3]]);
        }
    
        #[test]
        fn test_group_consecutive_all_same() {
            let result = group_consecutive(&[7, 7, 7]);
            assert_eq!(result, vec![vec![7, 7, 7]]);
        }
    
        #[test]
        fn test_group_consecutive_all_distinct() {
            let result = group_consecutive(&[1, 2, 3]);
            assert_eq!(result, vec![vec![1], vec![2], vec![3]]);
        }
    
        #[test]
        fn test_group_consecutive_empty() {
            let result: Vec<Vec<i32>> = group_consecutive(&[]);
            assert_eq!(result, Vec::<Vec<i32>>::new());
        }
    
        // --- tokenize ---
    
        #[test]
        fn test_tokenize_number_and_ops() {
            use Token::*;
            let toks = tokenize("12 + 3 * (4 - 1)");
            assert_eq!(
                toks,
                vec![
                    Num(12),
                    Plus,
                    Num(3),
                    Star,
                    LParen,
                    Num(4),
                    Minus,
                    Num(1),
                    RParen
                ]
            );
        }
    
        #[test]
        fn test_tokenize_multi_digit_numbers() {
            use Token::*;
            let toks = tokenize("100 / 25");
            assert_eq!(toks, vec![Num(100), Slash, Num(25)]);
        }
    
        #[test]
        fn test_tokenize_single_number() {
            assert_eq!(tokenize("42"), vec![Token::Num(42)]);
        }
    
        #[test]
        fn test_tokenize_empty() {
            assert_eq!(tokenize(""), vec![]);
        }
    
        // --- take_while_lt ---
    
        #[test]
        fn test_take_while_lt_basic() {
            let data = [1, 2, 3, 10, 11];
            let mut iter = data.iter().peekable();
            let taken = take_while_lt(&mut iter, 5);
            assert_eq!(taken, vec![1, 2, 3]);
            // iterator still has 10, 11
            assert_eq!(iter.next(), Some(&10));
        }
    
        #[test]
        fn test_take_while_lt_none_qualify() {
            let data = [5, 6, 7];
            let mut iter = data.iter().peekable();
            let taken = take_while_lt(&mut iter, 5);
            assert_eq!(taken, vec![]);
            assert_eq!(iter.next(), Some(&5));
        }
    }

    Deep Comparison

    OCaml vs Rust: Peekable Iterator

    Side-by-Side Code

    OCaml — manual peekable buffer

    type 'a peekable = {
      mutable peeked : 'a option;
      next_fn : unit -> 'a option;
    }
    
    let peek p =
      match p.peeked with
      | Some _ as v -> v
      | None ->
        let v = p.next_fn () in
        p.peeked <- v; v
    
    let next p =
      match p.peeked with
      | Some _ as v -> p.peeked <- None; v
      | None -> p.next_fn ()
    

    Rust (idiomatic) — .peekable() from std

    pub fn sum_while_positive(data: &[i32]) -> i32 {
        let mut iter = data.iter().peekable();
        let mut sum = 0;
        while iter.peek().map_or(false, |&&v| v > 0) {
            sum += iter.next().unwrap();
        }
        sum
    }
    

    Rust (functional/recursive) — next_if combinator

    // next_if: consume and return next element only if predicate holds.
    // Returns None and leaves iterator untouched if predicate fails.
    while let Some(d) = chars.next_if(|c| c.is_ascii_digit()) {
        num_str.push(d);
    }
    

    Type Signatures

    ConceptOCamlRust
    Peekable type'a peekable (custom record)Peekable<I> (std wrapper)
    Peek result'a optionOption<&I::Item> (reference)
    Conditional advancemanual: check peeked, call nextiter.next_if(pred)
    Iterator item'aI::Item

    Key Insights

  • Built-in vs manual: OCaml has no standard peekable — you must build a mutable record with a peeked ref field yourself. Rust ships Peekable<I> in std::iter, wrapping any iterator with a one-slot buffer at zero cost.
  • Reference semantics on peek: Rust's .peek() returns Option<&Item> — a reference to the buffered item, not ownership. This prevents double-consume bugs at the type level; you can only move the value by calling .next().
  • **next_if is the power move:** iter.next_if(|v| pred(v)) atomically peeks and conditionally advances — perfect for tokenizer digit runs. There is no equivalent in OCaml's standard library; you'd call peek then next manually, risking logic errors.
  • No mutation of outer state: OCaml's manual peekable requires a mutable peeked field and a captured ref to the list tail. Rust's Peekable<I> owns its internal buffer with no external ref cells — mutation is localized inside the iterator wrapper.
  • Composability: Because Peekable<I> implements Iterator, it chains with .map(), .filter(), .take_while(), etc. without losing lookahead capability. OCaml's custom record does not compose with List higher-order functions directly.
  • When to Use Each Style

    **Use .peek() when:** you need to inspect the next value and make a branching decision — parsers, tokenizers, run-length grouping — before committing to consumption.

    **Use .next_if(pred) when:** the advance decision is a pure predicate; this eliminates the peek-then-next two-step and makes intent explicit in one combinator call.

    Exercises

  • Implement a simple arithmetic expression tokenizer for +, -, *, /, integers, and parentheses using .peekable().
  • Write merge_sorted_peekable that merges two sorted peekable iterators into a single sorted sequence without collecting.
  • Implement parse_csv_field that uses a peekable char iterator to correctly handle quoted fields containing commas.
  • Open Source Repos