ExamplesBy LevelBy TopicLearning Paths
276 Intermediate

Parallel Letter Frequency

Higher-Order Functions

Tutorial Video

Text description (accessibility)

This video demonstrates the "Parallel Letter Frequency" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Higher-Order Functions. Count the frequency of each letter across multiple texts using a map-reduce pattern: map each text to a frequency table, then reduce (merge) all tables into one combined result. Key difference from OCaml: 1. **Map type:** OCaml uses `Map.Make(Char)` (ordered, functor

Tutorial

The Problem

Count the frequency of each letter across multiple texts using a map-reduce pattern: map each text to a frequency table, then reduce (merge) all tables into one combined result.

🎯 Learning Outcomes

  • • How to implement map-reduce with iterators and fold
  • • Using HashMap::entry API for efficient in-place updates
  • • Translating OCaml's Map.Make functor to Rust's HashMap
  • • Pattern matching on slices for recursive decomposition
  • 🦀 The Rust Way

    Rust uses HashMap<char, usize> with the entry API for ergonomic insert-or-update. The iterator chain .iter().map().fold() mirrors OCaml's pipeline. A recursive variant uses slice pattern matching [head, rest @ ..].

    Code Example

    fn letter_freq(s: &str) -> HashMap<char, usize> {
        s.chars().fold(HashMap::new(), |mut map, c| {
            let c = c.to_ascii_lowercase();
            if c.is_ascii_lowercase() {
                *map.entry(c).or_insert(0) += 1;
            }
            map
        })
    }
    
    fn merge_maps(mut a: HashMap<char, usize>, b: &HashMap<char, usize>) -> HashMap<char, usize> {
        for (&ch, &count) in b {
            *a.entry(ch).or_insert(0) += count;
        }
        a
    }
    
    fn parallel_frequency(texts: &[&str]) -> HashMap<char, usize> {
        texts.iter().map(|t| letter_freq(t)).fold(HashMap::new(), |acc, f| merge_maps(acc, &f))
    }

    Key Differences

  • Map type: OCaml uses Map.Make(Char) (ordered, functor-generated); Rust uses HashMap (unordered, generic)
  • Entry API: OCaml's CMap.update takes an option -> option function; Rust's entry().or_insert() is more ergonomic for counters
  • Merge strategy: OCaml's CMap.union takes a 3-argument merge function; Rust requires manual iteration over entries
  • Mutability: OCaml maps are immutable (each operation returns new map); Rust's HashMap is mutated in place for efficiency
  • OCaml Approach

    OCaml uses a functor-generated Map.Make(Char) for an ordered char map. String.fold_left builds per-text frequency maps, and CMap.union with a merge function combines them. The pipeline |> List.map |> List.fold_left is classic map-reduce.

    Full Source

    #![allow(clippy::all)]
    use std::collections::HashMap;
    
    /// Count letter frequencies in a single string, ignoring non-alphabetic characters.
    /// All letters are lowercased before counting.
    ///
    /// # Idiomatic Rust — fold over chars into a HashMap
    pub fn letter_freq(s: &str) -> HashMap<char, usize> {
        s.chars().fold(HashMap::new(), |mut map, c| {
            let c = c.to_ascii_lowercase();
            if c.is_ascii_lowercase() {
                *map.entry(c).or_insert(0) += 1;
            }
            map
        })
    }
    
    /// Merge two frequency maps by summing counts.
    pub fn merge_maps(mut a: HashMap<char, usize>, b: &HashMap<char, usize>) -> HashMap<char, usize> {
        for (&ch, &count) in b {
            *a.entry(ch).or_insert(0) += count;
        }
        a
    }
    
    /// Map-reduce: compute letter frequencies across multiple texts.
    ///
    /// Maps each text to its frequency map, then folds (reduces) all maps
    /// into one by merging counts — the classic map-reduce pattern.
    pub fn parallel_frequency(texts: &[&str]) -> HashMap<char, usize> {
        texts
            .iter()
            .map(|text| letter_freq(text))
            .fold(HashMap::new(), |acc, freq| merge_maps(acc, &freq))
    }
    
    /// Functional/recursive version — processes texts recursively instead of with fold.
    pub fn parallel_frequency_recursive(texts: &[&str]) -> HashMap<char, usize> {
        match texts {
            [] => HashMap::new(),
            [single] => letter_freq(single),
            [head, rest @ ..] => {
                let head_freq = letter_freq(head);
                let rest_freq = parallel_frequency_recursive(rest);
                merge_maps(head_freq, &rest_freq)
            }
        }
    }
    
    /// Iterator-chain version using `for_each` for accumulation.
    pub fn parallel_frequency_iter(texts: &[&str]) -> HashMap<char, usize> {
        let mut combined = HashMap::new();
        texts.iter().for_each(|text| {
            for c in text.chars() {
                let c = c.to_ascii_lowercase();
                if c.is_ascii_lowercase() {
                    *combined.entry(c).or_insert(0) += 1;
                }
            }
        });
        combined
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_empty_input() {
            assert!(parallel_frequency(&[]).is_empty());
            assert!(parallel_frequency(&[""]).is_empty());
        }
    
        #[test]
        fn test_single_text() {
            let freq = parallel_frequency(&["aab"]);
            assert_eq!(freq[&'a'], 2);
            assert_eq!(freq[&'b'], 1);
        }
    
        #[test]
        fn test_multiple_texts() {
            let freq = parallel_frequency(&["Hello World", "Functional Programming", "OCaml is Great"]);
            // 'l' appears in "Hello World" (3) + "Functional" (2) + "OCaml" (1) + "Great" (0) = 6
            // Let's just check some known chars
            assert_eq!(freq[&'o'], 5); // HellO WOrld, FunctiOnal PrOgramming, OCaml is Great
            assert!(freq[&'h'] >= 1);
        }
    
        #[test]
        fn test_ignores_non_alpha() {
            let freq = parallel_frequency(&["123!!!", "a1b2c3"]);
            assert_eq!(freq.len(), 3); // only a, b, c
            assert_eq!(freq[&'a'], 1);
        }
    
        #[test]
        fn test_case_insensitive() {
            let freq = parallel_frequency(&["AaBb"]);
            assert_eq!(freq[&'a'], 2);
            assert_eq!(freq[&'b'], 2);
        }
    
        #[test]
        fn test_recursive_matches_iterative() {
            let texts = &["Hello", "World", "Rust"];
            let iter_result = parallel_frequency(texts);
            let rec_result = parallel_frequency_recursive(texts);
            assert_eq!(iter_result, rec_result);
        }
    
        #[test]
        fn test_iter_version_matches() {
            let texts = &["Hello", "World"];
            let fold_result = parallel_frequency(texts);
            let iter_result = parallel_frequency_iter(texts);
            assert_eq!(fold_result, iter_result);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_empty_input() {
            assert!(parallel_frequency(&[]).is_empty());
            assert!(parallel_frequency(&[""]).is_empty());
        }
    
        #[test]
        fn test_single_text() {
            let freq = parallel_frequency(&["aab"]);
            assert_eq!(freq[&'a'], 2);
            assert_eq!(freq[&'b'], 1);
        }
    
        #[test]
        fn test_multiple_texts() {
            let freq = parallel_frequency(&["Hello World", "Functional Programming", "OCaml is Great"]);
            // 'l' appears in "Hello World" (3) + "Functional" (2) + "OCaml" (1) + "Great" (0) = 6
            // Let's just check some known chars
            assert_eq!(freq[&'o'], 5); // HellO WOrld, FunctiOnal PrOgramming, OCaml is Great
            assert!(freq[&'h'] >= 1);
        }
    
        #[test]
        fn test_ignores_non_alpha() {
            let freq = parallel_frequency(&["123!!!", "a1b2c3"]);
            assert_eq!(freq.len(), 3); // only a, b, c
            assert_eq!(freq[&'a'], 1);
        }
    
        #[test]
        fn test_case_insensitive() {
            let freq = parallel_frequency(&["AaBb"]);
            assert_eq!(freq[&'a'], 2);
            assert_eq!(freq[&'b'], 2);
        }
    
        #[test]
        fn test_recursive_matches_iterative() {
            let texts = &["Hello", "World", "Rust"];
            let iter_result = parallel_frequency(texts);
            let rec_result = parallel_frequency_recursive(texts);
            assert_eq!(iter_result, rec_result);
        }
    
        #[test]
        fn test_iter_version_matches() {
            let texts = &["Hello", "World"];
            let fold_result = parallel_frequency(texts);
            let iter_result = parallel_frequency_iter(texts);
            assert_eq!(fold_result, iter_result);
        }
    }

    Deep Comparison

    OCaml vs Rust: Parallel Letter Frequency

    Side-by-Side Code

    OCaml

    module CMap = Map.Make(Char)
    
    let letter_freq s =
      String.fold_left (fun m c ->
        let c = Char.lowercase_ascii c in
        if c >= 'a' && c <= 'z' then
          CMap.update c (function None -> Some 1 | Some n -> Some (n+1)) m
        else m
      ) CMap.empty s
    
    let merge_maps =
      CMap.union (fun _ a b -> Some (a + b))
    
    let parallel_frequency texts =
      texts |> List.map letter_freq |> List.fold_left merge_maps CMap.empty
    

    Rust (idiomatic)

    fn letter_freq(s: &str) -> HashMap<char, usize> {
        s.chars().fold(HashMap::new(), |mut map, c| {
            let c = c.to_ascii_lowercase();
            if c.is_ascii_lowercase() {
                *map.entry(c).or_insert(0) += 1;
            }
            map
        })
    }
    
    fn merge_maps(mut a: HashMap<char, usize>, b: &HashMap<char, usize>) -> HashMap<char, usize> {
        for (&ch, &count) in b {
            *a.entry(ch).or_insert(0) += count;
        }
        a
    }
    
    fn parallel_frequency(texts: &[&str]) -> HashMap<char, usize> {
        texts.iter().map(|t| letter_freq(t)).fold(HashMap::new(), |acc, f| merge_maps(acc, &f))
    }
    

    Rust (functional/recursive)

    fn parallel_frequency_recursive(texts: &[&str]) -> HashMap<char, usize> {
        match texts {
            [] => HashMap::new(),
            [single] => letter_freq(single),
            [head, rest @ ..] => {
                let head_freq = letter_freq(head);
                let rest_freq = parallel_frequency_recursive(rest);
                merge_maps(head_freq, &rest_freq)
            }
        }
    }
    

    Type Signatures

    ConceptOCamlRust
    Frequency mapchar CMap.t (via Map.Make(Char))HashMap<char, usize>
    Count typeintusize
    Text liststring list&[&str] (slice of string slices)
    Update functionCMap.update : key -> (val option -> val option) -> map -> mapmap.entry(key).or_insert(default) returns &mut V
    Merge functionCMap.union : (key -> val -> val -> val option) -> map -> map -> mapManual iteration + entry API

    Key Insights

  • Functor vs generic: OCaml needs Map.Make(Char) to create a char-keyed map module; Rust's HashMap<K, V> is generic out of the box — no functor ceremony needed
  • Immutable vs mutable maps: OCaml's CMap.update returns a new map each time (persistent data structure); Rust mutates the HashMap in place, which is more cache-friendly
  • Entry API elegance: Rust's entry().or_insert() pattern replaces OCaml's function None -> Some 1 | Some n -> Some (n+1) — less boilerplate for the common "upsert" pattern
  • Union vs manual merge: OCaml's CMap.union is a single elegant call with a merge function; Rust has no built-in HashMap merge, requiring explicit iteration
  • Pipeline preservation: Both languages express map-reduce cleanly — OCaml with |> pipeline, Rust with .iter().map().fold() method chain
  • When to Use Each Style

    Use idiomatic Rust when: You want maximum performance — in-place mutation avoids allocation overhead of creating new maps on every insert. This is the natural Rust approach for frequency counting.

    Use recursive Rust when: Teaching the map-reduce concept explicitly — the recursive decomposition [head, rest @ ..] makes the "divide and conquer" structure visible, matching OCaml's mental model of list processing.

    Exercises

  • Extend the parallel frequency counter to handle Unicode code points rather than ASCII bytes, using char as the key type.
  • Benchmark the parallel version against the sequential version for texts of 1 KB, 100 KB, and 10 MB, and explain at what size parallelism starts to pay off.
  • Generalize the parallel aggregation to a generic parallel_fold function that splits any Vec into chunks, processes them in parallel, and combines results using a monoid — then use it to compute both letter frequency and total word count in one pass.
  • Open Source Repos