Parallel Letter Frequency
Tutorial Video
Text description (accessibility)
This video demonstrates the "Parallel Letter Frequency" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Higher-Order Functions. Count the frequency of each letter across multiple texts using a map-reduce pattern: map each text to a frequency table, then reduce (merge) all tables into one combined result. Key difference from OCaml: 1. **Map type:** OCaml uses `Map.Make(Char)` (ordered, functor
Tutorial
The Problem
Count the frequency of each letter across multiple texts using a map-reduce pattern: map each text to a frequency table, then reduce (merge) all tables into one combined result.
🎯 Learning Outcomes
foldHashMap::entry API for efficient in-place updatesMap.Make functor to Rust's HashMap🦀 The Rust Way
Rust uses HashMap<char, usize> with the entry API for ergonomic insert-or-update. The iterator chain .iter().map().fold() mirrors OCaml's pipeline. A recursive variant uses slice pattern matching [head, rest @ ..].
Code Example
fn letter_freq(s: &str) -> HashMap<char, usize> {
s.chars().fold(HashMap::new(), |mut map, c| {
let c = c.to_ascii_lowercase();
if c.is_ascii_lowercase() {
*map.entry(c).or_insert(0) += 1;
}
map
})
}
fn merge_maps(mut a: HashMap<char, usize>, b: &HashMap<char, usize>) -> HashMap<char, usize> {
for (&ch, &count) in b {
*a.entry(ch).or_insert(0) += count;
}
a
}
fn parallel_frequency(texts: &[&str]) -> HashMap<char, usize> {
texts.iter().map(|t| letter_freq(t)).fold(HashMap::new(), |acc, f| merge_maps(acc, &f))
}Key Differences
Map.Make(Char) (ordered, functor-generated); Rust uses HashMap (unordered, generic)CMap.update takes an option -> option function; Rust's entry().or_insert() is more ergonomic for countersCMap.union takes a 3-argument merge function; Rust requires manual iteration over entriesHashMap is mutated in place for efficiencyOCaml Approach
OCaml uses a functor-generated Map.Make(Char) for an ordered char map. String.fold_left builds per-text frequency maps, and CMap.union with a merge function combines them. The pipeline |> List.map |> List.fold_left is classic map-reduce.
Full Source
#![allow(clippy::all)]
use std::collections::HashMap;
/// Count letter frequencies in a single string, ignoring non-alphabetic characters.
/// All letters are lowercased before counting.
///
/// # Idiomatic Rust — fold over chars into a HashMap
pub fn letter_freq(s: &str) -> HashMap<char, usize> {
s.chars().fold(HashMap::new(), |mut map, c| {
let c = c.to_ascii_lowercase();
if c.is_ascii_lowercase() {
*map.entry(c).or_insert(0) += 1;
}
map
})
}
/// Merge two frequency maps by summing counts.
pub fn merge_maps(mut a: HashMap<char, usize>, b: &HashMap<char, usize>) -> HashMap<char, usize> {
for (&ch, &count) in b {
*a.entry(ch).or_insert(0) += count;
}
a
}
/// Map-reduce: compute letter frequencies across multiple texts.
///
/// Maps each text to its frequency map, then folds (reduces) all maps
/// into one by merging counts — the classic map-reduce pattern.
pub fn parallel_frequency(texts: &[&str]) -> HashMap<char, usize> {
texts
.iter()
.map(|text| letter_freq(text))
.fold(HashMap::new(), |acc, freq| merge_maps(acc, &freq))
}
/// Functional/recursive version — processes texts recursively instead of with fold.
pub fn parallel_frequency_recursive(texts: &[&str]) -> HashMap<char, usize> {
match texts {
[] => HashMap::new(),
[single] => letter_freq(single),
[head, rest @ ..] => {
let head_freq = letter_freq(head);
let rest_freq = parallel_frequency_recursive(rest);
merge_maps(head_freq, &rest_freq)
}
}
}
/// Iterator-chain version using `for_each` for accumulation.
pub fn parallel_frequency_iter(texts: &[&str]) -> HashMap<char, usize> {
let mut combined = HashMap::new();
texts.iter().for_each(|text| {
for c in text.chars() {
let c = c.to_ascii_lowercase();
if c.is_ascii_lowercase() {
*combined.entry(c).or_insert(0) += 1;
}
}
});
combined
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_empty_input() {
assert!(parallel_frequency(&[]).is_empty());
assert!(parallel_frequency(&[""]).is_empty());
}
#[test]
fn test_single_text() {
let freq = parallel_frequency(&["aab"]);
assert_eq!(freq[&'a'], 2);
assert_eq!(freq[&'b'], 1);
}
#[test]
fn test_multiple_texts() {
let freq = parallel_frequency(&["Hello World", "Functional Programming", "OCaml is Great"]);
// 'l' appears in "Hello World" (3) + "Functional" (2) + "OCaml" (1) + "Great" (0) = 6
// Let's just check some known chars
assert_eq!(freq[&'o'], 5); // HellO WOrld, FunctiOnal PrOgramming, OCaml is Great
assert!(freq[&'h'] >= 1);
}
#[test]
fn test_ignores_non_alpha() {
let freq = parallel_frequency(&["123!!!", "a1b2c3"]);
assert_eq!(freq.len(), 3); // only a, b, c
assert_eq!(freq[&'a'], 1);
}
#[test]
fn test_case_insensitive() {
let freq = parallel_frequency(&["AaBb"]);
assert_eq!(freq[&'a'], 2);
assert_eq!(freq[&'b'], 2);
}
#[test]
fn test_recursive_matches_iterative() {
let texts = &["Hello", "World", "Rust"];
let iter_result = parallel_frequency(texts);
let rec_result = parallel_frequency_recursive(texts);
assert_eq!(iter_result, rec_result);
}
#[test]
fn test_iter_version_matches() {
let texts = &["Hello", "World"];
let fold_result = parallel_frequency(texts);
let iter_result = parallel_frequency_iter(texts);
assert_eq!(fold_result, iter_result);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_empty_input() {
assert!(parallel_frequency(&[]).is_empty());
assert!(parallel_frequency(&[""]).is_empty());
}
#[test]
fn test_single_text() {
let freq = parallel_frequency(&["aab"]);
assert_eq!(freq[&'a'], 2);
assert_eq!(freq[&'b'], 1);
}
#[test]
fn test_multiple_texts() {
let freq = parallel_frequency(&["Hello World", "Functional Programming", "OCaml is Great"]);
// 'l' appears in "Hello World" (3) + "Functional" (2) + "OCaml" (1) + "Great" (0) = 6
// Let's just check some known chars
assert_eq!(freq[&'o'], 5); // HellO WOrld, FunctiOnal PrOgramming, OCaml is Great
assert!(freq[&'h'] >= 1);
}
#[test]
fn test_ignores_non_alpha() {
let freq = parallel_frequency(&["123!!!", "a1b2c3"]);
assert_eq!(freq.len(), 3); // only a, b, c
assert_eq!(freq[&'a'], 1);
}
#[test]
fn test_case_insensitive() {
let freq = parallel_frequency(&["AaBb"]);
assert_eq!(freq[&'a'], 2);
assert_eq!(freq[&'b'], 2);
}
#[test]
fn test_recursive_matches_iterative() {
let texts = &["Hello", "World", "Rust"];
let iter_result = parallel_frequency(texts);
let rec_result = parallel_frequency_recursive(texts);
assert_eq!(iter_result, rec_result);
}
#[test]
fn test_iter_version_matches() {
let texts = &["Hello", "World"];
let fold_result = parallel_frequency(texts);
let iter_result = parallel_frequency_iter(texts);
assert_eq!(fold_result, iter_result);
}
}
Deep Comparison
OCaml vs Rust: Parallel Letter Frequency
Side-by-Side Code
OCaml
module CMap = Map.Make(Char)
let letter_freq s =
String.fold_left (fun m c ->
let c = Char.lowercase_ascii c in
if c >= 'a' && c <= 'z' then
CMap.update c (function None -> Some 1 | Some n -> Some (n+1)) m
else m
) CMap.empty s
let merge_maps =
CMap.union (fun _ a b -> Some (a + b))
let parallel_frequency texts =
texts |> List.map letter_freq |> List.fold_left merge_maps CMap.empty
Rust (idiomatic)
fn letter_freq(s: &str) -> HashMap<char, usize> {
s.chars().fold(HashMap::new(), |mut map, c| {
let c = c.to_ascii_lowercase();
if c.is_ascii_lowercase() {
*map.entry(c).or_insert(0) += 1;
}
map
})
}
fn merge_maps(mut a: HashMap<char, usize>, b: &HashMap<char, usize>) -> HashMap<char, usize> {
for (&ch, &count) in b {
*a.entry(ch).or_insert(0) += count;
}
a
}
fn parallel_frequency(texts: &[&str]) -> HashMap<char, usize> {
texts.iter().map(|t| letter_freq(t)).fold(HashMap::new(), |acc, f| merge_maps(acc, &f))
}
Rust (functional/recursive)
fn parallel_frequency_recursive(texts: &[&str]) -> HashMap<char, usize> {
match texts {
[] => HashMap::new(),
[single] => letter_freq(single),
[head, rest @ ..] => {
let head_freq = letter_freq(head);
let rest_freq = parallel_frequency_recursive(rest);
merge_maps(head_freq, &rest_freq)
}
}
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| Frequency map | char CMap.t (via Map.Make(Char)) | HashMap<char, usize> |
| Count type | int | usize |
| Text list | string list | &[&str] (slice of string slices) |
| Update function | CMap.update : key -> (val option -> val option) -> map -> map | map.entry(key).or_insert(default) returns &mut V |
| Merge function | CMap.union : (key -> val -> val -> val option) -> map -> map -> map | Manual iteration + entry API |
Key Insights
Map.Make(Char) to create a char-keyed map module; Rust's HashMap<K, V> is generic out of the box — no functor ceremony neededCMap.update returns a new map each time (persistent data structure); Rust mutates the HashMap in place, which is more cache-friendlyentry().or_insert() pattern replaces OCaml's function None -> Some 1 | Some n -> Some (n+1) — less boilerplate for the common "upsert" patternCMap.union is a single elegant call with a merge function; Rust has no built-in HashMap merge, requiring explicit iteration|> pipeline, Rust with .iter().map().fold() method chainWhen to Use Each Style
Use idiomatic Rust when: You want maximum performance — in-place mutation avoids allocation overhead of creating new maps on every insert. This is the natural Rust approach for frequency counting.
Use recursive Rust when: Teaching the map-reduce concept explicitly — the recursive decomposition [head, rest @ ..] makes the "divide and conquer" structure visible, matching OCaml's mental model of list processing.
Exercises
char as the key type.parallel_fold function that splits any Vec into chunks, processes them in parallel, and combines results using a monoid — then use it to compute both letter frequency and total word count in one pass.