ExamplesBy LevelBy TopicLearning Paths
102 Fundamental

102-frequency-counter — Frequency Counter

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "102-frequency-counter — Frequency Counter" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. Counting the frequency of items in a collection is one of the most common data processing operations: word frequency in text analysis, character histograms in compression, event counts in telemetry. Key difference from OCaml: 1. **Mutability model**: Rust's `HashMap` is imperatively mutated via the entry API; OCaml's `Map` produces new persistent versions on each update.

Tutorial

The Problem

Counting the frequency of items in a collection is one of the most common data processing operations: word frequency in text analysis, character histograms in compression, event counts in telemetry. The pattern requires a map from item to count that inserts a zero for new keys and increments existing ones atomically.

Rust's Entry API makes this pattern concise and allocation-efficient. OCaml's Map.Make module provides an immutable tree-based equivalent with different trade-offs.

🎯 Learning Outcomes

  • • Use HashMap::entry().or_insert(0) to count occurrences idiomatically
  • • Use BTreeMap when sorted iteration order is required
  • • Compare the mutable HashMap approach to a functional fold approach
  • • Understand the performance implications of hash maps versus tree maps
  • • Build reusable frequency-counting utilities
  • Code Example

    pub fn word_freq(text: &str) -> HashMap<String, usize> {
        let mut freq = HashMap::new();
        for word in text.split_whitespace() {
            *freq.entry(word.to_lowercase()).or_insert(0) += 1;
        }
        freq
    }

    Key Differences

  • Mutability model: Rust's HashMap is imperatively mutated via the entry API; OCaml's Map produces new persistent versions on each update.
  • Sorted order: OCaml's Map.Make always iterates in key order; Rust's HashMap has random order and requires BTreeMap for sorted iteration.
  • Entry API: Rust's entry().or_insert(0) is a single lookup; OCaml's find + add is conceptually two operations but O(log n) per structural sharing update.
  • Memory: Rust's HashMap is compact and cache-friendly; OCaml's tree map uses more pointer-chasing but enables sharing between versions.
  • OCaml Approach

    OCaml's standard Map.Make(String) creates an immutable, sorted map:

    module StringMap = Map.Make(String)
    
    let word_freq text =
      String.split_on_char ' ' text
      |> List.fold_left (fun acc w ->
        let count = try StringMap.find w acc with Not_found -> 0 in
        StringMap.add w (count + 1) acc
      ) StringMap.empty
    

    Every update produces a new map via structural sharing, so intermediate maps are not copied entirely. The Hashtbl module provides mutable hash tables analogous to Rust's HashMap.

    Full Source

    #![allow(clippy::all)]
    //! # Map Module — Frequency Counter
    //!
    //! Count word frequencies using a map. OCaml's `Map.Make(String)` creates
    //! an immutable tree map; Rust's `HashMap` is the standard mutable map.
    
    use std::collections::BTreeMap;
    use std::collections::HashMap;
    
    // ---------------------------------------------------------------------------
    // Approach A: HashMap (idiomatic Rust)
    // ---------------------------------------------------------------------------
    
    pub fn word_freq_hashmap(text: &str) -> HashMap<String, usize> {
        let mut freq = HashMap::new();
        for word in text.split_whitespace() {
            let w = word.to_lowercase();
            *freq.entry(w).or_insert(0) += 1;
        }
        freq
    }
    
    // ---------------------------------------------------------------------------
    // Approach B: BTreeMap (sorted, like OCaml's Map)
    // ---------------------------------------------------------------------------
    
    pub fn word_freq_btree(text: &str) -> BTreeMap<String, usize> {
        text.split_whitespace()
            .map(|w| w.to_lowercase())
            .fold(BTreeMap::new(), |mut acc, w| {
                *acc.entry(w).or_insert(0) += 1;
                acc
            })
    }
    
    // ---------------------------------------------------------------------------
    // Approach C: Functional — collect into counts
    // ---------------------------------------------------------------------------
    
    pub fn word_freq_functional(text: &str) -> BTreeMap<String, usize> {
        let words: Vec<String> = text.split_whitespace().map(|w| w.to_lowercase()).collect();
    
        let mut sorted = words.clone();
        sorted.sort();
    
        let mut result = BTreeMap::new();
        let mut i = 0;
        while i < sorted.len() {
            let word = &sorted[i];
            let count = sorted[i..].iter().take_while(|w| *w == word).count();
            result.insert(word.clone(), count);
            i += count;
        }
        result
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_basic_freq() {
            let freq = word_freq_hashmap("the cat sat on the mat the cat");
            assert_eq!(freq["the"], 3);
            assert_eq!(freq["cat"], 2);
            assert_eq!(freq["sat"], 1);
        }
    
        #[test]
        fn test_btree_sorted() {
            let freq = word_freq_btree("the cat sat on the mat the cat");
            let keys: Vec<&String> = freq.keys().collect();
            assert_eq!(keys, vec!["cat", "mat", "on", "sat", "the"]);
        }
    
        #[test]
        fn test_empty() {
            let freq = word_freq_hashmap("");
            assert!(freq.is_empty());
        }
    
        #[test]
        fn test_case_insensitive() {
            let freq = word_freq_hashmap("The THE the");
            assert_eq!(freq["the"], 3);
        }
    
        #[test]
        fn test_functional_matches() {
            let a = word_freq_btree("hello world hello");
            let b = word_freq_functional("hello world hello");
            assert_eq!(a, b);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_basic_freq() {
            let freq = word_freq_hashmap("the cat sat on the mat the cat");
            assert_eq!(freq["the"], 3);
            assert_eq!(freq["cat"], 2);
            assert_eq!(freq["sat"], 1);
        }
    
        #[test]
        fn test_btree_sorted() {
            let freq = word_freq_btree("the cat sat on the mat the cat");
            let keys: Vec<&String> = freq.keys().collect();
            assert_eq!(keys, vec!["cat", "mat", "on", "sat", "the"]);
        }
    
        #[test]
        fn test_empty() {
            let freq = word_freq_hashmap("");
            assert!(freq.is_empty());
        }
    
        #[test]
        fn test_case_insensitive() {
            let freq = word_freq_hashmap("The THE the");
            assert_eq!(freq["the"], 3);
        }
    
        #[test]
        fn test_functional_matches() {
            let a = word_freq_btree("hello world hello");
            let b = word_freq_functional("hello world hello");
            assert_eq!(a, b);
        }
    }

    Deep Comparison

    Comparison: Frequency Counter — OCaml vs Rust

    Core Insight

    OCaml's Map is immutable — SMap.add returns a new map, leaving the original unchanged. Rust's HashMap is mutable, and the entry API is its killer feature: freq.entry(w).or_insert(0) += 1 does a single lookup for both read and write, whereas OCaml's find + add does two tree traversals.

    OCaml

    module SMap = Map.Make(String)
    let word_freq text =
      text |> String.split_on_char ' '
      |> List.fold_left (fun acc w ->
        let count = try SMap.find w acc with Not_found -> 0 in
        SMap.add w (count + 1) acc
      ) SMap.empty
    

    Rust

    pub fn word_freq(text: &str) -> HashMap<String, usize> {
        let mut freq = HashMap::new();
        for word in text.split_whitespace() {
            *freq.entry(word.to_lowercase()).or_insert(0) += 1;
        }
        freq
    }
    

    Comparison Table

    AspectOCamlRust
    Map typeMap.Make(String) (tree)HashMap (hash) / BTreeMap (tree)
    MutabilityImmutable (new map each add)Mutable in-place
    Lookup+updatefind + add (2 traversals)entry().or_insert() (1 lookup)
    Missing keyNot_found exceptionor_insert(default)
    OrderingSorted (tree-based)Unordered (HashMap) / Sorted (BTreeMap)
    Functor neededYes: Map.Make(String)No: generics handle it

    Learner Notes

  • Entry API: Rust's most powerful map feature — avoids the check-then-insert anti-pattern
  • • **BTreeMap** ≈ OCaml's Map: both are balanced trees with O(log n) ops and sorted iteration
  • • **HashMap** is O(1) average but unordered — use BTreeMap when you need sorted keys
  • Exception vs default: OCaml's Not_found requires try/with; Rust's or_insert is cleaner
  • Ownership: entry() takes ownership of the key — use String, not &str
  • Exercises

  • Extend word_freq_hashmap to normalize words by stripping punctuation before counting.
  • Write a top_n(freq: &HashMap<String, usize>, n: usize) -> Vec<(String, usize)> function that returns the n most frequent words in descending order.
  • Implement a merge_frequencies function that combines two frequency maps, summing counts for keys that appear in both.
  • Open Source Repos