ExamplesBy LevelBy TopicLearning Paths
1031 Intermediate

1031-hashmap-counting — Frequency Counting with HashMap

Functional Programming

Tutorial

The Problem

Frequency counting is the foundation of histogram generation, text analysis, log aggregation, and A/B test result collection. The operation reads each item from a stream and increments a counter in a map, requiring an efficient insert-or-increment primitive.

Rust's Entry API provides and_modify(|c| *c += 1).or_insert(1) as the canonical single-lookup pattern. This example explores all the Entry API variants for counting, showing how each is optimized for different use cases.

🎯 Learning Outcomes

  • • Count character frequencies with entry(ch).or_insert(0) and *count += 1
  • • Use and_modify(|c| *c += 1).or_insert(1) for combined increment-or-insert
  • • Find the most frequent element after counting
  • • Use HashMap::iter() to iterate and sort counts
  • • Understand entry vs re-lookup performance characteristics
  • Code Example

    #![allow(clippy::all)]
    // 1031: Count Frequencies
    // The classic frequency counting pattern with Entry API
    
    use std::collections::HashMap;
    
    /// Count character frequencies
    fn char_frequency() {
        let text = "abracadabra";
    
        let mut counts: HashMap<char, usize> = HashMap::new();
        for ch in text.chars() {
            *counts.entry(ch).or_insert(0) += 1;
        }
    
        assert_eq!(counts[&'a'], 5);
        assert_eq!(counts[&'b'], 2);
        assert_eq!(counts[&'r'], 2);
        assert_eq!(counts[&'c'], 1);
        assert_eq!(counts[&'d'], 1);
    }
    
    /// Count word frequencies using and_modify
    fn word_frequency() {
        let words = vec!["the", "cat", "sat", "on", "the", "mat", "the", "cat"];
    
        let mut counts: HashMap<&str, usize> = HashMap::new();
        for word in &words {
            counts.entry(word).and_modify(|c| *c += 1).or_insert(1);
        }
    
        assert_eq!(counts["the"], 3);
        assert_eq!(counts["cat"], 2);
        assert_eq!(counts["sat"], 1);
    }
    
    /// Find the most frequent element
    fn most_frequent() {
        let items = vec![1, 2, 3, 2, 1, 2, 3, 2, 2];
    
        let mut counts: HashMap<i32, usize> = HashMap::new();
        for &x in &items {
            *counts.entry(x).or_insert(0) += 1;
        }
    
        let (most, count) = counts
            .iter()
            .max_by_key(|&(_, &v)| v)
            .map(|(&k, &v)| (k, v))
            .unwrap();
    
        assert_eq!(most, 2);
        assert_eq!(count, 5);
    }
    
    /// Frequency counting with iterators (functional style)
    fn functional_counting() {
        let data = vec![1, 1, 2, 3, 3, 3];
    
        let counts: HashMap<i32, usize> = data.iter().fold(HashMap::new(), |mut acc, &x| {
            *acc.entry(x).or_insert(0) += 1;
            acc
        });
    
        assert_eq!(counts[&1], 2);
        assert_eq!(counts[&3], 3);
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_char_frequency() {
            char_frequency();
        }
    
        #[test]
        fn test_word_frequency() {
            word_frequency();
        }
    
        #[test]
        fn test_most_frequent() {
            most_frequent();
        }
    
        #[test]
        fn test_functional_counting() {
            functional_counting();
        }
    
        #[test]
        fn test_empty_input() {
            let empty: Vec<i32> = vec![];
            let counts: HashMap<i32, usize> = empty.iter().fold(HashMap::new(), |mut acc, &x| {
                *acc.entry(x).or_insert(0) += 1;
                acc
            });
            assert!(counts.is_empty());
        }
    }

    Key Differences

  • Single lookup: Rust's entry().and_modify().or_insert() is one hash computation; OCaml's find + replace is two.
  • Chained API: Rust's and_modify(...).or_insert(1) is a fluent chain; OCaml's stdlib requires two statements.
  • **Base.Hashtbl.incr**: OCaml's Base library provides a dedicated increment function; Rust's stdlib does not but the Entry API is equivalent.
  • Return value: Rust's or_insert returns &mut V enabling direct mutation; OCaml's Hashtbl.find returns the value by copy.
  • OCaml Approach

    OCaml uses Hashtbl for mutable counting or Map for immutable:

    let char_frequency s =
      let tbl = Hashtbl.create 16 in
      String.iter (fun c ->
        let count = try Hashtbl.find tbl c with Not_found -> 0 in
        Hashtbl.replace tbl c (count + 1)
      ) s;
      tbl
    

    Base.Hashtbl.incr provides a one-liner: Hashtbl.incr tbl ~key:c ~by:1 ~default:0.

    Full Source

    #![allow(clippy::all)]
    // 1031: Count Frequencies
    // The classic frequency counting pattern with Entry API
    
    use std::collections::HashMap;
    
    /// Count character frequencies
    fn char_frequency() {
        let text = "abracadabra";
    
        let mut counts: HashMap<char, usize> = HashMap::new();
        for ch in text.chars() {
            *counts.entry(ch).or_insert(0) += 1;
        }
    
        assert_eq!(counts[&'a'], 5);
        assert_eq!(counts[&'b'], 2);
        assert_eq!(counts[&'r'], 2);
        assert_eq!(counts[&'c'], 1);
        assert_eq!(counts[&'d'], 1);
    }
    
    /// Count word frequencies using and_modify
    fn word_frequency() {
        let words = vec!["the", "cat", "sat", "on", "the", "mat", "the", "cat"];
    
        let mut counts: HashMap<&str, usize> = HashMap::new();
        for word in &words {
            counts.entry(word).and_modify(|c| *c += 1).or_insert(1);
        }
    
        assert_eq!(counts["the"], 3);
        assert_eq!(counts["cat"], 2);
        assert_eq!(counts["sat"], 1);
    }
    
    /// Find the most frequent element
    fn most_frequent() {
        let items = vec![1, 2, 3, 2, 1, 2, 3, 2, 2];
    
        let mut counts: HashMap<i32, usize> = HashMap::new();
        for &x in &items {
            *counts.entry(x).or_insert(0) += 1;
        }
    
        let (most, count) = counts
            .iter()
            .max_by_key(|&(_, &v)| v)
            .map(|(&k, &v)| (k, v))
            .unwrap();
    
        assert_eq!(most, 2);
        assert_eq!(count, 5);
    }
    
    /// Frequency counting with iterators (functional style)
    fn functional_counting() {
        let data = vec![1, 1, 2, 3, 3, 3];
    
        let counts: HashMap<i32, usize> = data.iter().fold(HashMap::new(), |mut acc, &x| {
            *acc.entry(x).or_insert(0) += 1;
            acc
        });
    
        assert_eq!(counts[&1], 2);
        assert_eq!(counts[&3], 3);
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_char_frequency() {
            char_frequency();
        }
    
        #[test]
        fn test_word_frequency() {
            word_frequency();
        }
    
        #[test]
        fn test_most_frequent() {
            most_frequent();
        }
    
        #[test]
        fn test_functional_counting() {
            functional_counting();
        }
    
        #[test]
        fn test_empty_input() {
            let empty: Vec<i32> = vec![];
            let counts: HashMap<i32, usize> = empty.iter().fold(HashMap::new(), |mut acc, &x| {
                *acc.entry(x).or_insert(0) += 1;
                acc
            });
            assert!(counts.is_empty());
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_char_frequency() {
            char_frequency();
        }
    
        #[test]
        fn test_word_frequency() {
            word_frequency();
        }
    
        #[test]
        fn test_most_frequent() {
            most_frequent();
        }
    
        #[test]
        fn test_functional_counting() {
            functional_counting();
        }
    
        #[test]
        fn test_empty_input() {
            let empty: Vec<i32> = vec![];
            let counts: HashMap<i32, usize> = empty.iter().fold(HashMap::new(), |mut acc, &x| {
                *acc.entry(x).or_insert(0) += 1;
                acc
            });
            assert!(counts.is_empty());
        }
    }

    Deep Comparison

    Count Frequencies — Comparison

    Core Insight

    Frequency counting is the "hello world" of the Entry API. Both languages use a fold/loop to accumulate counts, but Rust's entry pattern compresses the check-then-update into a single dereference-and-increment.

    OCaml Approach

  • find_opt + default 0 + add with incremented value
  • • Pattern: let n = match find_opt k m with Some n -> n | None -> 0 in add k (n+1) m
  • fold to find max element
  • • Each update creates a new map node (immutable)
  • Rust Approach

  • *entry(k).or_insert(0) += 1 — one line
  • • Alternative: entry(k).and_modify(|c| *c += 1).or_insert(1) — more explicit
  • max_by_key on iterator to find most frequent
  • • Functional style via iter().fold() also works
  • Comparison Table

    AspectOCamlRust
    Count idiomfind_opt + match + add*entry(k).or_insert(0) += 1
    Lines per count41
    Max elementfold with comparisoniter().max_by_key()
    AllocationNew map node per updateIn-place mutation
    StyleAlways functionalImperative or functional

    Exercises

  • Write top_k_chars(text: &str, k: usize) -> Vec<(char, usize)> that returns the k most frequent characters sorted by descending count.
  • Implement a streaming counter FrequencyCounter<K> struct with add(&mut self, key: K) and top_n(n: usize) -> Vec<(K, usize)> methods.
  • Write a function that detects the first character that appears exactly once in a string using the frequency map.
  • Open Source Repos