ExamplesBy LevelBy TopicLearning Paths
062 Fundamental

062 — Isogram Check

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "062 — Isogram Check" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. An isogram is a word where no letter appears more than once (ignoring case, hyphens, and spaces). Key difference from OCaml: 1. **`HashSet` vs `Set.Make`**: Rust's `HashSet` has O(1) average insert/lookup. OCaml's `Set.Make` is a balanced BST with O(log n) operations. For 26 possible keys, the difference is negligible but structural.

Tutorial

The Problem

An isogram is a word where no letter appears more than once (ignoring case, hyphens, and spaces). Examples: "lumberjacks", "background", "downstream". Non-examples: "hello" (two l's), "Alabama" (three a's). This problem from Exercism exercises set-based duplicate detection.

Isogram detection appears in spell-checking (detecting repeated characters in passwords), password strength metrics (no repeated chars = higher entropy), and linguistics analysis. The problem has three O(n) solutions with different constant factors: sort-and-check (O(n log n)), HashSet, and bitset (26 bits for 26 letters).

🎯 Learning Outcomes

  • • Use HashSet for O(1) membership testing and duplicate detection
  • • Filter non-alphabetic characters with is_ascii_alphabetic()
  • • Normalize case with to_ascii_lowercase()
  • • Implement the early-exit version for better performance on failing inputs
  • • Use a 32-bit integer as a bitset for 26 letters (most cache-friendly approach)
  • • Use HashSet<char> for O(1) character lookup — the isogram check runs in O(n) total time
  • • Normalize input to lowercase and filter non-alphabetic characters for case-insensitive real-world isogram checks
  • Code Example

    #![allow(clippy::all)]
    /// # Isogram Check
    ///
    /// An isogram is a word with no repeating letters (ignoring case, hyphens, spaces).
    /// Demonstrates set-based duplicate detection.
    use std::collections::HashSet;
    
    /// Idiomatic Rust: filter to alphabetic, lowercase, check uniqueness via set size.
    pub fn is_isogram(word: &str) -> bool {
        let letters: Vec<char> = word
            .chars()
            .filter(|c| c.is_ascii_alphabetic())
            .map(|c| c.to_ascii_lowercase())
            .collect();
        let unique: HashSet<char> = letters.iter().copied().collect();
        letters.len() == unique.len()
    }
    
    /// Early-exit approach: insert into set, return false on first duplicate.
    /// More efficient for long strings with early duplicates.
    pub fn is_isogram_early_exit(word: &str) -> bool {
        let mut seen = HashSet::new();
        for c in word.chars() {
            if c.is_ascii_alphabetic() {
                if !seen.insert(c.to_ascii_lowercase()) {
                    return false; // insert returns false if already present
                }
            }
        }
        true
    }
    
    /// Bitflag approach — same idea as pangram but checking for collisions.
    pub fn is_isogram_bitflag(word: &str) -> bool {
        let mut seen: u32 = 0;
        for c in word.chars() {
            if c.is_ascii_alphabetic() {
                let bit = 1 << (c.to_ascii_lowercase() as u32 - 'a' as u32);
                if seen & bit != 0 {
                    return false;
                }
                seen |= bit;
            }
        }
        true
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_isogram() {
            assert!(is_isogram("lumberjacks"));
            assert!(is_isogram("subdermatoglyphic"));
        }
    
        #[test]
        fn test_not_isogram() {
            assert!(!is_isogram("eleven"));
            assert!(!is_isogram("balloon")); // 'l' repeats
        }
    
        #[test]
        fn test_empty() {
            assert!(is_isogram(""));
        }
    
        #[test]
        fn test_hyphenated() {
            assert!(is_isogram("thumbscrew-japing"));
        }
    
        #[test]
        fn test_case_insensitive() {
            assert!(!is_isogram("Alphabet")); // 'a' appears twice
        }
    
        #[test]
        fn test_early_exit() {
            assert!(is_isogram_early_exit("lumberjacks"));
            assert!(!is_isogram_early_exit("eleven"));
        }
    
        #[test]
        fn test_bitflag() {
            assert!(is_isogram_bitflag("lumberjacks"));
            assert!(!is_isogram_bitflag("eleven"));
        }
    }

    Key Differences

  • **HashSet vs Set.Make**: Rust's HashSet has O(1) average insert/lookup. OCaml's Set.Make is a balanced BST with O(log n) operations. For 26 possible keys, the difference is negligible but structural.
  • **insert return value**: Rust's HashSet::insert returns boolfalse if already present. This enables the early-exit pattern. OCaml's Set.add does not return whether the element was new.
  • Bitset: The u32 bitflag approach is not idiomatically OCaml (it's C-style). OCaml functional code prefers Set. Rust's systems heritage makes bitsets natural and common.
  • Unicode: is_ascii_alphabetic() handles only ASCII letters. For Unicode isogram checking, use char::is_alphabetic() and to_lowercase() (which returns an iterator for multi-char lowercasing).
  • **HashSet for O(1) lookup:** Checking if a character has been seen before uses a HashSet<char>. OCaml's equivalent uses Hashtbl or a sorted list with binary search. HashSet gives O(n) total for the whole check.
  • Unicode handling: s.chars() iterates Unicode scalar values. Byte-level iteration (s.bytes()) would mishandle multi-byte characters. OCaml's String.iter also iterates bytes — proper Unicode requires Uutf or similar.
  • Fold-based check: fold can detect the first duplicate: fold_while (from itertools) or try_fold exits early. Standard fold processes all elements — use all(predicate) with HashSet mutation for the short-circuiting version.
  • Case sensitivity: Normalizing to lowercase before insertion handles case-insensitive isogram checks. c.to_ascii_lowercase() is O(1) per character.
  • OCaml Approach

    OCaml's version using a module set: module CharSet = Set.Make(Char). let is_isogram s = let chars = String.to_seq s |> Seq.filter Char.is_letter |> Seq.map Char.lowercase_ascii |> List.of_seq in List.length chars = CharSet.cardinal (CharSet.of_list chars). The Set approach mirrors the HashSet approach but with a balanced BST (O(log n) operations, O(n log n) total).

    Full Source

    #![allow(clippy::all)]
    /// # Isogram Check
    ///
    /// An isogram is a word with no repeating letters (ignoring case, hyphens, spaces).
    /// Demonstrates set-based duplicate detection.
    use std::collections::HashSet;
    
    /// Idiomatic Rust: filter to alphabetic, lowercase, check uniqueness via set size.
    pub fn is_isogram(word: &str) -> bool {
        let letters: Vec<char> = word
            .chars()
            .filter(|c| c.is_ascii_alphabetic())
            .map(|c| c.to_ascii_lowercase())
            .collect();
        let unique: HashSet<char> = letters.iter().copied().collect();
        letters.len() == unique.len()
    }
    
    /// Early-exit approach: insert into set, return false on first duplicate.
    /// More efficient for long strings with early duplicates.
    pub fn is_isogram_early_exit(word: &str) -> bool {
        let mut seen = HashSet::new();
        for c in word.chars() {
            if c.is_ascii_alphabetic() {
                if !seen.insert(c.to_ascii_lowercase()) {
                    return false; // insert returns false if already present
                }
            }
        }
        true
    }
    
    /// Bitflag approach — same idea as pangram but checking for collisions.
    pub fn is_isogram_bitflag(word: &str) -> bool {
        let mut seen: u32 = 0;
        for c in word.chars() {
            if c.is_ascii_alphabetic() {
                let bit = 1 << (c.to_ascii_lowercase() as u32 - 'a' as u32);
                if seen & bit != 0 {
                    return false;
                }
                seen |= bit;
            }
        }
        true
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_isogram() {
            assert!(is_isogram("lumberjacks"));
            assert!(is_isogram("subdermatoglyphic"));
        }
    
        #[test]
        fn test_not_isogram() {
            assert!(!is_isogram("eleven"));
            assert!(!is_isogram("balloon")); // 'l' repeats
        }
    
        #[test]
        fn test_empty() {
            assert!(is_isogram(""));
        }
    
        #[test]
        fn test_hyphenated() {
            assert!(is_isogram("thumbscrew-japing"));
        }
    
        #[test]
        fn test_case_insensitive() {
            assert!(!is_isogram("Alphabet")); // 'a' appears twice
        }
    
        #[test]
        fn test_early_exit() {
            assert!(is_isogram_early_exit("lumberjacks"));
            assert!(!is_isogram_early_exit("eleven"));
        }
    
        #[test]
        fn test_bitflag() {
            assert!(is_isogram_bitflag("lumberjacks"));
            assert!(!is_isogram_bitflag("eleven"));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_isogram() {
            assert!(is_isogram("lumberjacks"));
            assert!(is_isogram("subdermatoglyphic"));
        }
    
        #[test]
        fn test_not_isogram() {
            assert!(!is_isogram("eleven"));
            assert!(!is_isogram("balloon")); // 'l' repeats
        }
    
        #[test]
        fn test_empty() {
            assert!(is_isogram(""));
        }
    
        #[test]
        fn test_hyphenated() {
            assert!(is_isogram("thumbscrew-japing"));
        }
    
        #[test]
        fn test_case_insensitive() {
            assert!(!is_isogram("Alphabet")); // 'a' appears twice
        }
    
        #[test]
        fn test_early_exit() {
            assert!(is_isogram_early_exit("lumberjacks"));
            assert!(!is_isogram_early_exit("eleven"));
        }
    
        #[test]
        fn test_bitflag() {
            assert!(is_isogram_bitflag("lumberjacks"));
            assert!(!is_isogram_bitflag("eleven"));
        }
    }

    Deep Comparison

    Isogram Check — OCaml vs Rust Comparison

    Core Insight

    Duplicate detection is fundamentally a set membership problem. OCaml uses List.sort_uniq to compare lengths, while Rust's HashSet::insert() returns a boolean indicating whether the element was new — enabling early termination.

    OCaml Approach

    Filters characters to lowercase alpha, converts to list, applies List.sort_uniq, and compares lengths. The sort-based dedup is O(n log n). Alternatively, a recursive approach with Set.Make(Char) gives O(n log n) with early exit.

    Rust Approach

    Three approaches: (1) collect to Vec and HashSet, compare lengths — O(n) average; (2) early-exit using HashSet::insert() return value; (3) bitflag for zero-allocation O(n). The early-exit pattern is idiomatic Rust and has no direct OCaml equivalent without mutable state.

    Comparison Table

    AspectOCamlRust
    MemoryList + sorted copyHashSet or u32 bitflag
    Null safetyN/AN/A
    ErrorsNot applicableNot applicable
    IterationSeq.filter + List.of_seq.chars().filter().map()
    DedupList.sort_uniq (O(n log n))HashSet (O(n) average)

    Things Rust Learners Should Notice

  • **HashSet::insert() returns bool** — this is a key API design that enables early exit patterns
  • Three allocation tiers: HashSet (heap), Vec+HashSet (heap), bitflag u32 (stack-only)
  • **copied() iterator adapter** — converts &char to char for collecting into HashSet
  • Filtering and mapping are lazy — the iterator chain doesn't allocate until collect()
  • **Rust's return in closures** — early return only exits the closure, use a for loop for early function exit
  • Further Reading

  • • [HashSet::insert](https://doc.rust-lang.org/std/collections/struct.HashSet.html#method.insert)
  • • [Exercism: Isogram](https://exercism.org/tracks/rust/exercises/isogram)
  • Exercises

  • Anagram check: Two words are anagrams if they use the same letters (with repetition). Write is_anagram(a: &str, b: &str) -> bool using sorted character vectors.
  • Unique characters count: Write unique_letter_count(s: &str) -> usize that counts distinct alphabetic characters. Use HashSet or bitset.
  • Pangram from isogram: What is the shortest English sentence that is both a pangram (uses all 26 letters) and an isogram (no repeated letters)? Research "the quick brown fox" and verify it is not an isogram.
  • k-isogram: Implement is_k_isogram(s: &str, k: usize) -> bool where a k-isogram allows each letter to appear at most k times (1-isogram = standard isogram, 2-isogram = allows pairs).
  • Isogram score: Implement isogram_score(s: &str) -> usize that returns the number of unique characters in the string — 26 means all letters appear exactly once (a perfect isogram for the alphabet).
  • Open Source Repos