062 — Isogram Check
Tutorial Video
Text description (accessibility)
This video demonstrates the "062 — Isogram Check" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. An isogram is a word where no letter appears more than once (ignoring case, hyphens, and spaces). Key difference from OCaml: 1. **`HashSet` vs `Set.Make`**: Rust's `HashSet` has O(1) average insert/lookup. OCaml's `Set.Make` is a balanced BST with O(log n) operations. For 26 possible keys, the difference is negligible but structural.
Tutorial
The Problem
An isogram is a word where no letter appears more than once (ignoring case, hyphens, and spaces). Examples: "lumberjacks", "background", "downstream". Non-examples: "hello" (two l's), "Alabama" (three a's). This problem from Exercism exercises set-based duplicate detection.
Isogram detection appears in spell-checking (detecting repeated characters in passwords), password strength metrics (no repeated chars = higher entropy), and linguistics analysis. The problem has three O(n) solutions with different constant factors: sort-and-check (O(n log n)), HashSet, and bitset (26 bits for 26 letters).
🎯 Learning Outcomes
HashSet for O(1) membership testing and duplicate detectionis_ascii_alphabetic()to_ascii_lowercase()HashSet<char> for O(1) character lookup — the isogram check runs in O(n) total timeCode Example
#![allow(clippy::all)]
/// # Isogram Check
///
/// An isogram is a word with no repeating letters (ignoring case, hyphens, spaces).
/// Demonstrates set-based duplicate detection.
use std::collections::HashSet;
/// Idiomatic Rust: filter to alphabetic, lowercase, check uniqueness via set size.
pub fn is_isogram(word: &str) -> bool {
let letters: Vec<char> = word
.chars()
.filter(|c| c.is_ascii_alphabetic())
.map(|c| c.to_ascii_lowercase())
.collect();
let unique: HashSet<char> = letters.iter().copied().collect();
letters.len() == unique.len()
}
/// Early-exit approach: insert into set, return false on first duplicate.
/// More efficient for long strings with early duplicates.
pub fn is_isogram_early_exit(word: &str) -> bool {
let mut seen = HashSet::new();
for c in word.chars() {
if c.is_ascii_alphabetic() {
if !seen.insert(c.to_ascii_lowercase()) {
return false; // insert returns false if already present
}
}
}
true
}
/// Bitflag approach — same idea as pangram but checking for collisions.
pub fn is_isogram_bitflag(word: &str) -> bool {
let mut seen: u32 = 0;
for c in word.chars() {
if c.is_ascii_alphabetic() {
let bit = 1 << (c.to_ascii_lowercase() as u32 - 'a' as u32);
if seen & bit != 0 {
return false;
}
seen |= bit;
}
}
true
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_isogram() {
assert!(is_isogram("lumberjacks"));
assert!(is_isogram("subdermatoglyphic"));
}
#[test]
fn test_not_isogram() {
assert!(!is_isogram("eleven"));
assert!(!is_isogram("balloon")); // 'l' repeats
}
#[test]
fn test_empty() {
assert!(is_isogram(""));
}
#[test]
fn test_hyphenated() {
assert!(is_isogram("thumbscrew-japing"));
}
#[test]
fn test_case_insensitive() {
assert!(!is_isogram("Alphabet")); // 'a' appears twice
}
#[test]
fn test_early_exit() {
assert!(is_isogram_early_exit("lumberjacks"));
assert!(!is_isogram_early_exit("eleven"));
}
#[test]
fn test_bitflag() {
assert!(is_isogram_bitflag("lumberjacks"));
assert!(!is_isogram_bitflag("eleven"));
}
}Key Differences
HashSet vs Set.Make**: Rust's HashSet has O(1) average insert/lookup. OCaml's Set.Make is a balanced BST with O(log n) operations. For 26 possible keys, the difference is negligible but structural.insert return value**: Rust's HashSet::insert returns bool — false if already present. This enables the early-exit pattern. OCaml's Set.add does not return whether the element was new.Set. Rust's systems heritage makes bitsets natural and common.is_ascii_alphabetic() handles only ASCII letters. For Unicode isogram checking, use char::is_alphabetic() and to_lowercase() (which returns an iterator for multi-char lowercasing).HashSet for O(1) lookup:** Checking if a character has been seen before uses a HashSet<char>. OCaml's equivalent uses Hashtbl or a sorted list with binary search. HashSet gives O(n) total for the whole check.s.chars() iterates Unicode scalar values. Byte-level iteration (s.bytes()) would mishandle multi-byte characters. OCaml's String.iter also iterates bytes — proper Unicode requires Uutf or similar.fold can detect the first duplicate: fold_while (from itertools) or try_fold exits early. Standard fold processes all elements — use all(predicate) with HashSet mutation for the short-circuiting version.c.to_ascii_lowercase() is O(1) per character.OCaml Approach
OCaml's version using a module set: module CharSet = Set.Make(Char). let is_isogram s = let chars = String.to_seq s |> Seq.filter Char.is_letter |> Seq.map Char.lowercase_ascii |> List.of_seq in List.length chars = CharSet.cardinal (CharSet.of_list chars). The Set approach mirrors the HashSet approach but with a balanced BST (O(log n) operations, O(n log n) total).
Full Source
#![allow(clippy::all)]
/// # Isogram Check
///
/// An isogram is a word with no repeating letters (ignoring case, hyphens, spaces).
/// Demonstrates set-based duplicate detection.
use std::collections::HashSet;
/// Idiomatic Rust: filter to alphabetic, lowercase, check uniqueness via set size.
pub fn is_isogram(word: &str) -> bool {
let letters: Vec<char> = word
.chars()
.filter(|c| c.is_ascii_alphabetic())
.map(|c| c.to_ascii_lowercase())
.collect();
let unique: HashSet<char> = letters.iter().copied().collect();
letters.len() == unique.len()
}
/// Early-exit approach: insert into set, return false on first duplicate.
/// More efficient for long strings with early duplicates.
pub fn is_isogram_early_exit(word: &str) -> bool {
let mut seen = HashSet::new();
for c in word.chars() {
if c.is_ascii_alphabetic() {
if !seen.insert(c.to_ascii_lowercase()) {
return false; // insert returns false if already present
}
}
}
true
}
/// Bitflag approach — same idea as pangram but checking for collisions.
pub fn is_isogram_bitflag(word: &str) -> bool {
let mut seen: u32 = 0;
for c in word.chars() {
if c.is_ascii_alphabetic() {
let bit = 1 << (c.to_ascii_lowercase() as u32 - 'a' as u32);
if seen & bit != 0 {
return false;
}
seen |= bit;
}
}
true
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_isogram() {
assert!(is_isogram("lumberjacks"));
assert!(is_isogram("subdermatoglyphic"));
}
#[test]
fn test_not_isogram() {
assert!(!is_isogram("eleven"));
assert!(!is_isogram("balloon")); // 'l' repeats
}
#[test]
fn test_empty() {
assert!(is_isogram(""));
}
#[test]
fn test_hyphenated() {
assert!(is_isogram("thumbscrew-japing"));
}
#[test]
fn test_case_insensitive() {
assert!(!is_isogram("Alphabet")); // 'a' appears twice
}
#[test]
fn test_early_exit() {
assert!(is_isogram_early_exit("lumberjacks"));
assert!(!is_isogram_early_exit("eleven"));
}
#[test]
fn test_bitflag() {
assert!(is_isogram_bitflag("lumberjacks"));
assert!(!is_isogram_bitflag("eleven"));
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_isogram() {
assert!(is_isogram("lumberjacks"));
assert!(is_isogram("subdermatoglyphic"));
}
#[test]
fn test_not_isogram() {
assert!(!is_isogram("eleven"));
assert!(!is_isogram("balloon")); // 'l' repeats
}
#[test]
fn test_empty() {
assert!(is_isogram(""));
}
#[test]
fn test_hyphenated() {
assert!(is_isogram("thumbscrew-japing"));
}
#[test]
fn test_case_insensitive() {
assert!(!is_isogram("Alphabet")); // 'a' appears twice
}
#[test]
fn test_early_exit() {
assert!(is_isogram_early_exit("lumberjacks"));
assert!(!is_isogram_early_exit("eleven"));
}
#[test]
fn test_bitflag() {
assert!(is_isogram_bitflag("lumberjacks"));
assert!(!is_isogram_bitflag("eleven"));
}
}
Deep Comparison
Isogram Check — OCaml vs Rust Comparison
Core Insight
Duplicate detection is fundamentally a set membership problem. OCaml uses List.sort_uniq to compare lengths, while Rust's HashSet::insert() returns a boolean indicating whether the element was new — enabling early termination.
OCaml Approach
Filters characters to lowercase alpha, converts to list, applies List.sort_uniq, and compares lengths. The sort-based dedup is O(n log n). Alternatively, a recursive approach with Set.Make(Char) gives O(n log n) with early exit.
Rust Approach
Three approaches: (1) collect to Vec and HashSet, compare lengths — O(n) average; (2) early-exit using HashSet::insert() return value; (3) bitflag for zero-allocation O(n). The early-exit pattern is idiomatic Rust and has no direct OCaml equivalent without mutable state.
Comparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Memory | List + sorted copy | HashSet or u32 bitflag |
| Null safety | N/A | N/A |
| Errors | Not applicable | Not applicable |
| Iteration | Seq.filter + List.of_seq | .chars().filter().map() |
| Dedup | List.sort_uniq (O(n log n)) | HashSet (O(n) average) |
Things Rust Learners Should Notice
HashSet::insert() returns bool** — this is a key API design that enables early exit patternscopied() iterator adapter** — converts &char to char for collecting into HashSetcollect()return in closures** — early return only exits the closure, use a for loop for early function exitFurther Reading
Exercises
is_anagram(a: &str, b: &str) -> bool using sorted character vectors.unique_letter_count(s: &str) -> usize that counts distinct alphabetic characters. Use HashSet or bitset.is_k_isogram(s: &str, k: usize) -> bool where a k-isogram allows each letter to appear at most k times (1-isogram = standard isogram, 2-isogram = allows pairs).isogram_score(s: &str) -> usize that returns the number of unique characters in the string — 26 means all letters appear exactly once (a perfect isogram for the alphabet).