093 — Isogram Check
Tutorial Video
Text description (accessibility)
This video demonstrates the "093 — Isogram Check" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. Determine whether a word is an isogram — no letter appears more than once (ignoring case and non-alphabetic characters). Key difference from OCaml: | Aspect | Rust Sort | Rust HashSet | Rust Bitset | OCaml |
Tutorial
The Problem
Determine whether a word is an isogram — no letter appears more than once (ignoring case and non-alphabetic characters). Implement three approaches: sort + dedup, HashSet with early exit via all, and bitset with early exit on duplicate bit. Compare with OCaml's List.sort_uniq approach.
🎯 Learning Outcomes
sort_unstable + dedup and compare lengths to detect duplicatesHashSet::insert returning false on duplicates with .all(|c| seen.insert(…))bits & mask != 0 for O(1) duplicate detection with early exitis_ascii_alphabetic() + to_ascii_lowercase() for letter normalisationList.sort_uniq Char.compareCode Example
pub fn is_isogram_hashset(s: &str) -> bool {
let mut seen = HashSet::new();
s.chars()
.filter(|c| c.is_ascii_alphabetic())
.all(|c| seen.insert(c.to_ascii_lowercase()))
}Key Differences
| Aspect | Rust Sort | Rust HashSet | Rust Bitset | OCaml |
|---|---|---|---|---|
| Time | O(n log n) | O(n) avg | O(n) | O(n log n) |
| Early exit | No | Yes | Yes | No |
| Space | O(n) | O(n) | O(1) | O(n) |
| Code length | Short | Short | Short | Short |
| Readability | High | High | Medium | High |
For correctness and simplicity, the HashSet version is preferred. For performance-critical code with short strings, the bitset version wins. The sort version matches OCaml's natural idiom.
OCaml Approach
OCaml's List.sort_uniq Char.compare sorts and removes duplicates in one pass (O(n log n)). Comparing List.length chars = List.length unique is the same length-based test. OCaml lacks a direct HashSet::insert-returns-bool idiom; the functional approach is more natural with the sort-based method.
Full Source
#![allow(clippy::all)]
//! # Isogram Check
//!
//! A word is an isogram if no letter repeats. Compares sort-based,
//! HashSet-based, and bitset approaches.
use std::collections::HashSet;
// ---------------------------------------------------------------------------
// Approach A: Sort + dedup (mirrors OCaml's List.sort_uniq)
// ---------------------------------------------------------------------------
pub fn is_isogram_sort(s: &str) -> bool {
let mut chars: Vec<char> = s
.chars()
.filter_map(|c| {
let lc = c.to_ascii_lowercase();
lc.is_ascii_lowercase().then_some(lc)
})
.collect();
let total = chars.len();
chars.sort_unstable();
chars.dedup();
chars.len() == total
}
// ---------------------------------------------------------------------------
// Approach B: HashSet — insert returns false on duplicate
// ---------------------------------------------------------------------------
pub fn is_isogram_hashset(s: &str) -> bool {
let mut seen = HashSet::new();
s.chars()
.filter(|c| c.is_ascii_alphabetic())
.all(|c| seen.insert(c.to_ascii_lowercase()))
}
// ---------------------------------------------------------------------------
// Approach C: Bitset
// ---------------------------------------------------------------------------
pub fn is_isogram_bitset(s: &str) -> bool {
let mut bits: u32 = 0;
for c in s.chars() {
let lc = c.to_ascii_lowercase();
if lc.is_ascii_lowercase() {
let mask = 1 << (lc as u32 - 'a' as u32);
if bits & mask != 0 {
return false;
}
bits |= mask;
}
}
true
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_isogram() {
assert!(is_isogram_sort("lumberjacks"));
assert!(is_isogram_hashset("lumberjacks"));
assert!(is_isogram_bitset("lumberjacks"));
}
#[test]
fn test_not_isogram() {
assert!(!is_isogram_sort("eleven"));
assert!(!is_isogram_hashset("eleven"));
assert!(!is_isogram_bitset("eleven"));
}
#[test]
fn test_long_isogram() {
assert!(is_isogram_bitset("subdermatoglyphic"));
}
#[test]
fn test_empty() {
assert!(is_isogram_bitset(""));
}
#[test]
fn test_with_spaces() {
assert!(is_isogram_hashset("big dwarf"));
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_isogram() {
assert!(is_isogram_sort("lumberjacks"));
assert!(is_isogram_hashset("lumberjacks"));
assert!(is_isogram_bitset("lumberjacks"));
}
#[test]
fn test_not_isogram() {
assert!(!is_isogram_sort("eleven"));
assert!(!is_isogram_hashset("eleven"));
assert!(!is_isogram_bitset("eleven"));
}
#[test]
fn test_long_isogram() {
assert!(is_isogram_bitset("subdermatoglyphic"));
}
#[test]
fn test_empty() {
assert!(is_isogram_bitset(""));
}
#[test]
fn test_with_spaces() {
assert!(is_isogram_hashset("big dwarf"));
}
}
Deep Comparison
Comparison: Isogram Check — OCaml vs Rust
Core Insight
OCaml's List.sort_uniq elegantly combines sorting and deduplication. Rust separates these operations but offers a more powerful alternative: HashSet::insert returns a boolean indicating whether the element was new, allowing early termination on the first duplicate — something OCaml's approach cannot do.
OCaml
let is_isogram s =
let chars = s |> String.lowercase_ascii |> String.to_seq
|> Seq.filter (fun c -> c >= 'a' && c <= 'z') |> List.of_seq in
let unique = List.sort_uniq Char.compare chars in
List.length chars = List.length unique
Rust — HashSet with early exit
pub fn is_isogram_hashset(s: &str) -> bool {
let mut seen = HashSet::new();
s.chars()
.filter(|c| c.is_ascii_alphabetic())
.all(|c| seen.insert(c.to_ascii_lowercase()))
}
Comparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Dedup | List.sort_uniq | sort_unstable() + dedup() |
| Early exit | No (processes all) | HashSet::insert + all() |
| Set approach | Would need Set.Make(Char) | HashSet<char> built-in |
| Bitset | lor/lsl | \|=/<< |
| Complexity | O(n log n) | O(n) with HashSet/bitset |
Learner Notes
HashSet::insert idiom**: Returning bool on insert is uniquely useful — OCaml sets don't offer thisall() short-circuits**: Stops at first false, making this O(k) where k is first duplicate positionu32 beats any collectionsort_unstable**: Rust's unstable sort doesn't preserve order of equal elements but is faster — fine here since we just dedupExercises
to_lowercase().next() for multi-codepoint character normalisation.isogram_score(s: &str) -> usize that returns the number of unique letters.Option<char>.longest_isogram(words: &[&str]) -> Option<&str> that finds the longest isogram in a word list.HashSet-equivalent using Hashtbl and early exit via an exception.