String Searching
Functional Programming
Tutorial
The Problem
Searching strings is fundamental: checking whether a URL starts with "https://", finding the position of a delimiter for manual parsing, counting occurrences of a substring, validating file extensions. A well-designed API avoids returning raw byte indices without validation, distinguishes first-occurrence (find) from last-occurrence (rfind), and supports both exact substring and character-predicate matching through a single interface.
🎯 Learning Outcomes
.contains(pat).find(pat) returning Option<usize>.rfind(pat).starts_with(pat) / .ends_with(pat).matches(pat).count()Code Example
#![allow(clippy::all)]
// 478. contains(), find(), starts_with()
#[cfg(test)]
mod tests {
#[test]
fn test_contains() {
assert!("hello world".contains("world"));
assert!(!"hello".contains("xyz"));
}
#[test]
fn test_starts() {
assert!("hello".starts_with("hel"));
assert!(!"hello".ends_with("hel"));
}
#[test]
fn test_find() {
assert_eq!("hello".find('l'), Some(2));
assert_eq!("hello".find('z'), None);
}
#[test]
fn test_rfind() {
assert_eq!("hello".rfind('l'), Some(3));
}
#[test]
fn test_matches() {
assert_eq!("aaabaa".matches('a').count(), 5);
}
}Key Differences
Pattern trait unifies char, &str, &[char], and closures under one API; OCaml has separate functions for each (index_opt, contains, Str.search_forward).find returns a byte offset (always a valid char boundary for the match start); OCaml's String.index_opt returns a byte offset (same caveat for non-ASCII).matches iterator**: Rust's .matches(pat) returns a lazy iterator; OCaml has no standard equivalent — count requires a loop.starts_with/ends_with**: Available in OCaml only since 4.13; Rust has had them since 1.0.OCaml Approach
(* contains *)
let contains_sub s sub =
let n = String.length sub in
let found = ref false in
for i = 0 to String.length s - n do
if String.sub s i n = sub then found := true
done; !found
(* find — String.index_opt for single char *)
String.index_opt "hello" 'l' (* Some 2 *)
(* starts_with — OCaml 4.13+ *)
String.starts_with ~prefix:"hel" "hello" (* true *)
OCaml 4.13 added String.starts_with and String.ends_with. For older versions and substring search, Str.search_forward or astring were the common choices.
Full Source
#![allow(clippy::all)]
// 478. contains(), find(), starts_with()
#[cfg(test)]
mod tests {
#[test]
fn test_contains() {
assert!("hello world".contains("world"));
assert!(!"hello".contains("xyz"));
}
#[test]
fn test_starts() {
assert!("hello".starts_with("hel"));
assert!(!"hello".ends_with("hel"));
}
#[test]
fn test_find() {
assert_eq!("hello".find('l'), Some(2));
assert_eq!("hello".find('z'), None);
}
#[test]
fn test_rfind() {
assert_eq!("hello".rfind('l'), Some(3));
}
#[test]
fn test_matches() {
assert_eq!("aaabaa".matches('a').count(), 5);
}
}
✓ Tests
Rust test suite
#[cfg(test)]
mod tests {
#[test]
fn test_contains() {
assert!("hello world".contains("world"));
assert!(!"hello".contains("xyz"));
}
#[test]
fn test_starts() {
assert!("hello".starts_with("hel"));
assert!(!"hello".ends_with("hel"));
}
#[test]
fn test_find() {
assert_eq!("hello".find('l'), Some(2));
assert_eq!("hello".find('z'), None);
}
#[test]
fn test_rfind() {
assert_eq!("hello".rfind('l'), Some(3));
}
#[test]
fn test_matches() {
assert_eq!("aaabaa".matches('a').count(), 5);
}
}
Exercises
find_all(haystack: &str, needle: &str) -> Vec<usize> that returns the byte offset of every non-overlapping occurrence using find in a loop.matches_prefix_ci(s: &str, prefix: &str) -> bool that does case-insensitive prefix checking without allocating a lowercase copy.find**: Benchmark .contains("error") vs. a compiled Regex pattern for scanning 10,000 log lines; measure when the regex overhead becomes worthwhile.