ExamplesBy LevelBy TopicLearning Paths
486 Fundamental

String Regex Pattern

Functional Programming

Tutorial

The Problem

Regex engines are powerful but heavyweight: they compile patterns, maintain state machines, and have non-trivial binary size impact. Many real-world matching tasks need only simple patterns: file globs (*.txt), SQL LIKE queries (user%), or path prefix matching. These can be implemented in a few lines with starts_with, ends_with, and recursive char-slice matching — zero dependencies, no compilation step, and predictable performance.

🎯 Learning Outcomes

  • • Implement a single-wildcard glob matcher using starts_with and ends_with
  • • Implement SQL LIKE pattern matching with recursive descent on &[char] slices
  • • Understand why recursion on slice patterns ([h, ..rest]) is cleaner than index loops
  • • Recognise when to use the regex crate vs. handwritten matchers
  • • Test edge cases: empty string, pure wildcard, no match, anchored match
  • Code Example

    #![allow(clippy::all)]
    // 486. Regex-like matching without crates
    fn glob_match(pattern: &str, s: &str) -> bool {
        if let Some((pre, suf)) = pattern.split_once('*') {
            s.starts_with(pre) && s.ends_with(suf) && s.len() >= pre.len() + suf.len()
        } else {
            s == pattern
        }
    }
    
    fn like_match(s: &str, pattern: &str) -> bool {
        // Simple SQL LIKE: % = any chars, _ = one char
        fn rec(s: &[char], p: &[char]) -> bool {
            match (s, p) {
                (_, []) => s.is_empty(),
                ([], [h, ..]) => *h == '%' && rec(s, &p[1..]),
                ([_, sr @ ..], ['%', pr @ ..]) => rec(s, pr) || rec(sr, p),
                ([sc, sr @ ..], ['_', pr @ ..]) if *sc != '' => rec(sr, pr),
                ([sc, sr @ ..], [pc, pr @ ..]) if sc == pc => rec(sr, pr),
                _ => false,
            }
        }
        let sv: Vec<char> = s.chars().collect();
        let pv: Vec<char> = pattern.chars().collect();
        rec(&sv, &pv)
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_glob() {
            assert!(glob_match("*.txt", "hello.txt"));
            assert!(!glob_match("*.txt", "hello.rs"));
            assert!(glob_match("exact", "exact"));
        }
        #[test]
        fn test_like() {
            assert!(like_match("hello", "h%o"));
            assert!(like_match("hello", "he_lo"));
            assert!(!like_match("hello", "world"));
        }
    }

    Key Differences

  • Slice patterns: Rust's [h, rest @ ..] destructuring on &[char] slices enables clean recursive descent; OCaml's list patterns h :: t are analogous but require List.of_seq (String.to_seq s) to convert.
  • Allocation for pattern matching: Both implementations convert the string to a Vec<char> / char list for recursive matching — this is a design trade-off for clarity over zero-copy.
  • Standard regex: OCaml's Str is in the standard library; Rust's regex crate is a separate dependency.
  • Memoisation: The LIKE recursion has exponential worst-case (multiple %). OCaml and Rust both benefit from memoisation via a HashMap on (s_idx, p_idx) pairs.
  • OCaml Approach

    OCaml pattern matching on lists mirrors the Rust slice pattern approach:

    let rec like_match s p = match s, p with
      | _, [] -> s = []
      | [], '%' :: pr -> like_match [] pr
      | _ :: sr, '%' :: pr -> like_match s pr || like_match sr p
      | _ :: sr, '_' :: pr -> like_match sr pr
      | sc :: sr, pc :: pr when sc = pc -> like_match sr pr
      | _ -> false
    

    OCaml's Str module provides Str.string_match and Str.regexp for full regex; Re (third-party) provides a safer, more composable API.

    Full Source

    #![allow(clippy::all)]
    // 486. Regex-like matching without crates
    fn glob_match(pattern: &str, s: &str) -> bool {
        if let Some((pre, suf)) = pattern.split_once('*') {
            s.starts_with(pre) && s.ends_with(suf) && s.len() >= pre.len() + suf.len()
        } else {
            s == pattern
        }
    }
    
    fn like_match(s: &str, pattern: &str) -> bool {
        // Simple SQL LIKE: % = any chars, _ = one char
        fn rec(s: &[char], p: &[char]) -> bool {
            match (s, p) {
                (_, []) => s.is_empty(),
                ([], [h, ..]) => *h == '%' && rec(s, &p[1..]),
                ([_, sr @ ..], ['%', pr @ ..]) => rec(s, pr) || rec(sr, p),
                ([sc, sr @ ..], ['_', pr @ ..]) if *sc != '' => rec(sr, pr),
                ([sc, sr @ ..], [pc, pr @ ..]) if sc == pc => rec(sr, pr),
                _ => false,
            }
        }
        let sv: Vec<char> = s.chars().collect();
        let pv: Vec<char> = pattern.chars().collect();
        rec(&sv, &pv)
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_glob() {
            assert!(glob_match("*.txt", "hello.txt"));
            assert!(!glob_match("*.txt", "hello.rs"));
            assert!(glob_match("exact", "exact"));
        }
        #[test]
        fn test_like() {
            assert!(like_match("hello", "h%o"));
            assert!(like_match("hello", "he_lo"));
            assert!(!like_match("hello", "world"));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_glob() {
            assert!(glob_match("*.txt", "hello.txt"));
            assert!(!glob_match("*.txt", "hello.rs"));
            assert!(glob_match("exact", "exact"));
        }
        #[test]
        fn test_like() {
            assert!(like_match("hello", "h%o"));
            assert!(like_match("hello", "he_lo"));
            assert!(!like_match("hello", "world"));
        }
    }

    Exercises

  • Multi-wildcard glob: Extend glob_match to handle multiple * wildcards by recursing on segments between * characters.
  • Memoised LIKE: Rewrite like_match with a HashMap<(usize, usize), bool> memo table to achieve O(N×M) time complexity.
  • Benchmark vs. regex: Use the regex crate to compile ^h.*o$ and ^he.lo$, then benchmark against glob_match and like_match for 100,000 matches on 20-char strings.
  • Open Source Repos