ExamplesBy LevelBy TopicLearning Paths
794 Intermediate

794-word-break-dp — Word Break

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "794-word-break-dp — Word Break" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Given a string and a dictionary of words, determine if the string can be segmented into a sequence of dictionary words. Key difference from OCaml: 1. **HashSet vs Hashtbl**: Rust uses `HashSet<&str>` for O(1) lookup; OCaml's `Hashtbl.mem` provides the same.

Tutorial

The Problem

Given a string and a dictionary of words, determine if the string can be segmented into a sequence of dictionary words. This is used in natural language processing (Chinese word segmentation — Chinese text has no spaces), search engine query parsing, and code tokenization. The DP approach avoids the exponential backtracking of a naive recursive search by caching whether each prefix is breakable.

🎯 Learning Outcomes

  • • Model word break as dp[i] = whether s[:i] can be formed from dictionary words
  • • Apply the recurrence: dp[i] = any j: dp[j] && s[j..i] in dict
  • • Use a HashSet for O(1) word lookups vs. O(k) linear scan
  • • Understand the O(n²) time and O(n) space complexity
  • • Extend to count all possible segmentations (exponential output)
  • Code Example

    #![allow(clippy::all)]
    //! # Word Break
    
    use std::collections::HashSet;
    
    pub fn word_break(s: &str, dict: &[&str]) -> bool {
        let words: HashSet<&str> = dict.iter().copied().collect();
        let n = s.len();
        let mut dp = vec![false; n + 1];
        dp[0] = true;
        for i in 1..=n {
            for j in 0..i {
                if dp[j] && words.contains(&s[j..i]) {
                    dp[i] = true;
                    break;
                }
            }
        }
        dp[n]
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_word_break() {
            assert!(word_break("leetcode", &["leet", "code"]));
        }
        #[test]
        fn test_no_break() {
            assert!(!word_break(
                "catsandog",
                &["cats", "dog", "sand", "and", "cat"]
            ));
        }
    }

    Key Differences

  • HashSet vs Hashtbl: Rust uses HashSet<&str> for O(1) lookup; OCaml's Hashtbl.mem provides the same.
  • Slice comparison: Rust's s[j..i] string slice is a &str; OCaml's String.sub creates a new string allocation on each check — a minor performance difference.
  • Early termination: Rust's break exits the inner loop once dp[i] is true; OCaml uses a boolean flag or exception-based early exit.
  • Segmentation: The all-segmentations variant uses DFS with memoization; both languages implement it similarly.
  • OCaml Approach

    OCaml implements with Array.make (n+1) false and uses Hashtbl for the dictionary. The inner loop uses try with Hashtbl.find or Hashtbl.mem. Functional style uses Array.init + List.exists over the dictionary. The words list can be sorted by length to prune impossible substrings early. OCaml's Re library provides regex-based word segmentation as an alternative approach.

    Full Source

    #![allow(clippy::all)]
    //! # Word Break
    
    use std::collections::HashSet;
    
    pub fn word_break(s: &str, dict: &[&str]) -> bool {
        let words: HashSet<&str> = dict.iter().copied().collect();
        let n = s.len();
        let mut dp = vec![false; n + 1];
        dp[0] = true;
        for i in 1..=n {
            for j in 0..i {
                if dp[j] && words.contains(&s[j..i]) {
                    dp[i] = true;
                    break;
                }
            }
        }
        dp[n]
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_word_break() {
            assert!(word_break("leetcode", &["leet", "code"]));
        }
        #[test]
        fn test_no_break() {
            assert!(!word_break(
                "catsandog",
                &["cats", "dog", "sand", "and", "cat"]
            ));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_word_break() {
            assert!(word_break("leetcode", &["leet", "code"]));
        }
        #[test]
        fn test_no_break() {
            assert!(!word_break(
                "catsandog",
                &["cats", "dog", "sand", "and", "cat"]
            ));
        }
    }

    Deep Comparison

    OCaml vs Rust: Word Break Dp

    Overview

    See the example.rs and example.ml files for detailed implementations.

    Key Differences

    AspectOCamlRust
    Type systemHindley-MilnerOwnership + traits
    MemoryGCZero-cost abstractions
    MutabilityExplicit refmut keyword
    Error handlingOption/ResultResult<T, E>

    See README.md for detailed comparison.

    Exercises

  • Implement word_break_all(s, dict) -> Vec<String> that returns all valid space-separated segmentations (e.g., ["cat sand dog", "cats and dog"]).
  • Add word length bounds to the inner loop: only check substrings of length between min_word_len and max_word_len in the dictionary, reducing the inner loop iterations.
  • Implement a streaming version that breaks a text input into words on the fly, maintaining the DP state as new characters arrive.
  • Open Source Repos