ExamplesBy LevelBy TopicLearning Paths
094 Intermediate

094 — Run-Length Encoding

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "094 — Run-Length Encoding" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Compress a string using run-length encoding: replace consecutive repeated characters with a count followed by the character (e.g. Key difference from OCaml: | Aspect | Rust | OCaml |

Tutorial

The Problem

Compress a string using run-length encoding: replace consecutive repeated characters with a count followed by the character (e.g. "AAABCC""3AB2C"). Implement both an imperative index-based version and a functional grouping version. Compare with OCaml's Buffer-based recursive implementation.

🎯 Learning Outcomes

  • • Traverse a character vector with indices to detect character boundaries
  • • Build a Vec<(char, usize)> of groups using last_mut and pattern matching
  • • Use format! conditionally to omit the count 1 from single-character runs
  • • Understand String::push and push_str for incremental string building
  • • Map Rust's String building to OCaml's Buffer.add_char/Buffer.add_string
  • • Recognise decode as the inverse: parse optional count, then character
  • Code Example

    pub fn encode_fold(s: &str) -> String {
        s.chars()
            .fold(Vec::<(char, usize)>::new(), |mut acc, c| {
                match acc.last_mut() {
                    Some((last, count)) if *last == c => *count += 1,
                    _ => acc.push((c, 1)),
                }; acc
            })
            .iter()
            .map(|&(c, n)| if n > 1 { format!("{n}{c}") } else { c.to_string() })
            .collect()
    }

    Key Differences

    AspectRustOCaml
    String builderString with push/push_strBuffer with add_char/add_string
    Group detectionlast_mut() + matchRecursive go i c count
    Format numbercount.to_string()string_of_int count
    Collect groupsVec<(char, usize)>Implicit in recursion
    Single charsc.to_string()Buffer.add_char buf c
    Pre-allocationString::with_capacityBuffer.create n

    Run-length encoding is a classic interview problem that tests string building, grouping, and edge-case handling (empty string, single characters, long runs). The decode inverse is equally instructive: parse optional digits, then a mandatory character.

    OCaml Approach

    OCaml uses Buffer.create and Buffer.add_char/Buffer.add_string for efficient incremental building. The recursive go i c count function scans the string, writing to the buffer at boundaries. Buffer.contents materialises the final string. OCaml's Buffer is semantically equivalent to Rust's String::push/push_str pattern.

    Full Source

    #![allow(clippy::all)]
    //! # Run-Length Encoding — String Compression
    //!
    //! Encode consecutive repeated characters as count+char.
    //! OCaml uses `Buffer` for incremental string building; Rust uses `String` directly.
    
    // ---------------------------------------------------------------------------
    // Approach A: Imperative — iterate with tracking
    // ---------------------------------------------------------------------------
    
    pub fn encode(s: &str) -> String {
        if s.is_empty() {
            return String::new();
        }
        let mut result = String::new();
        let chars: Vec<char> = s.chars().collect();
        let mut count = 1;
    
        for i in 1..chars.len() {
            if chars[i] == chars[i - 1] {
                count += 1;
            } else {
                if count > 1 {
                    result.push_str(&count.to_string());
                }
                result.push(chars[i - 1]);
                count = 1;
            }
        }
        if count > 1 {
            result.push_str(&count.to_string());
        }
        result.push(*chars.last().unwrap());
        result
    }
    
    // ---------------------------------------------------------------------------
    // Approach B: Iterator — chunk_by (nightly) or manual grouping
    // ---------------------------------------------------------------------------
    
    pub fn encode_functional(s: &str) -> String {
        if s.is_empty() {
            return String::new();
        }
        let chars: Vec<char> = s.chars().collect();
        let mut groups: Vec<(char, usize)> = vec![];
        for &c in &chars {
            match groups.last_mut() {
                Some((last, count)) if *last == c => *count += 1,
                _ => groups.push((c, 1)),
            }
        }
        groups
            .iter()
            .map(|&(c, n)| {
                if n > 1 {
                    format!("{}{}", n, c)
                } else {
                    c.to_string()
                }
            })
            .collect()
    }
    
    // ---------------------------------------------------------------------------
    // Approach C: Fold-based grouping
    // ---------------------------------------------------------------------------
    
    pub fn encode_fold(s: &str) -> String {
        s.chars()
            .fold(Vec::<(char, usize)>::new(), |mut acc, c| {
                match acc.last_mut() {
                    Some((last, count)) if *last == c => *count += 1,
                    _ => acc.push((c, 1)),
                }
                acc
            })
            .iter()
            .map(|&(c, n)| {
                if n > 1 {
                    format!("{}{}", n, c)
                } else {
                    c.to_string()
                }
            })
            .collect()
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_basic() {
            assert_eq!(encode("AABCCCDEEEE"), "2AB3CD4E");
        }
    
        #[test]
        fn test_no_repeats() {
            assert_eq!(encode("ABCDE"), "ABCDE");
        }
    
        #[test]
        fn test_all_same() {
            assert_eq!(encode("AAAA"), "4A");
        }
    
        #[test]
        fn test_empty() {
            assert_eq!(encode(""), "");
        }
    
        #[test]
        fn test_single() {
            assert_eq!(encode("A"), "A");
        }
    
        #[test]
        fn test_functional() {
            assert_eq!(encode_functional("AABCCCDEEEE"), "2AB3CD4E");
        }
    
        #[test]
        fn test_fold() {
            assert_eq!(encode_fold("AABCCCDEEEE"), "2AB3CD4E");
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_basic() {
            assert_eq!(encode("AABCCCDEEEE"), "2AB3CD4E");
        }
    
        #[test]
        fn test_no_repeats() {
            assert_eq!(encode("ABCDE"), "ABCDE");
        }
    
        #[test]
        fn test_all_same() {
            assert_eq!(encode("AAAA"), "4A");
        }
    
        #[test]
        fn test_empty() {
            assert_eq!(encode(""), "");
        }
    
        #[test]
        fn test_single() {
            assert_eq!(encode("A"), "A");
        }
    
        #[test]
        fn test_functional() {
            assert_eq!(encode_functional("AABCCCDEEEE"), "2AB3CD4E");
        }
    
        #[test]
        fn test_fold() {
            assert_eq!(encode_fold("AABCCCDEEEE"), "2AB3CD4E");
        }
    }

    Deep Comparison

    Comparison: Run-Length Encoding — OCaml vs Rust

    Core Insight

    Both languages build the result string incrementally. OCaml's Buffer module is the explicit choice for this; Rust's String is inherently a growable UTF-8 buffer. The recursive vs iterative style differs, but the core algorithm — tracking the current character and its count — is identical.

    OCaml

    let encode s =
      let buf = Buffer.create (String.length s) in
      let rec go i c count =
        if i = String.length s then begin
          if count > 1 then Buffer.add_string buf (string_of_int count);
          Buffer.add_char buf c end
        else if s.[i] = c then go (i+1) c (count+1)
        else begin
          if count > 1 then Buffer.add_string buf (string_of_int count);
          Buffer.add_char buf c; go (i+1) s.[i] 1 end
      in go 1 s.[0] 1; Buffer.contents buf
    

    Rust — Fold-based

    pub fn encode_fold(s: &str) -> String {
        s.chars()
            .fold(Vec::<(char, usize)>::new(), |mut acc, c| {
                match acc.last_mut() {
                    Some((last, count)) if *last == c => *count += 1,
                    _ => acc.push((c, 1)),
                }; acc
            })
            .iter()
            .map(|&(c, n)| if n > 1 { format!("{n}{c}") } else { c.to_string() })
            .collect()
    }
    

    Comparison Table

    AspectOCamlRust
    String builderBuffer.create nString::new() / String::with_capacity
    Char accesss.[i]chars[i] after collect
    Int to stringstring_of_int countcount.to_string() or format!
    GroupingRecursive with accfold with last_mut()
    String concatBuffer.add_stringpush_str / push

    Learner Notes

  • • **last_mut()**: Rust lets you mutably borrow the last vector element — perfect for accumulating groups
  • • **No chunk_by in stable**: Rust nightly has slice::chunk_by, but stable requires manual grouping
  • • **format! cost**: Each format!("{n}{c}") allocates; for hot paths, use write! into a single String
  • OCaml's Buffer: Similar to Java's StringBuilder — explicit mutable string building in an otherwise immutable world
  • Exercises

  • Implement decode(s: &str) -> String that reverses the encoding: parse the optional number prefix and repeat the following character.
  • Handle multi-digit run counts (e.g. "100A" → 100 As) in decode.
  • Write a property test: decode(encode(s)) == s for any string of alphabetic characters.
  • Implement encode for Vec<T: PartialEq> (not just strings) returning Vec<(T, usize)>.
  • In OCaml, implement decode using Scanf to parse the optional integer and mandatory character.
  • Open Source Repos