094 — Run-Length Encoding
Tutorial Video
Text description (accessibility)
This video demonstrates the "094 — Run-Length Encoding" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Compress a string using run-length encoding: replace consecutive repeated characters with a count followed by the character (e.g. Key difference from OCaml: | Aspect | Rust | OCaml |
Tutorial
The Problem
Compress a string using run-length encoding: replace consecutive repeated characters with a count followed by the character (e.g. "AAABCC" → "3AB2C"). Implement both an imperative index-based version and a functional grouping version. Compare with OCaml's Buffer-based recursive implementation.
🎯 Learning Outcomes
Vec<(char, usize)> of groups using last_mut and pattern matchingformat! conditionally to omit the count 1 from single-character runsString::push and push_str for incremental string buildingString building to OCaml's Buffer.add_char/Buffer.add_stringdecode as the inverse: parse optional count, then characterCode Example
pub fn encode_fold(s: &str) -> String {
s.chars()
.fold(Vec::<(char, usize)>::new(), |mut acc, c| {
match acc.last_mut() {
Some((last, count)) if *last == c => *count += 1,
_ => acc.push((c, 1)),
}; acc
})
.iter()
.map(|&(c, n)| if n > 1 { format!("{n}{c}") } else { c.to_string() })
.collect()
}Key Differences
| Aspect | Rust | OCaml |
|---|---|---|
| String builder | String with push/push_str | Buffer with add_char/add_string |
| Group detection | last_mut() + match | Recursive go i c count |
| Format number | count.to_string() | string_of_int count |
| Collect groups | Vec<(char, usize)> | Implicit in recursion |
| Single chars | c.to_string() | Buffer.add_char buf c |
| Pre-allocation | String::with_capacity | Buffer.create n |
Run-length encoding is a classic interview problem that tests string building, grouping, and edge-case handling (empty string, single characters, long runs). The decode inverse is equally instructive: parse optional digits, then a mandatory character.
OCaml Approach
OCaml uses Buffer.create and Buffer.add_char/Buffer.add_string for efficient incremental building. The recursive go i c count function scans the string, writing to the buffer at boundaries. Buffer.contents materialises the final string. OCaml's Buffer is semantically equivalent to Rust's String::push/push_str pattern.
Full Source
#![allow(clippy::all)]
//! # Run-Length Encoding — String Compression
//!
//! Encode consecutive repeated characters as count+char.
//! OCaml uses `Buffer` for incremental string building; Rust uses `String` directly.
// ---------------------------------------------------------------------------
// Approach A: Imperative — iterate with tracking
// ---------------------------------------------------------------------------
pub fn encode(s: &str) -> String {
if s.is_empty() {
return String::new();
}
let mut result = String::new();
let chars: Vec<char> = s.chars().collect();
let mut count = 1;
for i in 1..chars.len() {
if chars[i] == chars[i - 1] {
count += 1;
} else {
if count > 1 {
result.push_str(&count.to_string());
}
result.push(chars[i - 1]);
count = 1;
}
}
if count > 1 {
result.push_str(&count.to_string());
}
result.push(*chars.last().unwrap());
result
}
// ---------------------------------------------------------------------------
// Approach B: Iterator — chunk_by (nightly) or manual grouping
// ---------------------------------------------------------------------------
pub fn encode_functional(s: &str) -> String {
if s.is_empty() {
return String::new();
}
let chars: Vec<char> = s.chars().collect();
let mut groups: Vec<(char, usize)> = vec![];
for &c in &chars {
match groups.last_mut() {
Some((last, count)) if *last == c => *count += 1,
_ => groups.push((c, 1)),
}
}
groups
.iter()
.map(|&(c, n)| {
if n > 1 {
format!("{}{}", n, c)
} else {
c.to_string()
}
})
.collect()
}
// ---------------------------------------------------------------------------
// Approach C: Fold-based grouping
// ---------------------------------------------------------------------------
pub fn encode_fold(s: &str) -> String {
s.chars()
.fold(Vec::<(char, usize)>::new(), |mut acc, c| {
match acc.last_mut() {
Some((last, count)) if *last == c => *count += 1,
_ => acc.push((c, 1)),
}
acc
})
.iter()
.map(|&(c, n)| {
if n > 1 {
format!("{}{}", n, c)
} else {
c.to_string()
}
})
.collect()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_basic() {
assert_eq!(encode("AABCCCDEEEE"), "2AB3CD4E");
}
#[test]
fn test_no_repeats() {
assert_eq!(encode("ABCDE"), "ABCDE");
}
#[test]
fn test_all_same() {
assert_eq!(encode("AAAA"), "4A");
}
#[test]
fn test_empty() {
assert_eq!(encode(""), "");
}
#[test]
fn test_single() {
assert_eq!(encode("A"), "A");
}
#[test]
fn test_functional() {
assert_eq!(encode_functional("AABCCCDEEEE"), "2AB3CD4E");
}
#[test]
fn test_fold() {
assert_eq!(encode_fold("AABCCCDEEEE"), "2AB3CD4E");
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_basic() {
assert_eq!(encode("AABCCCDEEEE"), "2AB3CD4E");
}
#[test]
fn test_no_repeats() {
assert_eq!(encode("ABCDE"), "ABCDE");
}
#[test]
fn test_all_same() {
assert_eq!(encode("AAAA"), "4A");
}
#[test]
fn test_empty() {
assert_eq!(encode(""), "");
}
#[test]
fn test_single() {
assert_eq!(encode("A"), "A");
}
#[test]
fn test_functional() {
assert_eq!(encode_functional("AABCCCDEEEE"), "2AB3CD4E");
}
#[test]
fn test_fold() {
assert_eq!(encode_fold("AABCCCDEEEE"), "2AB3CD4E");
}
}
Deep Comparison
Comparison: Run-Length Encoding — OCaml vs Rust
Core Insight
Both languages build the result string incrementally. OCaml's Buffer module is the explicit choice for this; Rust's String is inherently a growable UTF-8 buffer. The recursive vs iterative style differs, but the core algorithm — tracking the current character and its count — is identical.
OCaml
let encode s =
let buf = Buffer.create (String.length s) in
let rec go i c count =
if i = String.length s then begin
if count > 1 then Buffer.add_string buf (string_of_int count);
Buffer.add_char buf c end
else if s.[i] = c then go (i+1) c (count+1)
else begin
if count > 1 then Buffer.add_string buf (string_of_int count);
Buffer.add_char buf c; go (i+1) s.[i] 1 end
in go 1 s.[0] 1; Buffer.contents buf
Rust — Fold-based
pub fn encode_fold(s: &str) -> String {
s.chars()
.fold(Vec::<(char, usize)>::new(), |mut acc, c| {
match acc.last_mut() {
Some((last, count)) if *last == c => *count += 1,
_ => acc.push((c, 1)),
}; acc
})
.iter()
.map(|&(c, n)| if n > 1 { format!("{n}{c}") } else { c.to_string() })
.collect()
}
Comparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| String builder | Buffer.create n | String::new() / String::with_capacity |
| Char access | s.[i] | chars[i] after collect |
| Int to string | string_of_int count | count.to_string() or format! |
| Grouping | Recursive with acc | fold with last_mut() |
| String concat | Buffer.add_string | push_str / push |
Learner Notes
last_mut()**: Rust lets you mutably borrow the last vector element — perfect for accumulating groupschunk_by in stable**: Rust nightly has slice::chunk_by, but stable requires manual groupingformat! cost**: Each format!("{n}{c}") allocates; for hot paths, use write! into a single StringStringBuilder — explicit mutable string building in an otherwise immutable worldExercises
decode(s: &str) -> String that reverses the encoding: parse the optional number prefix and repeat the following character."100A" → 100 As) in decode.decode(encode(s)) == s for any string of alphabetic characters.Vec<T: PartialEq> (not just strings) returning Vec<(T, usize)>.decode using Scanf to parse the optional integer and mandatory character.