Run-Length Encoding
Tutorial Video
Text description (accessibility)
This video demonstrates the "Run-Length Encoding" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Lists, Data Compression. Use the result of Problem 9 (pack consecutive duplicates) to implement run-length encoding: replace consecutive duplicate elements with `(count, element)` pairs. Key difference from OCaml: 1. **Tuple types**: Rust's `(usize, T)` is structurally typed like OCaml's `int * 'a`, but Rust tuples own their data
Tutorial
The Problem
Use the result of Problem 9 (pack consecutive duplicates) to implement run-length encoding: replace consecutive duplicate elements with (count, element) pairs.
Run-length encoding (RLE) is one of the oldest compression algorithms, dating to the 1960s. It is used in fax transmission (CCITT T.4 standard), BMP and PCX image formats, TIFF compression, and the PackBits algorithm in TIFF. The core idea is to replace runs of repeated data with a single (count, value) pair. For data with long runs (solid-color images, simple fax documents), RLE achieves significant compression. For random data with no runs, it can expand the data.
🎯 Learning Outcomes
(usize, T) as lightweight data containerspack().map() vs direct iteration🦀 The Rust Way
map groups to (len, first) — mirrors OCamllast_mut() to increment count or start new runCode Example
pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
pack(list)
.into_iter()
.map(|group| (group.len(), group[0].clone()))
.collect()
}Key Differences
(usize, T) is structurally typed like OCaml's int * 'a, but Rust tuples own their datapack().into_iter().map() chains naturally — Rust's iterator adaptors mirror OCaml's List.mapusize vs int**: Rust uses usize for counts (unsigned, pointer-sized); OCaml uses int (signed, word-sized)encode (pack then map) is cleaner but allocates twice. The single-pass encode_fold is more efficient. OCaml's version typically uses the same two-pass structure as it composes pack from problem 9.(usize, T) in Rust is a product type. OCaml's (int * 'a) is the same concept with different syntax. Both are structural — no named fields.fold for single pass:** Using fold with mutable access to the last element (acc.last_mut()) avoids re-scanning the accumulator. This is the key idiom for single-pass accumulation.OCaml Approach
First packs consecutive elements (reusing Problem 9's pack), then maps each group to a (count, element) tuple. The direct version counts in a single pass with a recursive helper.
Full Source
#![allow(clippy::all)]
//! # Run-Length Encoding
//! OCaml 99 Problems #10 — Encode consecutive duplicates as (count, element) pairs.
/// Idiomatic Rust: pack then map to (count, element).
pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
pack(list)
.into_iter()
.map(|group| (group.len(), group[0].clone()))
.collect()
}
/// Functional style: single-pass fold, no intermediate packing.
pub fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
list.iter().fold(Vec::new(), |mut acc, item| {
match acc.last_mut() {
Some((count, ref val)) if val == item => {
*count += 1;
}
_ => {
acc.push((1, item.clone()));
}
}
acc
})
}
/// Direct approach: iterate with counting, no helper function.
pub fn encode_direct<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
if list.is_empty() {
return vec![];
}
let mut result = Vec::new();
let mut count = 1;
for i in 1..list.len() {
if list[i] == list[i - 1] {
count += 1;
} else {
result.push((count, list[i - 1].clone()));
count = 1;
}
}
result.push((count, list[list.len() - 1].clone()));
result
}
/// Helper: pack consecutive duplicates (from example 009).
fn pack<T: PartialEq + Clone>(list: &[T]) -> Vec<Vec<T>> {
if list.is_empty() {
return vec![];
}
let mut result = Vec::new();
let mut current = vec![list[0].clone()];
for item in &list[1..] {
if *item == current[0] {
current.push(item.clone());
} else {
result.push(current);
current = vec![item.clone()];
}
}
result.push(current);
result
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_encode() {
let input = vec!["a", "a", "b", "c", "c", "c"];
let expected = vec![(2, "a"), (1, "b"), (3, "c")];
assert_eq!(encode(&input), expected);
assert_eq!(encode_fold(&input), expected);
assert_eq!(encode_direct(&input), expected);
}
#[test]
fn test_empty() {
assert_eq!(encode::<i32>(&[]), Vec::<(usize, i32)>::new());
assert_eq!(encode_fold::<i32>(&[]), Vec::<(usize, i32)>::new());
assert_eq!(encode_direct::<i32>(&[]), Vec::<(usize, i32)>::new());
}
#[test]
fn test_single() {
assert_eq!(encode(&[1]), vec![(1, 1)]);
assert_eq!(encode_fold(&[1]), vec![(1, 1)]);
assert_eq!(encode_direct(&[1]), vec![(1, 1)]);
}
#[test]
fn test_no_duplicates() {
assert_eq!(encode(&[1, 2, 3]), vec![(1, 1), (1, 2), (1, 3)]);
}
#[test]
fn test_all_same() {
assert_eq!(encode(&[5, 5, 5, 5]), vec![(4, 5)]);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_encode() {
let input = vec!["a", "a", "b", "c", "c", "c"];
let expected = vec![(2, "a"), (1, "b"), (3, "c")];
assert_eq!(encode(&input), expected);
assert_eq!(encode_fold(&input), expected);
assert_eq!(encode_direct(&input), expected);
}
#[test]
fn test_empty() {
assert_eq!(encode::<i32>(&[]), Vec::<(usize, i32)>::new());
assert_eq!(encode_fold::<i32>(&[]), Vec::<(usize, i32)>::new());
assert_eq!(encode_direct::<i32>(&[]), Vec::<(usize, i32)>::new());
}
#[test]
fn test_single() {
assert_eq!(encode(&[1]), vec![(1, 1)]);
assert_eq!(encode_fold(&[1]), vec![(1, 1)]);
assert_eq!(encode_direct(&[1]), vec![(1, 1)]);
}
#[test]
fn test_no_duplicates() {
assert_eq!(encode(&[1, 2, 3]), vec![(1, 1), (1, 2), (1, 3)]);
}
#[test]
fn test_all_same() {
assert_eq!(encode(&[5, 5, 5, 5]), vec![(4, 5)]);
}
}
Deep Comparison
Comparison: Run-Length Encoding
OCaml — Pack then encode
let encode lst =
List.map (fun group -> (List.length group, List.hd group)) (pack lst)
Rust — Idiomatic (compose with pack)
pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
pack(list)
.into_iter()
.map(|group| (group.len(), group[0].clone()))
.collect()
}
Rust — Functional (single-pass fold)
pub fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
list.iter().fold(Vec::new(), |mut acc, item| {
match acc.last_mut() {
Some((count, ref val)) if val == item => *count += 1,
_ => acc.push((1, item.clone())),
}
acc
})
}
Comparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Tuple type | int * 'a | (usize, T) |
| Count type | int (signed) | usize (unsigned) |
| Composition | List.map f (pack lst) | pack().into_iter().map().collect() |
| Single-pass | Separate recursive function | fold with last_mut() |
| Ownership | GC handles pack result | into_iter() consumes packed groups |
Type Signatures
val encode : 'a list -> (int * 'a) listfn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)>Takeaways
into_iter() consumes the intermediate packed groups — no wastelast_mut()usize for counts is more semantically correct (counts can't be negative) than OCaml's intmatch acc.last_mut() is a Rust idiom for stateful folds — learn it wellExercises
Vec<(usize, T)>, reconstruct the original Vec<T>.Rle<T> { Single(T), Run(usize, T) }.