013 — Direct Run-Length Encoding
Tutorial Video
Text description (accessibility)
This video demonstrates the "013 — Direct Run-Length Encoding" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. OCaml 99 Problems #13 challenges you to implement run-length encoding directly — without first packing runs into sublists (as in example 010) and then counting them. Key difference from OCaml: 1. **Two
Tutorial
The Problem
OCaml 99 Problems #13 challenges you to implement run-length encoding directly — without first packing runs into sublists (as in example 010) and then counting them. This single-pass approach is more efficient because it avoids intermediate allocations and processes each element exactly once.
The direct approach teaches an important programming pattern: maintaining a small amount of state (current element, current count) as you scan through a sequence, emitting output at each run boundary. This is the basis for streaming compression algorithms, tokenizers, lexers, and parser front-ends. Anything that groups consecutive similar items uses this pattern.
🎯 Learning Outcomes
(current_element, count) and emit at boundariesRleItem enum from example 011 as a shared output typestart marks the run beginning, i advances until a different element is foundCode Example
list.iter().fold(Vec::new(), |mut acc, x| {
match acc.last_mut() {
Some(RleItem::Many(n, ref y)) if y == x => { *n += 1; }
Some(RleItem::One(ref y)) if y == x => {
*acc.last_mut().unwrap() = RleItem::Many(2, y.clone());
}
_ => acc.push(RleItem::One(x.clone())),
}
acc
})Key Differences
Vec/slice). OCaml uses a nested recursive count_run function (idiomatic for linked lists).list[i] and list[start] in O(1). OCaml's list traversal is sequential — random access is O(n), so index-based approaches are avoided.list[i] != list[start] to detect run end. OCaml's when y = x guard in the match arm advances while equal.Vec (amortized O(1)). OCaml builds a cons list in reverse and reverses at the end, or builds forward using a tail-recursive accumulator.(start, i) or (current, count) explicitly. OCaml's recursive version passes state as function arguments — equivalent logic, different syntactic form.while loop vs recursion:** The nested while loops in Rust's direct version are idiomatic for scanning with two pointers. OCaml prefers recursive definitions; Rust uses loops for efficiency.RleItem type:** Both this example and example 011 use the same RleItem<T> enum — showing how shared types enable composition between modules.OCaml Approach
The direct OCaml version is: let rec encode lst = match lst with | [] -> [] | x :: rest -> let rec count_run acc = function | y :: t when y = x -> count_run (acc + 1) t | rest -> (acc, rest) in let (n, remaining) = count_run 1 rest in let item = if n = 1 then One x else Many (n, x) in item :: encode remaining. This uses a nested helper count_run that advances through matching elements, making the structure explicitly recursive.
Full Source
#![allow(clippy::all)]
/// Direct Run-Length Encoding (99 Problems #13)
///
/// Implement RLE directly — don't create sublists first, then count.
/// Instead, count consecutive duplicates in a single pass.
#[derive(Debug, PartialEq, Clone)]
pub enum RleItem<T> {
One(T),
Many(usize, T),
}
// ── Idiomatic Rust: single-pass with state ──────────────────────────────────
pub fn encode_direct<T: PartialEq + Clone>(list: &[T]) -> Vec<RleItem<T>> {
let mut result = Vec::new();
let mut i = 0;
while i < list.len() {
let start = i;
while i < list.len() && list[i] == list[start] {
i += 1;
}
let count = i - start;
result.push(if count == 1 {
RleItem::One(list[start].clone())
} else {
RleItem::Many(count, list[start].clone())
});
}
result
}
// ── Recursive style with accumulator ────────────────────────────────────────
pub fn encode_direct_recursive<T: PartialEq + Clone>(list: &[T]) -> Vec<RleItem<T>> {
fn aux<T: PartialEq + Clone>(list: &[T], count: usize, acc: &mut Vec<RleItem<T>>) {
match list.split_first() {
None => {
// Flush any remaining count — handled by previous step
}
Some((head, tail)) => {
match tail.first() {
Some(next) if next == head => {
aux(tail, count + 1, acc);
}
_ => {
// End of run (or end of list)
acc.push(if count + 1 == 1 {
RleItem::One(head.clone())
} else {
RleItem::Many(count + 1, head.clone())
});
aux(tail, 0, acc);
}
}
}
}
}
let mut result = Vec::new();
aux(list, 0, &mut result);
result
}
// ── Fold-based approach ─────────────────────────────────────────────────────
pub fn encode_direct_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<RleItem<T>> {
list.iter().fold(Vec::new(), |mut acc, x| {
match acc.last_mut() {
Some(RleItem::One(ref y)) if y == x => {
let y_clone = y.clone();
*acc.last_mut().unwrap() = RleItem::Many(2, y_clone);
}
Some(RleItem::Many(n, ref y)) if y == x => {
*n += 1;
}
_ => {
acc.push(RleItem::One(x.clone()));
}
}
acc
})
}
#[cfg(test)]
mod tests {
use super::*;
use RleItem::*;
#[test]
fn test_empty() {
assert_eq!(encode_direct::<i32>(&[]), vec![]);
assert_eq!(encode_direct_recursive::<i32>(&[]), vec![]);
assert_eq!(encode_direct_fold::<i32>(&[]), vec![]);
}
#[test]
fn test_no_repeats() {
let input = vec!['a', 'b', 'c'];
let expected = vec![One('a'), One('b'), One('c')];
assert_eq!(encode_direct(&input), expected);
assert_eq!(encode_direct_fold(&input), expected);
}
#[test]
fn test_all_same() {
assert_eq!(encode_direct(&[1, 1, 1, 1]), vec![Many(4, 1)]);
}
#[test]
fn test_classic_example() {
let input = vec![
'a', 'a', 'a', 'a', 'b', 'c', 'c', 'a', 'a', 'd', 'e', 'e', 'e', 'e',
];
let expected = vec![
Many(4, 'a'),
One('b'),
Many(2, 'c'),
Many(2, 'a'),
One('d'),
Many(4, 'e'),
];
assert_eq!(encode_direct(&input), expected);
assert_eq!(encode_direct_recursive(&input), expected);
assert_eq!(encode_direct_fold(&input), expected);
}
#[test]
fn test_single() {
assert_eq!(encode_direct(&['z']), vec![One('z')]);
}
#[test]
fn test_two_same() {
assert_eq!(encode_direct(&[5, 5]), vec![Many(2, 5)]);
}
}#[cfg(test)]
mod tests {
use super::*;
use RleItem::*;
#[test]
fn test_empty() {
assert_eq!(encode_direct::<i32>(&[]), vec![]);
assert_eq!(encode_direct_recursive::<i32>(&[]), vec![]);
assert_eq!(encode_direct_fold::<i32>(&[]), vec![]);
}
#[test]
fn test_no_repeats() {
let input = vec!['a', 'b', 'c'];
let expected = vec![One('a'), One('b'), One('c')];
assert_eq!(encode_direct(&input), expected);
assert_eq!(encode_direct_fold(&input), expected);
}
#[test]
fn test_all_same() {
assert_eq!(encode_direct(&[1, 1, 1, 1]), vec![Many(4, 1)]);
}
#[test]
fn test_classic_example() {
let input = vec![
'a', 'a', 'a', 'a', 'b', 'c', 'c', 'a', 'a', 'd', 'e', 'e', 'e', 'e',
];
let expected = vec![
Many(4, 'a'),
One('b'),
Many(2, 'c'),
Many(2, 'a'),
One('d'),
Many(4, 'e'),
];
assert_eq!(encode_direct(&input), expected);
assert_eq!(encode_direct_recursive(&input), expected);
assert_eq!(encode_direct_fold(&input), expected);
}
#[test]
fn test_single() {
assert_eq!(encode_direct(&['z']), vec![One('z')]);
}
#[test]
fn test_two_same() {
assert_eq!(encode_direct(&[5, 5]), vec![Many(2, 5)]);
}
}
Deep Comparison
Direct Run-Length Encoding: OCaml vs Rust
The Core Insight
Direct RLE eliminates the intermediate step of grouping — it counts runs on the fly. This requires carrying mutable state (the current count) through traversal. OCaml does this with recursive accumulators; Rust offers both fold-based and imperative approaches, with the fold version being particularly interesting because it mutates the accumulator in place.
OCaml Approach
The recursive version carries a count alongside the accumulator:
let rec aux count acc = function
| [] -> List.rev acc
| [x] -> List.rev ((if count = 0 then One x else Many (count+1, x)) :: acc)
| x :: (y :: _ as rest) ->
if x = y then aux (count+1) acc rest
else let item = if count = 0 then One x else Many (count+1, x) in
aux 0 (item :: acc) rest
The fold version builds the result in reverse, checking the head of the accumulator to extend the current run.
Rust Approach
Rust's fold approach modifies the last element of the accumulator in place:
list.iter().fold(Vec::new(), |mut acc, x| {
match acc.last_mut() {
Some(RleItem::Many(n, ref y)) if y == x => { *n += 1; }
Some(RleItem::One(ref y)) if y == x => {
*acc.last_mut().unwrap() = RleItem::Many(2, y.clone());
}
_ => acc.push(RleItem::One(x.clone())),
}
acc
})
The last_mut() method gives a mutable reference to the last element — a pattern impossible in pure functional OCaml but natural in Rust.
Key Differences
| Aspect | OCaml | Rust |
|---|---|---|
| State threading | Explicit parameters (count, acc) | fold with mutable accumulator |
| Mutation | Never (new list nodes) | In-place via last_mut() |
| Peeking ahead | x :: (y :: _ as rest) pattern | tail.first() or index comparison |
| Reverse at end | List.rev acc (common pattern) | Not needed (push appends) |
| Two-element look-ahead | Native with pattern matching | Requires explicit index logic |
What Rust Learners Should Notice
last_mut() enables accumulator mutation**: Rust's ownership system lets you mutate the last element of a Vec through a mutable reference — something OCaml can't do with immutable lists.x :: (y :: _ as rest) naturally looks at two elements. Rust requires comparing list[i] with list[i-1] or using windows(2).Further Reading
Exercises
encode_direct to work on &[u8] for binary data encoding. Benchmark it against the version from example 011 on a large repeated-byte input.longest_run<T: PartialEq>(list: &[T]) -> Option<(usize, &T)> that returns the count and value of the longest consecutive run in a single pass.RleEncoder<T> as a struct that accepts elements one at a time via push(&mut self, x: T) and emits complete RleItem<T> values when a run ends (like a streaming codec).encode_direct as an Iterator<Item = RleItem<T>> — returning a lazy encoder that produces encoded items on demand without building the full output vector.