010 Intermediate

Run-Length Encoding

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Run-Length Encoding" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Use the result of Problem 9 (pack consecutive duplicates) to implement run-length encoding: replace consecutive duplicate elements with `(count, element)` pairs. Key difference from OCaml: 1. **Tuple types**: Rust's `(usize, T)` is structurally typed like OCaml's `int * 'a`, but Rust tuples own their data

Tutorial

The Problem

Use the result of Problem 9 (pack consecutive duplicates) to implement run-length encoding: replace consecutive duplicate elements with (count, element) pairs.

Run-length encoding (RLE) is one of the oldest compression algorithms, dating to the 1960s. It is used in fax transmission (CCITT T.4 standard), BMP and PCX image formats, TIFF compression, and the PackBits algorithm in TIFF. The core idea is to replace runs of repeated data with a single (count, value) pair. For data with long runs (solid-color images, simple fax documents), RLE achieves significant compression. For random data with no runs, it can expand the data.

🎯 Learning Outcomes

• Compose solutions by building on previous problems (pack → encode)

• Use tuple types (usize, T) as lightweight data containers

• Practice the fold pattern for single-pass encoding

• Compare OCaml's tuple-based encoding with Rust's strongly-typed tuples

• Understand function composition: pack().map() vs direct iteration

🦀 The Rust Way

Compose with pack: Pack first, then map groups to (len, first) — mirrors OCaml

Fold: Single-pass fold, matching on last_mut() to increment count or start new run

Direct: Imperative single-pass with counter — most efficient, no intermediate structures

Code Example

pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
    pack(list)
        .into_iter()
        .map(|group| (group.len(), group[0].clone()))
        .collect()
}

let encode lst =
  List.map (fun group -> (List.length group, List.hd group)) (pack lst)

Key Differences

Tuple types: Rust's (usize, T) is structurally typed like OCaml's int * 'a, but Rust tuples own their data

Composition: pack().into_iter().map() chains naturally — Rust's iterator adaptors mirror OCaml's List.map

Single-pass efficiency: The fold and direct versions avoid creating intermediate packed groups

**usize vs int**: Rust uses usize for counts (unsigned, pointer-sized); OCaml uses int (signed, word-sized)

No intermediate allocation: The direct version builds the result in one pass — important for large inputs

Composition: The two-pass encode (pack then map) is cleaner but allocates twice. The single-pass encode_fold is more efficient. OCaml's version typically uses the same two-pass structure as it composes pack from problem 9.

Tuple types: (usize, T) in Rust is a product type. OCaml's (int * 'a) is the same concept with different syntax. Both are structural — no named fields.

**fold for single pass:** Using fold with mutable access to the last element (acc.last_mut()) avoids re-scanning the accumulator. This is the key idiom for single-pass accumulation.

OCaml Approach

First packs consecutive elements (reusing Problem 9's pack), then maps each group to a (count, element) tuple. The direct version counts in a single pass with a recursive helper.

Full Source

#![allow(clippy::all)]
//! # Run-Length Encoding
//! OCaml 99 Problems #10 — Encode consecutive duplicates as (count, element) pairs.

/// Idiomatic Rust: pack then map to (count, element).
pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
    pack(list)
        .into_iter()
        .map(|group| (group.len(), group[0].clone()))
        .collect()
}

/// Functional style: single-pass fold, no intermediate packing.
pub fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
    list.iter().fold(Vec::new(), |mut acc, item| {
        match acc.last_mut() {
            Some((count, ref val)) if val == item => {
                *count += 1;
            }
            _ => {
                acc.push((1, item.clone()));
            }
        }
        acc
    })
}

/// Direct approach: iterate with counting, no helper function.
pub fn encode_direct<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
    if list.is_empty() {
        return vec![];
    }
    let mut result = Vec::new();
    let mut count = 1;
    for i in 1..list.len() {
        if list[i] == list[i - 1] {
            count += 1;
        } else {
            result.push((count, list[i - 1].clone()));
            count = 1;
        }
    }
    result.push((count, list[list.len() - 1].clone()));
    result
}

/// Helper: pack consecutive duplicates (from example 009).
fn pack<T: PartialEq + Clone>(list: &[T]) -> Vec<Vec<T>> {
    if list.is_empty() {
        return vec![];
    }
    let mut result = Vec::new();
    let mut current = vec![list[0].clone()];
    for item in &list[1..] {
        if *item == current[0] {
            current.push(item.clone());
        } else {
            result.push(current);
            current = vec![item.clone()];
        }
    }
    result.push(current);
    result
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_encode() {
        let input = vec!["a", "a", "b", "c", "c", "c"];
        let expected = vec![(2, "a"), (1, "b"), (3, "c")];
        assert_eq!(encode(&input), expected);
        assert_eq!(encode_fold(&input), expected);
        assert_eq!(encode_direct(&input), expected);
    }

    #[test]
    fn test_empty() {
        assert_eq!(encode::<i32>(&[]), Vec::<(usize, i32)>::new());
        assert_eq!(encode_fold::<i32>(&[]), Vec::<(usize, i32)>::new());
        assert_eq!(encode_direct::<i32>(&[]), Vec::<(usize, i32)>::new());
    }

    #[test]
    fn test_single() {
        assert_eq!(encode(&[1]), vec![(1, 1)]);
        assert_eq!(encode_fold(&[1]), vec![(1, 1)]);
        assert_eq!(encode_direct(&[1]), vec![(1, 1)]);
    }

    #[test]
    fn test_no_duplicates() {
        assert_eq!(encode(&[1, 2, 3]), vec![(1, 1), (1, 2), (1, 3)]);
    }

    #[test]
    fn test_all_same() {
        assert_eq!(encode(&[5, 5, 5, 5]), vec![(4, 5)]);
    }
}

(* Run-length Encoding *)
(* OCaml 99 Problems #10 *)

(* First, pack consecutive elements *)
let pack lst =
  let rec aux current acc = function
    | [] -> []
    | [x] -> (x :: current) :: acc
    | h1 :: (h2 :: _ as t) ->
        if h1 = h2 then aux (h1 :: current) acc t
        else aux [] ((h1 :: current) :: acc) t
  in
  List.rev (aux [] [] lst)

(* Then encode by counting *)
let encode lst =
  let rec count n = function
    | [] -> []
    | [x] -> [(n + 1, x)]
    | h1 :: (h2 :: _ as t) -> (n + 1, h1) :: count 0 t
  in
  count 0 (pack lst)

(* Tests *)
let () =
  assert (encode ["a";"a";"b";"c";"c";"c"] = [(2,"a");(1,"b");(3,"c")]);
  assert (encode [] = []);
  assert (encode [1] = [(1,1)]);
  print_endline "✓ OCaml tests passed"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_encode() {
        let input = vec!["a", "a", "b", "c", "c", "c"];
        let expected = vec![(2, "a"), (1, "b"), (3, "c")];
        assert_eq!(encode(&input), expected);
        assert_eq!(encode_fold(&input), expected);
        assert_eq!(encode_direct(&input), expected);
    }

    #[test]
    fn test_empty() {
        assert_eq!(encode::<i32>(&[]), Vec::<(usize, i32)>::new());
        assert_eq!(encode_fold::<i32>(&[]), Vec::<(usize, i32)>::new());
        assert_eq!(encode_direct::<i32>(&[]), Vec::<(usize, i32)>::new());
    }

    #[test]
    fn test_single() {
        assert_eq!(encode(&[1]), vec![(1, 1)]);
        assert_eq!(encode_fold(&[1]), vec![(1, 1)]);
        assert_eq!(encode_direct(&[1]), vec![(1, 1)]);
    }

    #[test]
    fn test_no_duplicates() {
        assert_eq!(encode(&[1, 2, 3]), vec![(1, 1), (1, 2), (1, 3)]);
    }

    #[test]
    fn test_all_same() {
        assert_eq!(encode(&[5, 5, 5, 5]), vec![(4, 5)]);
    }
}

Deep Comparison

Comparison: Run-Length Encoding

OCaml — Pack then encode

let encode lst =
  List.map (fun group -> (List.length group, List.hd group)) (pack lst)

Rust — Idiomatic (compose with pack)

pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
    pack(list)
        .into_iter()
        .map(|group| (group.len(), group[0].clone()))
        .collect()
}

Rust — Functional (single-pass fold)

pub fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
    list.iter().fold(Vec::new(), |mut acc, item| {
        match acc.last_mut() {
            Some((count, ref val)) if val == item => *count += 1,
            _ => acc.push((1, item.clone())),
        }
        acc
    })
}

Comparison Table

Aspect	OCaml	Rust
Tuple type	`int * 'a`	`(usize, T)`
Count type	`int` (signed)	`usize` (unsigned)
Composition	`List.map f (pack lst)`	`pack().into_iter().map().collect()`
Single-pass	Separate recursive function	`fold` with `last_mut()`
Ownership	GC handles pack result	`into_iter()` consumes packed groups

Type Signatures

• OCaml: val encode : 'a list -> (int * 'a) list

• Rust: fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)>

Takeaways

Both languages naturally compose: pack first, then transform — functional composition at work

Rust's into_iter() consumes the intermediate packed groups — no waste

The fold version shows Rust can match OCaml's single-pass elegance with last_mut()

usize for counts is more semantically correct (counts can't be negative) than OCaml's int

The pattern of match acc.last_mut() is a Rust idiom for stateful folds — learn it well

Exercises

Implement the decoder: given a run-length encoded Vec<(usize, T)>, reconstruct the original Vec<T>.

Write a modified RLE that only encodes runs of length greater than 1 — single elements are stored as-is using an enum Rle<T> { Single(T), Run(usize, T) }.

Implement RLE for a stream of bytes and measure its compression ratio on a sample ASCII text; then try to compress already-compressed data and observe the result.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust