011 Intermediate

011 — Modified Run-Length Encoding

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "011 — Modified Run-Length Encoding" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Functional Programming. Standard run-length encoding always emits `(count, element)` pairs, but this is wasteful for singleton runs: `(1, 'a')` is less clear than just `'a'`. Key difference from OCaml: 1. **Generic enum**: Rust's `RleItem<T>` is a generic enum with a type parameter. OCaml's `'a rle` uses a type variable directly in the variant definition — the same concept, different syntax.

Tutorial

The Problem

Standard run-length encoding always emits (count, element) pairs, but this is wasteful for singleton runs: (1, 'a') is less clear than just 'a'. The modified encoding (OCaml 99 Problems #11) uses a sum type: elements with count > 1 are represented as Many(count, element), while singletons are represented as One(element). This avoids redundancy and makes the encoding self-describing.

This problem demonstrates how algebraic data types (enums with data) replace verbose class hierarchies. The RleItem<T> enum is the Rust/OCaml equivalent of a sealed interface with two implementations in Java. Modified RLE is used in fax transmission (CCITT T.4), BMP file compression, and PCX image format.

🎯 Learning Outcomes

• Use a generic enum to represent a sum type (One vs Many)

• Implement encoding in both iterative (stateful loop) and recursive styles

• Apply the run-detection pattern: scan for runs, then emit the appropriate variant

• Understand how enums with data replace class hierarchies from OOP languages

• Use PartialEq + Clone trait bounds for generic types

• Implement both a stateful loop and a recursive helper that uses position() to find run boundaries

• Derive Debug and PartialEq on RleItem<T> to enable assertions in tests

Code Example

#[derive(Debug, PartialEq, Clone)]
pub enum RleItem<T> { One(T), Many(usize, T) }

type 'a rle = One of 'a | Many of int * 'a

let encode lst =
  pack lst |> List.map (fun group ->
    match group with
    | [x] -> One x
    | x :: _ -> Many (List.length group, x)
    | [] -> failwith "impossible")

Key Differences

Generic enum: Rust's RleItem<T> is a generic enum with a type parameter. OCaml's 'a rle uses a type variable directly in the variant definition — the same concept, different syntax.

Clone bound: Rust requires T: Clone when the encode function clones elements into results. OCaml's GC shares structure automatically; no explicit cloning.

Run detection: Rust uses indexed access (list[i] == list[i-1]) in the loop. OCaml's span (not in stdlib but easily defined) splits a list by predicate — a higher-level abstraction.

Memory layout: Rust's Vec<RleItem<T>> stores enum variants inline with a discriminant tag. OCaml boxes variant values on the heap.

Sum types: RleItem<T> with One(T) and Many(usize, T) is a sum type — a value is exactly one of two shapes. OCaml's type 'a rle = One of 'a | Many of int * 'a is identical in concept and nearly identical in syntax.

Generic enum: Rust's RleItem<T> is generic over T. The enum itself is not tied to any specific element type. OCaml's 'a rle is the same.

Pattern matching in loops: The stateful loop uses an index i from 1..=list.len() and accesses list[i-1] at boundaries. This is a common pattern when you need to emit the last group after the loop ends.

Ownership: RleItem::One(T) owns the value. Constructing RleItem::Many(count, list[i-1].clone()) requires cloning from the slice — explicit in Rust, transparent in OCaml.

OCaml Approach

OCaml defines type 'a rle = One of 'a | Many of int * 'a. The encoding function matches on lists: let rec encode = function | [] -> [] | x :: xs -> let (run, rest) = span (fun y -> y = x) xs in let item = if run = [] then One x else Many (1 + List.length run, x) in item :: encode rest. The span function splits the list at the first element that does not match — a common functional idiom.

Full Source

#![allow(clippy::all)]
/// Modified Run-Length Encoding (99 Problems #11)
///
/// Modify the RLE result so that elements without duplicates are simply
/// copied into the result list, while duplicates are encoded as (count, elem).

#[derive(Debug, PartialEq, Clone)]
pub enum RleItem<T> {
    One(T),
    Many(usize, T),
}

// ── Idiomatic Rust: iterator-based ──────────────────────────────────────────

pub fn encode_modified<T: PartialEq + Clone>(list: &[T]) -> Vec<RleItem<T>> {
    if list.is_empty() {
        return vec![];
    }
    let mut result = Vec::new();
    let mut count = 1;
    for i in 1..=list.len() {
        if i < list.len() && list[i] == list[i - 1] {
            count += 1;
        } else {
            result.push(if count == 1 {
                RleItem::One(list[i - 1].clone())
            } else {
                RleItem::Many(count, list[i - 1].clone())
            });
            count = 1;
        }
    }
    result
}

// ── Functional/recursive style ──────────────────────────────────────────────

pub fn encode_modified_recursive<T: PartialEq + Clone>(list: &[T]) -> Vec<RleItem<T>> {
    fn pack_run<T: PartialEq + Clone>(list: &[T]) -> (&[T], &[T]) {
        if list.len() <= 1 {
            return (list, &[]);
        }
        let first = &list[0];
        let end = list.iter().position(|x| x != first).unwrap_or(list.len());
        (&list[..end], &list[end..])
    }

    fn aux<T: PartialEq + Clone>(list: &[T], acc: &mut Vec<RleItem<T>>) {
        if list.is_empty() {
            return;
        }
        let (run, rest) = pack_run(list);
        acc.push(if run.len() == 1 {
            RleItem::One(run[0].clone())
        } else {
            RleItem::Many(run.len(), run[0].clone())
        });
        aux(rest, acc);
    }

    let mut result = Vec::new();
    aux(list, &mut result);
    result
}

// ── Using chunk_by (functional, slice-based) ────────────────────────────────

pub fn encode_modified_chunks<T: PartialEq + Clone>(list: &[T]) -> Vec<RleItem<T>> {
    // Manual chunk-by since slice::chunk_by is nightly
    let mut result = Vec::new();
    let mut i = 0;
    while i < list.len() {
        let start = i;
        while i < list.len() && list[i] == list[start] {
            i += 1;
        }
        let len = i - start;
        result.push(if len == 1 {
            RleItem::One(list[start].clone())
        } else {
            RleItem::Many(len, list[start].clone())
        });
    }
    result
}

#[cfg(test)]
mod tests {
    use super::*;
    use RleItem::*;

    #[test]
    fn test_empty() {
        assert_eq!(encode_modified::<char>(&[]), vec![]);
        assert_eq!(encode_modified_recursive::<char>(&[]), vec![]);
    }

    #[test]
    fn test_no_duplicates() {
        assert_eq!(
            encode_modified(&['a', 'b', 'c']),
            vec![One('a'), One('b'), One('c')]
        );
    }

    #[test]
    fn test_all_same() {
        assert_eq!(encode_modified(&['x', 'x', 'x']), vec![Many(3, 'x')]);
    }

    #[test]
    fn test_mixed() {
        let input = vec!['a', 'a', 'a', 'b', 'c', 'c', 'd', 'd', 'd', 'd'];
        let expected = vec![Many(3, 'a'), One('b'), Many(2, 'c'), Many(4, 'd')];
        assert_eq!(encode_modified(&input), expected);
        assert_eq!(encode_modified_recursive(&input), expected);
        assert_eq!(encode_modified_chunks(&input), expected);
    }

    #[test]
    fn test_single_element() {
        assert_eq!(encode_modified(&[42]), vec![One(42)]);
    }

    #[test]
    fn test_strings() {
        let input = vec!["hi", "hi", "bye"];
        assert_eq!(encode_modified(&input), vec![Many(2, "hi"), One("bye")]);
    }
}

(* Modified Run-Length Encoding — 99 Problems #11 *)
(* Single elements stay unwrapped, runs get (count, elem) *)

type 'a rle =
  | One of 'a
  | Many of int * 'a

(* ── Pack consecutive duplicates into sublists ───────────── *)

let pack lst =
  let rec aux current acc = function
    | [] -> List.rev (current :: acc)
    | x :: rest ->
      match current with
      | [] -> aux [x] acc rest
      | y :: _ when x = y -> aux (x :: current) acc rest
      | _ -> aux [x] (current :: acc) rest
  in match lst with
  | [] -> []
  | h :: t -> aux [h] [] t

(* ── Modified RLE using pack ─────────────────────────────── *)

let encode lst =
  pack lst
  |> List.map (fun group ->
       match group with
       | [x] -> One x
       | x :: _ -> Many (List.length group, x)
       | [] -> failwith "impossible")

(* ── Direct recursive version ────────────────────────────── *)

let encode_direct lst =
  let rec aux count acc = function
    | [] -> List.rev acc
    | [x] -> List.rev ((if count = 0 then One x else Many (count + 1, x)) :: acc)
    | x :: (y :: _ as rest) ->
      if x = y then aux (count + 1) acc rest
      else
        let item = if count = 0 then One x else Many (count + 1, x) in
        aux 0 (item :: acc) rest
  in aux 0 [] lst

(* ── Tests ────────────────────────────────────────────────── *)
let () =
  assert (encode [] = []);
  assert (encode ['a';'b';'c'] = [One 'a'; One 'b'; One 'c']);
  assert (encode ['x';'x';'x'] = [Many (3, 'x')]);
  assert (encode ['a';'a';'a';'b';'c';'c'] = [Many (3,'a'); One 'b'; Many (2,'c')]);
  assert (encode_direct ['a';'a';'b';'c';'c';'c'] = [Many (2,'a'); One 'b'; Many (3,'c')]);
  print_endline "✓ Modified RLE tests passed"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;
    use RleItem::*;

    #[test]
    fn test_empty() {
        assert_eq!(encode_modified::<char>(&[]), vec![]);
        assert_eq!(encode_modified_recursive::<char>(&[]), vec![]);
    }

    #[test]
    fn test_no_duplicates() {
        assert_eq!(
            encode_modified(&['a', 'b', 'c']),
            vec![One('a'), One('b'), One('c')]
        );
    }

    #[test]
    fn test_all_same() {
        assert_eq!(encode_modified(&['x', 'x', 'x']), vec![Many(3, 'x')]);
    }

    #[test]
    fn test_mixed() {
        let input = vec!['a', 'a', 'a', 'b', 'c', 'c', 'd', 'd', 'd', 'd'];
        let expected = vec![Many(3, 'a'), One('b'), Many(2, 'c'), Many(4, 'd')];
        assert_eq!(encode_modified(&input), expected);
        assert_eq!(encode_modified_recursive(&input), expected);
        assert_eq!(encode_modified_chunks(&input), expected);
    }

    #[test]
    fn test_single_element() {
        assert_eq!(encode_modified(&[42]), vec![One(42)]);
    }

    #[test]
    fn test_strings() {
        let input = vec!["hi", "hi", "bye"];
        assert_eq!(encode_modified(&input), vec![Many(2, "hi"), One("bye")]);
    }
}

Deep Comparison

Modified Run-Length Encoding: OCaml vs Rust

The Core Insight

This problem naturally requires a sum type to represent two kinds of elements: singles and runs. Both OCaml and Rust make this elegant with algebraic data types. The interesting part is how each language handles the stateful traversal — accumulating runs of equal elements while distinguishing singletons.

OCaml Approach

OCaml defines a parameterized variant type and uses pattern matching with accumulators:

type 'a rle = One of 'a | Many of int * 'a

let encode lst =
  pack lst |> List.map (fun group ->
    match group with
    | [x] -> One x
    | x :: _ -> Many (List.length group, x)
    | [] -> failwith "impossible")

The pack helper groups consecutive duplicates into sublists first, then encode classifies each group. The direct version tracks count in a recursive accumulator.

Rust Approach

Rust uses a generic enum with the same structure:

#[derive(Debug, PartialEq, Clone)]
pub enum RleItem<T> { One(T), Many(usize, T) }

The idiomatic approach uses index-based iteration with equality checks. The recursive version uses split_first() or position-finding to identify runs. Both require Clone because Rust can't share data implicitly.

Key Differences

Aspect	OCaml	Rust
Sum type	`type 'a rle = One of 'a \\| Many of int * 'a`	`enum RleItem<T> { One(T), Many(usize, T) }`
Type parameter	`'a` (implicit)	`<T: PartialEq + Clone>` (explicit bounds)
Equality	Structural (built-in)	Requires `PartialEq` trait
Copying	Automatic (GC)	Requires `Clone`
Grouping	`pack` into sublists, then map	Direct counting (no intermediate lists)
Pattern matching	`[x]` matches singleton list	No slice literal patterns in stable
Performance	O(n) with intermediate allocations	O(n) single pass possible

What Rust Learners Should Notice

• Trait bounds are explicit: Where OCaml uses structural equality by default, Rust requires PartialEq to compare elements. This makes the contract visible.

• Clone is the cost of ownership: OCaml freely copies values (GC handles it). Rust requires Clone when you want to put a value into an enum variant while still traversing the original slice.

• No list pattern matching on slices: OCaml's [x] and x :: rest patterns don't exist for Rust slices (stable). You use split_first(), .len(), or index-based patterns instead.

• Enums are the same concept: The translation from OCaml variant to Rust enum is nearly mechanical — add #[derive] for common traits, add trait bounds for generics.

Exercises

Round-trip: Write a decode_modified(encoded: &[RleItem<T>]) -> Vec<T> function and verify that decode_modified(encode_modified(v)) == v for any input.

Compression ratio: Write compression_ratio(original: &[i32]) -> f64 that returns original.len() as f64 / encode_modified(original).len() as f64. What input maximizes/minimizes compression?

Stream encode: Rewrite encode_modified to accept impl Iterator<Item=T> and return impl Iterator<Item=RleItem<T>> without collecting to a Vec first.

Round-trip test: Write a property test (using proptest or manual generation) that verifies decode(encode_modified(list)) == list for any list of characters.

Modify to run-length: Convert the RleItem<T> encoding to a simple (count, value) encoding by implementing to_plain_rle(items: &[RleItem<T>]) -> Vec<(usize, T)>.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust

Tutorial Video

Tutorial

The Problem

🎯 Learning Outcomes

Code Example

Key Differences

OCaml Approach

Full Source

Deep Comparison

Modified Run-Length Encoding: OCaml vs Rust

The Core Insight

OCaml Approach

Rust Approach

Key Differences

What Rust Learners Should Notice

Further Reading

Exercises

Open Source Repos