ExamplesBy LevelBy TopicLearning Paths
010 Intermediate

Run-Length Encoding

ListsData Compression

Tutorial Video

Text description (accessibility)

This video demonstrates the "Run-Length Encoding" functional Rust example. Difficulty level: Intermediate. Key concepts covered: Lists, Data Compression. Use the result of Problem 9 (pack consecutive duplicates) to implement run-length encoding: replace consecutive duplicate elements with `(count, element)` pairs. Key difference from OCaml: 1. **Tuple types**: Rust's `(usize, T)` is structurally typed like OCaml's `int * 'a`, but Rust tuples own their data

Tutorial

The Problem

Use the result of Problem 9 (pack consecutive duplicates) to implement run-length encoding: replace consecutive duplicate elements with (count, element) pairs.

Run-length encoding (RLE) is one of the oldest compression algorithms, dating to the 1960s. It is used in fax transmission (CCITT T.4 standard), BMP and PCX image formats, TIFF compression, and the PackBits algorithm in TIFF. The core idea is to replace runs of repeated data with a single (count, value) pair. For data with long runs (solid-color images, simple fax documents), RLE achieves significant compression. For random data with no runs, it can expand the data.

🎯 Learning Outcomes

  • • Compose solutions by building on previous problems (pack → encode)
  • • Use tuple types (usize, T) as lightweight data containers
  • • Practice the fold pattern for single-pass encoding
  • • Compare OCaml's tuple-based encoding with Rust's strongly-typed tuples
  • • Understand function composition: pack().map() vs direct iteration
  • 🦀 The Rust Way

  • Compose with pack: Pack first, then map groups to (len, first) — mirrors OCaml
  • Fold: Single-pass fold, matching on last_mut() to increment count or start new run
  • Direct: Imperative single-pass with counter — most efficient, no intermediate structures
  • Code Example

    pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
        pack(list)
            .into_iter()
            .map(|group| (group.len(), group[0].clone()))
            .collect()
    }

    Key Differences

  • Tuple types: Rust's (usize, T) is structurally typed like OCaml's int * 'a, but Rust tuples own their data
  • Composition: pack().into_iter().map() chains naturally — Rust's iterator adaptors mirror OCaml's List.map
  • Single-pass efficiency: The fold and direct versions avoid creating intermediate packed groups
  • **usize vs int**: Rust uses usize for counts (unsigned, pointer-sized); OCaml uses int (signed, word-sized)
  • No intermediate allocation: The direct version builds the result in one pass — important for large inputs
  • Composition: The two-pass encode (pack then map) is cleaner but allocates twice. The single-pass encode_fold is more efficient. OCaml's version typically uses the same two-pass structure as it composes pack from problem 9.
  • Tuple types: (usize, T) in Rust is a product type. OCaml's (int * 'a) is the same concept with different syntax. Both are structural — no named fields.
  • **fold for single pass:** Using fold with mutable access to the last element (acc.last_mut()) avoids re-scanning the accumulator. This is the key idiom for single-pass accumulation.
  • OCaml Approach

    First packs consecutive elements (reusing Problem 9's pack), then maps each group to a (count, element) tuple. The direct version counts in a single pass with a recursive helper.

    Full Source

    #![allow(clippy::all)]
    //! # Run-Length Encoding
    //! OCaml 99 Problems #10 — Encode consecutive duplicates as (count, element) pairs.
    
    /// Idiomatic Rust: pack then map to (count, element).
    pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
        pack(list)
            .into_iter()
            .map(|group| (group.len(), group[0].clone()))
            .collect()
    }
    
    /// Functional style: single-pass fold, no intermediate packing.
    pub fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
        list.iter().fold(Vec::new(), |mut acc, item| {
            match acc.last_mut() {
                Some((count, ref val)) if val == item => {
                    *count += 1;
                }
                _ => {
                    acc.push((1, item.clone()));
                }
            }
            acc
        })
    }
    
    /// Direct approach: iterate with counting, no helper function.
    pub fn encode_direct<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
        if list.is_empty() {
            return vec![];
        }
        let mut result = Vec::new();
        let mut count = 1;
        for i in 1..list.len() {
            if list[i] == list[i - 1] {
                count += 1;
            } else {
                result.push((count, list[i - 1].clone()));
                count = 1;
            }
        }
        result.push((count, list[list.len() - 1].clone()));
        result
    }
    
    /// Helper: pack consecutive duplicates (from example 009).
    fn pack<T: PartialEq + Clone>(list: &[T]) -> Vec<Vec<T>> {
        if list.is_empty() {
            return vec![];
        }
        let mut result = Vec::new();
        let mut current = vec![list[0].clone()];
        for item in &list[1..] {
            if *item == current[0] {
                current.push(item.clone());
            } else {
                result.push(current);
                current = vec![item.clone()];
            }
        }
        result.push(current);
        result
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_encode() {
            let input = vec!["a", "a", "b", "c", "c", "c"];
            let expected = vec![(2, "a"), (1, "b"), (3, "c")];
            assert_eq!(encode(&input), expected);
            assert_eq!(encode_fold(&input), expected);
            assert_eq!(encode_direct(&input), expected);
        }
    
        #[test]
        fn test_empty() {
            assert_eq!(encode::<i32>(&[]), Vec::<(usize, i32)>::new());
            assert_eq!(encode_fold::<i32>(&[]), Vec::<(usize, i32)>::new());
            assert_eq!(encode_direct::<i32>(&[]), Vec::<(usize, i32)>::new());
        }
    
        #[test]
        fn test_single() {
            assert_eq!(encode(&[1]), vec![(1, 1)]);
            assert_eq!(encode_fold(&[1]), vec![(1, 1)]);
            assert_eq!(encode_direct(&[1]), vec![(1, 1)]);
        }
    
        #[test]
        fn test_no_duplicates() {
            assert_eq!(encode(&[1, 2, 3]), vec![(1, 1), (1, 2), (1, 3)]);
        }
    
        #[test]
        fn test_all_same() {
            assert_eq!(encode(&[5, 5, 5, 5]), vec![(4, 5)]);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_encode() {
            let input = vec!["a", "a", "b", "c", "c", "c"];
            let expected = vec![(2, "a"), (1, "b"), (3, "c")];
            assert_eq!(encode(&input), expected);
            assert_eq!(encode_fold(&input), expected);
            assert_eq!(encode_direct(&input), expected);
        }
    
        #[test]
        fn test_empty() {
            assert_eq!(encode::<i32>(&[]), Vec::<(usize, i32)>::new());
            assert_eq!(encode_fold::<i32>(&[]), Vec::<(usize, i32)>::new());
            assert_eq!(encode_direct::<i32>(&[]), Vec::<(usize, i32)>::new());
        }
    
        #[test]
        fn test_single() {
            assert_eq!(encode(&[1]), vec![(1, 1)]);
            assert_eq!(encode_fold(&[1]), vec![(1, 1)]);
            assert_eq!(encode_direct(&[1]), vec![(1, 1)]);
        }
    
        #[test]
        fn test_no_duplicates() {
            assert_eq!(encode(&[1, 2, 3]), vec![(1, 1), (1, 2), (1, 3)]);
        }
    
        #[test]
        fn test_all_same() {
            assert_eq!(encode(&[5, 5, 5, 5]), vec![(4, 5)]);
        }
    }

    Deep Comparison

    Comparison: Run-Length Encoding

    OCaml — Pack then encode

    let encode lst =
      List.map (fun group -> (List.length group, List.hd group)) (pack lst)
    

    Rust — Idiomatic (compose with pack)

    pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
        pack(list)
            .into_iter()
            .map(|group| (group.len(), group[0].clone()))
            .collect()
    }
    

    Rust — Functional (single-pass fold)

    pub fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
        list.iter().fold(Vec::new(), |mut acc, item| {
            match acc.last_mut() {
                Some((count, ref val)) if val == item => *count += 1,
                _ => acc.push((1, item.clone())),
            }
            acc
        })
    }
    

    Comparison Table

    AspectOCamlRust
    Tuple typeint * 'a(usize, T)
    Count typeint (signed)usize (unsigned)
    CompositionList.map f (pack lst)pack().into_iter().map().collect()
    Single-passSeparate recursive functionfold with last_mut()
    OwnershipGC handles pack resultinto_iter() consumes packed groups

    Type Signatures

  • • OCaml: val encode : 'a list -> (int * 'a) list
  • • Rust: fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)>
  • Takeaways

  • Both languages naturally compose: pack first, then transform — functional composition at work
  • Rust's into_iter() consumes the intermediate packed groups — no waste
  • The fold version shows Rust can match OCaml's single-pass elegance with last_mut()
  • usize for counts is more semantically correct (counts can't be negative) than OCaml's int
  • The pattern of match acc.last_mut() is a Rust idiom for stateful folds — learn it well
  • Exercises

  • Implement the decoder: given a run-length encoded Vec<(usize, T)>, reconstruct the original Vec<T>.
  • Write a modified RLE that only encodes runs of length greater than 1 — single elements are stored as-is using an enum Rle<T> { Single(T), Run(usize, T) }.
  • Implement RLE for a stream of bytes and measure its compression ratio on a sample ASCII text; then try to compress already-compressed data and observe the result.
  • Open Source Repos