ExamplesBy LevelBy TopicLearning Paths
948 Intermediate

948 Merge Sort

Functional Programming

Tutorial

The Problem

Implement functional merge sort in Rust: recursively split a slice in half, sort each half, then merge two sorted slices. Implement two merge variants — one with explicit index cursors and one using peekable iterators — to demonstrate the functional style. Compare with OCaml's list-based merge sort that uses pattern matching.

🎯 Learning Outcomes

  • • Implement recursive merge_sort<T: Ord + Clone>(list: &[T]) -> Vec<T> using slice splitting
  • • Implement merge<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> with two-pointer merge
  • • Implement merge_iter using Peekable iterators for a more functional-style merge
  • • Understand that Clone is required because merge sort produces new Vec allocations at each level
  • • Connect to OCaml's pattern-matched list merge sort: [] | [_] base case, List.init split
  • Code Example

    #![allow(clippy::all)]
    /// Merge Sort — Functional Divide and Conquer
    ///
    /// Pure functional merge sort: split the list, sort each half, merge.
    /// OCaml uses pattern matching on lists; Rust uses slices and Vec.
    
    /// Merge two sorted slices into a new sorted Vec.
    pub fn merge<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
        let mut result = Vec::with_capacity(left.len() + right.len());
        let (mut i, mut j) = (0, 0);
        while i < left.len() && j < right.len() {
            if left[i] <= right[j] {
                result.push(left[i].clone());
                i += 1;
            } else {
                result.push(right[j].clone());
                j += 1;
            }
        }
        result.extend_from_slice(&left[i..]);
        result.extend_from_slice(&right[j..]);
        result
    }
    
    /// Functional merge sort — recursive, creates new Vecs at each level.
    pub fn merge_sort<T: Ord + Clone>(list: &[T]) -> Vec<T> {
        if list.len() <= 1 {
            return list.to_vec();
        }
        let mid = list.len() / 2;
        let left = merge_sort(&list[..mid]);
        let right = merge_sort(&list[mid..]);
        merge(&left, &right)
    }
    
    /// Merge using iterators — more functional style.
    pub fn merge_iter<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
        let mut result = Vec::with_capacity(left.len() + right.len());
        let mut li = left.iter().peekable();
        let mut ri = right.iter().peekable();
        loop {
            match (li.peek(), ri.peek()) {
                (Some(l), Some(r)) => {
                    if l <= r {
                        result.push((*li.next().unwrap()).clone());
                    } else {
                        result.push((*ri.next().unwrap()).clone());
                    }
                }
                (Some(_), None) => {
                    result.extend(li.cloned());
                    break;
                }
                (None, Some(_)) => {
                    result.extend(ri.cloned());
                    break;
                }
                (None, None) => break,
            }
        }
        result
    }
    
    /// Custom comparator version (like OCaml's `cmp` parameter).
    pub fn merge_sort_by<T: Clone>(list: &[T], cmp: &impl Fn(&T, &T) -> std::cmp::Ordering) -> Vec<T> {
        if list.len() <= 1 {
            return list.to_vec();
        }
        let mid = list.len() / 2;
        let left = merge_sort_by(&list[..mid], cmp);
        let right = merge_sort_by(&list[mid..], cmp);
        let mut result = Vec::with_capacity(left.len() + right.len());
        let (mut i, mut j) = (0, 0);
        while i < left.len() && j < right.len() {
            if cmp(&left[i], &right[j]) != std::cmp::Ordering::Greater {
                result.push(left[i].clone());
                i += 1;
            } else {
                result.push(right[j].clone());
                j += 1;
            }
        }
        result.extend_from_slice(&left[i..]);
        result.extend_from_slice(&right[j..]);
        result
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_basic() {
            assert_eq!(merge_sort(&[5, 2, 8, 1, 9, 3]), vec![1, 2, 3, 5, 8, 9]);
        }
    
        #[test]
        fn test_empty() {
            assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
        }
    
        #[test]
        fn test_single() {
            assert_eq!(merge_sort(&[42]), vec![42]);
        }
    
        #[test]
        fn test_already_sorted() {
            assert_eq!(merge_sort(&[1, 2, 3, 4, 5]), vec![1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_reverse_sorted() {
            assert_eq!(merge_sort(&[5, 4, 3, 2, 1]), vec![1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_duplicates() {
            assert_eq!(merge_sort(&[3, 1, 2, 1, 3]), vec![1, 1, 2, 3, 3]);
        }
    
        #[test]
        fn test_strings() {
            assert_eq!(
                merge_sort(&["banana", "apple", "cherry"]),
                vec!["apple", "banana", "cherry"]
            );
        }
    
        #[test]
        fn test_custom_cmp() {
            // Sort descending
            let result = merge_sort_by(&[1, 5, 3, 2, 4], &|a: &i32, b: &i32| b.cmp(a));
            assert_eq!(result, vec![5, 4, 3, 2, 1]);
        }
    }

    Key Differences

    AspectRustOCaml
    Input type&[T] slice — zero-copy split at mid'a list — must copy for split
    Clone requirementT: Clone — explicitGC handles sharing transparently
    Base caselist.len() <= 1Pattern match [] \| [_]
    Merge styleIndex cursor or PeekablePattern match on pairs
    MemoryOne Vec per merge stepOne list cons per element

    Merge sort is naturally recursive and allocation-heavy. For production use, Rust's slice::sort (pdqsort) is significantly faster. Functional merge sort here is for algorithmic clarity.

    OCaml Approach

    let rec merge xs ys = match xs, ys with
      | [], ys -> ys
      | xs, [] -> xs
      | x :: xs', y :: ys' ->
        if x <= y then x :: merge xs' ys
        else y :: merge xs ys'
    
    let split xs =
      let n = List.length xs in
      let k = n / 2 in
      (List.filteri (fun i _ -> i < k) xs,
       List.filteri (fun i _ -> i >= k) xs)
    
    let rec merge_sort = function
      | [] | [_] as xs -> xs
      | xs ->
        let (left, right) = split xs in
        merge (merge_sort left) (merge_sort right)
    

    OCaml's pattern match on (xs, ys) pairs makes the merge function read as three cases: one side empty, other side empty, or both non-empty. The recursive merge in OCaml consumes the lists directly — no index tracking needed.

    Full Source

    #![allow(clippy::all)]
    /// Merge Sort — Functional Divide and Conquer
    ///
    /// Pure functional merge sort: split the list, sort each half, merge.
    /// OCaml uses pattern matching on lists; Rust uses slices and Vec.
    
    /// Merge two sorted slices into a new sorted Vec.
    pub fn merge<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
        let mut result = Vec::with_capacity(left.len() + right.len());
        let (mut i, mut j) = (0, 0);
        while i < left.len() && j < right.len() {
            if left[i] <= right[j] {
                result.push(left[i].clone());
                i += 1;
            } else {
                result.push(right[j].clone());
                j += 1;
            }
        }
        result.extend_from_slice(&left[i..]);
        result.extend_from_slice(&right[j..]);
        result
    }
    
    /// Functional merge sort — recursive, creates new Vecs at each level.
    pub fn merge_sort<T: Ord + Clone>(list: &[T]) -> Vec<T> {
        if list.len() <= 1 {
            return list.to_vec();
        }
        let mid = list.len() / 2;
        let left = merge_sort(&list[..mid]);
        let right = merge_sort(&list[mid..]);
        merge(&left, &right)
    }
    
    /// Merge using iterators — more functional style.
    pub fn merge_iter<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
        let mut result = Vec::with_capacity(left.len() + right.len());
        let mut li = left.iter().peekable();
        let mut ri = right.iter().peekable();
        loop {
            match (li.peek(), ri.peek()) {
                (Some(l), Some(r)) => {
                    if l <= r {
                        result.push((*li.next().unwrap()).clone());
                    } else {
                        result.push((*ri.next().unwrap()).clone());
                    }
                }
                (Some(_), None) => {
                    result.extend(li.cloned());
                    break;
                }
                (None, Some(_)) => {
                    result.extend(ri.cloned());
                    break;
                }
                (None, None) => break,
            }
        }
        result
    }
    
    /// Custom comparator version (like OCaml's `cmp` parameter).
    pub fn merge_sort_by<T: Clone>(list: &[T], cmp: &impl Fn(&T, &T) -> std::cmp::Ordering) -> Vec<T> {
        if list.len() <= 1 {
            return list.to_vec();
        }
        let mid = list.len() / 2;
        let left = merge_sort_by(&list[..mid], cmp);
        let right = merge_sort_by(&list[mid..], cmp);
        let mut result = Vec::with_capacity(left.len() + right.len());
        let (mut i, mut j) = (0, 0);
        while i < left.len() && j < right.len() {
            if cmp(&left[i], &right[j]) != std::cmp::Ordering::Greater {
                result.push(left[i].clone());
                i += 1;
            } else {
                result.push(right[j].clone());
                j += 1;
            }
        }
        result.extend_from_slice(&left[i..]);
        result.extend_from_slice(&right[j..]);
        result
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_basic() {
            assert_eq!(merge_sort(&[5, 2, 8, 1, 9, 3]), vec![1, 2, 3, 5, 8, 9]);
        }
    
        #[test]
        fn test_empty() {
            assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
        }
    
        #[test]
        fn test_single() {
            assert_eq!(merge_sort(&[42]), vec![42]);
        }
    
        #[test]
        fn test_already_sorted() {
            assert_eq!(merge_sort(&[1, 2, 3, 4, 5]), vec![1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_reverse_sorted() {
            assert_eq!(merge_sort(&[5, 4, 3, 2, 1]), vec![1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_duplicates() {
            assert_eq!(merge_sort(&[3, 1, 2, 1, 3]), vec![1, 1, 2, 3, 3]);
        }
    
        #[test]
        fn test_strings() {
            assert_eq!(
                merge_sort(&["banana", "apple", "cherry"]),
                vec!["apple", "banana", "cherry"]
            );
        }
    
        #[test]
        fn test_custom_cmp() {
            // Sort descending
            let result = merge_sort_by(&[1, 5, 3, 2, 4], &|a: &i32, b: &i32| b.cmp(a));
            assert_eq!(result, vec![5, 4, 3, 2, 1]);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_basic() {
            assert_eq!(merge_sort(&[5, 2, 8, 1, 9, 3]), vec![1, 2, 3, 5, 8, 9]);
        }
    
        #[test]
        fn test_empty() {
            assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
        }
    
        #[test]
        fn test_single() {
            assert_eq!(merge_sort(&[42]), vec![42]);
        }
    
        #[test]
        fn test_already_sorted() {
            assert_eq!(merge_sort(&[1, 2, 3, 4, 5]), vec![1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_reverse_sorted() {
            assert_eq!(merge_sort(&[5, 4, 3, 2, 1]), vec![1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_duplicates() {
            assert_eq!(merge_sort(&[3, 1, 2, 1, 3]), vec![1, 1, 2, 3, 3]);
        }
    
        #[test]
        fn test_strings() {
            assert_eq!(
                merge_sort(&["banana", "apple", "cherry"]),
                vec!["apple", "banana", "cherry"]
            );
        }
    
        #[test]
        fn test_custom_cmp() {
            // Sort descending
            let result = merge_sort_by(&[1, 5, 3, 2, 4], &|a: &i32, b: &i32| b.cmp(a));
            assert_eq!(result, vec![5, 4, 3, 2, 1]);
        }
    }

    Deep Comparison

    Merge Sort — Functional Divide and Conquer: OCaml vs Rust

    The Core Insight

    Merge sort is the quintessential functional sorting algorithm because it's naturally recursive and doesn't require mutation. The comparison reveals how OCaml's linked lists and Rust's contiguous Vecs lead to different implementation strategies with the same O(n log n) complexity but different constant factors.

    OCaml Approach

    OCaml's merge sort is textbook elegant. split alternates elements into two lists using pattern matching on pairs (a :: b :: rest). merge recursively cons-es the smaller head onto the merged tail. The compare function is passed as a first-class value for custom ordering. All intermediate lists share structure through the GC — no copying needed. The ([] | [_]) as l -> l or-pattern cleanly handles base cases.

    Rust Approach

    Rust uses slices (&[T]) for input and Vec<T> for output. Splitting is trivial with slice indexing: &list[..mid] and &list[mid..]. Merging uses index-based iteration with push and extend_from_slice for efficient bulk copying. Vec::with_capacity pre-allocates the exact needed size, avoiding reallocation. The Ord trait bound replaces OCaml's compare function parameter. A peekable iterator variant shows the more functional style.

    Side-by-Side

    ConceptOCamlRust
    SplitPattern match a :: b :: restSlice indexing &list[..mid]
    MergeRecursive cons h1 :: merge ...Vec::push + extend_from_slice
    Base case([] \| [_]) as l -> lif list.len() <= 1
    Comparatorcmp function parameterOrd trait bound or closure
    MemoryGC, structural sharingVec allocation, contiguous
    StabilityStable (preserves order)Stable (preserves order)

    What Rust Learners Should Notice

  • Vec::with_capacity(n) is a crucial optimization — it avoids O(log n) reallocations during the merge phase
  • • Slice references (&[T]) give you zero-cost views into data without copying — similar to how OCaml lists share tails
  • extend_from_slice copies a contiguous block efficiently, much faster than pushing elements one by one
  • • The Ord trait bound means any type that implements comparison can be sorted — no need to pass a comparator function (though merge_sort_by with a closure also works)
  • • In production Rust, use slice::sort() or slice::sort_unstable() — they use Timsort / pdqsort respectively, both highly optimized
  • Further Reading

  • • [The Rust Book — Generics and Traits](https://doc.rust-lang.org/book/ch10-02-traits.html)
  • • [OCaml List Sorting](https://cs3110.github.io/textbook/chapters/ds/bst.html)
  • Exercises

  • Implement merge_sort_by<T, F: Fn(&T, &T) -> Ordering>(list, cmp) with a custom comparator.
  • Make merge_sort work on Vec<T> in-place with O(n log n) time and O(n) auxiliary space.
  • Implement a natural_merge_sort that detects existing sorted runs and merges them (Timsort inspiration).
  • Benchmark merge_sort vs slice::sort on a 10,000-element Vec<i32>.
  • Implement merge_k_sorted(lists: Vec<Vec<T>>) that merges k sorted lists in O(n log k) using a BinaryHeap.
  • Open Source Repos