ExamplesBy LevelBy TopicLearning Paths
964 Fundamental

964 Union Find

Functional Programming

Tutorial

The Problem

Implement Union-Find (Disjoint Set Union) with path compression and union by rank. Path compression flattens the parent chain so that future find operations are O(1) amortized. Union by rank ensures the shorter tree is attached under the taller one, bounding tree height to O(log n). Track the number of connected components.

🎯 Learning Outcomes

  • • Implement find with two-pass path compression: first locate root, then point all nodes directly to root
  • • Implement union by rank: attach lower-rank root under higher-rank root; increment rank only on ties
  • • Track components counter, decrementing on successful union (different components merged)
  • • Understand why the inverse-Ackermann function α(n) bounds the amortized per-operation cost
  • • Implement iterative find (not recursive) to avoid stack overflow on degenerate inputs
  • Code Example

    #![allow(clippy::all)]
    // 964: Union-Find / Disjoint Set
    // Path compression + union by rank
    // OCaml: arrays + mutable fields; Rust: Vec + struct methods
    
    pub struct UnionFind {
        parent: Vec<usize>,
        rank: Vec<usize>,
        components: usize,
    }
    
    impl UnionFind {
        pub fn new(n: usize) -> Self {
            UnionFind {
                parent: (0..n).collect(),
                rank: vec![0; n],
                components: n,
            }
        }
    
        // Find with path compression (iterative to avoid stack overflow)
        pub fn find(&mut self, mut i: usize) -> usize {
            // Two-pass path compression
            let mut root = i;
            while self.parent[root] != root {
                root = self.parent[root];
            }
            // Path compression: point all nodes directly to root
            while self.parent[i] != root {
                let next = self.parent[i];
                self.parent[i] = root;
                i = next;
            }
            root
        }
    
        // Union by rank; returns true if they were in different sets
        pub fn union(&mut self, a: usize, b: usize) -> bool {
            let ra = self.find(a);
            let rb = self.find(b);
            if ra == rb {
                return false;
            }
            self.components -= 1;
            match self.rank[ra].cmp(&self.rank[rb]) {
                std::cmp::Ordering::Less => self.parent[ra] = rb,
                std::cmp::Ordering::Greater => self.parent[rb] = ra,
                std::cmp::Ordering::Equal => {
                    self.parent[rb] = ra;
                    self.rank[ra] += 1;
                }
            }
            true
        }
    
        pub fn connected(&mut self, a: usize, b: usize) -> bool {
            self.find(a) == self.find(b)
        }
    
        pub fn num_components(&self) -> usize {
            self.components
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_initial_state() {
            let mut uf = UnionFind::new(5);
            assert_eq!(uf.num_components(), 5);
            assert!(!uf.connected(0, 1));
            assert!(!uf.connected(2, 4));
        }
    
        #[test]
        fn test_union_and_connected() {
            let mut uf = UnionFind::new(6);
            assert!(uf.union(0, 1));
            assert!(uf.union(2, 3));
            assert!(uf.union(4, 5));
            assert_eq!(uf.num_components(), 3);
    
            assert!(uf.connected(0, 1));
            assert!(uf.connected(2, 3));
            assert!(!uf.connected(0, 2));
        }
    
        #[test]
        fn test_union_already_connected() {
            let mut uf = UnionFind::new(4);
            assert!(uf.union(0, 1));
            assert!(!uf.union(0, 1)); // already same set
            assert_eq!(uf.num_components(), 3);
        }
    
        #[test]
        fn test_transitivity() {
            let mut uf = UnionFind::new(6);
            uf.union(0, 1);
            uf.union(2, 3);
            uf.union(1, 2);
            assert!(uf.connected(0, 3)); // transitive
            assert_eq!(uf.num_components(), 3);
        }
    
        #[test]
        fn test_full_merge() {
            let mut uf = UnionFind::new(6);
            uf.union(0, 1);
            uf.union(2, 3);
            uf.union(4, 5);
            uf.union(1, 2);
            uf.union(3, 4);
            assert_eq!(uf.num_components(), 1);
            assert!(uf.connected(0, 5));
        }
    
        #[test]
        fn test_path_compression() {
            let mut uf = UnionFind::new(10);
            // Create a chain: 0-1-2-3-4-5-6-7-8-9
            for i in 0..9 {
                uf.union(i, i + 1);
            }
            assert_eq!(uf.num_components(), 1);
            // After find, path should be compressed
            let root = uf.find(9);
            assert_eq!(root, uf.find(0));
        }
    }

    Key Differences

    AspectRustOCaml
    Path compressionIterative two-passRecursive single-pass
    Array accessself.parent[i]uf.parent.(i)
    Stack safetyIterative — no overflow riskRecursive — risk before first compression
    Mutable fieldcomponents in structmutable components in record

    Union-Find is the foundation of Kruskal's minimum spanning tree algorithm, network connectivity, and cycle detection. The nearly-O(1) amortized cost makes it practical for millions of operations.

    OCaml Approach

    type t = {
      parent: int array;
      rank: int array;
      mutable components: int;
    }
    
    let create n = {
      parent = Array.init n (fun i -> i);
      rank = Array.make n 0;
      components = n;
    }
    
    let rec find uf i =
      if uf.parent.(i) = i then i
      else begin
        let root = find uf uf.parent.(i) in
        uf.parent.(i) <- root;  (* path compression *)
        root
      end
    
    let union uf a b =
      let ra = find uf a and rb = find uf b in
      if ra = rb then false
      else begin
        (if uf.rank.(ra) < uf.rank.(rb)
         then uf.parent.(ra) <- rb
         else if uf.rank.(ra) > uf.rank.(rb)
         then uf.parent.(rb) <- ra
         else begin uf.parent.(rb) <- ra; uf.rank.(ra) <- uf.rank.(ra) + 1 end);
        uf.components <- uf.components - 1;
        true
      end
    

    OCaml's recursive find with in-place path compression is natural but risks stack overflow on very deep chains before the first compression. Rust's iterative two-pass version avoids this.

    Full Source

    #![allow(clippy::all)]
    // 964: Union-Find / Disjoint Set
    // Path compression + union by rank
    // OCaml: arrays + mutable fields; Rust: Vec + struct methods
    
    pub struct UnionFind {
        parent: Vec<usize>,
        rank: Vec<usize>,
        components: usize,
    }
    
    impl UnionFind {
        pub fn new(n: usize) -> Self {
            UnionFind {
                parent: (0..n).collect(),
                rank: vec![0; n],
                components: n,
            }
        }
    
        // Find with path compression (iterative to avoid stack overflow)
        pub fn find(&mut self, mut i: usize) -> usize {
            // Two-pass path compression
            let mut root = i;
            while self.parent[root] != root {
                root = self.parent[root];
            }
            // Path compression: point all nodes directly to root
            while self.parent[i] != root {
                let next = self.parent[i];
                self.parent[i] = root;
                i = next;
            }
            root
        }
    
        // Union by rank; returns true if they were in different sets
        pub fn union(&mut self, a: usize, b: usize) -> bool {
            let ra = self.find(a);
            let rb = self.find(b);
            if ra == rb {
                return false;
            }
            self.components -= 1;
            match self.rank[ra].cmp(&self.rank[rb]) {
                std::cmp::Ordering::Less => self.parent[ra] = rb,
                std::cmp::Ordering::Greater => self.parent[rb] = ra,
                std::cmp::Ordering::Equal => {
                    self.parent[rb] = ra;
                    self.rank[ra] += 1;
                }
            }
            true
        }
    
        pub fn connected(&mut self, a: usize, b: usize) -> bool {
            self.find(a) == self.find(b)
        }
    
        pub fn num_components(&self) -> usize {
            self.components
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_initial_state() {
            let mut uf = UnionFind::new(5);
            assert_eq!(uf.num_components(), 5);
            assert!(!uf.connected(0, 1));
            assert!(!uf.connected(2, 4));
        }
    
        #[test]
        fn test_union_and_connected() {
            let mut uf = UnionFind::new(6);
            assert!(uf.union(0, 1));
            assert!(uf.union(2, 3));
            assert!(uf.union(4, 5));
            assert_eq!(uf.num_components(), 3);
    
            assert!(uf.connected(0, 1));
            assert!(uf.connected(2, 3));
            assert!(!uf.connected(0, 2));
        }
    
        #[test]
        fn test_union_already_connected() {
            let mut uf = UnionFind::new(4);
            assert!(uf.union(0, 1));
            assert!(!uf.union(0, 1)); // already same set
            assert_eq!(uf.num_components(), 3);
        }
    
        #[test]
        fn test_transitivity() {
            let mut uf = UnionFind::new(6);
            uf.union(0, 1);
            uf.union(2, 3);
            uf.union(1, 2);
            assert!(uf.connected(0, 3)); // transitive
            assert_eq!(uf.num_components(), 3);
        }
    
        #[test]
        fn test_full_merge() {
            let mut uf = UnionFind::new(6);
            uf.union(0, 1);
            uf.union(2, 3);
            uf.union(4, 5);
            uf.union(1, 2);
            uf.union(3, 4);
            assert_eq!(uf.num_components(), 1);
            assert!(uf.connected(0, 5));
        }
    
        #[test]
        fn test_path_compression() {
            let mut uf = UnionFind::new(10);
            // Create a chain: 0-1-2-3-4-5-6-7-8-9
            for i in 0..9 {
                uf.union(i, i + 1);
            }
            assert_eq!(uf.num_components(), 1);
            // After find, path should be compressed
            let root = uf.find(9);
            assert_eq!(root, uf.find(0));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_initial_state() {
            let mut uf = UnionFind::new(5);
            assert_eq!(uf.num_components(), 5);
            assert!(!uf.connected(0, 1));
            assert!(!uf.connected(2, 4));
        }
    
        #[test]
        fn test_union_and_connected() {
            let mut uf = UnionFind::new(6);
            assert!(uf.union(0, 1));
            assert!(uf.union(2, 3));
            assert!(uf.union(4, 5));
            assert_eq!(uf.num_components(), 3);
    
            assert!(uf.connected(0, 1));
            assert!(uf.connected(2, 3));
            assert!(!uf.connected(0, 2));
        }
    
        #[test]
        fn test_union_already_connected() {
            let mut uf = UnionFind::new(4);
            assert!(uf.union(0, 1));
            assert!(!uf.union(0, 1)); // already same set
            assert_eq!(uf.num_components(), 3);
        }
    
        #[test]
        fn test_transitivity() {
            let mut uf = UnionFind::new(6);
            uf.union(0, 1);
            uf.union(2, 3);
            uf.union(1, 2);
            assert!(uf.connected(0, 3)); // transitive
            assert_eq!(uf.num_components(), 3);
        }
    
        #[test]
        fn test_full_merge() {
            let mut uf = UnionFind::new(6);
            uf.union(0, 1);
            uf.union(2, 3);
            uf.union(4, 5);
            uf.union(1, 2);
            uf.union(3, 4);
            assert_eq!(uf.num_components(), 1);
            assert!(uf.connected(0, 5));
        }
    
        #[test]
        fn test_path_compression() {
            let mut uf = UnionFind::new(10);
            // Create a chain: 0-1-2-3-4-5-6-7-8-9
            for i in 0..9 {
                uf.union(i, i + 1);
            }
            assert_eq!(uf.num_components(), 1);
            // After find, path should be compressed
            let root = uf.find(9);
            assert_eq!(root, uf.find(0));
        }
    }

    Deep Comparison

    Union-Find / Disjoint Set — Comparison

    Core Insight

    Union-Find stores a forest where each tree represents a connected component. find(x) walks up to the root, compressing the path. union(a,b) merges two trees by rank. The algorithm is identical; OCaml's recursive find is elegant, while Rust prefers iterative path compression to avoid stack overflow on degenerate inputs.

    OCaml Approach

  • Array.init n (fun i -> i) — each node is its own parent
  • • Recursive find with in-place path compression: parent.(i) <- find uf parent.(i)
  • mutable components: int in record for component count
  • • Pattern matching on rank comparison: if rank.(ra) < rank.(rb)
  • • Returns bool from union — true if sets were merged
  • Rust Approach

  • (0..n).collect() — each node is its own parent
  • • Iterative two-pass path compression (find root, then compress)
  • self.components field decremented on successful union
  • std::cmp::Ordering match for rank comparison
  • &mut self required on find (modifies parent array)
  • Comparison Table

    AspectOCamlRust
    Storageint arrayVec<usize>
    InitArray.init n (fun i -> i)(0..n).collect()
    FindRecursive with path compressionIterative two-pass
    Path compressparent.(i) <- find uf parent.(i)Loop: parent[i] = root
    Rank compareif rank < rankmatch rank.cmp(&rank)
    MutationMutable record fields&mut self
    Component countmutable components: intself.components: usize
    Amortized costO(α(n)) inverse AckermannO(α(n)) same

    Exercises

  • Implement connected(a, b) -> bool as a convenience wrapper over find(a) == find(b).
  • Use Union-Find to implement Kruskal's MST algorithm: sort edges by weight, union each if not already connected.
  • Implement component_sizes() -> HashMap<usize, usize> mapping root → component size.
  • Extend to support an arbitrary label per component: set_label(root, label), get_label(node) -> label.
  • Implement a version where find is &self (not &mut self) using UnsafeCell for interior mutability — analyze the tradeoffs.
  • Open Source Repos