964 Union Find
Tutorial
The Problem
Implement Union-Find (Disjoint Set Union) with path compression and union by rank. Path compression flattens the parent chain so that future find operations are O(1) amortized. Union by rank ensures the shorter tree is attached under the taller one, bounding tree height to O(log n). Track the number of connected components.
🎯 Learning Outcomes
find with two-pass path compression: first locate root, then point all nodes directly to rootunion by rank: attach lower-rank root under higher-rank root; increment rank only on tiescomponents counter, decrementing on successful union (different components merged)find (not recursive) to avoid stack overflow on degenerate inputsCode Example
#![allow(clippy::all)]
// 964: Union-Find / Disjoint Set
// Path compression + union by rank
// OCaml: arrays + mutable fields; Rust: Vec + struct methods
pub struct UnionFind {
parent: Vec<usize>,
rank: Vec<usize>,
components: usize,
}
impl UnionFind {
pub fn new(n: usize) -> Self {
UnionFind {
parent: (0..n).collect(),
rank: vec![0; n],
components: n,
}
}
// Find with path compression (iterative to avoid stack overflow)
pub fn find(&mut self, mut i: usize) -> usize {
// Two-pass path compression
let mut root = i;
while self.parent[root] != root {
root = self.parent[root];
}
// Path compression: point all nodes directly to root
while self.parent[i] != root {
let next = self.parent[i];
self.parent[i] = root;
i = next;
}
root
}
// Union by rank; returns true if they were in different sets
pub fn union(&mut self, a: usize, b: usize) -> bool {
let ra = self.find(a);
let rb = self.find(b);
if ra == rb {
return false;
}
self.components -= 1;
match self.rank[ra].cmp(&self.rank[rb]) {
std::cmp::Ordering::Less => self.parent[ra] = rb,
std::cmp::Ordering::Greater => self.parent[rb] = ra,
std::cmp::Ordering::Equal => {
self.parent[rb] = ra;
self.rank[ra] += 1;
}
}
true
}
pub fn connected(&mut self, a: usize, b: usize) -> bool {
self.find(a) == self.find(b)
}
pub fn num_components(&self) -> usize {
self.components
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_initial_state() {
let mut uf = UnionFind::new(5);
assert_eq!(uf.num_components(), 5);
assert!(!uf.connected(0, 1));
assert!(!uf.connected(2, 4));
}
#[test]
fn test_union_and_connected() {
let mut uf = UnionFind::new(6);
assert!(uf.union(0, 1));
assert!(uf.union(2, 3));
assert!(uf.union(4, 5));
assert_eq!(uf.num_components(), 3);
assert!(uf.connected(0, 1));
assert!(uf.connected(2, 3));
assert!(!uf.connected(0, 2));
}
#[test]
fn test_union_already_connected() {
let mut uf = UnionFind::new(4);
assert!(uf.union(0, 1));
assert!(!uf.union(0, 1)); // already same set
assert_eq!(uf.num_components(), 3);
}
#[test]
fn test_transitivity() {
let mut uf = UnionFind::new(6);
uf.union(0, 1);
uf.union(2, 3);
uf.union(1, 2);
assert!(uf.connected(0, 3)); // transitive
assert_eq!(uf.num_components(), 3);
}
#[test]
fn test_full_merge() {
let mut uf = UnionFind::new(6);
uf.union(0, 1);
uf.union(2, 3);
uf.union(4, 5);
uf.union(1, 2);
uf.union(3, 4);
assert_eq!(uf.num_components(), 1);
assert!(uf.connected(0, 5));
}
#[test]
fn test_path_compression() {
let mut uf = UnionFind::new(10);
// Create a chain: 0-1-2-3-4-5-6-7-8-9
for i in 0..9 {
uf.union(i, i + 1);
}
assert_eq!(uf.num_components(), 1);
// After find, path should be compressed
let root = uf.find(9);
assert_eq!(root, uf.find(0));
}
}Key Differences
| Aspect | Rust | OCaml |
|---|---|---|
| Path compression | Iterative two-pass | Recursive single-pass |
| Array access | self.parent[i] | uf.parent.(i) |
| Stack safety | Iterative — no overflow risk | Recursive — risk before first compression |
| Mutable field | components in struct | mutable components in record |
Union-Find is the foundation of Kruskal's minimum spanning tree algorithm, network connectivity, and cycle detection. The nearly-O(1) amortized cost makes it practical for millions of operations.
OCaml Approach
type t = {
parent: int array;
rank: int array;
mutable components: int;
}
let create n = {
parent = Array.init n (fun i -> i);
rank = Array.make n 0;
components = n;
}
let rec find uf i =
if uf.parent.(i) = i then i
else begin
let root = find uf uf.parent.(i) in
uf.parent.(i) <- root; (* path compression *)
root
end
let union uf a b =
let ra = find uf a and rb = find uf b in
if ra = rb then false
else begin
(if uf.rank.(ra) < uf.rank.(rb)
then uf.parent.(ra) <- rb
else if uf.rank.(ra) > uf.rank.(rb)
then uf.parent.(rb) <- ra
else begin uf.parent.(rb) <- ra; uf.rank.(ra) <- uf.rank.(ra) + 1 end);
uf.components <- uf.components - 1;
true
end
OCaml's recursive find with in-place path compression is natural but risks stack overflow on very deep chains before the first compression. Rust's iterative two-pass version avoids this.
Full Source
#![allow(clippy::all)]
// 964: Union-Find / Disjoint Set
// Path compression + union by rank
// OCaml: arrays + mutable fields; Rust: Vec + struct methods
pub struct UnionFind {
parent: Vec<usize>,
rank: Vec<usize>,
components: usize,
}
impl UnionFind {
pub fn new(n: usize) -> Self {
UnionFind {
parent: (0..n).collect(),
rank: vec![0; n],
components: n,
}
}
// Find with path compression (iterative to avoid stack overflow)
pub fn find(&mut self, mut i: usize) -> usize {
// Two-pass path compression
let mut root = i;
while self.parent[root] != root {
root = self.parent[root];
}
// Path compression: point all nodes directly to root
while self.parent[i] != root {
let next = self.parent[i];
self.parent[i] = root;
i = next;
}
root
}
// Union by rank; returns true if they were in different sets
pub fn union(&mut self, a: usize, b: usize) -> bool {
let ra = self.find(a);
let rb = self.find(b);
if ra == rb {
return false;
}
self.components -= 1;
match self.rank[ra].cmp(&self.rank[rb]) {
std::cmp::Ordering::Less => self.parent[ra] = rb,
std::cmp::Ordering::Greater => self.parent[rb] = ra,
std::cmp::Ordering::Equal => {
self.parent[rb] = ra;
self.rank[ra] += 1;
}
}
true
}
pub fn connected(&mut self, a: usize, b: usize) -> bool {
self.find(a) == self.find(b)
}
pub fn num_components(&self) -> usize {
self.components
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_initial_state() {
let mut uf = UnionFind::new(5);
assert_eq!(uf.num_components(), 5);
assert!(!uf.connected(0, 1));
assert!(!uf.connected(2, 4));
}
#[test]
fn test_union_and_connected() {
let mut uf = UnionFind::new(6);
assert!(uf.union(0, 1));
assert!(uf.union(2, 3));
assert!(uf.union(4, 5));
assert_eq!(uf.num_components(), 3);
assert!(uf.connected(0, 1));
assert!(uf.connected(2, 3));
assert!(!uf.connected(0, 2));
}
#[test]
fn test_union_already_connected() {
let mut uf = UnionFind::new(4);
assert!(uf.union(0, 1));
assert!(!uf.union(0, 1)); // already same set
assert_eq!(uf.num_components(), 3);
}
#[test]
fn test_transitivity() {
let mut uf = UnionFind::new(6);
uf.union(0, 1);
uf.union(2, 3);
uf.union(1, 2);
assert!(uf.connected(0, 3)); // transitive
assert_eq!(uf.num_components(), 3);
}
#[test]
fn test_full_merge() {
let mut uf = UnionFind::new(6);
uf.union(0, 1);
uf.union(2, 3);
uf.union(4, 5);
uf.union(1, 2);
uf.union(3, 4);
assert_eq!(uf.num_components(), 1);
assert!(uf.connected(0, 5));
}
#[test]
fn test_path_compression() {
let mut uf = UnionFind::new(10);
// Create a chain: 0-1-2-3-4-5-6-7-8-9
for i in 0..9 {
uf.union(i, i + 1);
}
assert_eq!(uf.num_components(), 1);
// After find, path should be compressed
let root = uf.find(9);
assert_eq!(root, uf.find(0));
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_initial_state() {
let mut uf = UnionFind::new(5);
assert_eq!(uf.num_components(), 5);
assert!(!uf.connected(0, 1));
assert!(!uf.connected(2, 4));
}
#[test]
fn test_union_and_connected() {
let mut uf = UnionFind::new(6);
assert!(uf.union(0, 1));
assert!(uf.union(2, 3));
assert!(uf.union(4, 5));
assert_eq!(uf.num_components(), 3);
assert!(uf.connected(0, 1));
assert!(uf.connected(2, 3));
assert!(!uf.connected(0, 2));
}
#[test]
fn test_union_already_connected() {
let mut uf = UnionFind::new(4);
assert!(uf.union(0, 1));
assert!(!uf.union(0, 1)); // already same set
assert_eq!(uf.num_components(), 3);
}
#[test]
fn test_transitivity() {
let mut uf = UnionFind::new(6);
uf.union(0, 1);
uf.union(2, 3);
uf.union(1, 2);
assert!(uf.connected(0, 3)); // transitive
assert_eq!(uf.num_components(), 3);
}
#[test]
fn test_full_merge() {
let mut uf = UnionFind::new(6);
uf.union(0, 1);
uf.union(2, 3);
uf.union(4, 5);
uf.union(1, 2);
uf.union(3, 4);
assert_eq!(uf.num_components(), 1);
assert!(uf.connected(0, 5));
}
#[test]
fn test_path_compression() {
let mut uf = UnionFind::new(10);
// Create a chain: 0-1-2-3-4-5-6-7-8-9
for i in 0..9 {
uf.union(i, i + 1);
}
assert_eq!(uf.num_components(), 1);
// After find, path should be compressed
let root = uf.find(9);
assert_eq!(root, uf.find(0));
}
}
Deep Comparison
Union-Find / Disjoint Set — Comparison
Core Insight
Union-Find stores a forest where each tree represents a connected component. find(x) walks up to the root, compressing the path. union(a,b) merges two trees by rank. The algorithm is identical; OCaml's recursive find is elegant, while Rust prefers iterative path compression to avoid stack overflow on degenerate inputs.
OCaml Approach
Array.init n (fun i -> i) — each node is its own parentfind with in-place path compression: parent.(i) <- find uf parent.(i)mutable components: int in record for component countif rank.(ra) < rank.(rb)bool from union — true if sets were mergedRust Approach
(0..n).collect() — each node is its own parentself.components field decremented on successful unionstd::cmp::Ordering match for rank comparison&mut self required on find (modifies parent array)Comparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Storage | int array | Vec<usize> |
| Init | Array.init n (fun i -> i) | (0..n).collect() |
| Find | Recursive with path compression | Iterative two-pass |
| Path compress | parent.(i) <- find uf parent.(i) | Loop: parent[i] = root |
| Rank compare | if rank < rank | match rank.cmp(&rank) |
| Mutation | Mutable record fields | &mut self |
| Component count | mutable components: int | self.components: usize |
| Amortized cost | O(α(n)) inverse Ackermann | O(α(n)) same |
Exercises
connected(a, b) -> bool as a convenience wrapper over find(a) == find(b).component_sizes() -> HashMap<usize, usize> mapping root → component size.set_label(root, label), get_label(node) -> label.find is &self (not &mut self) using UnsafeCell for interior mutability — analyze the tradeoffs.