Topological Sort via Kahn's Algorithm
Tutorial Video
Text description (accessibility)
This video demonstrates the "Topological Sort via Kahn's Algorithm" functional Rust example. Difficulty level: Advanced. Key concepts covered: Graphs. Implement topological sorting of a directed acyclic graph using Kahn's algorithm (in-degree counting). Key difference from OCaml: 1. **Map types:** OCaml uses `Map.Make(String)` (balanced tree, O(log n)); Rust uses `HashMap` (hash table, O(1) amortized)
Tutorial
The Problem
Implement topological sorting of a directed acyclic graph using Kahn's algorithm (in-degree counting). Detect cycles and return None if the graph is not a DAG.
🎯 Learning Outcomes
HashMap and VecDeque as Rust equivalents of OCaml's Map module and lists🦀 The Rust Way
Rust uses HashMap for O(1) lookups and VecDeque for efficient queue operations. The iterative while let loop replaces OCaml's recursive processing. A DFS-based variant with coloring is also provided, showing how both algorithms detect cycles.
Code Example
pub fn kahn_sort(nodes: &[&str], edges: &[(&str, &str)]) -> Option<Vec<String>> {
let mut in_degree: HashMap<&str, usize> = nodes.iter().map(|&n| (n, 0)).collect();
let mut adj: HashMap<&str, Vec<&str>> = nodes.iter().map(|&n| (n, Vec::new())).collect();
for &(from, to) in edges {
*in_degree.entry(to).or_insert(0) += 1;
adj.entry(from).or_default().push(to);
}
let mut queue: VecDeque<&str> = in_degree.iter()
.filter(|(_, &d)| d == 0).map(|(&n, _)| n).collect();
let mut result = Vec::new();
while let Some(node) = queue.pop_front() {
result.push(node.to_string());
// ... decrement neighbors, enqueue zeros
}
if result.len() == nodes.len() { Some(result) } else { None }
}Key Differences
Map.Make(String) (balanced tree, O(log n)); Rust uses HashMap (hash table, O(1) amortized)VecDeque (O(1) amortized)while let loops with mutable stateOption<Vec> for explicit error handlingOCaml Approach
OCaml uses functorized Map.Make(String) for the in-degree map and adjacency structure. The algorithm is expressed recursively: process nodes with zero in-degree, update neighbors, recurse. Lists are used as queues (not ideal for performance but idiomatic OCaml).
Full Source
#![allow(clippy::all)]
//! Topological Sort via Kahn's Algorithm
//!
//! Iterative topological sort using in-degree counting.
//! OCaml uses `Map` and `Set` from the standard library.
//! Rust uses `HashMap`, `HashSet`, and `VecDeque` for the queue.
use std::collections::{HashMap, HashSet, VecDeque};
// ── Solution 1: Idiomatic Rust — HashMap + VecDeque ──
/// Perform topological sort on a directed acyclic graph using Kahn's algorithm.
///
/// Takes a list of nodes and directed edges (from, to).
/// Returns nodes in topological order, or `None` if a cycle is detected.
///
/// OCaml: `val kahn_sort : string list -> (string * string) list -> string list`
pub fn kahn_sort(nodes: &[&str], edges: &[(&str, &str)]) -> Option<Vec<String>> {
// Build in-degree map: count incoming edges for each node
let mut in_degree: HashMap<&str, usize> = nodes.iter().map(|&n| (n, 0)).collect();
// Build adjacency list for outgoing edges
let mut adj: HashMap<&str, Vec<&str>> = nodes.iter().map(|&n| (n, Vec::new())).collect();
for &(from, to) in edges {
*in_degree.entry(to).or_insert(0) += 1;
adj.entry(from).or_default().push(to);
}
// Start with all nodes that have zero in-degree
let mut queue: VecDeque<&str> = in_degree
.iter()
.filter(|(_, °)| deg == 0)
.map(|(&node, _)| node)
.collect();
// Sort the initial queue for deterministic output
let mut sorted_queue: Vec<&str> = queue.drain(..).collect();
sorted_queue.sort();
queue.extend(sorted_queue);
let mut result = Vec::new();
while let Some(node) = queue.pop_front() {
result.push(node.to_string());
// Collect and sort neighbors for deterministic ordering
if let Some(neighbors) = adj.get(node) {
let mut next_nodes = Vec::new();
for &neighbor in neighbors {
if let Some(deg) = in_degree.get_mut(neighbor) {
*deg -= 1;
if *deg == 0 {
next_nodes.push(neighbor);
}
}
}
next_nodes.sort();
queue.extend(next_nodes);
}
}
// If we processed all nodes, the sort succeeded
if result.len() == nodes.len() {
Some(result)
} else {
None // Cycle detected
}
}
// ── Solution 2: Recursive (DFS-based) topological sort ──
//
// Uses depth-first search with coloring to detect cycles.
#[derive(Clone, Copy, PartialEq)]
enum Color {
White, // Unvisited
Gray, // In progress (on current path)
Black, // Finished
}
/// DFS-based topological sort. Returns `None` if a cycle is detected.
pub fn topo_sort_dfs(nodes: &[&str], edges: &[(&str, &str)]) -> Option<Vec<String>> {
let adj: HashMap<&str, Vec<&str>> = {
let mut map: HashMap<&str, Vec<&str>> = nodes.iter().map(|&n| (n, Vec::new())).collect();
for &(from, to) in edges {
map.entry(from).or_default().push(to);
}
// Sort adjacency lists for deterministic output
for list in map.values_mut() {
list.sort();
}
map
};
let mut color: HashMap<&str, Color> = nodes.iter().map(|&n| (n, Color::White)).collect();
let mut result = Vec::new();
// Sort nodes for deterministic starting order
let mut sorted_nodes: Vec<&str> = nodes.to_vec();
sorted_nodes.sort();
for &node in &sorted_nodes {
if color[node] == Color::White && !dfs_visit(node, &adj, &mut color, &mut result) {
return None; // Cycle
}
}
result.reverse();
Some(result)
}
fn dfs_visit<'a>(
node: &'a str,
adj: &HashMap<&'a str, Vec<&'a str>>,
color: &mut HashMap<&'a str, Color>,
result: &mut Vec<String>,
) -> bool {
color.insert(node, Color::Gray);
if let Some(neighbors) = adj.get(node) {
for &neighbor in neighbors {
match color.get(neighbor) {
Some(Color::Gray) => return false, // Back edge → cycle
Some(Color::White) => {
if !dfs_visit(neighbor, adj, color, result) {
return false;
}
}
_ => {} // Black: already finished
}
}
}
color.insert(node, Color::Black);
result.push(node.to_string());
true
}
// ── Solution 3: Functional-style with iterators ──
/// Kahn's algorithm expressed more functionally using iterators and fold.
pub fn kahn_functional(nodes: &[&str], edges: &[(&str, &str)]) -> Option<Vec<String>> {
let node_set: HashSet<&str> = nodes.iter().copied().collect();
// Build in-degree via fold
let in_degree: HashMap<&str, usize> = edges.iter().fold(
nodes
.iter()
.map(|&n| (n, 0_usize))
.collect::<HashMap<_, _>>(),
|mut acc, &(_, to)| {
*acc.entry(to).or_insert(0) += 1;
acc
},
);
// Build adjacency via fold
let adj: HashMap<&str, Vec<&str>> = edges.iter().fold(
nodes
.iter()
.map(|&n| (n, Vec::new()))
.collect::<HashMap<_, _>>(),
|mut acc, &(from, to)| {
acc.entry(from).or_default().push(to);
acc
},
);
// Iterative process using a mutable state tuple
let mut deg = in_degree;
let mut queue: Vec<&str> = deg
.iter()
.filter(|(_, &d)| d == 0)
.map(|(&n, _)| n)
.collect();
queue.sort();
let mut result = Vec::new();
while let Some(node) = queue.first().copied() {
queue.remove(0);
if !node_set.contains(node) {
continue;
}
result.push(node.to_string());
if let Some(neighbors) = adj.get(node) {
let mut new_zeros = Vec::new();
for &nb in neighbors {
if let Some(d) = deg.get_mut(nb) {
*d -= 1;
if *d == 0 {
new_zeros.push(nb);
}
}
}
new_zeros.sort();
queue.extend(new_zeros);
}
}
if result.len() == nodes.len() {
Some(result)
} else {
None
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_simple_linear_chain() {
let nodes = vec!["a", "b", "c"];
let edges = vec![("a", "b"), ("b", "c")];
assert_eq!(
kahn_sort(&nodes, &edges),
Some(vec!["a".to_string(), "b".to_string(), "c".to_string()])
);
}
#[test]
fn test_diamond_graph() {
let nodes = vec!["a", "b", "c", "d", "e"];
let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
let result = kahn_sort(&nodes, &edges).unwrap();
// a must come before b, c; b and c before d; d before e
assert_eq!(result[0], "a");
assert_eq!(result[3], "d");
assert_eq!(result[4], "e");
}
#[test]
fn test_cycle_detection() {
let nodes = vec!["a", "b", "c"];
let edges = vec![("a", "b"), ("b", "c"), ("c", "a")];
assert_eq!(kahn_sort(&nodes, &edges), None);
}
#[test]
fn test_single_node() {
let nodes = vec!["x"];
let edges: Vec<(&str, &str)> = vec![];
assert_eq!(kahn_sort(&nodes, &edges), Some(vec!["x".to_string()]));
}
#[test]
fn test_dfs_matches_kahn() {
let nodes = vec!["a", "b", "c", "d", "e"];
let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
let kahn = kahn_sort(&nodes, &edges).unwrap();
let dfs = topo_sort_dfs(&nodes, &edges).unwrap();
// Both should produce valid topological orderings
assert_eq!(kahn[0], "a");
assert_eq!(dfs[0], "a");
assert_eq!(*kahn.last().unwrap(), "e");
assert_eq!(*dfs.last().unwrap(), "e");
}
#[test]
fn test_dfs_cycle_detection() {
let nodes = vec!["a", "b", "c"];
let edges = vec![("a", "b"), ("b", "c"), ("c", "a")];
assert_eq!(topo_sort_dfs(&nodes, &edges), None);
}
#[test]
fn test_functional_kahn() {
let nodes = vec!["a", "b", "c", "d", "e"];
let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
let result = kahn_functional(&nodes, &edges).unwrap();
assert_eq!(result[0], "a");
assert_eq!(*result.last().unwrap(), "e");
}
#[test]
fn test_disconnected_nodes() {
let nodes = vec!["a", "b", "c"];
let edges: Vec<(&str, &str)> = vec![];
let result = kahn_sort(&nodes, &edges).unwrap();
assert_eq!(result.len(), 3);
// All nodes present, sorted alphabetically (deterministic)
assert_eq!(result, vec!["a", "b", "c"]);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_simple_linear_chain() {
let nodes = vec!["a", "b", "c"];
let edges = vec![("a", "b"), ("b", "c")];
assert_eq!(
kahn_sort(&nodes, &edges),
Some(vec!["a".to_string(), "b".to_string(), "c".to_string()])
);
}
#[test]
fn test_diamond_graph() {
let nodes = vec!["a", "b", "c", "d", "e"];
let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
let result = kahn_sort(&nodes, &edges).unwrap();
// a must come before b, c; b and c before d; d before e
assert_eq!(result[0], "a");
assert_eq!(result[3], "d");
assert_eq!(result[4], "e");
}
#[test]
fn test_cycle_detection() {
let nodes = vec!["a", "b", "c"];
let edges = vec![("a", "b"), ("b", "c"), ("c", "a")];
assert_eq!(kahn_sort(&nodes, &edges), None);
}
#[test]
fn test_single_node() {
let nodes = vec!["x"];
let edges: Vec<(&str, &str)> = vec![];
assert_eq!(kahn_sort(&nodes, &edges), Some(vec!["x".to_string()]));
}
#[test]
fn test_dfs_matches_kahn() {
let nodes = vec!["a", "b", "c", "d", "e"];
let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
let kahn = kahn_sort(&nodes, &edges).unwrap();
let dfs = topo_sort_dfs(&nodes, &edges).unwrap();
// Both should produce valid topological orderings
assert_eq!(kahn[0], "a");
assert_eq!(dfs[0], "a");
assert_eq!(*kahn.last().unwrap(), "e");
assert_eq!(*dfs.last().unwrap(), "e");
}
#[test]
fn test_dfs_cycle_detection() {
let nodes = vec!["a", "b", "c"];
let edges = vec![("a", "b"), ("b", "c"), ("c", "a")];
assert_eq!(topo_sort_dfs(&nodes, &edges), None);
}
#[test]
fn test_functional_kahn() {
let nodes = vec!["a", "b", "c", "d", "e"];
let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
let result = kahn_functional(&nodes, &edges).unwrap();
assert_eq!(result[0], "a");
assert_eq!(*result.last().unwrap(), "e");
}
#[test]
fn test_disconnected_nodes() {
let nodes = vec!["a", "b", "c"];
let edges: Vec<(&str, &str)> = vec![];
let result = kahn_sort(&nodes, &edges).unwrap();
assert_eq!(result.len(), 3);
// All nodes present, sorted alphabetically (deterministic)
assert_eq!(result, vec!["a", "b", "c"]);
}
}
Deep Comparison
OCaml vs Rust: Topological Sort via Kahn's Algorithm
Side-by-Side Code
OCaml
module SMap = Map.Make(String)
let kahn_sort nodes edges =
let in_deg = List.fold_left (fun m (_, b) ->
SMap.update b (function None -> Some 1 | Some n -> Some (n+1)) m
) (List.fold_left (fun m n -> SMap.add n 0 m) SMap.empty nodes) edges in
let queue = SMap.fold (fun k v acc -> if v = 0 then k :: acc else acc) in_deg [] in
let rec go queue in_deg result =
match queue with
| [] -> List.rev result
| node :: rest ->
let out_edges = List.filter (fun (a, _) -> a = node) edges in
let in_deg, new_queue = List.fold_left (fun (deg, q) (_, b) ->
let d = SMap.find b deg - 1 in
let deg = SMap.add b d deg in
if d = 0 then (deg, b :: q) else (deg, q)
) (in_deg, rest) out_edges in
go new_queue in_deg (node :: result)
in go queue in_deg []
Rust (idiomatic)
pub fn kahn_sort(nodes: &[&str], edges: &[(&str, &str)]) -> Option<Vec<String>> {
let mut in_degree: HashMap<&str, usize> = nodes.iter().map(|&n| (n, 0)).collect();
let mut adj: HashMap<&str, Vec<&str>> = nodes.iter().map(|&n| (n, Vec::new())).collect();
for &(from, to) in edges {
*in_degree.entry(to).or_insert(0) += 1;
adj.entry(from).or_default().push(to);
}
let mut queue: VecDeque<&str> = in_degree.iter()
.filter(|(_, &d)| d == 0).map(|(&n, _)| n).collect();
let mut result = Vec::new();
while let Some(node) = queue.pop_front() {
result.push(node.to_string());
// ... decrement neighbors, enqueue zeros
}
if result.len() == nodes.len() { Some(result) } else { None }
}
Rust (DFS-based alternative)
fn dfs_visit(node: &str, adj: &HashMap<&str, Vec<&str>>,
color: &mut HashMap<&str, Color>, result: &mut Vec<String>) -> bool {
color.insert(node, Color::Gray);
for &neighbor in adj.get(node).unwrap_or(&vec![]) {
match color.get(neighbor) {
Some(Color::Gray) => return false, // cycle!
Some(Color::White) => if !dfs_visit(neighbor, adj, color, result) { return false; }
_ => {}
}
}
color.insert(node, Color::Black);
result.push(node.to_string());
true
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| Map type | SMap.t (balanced BST) | HashMap<&str, usize> (hash table) |
| Return type | string list (always returns) | Option<Vec<String>> (None = cycle) |
| Edge type | (string * string) list | &[(&str, &str)] |
| Queue | string list | VecDeque<&str> |
Key Insights
Map.Make functor vs Rust's HashMap** — OCaml uses a balanced tree (ordered, O(log n)), Rust uses a hash table (unordered, O(1)). For topological sort, order doesn't matter, so HashMap is faster.go function threads in_deg immutably through recursion. Rust mutates in_degree in place. Both are correct; OCaml's is more functional, Rust's is more efficient.result.len() < nodes.len(). DFS detects via "gray" (in-progress) nodes. Both O(V+E).List.fold_left vs Rust's for loops** — OCaml builds the in-degree map with a fold. Rust uses an imperative loop with entry() API. The Rust functional variant shows both styles work.Option<Vec<String>> is more honest than string list** — OCaml's version silently returns a partial list on cycles. Rust's Option forces the caller to handle the error case.When to Use Each Style
Use Kahn's algorithm when: You need a BFS-based ordering, want to detect cycles naturally, or need to process nodes level by level (e.g., build systems, task scheduling). Use DFS-based sort when: You're already doing graph traversal, want to detect back edges explicitly, or need post-order for other algorithms (e.g., SCC detection).
Exercises
Result<Vec<NodeId>, Vec<NodeId>> where the error contains nodes involved in the cycle."target: dep1 dep2" rules, topologically sort them, and print the build order.