073 Advanced

073 — Topological Sort

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "073 — Topological Sort" functional Rust example. Difficulty level: Advanced. Key concepts covered: Functional Programming. Topological sort orders the nodes of a directed acyclic graph (DAG) such that for every directed edge (u → v), node u appears before v in the ordering. Key difference from OCaml: 1. **Nested function**: Rust's nested `fn visit<'a>(...)` inside `topo_sort` requires explicit lifetime annotations for the borrowed adjacency map. OCaml closures capture the environment implicitly.

Tutorial

The Problem

Topological sort orders the nodes of a directed acyclic graph (DAG) such that for every directed edge (u → v), node u appears before v in the ordering. It answers the question: "in what order should I complete these tasks, given their dependencies?" This is one of the most widely used graph algorithms in practice.

Build systems (Bazel, Make, Cargo) topologically sort compilation units so dependencies compile before dependents. Package managers (npm, pip, apt) sort package installations so transitive dependencies install first. Spreadsheet engines recalculate cells in topological order. Database engines plan query execution and schema migrations using topological ordering.

The DFS-based algorithm (Tarjan, 1976) marks each node visited, recurses on all neighbors first, then adds the current node — this is post-order DFS. Reversing the post-order gives a valid topological ordering. Kahn's algorithm (BFS-based) is an alternative that processes nodes with in-degree 0 iteratively and naturally detects cycles.

🎯 Learning Outcomes

• Implement DFS-based topological sort with a visited set and post-order accumulation

• Build an adjacency list from edge pairs using HashMap<&str, Vec<&str>>

• Use a HashSet for O(1) visited checking

• Understand the post-order property: a node is added AFTER all its descendants

• Recognize that the reversed post-order DFS is a valid topological ordering

Code Example

#![allow(clippy::all)]
/// # Topological Sort — DAG Ordering
///
/// Order nodes in a directed acyclic graph so every edge goes from earlier to later.
/// Uses DFS-based algorithm.
use std::collections::{HashMap, HashSet};

/// DFS-based topological sort.
/// Returns nodes in topological order (dependencies first).
pub fn topo_sort(edges: &[(&str, &str)]) -> Vec<String> {
    // Collect all nodes
    let mut all_nodes = HashSet::new();
    let mut adj: HashMap<&str, Vec<&str>> = HashMap::new();
    for &(from, to) in edges {
        all_nodes.insert(from);
        all_nodes.insert(to);
        adj.entry(from).or_default().push(to);
    }

    let mut visited = HashSet::new();
    let mut order = Vec::new();

    fn visit<'a>(
        node: &'a str,
        adj: &HashMap<&'a str, Vec<&'a str>>,
        visited: &mut HashSet<&'a str>,
        order: &mut Vec<String>,
    ) {
        if visited.contains(node) {
            return;
        }
        visited.insert(node);
        if let Some(neighbors) = adj.get(node) {
            for &neighbor in neighbors {
                visit(neighbor, adj, visited, order);
            }
        }
        // Post-order: add after all descendants visited
        order.push(node.to_string());
    }

    // Visit all nodes (handles disconnected components)
    let mut sorted_nodes: Vec<&str> = all_nodes.into_iter().collect();
    sorted_nodes.sort(); // deterministic ordering
    for node in sorted_nodes {
        visit(node, &adj, &mut visited, &mut order);
    }

    order.reverse(); // reverse post-order = topological order
    order
}

/// Kahn's algorithm (BFS-based) — alternative approach using in-degree counting.
pub fn topo_sort_kahn(edges: &[(&str, &str)]) -> Vec<String> {
    let mut all_nodes = HashSet::new();
    let mut adj: HashMap<&str, Vec<&str>> = HashMap::new();
    let mut in_degree: HashMap<&str, usize> = HashMap::new();

    for &(from, to) in edges {
        all_nodes.insert(from);
        all_nodes.insert(to);
        adj.entry(from).or_default().push(to);
        *in_degree.entry(to).or_insert(0) += 1;
        in_degree.entry(from).or_insert(0);
    }

    // Start with nodes that have no incoming edges
    let mut queue: Vec<&str> = in_degree
        .iter()
        .filter(|(_, &deg)| deg == 0)
        .map(|(&node, _)| node)
        .collect();
    queue.sort(); // deterministic
    let mut result = Vec::new();

    while let Some(node) = queue.first().copied() {
        queue.remove(0);
        result.push(node.to_string());
        if let Some(neighbors) = adj.get(node) {
            for &neighbor in neighbors {
                let deg = in_degree.get_mut(neighbor).unwrap();
                *deg -= 1;
                if *deg == 0 {
                    queue.push(neighbor);
                    queue.sort();
                }
            }
        }
    }

    result
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic() {
        let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
        let order = topo_sort(&edges);
        // a must come before b and c; b and c before d; d before e
        let pos = |s: &str| order.iter().position(|x| x == s).unwrap();
        assert!(pos("a") < pos("b"));
        assert!(pos("a") < pos("c"));
        assert!(pos("b") < pos("d"));
        assert!(pos("d") < pos("e"));
    }

    #[test]
    fn test_kahn() {
        let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
        let order = topo_sort_kahn(&edges);
        let pos = |s: &str| order.iter().position(|x| x == s).unwrap();
        assert!(pos("a") < pos("b"));
        assert!(pos("d") < pos("e"));
    }

    #[test]
    fn test_empty() {
        let order = topo_sort(&[]);
        assert!(order.is_empty());
    }

    #[test]
    fn test_single_edge() {
        let order = topo_sort(&[("a", "b")]);
        assert_eq!(order, vec!["a", "b"]);
    }

    #[test]
    fn test_linear_chain() {
        let edges = vec![("a", "b"), ("b", "c"), ("c", "d")];
        let order = topo_sort(&edges);
        assert_eq!(order, vec!["a", "b", "c", "d"]);
    }
}

(* Topological Sort — DAG Ordering *)

module SS = Set.Make(String)

let topo_sort edges =
  let neighbors node =
    List.filter_map (fun (a, b) -> if a = node then Some b else None) edges in
  let all_nodes = List.fold_left (fun s (a, b) -> SS.add a (SS.add b s)) SS.empty edges in
  let rec visit node (visited, order) =
    if SS.mem node visited then (visited, order)
    else
      let visited = SS.add node visited in
      let visited, order = List.fold_left (fun acc n ->
        visit n acc) (visited, order) (neighbors node) in
      (visited, node :: order)
  in
  let _, order = SS.fold (fun node acc -> visit node acc) all_nodes (SS.empty, []) in
  order

let () =
  let edges = [("a","b");("a","c");("b","d");("c","d");("d","e")] in
  List.iter (Printf.printf "%s ") (topo_sort edges);
  print_newline ()

Key Differences

Nested function: Rust's nested fn visit<'a>(...) inside topo_sort requires explicit lifetime annotations for the borrowed adjacency map. OCaml closures capture the environment implicitly.

**HashSet vs Hashtbl**: Rust's HashSet for visited nodes. OCaml's Hashtbl serves both set and map purposes. Both provide O(1) lookup.

Cycle detection: This implementation does not detect cycles (silently handles them by skipping visited nodes, which may produce incorrect output). A full implementation needs a "in progress" visited state (three-color DFS).

Ordering: The post-order accumulation produces dependencies-last. Reversing gives dependencies-first. Kahn's algorithm naturally produces dependencies-first without reversal.

OCaml Approach

OCaml uses a hash table for the visited set and a reference list for accumulation:

let topo_sort edges =
  let visited = Hashtbl.create 16 in
  let order = ref [] in
  let adj = build_adj edges in
  let rec visit node =
    if not (Hashtbl.mem visited node) then begin
      Hashtbl.add visited node ();
      List.iter visit (List.assoc_opt node adj |> Option.value ~default:[]);
      order := node :: !order
    end
  in
  List.iter (fun (a, b) -> visit a; visit b) edges;
  !order

The order := node :: !order prepends in post-order, giving the correct topological sequence without an explicit reverse step.

Full Source

#![allow(clippy::all)]
/// # Topological Sort — DAG Ordering
///
/// Order nodes in a directed acyclic graph so every edge goes from earlier to later.
/// Uses DFS-based algorithm.
use std::collections::{HashMap, HashSet};

/// DFS-based topological sort.
/// Returns nodes in topological order (dependencies first).
pub fn topo_sort(edges: &[(&str, &str)]) -> Vec<String> {
    // Collect all nodes
    let mut all_nodes = HashSet::new();
    let mut adj: HashMap<&str, Vec<&str>> = HashMap::new();
    for &(from, to) in edges {
        all_nodes.insert(from);
        all_nodes.insert(to);
        adj.entry(from).or_default().push(to);
    }

    let mut visited = HashSet::new();
    let mut order = Vec::new();

    fn visit<'a>(
        node: &'a str,
        adj: &HashMap<&'a str, Vec<&'a str>>,
        visited: &mut HashSet<&'a str>,
        order: &mut Vec<String>,
    ) {
        if visited.contains(node) {
            return;
        }
        visited.insert(node);
        if let Some(neighbors) = adj.get(node) {
            for &neighbor in neighbors {
                visit(neighbor, adj, visited, order);
            }
        }
        // Post-order: add after all descendants visited
        order.push(node.to_string());
    }

    // Visit all nodes (handles disconnected components)
    let mut sorted_nodes: Vec<&str> = all_nodes.into_iter().collect();
    sorted_nodes.sort(); // deterministic ordering
    for node in sorted_nodes {
        visit(node, &adj, &mut visited, &mut order);
    }

    order.reverse(); // reverse post-order = topological order
    order
}

/// Kahn's algorithm (BFS-based) — alternative approach using in-degree counting.
pub fn topo_sort_kahn(edges: &[(&str, &str)]) -> Vec<String> {
    let mut all_nodes = HashSet::new();
    let mut adj: HashMap<&str, Vec<&str>> = HashMap::new();
    let mut in_degree: HashMap<&str, usize> = HashMap::new();

    for &(from, to) in edges {
        all_nodes.insert(from);
        all_nodes.insert(to);
        adj.entry(from).or_default().push(to);
        *in_degree.entry(to).or_insert(0) += 1;
        in_degree.entry(from).or_insert(0);
    }

    // Start with nodes that have no incoming edges
    let mut queue: Vec<&str> = in_degree
        .iter()
        .filter(|(_, &deg)| deg == 0)
        .map(|(&node, _)| node)
        .collect();
    queue.sort(); // deterministic
    let mut result = Vec::new();

    while let Some(node) = queue.first().copied() {
        queue.remove(0);
        result.push(node.to_string());
        if let Some(neighbors) = adj.get(node) {
            for &neighbor in neighbors {
                let deg = in_degree.get_mut(neighbor).unwrap();
                *deg -= 1;
                if *deg == 0 {
                    queue.push(neighbor);
                    queue.sort();
                }
            }
        }
    }

    result
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic() {
        let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
        let order = topo_sort(&edges);
        // a must come before b and c; b and c before d; d before e
        let pos = |s: &str| order.iter().position(|x| x == s).unwrap();
        assert!(pos("a") < pos("b"));
        assert!(pos("a") < pos("c"));
        assert!(pos("b") < pos("d"));
        assert!(pos("d") < pos("e"));
    }

    #[test]
    fn test_kahn() {
        let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
        let order = topo_sort_kahn(&edges);
        let pos = |s: &str| order.iter().position(|x| x == s).unwrap();
        assert!(pos("a") < pos("b"));
        assert!(pos("d") < pos("e"));
    }

    #[test]
    fn test_empty() {
        let order = topo_sort(&[]);
        assert!(order.is_empty());
    }

    #[test]
    fn test_single_edge() {
        let order = topo_sort(&[("a", "b")]);
        assert_eq!(order, vec!["a", "b"]);
    }

    #[test]
    fn test_linear_chain() {
        let edges = vec![("a", "b"), ("b", "c"), ("c", "d")];
        let order = topo_sort(&edges);
        assert_eq!(order, vec!["a", "b", "c", "d"]);
    }
}

(* Topological Sort — DAG Ordering *)

module SS = Set.Make(String)

let topo_sort edges =
  let neighbors node =
    List.filter_map (fun (a, b) -> if a = node then Some b else None) edges in
  let all_nodes = List.fold_left (fun s (a, b) -> SS.add a (SS.add b s)) SS.empty edges in
  let rec visit node (visited, order) =
    if SS.mem node visited then (visited, order)
    else
      let visited = SS.add node visited in
      let visited, order = List.fold_left (fun acc n ->
        visit n acc) (visited, order) (neighbors node) in
      (visited, node :: order)
  in
  let _, order = SS.fold (fun node acc -> visit node acc) all_nodes (SS.empty, []) in
  order

let () =
  let edges = [("a","b");("a","c");("b","d");("c","d");("d","e")] in
  List.iter (Printf.printf "%s ") (topo_sort edges);
  print_newline ()

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic() {
        let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
        let order = topo_sort(&edges);
        // a must come before b and c; b and c before d; d before e
        let pos = |s: &str| order.iter().position(|x| x == s).unwrap();
        assert!(pos("a") < pos("b"));
        assert!(pos("a") < pos("c"));
        assert!(pos("b") < pos("d"));
        assert!(pos("d") < pos("e"));
    }

    #[test]
    fn test_kahn() {
        let edges = vec![("a", "b"), ("a", "c"), ("b", "d"), ("c", "d"), ("d", "e")];
        let order = topo_sort_kahn(&edges);
        let pos = |s: &str| order.iter().position(|x| x == s).unwrap();
        assert!(pos("a") < pos("b"));
        assert!(pos("d") < pos("e"));
    }

    #[test]
    fn test_empty() {
        let order = topo_sort(&[]);
        assert!(order.is_empty());
    }

    #[test]
    fn test_single_edge() {
        let order = topo_sort(&[("a", "b")]);
        assert_eq!(order, vec!["a", "b"]);
    }

    #[test]
    fn test_linear_chain() {
        let edges = vec![("a", "b"), ("b", "c"), ("c", "d")];
        let order = topo_sort(&edges);
        assert_eq!(order, vec!["a", "b", "c", "d"]);
    }
}

Deep Comparison

Topological Sort — OCaml vs Rust Comparison

Core Insight

Topological sort reveals how each language handles graph traversal state. OCaml threads immutable (visited_set, result_list) pairs through recursive calls — purely functional but verbose. Rust passes mutable references to HashSet and Vec — more explicit about mutation but closer to the imperative algorithm.

OCaml Approach

Uses Set.Make(String) for the visited set. The visit function takes and returns (visited, order) tuples — no mutation, but every recursive call creates new tuples. SS.fold iterates over all nodes to handle disconnected components. The functional style ensures referential transparency.

Rust Approach

Two algorithms: (1) DFS with &mut HashSet and &mut Vec passed to recursive visit — clear ownership of mutable state. (2) Kahn's BFS algorithm using in-degree counting. The Rust version explicitly allocates HashMap for adjacency lists upfront, making the graph structure clear.

Comparison Table

Aspect	OCaml	Rust
Memory	Immutable sets (tree nodes)	HashSet + Vec (hash table + array)
Null safety	N/A	N/A
Errors	No cycle detection	No cycle detection (add if needed)
Iteration	Recursive with tuple threading	Recursive with `&mut` references
State	Functional (pass-and-return)	Mutable references (borrow checker)

Things Rust Learners Should Notice

**&mut in recursive functions** — passing mutable refs down the call stack is idiomatic for graph algorithms

**HashMap::entry().or_default()** — builds adjacency lists cleanly

**Lifetime annotations 'a** — needed when storing references to input data in the visited set

Two algorithms — DFS (post-order reverse) vs Kahn's (BFS with in-degree) — same result, different trade-offs

Deterministic output — sorting nodes before iteration ensures reproducible results in tests

Exercises

Cycle detection: Extend topo_sort to return Result<Vec<String>, Vec<String>> where the error contains the nodes in the detected cycle. Use three-color DFS: white (unvisited), gray (in progress), black (done).

Kahn's algorithm: Implement topological sort using Kahn's algorithm: repeatedly remove nodes with in-degree 0. Return None if a cycle prevents completion.

Build order: Given Cargo-style [[dependencies]] as a list of (package, depends_on) pairs, produce a valid build order using topological sort.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust

Tutorial Video

Tutorial

The Problem

🎯 Learning Outcomes

Code Example

Key Differences

OCaml Approach

Full Source

Deep Comparison

Topological Sort — OCaml vs Rust Comparison

Core Insight

OCaml Approach

Rust Approach

Comparison Table

Things Rust Learners Should Notice

Further Reading

Exercises

Open Source Repos