946 Catamorphism
Tutorial
The Problem
Implement a catamorphism for a binary tree — the generalized fold that replaces every constructor with a function. The catamorphism cata(tree, leaf, node_fn) collapses the entire tree into a single value by recursively substituting Leaf with leaf and Node(l, v, r) with node_fn(cata(l), v, cata(r)). Derive tree size, sum, height, and in-order list from this single combinator.
🎯 Learning Outcomes
cata<T, R>(tree, leaf: R, node: &dyn Fn(R, &T, R) -> R) -> R for a binary treesize, sum, height, and to_list as specializations of cataCode Example
#![allow(clippy::all)]
/// # Catamorphism — Generalized Fold on ADTs
///
/// A catamorphism replaces each constructor of an algebraic data type
/// with a function. It's the universal way to consume a recursive structure.
/// Binary tree
#[derive(Debug, Clone, PartialEq)]
pub enum Tree<T> {
Leaf,
Node(Box<Tree<T>>, T, Box<Tree<T>>),
}
impl<T> Tree<T> {
pub fn node(left: Tree<T>, value: T, right: Tree<T>) -> Self {
Tree::Node(Box::new(left), value, Box::new(right))
}
}
/// The catamorphism: replace Leaf with `leaf` value, Node with `node` function.
/// This is the most general way to fold over a tree.
pub fn cata<T, R>(tree: &Tree<T>, leaf: R, node: &dyn Fn(R, &T, R) -> R) -> R
where
R: Clone,
{
match tree {
Tree::Leaf => leaf,
Tree::Node(l, v, r) => {
let left_result = cata(l, leaf.clone(), node);
let right_result = cata(r, leaf, node);
node(left_result, v, right_result)
}
}
}
/// Size: count all nodes
pub fn size<T>(tree: &Tree<T>) -> usize {
cata(tree, 0, &|l, _, r| 1 + l + r)
}
/// Sum: add all values
pub fn sum(tree: &Tree<i64>) -> i64 {
cata(tree, 0, &|l, v, r| l + v + r)
}
/// Height: longest path from root to leaf
pub fn height<T>(tree: &Tree<T>) -> usize {
cata(tree, 0, &|l, _, r| 1 + l.max(r))
}
/// Mirror: swap left and right subtrees
pub fn mirror<T: Clone>(tree: &Tree<T>) -> Tree<T> {
match tree {
Tree::Leaf => Tree::Leaf,
Tree::Node(l, v, r) => Tree::node(mirror(r), v.clone(), mirror(l)),
}
}
/// In-order traversal to list
pub fn to_list<T: Clone>(tree: &Tree<T>) -> Vec<T> {
cata(tree, vec![], &|mut l, v, r| {
l.push(v.clone());
l.extend(r);
l
})
}
#[cfg(test)]
mod tests {
use super::*;
fn sample_tree() -> Tree<i64> {
Tree::node(
Tree::node(Tree::Leaf, 1, Tree::Leaf),
2,
Tree::node(Tree::Leaf, 3, Tree::Leaf),
)
}
#[test]
fn test_size() {
assert_eq!(size(&sample_tree()), 3);
assert_eq!(size::<i64>(&Tree::Leaf), 0);
}
#[test]
fn test_sum() {
assert_eq!(sum(&sample_tree()), 6);
}
#[test]
fn test_height() {
assert_eq!(height(&sample_tree()), 2);
assert_eq!(height::<i64>(&Tree::Leaf), 0);
}
#[test]
fn test_mirror() {
let t = sample_tree();
let m = mirror(&t);
assert_eq!(to_list(&m), vec![3, 2, 1]);
}
#[test]
fn test_to_list() {
assert_eq!(to_list(&sample_tree()), vec![1, 2, 3]);
}
#[test]
fn test_catamorphism_is_general() {
// Any tree computation can be expressed as a cata
let product = cata(&sample_tree(), 1i64, &|l, v, r| l * v * r);
assert_eq!(product, 6); // 1 * 1 * 2 * 1 * 3 * 1 = 6... wait
// Actually: Node(Node(Leaf,1,Leaf), 2, Node(Leaf,3,Leaf))
// = node(node(1, 1, 1), 2, node(1, 3, 1)) = node(1, 2, 3) = 1*2*3 = 6
assert_eq!(product, 6);
}
}Key Differences
| Aspect | Rust | OCaml |
|---|---|---|
| Constructor replacement | leaf: R value + node: &dyn Fn(R, &T, R) -> R | leaf_val + curried node_fn |
| Clone requirement | R: Clone for sharing leaf value | Immutable GC values — no cloning needed |
| Point-free style | Verbose (cata(tree, ...)) | Natural (cata 0 (fun ...) tree) |
| List append in algebra | extend — O(n) like OCaml @ | @ — also O(n); use accumulator for O(n log n) overall |
Catamorphisms embody the principle that all observation of a recursive type flows through its fold. Any function on a Tree<T> that does not require sharing/cycles can be expressed as a cata.
OCaml Approach
type 'a tree =
| Leaf
| Node of 'a tree * 'a * 'a tree
let rec cata leaf_val node_fn = function
| Leaf -> leaf_val
| Node (l, v, r) ->
node_fn
(cata leaf_val node_fn l)
v
(cata leaf_val node_fn r)
let size t = cata 0 (fun l _ r -> 1 + l + r) t
let sum t = cata 0 (fun l v r -> l + v + r) t
let height t = cata 0 (fun l _ r -> 1 + max l r) t
let to_list t =
cata [] (fun l v r -> l @ [v] @ r) t
OCaml's curried cata leaf_val node_fn produces a function tree -> result — natural point-free style. to_list uses @ (list append) which is O(n) per step; production code would use an accumulator.
Full Source
#![allow(clippy::all)]
/// # Catamorphism — Generalized Fold on ADTs
///
/// A catamorphism replaces each constructor of an algebraic data type
/// with a function. It's the universal way to consume a recursive structure.
/// Binary tree
#[derive(Debug, Clone, PartialEq)]
pub enum Tree<T> {
Leaf,
Node(Box<Tree<T>>, T, Box<Tree<T>>),
}
impl<T> Tree<T> {
pub fn node(left: Tree<T>, value: T, right: Tree<T>) -> Self {
Tree::Node(Box::new(left), value, Box::new(right))
}
}
/// The catamorphism: replace Leaf with `leaf` value, Node with `node` function.
/// This is the most general way to fold over a tree.
pub fn cata<T, R>(tree: &Tree<T>, leaf: R, node: &dyn Fn(R, &T, R) -> R) -> R
where
R: Clone,
{
match tree {
Tree::Leaf => leaf,
Tree::Node(l, v, r) => {
let left_result = cata(l, leaf.clone(), node);
let right_result = cata(r, leaf, node);
node(left_result, v, right_result)
}
}
}
/// Size: count all nodes
pub fn size<T>(tree: &Tree<T>) -> usize {
cata(tree, 0, &|l, _, r| 1 + l + r)
}
/// Sum: add all values
pub fn sum(tree: &Tree<i64>) -> i64 {
cata(tree, 0, &|l, v, r| l + v + r)
}
/// Height: longest path from root to leaf
pub fn height<T>(tree: &Tree<T>) -> usize {
cata(tree, 0, &|l, _, r| 1 + l.max(r))
}
/// Mirror: swap left and right subtrees
pub fn mirror<T: Clone>(tree: &Tree<T>) -> Tree<T> {
match tree {
Tree::Leaf => Tree::Leaf,
Tree::Node(l, v, r) => Tree::node(mirror(r), v.clone(), mirror(l)),
}
}
/// In-order traversal to list
pub fn to_list<T: Clone>(tree: &Tree<T>) -> Vec<T> {
cata(tree, vec![], &|mut l, v, r| {
l.push(v.clone());
l.extend(r);
l
})
}
#[cfg(test)]
mod tests {
use super::*;
fn sample_tree() -> Tree<i64> {
Tree::node(
Tree::node(Tree::Leaf, 1, Tree::Leaf),
2,
Tree::node(Tree::Leaf, 3, Tree::Leaf),
)
}
#[test]
fn test_size() {
assert_eq!(size(&sample_tree()), 3);
assert_eq!(size::<i64>(&Tree::Leaf), 0);
}
#[test]
fn test_sum() {
assert_eq!(sum(&sample_tree()), 6);
}
#[test]
fn test_height() {
assert_eq!(height(&sample_tree()), 2);
assert_eq!(height::<i64>(&Tree::Leaf), 0);
}
#[test]
fn test_mirror() {
let t = sample_tree();
let m = mirror(&t);
assert_eq!(to_list(&m), vec![3, 2, 1]);
}
#[test]
fn test_to_list() {
assert_eq!(to_list(&sample_tree()), vec![1, 2, 3]);
}
#[test]
fn test_catamorphism_is_general() {
// Any tree computation can be expressed as a cata
let product = cata(&sample_tree(), 1i64, &|l, v, r| l * v * r);
assert_eq!(product, 6); // 1 * 1 * 2 * 1 * 3 * 1 = 6... wait
// Actually: Node(Node(Leaf,1,Leaf), 2, Node(Leaf,3,Leaf))
// = node(node(1, 1, 1), 2, node(1, 3, 1)) = node(1, 2, 3) = 1*2*3 = 6
assert_eq!(product, 6);
}
}#[cfg(test)]
mod tests {
use super::*;
fn sample_tree() -> Tree<i64> {
Tree::node(
Tree::node(Tree::Leaf, 1, Tree::Leaf),
2,
Tree::node(Tree::Leaf, 3, Tree::Leaf),
)
}
#[test]
fn test_size() {
assert_eq!(size(&sample_tree()), 3);
assert_eq!(size::<i64>(&Tree::Leaf), 0);
}
#[test]
fn test_sum() {
assert_eq!(sum(&sample_tree()), 6);
}
#[test]
fn test_height() {
assert_eq!(height(&sample_tree()), 2);
assert_eq!(height::<i64>(&Tree::Leaf), 0);
}
#[test]
fn test_mirror() {
let t = sample_tree();
let m = mirror(&t);
assert_eq!(to_list(&m), vec![3, 2, 1]);
}
#[test]
fn test_to_list() {
assert_eq!(to_list(&sample_tree()), vec![1, 2, 3]);
}
#[test]
fn test_catamorphism_is_general() {
// Any tree computation can be expressed as a cata
let product = cata(&sample_tree(), 1i64, &|l, v, r| l * v * r);
assert_eq!(product, 6); // 1 * 1 * 2 * 1 * 3 * 1 = 6... wait
// Actually: Node(Node(Leaf,1,Leaf), 2, Node(Leaf,3,Leaf))
// = node(node(1, 1, 1), 2, node(1, 3, 1)) = node(1, 2, 3) = 1*2*3 = 6
assert_eq!(product, 6);
}
}
Deep Comparison
Catamorphism — OCaml vs Rust Comparison
Core Insight
A catamorphism is the "universal fold" over any algebraic data type. You replace each constructor with a function: Leaf → leaf_value, Node(l,v,r) → node_fn(l_result, v, r_result). Once you have cata, any tree computation (size, sum, height, mirror) is just choosing the right leaf and node functions.
OCaml Approach
Labeled arguments make the pattern crystal clear: cata ~leaf:0 ~node:(fun l _ r -> 1 + l + r) reads almost like a specification. The parametric polymorphism of the return type means cata can produce any type — integers, trees, lists. OCaml's GC handles all the intermediate allocations.
Rust Approach
Uses &dyn Fn(R, &T, R) -> R for the node function — dynamic dispatch via trait objects. The R: Clone bound is needed because the leaf value must be cloned for each leaf in the tree. mirror can't easily use cata because it needs to produce Tree<T> which involves ownership — showing where Rust's ownership model creates friction with generic recursive patterns.
Comparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Memory | GC handles all allocations | Clone bound for leaf, Box for nodes |
| Null safety | N/A | N/A |
| Errors | N/A | Ownership issues with tree-producing cata |
| Iteration | Recursive pattern match | Recursive pattern match |
| Ergonomics | Labeled args (~leaf ~node) | Trait objects (&dyn Fn) |
Things Rust Learners Should Notice
&dyn Fn vs generic F: Fn** — dyn dispatch avoids monomorphization bloat for recursive callsR: Clone** — the leaf value must be cloneable since it's used at every leaf positionBox<Tree<T>>** — recursive types need indirection in Rust (Box) but not in OCamlTree from cata requires Clone on T and fighting ownershipfold / reduceFurther Reading
Exercises
contains<T: PartialEq>(tree, target) using cata.flatten_preorder (root before children) using cata — note that pre-order requires different placement of v.mirror(tree) as a cata — the node algebra returns Node(r_result, v, l_result).map_tree<T, U>(tree, f) using cata where the output is a Tree<U>.depth_first_values that collects node values in pre-order without using cata, then compare it to the cata-based version for clarity.