1055 Intermediate

1055-longest-common-subseq — Longest Common Subsequence

Functional Programming

Tutorial

The Problem

The longest common subsequence (LCS) finds the longest sequence of characters (not necessarily contiguous) that appears in the same order in two strings. It is the foundation of diff tools (git diff, Unix diff), DNA sequence alignment in bioinformatics, and file comparison in version control systems.

The DP solution uses a 2D table where dp[i][j] is the LCS length of the first i characters of s1 and the first j characters of s2, giving O(m×n) time and space.

🎯 Learning Outcomes

• Implement LCS length using a 2D DP table

• Backtrack the DP table to reconstruct the actual subsequence

• Understand the recurrence: equal characters extend, unequal take the max

• Optimize space using a rolling two-row approach

• Connect LCS to the diff algorithm and DNA sequence alignment

Code Example

#![allow(dead_code)]
#![allow(clippy::all)]
// 1055: Longest Common Subsequence — 2D DP + Backtrack

use std::collections::HashMap;

// Approach 1: 2D DP table for length
fn lcs_length(s1: &str, s2: &str) -> usize {
    let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
    let (m, n) = (a.len(), b.len());
    let mut dp = vec![vec![0usize; n + 1]; m + 1];
    for i in 1..=m {
        for j in 1..=n {
            dp[i][j] = if a[i - 1] == b[j - 1] {
                dp[i - 1][j - 1] + 1
            } else {
                dp[i - 1][j].max(dp[i][j - 1])
            };
        }
    }
    dp[m][n]
}

// Approach 2: DP + backtrack to reconstruct
fn lcs_string(s1: &str, s2: &str) -> String {
    let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
    let (m, n) = (a.len(), b.len());
    let mut dp = vec![vec![0usize; n + 1]; m + 1];
    for i in 1..=m {
        for j in 1..=n {
            dp[i][j] = if a[i - 1] == b[j - 1] {
                dp[i - 1][j - 1] + 1
            } else {
                dp[i - 1][j].max(dp[i][j - 1])
            };
        }
    }
    // Backtrack
    let mut result = Vec::new();
    let (mut i, mut j) = (m, n);
    while i > 0 && j > 0 {
        if a[i - 1] == b[j - 1] {
            result.push(a[i - 1]);
            i -= 1;
            j -= 1;
        } else if dp[i - 1][j] > dp[i][j - 1] {
            i -= 1;
        } else {
            j -= 1;
        }
    }
    result.reverse();
    result.into_iter().collect()
}

// Approach 3: Recursive with memoization
fn lcs_memo(s1: &str, s2: &str) -> usize {
    let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
    fn solve(
        i: usize,
        j: usize,
        a: &[char],
        b: &[char],
        cache: &mut HashMap<(usize, usize), usize>,
    ) -> usize {
        if i == 0 || j == 0 {
            return 0;
        }
        if let Some(&v) = cache.get(&(i, j)) {
            return v;
        }
        let v = if a[i - 1] == b[j - 1] {
            solve(i - 1, j - 1, a, b, cache) + 1
        } else {
            solve(i - 1, j, a, b, cache).max(solve(i, j - 1, a, b, cache))
        };
        cache.insert((i, j), v);
        v
    }
    let mut cache = HashMap::new();
    solve(a.len(), b.len(), &a, &b, &mut cache)
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_lcs_length() {
        assert_eq!(lcs_length("abcde", "ace"), 3);
        assert_eq!(lcs_length("abc", "abc"), 3);
        assert_eq!(lcs_length("abc", "def"), 0);
    }

    #[test]
    fn test_lcs_string() {
        assert_eq!(lcs_string("abcde", "ace"), "ace");
        assert_eq!(lcs_string("AGGTAB", "GXTXAYB"), "GTAB");
    }

    #[test]
    fn test_lcs_memo() {
        assert_eq!(lcs_memo("abcde", "ace"), 3);
        assert_eq!(lcs_memo("AGGTAB", "GXTXAYB"), 4);
    }
}

(* 1055: Longest Common Subsequence — 2D DP + Backtrack *)

(* Approach 1: 2D DP table for length *)
let lcs_length s1 s2 =
  let m = String.length s1 and n = String.length s2 in
  let dp = Array.init (m + 1) (fun _ -> Array.make (n + 1) 0) in
  for i = 1 to m do
    for j = 1 to n do
      if s1.[i - 1] = s2.[j - 1] then
        dp.(i).(j) <- dp.(i - 1).(j - 1) + 1
      else
        dp.(i).(j) <- max dp.(i - 1).(j) dp.(i).(j - 1)
    done
  done;
  dp.(m).(n)

(* Approach 2: DP + backtrack to reconstruct the subsequence *)
let lcs_string s1 s2 =
  let m = String.length s1 and n = String.length s2 in
  let dp = Array.init (m + 1) (fun _ -> Array.make (n + 1) 0) in
  for i = 1 to m do
    for j = 1 to n do
      if s1.[i - 1] = s2.[j - 1] then
        dp.(i).(j) <- dp.(i - 1).(j - 1) + 1
      else
        dp.(i).(j) <- max dp.(i - 1).(j) dp.(i).(j - 1)
    done
  done;
  (* Backtrack to find the actual subsequence *)
  let buf = Buffer.create 16 in
  let i = ref m and j = ref n in
  while !i > 0 && !j > 0 do
    if s1.[!i - 1] = s2.[!j - 1] then begin
      Buffer.add_char buf s1.[!i - 1];
      decr i; decr j
    end else if dp.(!i - 1).(!j) > dp.(!i).(!j - 1) then
      decr i
    else
      decr j
  done;
  let s = Buffer.contents buf in
  String.init (String.length s) (fun i -> s.[String.length s - 1 - i])

(* Approach 3: Recursive with memoization *)
let lcs_memo s1 s2 =
  let cache = Hashtbl.create 128 in
  let rec solve i j =
    if i = 0 || j = 0 then 0
    else
      match Hashtbl.find_opt cache (i, j) with
      | Some v -> v
      | None ->
        let v =
          if s1.[i - 1] = s2.[j - 1] then solve (i - 1) (j - 1) + 1
          else max (solve (i - 1) j) (solve i (j - 1))
        in
        Hashtbl.add cache (i, j) v;
        v
  in
  solve (String.length s1) (String.length s2)

let () =
  assert (lcs_length "abcde" "ace" = 3);
  assert (lcs_length "abc" "abc" = 3);
  assert (lcs_length "abc" "def" = 0);
  assert (lcs_string "abcde" "ace" = "ace");
  assert (lcs_string "AGGTAB" "GXTXAYB" = "GTAB");
  assert (lcs_memo "abcde" "ace" = 3);
  assert (lcs_memo "AGGTAB" "GXTXAYB" = 4);
  Printf.printf "✓ All tests passed\n"

Key Differences

Character iteration: Rust collects chars() into Vec<char> for indexed access; OCaml converts to Array.of_seq.

Matrix initialization: OCaml's Array.make_matrix creates a 2D array in one call; Rust's vec![vec![0; n+1]; m+1] uses nested Vecs.

Backtracking: Both reconstruct the LCS by following dp cell comparisons from (m, n) to (0, 0) — the algorithm is identical.

Space optimization: The rolling two-row optimization is straightforward in both; Rust's std::mem::swap makes the row rotation explicit.

OCaml Approach

let lcs_length s1 s2 =
  let a = Array.of_seq (String.to_seq s1) in
  let b = Array.of_seq (String.to_seq s2) in
  let m, n = Array.length a, Array.length b in
  let dp = Array.make_matrix (m+1) (n+1) 0 in
  for i = 1 to m do
    for j = 1 to n do
      dp.(i).(j) <- if a.(i-1) = b.(j-1)
        then dp.(i-1).(j-1) + 1
        else max dp.(i-1).(j) dp.(i).(j-1)
    done
  done;
  dp.(m).(n)

The structure is identical; the syntax difference is OCaml's .(i) vs Rust's [i].

Full Source

#![allow(dead_code)]
#![allow(clippy::all)]
// 1055: Longest Common Subsequence — 2D DP + Backtrack

use std::collections::HashMap;

// Approach 1: 2D DP table for length
fn lcs_length(s1: &str, s2: &str) -> usize {
    let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
    let (m, n) = (a.len(), b.len());
    let mut dp = vec![vec![0usize; n + 1]; m + 1];
    for i in 1..=m {
        for j in 1..=n {
            dp[i][j] = if a[i - 1] == b[j - 1] {
                dp[i - 1][j - 1] + 1
            } else {
                dp[i - 1][j].max(dp[i][j - 1])
            };
        }
    }
    dp[m][n]
}

// Approach 2: DP + backtrack to reconstruct
fn lcs_string(s1: &str, s2: &str) -> String {
    let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
    let (m, n) = (a.len(), b.len());
    let mut dp = vec![vec![0usize; n + 1]; m + 1];
    for i in 1..=m {
        for j in 1..=n {
            dp[i][j] = if a[i - 1] == b[j - 1] {
                dp[i - 1][j - 1] + 1
            } else {
                dp[i - 1][j].max(dp[i][j - 1])
            };
        }
    }
    // Backtrack
    let mut result = Vec::new();
    let (mut i, mut j) = (m, n);
    while i > 0 && j > 0 {
        if a[i - 1] == b[j - 1] {
            result.push(a[i - 1]);
            i -= 1;
            j -= 1;
        } else if dp[i - 1][j] > dp[i][j - 1] {
            i -= 1;
        } else {
            j -= 1;
        }
    }
    result.reverse();
    result.into_iter().collect()
}

// Approach 3: Recursive with memoization
fn lcs_memo(s1: &str, s2: &str) -> usize {
    let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
    fn solve(
        i: usize,
        j: usize,
        a: &[char],
        b: &[char],
        cache: &mut HashMap<(usize, usize), usize>,
    ) -> usize {
        if i == 0 || j == 0 {
            return 0;
        }
        if let Some(&v) = cache.get(&(i, j)) {
            return v;
        }
        let v = if a[i - 1] == b[j - 1] {
            solve(i - 1, j - 1, a, b, cache) + 1
        } else {
            solve(i - 1, j, a, b, cache).max(solve(i, j - 1, a, b, cache))
        };
        cache.insert((i, j), v);
        v
    }
    let mut cache = HashMap::new();
    solve(a.len(), b.len(), &a, &b, &mut cache)
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_lcs_length() {
        assert_eq!(lcs_length("abcde", "ace"), 3);
        assert_eq!(lcs_length("abc", "abc"), 3);
        assert_eq!(lcs_length("abc", "def"), 0);
    }

    #[test]
    fn test_lcs_string() {
        assert_eq!(lcs_string("abcde", "ace"), "ace");
        assert_eq!(lcs_string("AGGTAB", "GXTXAYB"), "GTAB");
    }

    #[test]
    fn test_lcs_memo() {
        assert_eq!(lcs_memo("abcde", "ace"), 3);
        assert_eq!(lcs_memo("AGGTAB", "GXTXAYB"), 4);
    }
}

(* 1055: Longest Common Subsequence — 2D DP + Backtrack *)

(* Approach 1: 2D DP table for length *)
let lcs_length s1 s2 =
  let m = String.length s1 and n = String.length s2 in
  let dp = Array.init (m + 1) (fun _ -> Array.make (n + 1) 0) in
  for i = 1 to m do
    for j = 1 to n do
      if s1.[i - 1] = s2.[j - 1] then
        dp.(i).(j) <- dp.(i - 1).(j - 1) + 1
      else
        dp.(i).(j) <- max dp.(i - 1).(j) dp.(i).(j - 1)
    done
  done;
  dp.(m).(n)

(* Approach 2: DP + backtrack to reconstruct the subsequence *)
let lcs_string s1 s2 =
  let m = String.length s1 and n = String.length s2 in
  let dp = Array.init (m + 1) (fun _ -> Array.make (n + 1) 0) in
  for i = 1 to m do
    for j = 1 to n do
      if s1.[i - 1] = s2.[j - 1] then
        dp.(i).(j) <- dp.(i - 1).(j - 1) + 1
      else
        dp.(i).(j) <- max dp.(i - 1).(j) dp.(i).(j - 1)
    done
  done;
  (* Backtrack to find the actual subsequence *)
  let buf = Buffer.create 16 in
  let i = ref m and j = ref n in
  while !i > 0 && !j > 0 do
    if s1.[!i - 1] = s2.[!j - 1] then begin
      Buffer.add_char buf s1.[!i - 1];
      decr i; decr j
    end else if dp.(!i - 1).(!j) > dp.(!i).(!j - 1) then
      decr i
    else
      decr j
  done;
  let s = Buffer.contents buf in
  String.init (String.length s) (fun i -> s.[String.length s - 1 - i])

(* Approach 3: Recursive with memoization *)
let lcs_memo s1 s2 =
  let cache = Hashtbl.create 128 in
  let rec solve i j =
    if i = 0 || j = 0 then 0
    else
      match Hashtbl.find_opt cache (i, j) with
      | Some v -> v
      | None ->
        let v =
          if s1.[i - 1] = s2.[j - 1] then solve (i - 1) (j - 1) + 1
          else max (solve (i - 1) j) (solve i (j - 1))
        in
        Hashtbl.add cache (i, j) v;
        v
  in
  solve (String.length s1) (String.length s2)

let () =
  assert (lcs_length "abcde" "ace" = 3);
  assert (lcs_length "abc" "abc" = 3);
  assert (lcs_length "abc" "def" = 0);
  assert (lcs_string "abcde" "ace" = "ace");
  assert (lcs_string "AGGTAB" "GXTXAYB" = "GTAB");
  assert (lcs_memo "abcde" "ace" = 3);
  assert (lcs_memo "AGGTAB" "GXTXAYB" = 4);
  Printf.printf "✓ All tests passed\n"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_lcs_length() {
        assert_eq!(lcs_length("abcde", "ace"), 3);
        assert_eq!(lcs_length("abc", "abc"), 3);
        assert_eq!(lcs_length("abc", "def"), 0);
    }

    #[test]
    fn test_lcs_string() {
        assert_eq!(lcs_string("abcde", "ace"), "ace");
        assert_eq!(lcs_string("AGGTAB", "GXTXAYB"), "GTAB");
    }

    #[test]
    fn test_lcs_memo() {
        assert_eq!(lcs_memo("abcde", "ace"), 3);
        assert_eq!(lcs_memo("AGGTAB", "GXTXAYB"), 4);
    }
}

Deep Comparison

Longest Common Subsequence — Comparison

Core Insight

LCS uses a 2D DP table where dp[i][j] = length of LCS of first i chars of s1 and first j chars of s2. Backtracking from the bottom-right corner reconstructs the actual subsequence.

OCaml Approach

• Array.init for 2D table, Buffer for string reconstruction

• Character access via s.[i] (O(1) for ASCII strings)

• Backtracking uses ref cells for mutable indices

• String.init with index reversal to flip the backtracked result

Rust Approach

• Strings converted to Vec<char> first (UTF-8 aware)

• vec![vec![0; n+1]; m+1] for 2D table

• Backtracking pushes to Vec<char> then reverse()

• into_iter().collect() to convert back to String

Comparison Table

Aspect	OCaml	Rust
String indexing	`s.[i]` — O(1) byte access	Must convert to `Vec<char>` first
2D table	`Array.init` + `Array.make`	`vec![vec![]; ...]`
String building	`Buffer.add_char` + `Buffer.contents`	`Vec<char>` + `.collect()`
Reversal	`String.init` with reversed index	`.reverse()` in-place
UTF-8 handling	Byte-level (ASCII only)	Full Unicode via `chars()`

Exercises

Implement edit_distance_from_lcs(s1: &str, s2: &str) -> usize using the formula |s1| + |s2| - 2 * lcs_length.

Write lcs_all(s1: &str, s2: &str) -> Vec<String> that returns all LCS strings (there may be multiple).

Implement the Myers diff algorithm that produces +/- lines like git diff using LCS as the core.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust