1055-longest-common-subseq — Longest Common Subsequence
Functional Programming
Tutorial
The Problem
The longest common subsequence (LCS) finds the longest sequence of characters (not necessarily contiguous) that appears in the same order in two strings. It is the foundation of diff tools (git diff, Unix diff), DNA sequence alignment in bioinformatics, and file comparison in version control systems.
The DP solution uses a 2D table where dp[i][j] is the LCS length of the first i characters of s1 and the first j characters of s2, giving O(m×n) time and space.
🎯 Learning Outcomes
diff algorithm and DNA sequence alignmentCode Example
#![allow(clippy::all)]
// 1055: Longest Common Subsequence — 2D DP + Backtrack
use std::collections::HashMap;
// Approach 1: 2D DP table for length
fn lcs_length(s1: &str, s2: &str) -> usize {
let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
let (m, n) = (a.len(), b.len());
let mut dp = vec![vec![0usize; n + 1]; m + 1];
for i in 1..=m {
for j in 1..=n {
dp[i][j] = if a[i - 1] == b[j - 1] {
dp[i - 1][j - 1] + 1
} else {
dp[i - 1][j].max(dp[i][j - 1])
};
}
}
dp[m][n]
}
// Approach 2: DP + backtrack to reconstruct
fn lcs_string(s1: &str, s2: &str) -> String {
let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
let (m, n) = (a.len(), b.len());
let mut dp = vec![vec![0usize; n + 1]; m + 1];
for i in 1..=m {
for j in 1..=n {
dp[i][j] = if a[i - 1] == b[j - 1] {
dp[i - 1][j - 1] + 1
} else {
dp[i - 1][j].max(dp[i][j - 1])
};
}
}
// Backtrack
let mut result = Vec::new();
let (mut i, mut j) = (m, n);
while i > 0 && j > 0 {
if a[i - 1] == b[j - 1] {
result.push(a[i - 1]);
i -= 1;
j -= 1;
} else if dp[i - 1][j] > dp[i][j - 1] {
i -= 1;
} else {
j -= 1;
}
}
result.reverse();
result.into_iter().collect()
}
// Approach 3: Recursive with memoization
fn lcs_memo(s1: &str, s2: &str) -> usize {
let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
fn solve(
i: usize,
j: usize,
a: &[char],
b: &[char],
cache: &mut HashMap<(usize, usize), usize>,
) -> usize {
if i == 0 || j == 0 {
return 0;
}
if let Some(&v) = cache.get(&(i, j)) {
return v;
}
let v = if a[i - 1] == b[j - 1] {
solve(i - 1, j - 1, a, b, cache) + 1
} else {
solve(i - 1, j, a, b, cache).max(solve(i, j - 1, a, b, cache))
};
cache.insert((i, j), v);
v
}
let mut cache = HashMap::new();
solve(a.len(), b.len(), &a, &b, &mut cache)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_lcs_length() {
assert_eq!(lcs_length("abcde", "ace"), 3);
assert_eq!(lcs_length("abc", "abc"), 3);
assert_eq!(lcs_length("abc", "def"), 0);
}
#[test]
fn test_lcs_string() {
assert_eq!(lcs_string("abcde", "ace"), "ace");
assert_eq!(lcs_string("AGGTAB", "GXTXAYB"), "GTAB");
}
#[test]
fn test_lcs_memo() {
assert_eq!(lcs_memo("abcde", "ace"), 3);
assert_eq!(lcs_memo("AGGTAB", "GXTXAYB"), 4);
}
}Key Differences
chars() into Vec<char> for indexed access; OCaml converts to Array.of_seq.Array.make_matrix creates a 2D array in one call; Rust's vec![vec![0; n+1]; m+1] uses nested Vecs.dp cell comparisons from (m, n) to (0, 0) — the algorithm is identical.std::mem::swap makes the row rotation explicit.OCaml Approach
let lcs_length s1 s2 =
let a = Array.of_seq (String.to_seq s1) in
let b = Array.of_seq (String.to_seq s2) in
let m, n = Array.length a, Array.length b in
let dp = Array.make_matrix (m+1) (n+1) 0 in
for i = 1 to m do
for j = 1 to n do
dp.(i).(j) <- if a.(i-1) = b.(j-1)
then dp.(i-1).(j-1) + 1
else max dp.(i-1).(j) dp.(i).(j-1)
done
done;
dp.(m).(n)
The structure is identical; the syntax difference is OCaml's .(i) vs Rust's [i].
Full Source
#![allow(clippy::all)]
// 1055: Longest Common Subsequence — 2D DP + Backtrack
use std::collections::HashMap;
// Approach 1: 2D DP table for length
fn lcs_length(s1: &str, s2: &str) -> usize {
let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
let (m, n) = (a.len(), b.len());
let mut dp = vec![vec![0usize; n + 1]; m + 1];
for i in 1..=m {
for j in 1..=n {
dp[i][j] = if a[i - 1] == b[j - 1] {
dp[i - 1][j - 1] + 1
} else {
dp[i - 1][j].max(dp[i][j - 1])
};
}
}
dp[m][n]
}
// Approach 2: DP + backtrack to reconstruct
fn lcs_string(s1: &str, s2: &str) -> String {
let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
let (m, n) = (a.len(), b.len());
let mut dp = vec![vec![0usize; n + 1]; m + 1];
for i in 1..=m {
for j in 1..=n {
dp[i][j] = if a[i - 1] == b[j - 1] {
dp[i - 1][j - 1] + 1
} else {
dp[i - 1][j].max(dp[i][j - 1])
};
}
}
// Backtrack
let mut result = Vec::new();
let (mut i, mut j) = (m, n);
while i > 0 && j > 0 {
if a[i - 1] == b[j - 1] {
result.push(a[i - 1]);
i -= 1;
j -= 1;
} else if dp[i - 1][j] > dp[i][j - 1] {
i -= 1;
} else {
j -= 1;
}
}
result.reverse();
result.into_iter().collect()
}
// Approach 3: Recursive with memoization
fn lcs_memo(s1: &str, s2: &str) -> usize {
let (a, b): (Vec<char>, Vec<char>) = (s1.chars().collect(), s2.chars().collect());
fn solve(
i: usize,
j: usize,
a: &[char],
b: &[char],
cache: &mut HashMap<(usize, usize), usize>,
) -> usize {
if i == 0 || j == 0 {
return 0;
}
if let Some(&v) = cache.get(&(i, j)) {
return v;
}
let v = if a[i - 1] == b[j - 1] {
solve(i - 1, j - 1, a, b, cache) + 1
} else {
solve(i - 1, j, a, b, cache).max(solve(i, j - 1, a, b, cache))
};
cache.insert((i, j), v);
v
}
let mut cache = HashMap::new();
solve(a.len(), b.len(), &a, &b, &mut cache)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_lcs_length() {
assert_eq!(lcs_length("abcde", "ace"), 3);
assert_eq!(lcs_length("abc", "abc"), 3);
assert_eq!(lcs_length("abc", "def"), 0);
}
#[test]
fn test_lcs_string() {
assert_eq!(lcs_string("abcde", "ace"), "ace");
assert_eq!(lcs_string("AGGTAB", "GXTXAYB"), "GTAB");
}
#[test]
fn test_lcs_memo() {
assert_eq!(lcs_memo("abcde", "ace"), 3);
assert_eq!(lcs_memo("AGGTAB", "GXTXAYB"), 4);
}
}
✓ Tests
Rust test suite
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_lcs_length() {
assert_eq!(lcs_length("abcde", "ace"), 3);
assert_eq!(lcs_length("abc", "abc"), 3);
assert_eq!(lcs_length("abc", "def"), 0);
}
#[test]
fn test_lcs_string() {
assert_eq!(lcs_string("abcde", "ace"), "ace");
assert_eq!(lcs_string("AGGTAB", "GXTXAYB"), "GTAB");
}
#[test]
fn test_lcs_memo() {
assert_eq!(lcs_memo("abcde", "ace"), 3);
assert_eq!(lcs_memo("AGGTAB", "GXTXAYB"), 4);
}
}
Deep Comparison
Longest Common Subsequence — Comparison
Core Insight
LCS uses a 2D DP table where dp[i][j] = length of LCS of first i chars of s1 and first j chars of s2. Backtracking from the bottom-right corner reconstructs the actual subsequence.
OCaml Approach
Array.init for 2D table, Buffer for string reconstructions.[i] (O(1) for ASCII strings)ref cells for mutable indicesString.init with index reversal to flip the backtracked resultRust Approach
Vec<char> first (UTF-8 aware)vec![vec![0; n+1]; m+1] for 2D tableVec<char> then reverse()into_iter().collect() to convert back to StringComparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| String indexing | s.[i] — O(1) byte access | Must convert to Vec<char> first |
| 2D table | Array.init + Array.make | vec![vec![]; ...] |
| String building | Buffer.add_char + Buffer.contents | Vec<char> + .collect() |
| Reversal | String.init with reversed index | .reverse() in-place |
| UTF-8 handling | Byte-level (ASCII only) | Full Unicode via chars() |
Exercises
edit_distance_from_lcs(s1: &str, s2: &str) -> usize using the formula |s1| + |s2| - 2 * lcs_length.lcs_all(s1: &str, s2: &str) -> Vec<String> that returns all LCS strings (there may be multiple).+/- lines like git diff using LCS as the core.