970 Rope String
Tutorial
The Problem
Implement a rope — a tree-based string representation that enables O(log n) concatenation, splitting, and indexing for large strings. A Rope is either a Leaf(String) or a Node(left, right, total_len). Concatenation creates a new Node without copying content. Collecting to a String is O(n) but infrequent.
🎯 Learning Outcomes
enum Rope { Leaf(String), Node(Box<Rope>, Box<Rope>, usize) } with length cached in Nodeconcat(left, right) -> Rope that creates a Node in O(1) — no string copyinglen(&self) by reading the cached length from Node or Leaf::len()char_at(&self, idx: usize) -> Option<char> by navigating left/right subtrees using cached lengthsto_string_val(&self) -> String as a recursive O(n) collectionCode Example
#![allow(clippy::all)]
// 970: Rope String
// Tree-based string for O(log n) concat/split, O(log n) indexing
// OCaml: recursive type; Rust: enum with Box for recursion
#[derive(Debug, Clone)]
pub enum Rope {
Leaf(String),
Node(Box<Rope>, Box<Rope>, usize), // left, right, total_len
}
impl Rope {
pub fn from_str(s: &str) -> Self {
Rope::Leaf(s.to_string())
}
pub fn len(&self) -> usize {
match self {
Rope::Leaf(s) => s.len(),
Rope::Node(_, _, n) => *n,
}
}
pub fn is_empty(&self) -> bool {
self.len() == 0
}
pub fn concat(left: Rope, right: Rope) -> Rope {
let total = left.len() + right.len();
Rope::Node(Box::new(left), Box::new(right), total)
}
/// Collect rope into a String (O(n))
pub fn to_string_val(&self) -> String {
match self {
Rope::Leaf(s) => s.clone(),
Rope::Node(l, r, _) => {
let mut out = l.to_string_val();
out.push_str(&r.to_string_val());
out
}
}
}
/// Get character at index i (O(log n))
pub fn index(&self, i: usize) -> Option<char> {
match self {
Rope::Leaf(s) => s.chars().nth(i),
Rope::Node(l, r, _) => {
let ln = l.len();
if i < ln {
l.index(i)
} else {
r.index(i - ln)
}
}
}
}
/// Split at position i: returns (left[0..i], right[i..])
pub fn split(self, i: usize) -> (Rope, Rope) {
match self {
Rope::Leaf(s) => {
let i = i.min(s.len());
let left = Rope::Leaf(s[..i].to_string());
let right = Rope::Leaf(s[i..].to_string());
(left, right)
}
Rope::Node(l, r, _) => {
let ln = l.len();
if i <= ln {
let (ll, lr) = l.split(i);
(ll, Rope::concat(lr, *r))
} else {
let (rl, rr) = r.split(i - ln);
(Rope::concat(*l, rl), rr)
}
}
}
}
/// Extract substring [start, start+len)
pub fn sub(&self, start: usize, len: usize) -> String {
let (_, right) = self.clone().split(start);
let (mid, _) = right.split(len);
mid.to_string_val()
}
}
#[cfg(test)]
mod tests {
use super::*;
fn make_rope() -> Rope {
let r1 = Rope::from_str("Hello");
let r2 = Rope::from_str(", ");
let r3 = Rope::from_str("World");
let r4 = Rope::from_str("!");
Rope::concat(Rope::concat(r1, r2), Rope::concat(r3, r4))
}
#[test]
fn test_length_and_to_string() {
let rope = make_rope();
assert_eq!(rope.len(), 13);
assert_eq!(rope.to_string_val(), "Hello, World!");
}
#[test]
fn test_index() {
let rope = make_rope();
assert_eq!(rope.index(0), Some('H'));
assert_eq!(rope.index(7), Some('W'));
assert_eq!(rope.index(12), Some('!'));
assert_eq!(rope.index(13), None);
}
#[test]
fn test_sub() {
let rope = make_rope();
assert_eq!(rope.sub(7, 5), "World");
assert_eq!(rope.sub(0, 5), "Hello");
assert_eq!(rope.sub(5, 2), ", ");
}
#[test]
fn test_split() {
let rope = make_rope();
let (left, right) = rope.split(7);
assert_eq!(left.to_string_val(), "Hello, ");
assert_eq!(right.to_string_val(), "World!");
}
#[test]
fn test_empty_rope() {
let r = Rope::from_str("");
assert!(r.is_empty());
assert_eq!(r.to_string_val(), "");
}
}Key Differences
| Aspect | Rust | OCaml |
|---|---|---|
| Recursive enum | Box<Rope> in variants (required) | Direct recursion (GC handles indirection) |
| Concat | O(1) — allocates one Node | O(1) — allocates one Node record |
to_string | O(n) — push_str into buffer | O(n) — ^ for each concat |
char_at | O(log n) for balanced tree | O(log n) same |
Ropes are valuable for large text editors where users insert and delete characters frequently. For strings under ~1KB, plain String is faster due to CPU cache effects.
OCaml Approach
type rope =
| Leaf of string
| Node of rope * rope * int (* left, right, len *)
let from_str s = Leaf s
let len = function
| Leaf s -> String.length s
| Node (_, _, n) -> n
let concat l r = Node (l, r, len l + len r)
let rec char_at rope idx =
match rope with
| Leaf s -> if idx < String.length s then Some s.[idx] else None
| Node (left, right, _) ->
let ll = len left in
if idx < ll then char_at left idx
else char_at right (idx - ll)
let rec to_string = function
| Leaf s -> s
| Node (l, r, _) -> to_string l ^ to_string r (* O(n) due to ^ *)
OCaml's ^ is string concatenation — O(n) for the concat step. The rope type delays this cost: concat in OCaml is O(1) just like in Rust.
Full Source
#![allow(clippy::all)]
// 970: Rope String
// Tree-based string for O(log n) concat/split, O(log n) indexing
// OCaml: recursive type; Rust: enum with Box for recursion
#[derive(Debug, Clone)]
pub enum Rope {
Leaf(String),
Node(Box<Rope>, Box<Rope>, usize), // left, right, total_len
}
impl Rope {
pub fn from_str(s: &str) -> Self {
Rope::Leaf(s.to_string())
}
pub fn len(&self) -> usize {
match self {
Rope::Leaf(s) => s.len(),
Rope::Node(_, _, n) => *n,
}
}
pub fn is_empty(&self) -> bool {
self.len() == 0
}
pub fn concat(left: Rope, right: Rope) -> Rope {
let total = left.len() + right.len();
Rope::Node(Box::new(left), Box::new(right), total)
}
/// Collect rope into a String (O(n))
pub fn to_string_val(&self) -> String {
match self {
Rope::Leaf(s) => s.clone(),
Rope::Node(l, r, _) => {
let mut out = l.to_string_val();
out.push_str(&r.to_string_val());
out
}
}
}
/// Get character at index i (O(log n))
pub fn index(&self, i: usize) -> Option<char> {
match self {
Rope::Leaf(s) => s.chars().nth(i),
Rope::Node(l, r, _) => {
let ln = l.len();
if i < ln {
l.index(i)
} else {
r.index(i - ln)
}
}
}
}
/// Split at position i: returns (left[0..i], right[i..])
pub fn split(self, i: usize) -> (Rope, Rope) {
match self {
Rope::Leaf(s) => {
let i = i.min(s.len());
let left = Rope::Leaf(s[..i].to_string());
let right = Rope::Leaf(s[i..].to_string());
(left, right)
}
Rope::Node(l, r, _) => {
let ln = l.len();
if i <= ln {
let (ll, lr) = l.split(i);
(ll, Rope::concat(lr, *r))
} else {
let (rl, rr) = r.split(i - ln);
(Rope::concat(*l, rl), rr)
}
}
}
}
/// Extract substring [start, start+len)
pub fn sub(&self, start: usize, len: usize) -> String {
let (_, right) = self.clone().split(start);
let (mid, _) = right.split(len);
mid.to_string_val()
}
}
#[cfg(test)]
mod tests {
use super::*;
fn make_rope() -> Rope {
let r1 = Rope::from_str("Hello");
let r2 = Rope::from_str(", ");
let r3 = Rope::from_str("World");
let r4 = Rope::from_str("!");
Rope::concat(Rope::concat(r1, r2), Rope::concat(r3, r4))
}
#[test]
fn test_length_and_to_string() {
let rope = make_rope();
assert_eq!(rope.len(), 13);
assert_eq!(rope.to_string_val(), "Hello, World!");
}
#[test]
fn test_index() {
let rope = make_rope();
assert_eq!(rope.index(0), Some('H'));
assert_eq!(rope.index(7), Some('W'));
assert_eq!(rope.index(12), Some('!'));
assert_eq!(rope.index(13), None);
}
#[test]
fn test_sub() {
let rope = make_rope();
assert_eq!(rope.sub(7, 5), "World");
assert_eq!(rope.sub(0, 5), "Hello");
assert_eq!(rope.sub(5, 2), ", ");
}
#[test]
fn test_split() {
let rope = make_rope();
let (left, right) = rope.split(7);
assert_eq!(left.to_string_val(), "Hello, ");
assert_eq!(right.to_string_val(), "World!");
}
#[test]
fn test_empty_rope() {
let r = Rope::from_str("");
assert!(r.is_empty());
assert_eq!(r.to_string_val(), "");
}
}#[cfg(test)]
mod tests {
use super::*;
fn make_rope() -> Rope {
let r1 = Rope::from_str("Hello");
let r2 = Rope::from_str(", ");
let r3 = Rope::from_str("World");
let r4 = Rope::from_str("!");
Rope::concat(Rope::concat(r1, r2), Rope::concat(r3, r4))
}
#[test]
fn test_length_and_to_string() {
let rope = make_rope();
assert_eq!(rope.len(), 13);
assert_eq!(rope.to_string_val(), "Hello, World!");
}
#[test]
fn test_index() {
let rope = make_rope();
assert_eq!(rope.index(0), Some('H'));
assert_eq!(rope.index(7), Some('W'));
assert_eq!(rope.index(12), Some('!'));
assert_eq!(rope.index(13), None);
}
#[test]
fn test_sub() {
let rope = make_rope();
assert_eq!(rope.sub(7, 5), "World");
assert_eq!(rope.sub(0, 5), "Hello");
assert_eq!(rope.sub(5, 2), ", ");
}
#[test]
fn test_split() {
let rope = make_rope();
let (left, right) = rope.split(7);
assert_eq!(left.to_string_val(), "Hello, ");
assert_eq!(right.to_string_val(), "World!");
}
#[test]
fn test_empty_rope() {
let r = Rope::from_str("");
assert!(r.is_empty());
assert_eq!(r.to_string_val(), "");
}
}
Deep Comparison
Rope String — Comparison
Core Insight
A rope represents a string as a binary tree of smaller strings (leaves). Concatenation creates a new Node in O(1). Indexing and splitting traverse the tree in O(log n). Both languages express this as a recursive algebraic type. The critical Rust difference: Box<Rope> is required because recursive enum variants must have statically known size, and a direct self-reference would be infinite.
OCaml Approach
type rope = Leaf of string | Node of rope * rope * int — recursive type, implicitString.sub s 0 i for splitting leaf stringss.[i] for character access (byte index)to_string l ^ to_string r — string concatenation (O(n) but readable)Rust Approach
enum Rope { Leaf(String), Node(Box<Rope>, Box<Rope>, usize) } — explicit BoxBox::new(...) allocates on the heap, providing the indirection needed for recursions[..i].to_string() for splitting (byte-indexed slice)s.chars().nth(i) for Unicode-safe character access#[derive(Clone)] needed because to_string_val requires cloning for subl.to_string_val() + push_str instead of ^ operatorComparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Recursive type | type rope = ... Node of rope * rope * int | enum Rope { Node(Box<Rope>, Box<Rope>, usize) } |
| Heap boxing | Implicit (GC) | Explicit Box<T> |
| String split | String.sub s 0 i | s[..i].to_string() |
| Char access | s.[i] (byte) | s.chars().nth(i) (Unicode) |
| Concat strings | ^ operator | String::push_str |
| Clone | Not needed (GC shares) | #[derive(Clone)] explicit |
| Length field | int | usize |
Exercises
split(rope, pos) -> (Rope, Rope) that splits the rope at character index pos in O(log n).insert(rope, pos, s) that inserts string s at position pos using split + concat.delete(rope, l, r) that removes characters in range [l, r) using split + concat.2 * log2(len), flatten to a Leaf and rebuild.iter_chars(&self) -> impl Iterator<Item=char> that lazily yields characters without materializing the full string.