1046-cow-collections — Clone-on-Write Collections
Tutorial
The Problem
Many operations on collections are read-only most of the time: a configuration lookup, a data validation pass, or a rendering pipeline. Cloning the entire collection for a rare mutation is wasteful. Clone-on-write (Cow) defers allocation until mutation is actually needed, returning a borrowed reference for the common read-only path.
Rust's Cow<'_, [T]> and Cow<'_, str> implement this: Borrowed holds a reference with zero allocation, Owned holds a fully-owned copy. The transition from borrowed to owned happens lazily via .to_mut().
🎯 Learning Outcomes
Cow::Borrowed for zero-allocation read-only access to slices and stringsCow::Owned allocation only when mutation is neededto_mut() for lazy clone-on-first-writeCow in standard library APIs (String::from_utf8_lossy)Cow is appropriate vs always cloningCode Example
#![allow(clippy::all)]
// 1046: Clone-on-Write: Cow<'_, [T]> for Read-Mostly Data
// Avoid cloning until mutation is actually needed
use std::borrow::Cow;
/// Process data without cloning if no modification needed
fn process_data(data: &[i32], threshold: i32) -> Cow<'_, [i32]> {
if data.iter().all(|&x| x <= threshold) {
// No change needed — borrow original data
Cow::Borrowed(data)
} else {
// Need to modify — clone and filter
Cow::Owned(data.iter().map(|&x| x.min(threshold)).collect())
}
}
fn cow_borrow_vs_owned() {
let data = vec![1, 2, 3, 4, 5];
// Case 1: All values within threshold — no clone
let result = process_data(&data, 10);
assert!(matches!(result, Cow::Borrowed(_)));
assert_eq!(&*result, &[1, 2, 3, 4, 5]);
// Case 2: Some values exceed threshold — clones and caps
let result = process_data(&data, 3);
assert!(matches!(result, Cow::Owned(_)));
assert_eq!(&*result, &[1, 2, 3, 3, 3]);
}
/// Cow<str> for string processing
fn normalize_name(name: &str) -> Cow<'_, str> {
if name.contains(char::is_uppercase) {
// Need to modify — allocate new string
Cow::Owned(name.to_lowercase())
} else {
// Already lowercase — just borrow
Cow::Borrowed(name)
}
}
fn cow_str_demo() {
let name1 = "alice";
let result1 = normalize_name(name1);
assert!(matches!(result1, Cow::Borrowed(_)));
assert_eq!(&*result1, "alice");
let name2 = "Alice";
let result2 = normalize_name(name2);
assert!(matches!(result2, Cow::Owned(_)));
assert_eq!(&*result2, "alice");
}
/// to_mut() triggers clone only on first mutation
fn to_mut_demo() {
let data = vec![1, 2, 3, 4, 5];
let mut cow: Cow<'_, [i32]> = Cow::Borrowed(&data);
// Reading doesn't clone
assert_eq!(cow[0], 1);
assert!(matches!(cow, Cow::Borrowed(_)));
// Mutation triggers clone
cow.to_mut()[0] = 99;
assert!(matches!(cow, Cow::Owned(_)));
assert_eq!(cow[0], 99);
// Second mutation doesn't clone again
cow.to_mut()[1] = 88;
assert_eq!(&*cow, &[99, 88, 3, 4, 5]);
// Original unchanged
assert_eq!(data, vec![1, 2, 3, 4, 5]);
}
/// Cow in function signatures — accept both owned and borrowed
fn print_items(items: Cow<'_, [i32]>) -> usize {
items.len()
}
fn flexible_api() {
let owned = vec![1, 2, 3];
let borrowed = &[4, 5, 6][..];
assert_eq!(print_items(Cow::Owned(owned)), 3);
assert_eq!(print_items(Cow::Borrowed(borrowed)), 3);
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_cow_borrow() {
cow_borrow_vs_owned();
}
#[test]
fn test_cow_str() {
cow_str_demo();
}
#[test]
fn test_to_mut() {
to_mut_demo();
}
#[test]
fn test_flexible() {
flexible_api();
}
#[test]
fn test_into_owned() {
let data = vec![1, 2, 3];
let cow: Cow<'_, [i32]> = Cow::Borrowed(&data);
let owned: Vec<i32> = cow.into_owned();
assert_eq!(owned, vec![1, 2, 3]);
}
}Key Differences
Cow makes the borrow-or-own decision explicit in the type; OCaml's GC sharing is implicit and applies only to immutable values.Cow<'_, [T]> carries a lifetime annotation; OCaml has no equivalent.Cow<'_, str> is common for string normalization; OCaml's strings are bytes and lack direct Cow equivalents.to_mut()**: Rust's Cow::to_mut() triggers a clone on first call, then returns &mut T for subsequent mutations; OCaml has no equivalent lazy promotion.OCaml Approach
OCaml's GC + structural sharing achieves copy-on-write semantics naturally for immutable structures:
(* No mutation needed — structural sharing is automatic *)
let filter_evens lst = List.filter (fun x -> x mod 2 = 0) lst
(* If input is already filtered, no benefit — always rebuilds *)
For mutable structures, OCaml's Bytes has explicit Bytes.copy for explicit copying. The Cow pattern is less necessary in OCaml because immutable structures share memory automatically via the GC.
Full Source
#![allow(clippy::all)]
// 1046: Clone-on-Write: Cow<'_, [T]> for Read-Mostly Data
// Avoid cloning until mutation is actually needed
use std::borrow::Cow;
/// Process data without cloning if no modification needed
fn process_data(data: &[i32], threshold: i32) -> Cow<'_, [i32]> {
if data.iter().all(|&x| x <= threshold) {
// No change needed — borrow original data
Cow::Borrowed(data)
} else {
// Need to modify — clone and filter
Cow::Owned(data.iter().map(|&x| x.min(threshold)).collect())
}
}
fn cow_borrow_vs_owned() {
let data = vec![1, 2, 3, 4, 5];
// Case 1: All values within threshold — no clone
let result = process_data(&data, 10);
assert!(matches!(result, Cow::Borrowed(_)));
assert_eq!(&*result, &[1, 2, 3, 4, 5]);
// Case 2: Some values exceed threshold — clones and caps
let result = process_data(&data, 3);
assert!(matches!(result, Cow::Owned(_)));
assert_eq!(&*result, &[1, 2, 3, 3, 3]);
}
/// Cow<str> for string processing
fn normalize_name(name: &str) -> Cow<'_, str> {
if name.contains(char::is_uppercase) {
// Need to modify — allocate new string
Cow::Owned(name.to_lowercase())
} else {
// Already lowercase — just borrow
Cow::Borrowed(name)
}
}
fn cow_str_demo() {
let name1 = "alice";
let result1 = normalize_name(name1);
assert!(matches!(result1, Cow::Borrowed(_)));
assert_eq!(&*result1, "alice");
let name2 = "Alice";
let result2 = normalize_name(name2);
assert!(matches!(result2, Cow::Owned(_)));
assert_eq!(&*result2, "alice");
}
/// to_mut() triggers clone only on first mutation
fn to_mut_demo() {
let data = vec![1, 2, 3, 4, 5];
let mut cow: Cow<'_, [i32]> = Cow::Borrowed(&data);
// Reading doesn't clone
assert_eq!(cow[0], 1);
assert!(matches!(cow, Cow::Borrowed(_)));
// Mutation triggers clone
cow.to_mut()[0] = 99;
assert!(matches!(cow, Cow::Owned(_)));
assert_eq!(cow[0], 99);
// Second mutation doesn't clone again
cow.to_mut()[1] = 88;
assert_eq!(&*cow, &[99, 88, 3, 4, 5]);
// Original unchanged
assert_eq!(data, vec![1, 2, 3, 4, 5]);
}
/// Cow in function signatures — accept both owned and borrowed
fn print_items(items: Cow<'_, [i32]>) -> usize {
items.len()
}
fn flexible_api() {
let owned = vec![1, 2, 3];
let borrowed = &[4, 5, 6][..];
assert_eq!(print_items(Cow::Owned(owned)), 3);
assert_eq!(print_items(Cow::Borrowed(borrowed)), 3);
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_cow_borrow() {
cow_borrow_vs_owned();
}
#[test]
fn test_cow_str() {
cow_str_demo();
}
#[test]
fn test_to_mut() {
to_mut_demo();
}
#[test]
fn test_flexible() {
flexible_api();
}
#[test]
fn test_into_owned() {
let data = vec![1, 2, 3];
let cow: Cow<'_, [i32]> = Cow::Borrowed(&data);
let owned: Vec<i32> = cow.into_owned();
assert_eq!(owned, vec![1, 2, 3]);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_cow_borrow() {
cow_borrow_vs_owned();
}
#[test]
fn test_cow_str() {
cow_str_demo();
}
#[test]
fn test_to_mut() {
to_mut_demo();
}
#[test]
fn test_flexible() {
flexible_api();
}
#[test]
fn test_into_owned() {
let data = vec![1, 2, 3];
let cow: Cow<'_, [i32]> = Cow::Borrowed(&data);
let owned: Vec<i32> = cow.into_owned();
assert_eq!(owned, vec![1, 2, 3]);
}
}
Deep Comparison
Clone-on-Write Collections — Comparison
Core Insight
Clone-on-write defers allocation until mutation occurs. In Rust, this is explicit via Cow<'a, T>. In OCaml, immutable data structures with structural sharing provide this behavior implicitly — you never clone because you never mutate.
OCaml Approach
0 :: list shares the tail — zero-copy extensionList.map creates new structure only where changedref + Array.copy but rarely neededRust Approach
Cow<'a, [T]>: enum of Borrowed(&'a [T]) and Owned(Vec<T>)Cow<'a, str>: Borrowed(&'a str) or Owned(String)to_mut(): clones on first mutation, returns &mutinto_owned(): consumes Cow, returns owned valueCow to avoid cloning on happy pathComparison Table
| Feature | OCaml | Rust (Cow) |
|---|---|---|
| CoW mechanism | Implicit (immutability) | Explicit (Cow<'a, T>) |
| Read cost | Zero (shared) | Zero (Deref) |
| Write cost | New allocation (structural sharing) | Clone on first write |
| API complexity | None (just use values) | Cow::Borrowed/Owned, to_mut() |
| Return from function | Value (shared by GC) | Cow (avoids clone if unmodified) |
| Typical use | Default behavior | Optimization for read-heavy paths |
Exercises
sanitize_html(s: &str) -> Cow<'_, str> function that returns Borrowed if no HTML characters are present and Owned only when replacing <, >, &.deduplicate(data: &[i32]) -> Cow<'_, [i32]> that returns Borrowed if the slice has no duplicates and Owned with duplicates removed otherwise.Cow<'_, str> as input and avoids cloning when the input is already normalized.