369: Clone-on-Write (Cow)
Tutorial Video
Text description (accessibility)
This video demonstrates the "369: Clone-on-Write (Cow)" functional Rust example. Difficulty level: Advanced. Key concepts covered: Functional Programming. String processing functions often receive data that usually needs no modification — normalizing whitespace, sanitizing identifiers, trimming. Key difference from OCaml: | Aspect | Rust `Cow<'a, str>` | OCaml `string` |
Tutorial
The Problem
String processing functions often receive data that usually needs no modification — normalizing whitespace, sanitizing identifiers, trimming. Returning String always allocates even when the input is already valid. Returning &str requires the caller to own the buffer. Rust's Cow<'a, str> (Clone-on-Write) solves the dilemma: it holds either a borrowed reference (Cow::Borrowed(&str)) or an owned string (Cow::Owned(String)) and deref-transparently exposes &str in both cases. Allocation happens only when the data actually needs modification. This pattern appears in serde deserialization, HTTP header parsing, and any API that wants to avoid unnecessary copying.
🎯 Learning Outcomes
Cow<'a, str> to return borrowed data when no modification is neededCow::Borrowed(s) when the input is already valid — zero allocationCow::Owned(s.to_string()) or Cow::Owned(s.replace(...)) when modification is neededCow as a function parameter to accept both &str and String ergonomicallyCow<'a, B> works for any B: ToOwned (slices, paths, etc.)Code Example
fn ensure_no_spaces(s: &str) -> Cow<str> {
if s.contains(' ') {
Cow::Owned(s.replace(' ', "_"))
} else {
Cow::Borrowed(s)
}
}Key Differences
| Aspect | Rust Cow<'a, str> | OCaml string |
|---|---|---|
| Zero-copy path | Cow::Borrowed(s) | Direct return (GC-tracked) |
| Allocation path | Cow::Owned(...) | New string allocation |
| Lifetime tracking | 'a lifetime parameter | GC |
| Mutation | cow.to_mut() triggers clone | N/A (strings immutable) |
| Transparency | Deref<Target = str> | Direct string value |
OCaml Approach
OCaml's immutable strings sidestep this problem: you can't mutate a string, so borrowed vs owned is irrelevant:
let ensure_no_spaces s =
if String.contains s ' '
then String.map (fun c -> if c = ' ' then '_' else c) s
else s (* return original — GC manages the reference *)
let truncate_to_limit s limit =
if String.length s <= limit then s
else String.sub s 0 limit
In OCaml, s is returned directly without a wrapper type — the GC knows both the caller and callee share the same string object. There's no distinction between "borrowed" and "owned" at the type level; the GC handles lifetime tracking automatically.
Full Source
#![allow(clippy::all)]
//! Clone-on-Write (Cow) Pattern
//!
//! Avoid allocation when data doesn't need to be modified.
use std::borrow::Cow;
// === Approach 1: String processing ===
/// Replace spaces with underscores, only allocating if needed
pub fn ensure_no_spaces(s: &str) -> Cow<str> {
if s.contains(' ') {
Cow::Owned(s.replace(' ', "_"))
} else {
Cow::Borrowed(s)
}
}
/// Truncate string to limit, only allocating if needed
pub fn truncate_to_limit(s: &str, limit: usize) -> Cow<str> {
if s.len() <= limit {
Cow::Borrowed(s)
} else {
Cow::Owned(s[..limit].to_string())
}
}
/// Normalize whitespace (collapse multiple spaces, trim)
pub fn normalize_whitespace(input: &str) -> Cow<str> {
// Check if normalization is needed
let needs_normalization =
input.contains(" ") || input.starts_with(' ') || input.ends_with(' ');
if !needs_normalization {
Cow::Borrowed(input)
} else {
let mut result = String::with_capacity(input.len());
let mut prev_space = true; // start true to trim leading
for c in input.chars() {
if c == ' ' {
if !prev_space {
result.push(c);
}
prev_space = true;
} else {
result.push(c);
prev_space = false;
}
}
// Trim trailing
while result.ends_with(' ') {
result.pop();
}
Cow::Owned(result)
}
}
// === Approach 2: Converting to uppercase conditionally ===
/// Convert to uppercase only if needed
pub fn to_uppercase_if_needed(s: &str) -> Cow<str> {
if s.chars().all(|c| !c.is_lowercase()) {
Cow::Borrowed(s)
} else {
Cow::Owned(s.to_uppercase())
}
}
/// Convert to lowercase only if needed
pub fn to_lowercase_if_needed(s: &str) -> Cow<str> {
if s.chars().all(|c| !c.is_uppercase()) {
Cow::Borrowed(s)
} else {
Cow::Owned(s.to_lowercase())
}
}
// === Approach 3: Escape special characters ===
/// Escape HTML special characters, only allocating if needed
pub fn escape_html(s: &str) -> Cow<str> {
if !s.contains(['&', '<', '>', '"', '\'']) {
Cow::Borrowed(s)
} else {
let mut result = String::with_capacity(s.len() + 10);
for c in s.chars() {
match c {
'&' => result.push_str("&"),
'<' => result.push_str("<"),
'>' => result.push_str(">"),
'"' => result.push_str("""),
'\'' => result.push_str("'"),
_ => result.push(c),
}
}
Cow::Owned(result)
}
}
/// URL-encode a string, only allocating if needed
pub fn url_encode(s: &str) -> Cow<str> {
let needs_encoding = s
.chars()
.any(|c| !matches!(c, 'a'..='z' | 'A'..='Z' | '0'..='9' | '-' | '_' | '.' | '~'));
if !needs_encoding {
Cow::Borrowed(s)
} else {
let mut result = String::with_capacity(s.len() * 3);
for c in s.chars() {
if matches!(c, 'a'..='z' | 'A'..='Z' | '0'..='9' | '-' | '_' | '.' | '~') {
result.push(c);
} else {
for byte in c.to_string().bytes() {
result.push_str(&format!("%{:02X}", byte));
}
}
}
Cow::Owned(result)
}
}
/// Check if Cow is borrowed (no allocation occurred)
pub fn is_borrowed<T: ?Sized + ToOwned>(cow: &Cow<T>) -> bool {
matches!(cow, Cow::Borrowed(_))
}
/// Check if Cow is owned (allocation occurred)
pub fn is_owned<T: ?Sized + ToOwned>(cow: &Cow<T>) -> bool {
matches!(cow, Cow::Owned(_))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_no_spaces_borrowed() {
let result = ensure_no_spaces("hello");
assert!(matches!(result, Cow::Borrowed(_)));
assert_eq!(result, "hello");
}
#[test]
fn test_has_spaces_owned() {
let result = ensure_no_spaces("hello world");
assert!(matches!(result, Cow::Owned(_)));
assert_eq!(result, "hello_world");
}
#[test]
fn test_truncate_no_change() {
let result = truncate_to_limit("hello", 10);
assert!(is_borrowed(&result));
assert_eq!(result, "hello");
}
#[test]
fn test_truncate_needed() {
let result = truncate_to_limit("hello world", 5);
assert!(is_owned(&result));
assert_eq!(result, "hello");
}
#[test]
fn test_normalize_whitespace_no_change() {
let result = normalize_whitespace("hello world");
assert!(is_borrowed(&result));
}
#[test]
fn test_normalize_whitespace_needed() {
let result = normalize_whitespace(" hello world ");
assert!(is_owned(&result));
assert_eq!(result, "hello world");
}
#[test]
fn test_uppercase_no_change() {
let result = to_uppercase_if_needed("HELLO");
assert!(is_borrowed(&result));
}
#[test]
fn test_uppercase_needed() {
let result = to_uppercase_if_needed("Hello");
assert!(is_owned(&result));
assert_eq!(result, "HELLO");
}
#[test]
fn test_escape_html_no_change() {
let result = escape_html("hello world");
assert!(is_borrowed(&result));
}
#[test]
fn test_escape_html_needed() {
let result = escape_html("<script>");
assert!(is_owned(&result));
assert_eq!(result, "<script>");
}
#[test]
fn test_url_encode_no_change() {
let result = url_encode("hello-world_123");
assert!(is_borrowed(&result));
}
#[test]
fn test_url_encode_needed() {
let result = url_encode("hello world");
assert!(is_owned(&result));
assert_eq!(result, "hello%20world");
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_no_spaces_borrowed() {
let result = ensure_no_spaces("hello");
assert!(matches!(result, Cow::Borrowed(_)));
assert_eq!(result, "hello");
}
#[test]
fn test_has_spaces_owned() {
let result = ensure_no_spaces("hello world");
assert!(matches!(result, Cow::Owned(_)));
assert_eq!(result, "hello_world");
}
#[test]
fn test_truncate_no_change() {
let result = truncate_to_limit("hello", 10);
assert!(is_borrowed(&result));
assert_eq!(result, "hello");
}
#[test]
fn test_truncate_needed() {
let result = truncate_to_limit("hello world", 5);
assert!(is_owned(&result));
assert_eq!(result, "hello");
}
#[test]
fn test_normalize_whitespace_no_change() {
let result = normalize_whitespace("hello world");
assert!(is_borrowed(&result));
}
#[test]
fn test_normalize_whitespace_needed() {
let result = normalize_whitespace(" hello world ");
assert!(is_owned(&result));
assert_eq!(result, "hello world");
}
#[test]
fn test_uppercase_no_change() {
let result = to_uppercase_if_needed("HELLO");
assert!(is_borrowed(&result));
}
#[test]
fn test_uppercase_needed() {
let result = to_uppercase_if_needed("Hello");
assert!(is_owned(&result));
assert_eq!(result, "HELLO");
}
#[test]
fn test_escape_html_no_change() {
let result = escape_html("hello world");
assert!(is_borrowed(&result));
}
#[test]
fn test_escape_html_needed() {
let result = escape_html("<script>");
assert!(is_owned(&result));
assert_eq!(result, "<script>");
}
#[test]
fn test_url_encode_no_change() {
let result = url_encode("hello-world_123");
assert!(is_borrowed(&result));
}
#[test]
fn test_url_encode_needed() {
let result = url_encode("hello world");
assert!(is_owned(&result));
assert_eq!(result, "hello%20world");
}
}
Deep Comparison
OCaml vs Rust: Clone-on-Write
Side-by-Side Comparison
Conditional Processing
OCaml:
let maybe_uppercase s threshold =
if String.length s > threshold then String.uppercase_ascii s
else s (* no copy needed - string is immutable *)
Rust:
fn ensure_no_spaces(s: &str) -> Cow<str> {
if s.contains(' ') {
Cow::Owned(s.replace(' ', "_"))
} else {
Cow::Borrowed(s)
}
}
Key Differences
| Aspect | OCaml | Rust |
|---|---|---|
| Strings | Immutable by default | Owned String or borrowed &str |
| Copy-on-write | Implicit (GC handles) | Explicit Cow<T> |
| Return type | Same type | Cow enum |
| Allocation | Hidden by GC | Explicit Borrowed/Owned |
OCaml's Advantage
In OCaml, strings are immutable, so returning the same string or a new one has the same type. The GC handles memory - no explicit Cow needed.
Rust's Cow
Rust's ownership model requires distinguishing between:
&str - borrowed string sliceString - owned stringCow<str> bridges these, allowing a function to return either depending on whether modification occurred.
Performance
| Scenario | OCaml | Rust |
|---|---|---|
| No change needed | Return same ref | Cow::Borrowed - zero alloc |
| Change needed | Allocate new string | Cow::Owned - allocate |
| Memory overhead | GC tracking | Enum discriminant (1 byte) |
Use Cases
Exercises
html_escape<'a>(s: &'a str) -> Cow<'a, str> that replaces <, >, & with their HTML entities — borrow the original if none are present, allocate a new string only when needed.Cow<'_, Path> to normalize a file path: if it's already absolute and clean, borrow it; if it needs canonicalize(), return Cow::Owned.Cow::Borrowed from a string, then call .to_mut() to get a mutable reference; verify that the original slice is unchanged and the Cow now holds an owned copy.