ExamplesBy LevelBy TopicLearning Paths
369 Advanced

369: Clone-on-Write (Cow)

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "369: Clone-on-Write (Cow)" functional Rust example. Difficulty level: Advanced. Key concepts covered: Functional Programming. String processing functions often receive data that usually needs no modification — normalizing whitespace, sanitizing identifiers, trimming. Key difference from OCaml: | Aspect | Rust `Cow<'a, str>` | OCaml `string` |

Tutorial

The Problem

String processing functions often receive data that usually needs no modification — normalizing whitespace, sanitizing identifiers, trimming. Returning String always allocates even when the input is already valid. Returning &str requires the caller to own the buffer. Rust's Cow<'a, str> (Clone-on-Write) solves the dilemma: it holds either a borrowed reference (Cow::Borrowed(&str)) or an owned string (Cow::Owned(String)) and deref-transparently exposes &str in both cases. Allocation happens only when the data actually needs modification. This pattern appears in serde deserialization, HTTP header parsing, and any API that wants to avoid unnecessary copying.

🎯 Learning Outcomes

  • • Use Cow<'a, str> to return borrowed data when no modification is needed
  • • Return Cow::Borrowed(s) when the input is already valid — zero allocation
  • • Return Cow::Owned(s.to_string()) or Cow::Owned(s.replace(...)) when modification is needed
  • • Use Cow as a function parameter to accept both &str and String ergonomically
  • • Understand that Cow<'a, B> works for any B: ToOwned (slices, paths, etc.)
  • • Recognize the performance benefit: O(1) for the common no-modification case
  • Code Example

    fn ensure_no_spaces(s: &str) -> Cow<str> {
        if s.contains(' ') {
            Cow::Owned(s.replace(' ', "_"))
        } else {
            Cow::Borrowed(s)
        }
    }

    Key Differences

    AspectRust Cow<'a, str>OCaml string
    Zero-copy pathCow::Borrowed(s)Direct return (GC-tracked)
    Allocation pathCow::Owned(...)New string allocation
    Lifetime tracking'a lifetime parameterGC
    Mutationcow.to_mut() triggers cloneN/A (strings immutable)
    TransparencyDeref<Target = str>Direct string value

    OCaml Approach

    OCaml's immutable strings sidestep this problem: you can't mutate a string, so borrowed vs owned is irrelevant:

    let ensure_no_spaces s =
      if String.contains s ' '
      then String.map (fun c -> if c = ' ' then '_' else c) s
      else s  (* return original — GC manages the reference *)
    
    let truncate_to_limit s limit =
      if String.length s <= limit then s
      else String.sub s 0 limit
    

    In OCaml, s is returned directly without a wrapper type — the GC knows both the caller and callee share the same string object. There's no distinction between "borrowed" and "owned" at the type level; the GC handles lifetime tracking automatically.

    Full Source

    #![allow(clippy::all)]
    //! Clone-on-Write (Cow) Pattern
    //!
    //! Avoid allocation when data doesn't need to be modified.
    
    use std::borrow::Cow;
    
    // === Approach 1: String processing ===
    
    /// Replace spaces with underscores, only allocating if needed
    pub fn ensure_no_spaces(s: &str) -> Cow<str> {
        if s.contains(' ') {
            Cow::Owned(s.replace(' ', "_"))
        } else {
            Cow::Borrowed(s)
        }
    }
    
    /// Truncate string to limit, only allocating if needed
    pub fn truncate_to_limit(s: &str, limit: usize) -> Cow<str> {
        if s.len() <= limit {
            Cow::Borrowed(s)
        } else {
            Cow::Owned(s[..limit].to_string())
        }
    }
    
    /// Normalize whitespace (collapse multiple spaces, trim)
    pub fn normalize_whitespace(input: &str) -> Cow<str> {
        // Check if normalization is needed
        let needs_normalization =
            input.contains("  ") || input.starts_with(' ') || input.ends_with(' ');
    
        if !needs_normalization {
            Cow::Borrowed(input)
        } else {
            let mut result = String::with_capacity(input.len());
            let mut prev_space = true; // start true to trim leading
            for c in input.chars() {
                if c == ' ' {
                    if !prev_space {
                        result.push(c);
                    }
                    prev_space = true;
                } else {
                    result.push(c);
                    prev_space = false;
                }
            }
            // Trim trailing
            while result.ends_with(' ') {
                result.pop();
            }
            Cow::Owned(result)
        }
    }
    
    // === Approach 2: Converting to uppercase conditionally ===
    
    /// Convert to uppercase only if needed
    pub fn to_uppercase_if_needed(s: &str) -> Cow<str> {
        if s.chars().all(|c| !c.is_lowercase()) {
            Cow::Borrowed(s)
        } else {
            Cow::Owned(s.to_uppercase())
        }
    }
    
    /// Convert to lowercase only if needed
    pub fn to_lowercase_if_needed(s: &str) -> Cow<str> {
        if s.chars().all(|c| !c.is_uppercase()) {
            Cow::Borrowed(s)
        } else {
            Cow::Owned(s.to_lowercase())
        }
    }
    
    // === Approach 3: Escape special characters ===
    
    /// Escape HTML special characters, only allocating if needed
    pub fn escape_html(s: &str) -> Cow<str> {
        if !s.contains(['&', '<', '>', '"', '\'']) {
            Cow::Borrowed(s)
        } else {
            let mut result = String::with_capacity(s.len() + 10);
            for c in s.chars() {
                match c {
                    '&' => result.push_str("&amp;"),
                    '<' => result.push_str("&lt;"),
                    '>' => result.push_str("&gt;"),
                    '"' => result.push_str("&quot;"),
                    '\'' => result.push_str("&#39;"),
                    _ => result.push(c),
                }
            }
            Cow::Owned(result)
        }
    }
    
    /// URL-encode a string, only allocating if needed
    pub fn url_encode(s: &str) -> Cow<str> {
        let needs_encoding = s
            .chars()
            .any(|c| !matches!(c, 'a'..='z' | 'A'..='Z' | '0'..='9' | '-' | '_' | '.' | '~'));
    
        if !needs_encoding {
            Cow::Borrowed(s)
        } else {
            let mut result = String::with_capacity(s.len() * 3);
            for c in s.chars() {
                if matches!(c, 'a'..='z' | 'A'..='Z' | '0'..='9' | '-' | '_' | '.' | '~') {
                    result.push(c);
                } else {
                    for byte in c.to_string().bytes() {
                        result.push_str(&format!("%{:02X}", byte));
                    }
                }
            }
            Cow::Owned(result)
        }
    }
    
    /// Check if Cow is borrowed (no allocation occurred)
    pub fn is_borrowed<T: ?Sized + ToOwned>(cow: &Cow<T>) -> bool {
        matches!(cow, Cow::Borrowed(_))
    }
    
    /// Check if Cow is owned (allocation occurred)
    pub fn is_owned<T: ?Sized + ToOwned>(cow: &Cow<T>) -> bool {
        matches!(cow, Cow::Owned(_))
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_no_spaces_borrowed() {
            let result = ensure_no_spaces("hello");
            assert!(matches!(result, Cow::Borrowed(_)));
            assert_eq!(result, "hello");
        }
    
        #[test]
        fn test_has_spaces_owned() {
            let result = ensure_no_spaces("hello world");
            assert!(matches!(result, Cow::Owned(_)));
            assert_eq!(result, "hello_world");
        }
    
        #[test]
        fn test_truncate_no_change() {
            let result = truncate_to_limit("hello", 10);
            assert!(is_borrowed(&result));
            assert_eq!(result, "hello");
        }
    
        #[test]
        fn test_truncate_needed() {
            let result = truncate_to_limit("hello world", 5);
            assert!(is_owned(&result));
            assert_eq!(result, "hello");
        }
    
        #[test]
        fn test_normalize_whitespace_no_change() {
            let result = normalize_whitespace("hello world");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_normalize_whitespace_needed() {
            let result = normalize_whitespace("  hello   world  ");
            assert!(is_owned(&result));
            assert_eq!(result, "hello world");
        }
    
        #[test]
        fn test_uppercase_no_change() {
            let result = to_uppercase_if_needed("HELLO");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_uppercase_needed() {
            let result = to_uppercase_if_needed("Hello");
            assert!(is_owned(&result));
            assert_eq!(result, "HELLO");
        }
    
        #[test]
        fn test_escape_html_no_change() {
            let result = escape_html("hello world");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_escape_html_needed() {
            let result = escape_html("<script>");
            assert!(is_owned(&result));
            assert_eq!(result, "&lt;script&gt;");
        }
    
        #[test]
        fn test_url_encode_no_change() {
            let result = url_encode("hello-world_123");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_url_encode_needed() {
            let result = url_encode("hello world");
            assert!(is_owned(&result));
            assert_eq!(result, "hello%20world");
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_no_spaces_borrowed() {
            let result = ensure_no_spaces("hello");
            assert!(matches!(result, Cow::Borrowed(_)));
            assert_eq!(result, "hello");
        }
    
        #[test]
        fn test_has_spaces_owned() {
            let result = ensure_no_spaces("hello world");
            assert!(matches!(result, Cow::Owned(_)));
            assert_eq!(result, "hello_world");
        }
    
        #[test]
        fn test_truncate_no_change() {
            let result = truncate_to_limit("hello", 10);
            assert!(is_borrowed(&result));
            assert_eq!(result, "hello");
        }
    
        #[test]
        fn test_truncate_needed() {
            let result = truncate_to_limit("hello world", 5);
            assert!(is_owned(&result));
            assert_eq!(result, "hello");
        }
    
        #[test]
        fn test_normalize_whitespace_no_change() {
            let result = normalize_whitespace("hello world");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_normalize_whitespace_needed() {
            let result = normalize_whitespace("  hello   world  ");
            assert!(is_owned(&result));
            assert_eq!(result, "hello world");
        }
    
        #[test]
        fn test_uppercase_no_change() {
            let result = to_uppercase_if_needed("HELLO");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_uppercase_needed() {
            let result = to_uppercase_if_needed("Hello");
            assert!(is_owned(&result));
            assert_eq!(result, "HELLO");
        }
    
        #[test]
        fn test_escape_html_no_change() {
            let result = escape_html("hello world");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_escape_html_needed() {
            let result = escape_html("<script>");
            assert!(is_owned(&result));
            assert_eq!(result, "&lt;script&gt;");
        }
    
        #[test]
        fn test_url_encode_no_change() {
            let result = url_encode("hello-world_123");
            assert!(is_borrowed(&result));
        }
    
        #[test]
        fn test_url_encode_needed() {
            let result = url_encode("hello world");
            assert!(is_owned(&result));
            assert_eq!(result, "hello%20world");
        }
    }

    Deep Comparison

    OCaml vs Rust: Clone-on-Write

    Side-by-Side Comparison

    Conditional Processing

    OCaml:

    let maybe_uppercase s threshold =
      if String.length s > threshold then String.uppercase_ascii s
      else s  (* no copy needed - string is immutable *)
    

    Rust:

    fn ensure_no_spaces(s: &str) -> Cow<str> {
        if s.contains(' ') {
            Cow::Owned(s.replace(' ', "_"))
        } else {
            Cow::Borrowed(s)
        }
    }
    

    Key Differences

    AspectOCamlRust
    StringsImmutable by defaultOwned String or borrowed &str
    Copy-on-writeImplicit (GC handles)Explicit Cow<T>
    Return typeSame typeCow enum
    AllocationHidden by GCExplicit Borrowed/Owned

    OCaml's Advantage

    In OCaml, strings are immutable, so returning the same string or a new one has the same type. The GC handles memory - no explicit Cow needed.

    Rust's Cow

    Rust's ownership model requires distinguishing between:

  • &str - borrowed string slice
  • String - owned string
  • Cow<str> bridges these, allowing a function to return either depending on whether modification occurred.

    Performance

    ScenarioOCamlRust
    No change neededReturn same refCow::Borrowed - zero alloc
    Change neededAllocate new stringCow::Owned - allocate
    Memory overheadGC trackingEnum discriminant (1 byte)

    Use Cases

  • • String sanitization
  • • Path normalization
  • • Config value processing
  • • Any "transform if needed" pattern
  • Exercises

  • HTML escape: Implement html_escape<'a>(s: &'a str) -> Cow<'a, str> that replaces <, >, & with their HTML entities — borrow the original if none are present, allocate a new string only when needed.
  • Path normalization: Use Cow<'_, Path> to normalize a file path: if it's already absolute and clean, borrow it; if it needs canonicalize(), return Cow::Owned.
  • to_mut demo: Create a Cow::Borrowed from a string, then call .to_mut() to get a mutable reference; verify that the original slice is unchanged and the Cow now holds an owned copy.
  • Open Source Repos