ExamplesBy LevelBy TopicLearning Paths
113 Intermediate

113-string-str — String vs &str

Functional Programming

Tutorial

The Problem

Rust has two primary string types: String (owned, heap-allocated, growable) and &str (borrowed string slice — a pointer + length into any UTF-8 data). This distinction is analogous to std::string vs const char* in C++, but with full UTF-8 guarantees and lifetime safety. Choosing the right type for function parameters and return types affects performance, API ergonomics, and ownership semantics.

The rule of thumb: accept &str in function parameters (works with both String and &str), return String when ownership is needed, and return &str only when borrowing from an input.

🎯 Learning Outcomes

  • • Understand String as owned, heap-allocated, growable UTF-8
  • • Understand &str as a borrowed slice with a length (no ownership)
  • • Write functions that accept &str and work with both String and string literals
  • • Build Strings by pushing, appending, and formatting
  • • Understand Deref coercion: &String automatically coerces to &str
  • Code Example

    // &str in function parameters: callers can pass literals or &String
    pub fn first_word(s: &str) -> &str {
        s.split(',').next().unwrap_or(s).trim()
    }
    
    pub fn greet(name: &str) -> String {
        let mut g = String::from("Hello, ");
        g.push_str(name);
        g.push('!');
        g
    }
    
    pub fn char_count(s: &str) -> usize {
        s.chars().count()   // Unicode scalar values, not bytes
    }

    Key Differences

  • Ownership: Rust's String is owned (dropped when out of scope); OCaml strings are GC-managed — no explicit ownership.
  • Zero-copy return: Rust can return &str pointing into the original data (zero allocation); OCaml's String.sub allocates.
  • Growability: Rust's String can grow with push_str; OCaml strings are immutable (use Buffer for mutable building).
  • Deref coercion: &String coerces to &str automatically; OCaml has no equivalent because there is only one type.
  • OCaml Approach

    OCaml has one string type: string (immutable byte sequence since OCaml 4.06). There is no distinction between owned and borrowed strings — the GC handles all lifetime management:

    let first_word s = String.split_on_char ',' s |> List.hd |> String.trim
    let char_count s = String.length s  (* byte count, not Unicode *)
    let append base suffix = base ^ suffix  (* allocates new string *)
    

    OCaml's ^ operator always allocates a new string. Rust's push_str avoids allocation when the String has sufficient capacity.

    Full Source

    #![allow(clippy::all)]
    // Example 113: String vs &str
    //
    // String: owned, heap-allocated, growable, mutable
    // &str: borrowed slice — a pointer + length into any existing string data
    
    // ---------------------------------------------------------------------------
    // Approach 1: Idiomatic Rust — use &str in parameters, String for ownership
    // ---------------------------------------------------------------------------
    
    /// Returns the first word (before any comma) from a string slice.
    /// Accepts `&str` so callers can pass a `String` reference or a literal —
    /// no allocation forced on the caller.
    pub fn first_word(s: &str) -> &str {
        s.split(',').next().unwrap_or(s).trim()
    }
    
    /// Counts Unicode scalar values (chars) in any string slice.
    pub fn char_count(s: &str) -> usize {
        s.chars().count()
    }
    
    /// Appends a suffix and returns a new owned `String`.
    /// Takes `&str` for both — works with literals or `String` borrows.
    pub fn append(base: &str, suffix: &str) -> String {
        let mut result = String::with_capacity(base.len() + suffix.len());
        result.push_str(base);
        result.push_str(suffix);
        result
    }
    
    // ---------------------------------------------------------------------------
    // Approach 2: Functional / builder style — manipulate owned Strings
    // ---------------------------------------------------------------------------
    
    /// Builds a greeting by owning the name, demonstrating String mutation.
    pub fn greet(name: &str) -> String {
        // String::from converts &str → String (heap allocation)
        let mut greeting = String::from("Hello, ");
        greeting.push_str(name);
        greeting.push('!');
        greeting
    }
    
    /// Splits a sentence into words, returning owned Strings.
    /// Shows that collecting &str views into Strings requires an explicit clone.
    pub fn words(s: &str) -> Vec<&str> {
        s.split_whitespace().collect()
    }
    
    /// Uppercase: &str in, new String out — no in-place mutation.
    pub fn to_upper(s: &str) -> String {
        s.to_uppercase()
    }
    
    // ---------------------------------------------------------------------------
    // Approach 3: Subslicing — zero-copy views into string data
    // ---------------------------------------------------------------------------
    
    /// Returns the substring at `start..start+len` as a &str slice.
    /// Panics if the byte indices are not on char boundaries.
    pub fn substring(s: &str, start: usize, len: usize) -> &str {
        &s[start..start + len]
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_first_word_with_comma() {
            assert_eq!(first_word("hello, world!"), "hello");
        }
    
        #[test]
        fn test_first_word_no_comma() {
            assert_eq!(first_word("hello"), "hello");
        }
    
        #[test]
        fn test_first_word_from_owned_string() {
            // Demonstrates that first_word accepts &String via auto-deref
            let owned = String::from("foo, bar");
            assert_eq!(first_word(&owned), "foo");
        }
    
        #[test]
        fn test_char_count_ascii() {
            assert_eq!(char_count("hello"), 5);
        }
    
        #[test]
        fn test_char_count_unicode() {
            // "café" is 4 chars but 5 bytes — char_count returns chars
            assert_eq!(char_count("café"), 4);
        }
    
        #[test]
        fn test_append() {
            assert_eq!(append("hello", " world"), "hello world");
            // Works with &String too
            let s = String::from("foo");
            assert_eq!(append(&s, "bar"), "foobar");
        }
    
        #[test]
        fn test_greet() {
            assert_eq!(greet("Alice"), "Hello, Alice!");
            assert_eq!(greet("World"), "Hello, World!");
        }
    
        #[test]
        fn test_words() {
            assert_eq!(words("one two three"), vec!["one", "two", "three"]);
            assert_eq!(words(""), Vec::<&str>::new());
        }
    
        #[test]
        fn test_to_upper() {
            assert_eq!(to_upper("hello world"), "HELLO WORLD");
        }
    
        #[test]
        fn test_substring() {
            let s = "hello, world!";
            assert_eq!(substring(s, 7, 5), "world");
            assert_eq!(substring(s, 0, 5), "hello");
        }
    
        #[test]
        fn test_string_literal_is_str() {
            // &'static str: baked into the binary, no heap allocation
            let literal: &str = "static text";
            let owned: String = literal.to_owned();
            // &String coerces to &str via Deref
            let back: &str = &owned;
            assert_eq!(literal, back);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_first_word_with_comma() {
            assert_eq!(first_word("hello, world!"), "hello");
        }
    
        #[test]
        fn test_first_word_no_comma() {
            assert_eq!(first_word("hello"), "hello");
        }
    
        #[test]
        fn test_first_word_from_owned_string() {
            // Demonstrates that first_word accepts &String via auto-deref
            let owned = String::from("foo, bar");
            assert_eq!(first_word(&owned), "foo");
        }
    
        #[test]
        fn test_char_count_ascii() {
            assert_eq!(char_count("hello"), 5);
        }
    
        #[test]
        fn test_char_count_unicode() {
            // "café" is 4 chars but 5 bytes — char_count returns chars
            assert_eq!(char_count("café"), 4);
        }
    
        #[test]
        fn test_append() {
            assert_eq!(append("hello", " world"), "hello world");
            // Works with &String too
            let s = String::from("foo");
            assert_eq!(append(&s, "bar"), "foobar");
        }
    
        #[test]
        fn test_greet() {
            assert_eq!(greet("Alice"), "Hello, Alice!");
            assert_eq!(greet("World"), "Hello, World!");
        }
    
        #[test]
        fn test_words() {
            assert_eq!(words("one two three"), vec!["one", "two", "three"]);
            assert_eq!(words(""), Vec::<&str>::new());
        }
    
        #[test]
        fn test_to_upper() {
            assert_eq!(to_upper("hello world"), "HELLO WORLD");
        }
    
        #[test]
        fn test_substring() {
            let s = "hello, world!";
            assert_eq!(substring(s, 7, 5), "world");
            assert_eq!(substring(s, 0, 5), "hello");
        }
    
        #[test]
        fn test_string_literal_is_str() {
            // &'static str: baked into the binary, no heap allocation
            let literal: &str = "static text";
            let owned: String = literal.to_owned();
            // &String coerces to &str via Deref
            let back: &str = &owned;
            assert_eq!(literal, back);
        }
    }

    Deep Comparison

    OCaml vs Rust: String vs &str

    Side-by-Side Code

    OCaml

    (* OCaml has one string type — immutable, GC-managed *)
    let first_word s =
      match String.index_opt s ',' with
      | Some i -> String.sub s 0 i |> String.trim
      | None   -> String.trim s
    
    let greet name = "Hello, " ^ name ^ "!"
    
    let char_count s = String.length s   (* byte count in OCaml *)
    
    let () =
      assert (first_word "hello, world!" = "hello");
      assert (greet "Alice" = "Hello, Alice!");
      print_endline "ok"
    

    Rust (idiomatic — &str parameters)

    // &str in function parameters: callers can pass literals or &String
    pub fn first_word(s: &str) -> &str {
        s.split(',').next().unwrap_or(s).trim()
    }
    
    pub fn greet(name: &str) -> String {
        let mut g = String::from("Hello, ");
        g.push_str(name);
        g.push('!');
        g
    }
    
    pub fn char_count(s: &str) -> usize {
        s.chars().count()   // Unicode scalar values, not bytes
    }
    

    Rust (functional / builder style)

    // Build strings with iterators — no mutation
    pub fn words(s: &str) -> Vec<&str> {
        s.split_whitespace().collect()
    }
    
    pub fn to_upper(s: &str) -> String {
        s.to_uppercase()
    }
    
    pub fn append(base: &str, suffix: &str) -> String {
        [base, suffix].concat()
    }
    

    Type Signatures

    ConceptOCamlRust
    String typestring (single type)String (owned) / &str (borrowed)
    Function parameterval f : string -> stringfn f(s: &str) -> String
    Literal typestring&'static str
    SubstringString.sub s start lenstring (copy)&s[start..end]&str (zero-copy)
    MutationNot allowed (immutable)String is mutable; &str is not
    Char countString.length (bytes)s.chars().count() (Unicode scalars)

    Key Insights

  • Two types for two purposes. Rust's String owns heap-allocated text you can mutate and grow. &str is a borrowed view — a fat pointer (pointer + length) into any existing string data, requiring no allocation.
  • **Use &str in function signatures.** Writing fn f(s: &str) lets callers pass a string literal ("hello"), a &String (via auto-deref through Deref<Target = str>), or a slice of a larger string — all without forcing an allocation.
  • **OCaml's single string is Rust's String.** Both are heap-allocated and managed (GC in OCaml, ownership in Rust). Rust adds &str as a zero-cost abstraction that OCaml doesn't have — every OCaml String.sub copies; Rust &s[a..b] does not.
  • **String::length in OCaml counts bytes; str::chars().count() counts Unicode scalar values.** The distinction matters for multibyte characters: "café" has 4 chars but 5 UTF-8 bytes.
  • Ownership is visible in the type. String in a return type tells the caller they own new heap memory. &str in a return type (borrowing from input) is zero-copy. This contract is enforced by the borrow checker — no runtime surprises.
  • When to Use Each Style

    **Use &str in function parameters when:** you only need to read the string. This is the idiomatic Rust default — it accepts literals, String borrows, and subslices without allocation.

    **Use String (owned) when:** the function needs to build, grow, or return new string data, or when you need the string to outlive the input (e.g., storing in a struct field).

    **Use subslice &str returns when:** you can return a view into the input string (e.g., first_word) — zero allocation, maximum efficiency. The lifetime ties the returned slice to the input.

    Exercises

  • Write a function count_words(s: &str) -> usize that counts space-separated words without allocating.
  • Implement title_case(s: &str) -> String that capitalizes the first letter of each word.
  • Write split_once_custom<'a>(s: &'a str, delim: char) -> Option<(&'a str, &'a str)> that returns borrowed slices of the input.
  • Open Source Repos