ExamplesBy LevelBy TopicLearning Paths
553 Advanced

Self-Referential Structs

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Self-Referential Structs" functional Rust example. Difficulty level: Advanced. Key concepts covered: Functional Programming. A self-referential struct stores a reference to its own data — for example, a struct that owns a `String` and also holds a `&str` pointing into that same `String`. Key difference from OCaml: 1. **Move safety**: Rust's ownership model makes self

Tutorial

The Problem

A self-referential struct stores a reference to its own data — for example, a struct that owns a String and also holds a &str pointing into that same String. This is fundamentally incompatible with Rust's ownership model: moving the struct invalidates the internal reference. This is one of Rust's hardest problems, arising in async futures (which hold references to their own state), parsers (holding a pointer into a buffer they own), and event loops. The standard solutions are: use indices instead of references, use Pin<Box<T>> for unmovable data, or use external crates like ouroboros.

🎯 Learning Outcomes

  • • Why self-referential structs are problematic in Rust's ownership model
  • • How storing indices instead of references sidesteps the problem safely
  • • How Pin<Box<T>> prevents a struct from moving, enabling self-references
  • • How the Owner/View two-struct pattern separates owned data from borrowed views
  • • Where self-referential structs arise: async futures, parsers, event loop state machines
  • Code Example

    #![allow(clippy::all)]
    //! Self-Referential Structs
    //!
    //! Patterns for structs that reference their own data.
    
    use std::pin::Pin;
    
    /// Safe approach: store index instead of reference.
    pub struct Buffer {
        data: String,
        start: usize,
        end: usize,
    }
    
    impl Buffer {
        pub fn new(data: &str, start: usize, end: usize) -> Self {
            Buffer {
                data: data.to_string(),
                start,
                end,
            }
        }
    
        pub fn view(&self) -> &str {
            &self.data[self.start..self.end]
        }
    }
    
    /// Using separate owner and view.
    pub struct Owner {
        data: String,
    }
    
    impl Owner {
        pub fn new(data: &str) -> Self {
            Owner {
                data: data.to_string(),
            }
        }
    
        pub fn get(&self) -> &str {
            &self.data
        }
    }
    
    /// Pinned self-referential (advanced).
    pub struct Pinned {
        data: String,
        // In real code, this would be a pointer set after pinning
    }
    
    impl Pinned {
        pub fn new(data: String) -> Pin<Box<Self>> {
            Box::pin(Pinned { data })
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_buffer_view() {
            let buf = Buffer::new("hello world", 0, 5);
            assert_eq!(buf.view(), "hello");
        }
    
        #[test]
        fn test_owner() {
            let owner = Owner::new("test");
            assert_eq!(owner.get(), "test");
        }
    
        #[test]
        fn test_pinned() {
            let pinned = Pinned::new("data".into());
            // pinned is now immovable
            assert!(pinned.data.len() > 0);
        }
    }

    Key Differences

  • Move safety: Rust's ownership model makes self-references dangerous when structs are moved; OCaml values are GC-managed and never "moved" in the Rust sense.
  • Async futures: Rust async futures are self-referential state machines requiring Pin; OCaml's effect-based async (OCaml 5.x) does not require pinning.
  • Index pattern: Storing indices instead of references is idiomatic Rust for self-referential data; OCaml can store direct references without concern.
  • Crate solutions: ouroboros, self_cell, and rental (deprecated) solve self-referential structs with macros; OCaml needs no such workarounds.
  • OCaml Approach

    OCaml's GC-managed heap allows self-referential structures trivially — values can reference themselves without any pinning or special types:

    type node = { mutable next: node option; value: int }
    let rec n = { next = Some n; value = 42 }  (* self-referential — fine in OCaml *)
    

    The GC tracks the cycle and keeps all nodes alive.

    Full Source

    #![allow(clippy::all)]
    //! Self-Referential Structs
    //!
    //! Patterns for structs that reference their own data.
    
    use std::pin::Pin;
    
    /// Safe approach: store index instead of reference.
    pub struct Buffer {
        data: String,
        start: usize,
        end: usize,
    }
    
    impl Buffer {
        pub fn new(data: &str, start: usize, end: usize) -> Self {
            Buffer {
                data: data.to_string(),
                start,
                end,
            }
        }
    
        pub fn view(&self) -> &str {
            &self.data[self.start..self.end]
        }
    }
    
    /// Using separate owner and view.
    pub struct Owner {
        data: String,
    }
    
    impl Owner {
        pub fn new(data: &str) -> Self {
            Owner {
                data: data.to_string(),
            }
        }
    
        pub fn get(&self) -> &str {
            &self.data
        }
    }
    
    /// Pinned self-referential (advanced).
    pub struct Pinned {
        data: String,
        // In real code, this would be a pointer set after pinning
    }
    
    impl Pinned {
        pub fn new(data: String) -> Pin<Box<Self>> {
            Box::pin(Pinned { data })
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_buffer_view() {
            let buf = Buffer::new("hello world", 0, 5);
            assert_eq!(buf.view(), "hello");
        }
    
        #[test]
        fn test_owner() {
            let owner = Owner::new("test");
            assert_eq!(owner.get(), "test");
        }
    
        #[test]
        fn test_pinned() {
            let pinned = Pinned::new("data".into());
            // pinned is now immovable
            assert!(pinned.data.len() > 0);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_buffer_view() {
            let buf = Buffer::new("hello world", 0, 5);
            assert_eq!(buf.view(), "hello");
        }
    
        #[test]
        fn test_owner() {
            let owner = Owner::new("test");
            assert_eq!(owner.get(), "test");
        }
    
        #[test]
        fn test_pinned() {
            let pinned = Pinned::new("data".into());
            // pinned is now immovable
            assert!(pinned.data.len() > 0);
        }
    }

    Deep Comparison

    OCaml vs Rust: lifetime self referential

    See example.rs and example.ml for implementations.

    Key Differences

  • OCaml uses garbage collection
  • Rust uses ownership and borrowing
  • Both support the core concept
  • Exercises

  • Index-based parser: Implement struct Parser { source: String, pos: usize } where all parsing methods return &str slices computed from pos into source — no self-reference needed.
  • Pinned future: Write a simple state machine struct implementing Future that stores a reference to a field within itself using Pin<&mut Self> in the poll method.
  • Owner-View pair: Implement a TextDocument owning a String and a Selection { start: usize, end: usize } that returns &str slices via document.selected_text(&selection).
  • Open Source Repos