1077 Expert

Phantom Type State Machine — File Handle

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "Phantom Type State Machine — File Handle" functional Rust example. Difficulty level: Expert. Key concepts covered: Functional Programming. Use phantom types to enforce that a file handle can only be read when open, and that closing it prevents further reads — all checked at compile time, not runtime. Key difference from OCaml: 1. **Phantom types:** OCaml uses abstract types; Rust uses `PhantomData<T>` with zero

Tutorial

The Problem

Use phantom types to enforce that a file handle can only be read when open, and that closing it prevents further reads — all checked at compile time, not runtime.

🎯 Learning Outcomes

• Phantom types in Rust via PhantomData<T> vs OCaml's type parameter trick

• Zero-cost type-level state machines (no runtime overhead)

• How move semantics enforce state transitions (consuming the old handle)

• Comparison with runtime state checks via enums

🦀 The Rust Way

Rust uses zero-sized marker types (struct Opened;) and PhantomData<State> to carry the type parameter without runtime cost. Methods are implemented only on FileHandle<Opened>, so calling read_line on a closed handle is a compile error. The close method consumes the open handle (move semantics), preventing use-after-close.

Code Example

use std::marker::PhantomData;

struct Opened;
struct Closed;

struct FileHandle<State> {
    name: String,
    content: Vec<String>,
    _state: PhantomData<State>,
}

impl FileHandle<Opened> {
    fn read_line(&self, n: usize) -> Option<&str> {
        self.content.get(n).map(|s| s.as_str())
    }

    fn close(self) -> FileHandle<Closed> {
        FileHandle { name: self.name, content: vec![], _state: PhantomData }
    }
}

type opened
type closed
type 'state handle = { name: string; content: string list }

let open_file name : opened handle =
  { name; content = ["line1"; "line2"; "line3"] }

let read_line (h : opened handle) n : string =
  List.nth h.content n

let close_file (_ : opened handle) : closed handle =
  { name = "closed"; content = [] }

Key Differences

Phantom types: OCaml uses abstract types; Rust uses PhantomData<T> with zero-sized marker structs

State transition: OCaml returns a new value; Rust moves the old one, making reuse impossible

Method dispatch: OCaml uses standalone functions with type constraints; Rust uses impl blocks on specific type parameters

Runtime comparison: Both languages can also do runtime checks (enum/variant), but phantom types are zero-cost

OCaml Approach

OCaml uses phantom type parameters on a record type. The opened and closed types are abstract — they have no values. Functions constrain which phantom type is accepted, so read_line only works on opened handle values. The type checker enforces this statically.

Full Source

#![allow(clippy::all)]
//! Phantom Type State Machine — File Handle
//!
//! Uses phantom types to enforce state transitions at compile time.
//! In OCaml, phantom type parameters constrain which operations are valid.
//! In Rust, we use the same pattern with zero-sized type markers.

use std::marker::PhantomData;

// ── Solution 1: Idiomatic Rust — phantom type markers ──

/// State marker: file is open (zero-sized, exists only at type level)
pub struct Opened;
/// State marker: file is closed
pub struct Closed;

/// A file handle parameterized by its state.
/// The `PhantomData<State>` makes the compiler track the state
/// without any runtime cost.
///
/// OCaml equivalent: `type 'state handle = { name: string; content: string list }`
pub struct FileHandle<State> {
    name: String,
    content: Vec<String>,
    _state: PhantomData<State>,
}

/// Open a file — returns a handle in the `Opened` state.
/// OCaml: `val open_file : string -> opened handle`
pub fn open_file(name: &str) -> FileHandle<Opened> {
    FileHandle {
        name: name.to_string(),
        content: vec![
            "line1".to_string(),
            "line2".to_string(),
            "line3".to_string(),
        ],
        _state: PhantomData,
    }
}

impl FileHandle<Opened> {
    /// Read a line — only available when the file is open.
    /// OCaml: `val read_line : opened handle -> int -> string`
    pub fn read_line(&self, n: usize) -> Option<&str> {
        self.content.get(n).map(|s| s.as_str())
    }

    /// Close the file — consumes the open handle, returns a closed one.
    /// This is the key insight: after closing, the old handle is gone.
    /// OCaml: `val close_file : opened handle -> closed handle`
    pub fn close(self) -> FileHandle<Closed> {
        FileHandle {
            name: self.name,
            content: vec![],
            _state: PhantomData,
        }
    }

    /// Get the file name
    pub fn name(&self) -> &str {
        &self.name
    }
}

impl FileHandle<Closed> {
    /// Get the file name even after closing
    pub fn name(&self) -> &str {
        &self.name
    }
}

// ── Solution 2: Trait-based approach ──
//
// Uses traits to gate operations instead of inherent impls.

/// Marker trait for states that allow reading
pub trait Readable {}
impl Readable for Opened {}
// Closed does NOT implement Readable

/// Generic read function — only compiles for Readable states
pub fn read_generic<S: Readable>(handle: &FileHandle<S>, n: usize) -> Option<&str> {
    handle.content.get(n).map(|s| s.as_str())
}

// ── Solution 3: Enum-based (runtime check, for comparison) ──
//
// Shows why phantom types are superior — enum checks happen at runtime.

#[derive(Debug, PartialEq)]
pub enum FileState {
    Open,
    Closed,
}

pub struct RuntimeFileHandle {
    pub name: String,
    pub content: Vec<String>,
    pub state: FileState,
}

impl RuntimeFileHandle {
    pub fn open(name: &str) -> Self {
        Self {
            name: name.to_string(),
            content: vec![
                "line1".to_string(),
                "line2".to_string(),
                "line3".to_string(),
            ],
            state: FileState::Open,
        }
    }

    /// Returns Err if file is closed — runtime check instead of compile-time
    pub fn read_line(&self, n: usize) -> Result<&str, &'static str> {
        if self.state == FileState::Closed {
            return Err("cannot read from closed file");
        }
        self.content
            .get(n)
            .map(|s| s.as_str())
            .ok_or("line index out of range")
    }

    pub fn close(&mut self) {
        self.state = FileState::Closed;
        self.content.clear();
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_open_and_read() {
        let f = open_file("data.txt");
        assert_eq!(f.read_line(0), Some("line1"));
        assert_eq!(f.read_line(1), Some("line2"));
        assert_eq!(f.read_line(2), Some("line3"));
    }

    #[test]
    fn test_read_out_of_bounds() {
        let f = open_file("data.txt");
        assert_eq!(f.read_line(99), None);
    }

    #[test]
    fn test_close_returns_closed_handle() {
        let f = open_file("data.txt");
        let closed = f.close();
        // After closing, we can still get the name
        assert_eq!(closed.name(), "data.txt");
        // But we CANNOT call read_line — it won't compile:
        // closed.read_line(0);  // ERROR: no method `read_line` on FileHandle<Closed>
    }

    #[test]
    fn test_generic_read_on_opened() {
        let f = open_file("test.txt");
        assert_eq!(read_generic(&f, 0), Some("line1"));
    }

    #[test]
    fn test_runtime_handle_read_after_close() {
        let mut f = RuntimeFileHandle::open("data.txt");
        assert_eq!(f.read_line(0), Ok("line1"));
        f.close();
        assert_eq!(f.read_line(0), Err("cannot read from closed file"));
    }

    #[test]
    fn test_file_name_persists_after_close() {
        let f = open_file("important.txt");
        assert_eq!(f.name(), "important.txt");
        let closed = f.close();
        assert_eq!(closed.name(), "important.txt");
    }
}

(* Phantom Type State Machine — File Handle *)

type opened
type closed

type 'state handle = { name: string; content: string list }

let open_file name : opened handle =
  { name; content = ["line1"; "line2"; "line3"] }

let read_line (h : opened handle) n : string =
  List.nth h.content n

let close_file (_ : opened handle) : closed handle =
  { name = "closed"; content = [] }

(* read_line on a closed handle would be a type error! *)
(* let _ = read_line (close_file (open_file "test")) 0 *)

let () =
  let f = open_file "data.txt" in
  assert (read_line f 0 = "line1");
  assert (read_line f 1 = "line2");
  let _closed = close_file f in
  Printf.printf "%s\n" (read_line f 0);
  Printf.printf "%s\n" (read_line f 1);
  Printf.printf "File safely closed\n";
  print_endline "ok"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_open_and_read() {
        let f = open_file("data.txt");
        assert_eq!(f.read_line(0), Some("line1"));
        assert_eq!(f.read_line(1), Some("line2"));
        assert_eq!(f.read_line(2), Some("line3"));
    }

    #[test]
    fn test_read_out_of_bounds() {
        let f = open_file("data.txt");
        assert_eq!(f.read_line(99), None);
    }

    #[test]
    fn test_close_returns_closed_handle() {
        let f = open_file("data.txt");
        let closed = f.close();
        // After closing, we can still get the name
        assert_eq!(closed.name(), "data.txt");
        // But we CANNOT call read_line — it won't compile:
        // closed.read_line(0);  // ERROR: no method `read_line` on FileHandle<Closed>
    }

    #[test]
    fn test_generic_read_on_opened() {
        let f = open_file("test.txt");
        assert_eq!(read_generic(&f, 0), Some("line1"));
    }

    #[test]
    fn test_runtime_handle_read_after_close() {
        let mut f = RuntimeFileHandle::open("data.txt");
        assert_eq!(f.read_line(0), Ok("line1"));
        f.close();
        assert_eq!(f.read_line(0), Err("cannot read from closed file"));
    }

    #[test]
    fn test_file_name_persists_after_close() {
        let f = open_file("important.txt");
        assert_eq!(f.name(), "important.txt");
        let closed = f.close();
        assert_eq!(closed.name(), "important.txt");
    }
}

Deep Comparison

OCaml vs Rust: Phantom Type State Machine

Side-by-Side Code

OCaml

type opened
type closed
type 'state handle = { name: string; content: string list }

let open_file name : opened handle =
  { name; content = ["line1"; "line2"; "line3"] }

let read_line (h : opened handle) n : string =
  List.nth h.content n

let close_file (_ : opened handle) : closed handle =
  { name = "closed"; content = [] }

Rust (idiomatic)

use std::marker::PhantomData;

struct Opened;
struct Closed;

struct FileHandle<State> {
    name: String,
    content: Vec<String>,
    _state: PhantomData<State>,
}

impl FileHandle<Opened> {
    fn read_line(&self, n: usize) -> Option<&str> {
        self.content.get(n).map(|s| s.as_str())
    }

    fn close(self) -> FileHandle<Closed> {
        FileHandle { name: self.name, content: vec![], _state: PhantomData }
    }
}

Rust (runtime comparison — enum-based)

enum FileState { Open, Closed }

struct RuntimeFileHandle {
    name: String,
    content: Vec<String>,
    state: FileState,
}

impl RuntimeFileHandle {
    fn read_line(&self, n: usize) -> Result<&str, &'static str> {
        if self.state == FileState::Closed {
            return Err("cannot read from closed file");
        }
        self.content.get(n).map(|s| s.as_str()).ok_or("out of range")
    }
}

Type Signatures

Concept	OCaml	Rust
Phantom parameter	`type 'state handle`	`struct FileHandle<State>`
State markers	`type opened` (abstract)	`struct Opened;` (zero-sized)
Phantom carrier	Built into type parameter	`PhantomData<State>`
State transition	Returns new phantom type	`self` consumed, new type returned

Key Insights

Both languages achieve zero-cost type-level state machines — the phantom parameter exists only for the type checker, never at runtime.

Rust's move semantics add an extra guarantee — close(self) consumes the handle, so you can't accidentally keep using it. OCaml's close_file returns a new value but doesn't prevent keeping the old one.

OCaml's abstract types vs Rust's zero-sized types — OCaml's type opened has no constructors; Rust's struct Opened; is a unit struct. Both serve as compile-time-only markers.

**PhantomData is Rust's explicit marker** — OCaml doesn't need an equivalent because type parameters don't affect struct layout. Rust needs PhantomData to tell the compiler the type parameter is intentional.

Runtime alternatives exist in both — OCaml can use variants, Rust can use enums. But phantom types catch errors at compile time with zero runtime cost.

When to Use Each Style

Use phantom types when: State transitions must be enforced at compile time — connection states, protocol phases, resource lifecycles. The errors become impossible, not just caught. Use runtime enums when: States are dynamic or determined by external input, and you can't know the state at compile time.

Exercises

Add a third state (e.g., Locked) to the phantom-type state machine and implement transitions that enforce valid state sequences at compile time.

Apply the phantom-type state machine pattern to model a network connection lifecycle: Disconnected → Connecting → Connected → Disconnecting → Disconnected, preventing methods like send from being called in wrong states.

Implement a builder for a configuration struct using phantom types to enforce that required fields (host, port) must be set before build() can be called.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust