ExamplesBy LevelBy TopicLearning Paths
455 Expert

455: Lock-Free Stack

Functional Programming

Tutorial

The Problem

A mutex-based stack serializes all push/pop operations — a contention bottleneck under high concurrency. A lock-free stack uses compare_and_swap on the head pointer: push allocates a new node, sets its next to the current head, then CAS-swaps the head; pop reads the head, follows next, then CAS-swaps the head to next. Retrying on CAS failure makes it correct without locks. Treiber's lock-free stack (1986) is the canonical algorithm and remains relevant in high-frequency trading, lock-free allocators, and concurrent data structures research.

Lock-free stacks appear in memory allocators (free list), lock-free work queues, concurrent GC systems, and any high-throughput concurrent LIFO structure.

🎯 Learning Outcomes

  • • Understand Treiber's lock-free stack algorithm using CAS on the head pointer
  • • Learn how AtomicPtr<Node<T>> stores the stack head with atomic pointer swaps
  • • See how the push CAS loop handles concurrent modifications correctly
  • • Understand the ABA problem: node reuse can fool CAS even with correct values
  • • Learn why unsafe Rust is needed for raw pointer manipulation in lock-free structures
  • Code Example

    #![allow(clippy::all)]
    // 455. Lock-free stack with atomics
    use std::ptr;
    use std::sync::atomic::{AtomicPtr, Ordering};
    
    struct Node<T> {
        value: T,
        next: *mut Node<T>,
    }
    pub struct Stack<T> {
        head: AtomicPtr<Node<T>>,
    }
    unsafe impl<T: Send> Send for Stack<T> {}
    unsafe impl<T: Send> Sync for Stack<T> {}
    
    impl<T> Stack<T> {
        pub fn new() -> Self {
            Stack {
                head: AtomicPtr::new(ptr::null_mut()),
            }
        }
        pub fn push(&self, v: T) {
            let n = Box::into_raw(Box::new(Node {
                value: v,
                next: ptr::null_mut(),
            }));
            loop {
                let h = self.head.load(Ordering::Relaxed);
                unsafe {
                    (*n).next = h;
                }
                match self
                    .head
                    .compare_exchange_weak(h, n, Ordering::Release, Ordering::Relaxed)
                {
                    Ok(_) => break,
                    Err(_) => {}
                }
            }
        }
        pub fn pop(&self) -> Option<T> {
            loop {
                let h = self.head.load(Ordering::Acquire);
                if h.is_null() {
                    return None;
                }
                let next = unsafe { (*h).next };
                match self
                    .head
                    .compare_exchange_weak(h, next, Ordering::AcqRel, Ordering::Relaxed)
                {
                    Ok(_) => {
                        let v = unsafe { ptr::read(&(*h).value) };
                        unsafe {
                            drop(Box::from_raw(h));
                        }
                        return Some(v);
                    }
                    Err(_) => {}
                }
            }
        }
    }
    impl<T> Drop for Stack<T> {
        fn drop(&mut self) {
            while self.pop().is_some() {}
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        use std::sync::Arc;
        use std::thread;
        #[test]
        fn test_lifo() {
            let s = Stack::new();
            s.push(1);
            s.push(2);
            s.push(3);
            assert_eq!(s.pop(), Some(3));
            assert_eq!(s.pop(), Some(2));
            assert_eq!(s.pop(), None.or(Some(1)));
            assert_eq!(s.pop(), None);
        }
        #[test]
        fn test_concurrent() {
            let s = Arc::new(Stack::<u32>::new());
            let hs: Vec<std::thread::JoinHandle<()>> = (0..4)
                .map(|_| {
                    let s = Arc::clone(&s);
                    thread::spawn(move || {
                        for i in 0..100 {
                            s.push(i);
                        }
                    })
                })
                .collect();
            for h in hs {
                h.join().unwrap();
            }
            let mut c = 0;
            while s.pop().is_some() {
                c += 1;
            }
            assert_eq!(c, 400);
        }
    }

    Key Differences

  • Memory safety: Rust requires unsafe for pointer manipulation; OCaml's GC provides automatic memory management for lock-free structures.
  • ABA problem: Rust's lock-free stack is susceptible to ABA (need hazard pointers or epoch GC); OCaml's GC prevents ABA for pointer-based structures.
  • Type safety: Rust's AtomicPtr<T> is typed; OCaml's Atomic.t holds any value.
  • Unsafe boundary: Rust isolates unsafe code; OCaml's lock-free code is safe but depends on GC memory model guarantees.
  • OCaml Approach

    OCaml's lock-free stack for OCaml 5.x uses Atomic.t for the head pointer combined with OCaml's GC handling memory. The GC eliminates ABA problems for pointer-based structures since allocated nodes are never reused at the same address while still referenced. A lock-free stack: let push s v = let n = { v; next = Atomic.get s } in while not (Atomic.compare_and_set s n.next n) do n.next <- Atomic.get s done.

    Full Source

    #![allow(clippy::all)]
    // 455. Lock-free stack with atomics
    use std::ptr;
    use std::sync::atomic::{AtomicPtr, Ordering};
    
    struct Node<T> {
        value: T,
        next: *mut Node<T>,
    }
    pub struct Stack<T> {
        head: AtomicPtr<Node<T>>,
    }
    unsafe impl<T: Send> Send for Stack<T> {}
    unsafe impl<T: Send> Sync for Stack<T> {}
    
    impl<T> Stack<T> {
        pub fn new() -> Self {
            Stack {
                head: AtomicPtr::new(ptr::null_mut()),
            }
        }
        pub fn push(&self, v: T) {
            let n = Box::into_raw(Box::new(Node {
                value: v,
                next: ptr::null_mut(),
            }));
            loop {
                let h = self.head.load(Ordering::Relaxed);
                unsafe {
                    (*n).next = h;
                }
                match self
                    .head
                    .compare_exchange_weak(h, n, Ordering::Release, Ordering::Relaxed)
                {
                    Ok(_) => break,
                    Err(_) => {}
                }
            }
        }
        pub fn pop(&self) -> Option<T> {
            loop {
                let h = self.head.load(Ordering::Acquire);
                if h.is_null() {
                    return None;
                }
                let next = unsafe { (*h).next };
                match self
                    .head
                    .compare_exchange_weak(h, next, Ordering::AcqRel, Ordering::Relaxed)
                {
                    Ok(_) => {
                        let v = unsafe { ptr::read(&(*h).value) };
                        unsafe {
                            drop(Box::from_raw(h));
                        }
                        return Some(v);
                    }
                    Err(_) => {}
                }
            }
        }
    }
    impl<T> Drop for Stack<T> {
        fn drop(&mut self) {
            while self.pop().is_some() {}
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        use std::sync::Arc;
        use std::thread;
        #[test]
        fn test_lifo() {
            let s = Stack::new();
            s.push(1);
            s.push(2);
            s.push(3);
            assert_eq!(s.pop(), Some(3));
            assert_eq!(s.pop(), Some(2));
            assert_eq!(s.pop(), None.or(Some(1)));
            assert_eq!(s.pop(), None);
        }
        #[test]
        fn test_concurrent() {
            let s = Arc::new(Stack::<u32>::new());
            let hs: Vec<std::thread::JoinHandle<()>> = (0..4)
                .map(|_| {
                    let s = Arc::clone(&s);
                    thread::spawn(move || {
                        for i in 0..100 {
                            s.push(i);
                        }
                    })
                })
                .collect();
            for h in hs {
                h.join().unwrap();
            }
            let mut c = 0;
            while s.pop().is_some() {
                c += 1;
            }
            assert_eq!(c, 400);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
        use std::sync::Arc;
        use std::thread;
        #[test]
        fn test_lifo() {
            let s = Stack::new();
            s.push(1);
            s.push(2);
            s.push(3);
            assert_eq!(s.pop(), Some(3));
            assert_eq!(s.pop(), Some(2));
            assert_eq!(s.pop(), None.or(Some(1)));
            assert_eq!(s.pop(), None);
        }
        #[test]
        fn test_concurrent() {
            let s = Arc::new(Stack::<u32>::new());
            let hs: Vec<std::thread::JoinHandle<()>> = (0..4)
                .map(|_| {
                    let s = Arc::clone(&s);
                    thread::spawn(move || {
                        for i in 0..100 {
                            s.push(i);
                        }
                    })
                })
                .collect();
            for h in hs {
                h.join().unwrap();
            }
            let mut c = 0;
            while s.pop().is_some() {
                c += 1;
            }
            assert_eq!(c, 400);
        }
    }

    Exercises

  • Correct stack test: Write tests for the lock-free stack with 8 concurrent producer threads pushing 1000 items each, and 8 consumer threads popping until empty. Verify no items are lost or duplicated.
  • ABA demonstration: Devise a scenario where the ABA problem could manifest in the current implementation. Propose a fix using an epoch counter packed with the pointer (if on a 64-bit platform with spare pointer bits).
  • Lock-free queue: Extend the lock-free stack concept to implement a lock-free FIFO queue (Michael-Scott queue). The key difference: maintain both head and tail pointers, each protected by separate CAS operations.
  • Open Source Repos