ExamplesBy LevelBy TopicLearning Paths
454 Fundamental

454: Compare-and-Exchange (CAS)

Functional Programming

Tutorial

The Problem

Atomic increment (fetch_add) handles the simple case, but complex atomic updates — "set to max", "set if value is X", "conditional swap" — require compare_and_exchange (CAS). CAS is the universal primitive for lock-free algorithms: read the current value, compute the new value, atomically swap old→new only if the value is still what you read. If another thread changed it, retry. This retry loop is the foundation of all lock-free data structures: lock-free stacks, queues, linked lists, and hash maps.

CAS appears in Arc's reference count decrement (is-zero check), lock-free queue implementations, optimistic concurrency control, and AtomicPtr pointer swaps.

🎯 Learning Outcomes

  • β€’ Understand the CAS loop pattern: load β†’ compute β†’ compare_exchange β†’ retry on failure
  • β€’ Learn the difference between compare_exchange (strong, no spurious failure) and compare_exchange_weak (weak, may spuriously fail β€” preferred in loops)
  • β€’ See how atomic_max uses CAS to atomically update a maximum value
  • β€’ Understand the (success_ordering, failure_ordering) parameters
  • β€’ Learn the ABA problem and why some lock-free algorithms need versioned pointers
  • Code Example

    #![allow(clippy::all)]
    // 454. CAS: compare_exchange and loops
    use std::sync::atomic::{AtomicI64, AtomicUsize, Ordering};
    use std::sync::Arc;
    use std::thread;
    
    fn cas_increment(a: &AtomicUsize) {
        let mut cur = a.load(Ordering::Relaxed);
        loop {
            match a.compare_exchange_weak(cur, cur + 1, Ordering::AcqRel, Ordering::Relaxed) {
                Ok(_) => break,
                Err(actual) => cur = actual,
            }
        }
    }
    
    fn atomic_max(a: &AtomicI64, v: i64) {
        let mut cur = a.load(Ordering::Relaxed);
        loop {
            if v <= cur {
                break;
            }
            match a.compare_exchange_weak(cur, v, Ordering::AcqRel, Ordering::Relaxed) {
                Ok(_) => break,
                Err(actual) => cur = actual,
            }
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_cas_inc() {
            let a = AtomicUsize::new(0);
            for _ in 0..100 {
                cas_increment(&a);
            }
            assert_eq!(a.load(Ordering::SeqCst), 100);
        }
        #[test]
        fn test_cas_fail() {
            let a = AtomicUsize::new(5);
            assert_eq!(
                a.compare_exchange(99, 0, Ordering::SeqCst, Ordering::SeqCst),
                Err(5)
            );
        }
        #[test]
        fn test_max() {
            let m = AtomicI64::new(i64::MIN);
            for v in [5, 3, 8, 1, 9, 2] {
                atomic_max(&m, v);
            }
            assert_eq!(m.load(Ordering::SeqCst), 9);
        }
    }

    Key Differences

  • Result vs. bool: Rust's compare_exchange returns Result<old_val, actual_val>; OCaml returns bool, requiring a separate Atomic.get to get the actual value on failure.
  • Weak variant: Rust has compare_exchange_weak for loop use; OCaml's single compare_and_set corresponds roughly to compare_exchange (strong).
  • ABA problem: Both languages' CAS operations are susceptible to the ABA problem; solutions require versioned pointers (u128 packing tag + pointer) or hazard pointers.
  • Ordering: Rust separates success and failure orderings; OCaml uses sequential consistency for all atomic operations.
  • OCaml Approach

    OCaml 5.x's Atomic.compare_and_set old_val new_val at is the CAS primitive. It returns a bool rather than Result<_, actual>, so you reload the value after failure. OCaml's CAS uses sequential consistency. A CAS-based counter: let cas_inc a = while not (let cur = Atomic.get a in Atomic.compare_and_set a cur (cur+1)) do () done.

    Full Source

    #![allow(clippy::all)]
    // 454. CAS: compare_exchange and loops
    use std::sync::atomic::{AtomicI64, AtomicUsize, Ordering};
    use std::sync::Arc;
    use std::thread;
    
    fn cas_increment(a: &AtomicUsize) {
        let mut cur = a.load(Ordering::Relaxed);
        loop {
            match a.compare_exchange_weak(cur, cur + 1, Ordering::AcqRel, Ordering::Relaxed) {
                Ok(_) => break,
                Err(actual) => cur = actual,
            }
        }
    }
    
    fn atomic_max(a: &AtomicI64, v: i64) {
        let mut cur = a.load(Ordering::Relaxed);
        loop {
            if v <= cur {
                break;
            }
            match a.compare_exchange_weak(cur, v, Ordering::AcqRel, Ordering::Relaxed) {
                Ok(_) => break,
                Err(actual) => cur = actual,
            }
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_cas_inc() {
            let a = AtomicUsize::new(0);
            for _ in 0..100 {
                cas_increment(&a);
            }
            assert_eq!(a.load(Ordering::SeqCst), 100);
        }
        #[test]
        fn test_cas_fail() {
            let a = AtomicUsize::new(5);
            assert_eq!(
                a.compare_exchange(99, 0, Ordering::SeqCst, Ordering::SeqCst),
                Err(5)
            );
        }
        #[test]
        fn test_max() {
            let m = AtomicI64::new(i64::MIN);
            for v in [5, 3, 8, 1, 9, 2] {
                atomic_max(&m, v);
            }
            assert_eq!(m.load(Ordering::SeqCst), 9);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_cas_inc() {
            let a = AtomicUsize::new(0);
            for _ in 0..100 {
                cas_increment(&a);
            }
            assert_eq!(a.load(Ordering::SeqCst), 100);
        }
        #[test]
        fn test_cas_fail() {
            let a = AtomicUsize::new(5);
            assert_eq!(
                a.compare_exchange(99, 0, Ordering::SeqCst, Ordering::SeqCst),
                Err(5)
            );
        }
        #[test]
        fn test_max() {
            let m = AtomicI64::new(i64::MIN);
            for v in [5, 3, 8, 1, 9, 2] {
                atomic_max(&m, v);
            }
            assert_eq!(m.load(Ordering::SeqCst), 9);
        }
    }

    Exercises

  • Atomic stack push: Implement lock-free stack push using CAS on AtomicPtr<Node<T>>. push allocates a node, sets its next to the current head, then CAS-swaps the head from old_head to new_node. Repeat on failure.
  • Versioned CAS (DCAS): On platforms with 128-bit atomics (using AtomicU128 or a crate), implement an (version, value) CAS that increments the version on every successful update, preventing the ABA problem.
  • CAS vs. mutex benchmark: Implement a shared counter using CAS loop and one using Mutex<u64>. Benchmark both with 8 threads each doing 1M increments. Plot throughput vs. number of threads from 1 to 16.
  • Open Source Repos