ExamplesBy LevelBy TopicLearning Paths
719 Fundamental

Struct of Arrays (SoA) Pattern

Functional Programming

Tutorial

The Problem

Modern CPUs execute instructions faster than memory can supply data. The bottleneck in data-intensive loops is not computation but cache misses: loading a 64-byte cache line that contains only 4 bytes of useful data wastes 94% of bandwidth. The Array of Structs (AoS) layoutβ€”the default in most languagesβ€”interleaves every field of every record, forcing the CPU to load fields it does not need. Struct of Arrays (SoA) separates each field into its own contiguous array, enabling full cache-line utilization when iterating over a single field.

The pattern matters whenever a hot loop reads or writes only a subset of fields: physics simulations reading positions to compute forces, renderers reading vertex positions to cull geometry, databases scanning a single column, or game engines querying entity health bars. In all these cases AoS pays a memory bandwidth tax proportional to the number of unused fields.

🎯 Learning Outcomes

  • β€’ Understand how CPU cache lines and spatial locality affect measured throughput
  • β€’ Implement both AoS and SoA layouts and benchmark the difference
  • β€’ Use #[repr(C)] and field alignment to reason about memory layouts
  • β€’ Apply the SoA pattern in iterators that remain safe and ergonomic
  • β€’ Recognize when SoA hurts (random access, frequent full-struct reads)
  • Code Example

    #[derive(Clone, Debug, Default, PartialEq)]
    pub struct ParticleAoS { pub x: f32, pub y: f32, pub z: f32, pub mass: f32 }
    
    pub fn sum_x_aos(particles: &[ParticleAoS]) -> f32 {
        particles.iter().map(|p| p.x).sum()
    }

    Key Differences

    AspectRustOCaml
    Default float layoutUnboxed, inlineBoxed (GC header per value)
    Float-only structContiguous, SIMD-readyFlat array optimization applies
    Mixed structInterleaved, inline fieldsEach non-float field boxed
    SoA ergonomicsVec<f32> per field, safeBigarray or manual arrays
    Auto-vectorizationCommon with contiguous slicesRare without Bigarray

    Rust's value semantics mean ParticleSoA fields are guaranteed contiguous without annotation. OCaml needs explicit Bigarray.Array1 for numeric-intensive SoA work.

    OCaml Approach

    OCaml's GC uses a uniform boxed representation: every float inside a record is heap-allocated with a header word unless the record contains only floats, in which case OCaml applies a float-array optimization. For mixed-field structs, AoS is unavoidable without manual Bigarray usage:

    (* Unboxed float arrays β€” cache-friendly for float-only data *)
    type particle_soa = {
      x    : float array;
      y    : float array;
      mass : float array;
    }
    
    let update_positions soa dt =
      Array.iteri (fun i _ ->
        soa.x.(i) <- soa.x.(i) +. soa.vx.(i) *. dt
      ) soa.x
    

    OCaml 5 introduces unboxed types (#float, #int) that reduce boxing overhead but full SoA still requires manual field separation.

    Full Source

    #![allow(clippy::all)]
    //! # 719: Struct of Arrays (SoA) vs Array of Structs (AoS)
    //!
    //! Demonstrates how data layout affects cache efficiency.
    //! AoS: `Vec<Particle>` β€” fields interleaved, bad for single-field iteration.
    //! SoA: `{xs, ys, zs, masses}` β€” each field is contiguous, cache-friendly.
    
    // ── Array of Structures (AoS) ─────────────────────────────────────────────────
    
    /// Classic OOP layout: one struct per element.
    /// Memory layout: `[x0,y0,z0,mass0, x1,y1,z1,mass1, ...]`.
    /// When iterating only `x`, 75 % of each cache line is wasted.
    #[derive(Clone, Debug, Default, PartialEq)]
    pub struct ParticleAoS {
        pub x: f32,
        pub y: f32,
        pub z: f32,
        pub mass: f32,
    }
    
    /// Sum all x-coordinates using AoS.
    /// Each element is 16 bytes; only 4 bytes (`x`) are used per iteration.
    pub fn sum_x_aos(particles: &[ParticleAoS]) -> f32 {
        particles.iter().map(|p| p.x).sum()
    }
    
    /// Apply gravity to y-coordinates using AoS.
    /// Even though only `y` changes, the entire struct is loaded per element.
    pub fn apply_gravity_aos(particles: &mut [ParticleAoS], dt: f32) {
        particles.iter_mut().for_each(|p| p.y -= 9.81 * p.mass * dt);
    }
    
    // ── Structure of Arrays (SoA) ─────────────────────────────────────────────────
    
    /// Cache-friendly layout: each field is a separate contiguous `Vec`.
    /// Memory layout: `[x0,x1,...,xN]` then `[y0,y1,...,yN]` etc.
    /// Iterating over `xs` accesses only the data you need.
    #[derive(Debug, Default)]
    pub struct ParticlesSoA {
        pub xs: Vec<f32>,
        pub ys: Vec<f32>,
        pub zs: Vec<f32>,
        pub masses: Vec<f32>,
    }
    
    impl ParticlesSoA {
        /// Build a SoA collection from an iterator of `(x, y, z, mass)` tuples.
        pub fn from_tuples(iter: impl IntoIterator<Item = (f32, f32, f32, f32)>) -> Self {
            let (xs, ys, zs, masses) = iter.into_iter().fold(
                (vec![], vec![], vec![], vec![]),
                |(mut xs, mut ys, mut zs, mut ms), (x, y, z, m)| {
                    xs.push(x);
                    ys.push(y);
                    zs.push(z);
                    ms.push(m);
                    (xs, ys, zs, ms)
                },
            );
            Self { xs, ys, zs, masses }
        }
    
        /// Number of particles.
        pub fn len(&self) -> usize {
            self.xs.len()
        }
    
        /// Returns true when there are no particles.
        pub fn is_empty(&self) -> bool {
            self.xs.is_empty()
        }
    }
    
    /// Sum all x-coordinates using SoA.
    /// Touches **only** `xs` β€” every byte in every cache line is useful.
    pub fn sum_x_soa(soa: &ParticlesSoA) -> f32 {
        soa.xs.iter().sum()
    }
    
    /// Apply gravity to y-coordinates using SoA.
    /// Only `ys` and `masses` are accessed β€” two contiguous passes, zero waste.
    pub fn apply_gravity_soa(soa: &mut ParticlesSoA, dt: f32) {
        soa.ys
            .iter_mut()
            .zip(soa.masses.iter())
            .for_each(|(y, &m)| *y -= 9.81 * m * dt);
    }
    
    // ── Conversion helpers ────────────────────────────────────────────────────────
    
    /// Convert AoS to SoA (useful when you collect from an API that returns AoS).
    pub fn aos_to_soa(particles: &[ParticleAoS]) -> ParticlesSoA {
        ParticlesSoA {
            xs: particles.iter().map(|p| p.x).collect(),
            ys: particles.iter().map(|p| p.y).collect(),
            zs: particles.iter().map(|p| p.z).collect(),
            masses: particles.iter().map(|p| p.mass).collect(),
        }
    }
    
    /// Convert SoA back to AoS (e.g. to pass one particle to an external API).
    pub fn soa_to_aos(soa: &ParticlesSoA) -> Vec<ParticleAoS> {
        soa.xs
            .iter()
            .zip(soa.ys.iter())
            .zip(soa.zs.iter())
            .zip(soa.masses.iter())
            .map(|(((x, y), z), m)| ParticleAoS {
                x: *x,
                y: *y,
                z: *z,
                mass: *m,
            })
            .collect()
    }
    
    // ── Tests ─────────────────────────────────────────────────────────────────────
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        fn sample_aos() -> Vec<ParticleAoS> {
            vec![
                ParticleAoS {
                    x: 1.0,
                    y: 2.0,
                    z: 3.0,
                    mass: 0.5,
                },
                ParticleAoS {
                    x: 4.0,
                    y: 5.0,
                    z: 6.0,
                    mass: 1.0,
                },
                ParticleAoS {
                    x: 7.0,
                    y: 8.0,
                    z: 9.0,
                    mass: 2.0,
                },
            ]
        }
    
        fn sample_soa() -> ParticlesSoA {
            ParticlesSoA::from_tuples([
                (1.0, 2.0, 3.0, 0.5),
                (4.0, 5.0, 6.0, 1.0),
                (7.0, 8.0, 9.0, 2.0),
            ])
        }
    
        #[test]
        fn test_sum_x_aos_and_soa_agree() {
            let aos = sample_aos();
            let soa = sample_soa();
            let expected = 1.0_f32 + 4.0 + 7.0;
            assert!((sum_x_aos(&aos) - expected).abs() < f32::EPSILON);
            assert!((sum_x_soa(&soa) - expected).abs() < f32::EPSILON);
        }
    
        #[test]
        fn test_sum_x_empty() {
            assert_eq!(sum_x_aos(&[]), 0.0_f32);
            assert_eq!(sum_x_soa(&ParticlesSoA::default()), 0.0_f32);
        }
    
        #[test]
        fn test_apply_gravity_aos() {
            let mut particles = sample_aos();
            // dt = 0 β†’ y unchanged
            apply_gravity_aos(&mut particles, 0.0);
            assert!((particles[0].y - 2.0).abs() < f32::EPSILON);
            // dt = 1, mass = 0.5 β†’ y -= 9.81 * 0.5 * 1 = 4.905
            apply_gravity_aos(&mut particles, 1.0);
            let expected_y0 = 2.0 - 9.81 * 0.5 * 1.0_f32;
            assert!((particles[0].y - expected_y0).abs() < 1e-5);
        }
    
        #[test]
        fn test_apply_gravity_soa() {
            let mut soa = sample_soa();
            apply_gravity_soa(&mut soa, 0.0);
            assert!((soa.ys[0] - 2.0).abs() < f32::EPSILON);
            apply_gravity_soa(&mut soa, 1.0);
            let expected_y0 = 2.0 - 9.81 * 0.5 * 1.0_f32;
            assert!((soa.ys[0] - expected_y0).abs() < 1e-5);
        }
    
        #[test]
        fn test_aos_to_soa_roundtrip() {
            let aos = sample_aos();
            let soa = aos_to_soa(&aos);
            let back = soa_to_aos(&soa);
            assert_eq!(aos, back);
        }
    
        #[test]
        fn test_soa_len_and_is_empty() {
            let soa = sample_soa();
            assert_eq!(soa.len(), 3);
            assert!(!soa.is_empty());
            assert!(ParticlesSoA::default().is_empty());
        }
    
        #[test]
        fn test_gravity_soa_and_aos_agree() {
            let mut aos = sample_aos();
            let mut soa = sample_soa();
            let dt = 0.016_f32;
            apply_gravity_aos(&mut aos, dt);
            apply_gravity_soa(&mut soa, dt);
            for (i, p) in aos.iter().enumerate() {
                assert!((p.y - soa.ys[i]).abs() < 1e-5, "y[{i}] mismatch");
            }
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        fn sample_aos() -> Vec<ParticleAoS> {
            vec![
                ParticleAoS {
                    x: 1.0,
                    y: 2.0,
                    z: 3.0,
                    mass: 0.5,
                },
                ParticleAoS {
                    x: 4.0,
                    y: 5.0,
                    z: 6.0,
                    mass: 1.0,
                },
                ParticleAoS {
                    x: 7.0,
                    y: 8.0,
                    z: 9.0,
                    mass: 2.0,
                },
            ]
        }
    
        fn sample_soa() -> ParticlesSoA {
            ParticlesSoA::from_tuples([
                (1.0, 2.0, 3.0, 0.5),
                (4.0, 5.0, 6.0, 1.0),
                (7.0, 8.0, 9.0, 2.0),
            ])
        }
    
        #[test]
        fn test_sum_x_aos_and_soa_agree() {
            let aos = sample_aos();
            let soa = sample_soa();
            let expected = 1.0_f32 + 4.0 + 7.0;
            assert!((sum_x_aos(&aos) - expected).abs() < f32::EPSILON);
            assert!((sum_x_soa(&soa) - expected).abs() < f32::EPSILON);
        }
    
        #[test]
        fn test_sum_x_empty() {
            assert_eq!(sum_x_aos(&[]), 0.0_f32);
            assert_eq!(sum_x_soa(&ParticlesSoA::default()), 0.0_f32);
        }
    
        #[test]
        fn test_apply_gravity_aos() {
            let mut particles = sample_aos();
            // dt = 0 β†’ y unchanged
            apply_gravity_aos(&mut particles, 0.0);
            assert!((particles[0].y - 2.0).abs() < f32::EPSILON);
            // dt = 1, mass = 0.5 β†’ y -= 9.81 * 0.5 * 1 = 4.905
            apply_gravity_aos(&mut particles, 1.0);
            let expected_y0 = 2.0 - 9.81 * 0.5 * 1.0_f32;
            assert!((particles[0].y - expected_y0).abs() < 1e-5);
        }
    
        #[test]
        fn test_apply_gravity_soa() {
            let mut soa = sample_soa();
            apply_gravity_soa(&mut soa, 0.0);
            assert!((soa.ys[0] - 2.0).abs() < f32::EPSILON);
            apply_gravity_soa(&mut soa, 1.0);
            let expected_y0 = 2.0 - 9.81 * 0.5 * 1.0_f32;
            assert!((soa.ys[0] - expected_y0).abs() < 1e-5);
        }
    
        #[test]
        fn test_aos_to_soa_roundtrip() {
            let aos = sample_aos();
            let soa = aos_to_soa(&aos);
            let back = soa_to_aos(&soa);
            assert_eq!(aos, back);
        }
    
        #[test]
        fn test_soa_len_and_is_empty() {
            let soa = sample_soa();
            assert_eq!(soa.len(), 3);
            assert!(!soa.is_empty());
            assert!(ParticlesSoA::default().is_empty());
        }
    
        #[test]
        fn test_gravity_soa_and_aos_agree() {
            let mut aos = sample_aos();
            let mut soa = sample_soa();
            let dt = 0.016_f32;
            apply_gravity_aos(&mut aos, dt);
            apply_gravity_soa(&mut soa, dt);
            for (i, p) in aos.iter().enumerate() {
                assert!((p.y - soa.ys[i]).abs() < 1e-5, "y[{i}] mismatch");
            }
        }
    }

    Deep Comparison

    OCaml vs Rust: Struct of Arrays vs Array of Structs

    Side-by-Side Code

    OCaml (AoS β€” sum x-coordinates)

    type particle = { x: float; y: float; z: float; mass: float }
    
    let sum_x_aos particles =
      Array.fold_left (fun acc p -> acc +. p.x) 0.0 particles
    

    OCaml (SoA β€” sum x-coordinates)

    type particles_soa = { xs: float array; ys: float array; zs: float array; masses: float array }
    
    let sum_x_soa soa =
      Array.fold_left (+.) 0.0 soa.xs
    

    Rust (AoS β€” idiomatic)

    #[derive(Clone, Debug, Default, PartialEq)]
    pub struct ParticleAoS { pub x: f32, pub y: f32, pub z: f32, pub mass: f32 }
    
    pub fn sum_x_aos(particles: &[ParticleAoS]) -> f32 {
        particles.iter().map(|p| p.x).sum()
    }
    

    Rust (SoA β€” idiomatic)

    #[derive(Debug, Default)]
    pub struct ParticlesSoA {
        pub xs: Vec<f32>, pub ys: Vec<f32>, pub zs: Vec<f32>, pub masses: Vec<f32>,
    }
    
    pub fn sum_x_soa(soa: &ParticlesSoA) -> f32 {
        soa.xs.iter().sum()
    }
    

    Rust (gravity update β€” SoA with zip)

    pub fn apply_gravity_soa(soa: &mut ParticlesSoA, dt: f32) {
        soa.ys
            .iter_mut()
            .zip(soa.masses.iter())
            .for_each(|(y, &m)| *y -= 9.81 * m * dt);
    }
    

    Type Signatures

    ConceptOCamlRust
    AoS elementtype particle = { x: float; ... }struct ParticleAoS { x: f32, ... }
    SoA containertype particles_soa = { xs: float array; ... }struct ParticlesSoA { xs: Vec<f32>, ... }
    Sum functionval sum_x_soa : particles_soa -> floatfn sum_x_soa(soa: &ParticlesSoA) -> f32
    MutationArray.iteri + index.iter_mut().zip(...)
    BorrowingN/A (GC)&[T] for read, &mut [T] for write

    Memory Layout

    AoS: [x0|y0|z0|m0|x1|y1|z1|m1|x2|y2|z2|m2|...]
          ^^^^                                       ← only this used when summing x
          cache line brings y0,z0,m0 for free β€” wasted
    
    SoA: [x0|x1|x2|...xN] [y0|y1|...yN] [z0|...] [m0|...]
         ^^^^^^^^^^^^^^^^^                          ← entire array used when summing x
          cache line is 100% useful
    

    Key Insights

  • Cache line utilisation: AoS loads an entire struct (16 bytes here) per iteration but uses only 4 bytes (x); SoA accesses a tightly packed f32 array β€” 16 values per 64-byte cache line vs. 4 in AoS. This is a 4Γ— difference in effective cache bandwidth for single-field workloads.
  • Iterator composition: Rust's .iter_mut().zip(other.iter()) lets the gravity update operate on two SoA columns simultaneously without allocating β€” the same pattern OCaml expresses with Array.iteri. Both compile to a simple sequential loop over contiguous memory.
  • Ownership encoding: Rust encodes the AoSβ†’SoA boundary at the type level. A function taking &[ParticleAoS] signals AoS semantics; one taking &ParticlesSoA signals SoA semantics. OCaml uses structural types but relies on programmer discipline for the same distinction.
  • Mutable aliasing safety: apply_gravity_soa takes &mut ParticlesSoA and simultaneously borrows ys mutably and masses immutably via zip. Rust enforces at compile time that no aliasing occurs β€” OCaml's GC permits this but provides no static guarantee.
  • Layout control: Rust gives precise control over struct layout via #[repr(C)], padding, and alignment. OCaml floats are boxed by default (except in float-only records), so AoS in OCaml already suffers an additional indirection overhead that Rust avoids entirely.
  • When to Use Each Style

    Use AoS when: entities are usually processed whole (e.g. serialising a single particle to JSON, passing one particle to a physics callback), or the collection is small enough to fit in L1 cache regardless of layout.

    Use SoA when: hot loops process one or two fields across thousands of entities (physics simulations, particle systems, SIMD vectorisation). The layout change alone routinely yields 2–10Γ— throughput improvements without algorithmic changes. SIMD intrinsics require SoA because they load a vector register from contiguous memory of a single type.

    Exercises

  • Benchmark ParticleAoS vs ParticleSoA position update with 1 million particles
  • using criterion. Measure with and without compiler auto-vectorization (-C target-cpu=native).

  • Implement a ParticleSoAIter that yields (x, y, z, vx, vy, vz, mass) tuples by
  • zipping the seven slices, maintaining borrowing safety.

  • Extend ParticleSoA to support sorting particles by mass while keeping all arrays
  • synchronized (use an index array or a sort-and-permute approach).

  • Implement an "AoSoA" (Array of Struct of Arrays) layout using [f32; 8] lanes per
  • field group to combine SIMD width with cache locality.

  • Profile a real workload (ray-sphere intersection, N-body gravity) to identify which
  • fields are hot and restructure accordingly.

    Open Source Repos