ExamplesBy LevelBy TopicLearning Paths
977 Advanced

977 Bitset

Functional Programming

Tutorial

The Problem

Implement a bitset — a fixed-size set of bits stored compactly in Vec<u64> words. Support set, clear, toggle, test, union (OR), intersection (AND), difference (AND NOT), and population count (popcount). Compare with OCaml's 63-bit integer bitset (one GC tag bit is reserved) and with Rust's full 64-bit words.

🎯 Learning Outcomes

  • • Map bit index i to word index i/64 and bit offset i%64
  • • Implement set, clear, toggle, test using |=, &= !, ^=, & != 0
  • • Implement set operations: union (|), intersection (&), difference (& !)
  • • Implement count_ones (population count) using u64::count_ones() summed over words
  • • Understand why OCaml uses 63-bit integers (one bit reserved for the GC tag) while Rust uses full 64-bit words
  • Code Example

    #![allow(clippy::all)]
    // 977: Bitset
    // Set, clear, toggle, union, intersection via u64 word arrays
    // OCaml uses 63-bit int (1 bit used for GC tag); Rust uses full u64
    
    pub struct Bitset {
        bits: Vec<u64>,
        size: usize,
    }
    
    impl Bitset {
        pub fn new(size: usize) -> Self {
            let words = (size + 63) / 64;
            Bitset {
                bits: vec![0u64; words],
                size,
            }
        }
    
        fn word(&self, i: usize) -> usize {
            i / 64
        }
        fn bit(&self, i: usize) -> u64 {
            1u64 << (i % 64)
        }
    
        pub fn set(&mut self, i: usize) {
            assert!(
                i < self.size,
                "index {} out of range (size={})",
                i,
                self.size
            );
            let (w, b) = (self.word(i), self.bit(i));
            self.bits[w] |= b;
        }
    
        pub fn clear(&mut self, i: usize) {
            assert!(i < self.size, "index out of range");
            let (w, b) = (self.word(i), self.bit(i));
            self.bits[w] &= !b;
        }
    
        pub fn toggle(&mut self, i: usize) {
            assert!(i < self.size, "index out of range");
            let (w, b) = (self.word(i), self.bit(i));
            self.bits[w] ^= b;
        }
    
        pub fn test(&self, i: usize) -> bool {
            if i >= self.size {
                return false;
            }
            (self.bits[self.word(i)] >> (i % 64)) & 1 == 1
        }
    
        /// Count of set bits (popcount)
        pub fn count(&self) -> u32 {
            self.bits.iter().map(|w| w.count_ones()).sum()
        }
    
        /// Union: returns new bitset with bits set in either
        pub fn union(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a | b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Intersection: bits set in both
        pub fn intersect(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a & b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Difference: bits in self but not other
        pub fn difference(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a & !b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Symmetric difference: bits in one but not both
        pub fn symmetric_difference(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a ^ b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Iterate set bit indices
        pub fn iter_ones(&self) -> Vec<usize> {
            let mut result = vec![];
            for (w_idx, &word) in self.bits.iter().enumerate() {
                let mut word = word;
                while word != 0 {
                    let bit = word.trailing_zeros() as usize;
                    result.push(w_idx * 64 + bit);
                    word &= word - 1; // clear lowest set bit
                }
            }
            result
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_set_test_clear() {
            let mut bs = Bitset::new(128);
            bs.set(0);
            bs.set(5);
            bs.set(63);
            bs.set(64);
            bs.set(127);
    
            assert!(bs.test(0));
            assert!(bs.test(5));
            assert!(bs.test(63));
            assert!(bs.test(64));
            assert!(bs.test(127));
            assert!(!bs.test(1));
            assert_eq!(bs.count(), 5);
    
            bs.clear(5);
            assert!(!bs.test(5));
            assert_eq!(bs.count(), 4);
        }
    
        #[test]
        fn test_toggle() {
            let mut bs = Bitset::new(64);
            bs.toggle(10);
            assert!(bs.test(10));
            bs.toggle(10);
            assert!(!bs.test(10));
        }
    
        #[test]
        fn test_union() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let u = a.union(&b);
            assert_eq!(u.count(), 6);
            assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_intersect() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let i = a.intersect(&b);
            assert_eq!(i.count(), 2);
            assert!(i.test(2));
            assert!(i.test(3));
        }
    
        #[test]
        fn test_difference() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let d = a.difference(&b);
            assert_eq!(d.count(), 2);
            assert_eq!(d.iter_ones(), vec![0, 1]);
        }
    
        #[test]
        fn test_iter_ones() {
            let mut bs = Bitset::new(200);
            bs.set(0);
            bs.set(63);
            bs.set(64);
            bs.set(127);
            bs.set(128);
            bs.set(199);
            assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
        }
    }

    Key Differences

    AspectRustOCaml
    Word size64 bits (u64) — full hardware word63 bits (int) — one GC tag bit
    count_onesu64::count_ones() — hardware POPCNTNo stdlib popcount; loop manually
    NOT operator!lnot
    Bitwise AND&land
    PerformanceDirect 64-bit ops63-bit bounds, slightly lower density

    Bitsets are useful for dense boolean arrays, Sieve of Eratosthenes, graph adjacency matrices (for small graphs), and flag sets. The bit-vec and fixedbitset crates provide production-grade implementations.

    OCaml Approach

    (* OCaml int is 63 bits on 64-bit systems (1 bit for GC tag) *)
    type bitset = {
      words: int array;
      size: int;
    }
    
    let create size =
      { words = Array.make ((size + 62) / 63) 0; size }
    
    let set bs i =
      let w = i / 63 and b = i mod 63 in
      bs.words.(w) <- bs.words.(w) lor (1 lsl b)
    
    let test bs i =
      let w = i / 63 and b = i mod 63 in
      bs.words.(w) land (1 lsl b) <> 0
    
    (* Using Bytes for 8-bit words — avoids GC tag issue *)
    let create_bytes size = Bytes.make ((size + 7) / 8) '\000'
    

    OCaml's int is 63 bits on 64-bit platforms (one bit is the GC's tag bit). Using int for bitsets means 63 bits per word, not 64. Bytes (byte arrays) avoids this issue at the cost of 8 bits per word instead of 63.

    Full Source

    #![allow(clippy::all)]
    // 977: Bitset
    // Set, clear, toggle, union, intersection via u64 word arrays
    // OCaml uses 63-bit int (1 bit used for GC tag); Rust uses full u64
    
    pub struct Bitset {
        bits: Vec<u64>,
        size: usize,
    }
    
    impl Bitset {
        pub fn new(size: usize) -> Self {
            let words = (size + 63) / 64;
            Bitset {
                bits: vec![0u64; words],
                size,
            }
        }
    
        fn word(&self, i: usize) -> usize {
            i / 64
        }
        fn bit(&self, i: usize) -> u64 {
            1u64 << (i % 64)
        }
    
        pub fn set(&mut self, i: usize) {
            assert!(
                i < self.size,
                "index {} out of range (size={})",
                i,
                self.size
            );
            let (w, b) = (self.word(i), self.bit(i));
            self.bits[w] |= b;
        }
    
        pub fn clear(&mut self, i: usize) {
            assert!(i < self.size, "index out of range");
            let (w, b) = (self.word(i), self.bit(i));
            self.bits[w] &= !b;
        }
    
        pub fn toggle(&mut self, i: usize) {
            assert!(i < self.size, "index out of range");
            let (w, b) = (self.word(i), self.bit(i));
            self.bits[w] ^= b;
        }
    
        pub fn test(&self, i: usize) -> bool {
            if i >= self.size {
                return false;
            }
            (self.bits[self.word(i)] >> (i % 64)) & 1 == 1
        }
    
        /// Count of set bits (popcount)
        pub fn count(&self) -> u32 {
            self.bits.iter().map(|w| w.count_ones()).sum()
        }
    
        /// Union: returns new bitset with bits set in either
        pub fn union(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a | b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Intersection: bits set in both
        pub fn intersect(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a & b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Difference: bits in self but not other
        pub fn difference(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a & !b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Symmetric difference: bits in one but not both
        pub fn symmetric_difference(&self, other: &Bitset) -> Bitset {
            assert_eq!(self.size, other.size);
            let bits = self
                .bits
                .iter()
                .zip(&other.bits)
                .map(|(a, b)| a ^ b)
                .collect();
            Bitset {
                bits,
                size: self.size,
            }
        }
    
        /// Iterate set bit indices
        pub fn iter_ones(&self) -> Vec<usize> {
            let mut result = vec![];
            for (w_idx, &word) in self.bits.iter().enumerate() {
                let mut word = word;
                while word != 0 {
                    let bit = word.trailing_zeros() as usize;
                    result.push(w_idx * 64 + bit);
                    word &= word - 1; // clear lowest set bit
                }
            }
            result
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_set_test_clear() {
            let mut bs = Bitset::new(128);
            bs.set(0);
            bs.set(5);
            bs.set(63);
            bs.set(64);
            bs.set(127);
    
            assert!(bs.test(0));
            assert!(bs.test(5));
            assert!(bs.test(63));
            assert!(bs.test(64));
            assert!(bs.test(127));
            assert!(!bs.test(1));
            assert_eq!(bs.count(), 5);
    
            bs.clear(5);
            assert!(!bs.test(5));
            assert_eq!(bs.count(), 4);
        }
    
        #[test]
        fn test_toggle() {
            let mut bs = Bitset::new(64);
            bs.toggle(10);
            assert!(bs.test(10));
            bs.toggle(10);
            assert!(!bs.test(10));
        }
    
        #[test]
        fn test_union() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let u = a.union(&b);
            assert_eq!(u.count(), 6);
            assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_intersect() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let i = a.intersect(&b);
            assert_eq!(i.count(), 2);
            assert!(i.test(2));
            assert!(i.test(3));
        }
    
        #[test]
        fn test_difference() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let d = a.difference(&b);
            assert_eq!(d.count(), 2);
            assert_eq!(d.iter_ones(), vec![0, 1]);
        }
    
        #[test]
        fn test_iter_ones() {
            let mut bs = Bitset::new(200);
            bs.set(0);
            bs.set(63);
            bs.set(64);
            bs.set(127);
            bs.set(128);
            bs.set(199);
            assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_set_test_clear() {
            let mut bs = Bitset::new(128);
            bs.set(0);
            bs.set(5);
            bs.set(63);
            bs.set(64);
            bs.set(127);
    
            assert!(bs.test(0));
            assert!(bs.test(5));
            assert!(bs.test(63));
            assert!(bs.test(64));
            assert!(bs.test(127));
            assert!(!bs.test(1));
            assert_eq!(bs.count(), 5);
    
            bs.clear(5);
            assert!(!bs.test(5));
            assert_eq!(bs.count(), 4);
        }
    
        #[test]
        fn test_toggle() {
            let mut bs = Bitset::new(64);
            bs.toggle(10);
            assert!(bs.test(10));
            bs.toggle(10);
            assert!(!bs.test(10));
        }
    
        #[test]
        fn test_union() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let u = a.union(&b);
            assert_eq!(u.count(), 6);
            assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
        }
    
        #[test]
        fn test_intersect() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let i = a.intersect(&b);
            assert_eq!(i.count(), 2);
            assert!(i.test(2));
            assert!(i.test(3));
        }
    
        #[test]
        fn test_difference() {
            let mut a = Bitset::new(64);
            let mut b = Bitset::new(64);
            for i in 0..4 {
                a.set(i);
            }
            for i in 2..6 {
                b.set(i);
            }
            let d = a.difference(&b);
            assert_eq!(d.count(), 2);
            assert_eq!(d.iter_ones(), vec![0, 1]);
        }
    
        #[test]
        fn test_iter_ones() {
            let mut bs = Bitset::new(200);
            bs.set(0);
            bs.set(63);
            bs.set(64);
            bs.set(127);
            bs.set(128);
            bs.set(199);
            assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
        }
    }

    Deep Comparison

    Bitset — Comparison

    Core Insight

    A bitset stores a set of integers compactly: 1 bit per element. Word-level operations (|, &, ^, ~) apply to 64 bits simultaneously. OCaml's int is 63-bit (the runtime uses bit 0 as a GC tag for boxing), so each word holds 63 bits. Rust uses u64 (all 64 bits). Rust also has hardware-accelerated count_ones() and trailing_zeros() as methods.

    OCaml Approach

  • int array — 63 bits per word (OCaml int is 63-bit on 64-bit systems)
  • lsl, lor, land, lxor, lnot — OCaml bitwise operators
  • words_for_bits n = (n + 62) / 63 — accounts for 63-bit words
  • • Manual popcount via while w <> 0 do w := w land (w-1); incr count done
  • Array.fold_left for counting across all words
  • • Set/clear/toggle return unit (mutate in place)
  • Rust Approach

  • Vec<u64> — 64 bits per word (full u64, no GC tag bit)
  • |=, &=, ^=, ! — Rust bitwise operators (same semantics)
  • (size + 63) / 64 words needed
  • word.count_ones() — hardware POPCNT instruction
  • word.trailing_zeros() — hardware BSF/CTZ for finding lowest set bit
  • word &= word - 1 — clear lowest set bit trick (same as OCaml)
  • .iter().zip(&other.bits).map(|(a,b)| a | b).collect() for set operations
  • Comparison Table

    AspectOCamlRust
    Word typeint (63-bit)u64 (64-bit)
    Words per N bits(N+62)/63(N+63)/64
    Bitwise ANDland&
    Bitwise ORlor\|
    Bitwise XORlxor^
    Bitwise NOTlnot!
    Left shiftlsl<<
    Right shiftlsr>>
    PopcountManual Brian Kernighan loopu64::count_ones() (hardware)
    Lowest set bitw land (w - 1) clearw.trailing_zeros() + w &= w-1
    Set operationsfor i = 0 to n-1 do result.(i) <- a.(i) lor b.(i) donea.iter().zip(b.iter()).map(\|(a,b)\| a\|b).collect()

    Exercises

  • Implement Bitset::from_indices(size, indices: &[usize]) that sets all specified bits.
  • Implement symmetric_difference(a, b) using XOR (^).
  • Implement is_subset(a, b) -> bool — returns true if every set bit in a is also set in b.
  • Implement first_set() -> Option<usize> — find the lowest set bit using trailing_zeros().
  • Implement the Sieve of Eratosthenes using a Bitset for an upper bound of 1,000,000.
  • Open Source Repos