977 Advanced

977 Bitset

Functional Programming

Tutorial

The Problem

Implement a bitset — a fixed-size set of bits stored compactly in Vec<u64> words. Support set, clear, toggle, test, union (OR), intersection (AND), difference (AND NOT), and population count (popcount). Compare with OCaml's 63-bit integer bitset (one GC tag bit is reserved) and with Rust's full 64-bit words.

🎯 Learning Outcomes

• Map bit index i to word index i/64 and bit offset i%64

• Implement set, clear, toggle, test using |=, &= !, ^=, & != 0

• Implement set operations: union (|), intersection (&), difference (& !)

• Implement count_ones (population count) using u64::count_ones() summed over words

• Understand why OCaml uses 63-bit integers (one bit reserved for the GC tag) while Rust uses full 64-bit words

Code Example

#![allow(clippy::all)]
// 977: Bitset
// Set, clear, toggle, union, intersection via u64 word arrays
// OCaml uses 63-bit int (1 bit used for GC tag); Rust uses full u64

pub struct Bitset {
    bits: Vec<u64>,
    size: usize,
}

impl Bitset {
    pub fn new(size: usize) -> Self {
        let words = (size + 63) / 64;
        Bitset {
            bits: vec![0u64; words],
            size,
        }
    }

    fn word(&self, i: usize) -> usize {
        i / 64
    }
    fn bit(&self, i: usize) -> u64 {
        1u64 << (i % 64)
    }

    pub fn set(&mut self, i: usize) {
        assert!(
            i < self.size,
            "index {} out of range (size={})",
            i,
            self.size
        );
        let (w, b) = (self.word(i), self.bit(i));
        self.bits[w] |= b;
    }

    pub fn clear(&mut self, i: usize) {
        assert!(i < self.size, "index out of range");
        let (w, b) = (self.word(i), self.bit(i));
        self.bits[w] &= !b;
    }

    pub fn toggle(&mut self, i: usize) {
        assert!(i < self.size, "index out of range");
        let (w, b) = (self.word(i), self.bit(i));
        self.bits[w] ^= b;
    }

    pub fn test(&self, i: usize) -> bool {
        if i >= self.size {
            return false;
        }
        (self.bits[self.word(i)] >> (i % 64)) & 1 == 1
    }

    /// Count of set bits (popcount)
    pub fn count(&self) -> u32 {
        self.bits.iter().map(|w| w.count_ones()).sum()
    }

    /// Union: returns new bitset with bits set in either
    pub fn union(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a | b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Intersection: bits set in both
    pub fn intersect(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a & b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Difference: bits in self but not other
    pub fn difference(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a & !b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Symmetric difference: bits in one but not both
    pub fn symmetric_difference(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a ^ b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Iterate set bit indices
    pub fn iter_ones(&self) -> Vec<usize> {
        let mut result = vec![];
        for (w_idx, &word) in self.bits.iter().enumerate() {
            let mut word = word;
            while word != 0 {
                let bit = word.trailing_zeros() as usize;
                result.push(w_idx * 64 + bit);
                word &= word - 1; // clear lowest set bit
            }
        }
        result
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_set_test_clear() {
        let mut bs = Bitset::new(128);
        bs.set(0);
        bs.set(5);
        bs.set(63);
        bs.set(64);
        bs.set(127);

        assert!(bs.test(0));
        assert!(bs.test(5));
        assert!(bs.test(63));
        assert!(bs.test(64));
        assert!(bs.test(127));
        assert!(!bs.test(1));
        assert_eq!(bs.count(), 5);

        bs.clear(5);
        assert!(!bs.test(5));
        assert_eq!(bs.count(), 4);
    }

    #[test]
    fn test_toggle() {
        let mut bs = Bitset::new(64);
        bs.toggle(10);
        assert!(bs.test(10));
        bs.toggle(10);
        assert!(!bs.test(10));
    }

    #[test]
    fn test_union() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let u = a.union(&b);
        assert_eq!(u.count(), 6);
        assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_intersect() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let i = a.intersect(&b);
        assert_eq!(i.count(), 2);
        assert!(i.test(2));
        assert!(i.test(3));
    }

    #[test]
    fn test_difference() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let d = a.difference(&b);
        assert_eq!(d.count(), 2);
        assert_eq!(d.iter_ones(), vec![0, 1]);
    }

    #[test]
    fn test_iter_ones() {
        let mut bs = Bitset::new(200);
        bs.set(0);
        bs.set(63);
        bs.set(64);
        bs.set(127);
        bs.set(128);
        bs.set(199);
        assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
    }
}

(* 977: Bitset Operations *)
(* Set, clear, toggle, union, intersection using u64 word arrays *)

let words_for_bits n = (n + 62) / 63  (* OCaml int is 63-bit on 64-bit systems *)

type bitset = {
  bits: int array;
  size: int;
}

let create size = { bits = Array.make (words_for_bits size) 0; size }

let word_index i = i / 63
let bit_index i = i mod 63

let set bs i =
  if i >= bs.size then failwith "out of range";
  bs.bits.(word_index i) <- bs.bits.(word_index i) lor (1 lsl bit_index i)

let clear bs i =
  if i >= bs.size then failwith "out of range";
  bs.bits.(word_index i) <- bs.bits.(word_index i) land (lnot (1 lsl bit_index i))

let toggle bs i =
  if i >= bs.size then failwith "out of range";
  bs.bits.(word_index i) <- bs.bits.(word_index i) lxor (1 lsl bit_index i)

let test bs i =
  if i >= bs.size then false
  else (bs.bits.(word_index i) lsr bit_index i) land 1 = 1

let popcount w =
  let w = ref w in
  let count = ref 0 in
  while !w <> 0 do
    w := !w land (!w - 1);  (* clear lowest set bit *)
    incr count
  done;
  !count

let count bs =
  Array.fold_left (fun acc w -> acc + popcount w) 0 bs.bits

let union a b =
  assert (a.size = b.size);
  let result = create a.size in
  for i = 0 to Array.length a.bits - 1 do
    result.bits.(i) <- a.bits.(i) lor b.bits.(i)
  done;
  result

let intersect a b =
  assert (a.size = b.size);
  let result = create a.size in
  for i = 0 to Array.length a.bits - 1 do
    result.bits.(i) <- a.bits.(i) land b.bits.(i)
  done;
  result

let difference a b =
  assert (a.size = b.size);
  let result = create a.size in
  for i = 0 to Array.length a.bits - 1 do
    result.bits.(i) <- a.bits.(i) land (lnot b.bits.(i))
  done;
  result

let () =
  let bs = create 64 in
  set bs 0;
  set bs 5;
  set bs 10;
  set bs 63;

  assert (test bs 0);
  assert (test bs 5);
  assert (test bs 10);
  assert (test bs 63);
  assert (not (test bs 1));
  assert (count bs = 4);

  clear bs 5;
  assert (not (test bs 5));
  assert (count bs = 3);

  toggle bs 5;
  assert (test bs 5);
  toggle bs 5;
  assert (not (test bs 5));

  let a = create 64 in
  let b = create 64 in
  List.iter (set a) [0; 1; 2; 3];
  List.iter (set b) [2; 3; 4; 5];

  let u = union a b in
  assert (count u = 6);

  let i = intersect a b in
  assert (count i = 2);
  assert (test i 2);
  assert (test i 3);

  let d = difference a b in
  assert (count d = 2);
  assert (test d 0);
  assert (test d 1);

  Printf.printf "✓ All tests passed\n"

Key Differences

Aspect	Rust	OCaml
Word size	64 bits (`u64`) — full hardware word	63 bits (`int`) — one GC tag bit
`count_ones`	`u64::count_ones()` — hardware POPCNT	No stdlib popcount; loop manually
NOT operator	`!`	`lnot`
Bitwise AND	`&`	`land`
Performance	Direct 64-bit ops	63-bit bounds, slightly lower density

Bitsets are useful for dense boolean arrays, Sieve of Eratosthenes, graph adjacency matrices (for small graphs), and flag sets. The bit-vec and fixedbitset crates provide production-grade implementations.

OCaml Approach

(* OCaml int is 63 bits on 64-bit systems (1 bit for GC tag) *)
type bitset = {
  words: int array;
  size: int;
}

let create size =
  { words = Array.make ((size + 62) / 63) 0; size }

let set bs i =
  let w = i / 63 and b = i mod 63 in
  bs.words.(w) <- bs.words.(w) lor (1 lsl b)

let test bs i =
  let w = i / 63 and b = i mod 63 in
  bs.words.(w) land (1 lsl b) <> 0

(* Using Bytes for 8-bit words — avoids GC tag issue *)
let create_bytes size = Bytes.make ((size + 7) / 8) '\000'

OCaml's int is 63 bits on 64-bit platforms (one bit is the GC's tag bit). Using int for bitsets means 63 bits per word, not 64. Bytes (byte arrays) avoids this issue at the cost of 8 bits per word instead of 63.

Full Source

#![allow(clippy::all)]
// 977: Bitset
// Set, clear, toggle, union, intersection via u64 word arrays
// OCaml uses 63-bit int (1 bit used for GC tag); Rust uses full u64

pub struct Bitset {
    bits: Vec<u64>,
    size: usize,
}

impl Bitset {
    pub fn new(size: usize) -> Self {
        let words = (size + 63) / 64;
        Bitset {
            bits: vec![0u64; words],
            size,
        }
    }

    fn word(&self, i: usize) -> usize {
        i / 64
    }
    fn bit(&self, i: usize) -> u64 {
        1u64 << (i % 64)
    }

    pub fn set(&mut self, i: usize) {
        assert!(
            i < self.size,
            "index {} out of range (size={})",
            i,
            self.size
        );
        let (w, b) = (self.word(i), self.bit(i));
        self.bits[w] |= b;
    }

    pub fn clear(&mut self, i: usize) {
        assert!(i < self.size, "index out of range");
        let (w, b) = (self.word(i), self.bit(i));
        self.bits[w] &= !b;
    }

    pub fn toggle(&mut self, i: usize) {
        assert!(i < self.size, "index out of range");
        let (w, b) = (self.word(i), self.bit(i));
        self.bits[w] ^= b;
    }

    pub fn test(&self, i: usize) -> bool {
        if i >= self.size {
            return false;
        }
        (self.bits[self.word(i)] >> (i % 64)) & 1 == 1
    }

    /// Count of set bits (popcount)
    pub fn count(&self) -> u32 {
        self.bits.iter().map(|w| w.count_ones()).sum()
    }

    /// Union: returns new bitset with bits set in either
    pub fn union(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a | b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Intersection: bits set in both
    pub fn intersect(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a & b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Difference: bits in self but not other
    pub fn difference(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a & !b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Symmetric difference: bits in one but not both
    pub fn symmetric_difference(&self, other: &Bitset) -> Bitset {
        assert_eq!(self.size, other.size);
        let bits = self
            .bits
            .iter()
            .zip(&other.bits)
            .map(|(a, b)| a ^ b)
            .collect();
        Bitset {
            bits,
            size: self.size,
        }
    }

    /// Iterate set bit indices
    pub fn iter_ones(&self) -> Vec<usize> {
        let mut result = vec![];
        for (w_idx, &word) in self.bits.iter().enumerate() {
            let mut word = word;
            while word != 0 {
                let bit = word.trailing_zeros() as usize;
                result.push(w_idx * 64 + bit);
                word &= word - 1; // clear lowest set bit
            }
        }
        result
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_set_test_clear() {
        let mut bs = Bitset::new(128);
        bs.set(0);
        bs.set(5);
        bs.set(63);
        bs.set(64);
        bs.set(127);

        assert!(bs.test(0));
        assert!(bs.test(5));
        assert!(bs.test(63));
        assert!(bs.test(64));
        assert!(bs.test(127));
        assert!(!bs.test(1));
        assert_eq!(bs.count(), 5);

        bs.clear(5);
        assert!(!bs.test(5));
        assert_eq!(bs.count(), 4);
    }

    #[test]
    fn test_toggle() {
        let mut bs = Bitset::new(64);
        bs.toggle(10);
        assert!(bs.test(10));
        bs.toggle(10);
        assert!(!bs.test(10));
    }

    #[test]
    fn test_union() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let u = a.union(&b);
        assert_eq!(u.count(), 6);
        assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_intersect() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let i = a.intersect(&b);
        assert_eq!(i.count(), 2);
        assert!(i.test(2));
        assert!(i.test(3));
    }

    #[test]
    fn test_difference() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let d = a.difference(&b);
        assert_eq!(d.count(), 2);
        assert_eq!(d.iter_ones(), vec![0, 1]);
    }

    #[test]
    fn test_iter_ones() {
        let mut bs = Bitset::new(200);
        bs.set(0);
        bs.set(63);
        bs.set(64);
        bs.set(127);
        bs.set(128);
        bs.set(199);
        assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
    }
}

(* 977: Bitset Operations *)
(* Set, clear, toggle, union, intersection using u64 word arrays *)

let words_for_bits n = (n + 62) / 63  (* OCaml int is 63-bit on 64-bit systems *)

type bitset = {
  bits: int array;
  size: int;
}

let create size = { bits = Array.make (words_for_bits size) 0; size }

let word_index i = i / 63
let bit_index i = i mod 63

let set bs i =
  if i >= bs.size then failwith "out of range";
  bs.bits.(word_index i) <- bs.bits.(word_index i) lor (1 lsl bit_index i)

let clear bs i =
  if i >= bs.size then failwith "out of range";
  bs.bits.(word_index i) <- bs.bits.(word_index i) land (lnot (1 lsl bit_index i))

let toggle bs i =
  if i >= bs.size then failwith "out of range";
  bs.bits.(word_index i) <- bs.bits.(word_index i) lxor (1 lsl bit_index i)

let test bs i =
  if i >= bs.size then false
  else (bs.bits.(word_index i) lsr bit_index i) land 1 = 1

let popcount w =
  let w = ref w in
  let count = ref 0 in
  while !w <> 0 do
    w := !w land (!w - 1);  (* clear lowest set bit *)
    incr count
  done;
  !count

let count bs =
  Array.fold_left (fun acc w -> acc + popcount w) 0 bs.bits

let union a b =
  assert (a.size = b.size);
  let result = create a.size in
  for i = 0 to Array.length a.bits - 1 do
    result.bits.(i) <- a.bits.(i) lor b.bits.(i)
  done;
  result

let intersect a b =
  assert (a.size = b.size);
  let result = create a.size in
  for i = 0 to Array.length a.bits - 1 do
    result.bits.(i) <- a.bits.(i) land b.bits.(i)
  done;
  result

let difference a b =
  assert (a.size = b.size);
  let result = create a.size in
  for i = 0 to Array.length a.bits - 1 do
    result.bits.(i) <- a.bits.(i) land (lnot b.bits.(i))
  done;
  result

let () =
  let bs = create 64 in
  set bs 0;
  set bs 5;
  set bs 10;
  set bs 63;

  assert (test bs 0);
  assert (test bs 5);
  assert (test bs 10);
  assert (test bs 63);
  assert (not (test bs 1));
  assert (count bs = 4);

  clear bs 5;
  assert (not (test bs 5));
  assert (count bs = 3);

  toggle bs 5;
  assert (test bs 5);
  toggle bs 5;
  assert (not (test bs 5));

  let a = create 64 in
  let b = create 64 in
  List.iter (set a) [0; 1; 2; 3];
  List.iter (set b) [2; 3; 4; 5];

  let u = union a b in
  assert (count u = 6);

  let i = intersect a b in
  assert (count i = 2);
  assert (test i 2);
  assert (test i 3);

  let d = difference a b in
  assert (count d = 2);
  assert (test d 0);
  assert (test d 1);

  Printf.printf "✓ All tests passed\n"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_set_test_clear() {
        let mut bs = Bitset::new(128);
        bs.set(0);
        bs.set(5);
        bs.set(63);
        bs.set(64);
        bs.set(127);

        assert!(bs.test(0));
        assert!(bs.test(5));
        assert!(bs.test(63));
        assert!(bs.test(64));
        assert!(bs.test(127));
        assert!(!bs.test(1));
        assert_eq!(bs.count(), 5);

        bs.clear(5);
        assert!(!bs.test(5));
        assert_eq!(bs.count(), 4);
    }

    #[test]
    fn test_toggle() {
        let mut bs = Bitset::new(64);
        bs.toggle(10);
        assert!(bs.test(10));
        bs.toggle(10);
        assert!(!bs.test(10));
    }

    #[test]
    fn test_union() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let u = a.union(&b);
        assert_eq!(u.count(), 6);
        assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_intersect() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let i = a.intersect(&b);
        assert_eq!(i.count(), 2);
        assert!(i.test(2));
        assert!(i.test(3));
    }

    #[test]
    fn test_difference() {
        let mut a = Bitset::new(64);
        let mut b = Bitset::new(64);
        for i in 0..4 {
            a.set(i);
        }
        for i in 2..6 {
            b.set(i);
        }
        let d = a.difference(&b);
        assert_eq!(d.count(), 2);
        assert_eq!(d.iter_ones(), vec![0, 1]);
    }

    #[test]
    fn test_iter_ones() {
        let mut bs = Bitset::new(200);
        bs.set(0);
        bs.set(63);
        bs.set(64);
        bs.set(127);
        bs.set(128);
        bs.set(199);
        assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
    }
}

Deep Comparison

Bitset — Comparison

Core Insight

A bitset stores a set of integers compactly: 1 bit per element. Word-level operations (|, &, ^, ~) apply to 64 bits simultaneously. OCaml's int is 63-bit (the runtime uses bit 0 as a GC tag for boxing), so each word holds 63 bits. Rust uses u64 (all 64 bits). Rust also has hardware-accelerated count_ones() and trailing_zeros() as methods.

OCaml Approach

• int array — 63 bits per word (OCaml int is 63-bit on 64-bit systems)

• lsl, lor, land, lxor, lnot — OCaml bitwise operators

• words_for_bits n = (n + 62) / 63 — accounts for 63-bit words

• Manual popcount via while w <> 0 do w := w land (w-1); incr count done

• Array.fold_left for counting across all words

• Set/clear/toggle return unit (mutate in place)

Rust Approach

• Vec<u64> — 64 bits per word (full u64, no GC tag bit)

• |=, &=, ^=, ! — Rust bitwise operators (same semantics)

• (size + 63) / 64 words needed

• word.count_ones() — hardware POPCNT instruction

• word.trailing_zeros() — hardware BSF/CTZ for finding lowest set bit

• word &= word - 1 — clear lowest set bit trick (same as OCaml)

• .iter().zip(&other.bits).map(|(a,b)| a | b).collect() for set operations

Comparison Table

Aspect	OCaml	Rust
Word type	`int` (63-bit)	`u64` (64-bit)
Words per N bits	`(N+62)/63`	`(N+63)/64`
Bitwise AND	`land`	`&`
Bitwise OR	`lor`	`\\|`
Bitwise XOR	`lxor`	`^`
Bitwise NOT	`lnot`	`!`
Left shift	`lsl`	`<<`
Right shift	`lsr`	`>>`
Popcount	Manual Brian Kernighan loop	`u64::count_ones()` (hardware)
Lowest set bit	`w land (w - 1)` clear	`w.trailing_zeros()` + `w &= w-1`
Set operations	`for i = 0 to n-1 do result.(i) <- a.(i) lor b.(i) done`	`a.iter().zip(b.iter()).map(\\|(a,b)\\| a\\|b).collect()`

Exercises

Implement Bitset::from_indices(size, indices: &[usize]) that sets all specified bits.

Implement symmetric_difference(a, b) using XOR (^).

Implement is_subset(a, b) -> bool — returns true if every set bit in a is also set in b.

Implement first_set() -> Option<usize> — find the lowest set bit using trailing_zeros().

Implement the Sieve of Eratosthenes using a Bitset for an upper bound of 1,000,000.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust