977 Bitset
Tutorial
The Problem
Implement a bitset — a fixed-size set of bits stored compactly in Vec<u64> words. Support set, clear, toggle, test, union (OR), intersection (AND), difference (AND NOT), and population count (popcount). Compare with OCaml's 63-bit integer bitset (one GC tag bit is reserved) and with Rust's full 64-bit words.
🎯 Learning Outcomes
i to word index i/64 and bit offset i%64set, clear, toggle, test using |=, &= !, ^=, & != 0union (|), intersection (&), difference (& !)count_ones (population count) using u64::count_ones() summed over wordsCode Example
#![allow(clippy::all)]
// 977: Bitset
// Set, clear, toggle, union, intersection via u64 word arrays
// OCaml uses 63-bit int (1 bit used for GC tag); Rust uses full u64
pub struct Bitset {
bits: Vec<u64>,
size: usize,
}
impl Bitset {
pub fn new(size: usize) -> Self {
let words = (size + 63) / 64;
Bitset {
bits: vec![0u64; words],
size,
}
}
fn word(&self, i: usize) -> usize {
i / 64
}
fn bit(&self, i: usize) -> u64 {
1u64 << (i % 64)
}
pub fn set(&mut self, i: usize) {
assert!(
i < self.size,
"index {} out of range (size={})",
i,
self.size
);
let (w, b) = (self.word(i), self.bit(i));
self.bits[w] |= b;
}
pub fn clear(&mut self, i: usize) {
assert!(i < self.size, "index out of range");
let (w, b) = (self.word(i), self.bit(i));
self.bits[w] &= !b;
}
pub fn toggle(&mut self, i: usize) {
assert!(i < self.size, "index out of range");
let (w, b) = (self.word(i), self.bit(i));
self.bits[w] ^= b;
}
pub fn test(&self, i: usize) -> bool {
if i >= self.size {
return false;
}
(self.bits[self.word(i)] >> (i % 64)) & 1 == 1
}
/// Count of set bits (popcount)
pub fn count(&self) -> u32 {
self.bits.iter().map(|w| w.count_ones()).sum()
}
/// Union: returns new bitset with bits set in either
pub fn union(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a | b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Intersection: bits set in both
pub fn intersect(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a & b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Difference: bits in self but not other
pub fn difference(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a & !b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Symmetric difference: bits in one but not both
pub fn symmetric_difference(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a ^ b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Iterate set bit indices
pub fn iter_ones(&self) -> Vec<usize> {
let mut result = vec![];
for (w_idx, &word) in self.bits.iter().enumerate() {
let mut word = word;
while word != 0 {
let bit = word.trailing_zeros() as usize;
result.push(w_idx * 64 + bit);
word &= word - 1; // clear lowest set bit
}
}
result
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_set_test_clear() {
let mut bs = Bitset::new(128);
bs.set(0);
bs.set(5);
bs.set(63);
bs.set(64);
bs.set(127);
assert!(bs.test(0));
assert!(bs.test(5));
assert!(bs.test(63));
assert!(bs.test(64));
assert!(bs.test(127));
assert!(!bs.test(1));
assert_eq!(bs.count(), 5);
bs.clear(5);
assert!(!bs.test(5));
assert_eq!(bs.count(), 4);
}
#[test]
fn test_toggle() {
let mut bs = Bitset::new(64);
bs.toggle(10);
assert!(bs.test(10));
bs.toggle(10);
assert!(!bs.test(10));
}
#[test]
fn test_union() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let u = a.union(&b);
assert_eq!(u.count(), 6);
assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
}
#[test]
fn test_intersect() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let i = a.intersect(&b);
assert_eq!(i.count(), 2);
assert!(i.test(2));
assert!(i.test(3));
}
#[test]
fn test_difference() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let d = a.difference(&b);
assert_eq!(d.count(), 2);
assert_eq!(d.iter_ones(), vec![0, 1]);
}
#[test]
fn test_iter_ones() {
let mut bs = Bitset::new(200);
bs.set(0);
bs.set(63);
bs.set(64);
bs.set(127);
bs.set(128);
bs.set(199);
assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
}
}Key Differences
| Aspect | Rust | OCaml | |
|---|---|---|---|
| Word size | 64 bits (u64) — full hardware word | 63 bits (int) — one GC tag bit | |
count_ones | u64::count_ones() — hardware POPCNT | No stdlib popcount; loop manually | |
| NOT operator | ! | lnot | |
| Bitwise AND | & | land | |
| Performance | Direct 64-bit ops | 63-bit bounds, slightly lower density |
Bitsets are useful for dense boolean arrays, Sieve of Eratosthenes, graph adjacency matrices (for small graphs), and flag sets. The bit-vec and fixedbitset crates provide production-grade implementations.
OCaml Approach
(* OCaml int is 63 bits on 64-bit systems (1 bit for GC tag) *)
type bitset = {
words: int array;
size: int;
}
let create size =
{ words = Array.make ((size + 62) / 63) 0; size }
let set bs i =
let w = i / 63 and b = i mod 63 in
bs.words.(w) <- bs.words.(w) lor (1 lsl b)
let test bs i =
let w = i / 63 and b = i mod 63 in
bs.words.(w) land (1 lsl b) <> 0
(* Using Bytes for 8-bit words — avoids GC tag issue *)
let create_bytes size = Bytes.make ((size + 7) / 8) '\000'
OCaml's int is 63 bits on 64-bit platforms (one bit is the GC's tag bit). Using int for bitsets means 63 bits per word, not 64. Bytes (byte arrays) avoids this issue at the cost of 8 bits per word instead of 63.
Full Source
#![allow(clippy::all)]
// 977: Bitset
// Set, clear, toggle, union, intersection via u64 word arrays
// OCaml uses 63-bit int (1 bit used for GC tag); Rust uses full u64
pub struct Bitset {
bits: Vec<u64>,
size: usize,
}
impl Bitset {
pub fn new(size: usize) -> Self {
let words = (size + 63) / 64;
Bitset {
bits: vec![0u64; words],
size,
}
}
fn word(&self, i: usize) -> usize {
i / 64
}
fn bit(&self, i: usize) -> u64 {
1u64 << (i % 64)
}
pub fn set(&mut self, i: usize) {
assert!(
i < self.size,
"index {} out of range (size={})",
i,
self.size
);
let (w, b) = (self.word(i), self.bit(i));
self.bits[w] |= b;
}
pub fn clear(&mut self, i: usize) {
assert!(i < self.size, "index out of range");
let (w, b) = (self.word(i), self.bit(i));
self.bits[w] &= !b;
}
pub fn toggle(&mut self, i: usize) {
assert!(i < self.size, "index out of range");
let (w, b) = (self.word(i), self.bit(i));
self.bits[w] ^= b;
}
pub fn test(&self, i: usize) -> bool {
if i >= self.size {
return false;
}
(self.bits[self.word(i)] >> (i % 64)) & 1 == 1
}
/// Count of set bits (popcount)
pub fn count(&self) -> u32 {
self.bits.iter().map(|w| w.count_ones()).sum()
}
/// Union: returns new bitset with bits set in either
pub fn union(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a | b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Intersection: bits set in both
pub fn intersect(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a & b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Difference: bits in self but not other
pub fn difference(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a & !b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Symmetric difference: bits in one but not both
pub fn symmetric_difference(&self, other: &Bitset) -> Bitset {
assert_eq!(self.size, other.size);
let bits = self
.bits
.iter()
.zip(&other.bits)
.map(|(a, b)| a ^ b)
.collect();
Bitset {
bits,
size: self.size,
}
}
/// Iterate set bit indices
pub fn iter_ones(&self) -> Vec<usize> {
let mut result = vec![];
for (w_idx, &word) in self.bits.iter().enumerate() {
let mut word = word;
while word != 0 {
let bit = word.trailing_zeros() as usize;
result.push(w_idx * 64 + bit);
word &= word - 1; // clear lowest set bit
}
}
result
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_set_test_clear() {
let mut bs = Bitset::new(128);
bs.set(0);
bs.set(5);
bs.set(63);
bs.set(64);
bs.set(127);
assert!(bs.test(0));
assert!(bs.test(5));
assert!(bs.test(63));
assert!(bs.test(64));
assert!(bs.test(127));
assert!(!bs.test(1));
assert_eq!(bs.count(), 5);
bs.clear(5);
assert!(!bs.test(5));
assert_eq!(bs.count(), 4);
}
#[test]
fn test_toggle() {
let mut bs = Bitset::new(64);
bs.toggle(10);
assert!(bs.test(10));
bs.toggle(10);
assert!(!bs.test(10));
}
#[test]
fn test_union() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let u = a.union(&b);
assert_eq!(u.count(), 6);
assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
}
#[test]
fn test_intersect() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let i = a.intersect(&b);
assert_eq!(i.count(), 2);
assert!(i.test(2));
assert!(i.test(3));
}
#[test]
fn test_difference() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let d = a.difference(&b);
assert_eq!(d.count(), 2);
assert_eq!(d.iter_ones(), vec![0, 1]);
}
#[test]
fn test_iter_ones() {
let mut bs = Bitset::new(200);
bs.set(0);
bs.set(63);
bs.set(64);
bs.set(127);
bs.set(128);
bs.set(199);
assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_set_test_clear() {
let mut bs = Bitset::new(128);
bs.set(0);
bs.set(5);
bs.set(63);
bs.set(64);
bs.set(127);
assert!(bs.test(0));
assert!(bs.test(5));
assert!(bs.test(63));
assert!(bs.test(64));
assert!(bs.test(127));
assert!(!bs.test(1));
assert_eq!(bs.count(), 5);
bs.clear(5);
assert!(!bs.test(5));
assert_eq!(bs.count(), 4);
}
#[test]
fn test_toggle() {
let mut bs = Bitset::new(64);
bs.toggle(10);
assert!(bs.test(10));
bs.toggle(10);
assert!(!bs.test(10));
}
#[test]
fn test_union() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let u = a.union(&b);
assert_eq!(u.count(), 6);
assert_eq!(u.iter_ones(), vec![0, 1, 2, 3, 4, 5]);
}
#[test]
fn test_intersect() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let i = a.intersect(&b);
assert_eq!(i.count(), 2);
assert!(i.test(2));
assert!(i.test(3));
}
#[test]
fn test_difference() {
let mut a = Bitset::new(64);
let mut b = Bitset::new(64);
for i in 0..4 {
a.set(i);
}
for i in 2..6 {
b.set(i);
}
let d = a.difference(&b);
assert_eq!(d.count(), 2);
assert_eq!(d.iter_ones(), vec![0, 1]);
}
#[test]
fn test_iter_ones() {
let mut bs = Bitset::new(200);
bs.set(0);
bs.set(63);
bs.set(64);
bs.set(127);
bs.set(128);
bs.set(199);
assert_eq!(bs.iter_ones(), vec![0, 63, 64, 127, 128, 199]);
}
}
Deep Comparison
Bitset — Comparison
Core Insight
A bitset stores a set of integers compactly: 1 bit per element. Word-level operations (|, &, ^, ~) apply to 64 bits simultaneously. OCaml's int is 63-bit (the runtime uses bit 0 as a GC tag for boxing), so each word holds 63 bits. Rust uses u64 (all 64 bits). Rust also has hardware-accelerated count_ones() and trailing_zeros() as methods.
OCaml Approach
int array — 63 bits per word (OCaml int is 63-bit on 64-bit systems)lsl, lor, land, lxor, lnot — OCaml bitwise operatorswords_for_bits n = (n + 62) / 63 — accounts for 63-bit wordswhile w <> 0 do w := w land (w-1); incr count doneArray.fold_left for counting across all wordsRust Approach
Vec<u64> — 64 bits per word (full u64, no GC tag bit)|=, &=, ^=, ! — Rust bitwise operators (same semantics)(size + 63) / 64 words neededword.count_ones() — hardware POPCNT instructionword.trailing_zeros() — hardware BSF/CTZ for finding lowest set bitword &= word - 1 — clear lowest set bit trick (same as OCaml).iter().zip(&other.bits).map(|(a,b)| a | b).collect() for set operationsComparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Word type | int (63-bit) | u64 (64-bit) |
| Words per N bits | (N+62)/63 | (N+63)/64 |
| Bitwise AND | land | & |
| Bitwise OR | lor | \| |
| Bitwise XOR | lxor | ^ |
| Bitwise NOT | lnot | ! |
| Left shift | lsl | << |
| Right shift | lsr | >> |
| Popcount | Manual Brian Kernighan loop | u64::count_ones() (hardware) |
| Lowest set bit | w land (w - 1) clear | w.trailing_zeros() + w &= w-1 |
| Set operations | for i = 0 to n-1 do result.(i) <- a.(i) lor b.(i) done | a.iter().zip(b.iter()).map(\|(a,b)\| a\|b).collect() |
Exercises
Bitset::from_indices(size, indices: &[usize]) that sets all specified bits.symmetric_difference(a, b) using XOR (^).is_subset(a, b) -> bool — returns true if every set bit in a is also set in b.first_set() -> Option<usize> — find the lowest set bit using trailing_zeros().Bitset for an upper bound of 1,000,000.