ExamplesBy LevelBy TopicLearning Paths
341 Intermediate

341: Buffered Stream — BufReader and BufWriter

Functional Programming

Tutorial

The Problem

Reading or writing one byte at a time with unbuffered I/O makes a system call for each operation — catastrophically slow for large files. BufReader and BufWriter add an in-memory buffer: reads fill the buffer in bulk (e.g., 8KB), and subsequent reads serve from the buffer without syscalls. Writers accumulate data in the buffer and flush in bulk. This optimization, crucial for text file processing and log writing, reduces system call overhead by orders of magnitude.

🎯 Learning Outcomes

  • • Use BufReader::new(reader) to wrap any reader with a 8KB internal buffer
  • • Use BufWriter::new(writer) to buffer writes and flush in bulk
  • • Process files line-by-line using BufRead::lines() — lazy, buffered
  • • Understand that flushing on BufWriter drop may silently discard errors — call flush() explicitly
  • Code Example

    #![allow(clippy::all)]
    // 341: Buffered Stream
    // BufReader/BufWriter wrapping for efficient I/O
    
    use std::io::{self, BufRead, BufReader, BufWriter, Cursor, Write};
    
    // Approach 1: BufReader for efficient reading
    fn count_lines(input: &[u8]) -> usize {
        let reader = BufReader::new(input);
        reader.lines().count()
    }
    
    fn read_lines(input: &[u8]) -> Vec<String> {
        let reader = BufReader::new(input);
        reader.lines().filter_map(|l| l.ok()).collect()
    }
    
    // Approach 2: BufWriter for efficient writing
    fn write_lines(lines: &[&str]) -> Vec<u8> {
        let mut output = Vec::new();
        {
            let mut writer = BufWriter::new(&mut output);
            for line in lines {
                writeln!(writer, "{}", line).unwrap();
            }
            writer.flush().unwrap();
        }
        output
    }
    
    // Approach 3: String building with buffered writes
    fn build_csv(headers: &[&str], rows: &[Vec<String>]) -> String {
        let mut buf = Vec::new();
        {
            let mut writer = BufWriter::new(&mut buf);
            writeln!(writer, "{}", headers.join(",")).unwrap();
            for row in rows {
                writeln!(writer, "{}", row.join(",")).unwrap();
            }
            writer.flush().unwrap();
        }
        String::from_utf8(buf).unwrap()
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_count_lines() {
            assert_eq!(count_lines(b"a\nb\nc\n"), 3);
            assert_eq!(count_lines(b""), 0);
        }
    
        #[test]
        fn test_read_lines() {
            let lines = read_lines(b"hello\nworld\n");
            assert_eq!(lines, vec!["hello", "world"]);
        }
    
        #[test]
        fn test_write_lines() {
            let output = write_lines(&["hello", "world"]);
            let s = String::from_utf8(output).unwrap();
            assert!(s.contains("hello"));
            assert!(s.contains("world"));
        }
    
        #[test]
        fn test_csv() {
            let csv = build_csv(&["a", "b"], &[vec!["1".into(), "2".into()]]);
            assert!(csv.starts_with("a,b\n"));
            assert!(csv.contains("1,2"));
        }
    }

    Key Differences

  • Default buffering: OCaml's In_channel / Out_channel are buffered by default; Rust's File is unbuffered — BufReader/BufWriter must be added explicitly.
  • Drop flush: BufWriter's Drop implementation calls flush(), but ignores errors — always call flush() explicitly when error handling matters.
  • Buffer size: Default buffer is 8KB; use BufReader::with_capacity(size) for tuned buffer sizes.
  • Async buffered I/O: tokio::io::BufReader / BufWriter are the async-aware equivalents for use with Tokio async I/O traits.
  • OCaml Approach

    OCaml's In_channel uses buffered I/O by default. In_channel.input_line is the standard line-by-line reader:

    let count_lines path =
      let ic = In_channel.open_text path in
      let count = ref 0 in
      (try while true do ignore (In_channel.input_line_exn ic); incr count done
       with End_of_file -> ());
      In_channel.close ic; !count
    

    Full Source

    #![allow(clippy::all)]
    // 341: Buffered Stream
    // BufReader/BufWriter wrapping for efficient I/O
    
    use std::io::{self, BufRead, BufReader, BufWriter, Cursor, Write};
    
    // Approach 1: BufReader for efficient reading
    fn count_lines(input: &[u8]) -> usize {
        let reader = BufReader::new(input);
        reader.lines().count()
    }
    
    fn read_lines(input: &[u8]) -> Vec<String> {
        let reader = BufReader::new(input);
        reader.lines().filter_map(|l| l.ok()).collect()
    }
    
    // Approach 2: BufWriter for efficient writing
    fn write_lines(lines: &[&str]) -> Vec<u8> {
        let mut output = Vec::new();
        {
            let mut writer = BufWriter::new(&mut output);
            for line in lines {
                writeln!(writer, "{}", line).unwrap();
            }
            writer.flush().unwrap();
        }
        output
    }
    
    // Approach 3: String building with buffered writes
    fn build_csv(headers: &[&str], rows: &[Vec<String>]) -> String {
        let mut buf = Vec::new();
        {
            let mut writer = BufWriter::new(&mut buf);
            writeln!(writer, "{}", headers.join(",")).unwrap();
            for row in rows {
                writeln!(writer, "{}", row.join(",")).unwrap();
            }
            writer.flush().unwrap();
        }
        String::from_utf8(buf).unwrap()
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_count_lines() {
            assert_eq!(count_lines(b"a\nb\nc\n"), 3);
            assert_eq!(count_lines(b""), 0);
        }
    
        #[test]
        fn test_read_lines() {
            let lines = read_lines(b"hello\nworld\n");
            assert_eq!(lines, vec!["hello", "world"]);
        }
    
        #[test]
        fn test_write_lines() {
            let output = write_lines(&["hello", "world"]);
            let s = String::from_utf8(output).unwrap();
            assert!(s.contains("hello"));
            assert!(s.contains("world"));
        }
    
        #[test]
        fn test_csv() {
            let csv = build_csv(&["a", "b"], &[vec!["1".into(), "2".into()]]);
            assert!(csv.starts_with("a,b\n"));
            assert!(csv.contains("1,2"));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_count_lines() {
            assert_eq!(count_lines(b"a\nb\nc\n"), 3);
            assert_eq!(count_lines(b""), 0);
        }
    
        #[test]
        fn test_read_lines() {
            let lines = read_lines(b"hello\nworld\n");
            assert_eq!(lines, vec!["hello", "world"]);
        }
    
        #[test]
        fn test_write_lines() {
            let output = write_lines(&["hello", "world"]);
            let s = String::from_utf8(output).unwrap();
            assert!(s.contains("hello"));
            assert!(s.contains("world"));
        }
    
        #[test]
        fn test_csv() {
            let csv = build_csv(&["a", "b"], &[vec!["1".into(), "2".into()]]);
            assert!(csv.starts_with("a,b\n"));
            assert!(csv.contains("1,2"));
        }
    }

    Deep Comparison

    Core Insight

    Buffering reduces system calls — wrapping raw I/O in buffers is essential for performance

    OCaml Approach

  • • See example.ml for implementation
  • Rust Approach

  • • See example.rs for implementation
  • Comparison Table

    FeatureOCamlRust
    Seeexample.mlexample.rs

    Exercises

  • Benchmark reading a large file byte-by-byte vs line-by-line with BufReader — measure time and system call count.
  • Implement a log file writer that uses BufWriter with periodic explicit flushes every 1000 lines.
  • Use BufReader::lines() to process a CSV file lazily, parsing each line into a Vec<String> of fields.
  • Open Source Repos