749 Fundamental

749-fuzzing-concepts — Fuzzing Concepts

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "749-fuzzing-concepts — Fuzzing Concepts" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. Fuzzing sends random or mutated inputs to a program to find panics, crashes, and assertion failures. Key difference from OCaml: 1. **Panic vs exception**: Rust's `unwrap()` panics (caught by libFuzzer as a crash); OCaml's `failwith` raises an exception (also caught as a crash).

Tutorial

The Problem

Fuzzing sends random or mutated inputs to a program to find panics, crashes, and assertion failures. It has discovered thousands of security vulnerabilities in parsers, decoders, and protocol implementations. AFL++ and libFuzzer are the dominant fuzzers; Rust's cargo-fuzz wraps libFuzzer for Rust code. The key design principle for fuzzer-safe code is: never panic on any input — return Err instead of unwrapping. This example demonstrates how to write fuzz-safe parsers.

🎯 Learning Outcomes

• Write parsers that return Result/Option on all invalid inputs instead of panicking

• Understand the Rust fuzzing workflow: cargo fuzz add target, cargo fuzz run target

• See why unwrap() in parsers is a security vulnerability (forced panic = DoS)

• Implement boundary checks before every slice index operation

• Write basic fuzz harness structure for a binary packet parser

Code Example

// fuzz/fuzz_targets/parse_packet.rs
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    let _ = my_crate::parse_packet(data);  // must NEVER panic
});

(* afl_input.ml *)
let () =
  let input = In_channel.input_all In_channel.stdin in
  let _ = My_lib.parse_packet input in
  ()

Key Differences

Panic vs exception: Rust's unwrap() panics (caught by libFuzzer as a crash); OCaml's failwith raises an exception (also caught as a crash).

Memory safety: Rust's memory safety guarantees eliminate entire classes of fuzzer-discovered bugs (buffer overflow, use-after-free); OCaml's GC provides similar protection.

Fuzzer integration: Rust has cargo-fuzz with first-class libFuzzer support; OCaml uses afl.sh wrappers or crowbar with more setup.

Speed: Rust fuzz targets run 2–5x faster than equivalent OCaml targets due to no GC overhead, finding more coverage per second.

OCaml Approach

OCaml's afl library integrates with American Fuzzy Lop. The crowbar library provides property-based fuzzing using libFuzzer. OCaml's exception-based error handling means uncaught exceptions crash the fuzz target similarly to Rust's unwrap. The key discipline is the same: use result-based parsing and avoid failwith/assert false in parser code paths.

Full Source

#![allow(clippy::all)]
//! # Fuzzing Concepts
//!
//! Demonstrates code that is fuzz-safe (never panics on any input).

/// A simple binary packet structure
#[derive(Debug, PartialEq, Clone)]
pub struct Packet {
    pub version: u8,
    pub payload_len: u8,
    pub payload: Vec<u8>,
}

/// Errors that can occur when parsing a packet
#[derive(Debug, PartialEq)]
pub enum ParseError {
    TooShort,
    InvalidVersion(u8),
    TruncatedPayload { expected: usize, got: usize },
}

impl std::fmt::Display for ParseError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            ParseError::TooShort => write!(f, "input too short (need ≥2 bytes)"),
            ParseError::InvalidVersion(v) => write!(f, "invalid version {}", v),
            ParseError::TruncatedPayload { expected, got } => {
                write!(f, "payload truncated: expected {} got {}", expected, got)
            }
        }
    }
}

/// Parse a simple binary packet format.
///
/// Format:
/// - Byte 0: version (must be 1-5)
/// - Byte 1: payload length
/// - Bytes 2..(2+payload_len): payload
///
/// **NEVER panics on any input** — returns Err for invalid data.
pub fn parse_packet(data: &[u8]) -> Result<Packet, ParseError> {
    if data.len() < 2 {
        return Err(ParseError::TooShort);
    }
    let version = data[0];
    if version == 0 || version > 5 {
        return Err(ParseError::InvalidVersion(version));
    }
    let payload_len = data[1] as usize;
    let available = data.len().saturating_sub(2);
    if available < payload_len {
        return Err(ParseError::TruncatedPayload {
            expected: payload_len,
            got: available,
        });
    }
    Ok(Packet {
        version,
        payload_len: payload_len as u8,
        payload: data[2..2 + payload_len].to_vec(),
    })
}

/// Encode a packet back to bytes
pub fn encode_packet(p: &Packet) -> Vec<u8> {
    let mut out = vec![p.version, p.payload_len];
    out.extend_from_slice(&p.payload);
    out
}

/// Parse a key=value string. Must not panic on any &str.
pub fn parse_kv(s: &str) -> Option<(&str, &str)> {
    s.split_once('=').and_then(|(k, v)| {
        if k.is_empty() || v.is_empty() {
            None
        } else {
            Some((k, v))
        }
    })
}

/// Validate that input is ASCII alphanumeric. Never panics.
pub fn is_valid_identifier(s: &str) -> bool {
    !s.is_empty() && s.chars().all(|c| c.is_ascii_alphanumeric() || c == '_')
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_parse_valid_packet() {
        let data = &[1, 3, b'a', b'b', b'c'];
        let packet = parse_packet(data).unwrap();
        assert_eq!(packet.version, 1);
        assert_eq!(packet.payload, vec![b'a', b'b', b'c']);
    }

    #[test]
    fn test_parse_empty_input() {
        assert_eq!(parse_packet(&[]), Err(ParseError::TooShort));
    }

    #[test]
    fn test_parse_too_short() {
        assert_eq!(parse_packet(&[1]), Err(ParseError::TooShort));
    }

    #[test]
    fn test_parse_invalid_version() {
        assert_eq!(parse_packet(&[0, 0]), Err(ParseError::InvalidVersion(0)));
        assert_eq!(parse_packet(&[6, 0]), Err(ParseError::InvalidVersion(6)));
    }

    #[test]
    fn test_parse_truncated_payload() {
        let result = parse_packet(&[1, 10, b'x', b'y']);
        assert_eq!(
            result,
            Err(ParseError::TruncatedPayload {
                expected: 10,
                got: 2
            })
        );
    }

    #[test]
    fn test_roundtrip() {
        let original = Packet {
            version: 3,
            payload_len: 2,
            payload: vec![0xAB, 0xCD],
        };
        let encoded = encode_packet(&original);
        let decoded = parse_packet(&encoded).unwrap();
        assert_eq!(decoded, original);
    }

    #[test]
    fn test_parse_kv_valid() {
        assert_eq!(parse_kv("key=value"), Some(("key", "value")));
        assert_eq!(parse_kv("a=b"), Some(("a", "b")));
    }

    #[test]
    fn test_parse_kv_invalid() {
        assert_eq!(parse_kv("noequals"), None);
        assert_eq!(parse_kv("=value"), None);
        assert_eq!(parse_kv("key="), None);
        assert_eq!(parse_kv(""), None);
    }

    #[test]
    fn test_valid_identifier() {
        assert!(is_valid_identifier("foo"));
        assert!(is_valid_identifier("foo_bar"));
        assert!(is_valid_identifier("Foo123"));
        assert!(!is_valid_identifier(""));
        assert!(!is_valid_identifier("foo-bar"));
        assert!(!is_valid_identifier("foo bar"));
    }

    // Fuzz-like exhaustive test
    #[test]
    fn test_parse_never_panics() {
        for v in 0..=255u8 {
            for len in 0..=10u8 {
                let data: Vec<u8> = std::iter::once(v)
                    .chain(std::iter::once(len))
                    .chain((0..len).map(|i| i))
                    .collect();
                let _ = parse_packet(&data); // Must not panic
            }
        }
    }
}

(* 749: Fuzzing Concepts — OCaml
   OCaml can be fuzzed with AFL (American Fuzzy Lop).
   Key principle: parsers should never raise exceptions on bad input. *)

(* A safe parser that returns Result instead of raising exceptions *)
type parse_error =
  | UnexpectedEnd
  | InvalidByte of char
  | TooLong of int

type packet = {
  version: int;
  payload_len: int;
  payload: bytes;
}

let parse_packet (data: bytes) : (packet, parse_error) result =
  let n = Bytes.length data in
  if n < 3 then Error UnexpectedEnd
  else
    let version = Char.code (Bytes.get data 0) in
    if version > 5 then Error (InvalidByte (Bytes.get data 0))
    else
      let payload_len = Char.code (Bytes.get data 1) in
      if payload_len > 255 then Error (TooLong payload_len)
      else if n < 2 + payload_len then Error UnexpectedEnd
      else
        let payload = Bytes.sub data 2 payload_len in
        Ok { version; payload_len; payload }

(* Fuzz-target-style function: accepts any bytes, never raises *)
let fuzz_target data =
  (match parse_packet data with
  | Ok p ->
    (* Invariant: payload length matches header *)
    assert (Bytes.length p.payload = p.payload_len)
  | Error _ ->
    (* Reject is fine — we just must not crash *)
    ())

(* Simulate fuzzing with some inputs *)
let () =
  let inputs = [
    Bytes.of_string "\x01\x05hello";  (* valid *)
    Bytes.of_string "";                (* too short *)
    Bytes.of_string "\x09\x01x";     (* invalid version *)
    Bytes.of_string "\x01\xFF";       (* truncated payload *)
  ] in
  List.iter fuzz_target inputs;
  Printf.printf "Fuzz targets: no panics!\n"

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_parse_valid_packet() {
        let data = &[1, 3, b'a', b'b', b'c'];
        let packet = parse_packet(data).unwrap();
        assert_eq!(packet.version, 1);
        assert_eq!(packet.payload, vec![b'a', b'b', b'c']);
    }

    #[test]
    fn test_parse_empty_input() {
        assert_eq!(parse_packet(&[]), Err(ParseError::TooShort));
    }

    #[test]
    fn test_parse_too_short() {
        assert_eq!(parse_packet(&[1]), Err(ParseError::TooShort));
    }

    #[test]
    fn test_parse_invalid_version() {
        assert_eq!(parse_packet(&[0, 0]), Err(ParseError::InvalidVersion(0)));
        assert_eq!(parse_packet(&[6, 0]), Err(ParseError::InvalidVersion(6)));
    }

    #[test]
    fn test_parse_truncated_payload() {
        let result = parse_packet(&[1, 10, b'x', b'y']);
        assert_eq!(
            result,
            Err(ParseError::TruncatedPayload {
                expected: 10,
                got: 2
            })
        );
    }

    #[test]
    fn test_roundtrip() {
        let original = Packet {
            version: 3,
            payload_len: 2,
            payload: vec![0xAB, 0xCD],
        };
        let encoded = encode_packet(&original);
        let decoded = parse_packet(&encoded).unwrap();
        assert_eq!(decoded, original);
    }

    #[test]
    fn test_parse_kv_valid() {
        assert_eq!(parse_kv("key=value"), Some(("key", "value")));
        assert_eq!(parse_kv("a=b"), Some(("a", "b")));
    }

    #[test]
    fn test_parse_kv_invalid() {
        assert_eq!(parse_kv("noequals"), None);
        assert_eq!(parse_kv("=value"), None);
        assert_eq!(parse_kv("key="), None);
        assert_eq!(parse_kv(""), None);
    }

    #[test]
    fn test_valid_identifier() {
        assert!(is_valid_identifier("foo"));
        assert!(is_valid_identifier("foo_bar"));
        assert!(is_valid_identifier("Foo123"));
        assert!(!is_valid_identifier(""));
        assert!(!is_valid_identifier("foo-bar"));
        assert!(!is_valid_identifier("foo bar"));
    }

    // Fuzz-like exhaustive test
    #[test]
    fn test_parse_never_panics() {
        for v in 0..=255u8 {
            for len in 0..=10u8 {
                let data: Vec<u8> = std::iter::once(v)
                    .chain(std::iter::once(len))
                    .chain((0..len).map(|i| i))
                    .collect();
                let _ = parse_packet(&data); // Must not panic
            }
        }
    }
}

Deep Comparison

OCaml vs Rust: Fuzzing Concepts

Fuzz Target Setup

Rust (cargo-fuzz)

// fuzz/fuzz_targets/parse_packet.rs
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    let _ = my_crate::parse_packet(data);  // must NEVER panic
});

Run with: cargo fuzz run parse_packet

OCaml (afl-fuzz)

(* afl_input.ml *)
let () =
  let input = In_channel.input_all In_channel.stdin in
  let _ = My_lib.parse_packet input in
  ()

Key Principle: Never Panic

Rust

pub fn parse_packet(data: &[u8]) -> Result<Packet, ParseError> {
    if data.len() < 2 {
        return Err(ParseError::TooShort);
    }
    let version = data[0];
    if version == 0 || version > 5 {
        return Err(ParseError::InvalidVersion(version));
    }
    // ... safe parsing with bounds checks
    Ok(packet)
}

OCaml

let parse_packet data =
  if String.length data < 2 then
    Error TooShort
  else
    let version = Char.code data.[0] in
    if version = 0 || version > 5 then
      Error (InvalidVersion version)
    else
      (* ... safe parsing *)
      Ok packet

Key Differences

Aspect	OCaml	Rust
Fuzzing tool	afl-fuzz, crowbar	cargo-fuzz (libFuzzer)
Input format	stdin/file	`&[u8]` parameter
Coverage	afl instrumentation	LLVM sanitizers
Setup complexity	Manual	`cargo fuzz init`
Structured fuzzing	crowbar	arbitrary crate

Exercises

Add a parse_json_number function that handles integers, floats, and scientific notation — ensure it never panics on any byte sequence and write a roundtrip property test.

Write a fuzz harness that tests the invariant: parse_packet(encode_packet(p)) == Ok(p) for all valid packets generated by the fuzzer.

Implement parse_packet_v2 with a checksum field at the end, verifying the checksum before returning Ok — make it fuzz-safe with proper error handling.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust