766-config-file-parsing — Config File Parsing
Tutorial Video
Text description (accessibility)
This video demonstrates the "766-config-file-parsing — Config File Parsing" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. Configuration files — INI, TOML, YAML — are the primary mechanism for parametrizing deployed software without recompilation. Key difference from OCaml: 1. **HashMap access**: Rust's `HashMap::get` returns `Option<&V>`; OCaml's `Hashtbl.find_opt` returns `option 'v` — equivalent semantics.
Tutorial
The Problem
Configuration files — INI, TOML, YAML — are the primary mechanism for parametrizing deployed software without recompilation. INI format predates them all and remains ubiquitous in tools like git config, Windows registry exports, and openssl.cnf. Parsing INI manually teaches section-based hierarchical configuration, comment handling, and graceful error reporting for malformed config files.
🎯 Learning Outcomes
[section] headers and key = value assignmentsHashMap<String, HashMap<String, String>># and ;), blank lines, and trailing whitespaceParseError variants for invalid lines and duplicate sectionsget_section(section, key) -> Option<&str>Code Example
pub struct Config {
pub global: HashMap<String, String>,
pub sections: HashMap<String, HashMap<String, String>>,
}Key Differences
HashMap::get returns Option<&V>; OCaml's Hashtbl.find_opt returns option 'v — equivalent semantics.str::lines() is lazy and handles both \n and \r\n; OCaml's String.split_on_char '\n' requires manual \r stripping.HashMap::contains_key before insert; OCaml uses Hashtbl.mem — same pattern.serde with a TOML/INI deserializer to map config directly to a typed struct; OCaml's ppx_sexp_conv does the same for S-expression configs.OCaml Approach
OCaml's Inifiles library parses INI files with similar section/key structure. The Config module from caml-gettext handles POSIX-style configuration. For TOML, toml-ocaml and otoml are available. The parsing pattern uses String.split_on_char for line splitting and String.trim for whitespace normalization — identical operations to Rust's split('\n') and trim().
Full Source
#![allow(clippy::all)]
//! # Config File Parsing
//!
//! INI-style configuration file parser.
use std::collections::HashMap;
/// A configuration with sections
#[derive(Debug, Default)]
pub struct Config {
pub global: HashMap<String, String>,
pub sections: HashMap<String, HashMap<String, String>>,
}
impl Config {
pub fn new() -> Self {
Config::default()
}
/// Get a global value
pub fn get(&self, key: &str) -> Option<&str> {
self.global.get(key).map(String::as_str)
}
/// Get a value from a section
pub fn get_section(&self, section: &str, key: &str) -> Option<&str> {
self.sections
.get(section)
.and_then(|s| s.get(key))
.map(String::as_str)
}
/// Get all keys in a section
pub fn section_keys(&self, section: &str) -> Vec<&str> {
self.sections
.get(section)
.map(|s| s.keys().map(String::as_str).collect())
.unwrap_or_default()
}
}
/// Parse error
#[derive(Debug, PartialEq)]
pub enum ParseError {
InvalidLine { line: usize, content: String },
DuplicateSection { name: String },
}
/// Parse INI-style config
pub fn parse_config(input: &str) -> Result<Config, ParseError> {
let mut config = Config::new();
let mut current_section: Option<String> = None;
for (line_num, line) in input.lines().enumerate() {
let line = line.trim();
// Skip empty lines and comments
if line.is_empty() || line.starts_with('#') || line.starts_with(';') {
continue;
}
// Section header
if line.starts_with('[') && line.ends_with(']') {
let name = line[1..line.len() - 1].trim().to_string();
if config.sections.contains_key(&name) {
return Err(ParseError::DuplicateSection { name });
}
config.sections.insert(name.clone(), HashMap::new());
current_section = Some(name);
continue;
}
// Key-value pair
if let Some((key, value)) = line.split_once('=') {
let key = key.trim().to_string();
let value = value.trim().to_string();
match ¤t_section {
Some(section) => {
config.sections.get_mut(section).unwrap().insert(key, value);
}
None => {
config.global.insert(key, value);
}
}
} else {
return Err(ParseError::InvalidLine {
line: line_num,
content: line.to_string(),
});
}
}
Ok(config)
}
/// Format config back to string
pub fn format_config(config: &Config) -> String {
let mut output = String::new();
// Global section
for (key, value) in &config.global {
output.push_str(&format!("{} = {}\n", key, value));
}
if !config.global.is_empty() && !config.sections.is_empty() {
output.push('\n');
}
// Named sections
for (section_name, section) in &config.sections {
output.push_str(&format!("[{}]\n", section_name));
for (key, value) in section {
output.push_str(&format!("{} = {}\n", key, value));
}
output.push('\n');
}
output.trim_end().to_string()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_global() {
let input = "key = value";
let config = parse_config(input).unwrap();
assert_eq!(config.get("key"), Some("value"));
}
#[test]
fn test_parse_section() {
let input = "[database]\nhost = localhost\nport = 5432";
let config = parse_config(input).unwrap();
assert_eq!(config.get_section("database", "host"), Some("localhost"));
assert_eq!(config.get_section("database", "port"), Some("5432"));
}
#[test]
fn test_parse_comments() {
let input = "# comment\nkey = value\n; another comment";
let config = parse_config(input).unwrap();
assert_eq!(config.get("key"), Some("value"));
assert_eq!(config.global.len(), 1);
}
#[test]
fn test_duplicate_section() {
let input = "[section]\na = 1\n[section]\nb = 2";
let result = parse_config(input);
assert!(matches!(result, Err(ParseError::DuplicateSection { .. })));
}
#[test]
fn test_invalid_line() {
let input = "not a valid line";
let result = parse_config(input);
assert!(matches!(result, Err(ParseError::InvalidLine { .. })));
}
#[test]
fn test_section_keys() {
let input = "[server]\nhost = localhost\nport = 8080";
let config = parse_config(input).unwrap();
let keys = config.section_keys("server");
assert!(keys.contains(&"host"));
assert!(keys.contains(&"port"));
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_global() {
let input = "key = value";
let config = parse_config(input).unwrap();
assert_eq!(config.get("key"), Some("value"));
}
#[test]
fn test_parse_section() {
let input = "[database]\nhost = localhost\nport = 5432";
let config = parse_config(input).unwrap();
assert_eq!(config.get_section("database", "host"), Some("localhost"));
assert_eq!(config.get_section("database", "port"), Some("5432"));
}
#[test]
fn test_parse_comments() {
let input = "# comment\nkey = value\n; another comment";
let config = parse_config(input).unwrap();
assert_eq!(config.get("key"), Some("value"));
assert_eq!(config.global.len(), 1);
}
#[test]
fn test_duplicate_section() {
let input = "[section]\na = 1\n[section]\nb = 2";
let result = parse_config(input);
assert!(matches!(result, Err(ParseError::DuplicateSection { .. })));
}
#[test]
fn test_invalid_line() {
let input = "not a valid line";
let result = parse_config(input);
assert!(matches!(result, Err(ParseError::InvalidLine { .. })));
}
#[test]
fn test_section_keys() {
let input = "[server]\nhost = localhost\nport = 8080";
let config = parse_config(input).unwrap();
let keys = config.section_keys("server");
assert!(keys.contains(&"host"));
assert!(keys.contains(&"port"));
}
}
Deep Comparison
OCaml vs Rust: Config File Parsing
Config Structure
Rust
pub struct Config {
pub global: HashMap<String, String>,
pub sections: HashMap<String, HashMap<String, String>>,
}
OCaml
type config = {
global: (string * string) list;
sections: (string * (string * string) list) list;
}
Parsing
Rust
if line.starts_with('[') && line.ends_with(']') {
let name = line[1..line.len()-1].trim();
current_section = Some(name.to_string());
} else if let Some((key, value)) = line.split_once('=') {
// Insert key-value
}
OCaml
match String.get line 0, String.get line (String.length line - 1) with
| '[', ']' ->
let name = String.sub line 1 (String.length line - 2) in
parse_lines rest ~current:(Some name)
| _ ->
match String.split_on_char '=' line with
| [key; value] -> (* insert *) ...
Key Differences
| Aspect | OCaml | Rust |
|---|---|---|
| Map type | Association list | HashMap |
| String split | split_on_char | .split_once() |
| Section start | Pattern match | starts_with |
| Mutable state | Recursive accumulator | Mutable vars |
Exercises
include = /path/to/other.ini directives that merge another config file's sections and keys into the current config.Config::to_string() that serializes the config back to INI format, preserving sections and key-value pairs.get_bool(section, key) -> Option<bool>, get_u64(section, key) -> Option<u64>, and get_list(section, key) -> Vec<String> (comma-separated values).