ExamplesBy LevelBy TopicLearning Paths
766 Fundamental

766-config-file-parsing — Config File Parsing

Functional Programming

Tutorial Video

Text description (accessibility)

This video demonstrates the "766-config-file-parsing — Config File Parsing" functional Rust example. Difficulty level: Fundamental. Key concepts covered: Functional Programming. Configuration files — INI, TOML, YAML — are the primary mechanism for parametrizing deployed software without recompilation. Key difference from OCaml: 1. **HashMap access**: Rust's `HashMap::get` returns `Option<&V>`; OCaml's `Hashtbl.find_opt` returns `option 'v` — equivalent semantics.

Tutorial

The Problem

Configuration files — INI, TOML, YAML — are the primary mechanism for parametrizing deployed software without recompilation. INI format predates them all and remains ubiquitous in tools like git config, Windows registry exports, and openssl.cnf. Parsing INI manually teaches section-based hierarchical configuration, comment handling, and graceful error reporting for malformed config files.

🎯 Learning Outcomes

  • • Parse INI-style [section] headers and key = value assignments
  • • Store configuration in a two-level HashMap<String, HashMap<String, String>>
  • • Handle inline comments (# and ;), blank lines, and trailing whitespace
  • • Return typed ParseError variants for invalid lines and duplicate sections
  • • Implement typed accessors: get_section(section, key) -> Option<&str>
  • Code Example

    pub struct Config {
        pub global: HashMap<String, String>,
        pub sections: HashMap<String, HashMap<String, String>>,
    }

    Key Differences

  • HashMap access: Rust's HashMap::get returns Option<&V>; OCaml's Hashtbl.find_opt returns option 'v — equivalent semantics.
  • Line iteration: Rust's str::lines() is lazy and handles both \n and \r\n; OCaml's String.split_on_char '\n' requires manual \r stripping.
  • Duplicate detection: Rust checks HashMap::contains_key before insert; OCaml uses Hashtbl.mem — same pattern.
  • Typed config: Production code uses serde with a TOML/INI deserializer to map config directly to a typed struct; OCaml's ppx_sexp_conv does the same for S-expression configs.
  • OCaml Approach

    OCaml's Inifiles library parses INI files with similar section/key structure. The Config module from caml-gettext handles POSIX-style configuration. For TOML, toml-ocaml and otoml are available. The parsing pattern uses String.split_on_char for line splitting and String.trim for whitespace normalization — identical operations to Rust's split('\n') and trim().

    Full Source

    #![allow(clippy::all)]
    //! # Config File Parsing
    //!
    //! INI-style configuration file parser.
    
    use std::collections::HashMap;
    
    /// A configuration with sections
    #[derive(Debug, Default)]
    pub struct Config {
        pub global: HashMap<String, String>,
        pub sections: HashMap<String, HashMap<String, String>>,
    }
    
    impl Config {
        pub fn new() -> Self {
            Config::default()
        }
    
        /// Get a global value
        pub fn get(&self, key: &str) -> Option<&str> {
            self.global.get(key).map(String::as_str)
        }
    
        /// Get a value from a section
        pub fn get_section(&self, section: &str, key: &str) -> Option<&str> {
            self.sections
                .get(section)
                .and_then(|s| s.get(key))
                .map(String::as_str)
        }
    
        /// Get all keys in a section
        pub fn section_keys(&self, section: &str) -> Vec<&str> {
            self.sections
                .get(section)
                .map(|s| s.keys().map(String::as_str).collect())
                .unwrap_or_default()
        }
    }
    
    /// Parse error
    #[derive(Debug, PartialEq)]
    pub enum ParseError {
        InvalidLine { line: usize, content: String },
        DuplicateSection { name: String },
    }
    
    /// Parse INI-style config
    pub fn parse_config(input: &str) -> Result<Config, ParseError> {
        let mut config = Config::new();
        let mut current_section: Option<String> = None;
    
        for (line_num, line) in input.lines().enumerate() {
            let line = line.trim();
    
            // Skip empty lines and comments
            if line.is_empty() || line.starts_with('#') || line.starts_with(';') {
                continue;
            }
    
            // Section header
            if line.starts_with('[') && line.ends_with(']') {
                let name = line[1..line.len() - 1].trim().to_string();
                if config.sections.contains_key(&name) {
                    return Err(ParseError::DuplicateSection { name });
                }
                config.sections.insert(name.clone(), HashMap::new());
                current_section = Some(name);
                continue;
            }
    
            // Key-value pair
            if let Some((key, value)) = line.split_once('=') {
                let key = key.trim().to_string();
                let value = value.trim().to_string();
    
                match &current_section {
                    Some(section) => {
                        config.sections.get_mut(section).unwrap().insert(key, value);
                    }
                    None => {
                        config.global.insert(key, value);
                    }
                }
            } else {
                return Err(ParseError::InvalidLine {
                    line: line_num,
                    content: line.to_string(),
                });
            }
        }
    
        Ok(config)
    }
    
    /// Format config back to string
    pub fn format_config(config: &Config) -> String {
        let mut output = String::new();
    
        // Global section
        for (key, value) in &config.global {
            output.push_str(&format!("{} = {}\n", key, value));
        }
    
        if !config.global.is_empty() && !config.sections.is_empty() {
            output.push('\n');
        }
    
        // Named sections
        for (section_name, section) in &config.sections {
            output.push_str(&format!("[{}]\n", section_name));
            for (key, value) in section {
                output.push_str(&format!("{} = {}\n", key, value));
            }
            output.push('\n');
        }
    
        output.trim_end().to_string()
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_parse_global() {
            let input = "key = value";
            let config = parse_config(input).unwrap();
            assert_eq!(config.get("key"), Some("value"));
        }
    
        #[test]
        fn test_parse_section() {
            let input = "[database]\nhost = localhost\nport = 5432";
            let config = parse_config(input).unwrap();
            assert_eq!(config.get_section("database", "host"), Some("localhost"));
            assert_eq!(config.get_section("database", "port"), Some("5432"));
        }
    
        #[test]
        fn test_parse_comments() {
            let input = "# comment\nkey = value\n; another comment";
            let config = parse_config(input).unwrap();
            assert_eq!(config.get("key"), Some("value"));
            assert_eq!(config.global.len(), 1);
        }
    
        #[test]
        fn test_duplicate_section() {
            let input = "[section]\na = 1\n[section]\nb = 2";
            let result = parse_config(input);
            assert!(matches!(result, Err(ParseError::DuplicateSection { .. })));
        }
    
        #[test]
        fn test_invalid_line() {
            let input = "not a valid line";
            let result = parse_config(input);
            assert!(matches!(result, Err(ParseError::InvalidLine { .. })));
        }
    
        #[test]
        fn test_section_keys() {
            let input = "[server]\nhost = localhost\nport = 8080";
            let config = parse_config(input).unwrap();
            let keys = config.section_keys("server");
            assert!(keys.contains(&"host"));
            assert!(keys.contains(&"port"));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_parse_global() {
            let input = "key = value";
            let config = parse_config(input).unwrap();
            assert_eq!(config.get("key"), Some("value"));
        }
    
        #[test]
        fn test_parse_section() {
            let input = "[database]\nhost = localhost\nport = 5432";
            let config = parse_config(input).unwrap();
            assert_eq!(config.get_section("database", "host"), Some("localhost"));
            assert_eq!(config.get_section("database", "port"), Some("5432"));
        }
    
        #[test]
        fn test_parse_comments() {
            let input = "# comment\nkey = value\n; another comment";
            let config = parse_config(input).unwrap();
            assert_eq!(config.get("key"), Some("value"));
            assert_eq!(config.global.len(), 1);
        }
    
        #[test]
        fn test_duplicate_section() {
            let input = "[section]\na = 1\n[section]\nb = 2";
            let result = parse_config(input);
            assert!(matches!(result, Err(ParseError::DuplicateSection { .. })));
        }
    
        #[test]
        fn test_invalid_line() {
            let input = "not a valid line";
            let result = parse_config(input);
            assert!(matches!(result, Err(ParseError::InvalidLine { .. })));
        }
    
        #[test]
        fn test_section_keys() {
            let input = "[server]\nhost = localhost\nport = 8080";
            let config = parse_config(input).unwrap();
            let keys = config.section_keys("server");
            assert!(keys.contains(&"host"));
            assert!(keys.contains(&"port"));
        }
    }

    Deep Comparison

    OCaml vs Rust: Config File Parsing

    Config Structure

    Rust

    pub struct Config {
        pub global: HashMap<String, String>,
        pub sections: HashMap<String, HashMap<String, String>>,
    }
    

    OCaml

    type config = {
      global: (string * string) list;
      sections: (string * (string * string) list) list;
    }
    

    Parsing

    Rust

    if line.starts_with('[') && line.ends_with(']') {
        let name = line[1..line.len()-1].trim();
        current_section = Some(name.to_string());
    } else if let Some((key, value)) = line.split_once('=') {
        // Insert key-value
    }
    

    OCaml

    match String.get line 0, String.get line (String.length line - 1) with
    | '[', ']' ->
        let name = String.sub line 1 (String.length line - 2) in
        parse_lines rest ~current:(Some name)
    | _ ->
        match String.split_on_char '=' line with
        | [key; value] -> (* insert *) ...
    

    Key Differences

    AspectOCamlRust
    Map typeAssociation listHashMap
    String splitsplit_on_char.split_once()
    Section startPattern matchstarts_with
    Mutable stateRecursive accumulatorMutable vars

    Exercises

  • Add support for include = /path/to/other.ini directives that merge another config file's sections and keys into the current config.
  • Implement Config::to_string() that serializes the config back to INI format, preserving sections and key-value pairs.
  • Write typed getters: get_bool(section, key) -> Option<bool>, get_u64(section, key) -> Option<u64>, and get_list(section, key) -> Vec<String> (comma-separated values).
  • Open Source Repos