ExamplesBy LevelBy TopicLearning Paths
492 Fundamental

OsStr Handling

Functional Programming

Tutorial

The Problem

File systems on Linux allow any byte sequence (except NUL) as a filename; Windows uses UTF-16. A Rust String (UTF-8) cannot represent arbitrary Unix filenames. The OsStr/OsString types bridge this gap: they store the native OS encoding and provide a .to_str() method that returns None for non-UTF-8 sequences rather than silently corrupting data. Path internally uses OsStr, making these types essential for any code that interacts with the filesystem or environment.

🎯 Learning Outcomes

  • • Create OsStr from a &str with OsStr::new(s)
  • • Convert back to &str with .to_str() returning Option<&str>
  • • Use .to_string_lossy() to get a Cow<str> with replacement characters for non-UTF-8 bytes
  • • Understand that Path::extension() returns Option<&OsStr>, not Option<&str>
  • • Distinguish OsStr (borrowed) from OsString (owned) as the str/String analogy
  • Code Example

    #![allow(clippy::all)]
    // 492. OsStr and OsString
    
    use std::ffi::{OsStr, OsString};
    use std::path::Path;
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_osstr_roundtrip() {
            let s = "hello";
            let os = OsStr::new(s);
            assert_eq!(os.to_str(), Some(s));
        }
        #[test]
        fn test_path_ext() {
            let p = Path::new("f.rs");
            assert_eq!(p.extension(), Some(OsStr::new("rs")));
        }
        #[test]
        fn test_os_string() {
            let s = OsString::from("hi");
            assert_eq!(s.to_string_lossy(), "hi");
        }
    }

    Key Differences

  • Type-level encoding guarantee: Rust has three string types for three encoding domains (str = UTF-8, OsStr = OS encoding, CStr = NUL-terminated C); OCaml uses a single string type for all.
  • **to_str() returns Option**: Rust forces callers to handle non-UTF-8 explicitly; OCaml silently passes bytes through.
  • Ecosystem integration: Rust's standard library functions (std::fs, std::env) consistently return OsStr/OsString for OS-provided strings; OCaml's Sys and Unix modules return plain string.
  • **to_string_lossy**: Rust provides a built-in lossy converter returning Cow<str>; OCaml needs manual UTF-8 validation and replacement.
  • OCaml Approach

    OCaml's string is a byte sequence — it naturally handles non-UTF-8 filenames without a separate type:

    (* Filename from environment — could be non-UTF-8 on Unix *)
    let fname = Sys.getenv "HOME" ^ "/file.txt"
    (* No type distinction — both UTF-8 and non-UTF-8 use string *)
    

    On Windows, OCaml 5 with Domain support uses the Windows API which works with UTF-16 paths via the win-unicode-filenames package. The Fpath library provides a typed path abstraction.

    Full Source

    #![allow(clippy::all)]
    // 492. OsStr and OsString
    
    use std::ffi::{OsStr, OsString};
    use std::path::Path;
    
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_osstr_roundtrip() {
            let s = "hello";
            let os = OsStr::new(s);
            assert_eq!(os.to_str(), Some(s));
        }
        #[test]
        fn test_path_ext() {
            let p = Path::new("f.rs");
            assert_eq!(p.extension(), Some(OsStr::new("rs")));
        }
        #[test]
        fn test_os_string() {
            let s = OsString::from("hi");
            assert_eq!(s.to_string_lossy(), "hi");
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
        #[test]
        fn test_osstr_roundtrip() {
            let s = "hello";
            let os = OsStr::new(s);
            assert_eq!(os.to_str(), Some(s));
        }
        #[test]
        fn test_path_ext() {
            let p = Path::new("f.rs");
            assert_eq!(p.extension(), Some(OsStr::new("rs")));
        }
        #[test]
        fn test_os_string() {
            let s = OsString::from("hi");
            assert_eq!(s.to_string_lossy(), "hi");
        }
    }

    Exercises

  • Environment variable lister: Write a function using std::env::vars_os() that collects all env vars as (OsString, OsString) pairs and converts each to String via .to_string_lossy().
  • Non-UTF-8 filename test: On Linux, create a file with a non-UTF-8 name using std::fs::File::create(Path::new(OsStr::from_bytes(&[0xfe, 0xff]))) and verify that to_str() returns None.
  • Cross-platform path builder: Write fn config_path() -> PathBuf that uses std::env::var_os("HOME") (returning Option<OsString>) and joins it with a relative path, handling None with a fallback.
  • Open Source Repos