Skip to content

Ignore Crate Overview: Code Companion

Reference code for the Ignore Crate Overview lecture. Sections correspond to the lecture document.


Section 1: The Crate's Public Interface

/*!
The ignore crate provides a fast recursive directory iterator that respects
various filters such as globs, file types and `.gitignore` files. The precise
matching rules and precedence is explained in the documentation for
`WalkBuilder`.

Secondarily, this crate exposes gitignore and file type matchers for use cases
that demand more fine-grained control.
*/

// Simple usage example from the documentation
use ignore::Walk;

for result in Walk::new("./") {
    match result {
        Ok(entry) => println!("{}", entry.path().display()),
        Err(err) => println!("ERROR: {}", err),
    }
}

// Builder pattern for customization
use ignore::WalkBuilder;

for result in WalkBuilder::new("./").hidden(false).build() {
    println!("{:?}", result);
}
// Re-exporting key types at the crate root for a clean public API
pub use crate::walk::{
    DirEntry, ParallelVisitor, ParallelVisitorBuilder, Walk, WalkBuilder,
    WalkParallel, WalkState,
};

The pub use pattern allows users to write ignore::Walk instead of ignore::walk::Walk, hiding internal module organization from the public API.


Section 2: Module Organization

// Private modules - implementation details
mod default_types;  // Built-in file type definitions
mod dir;            // Directory entry handling
mod pathutil;       // Path manipulation utilities
mod walk;           // Core walking implementation (types re-exported above)

// Public modules - stable API for fine-grained control
pub mod gitignore;  // Gitignore file parsing and matching
pub mod overrides;  // User-specified override patterns
pub mod types;      // File type matching (e.g., "rust", "python")

The pub mod vs mod distinction determines what users can import directly. Private modules can be refactored freely; public modules are part of the semver contract.


Section 3: The Error Enum Architecture

/// Represents an error that can occur when parsing a gitignore file.
#[derive(Debug)]
pub enum Error {
    /// A collection of "soft" errors. These occur when adding an ignore
    /// file partially succeeded.
    Partial(Vec<Error>),

    /// An error associated with a specific line number.
    WithLineNumber {
        line: u64,
        err: Box<Error>,  // Boxed to avoid infinite size
    },

    /// An error associated with a particular file path.
    WithPath {
        path: PathBuf,
        err: Box<Error>,
    },

    /// An error associated with a particular directory depth when recursively
    /// walking a directory.
    WithDepth {
        depth: usize,
        err: Box<Error>,
    },

    /// An error that occurs when a file loop is detected when traversing
    /// symbolic links.
    Loop {
        ancestor: PathBuf,
        child: PathBuf,
    },

    /// An error that occurs when doing I/O, such as reading an ignore file.
    Io(std::io::Error),

    /// An error that occurs when trying to parse a glob.
    Glob {
        glob: Option<String>,  // Original user-provided pattern
        err: String,
    },

    /// A type selection for a file type that is not defined.
    UnrecognizedFileType(String),

    /// A user specified file type definition could not be parsed.
    InvalidDefinition,
}

The wrapper variants (WithLineNumber, WithPath, WithDepth) create a nested structure that accumulates context as errors propagate upward.


Section 4: Implementing Clone for Error

impl Clone for Error {
    fn clone(&self) -> Error {
        match *self {
            // Recursive variants clone their inner errors
            Error::Partial(ref errs) => Error::Partial(errs.clone()),
            Error::WithLineNumber { line, ref err } => {
                Error::WithLineNumber { line, err: err.clone() }
            }
            // ... other wrapper variants follow the same pattern ...

            // std::io::Error doesn't implement Clone, so we must rebuild it
            Error::Io(ref err) => match err.raw_os_error() {
                // If there's a raw OS error code, recreate from that
                Some(e) => Error::Io(std::io::Error::from_raw_os_error(e)),
                // Otherwise, create new error with same kind and string message
                None => {
                    Error::Io(std::io::Error::new(err.kind(), err.to_string()))
                }
            },

            // Simple variants clone directly
            Error::Glob { ref glob, ref err } => {
                Error::Glob { glob: glob.clone(), err: err.clone() }
            }
            Error::InvalidDefinition => Error::InvalidDefinition,
            // ...
        }
    }
}

The IO error cloning is a pragmatic workaround. Using raw_os_error() preserves the most useful information; falling back to kind() and to_string() captures the rest.


Section 5: Error Inspection Methods

impl Error {
    /// Returns true if this is a partial error.
    pub fn is_partial(&self) -> bool {
        match *self {
            Error::Partial(_) => true,
            // Recursively check through context wrappers
            Error::WithLineNumber { ref err, .. } => err.is_partial(),
            Error::WithPath { ref err, .. } => err.is_partial(),
            Error::WithDepth { ref err, .. } => err.is_partial(),
            _ => false,
        }
    }

    /// Returns true if this error is exclusively an I/O error.
    pub fn is_io(&self) -> bool {
        match *self {
            // Special case: single-element Partial containing an IO error
            Error::Partial(ref errs) => errs.len() == 1 && errs[0].is_io(),
            // Unwrap context layers
            Error::WithLineNumber { ref err, .. } => err.is_io(),
            Error::WithPath { ref err, .. } => err.is_io(),
            Error::WithDepth { ref err, .. } => err.is_io(),
            Error::Loop { .. } => false,  // Loops are not IO errors
            Error::Io(_) => true,
            Error::Glob { .. } => false,
            Error::UnrecognizedFileType(_) => false,
            Error::InvalidDefinition => false,
        }
    }

    /// Inspect the original std::io::Error if there is one.
    /// Returns borrowed reference - use into_io_error() for ownership.
    pub fn io_error(&self) -> Option<&std::io::Error> {
        match *self {
            Error::Partial(ref errs) => {
                if errs.len() == 1 {
                    errs[0].io_error()  // Delegate to inner error
                } else {
                    None
                }
            }
            Error::WithLineNumber { ref err, .. } => err.io_error(),
            Error::WithPath { ref err, .. } => err.io_error(),
            Error::WithDepth { ref err, .. } => err.io_error(),
            Error::Io(ref err) => Some(err),  // Found it!
            _ => None,
        }
    }

    /// Consumes self to return owned std::io::Error.
    pub fn into_io_error(self) -> Option<std::io::Error> {
        match self {
            Error::Io(err) => Some(err),
            Error::WithPath { err, .. } => err.into_io_error(),
            // ... similar pattern for other wrappers
            _ => None,
        }
    }
}

The borrowing (io_error) vs consuming (into_io_error) pattern lets callers choose based on whether they need ownership of the underlying error.


Section 6: Error Context Builders

impl Error {
    /// Turn an error into a tagged error with the given file path.
    fn with_path<P: AsRef<Path>>(self, path: P) -> Error {
        Error::WithPath {
            path: path.as_ref().to_path_buf(),
            err: Box::new(self),  // Wrap self in Box
        }
    }

    /// Turn an error into a tagged error with the given depth.
    fn with_depth(self, depth: usize) -> Error {
        Error::WithDepth { depth, err: Box::new(self) }
    }

    /// Turn an error into a tagged error with path and line number.
    /// If path is empty, omit the path wrapper.
    fn tagged<P: AsRef<Path>>(self, path: P, lineno: u64) -> Error {
        let errline =
            Error::WithLineNumber { line: lineno, err: Box::new(self) };
        // Handle edge case: empty path shouldn't appear in error message
        if path.as_ref().as_os_str().is_empty() {
            return errline;
        }
        errline.with_path(path)  // Fluent chaining
    }

    /// Build an error from a walkdir error.
    fn from_walkdir(err: walkdir::Error) -> Error {
        let depth = err.depth();
        // Check for filesystem loop (special case)
        if let (Some(anc), Some(child)) = (err.loop_ancestor(), err.path()) {
            return Error::WithDepth {
                depth,
                err: Box::new(Error::Loop {
                    ancestor: anc.to_path_buf(),
                    child: child.to_path_buf(),
                }),
            };
        }
        // Convert to IO error with path context
        let path = err.path().map(|p| p.to_path_buf());
        let mut ig_err = Error::Io(std::io::Error::from(err));
        if let Some(path) = path {
            ig_err = Error::WithPath { path, err: Box::new(ig_err) };
        }
        ig_err
    }
}

These methods enable fluent error construction: Error::Glob{..}.tagged(path, line) wraps the glob error with location context in a single expression.


Section 7: Standard Trait Implementations

impl std::error::Error for Error {
    #[allow(deprecated)]  // description() is deprecated but still required
    fn description(&self) -> &str {
        match *self {
            Error::Partial(_) => "partial error",
            // Delegate through wrappers to get the real description
            Error::WithLineNumber { ref err, .. } => err.description(),
            Error::WithPath { ref err, .. } => err.description(),
            Error::WithDepth { ref err, .. } => err.description(),
            Error::Loop { .. } => "file system loop found",
            Error::Io(ref err) => err.description(),
            Error::Glob { ref err, .. } => err,
            Error::UnrecognizedFileType(_) => "unrecognized file type",
            Error::InvalidDefinition => "invalid definition",
        }
    }
}

impl std::fmt::Display for Error {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match *self {
            // Join partial errors with newlines
            Error::Partial(ref errs) => {
                let msgs: Vec<String> =
                    errs.iter().map(|err| err.to_string()).collect();
                write!(f, "{}", msgs.join("\n"))
            }
            // Prepend line number
            Error::WithLineNumber { line, ref err } => {
                write!(f, "line {}: {}", line, err)
            }
            // Prepend path
            Error::WithPath { ref path, ref err } => {
                write!(f, "{}: {}", path.display(), err)
            }
            // Depth is metadata, not displayed
            Error::WithDepth { ref err, .. } => err.fmt(f),
            Error::Loop { ref ancestor, ref child } => write!(
                f,
                "File system loop found: {} points to an ancestor {}",
                child.display(),
                ancestor.display()
            ),
            Error::Io(ref err) => err.fmt(f),
            Error::Glob { glob: None, ref err } => write!(f, "{}", err),
            Error::Glob { glob: Some(ref glob), ref err } => {
                write!(f, "error parsing glob '{}': {}", glob, err)
            }
            // ...
        }
    }
}

// Enables Error::Io(io_error) construction via .into()
impl From<std::io::Error> for Error {
    fn from(err: std::io::Error) -> Error {
        Error::Io(err)
    }
}

The Display implementation recursively formats wrapped errors, producing messages like "path/.gitignore: line 42: invalid glob".


Section 8: The Match Enum

/// The result of a glob match.
///
/// The type parameter `T` typically refers to a type that provides more
/// information about a particular match.
#[derive(Clone, Debug)]
pub enum Match<T> {
    /// The path didn't match any glob.
    None,
    /// The highest precedent glob matched indicates the path should be ignored.
    Ignore(T),
    /// The highest precedent glob matched indicates the path should be whitelisted.
    Whitelist(T),
}

impl<T> Match<T> {
    /// Returns true if the match result didn't match any globs.
    pub fn is_none(&self) -> bool {
        matches!(*self, Match::None)
    }

    /// Returns true if the match result implies the path should be ignored.
    pub fn is_ignore(&self) -> bool {
        matches!(*self, Match::Ignore(_))
    }

    /// Inverts the match so that Ignore becomes Whitelist and vice versa.
    pub fn invert(self) -> Match<T> {
        match self {
            Match::None => Match::None,
            Match::Ignore(t) => Match::Whitelist(t),
            Match::Whitelist(t) => Match::Ignore(t),
        }
    }

    /// Apply a function to the value inside this match.
    pub fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Match<U> {
        match self {
            Match::None => Match::None,
            Match::Ignore(t) => Match::Ignore(f(t)),
            Match::Whitelist(t) => Match::Whitelist(f(t)),
        }
    }

    /// Return self if not none, otherwise return other.
    pub fn or(self, other: Self) -> Self {
        if self.is_none() { other } else { self }
    }
}

The Match<T> enum is generic over the metadata type, allowing different matchers to attach different information (glob pattern, file source, etc.) to their match results.


Quick Reference

Error Variant Summary

Variant Purpose
Partial(Vec<Error>) Multiple errors, some operations may have succeeded
WithLineNumber { line, err } Adds line number context
WithPath { path, err } Adds file path context
WithDepth { depth, err } Adds directory depth context
Loop { ancestor, child } Filesystem loop via symlinks
Io(std::io::Error) Underlying IO error
Glob { glob, err } Invalid glob pattern
UnrecognizedFileType(String) Unknown file type requested
InvalidDefinition Malformed type definition

Match API

Method Returns Purpose
is_none() bool Check if no match occurred
is_ignore() bool Check if path should be ignored
is_whitelist() bool Check if path should be included
invert() Match<T> Swap ignore/whitelist
map(f) Match<U> Transform contained value
or(other) Match<T> Use other if self is None
inner() Option<&T> Access contained value

Public Exports

// From crate root
pub use walk::{DirEntry, ParallelVisitor, ParallelVisitorBuilder, 
               Walk, WalkBuilder, WalkParallel, WalkState};

// Public modules for fine-grained control
pub mod gitignore;   // Gitignore parsing
pub mod overrides;   // Override patterns  
pub mod types;       // File type matching