Skip to content

Ripgrep lowargs.rs: Code Companion

Reference code for the lowargs.rs lecture. Sections correspond to the lecture document.


Section 1: The Design Philosophy

/*!
Provides the definition of low level arguments from CLI flags.
*/

/// A collection of "low level" arguments.
///
/// The "low level" here is meant to constrain this type to be as close to the
/// actual CLI flags and arguments as possible. Namely, other than some
/// convenience types to help validate flag values and deal with overrides
/// between flags, these low level arguments do not contain any higher level
/// abstractions.
///
/// Another self-imposed constraint is that populating low level arguments
/// should not require anything other than validating what the user has
/// provided. For example, low level arguments should not contain a
/// `HyperlinkConfig`, since in order to get a full configuration, one needs to
/// discover the hostname of the current system (which might require running a
/// binary or a syscall).

LowArgs vs HiArgs:

Aspect LowArgs HiArgs
Contains Raw flag values Computed configuration
Validation Syntax only Semantic + environmental
Can fail due to Invalid flag values Missing files, platform issues
--help works when Always Always (uses LowArgs only)

Section 2: The LowArgs Struct

#[derive(Debug, Default)]
pub(crate) struct LowArgs {
    // Essential arguments.
    pub(crate) special: Option<SpecialMode>,
    pub(crate) mode: Mode,
    pub(crate) positional: Vec<OsString>,
    pub(crate) patterns: Vec<PatternSource>,

    // Everything else, sorted lexicographically.
    pub(crate) binary: BinaryMode,
    pub(crate) boundary: Option<BoundaryMode>,
    pub(crate) buffer: BufferMode,
    pub(crate) byte_offset: bool,
    pub(crate) case: CaseMode,
    pub(crate) color: ColorChoice,
    pub(crate) colors: Vec<UserColorSpec>,
    pub(crate) column: Option<bool>,
    pub(crate) context: ContextMode,
    // ... ~60 more fields
    pub(crate) threads: Option<usize>,
    pub(crate) trim: bool,
    pub(crate) type_changes: Vec<TypeChange>,
    pub(crate) unrestricted: usize,
    pub(crate) vimgrep: bool,
    pub(crate) with_filename: Option<bool>,
}

Field categories:

Category Examples Count
Essential mode, patterns, positional 4
Display color, heading, line_number ~15
Search case, boundary, multiline ~10
Filter globs, type_changes, hidden ~15
Performance threads, mmap, max_filesize ~10
Ignore rules no_ignore_*, follow ~10

Section 3: SpecialMode — The Short-Circuit Cases

/// A "special" mode that supercedes everything else.
///
/// When one of these modes is present, it overrides everything else and causes
/// ripgrep to short-circuit.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) enum SpecialMode {
    /// Show a condensed version of "help" output.
    /// Corresponds to the `-h` flag.
    HelpShort,
    /// Shows a very verbose version of the "help" output.
    /// Corresponds to the `--help` flag.
    HelpLong,
    /// Show condensed version information. e.g., `ripgrep x.y.z`.
    VersionShort,
    /// Show verbose version information including build features.
    VersionLong,
    /// Show PCRE2's version information.
    VersionPCRE2,
}

Usage in main.rs:

fn run(result: ParseResult<HiArgs>) -> anyhow::Result<ExitCode> {
    let args = match result {
        ParseResult::Err(err) => return Err(err),
        ParseResult::Special(mode) => return special(mode),  // Short-circuit!
        ParseResult::Ok(args) => args,
    };
    // ... normal processing
}

Section 4: Mode — What Ripgrep Should Do

/// The overall mode that ripgrep should operate in.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) enum Mode {
    /// ripgrep will execute a search of some kind.
    Search(SearchMode),
    /// Show the files that *would* be searched, but don't search them.
    Files,
    /// List all file type definitions configured.
    Types,
    /// Generate various things like the man page and completion files.
    Generate(GenerateMode),
}

impl Default for Mode {
    fn default() -> Mode {
        Mode::Search(SearchMode::Standard)
    }
}

impl Mode {
    /// Update this mode to the new mode while implementing override semantics.
    pub(crate) fn update(&mut self, new: Mode) {
        match *self {
            // If we're in a search mode, then anything can override it.
            Mode::Search(_) => *self = new,
            _ => {
                // Once in a non-search mode, only other non-search modes
                // can override. So `--files -l` stays Mode::Files.
                if !matches!(new, Mode::Search(_)) {
                    *self = new;
                }
            }
        }
    }
}

Override examples:

rg -l pattern          # Mode::Search(FilesWithMatches)
rg --files -l pattern  # Mode::Files (search mode can't override)
rg --files --types     # Mode::Types (non-search can override)

Section 5: SearchMode — Search Output Variations

/// The kind of search that ripgrep is going to perform.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) enum SearchMode {
    /// The default standard mode. Print matches when found.
    Standard,
    /// Show files containing at least one match. (-l)
    FilesWithMatches,
    /// Show files that don't contain any matches. (--files-without-match)
    FilesWithoutMatch,
    /// Show match count per file. (-c)
    Count,
    /// Show total match count per file. (--count-matches)
    CountMatches,
    /// Print matches in JSON lines format. (--json)
    JSON,
}

Count vs CountMatches:

File content: "foo foo bar foo"
Pattern: "foo"

-c (Count):        1  (one matching line)
--count-matches:   3  (three matches)

Section 6: BinaryMode — Handling Non-Text Files

/// Indicates how ripgrep should treat binary data.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) enum BinaryMode {
    /// Automatically determine based on how file was specified.
    /// Explicit files: SearchAndSuppress
    /// Implicit files: skip entirely
    #[default]
    Auto,
    /// Search but suppress matches, showing only a warning.
    /// NUL bytes replaced with line terminators.
    SearchAndSuppress,
    /// Treat all files as plain text. No skipping, no NUL replacement.
    AsText,
}

Detection flow:

File specified explicitly (rg pattern file.bin):
  → SearchAndSuppress: search, but warn about binary

File discovered during traversal:
  → Quit on first NUL byte: skip silently

-a/--text flag:
  → AsText: search everything as-is

Section 7: CaseMode — Pattern Matching Sensitivity

/// Indicates the case mode for pattern interpretation.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) enum CaseMode {
    /// 'a' matches only 'a'.
    #[default]
    Sensitive,
    /// 'a' matches both 'a' and 'A'. (-i)
    Insensitive,
    /// Case-insensitive only when pattern is all lowercase. (-S)
    Smart,
}

Smart case examples:

rg foo      # Matches: foo, Foo, FOO (pattern is lowercase)
rg Foo      # Matches: Foo only (pattern has uppercase)
rg FOO      # Matches: FOO only

Section 8: ColorChoice — Output Coloring

/// Indicates whether ripgrep should include color in output.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) enum ColorChoice {
    /// Color will never be used.
    Never,
    /// Color only when stdout is a tty.
    #[default]
    Auto,
    /// Color will always be used.
    Always,
    /// Always use ANSI escapes (Windows legacy console workaround).
    Ansi,
}

impl ColorChoice {
    /// Convert to the termcolor crate's equivalent type.
    pub(crate) fn to_termcolor(&self) -> termcolor::ColorChoice {
        match *self {
            ColorChoice::Never => termcolor::ColorChoice::Never,
            ColorChoice::Auto => termcolor::ColorChoice::Auto,
            ColorChoice::Always => termcolor::ColorChoice::Always,
            ColorChoice::Ansi => termcolor::ColorChoice::AlwaysAnsi,
        }
    }
}

Section 9: ContextMode — Lines Around Matches

/// Indicates the line context options ripgrep should use.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum ContextMode {
    /// All lines will be printed (--passthru).
    Passthru,
    /// Show specific number of lines before/after matches.
    Limited(ContextModeLimited),
}

/// Tracks before/after/both context separately for precedence.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) struct ContextModeLimited {
    before: Option<usize>,  // -B
    after: Option<usize>,   // -A
    both: Option<usize>,    // -C
}

impl ContextModeLimited {
    /// Returns (before, after) with proper precedence.
    /// -B and -A always override -C regardless of order.
    pub(crate) fn get(&self) -> (usize, usize) {
        let (mut before, mut after) =
            self.both.map(|lines| (lines, lines)).unwrap_or((0, 0));
        if let Some(lines) = self.before {
            before = lines;
        }
        if let Some(lines) = self.after {
            after = lines;
        }
        (before, after)
    }
}

Precedence examples:

rg -C5 pattern           # (5, 5)
rg -C5 -B2 pattern       # (2, 5) — -B overrides -C's before
rg -B2 -C5 pattern       # (2, 5) — same! -B always wins
rg -C5 -A0 pattern       # (5, 0) — -A overrides -C's after

Section 10: EngineChoice — Regex Implementation

/// The regex engine to use.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) enum EngineChoice {
    /// Uses Rust's `regex` crate (default).
    #[default]
    Default,
    /// Try default, fall back to PCRE2 if pattern fails.
    Auto,
    /// Uses PCRE2 if available.
    PCRE2,
}

When to use each:

rg 'simple.*pattern'        # Default: fast, good errors
rg -P '(?<=foo)bar'         # PCRE2: lookbehind required
rg --auto-hybrid '(?<=x)y'  # Auto: try default, fall back

Section 11: MmapMode — Memory Mapping Strategy

/// Indicates when to use memory maps.
#[derive(Debug, Default, Eq, PartialEq)]
pub(crate) enum MmapMode {
    /// Use heuristics to decide.
    #[default]
    Auto,
    /// Always try memory maps when possible.
    AlwaysTryMmap,
    /// Never use memory maps.
    Never,
}

Heuristic factors (Auto mode): - File count: mmap overhead hurts with many files - Input type: stdin/FIFOs can't be mmapped - Platform: mmap performance varies


Section 12: PatternSource — Where Patterns Come From

/// Represents a source of patterns that ripgrep should search for.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum PatternSource {
    /// Comes from the `-e/--regexp` flag.
    Regexp(String),
    /// Comes from the `-f/--file` flag.
    File(PathBuf),
}

Usage examples:

rg foo                     # Positional → treated specially
rg -e foo -e bar           # Two Regexp sources
rg -f patterns.txt         # One File source
rg -e foo -f more.txt -e x # Mixed: [Regexp, File, Regexp]

Section 13: SortMode — Result Ordering

/// The sort criteria, if present.
#[derive(Debug, Eq, PartialEq)]
pub(crate) struct SortMode {
    /// Whether to reverse (descending order).
    pub(crate) reverse: bool,
    /// The actual sorting criteria.
    pub(crate) kind: SortModeKind,
}

#[derive(Debug, Eq, PartialEq)]
pub(crate) enum SortModeKind {
    Path,
    LastModified,
    LastAccessed,
    Created,
}

impl SortMode {
    /// Check if sorting mode is supported on this platform.
    pub(crate) fn supported(&self) -> anyhow::Result<()> {
        match self.kind {
            SortModeKind::Path => Ok(()),
            SortModeKind::LastModified => {
                // Probe by checking current exe's metadata
                let md = std::env::current_exe()
                    .and_then(|p| p.metadata())
                    .and_then(|md| md.modified());
                let Err(err) = md else { return Ok(()) };
                anyhow::bail!("sorting by last modified isn't supported: {err}");
            }
            // Similar for LastAccessed, Created...
        }
    }
}

Section 14: TypeChange — File Type Modifications

/// A single instance of a type change or selection.
#[derive(Debug, Eq, PartialEq)]
pub(crate) enum TypeChange {
    /// Clear the given type from ripgrep.
    Clear { name: String },
    /// Add a new type definition (name and glob).
    Add { def: String },
    /// Select the type for filtering (include).
    Select { name: String },
    /// Select the type for filtering but negate it (exclude).
    Negate { name: String },
}

Command line → TypeChange:

--type-clear=all       # Clear { name: "all" }
--type-add='foo:*.foo' # Add { def: "foo:*.foo" }
-t rust                # Select { name: "rust" }
-T python              # Negate { name: "python" }

Order matters:

# Clear all, add custom, then select it
rg --type-clear=all --type-add='mycode:*.mc' -t mycode pattern

Section 15: Separator Types

/// Context separator between non-contiguous blocks (default: "--").
#[derive(Clone, Debug, Eq, PartialEq)]
pub(crate) struct ContextSeparator(Option<BString>);

impl ContextSeparator {
    /// Create from user input with escape handling.
    pub(crate) fn new(os: &OsStr) -> anyhow::Result<ContextSeparator> {
        let Some(string) = os.to_str() else {
            anyhow::bail!("separator must be valid UTF-8 (use escape sequences)");
        };
        Ok(ContextSeparator(Some(Vec::unescape_bytes(string).into())))
    }

    /// Disable separators entirely.
    pub(crate) fn disabled() -> ContextSeparator {
        ContextSeparator(None)
    }
}

/// Field separator for context lines (default: "-").
pub(crate) struct FieldContextSeparator(BString);

/// Field separator for match lines (default: ":").
pub(crate) struct FieldMatchSeparator(BString);

Escape sequence examples:

--context-separator=$'\t'     # Tab character
--context-separator='\x00'    # NUL byte
--context-separator=''        # Empty (no separator)
--no-context-separator        # Disabled entirely

Quick Reference: Flag → Field Mapping

// Selected examples showing flag → LowArgs field

-i, --ignore-case      case: CaseMode::Insensitive
-S, --smart-case       case: CaseMode::Smart
-l, --files-with-matches  mode: Mode::Search(SearchMode::FilesWithMatches)
-c, --count            mode: Mode::Search(SearchMode::Count)
--files                mode: Mode::Files
-t, --type             type_changes: Vec<TypeChange::Select>
-T, --type-not         type_changes: Vec<TypeChange::Negate>
-g, --glob             globs: Vec<String>
--iglob                iglobs: Vec<String>
-j, --threads          threads: Option<usize>
-A, --after-context    context: ContextMode (set_after)
-B, --before-context   context: ContextMode (set_before)
-C, --context          context: ContextMode (set_both)
-e, --regexp           patterns: Vec<PatternSource::Regexp>
-f, --file             patterns: Vec<PatternSource::File>
-h                     special: Some(SpecialMode::HelpShort)
--help                 special: Some(SpecialMode::HelpLong)

Data Flow: CLI to Execution

Command Line Arguments
┌─────────────────┐
│  flags/parse.rs │  Tokenize and validate
└─────────────────┘
┌─────────────────┐
│    LowArgs      │  Direct flag mirror (this file)
└─────────────────┘
┌─────────────────┐
│ flags/hiargs.rs │  Transform, compute, build objects
└─────────────────┘
┌─────────────────┐
│     HiArgs      │  Ready for execution
└─────────────────┘
┌─────────────────┐
│    main.rs      │  Dispatch to search/files/types
└─────────────────┘