ripgrep crates/core/haystack.rs: Code Companion¶
Reference code for the Haystack Discovery lecture. Sections correspond to the lecture document.
Section 1: The Builder Pattern at Its Simplest¶
/// A builder for constructing things to search over.
#[derive(Clone, Debug)]
pub(crate) struct HaystackBuilder {
strip_dot_prefix: bool, // The single configuration option
}
impl HaystackBuilder {
/// Return a new haystack builder with a default configuration.
pub(crate) fn new() -> HaystackBuilder {
HaystackBuilder { strip_dot_prefix: false }
}
/// When enabled, if the haystack's file path starts with `./` then it is
/// stripped.
///
/// This is useful when implicitly searching the current working directory.
pub(crate) fn strip_dot_prefix(
&mut self,
yes: bool,
) -> &mut HaystackBuilder { // Returns &mut Self for method chaining
self.strip_dot_prefix = yes;
self
}
}
The pub(crate) visibility restricts these types to the core crate. The strip_dot_prefix method returns &mut Self, enabling the fluent builder pattern: builder.strip_dot_prefix(true).build(entry).
Section 2: Constructing from Fallible Results¶
/// Create a new haystack from a possibly missing directory entry.
///
/// If the directory entry isn't present, then the corresponding error is
/// logged if messages have been configured. Otherwise, if the directory
/// entry is deemed searchable, then it is returned as a haystack.
pub(crate) fn build_from_result(
&self,
result: Result<ignore::DirEntry, ignore::Error>, // From directory traversal
) -> Option<Haystack> {
match result {
Ok(dent) => self.build(dent), // Delegate to main build logic
Err(err) => {
err_message!("{err}"); // Log error, don't propagate
None // Signal "nothing to search"
}
}
}
The err_message! macro is defined elsewhere in ripgrep's core crate. It handles conditional error output based on configuration. Returning Option<Haystack> lets callers use filter_map to seamlessly skip errors during iteration.
Section 3: The Filtering Decision Tree¶
/// Create a new haystack using this builder's configuration.
///
/// If a directory entry could not be created or should otherwise not be
/// searched, then this returns `None` after emitting any relevant log
/// messages.
fn build(&self, dent: ignore::DirEntry) -> Option<Haystack> {
// Wrap immediately to use Haystack's helper methods
let hay = Haystack { dent, strip_dot_prefix: self.strip_dot_prefix };
// Log partial errors but continue processing
if let Some(err) = hay.dent.error() {
ignore_message!("{err}");
}
// Priority 1: Explicit entries always pass through
if hay.is_explicit() {
return Some(hay);
}
// Priority 2: Regular files pass through
if hay.is_file() {
return Some(hay);
}
// Priority 3: Everything else gets rejected (with debug logging)
if !hay.is_dir() {
log::debug!(
"ignoring {}: failed to pass haystack filter: \
file type: {:?}, metadata: {:?}",
hay.dent.path().display(),
hay.dent.file_type(),
hay.dent.metadata()
);
}
None
}
The ignore_message! macro differs from err_message!—it's for warnings about things being skipped rather than outright failures. The decision tree's order matters: explicit paths get special treatment before the file-type check.
Section 4: Explicit vs Discovered Paths¶
/// A haystack is a thing we want to search.
///
/// Generally, a haystack is either a file or stdin.
#[derive(Clone, Debug)]
pub(crate) struct Haystack {
dent: ignore::DirEntry, // The underlying directory entry
strip_dot_prefix: bool, // Configuration baked in at construction
}
impl Haystack {
/// Returns true if and only if this entry corresponds to stdin.
pub(crate) fn is_stdin(&self) -> bool {
self.dent.is_stdin()
}
/// Returns true if and only if this entry corresponds to a haystack to
/// search that was explicitly supplied by an end user.
///
/// Generally, this corresponds to either stdin or an explicit file path
/// argument. e.g., in `rg foo some-file ./some-dir/`, `some-file` is
/// an explicit haystack, but, e.g., `./some-dir/some-other-file` is not.
///
/// However, note that ripgrep does not see through shell globbing. e.g.,
/// in `rg foo ./some-dir/*`, `./some-dir/some-other-file` will be treated
/// as an explicit haystack.
pub(crate) fn is_explicit(&self) -> bool {
// stdin is obvious. When an entry has a depth of 0, that means it
// was explicitly provided to our directory iterator, which means it
// was in turn explicitly provided by the end user. The !is_dir check
// means that we want to search files even if their symlinks, again,
// because they were explicitly provided. (And we never want to try
// to search a directory.)
self.is_stdin() || (self.dent.depth() == 0 && !self.is_dir())
}
}
The depth() == 0 check is the key heuristic: entries at depth zero were passed directly to the walker, not discovered during traversal. The !is_dir() check excludes directories since we search their contents, not the directories themselves.
Section 5: File Type Detection and Symlink Handling¶
/// Returns true if and only if this haystack points to a directory after
/// following symbolic links.
fn is_dir(&self) -> bool {
let ft = match self.dent.file_type() {
None => return false, // No file type means can't be a directory
Some(ft) => ft,
};
if ft.is_dir() {
return true; // Direct directory
}
// If this is a symlink, then we want to follow it to determine
// whether it's a directory or not.
self.dent.path_is_symlink() && self.dent.path().is_dir()
}
/// Returns true if and only if this haystack points to a file.
fn is_file(&self) -> bool {
self.dent.file_type().map_or(false, |ft| ft.is_file())
}
Note the asymmetry: is_file doesn't follow symlinks (uses only file_type()), but is_dir does (calls path().is_dir() which follows symlinks). This is intentional—discovered symlinks are skipped by the main filter, but we need symlink-aware directory checking for explicit paths.
Section 6: Path Presentation and User Experience¶
impl Haystack {
/// Return the file path corresponding to this haystack.
///
/// If this haystack corresponds to stdin, then a special `<stdin>` path
/// is returned instead.
pub(crate) fn path(&self) -> &Path {
if self.strip_dot_prefix && self.dent.path().starts_with("./") {
// strip_prefix returns Result, but we know "./" is present
self.dent.path().strip_prefix("./").unwrap()
} else {
self.dent.path()
}
}
}
The unwrap() here is safe because we've already verified the prefix exists with starts_with("./"). This transforms ./src/main.rs into src/main.rs for cleaner output when searching the current directory.
Quick Reference¶
Type Summary¶
| Type | Visibility | Purpose |
|---|---|---|
HaystackBuilder |
pub(crate) |
Configures and creates haystacks |
Haystack |
pub(crate) |
Wraps searchable directory entries |
Haystack Decision Flow¶
DirEntry
│
├─ Error? → Log, return None
│
├─ Explicit? (stdin or depth=0) → Return Some(haystack)
│
├─ Regular file? → Return Some(haystack)
│
└─ Otherwise → Log debug, return None
Key Method Signatures¶
// Builder
fn new() -> HaystackBuilder
fn build_from_result(&self, Result<DirEntry, Error>) -> Option<Haystack>
fn build(&self, DirEntry) -> Option<Haystack>
fn strip_dot_prefix(&mut self, bool) -> &mut HaystackBuilder
// Haystack
fn path(&self) -> &Path
fn is_stdin(&self) -> bool
fn is_explicit(&self) -> bool
fn is_dir(&self) -> bool // follows symlinks
fn is_file(&self) -> bool // does NOT follow symlinks