Fast, gitignore-aware directory concatenation with Markdown output.
dircat-rust recursively walks through a directory, concatenates the content of discovered files (respecting .gitignore rules and offering extensive filtering), and outputs everything as a single, well-formatted Markdown file.
It's designed for speed, developer convenience, and seamless integration with tools that consume Markdown (like LLMs or documentation systems).
Table of Contents
- dircat-rust ⚡
- Why Use
dircat-rust? - Key Features
- Installation
- Quick Start
- Usage
- Examples / Use Cases
- Goal: Concatenate all Rust files in
srcandtests - Goal: Create context for an LLM, excluding tests, logs, and comments
- Goal: Concatenate all Rust code, excluding the
testsdirectory - Goal: See which files would be included if max size is 50kB
- Goal: Concatenate only files in the top-level directory
- Goal: Concatenate all test files using a filename pattern
- Goal: Create a complete snapshot, including ignored files
- Goal: Create a cleaner output with filename-only headers
- Goal: Add line numbers to the output for easy reference
- Goal: Wrap filenames in backticks for better rendering
- Goal: Get a detailed summary with file counts
- Goal: Process
README.mdandLICENSElast - Goal: Concatenate specific config files only
- Goal: Concatenate only all Rust files from
srcand all TOML files fromconfig - Goal: Copy Python code (no comments/empty lines) to clipboard
- Goal: Pipe output to
glowfor terminal rendering - Goal: Include binary files (e.g., images) in the output
- Goal: Exclude lockfiles from the output
- Goal: Concatenate a remote git repository
- Goal: Concatenate only the
src/configdirectory from a remote repository
- Goal: Concatenate all Rust files in
- Tips & Considerations
- Comparison with Alternatives
- Development Status & Standards
- Contributing
- License
- Why Use
Are you tired of:
- Manually
cat-ing multiple files to create context for LLMs or documentation? - Wrestling with complex
find ... -execcommands just to view relevant code? - Sharing code snippets that lack structure or ignore your project's
.gitignorerules? - Needing a quick, readable snapshot of a directory's textual content?
dircat-rust solves these problems by providing a fast, configurable, and developer-friendly way to concatenate directory contents into a clean Markdown format.
- Markdown First: Outputting Markdown provides a universally readable, portable, and easily parsable format suitable for humans, documentation systems, and AI tools.
- Developer Focus: Deep integration with
.gitignorerules (via the excellentignorecrate) ensures the output accurately reflects the relevant parts of a typical software project. Sensible defaults like skipping binary files and an option to skip lockfiles enhance usability. - Performance: Built in Rust with parallel processing (via
rayon) to handle large directories efficiently without unnecessary overhead.
- Readability: Standardized, human-readable format with clear file separation and code block syntax highlighting (in compatible viewers).
- Portability: Easily shared and renders consistently across platforms and tools (GitHub, VS Code preview, Obsidian, etc.).
- LLM/AI Friendly: An excellent format for providing structured code context to Large Language Models.
- Integration: Can be easily included in other Markdown documents or processed by Markdown-aware tools (like static site generators or documentation tools).
- GitHub Folder/File API Integration (Fastest): For any
github.comURL pointing to a specific folder or file (e.g.,.../tree/main/srcor.../blob/main/src/main.rs),dircatuses the GitHub API to download only that content on-the-fly. This is extremely fast and avoids cloning large repositories.- To access private repositories or avoid API rate limits, set a
GITHUB_TOKENenvironment variable with a Personal Access Token that hasreposcope. - Automatic Fallback: If the GitHub API returns a rate limit error (HTTP 403),
dircatwill automatically fall back to performing a fullgit cloneof the repository and then process the specified folder. A warning will be displayed when this occurs.
- To access private repositories or avoid API rate limits, set a
- Git Clone for Repositories: For any git URL pointing to a repository root (e.g.,
github.com/user/repo.git,gitlab.com/...),dircatclones the repository.- Persistent Caching: Cloned repositories are stored in a local cache (e.g.,
~/.cache/dircat/reposon Linux). Subsequent runs for the same URL are significantly faster as they only fetch the latest updates.
- Persistent Caching: Cloned repositories are stored in a local cache (e.g.,
- Branch & Tag Selection: Process a specific branch, tag, or commit with
--git-branch(or the alias--git-ref). This works for both GitHub API and git clone methods. - Shallow Clone: For git clone operations, perform a shallow clone with
--git-depthto save time and data. - Private Repos: Automatically uses your SSH agent or default SSH keys for authentication when cloning non-GitHub URLs.
- Recursive Traversal: Walks through local or remote directories recursively by default (
-nto disable). - Comprehensive
.gitignoreSupport: Natively respects rules from.gitignore,.ignore, global git config files, and parent directories using theignorecrate (-tto disable). - Custom Ignore Patterns: Specify additional glob patterns to ignore files or directories (
-i). - Binary File Skipping: Skips files detected as binary/non-text by default (
--include-binaryto override). - Lockfile Skipping: Option to easily skip common lockfiles (
--no-lockfiles).
- By Size: Limit processing to files below a maximum size (
-m, e.g.,1M,512k). - By Extension: Include (
-e) or exclude (-x) files based on their extensions (case-insensitive). - By Path Regex: Include only files whose full path matches a regular expression (
-r). - By Filename Regex: Include only files whose filename (basename) matches a regular expression (
-d).
- Comment Removal: Option to strip C/C++ style comments (
//,/* ... */) while respecting strings (-c). - Empty Line Removal: Option to remove lines containing only whitespace (
-l).
- Markdown Format: Outputs content wrapped in Markdown code fences with language hints based on file extensions.
- File Headers: Clear
## File:headers separate content from different files. - Filename Only Header: Option to show only the filename in headers instead of the relative path (
-f). - Line Numbers: Prepend line numbers to each line of file content (
-L). - Backticks: Wrap filenames in headers and summaries with backticks (
-b). - Summary: Append a list of processed files, optionally with line, character, and word counts (
-s,-C).
- Rust Speed: Built in Rust for high performance and memory safety.
- Parallel Processing: Leverages Rayon for parallel file discovery and processing, speeding up operations on multi-core systems.
- Efficient Libraries: Uses optimized libraries like
walkdirandignorefor file system operations.
- Cross-Platform: Provides pre-compiled binaries for Linux, macOS, and Windows.
- Clone Progress: Displays a progress bar when cloning git repositories.
- Multiple Output Options: Write to stdout (default), a file (
-o), or the system clipboard (-p). - Dry Run: Preview which files would be processed without reading or concatenating content (
-D). - User-Friendly Errors: Clear error messages for issues like invalid paths, incorrect arguments, or file access problems.
For users of the Rust toolchain, the fastest way to install is with cargo-binstall, which installs pre-compiled binaries directly.
cargo binstall dircatIf you don't have cargo-binstall, you can install it by following the quick installation instructions on its page.
You can download a pre-compiled binary for your system directly from the Latest Release page. This method does not require the Rust toolchain.
(Note: Binaries are self-contained and do not require installing Rust or any other language runtime.)
Linux (x86_64 / aarch64) / macOS (Intel x86_64 / Apple Silicon arm64):
# 1. Get the latest version tag from the releases page:
# https://github.com/romelium/dircat-rust/releases/latest
#
# 2. Set the VERSION and TARGET variables.
# - VERSION should be the tag name.
# - TARGET options: x86_64-unknown-linux-gnu, aarch64-unknown-linux-gnu, x86_64-apple-darwin, aarch64-apple-darwin
VERSION="<latest_version>"
TARGET="x86_64-unknown-linux-gnu"
# 3. Download the binary.
curl -L "https://github.com/romelium/dircat-rust/releases/download/${VERSION}/dircat-${VERSION}-${TARGET}.tar.gz" -o dircat.tar.gz
# 4. Extract the archive.
tar xzf dircat.tar.gz
# 5. Make the binary executable.
chmod +x dircat
# 6. Optional: Move the binary to a directory in your system's PATH.
# sudo mv dircat /usr/local/bin/
# OR (if you have a ~/bin directory in your PATH)
# mkdir -p ~/bin && mv dircat ~/bin/Windows (x86_64):
# 1. Get the latest version tag from the releases page:
# https://github.com/romelium/dircat-rust/releases/latest
#
# 2. Set the VERSION variable.
# - VERSION should be the tag name.
$VERSION = "<latest_version>"
$TARGET = "x86_64-pc-windows-msvc"
# 3. Download the binary.
$URL = "https://github.com/romelium/dircat-rust/releases/download/${VERSION}/dircat-${VERSION}-${TARGET}.zip"
$OUTPUT = "dircat.zip"
Invoke-WebRequest -Uri $URL -OutFile $OUTPUT
# 4. Extract the archive.
Expand-Archive -Path $OUTPUT -DestinationPath .
# 5. Optional: Add the directory containing dircat.exe to your system's PATH environment variable
# or move dircat.exe to a directory already in your PATH.If you have the Rust toolchain installed (rustup), you can install dircat-rust using cargo:
cargo install dircat(Requires Rust 1.70 or later - check project's Cargo.toml for exact MSRV if specified).
# Clone the repository
git clone https://github.com/romelium/dircat-rust.git
cd dircat-rust
# Build the release binary
cargo build --release
# The executable will be in ./target/release/dircat
./target/release/dircat --version- Install
dircatusing one of the methods above (pre-compiled binary recommended). - Run it in your project directory:
# Concatenate all relevant text files in the current directory into output.md
# (skips binaries, respects .gitignore by default)
dircat . > output.md- Check
output.md! You should see something like:
## File: src/main.rs
```rs
fn main() { /* ... */ }
```
## File: README.md
```md
# My Project
...
```🚀 Start using dircat now! Try dircat . in your project.
dircat [OPTIONS] [INPUT]
INPUT: The directory, specific file, or git repository URL to process. Defaults to the current directory (.).
Basic Examples:
# Process the current directory (text files, respecting .gitignore), print to stdout
dircat
# Process the 'src' subdirectory
dircat src
# Process only a single file (binary check still applies unless --include-binary)
dircat src/main.rs
# Process a remote git repository (clones to a persistent cache for speed on subsequent runs)
dircat https://github.com/romelium/dircat-rust.git
# Process only the 'src' directory from a remote git repository
dircat https://github.com/romelium/dircat-rust/tree/main/src
# Process only a single file from a remote git repository
dircat https://github.com/romelium/dircat-rust/blob/main/src/main.rs
# Process a specific branch of a remote repository
dircat https://github.com/some/repo.git --git-branch develop
# Process the current directory and save to a file
dircat . -o project_snapshot.md
# Process the current directory, including binary files
dircat . --include-binary > output_with_binaries.md
# Process the current directory, excluding common lockfiles
dircat . --no-lockfiles > output_without_locks.mdBelow are the most common options. For a full, definitive list, run dircat --help.
| Option | Description |
| Option | Description |
|---|---|
[INPUT] |
Path to a directory/file, or a git URL. Defaults to .. |
--git-branch BRANCH, --git-ref REF |
For git URL inputs, check out a specific branch or tag instead of the default. |
--git-depth DEPTH |
For git URL inputs, perform a shallow clone with a limited history depth. |
--git-cache-path PATH |
Path to the directory for caching cloned git repositories. |
| Option | Alias | Description | Example |
|---|---|---|---|
--max-size BYTES |
-m |
Skip files larger than this size (e.g., "1M", "512k", "1024"). | -m 1M |
--no-recursive |
-n |
Process only the top-level directory or specified file (disable recursion). | -n |
--ext EXT |
-e |
Include only files with these extensions (case-insensitive, repeatable). | -e rs toml |
--exclude-ext EXT |
-x |
Exclude files with these extensions (case-insensitive, repeatable, overrides -e). |
-x log tmp |
--ignore GLOB |
-i |
Ignore files/directories matching these custom glob patterns (relative to input path, repeatable). | -i target/* *.lock |
--exclude-regex REGEX |
-X |
Exclude files whose full path matches any of these regexes (case-insensitive, repeatable). | `-X "tests/.* |
--regex REGEX |
-r |
Include only files whose full path matches any of these regexes (case-insensitive, repeatable). | -r "src/.*\.rs$" |
--filename-regex REGEX |
-d |
Include only files whose filename matches any of these regexes (case-insensitive, repeatable). | -d "^test_.*" |
--no-gitignore |
-t |
Process all files, ignoring .gitignore, .ignore, hidden files, etc. |
-t |
--include-binary |
-B |
Include files detected as binary/non-text (default is to skip them). | --include-binary |
--no-lockfiles |
-K |
Skip common lockfiles (e.g., Cargo.lock, package-lock.json). |
--no-lockfiles |
| Option | Alias | Description |
|---|---|---|
--remove-comments |
-c |
Remove C/C++ style comments (//, /* ... */) from content. |
--remove-empty-lines |
-l |
Remove lines containing only whitespace from content. |
| Option | Alias | Description |
|---|---|---|
--ticks COUNT |
-T |
Set the number of backticks for code fences (default: 3, min: 3). |
--filename-only |
-f |
Show only the filename (basename) in ## File: headers, not the relative path. |
--line-numbers |
-L |
Add line numbers to the beginning of each content line. |
--backticks |
-b |
Wrap filenames in headers and summary list with backticks (`). |
| Option | Alias | Description |
|---|---|---|
--output FILE |
-o |
Write output to the specified file instead of stdout. |
--paste |
-p |
Copy output to the system clipboard. |
--summary |
-s |
Print a summary list of processed files at the end. |
--counts |
-C |
Include line, character (byte), and word counts in the summary (implies -s). |
| Option | Alias | Description | Example |
|---|---|---|---|
--last GLOB |
-z |
Process files matching these glob patterns last, in the order specified. This can override .gitignore rules for the matched files. Repeatable. |
-z README.md |
--only GLOB |
-O |
A shorthand for --last <GLOB>... --only-last. Process only files matching these glob patterns. Conflicts with -z and -Z. |
-O *.rs |
--only-last |
-Z |
Only process files specified with -z/--last. Skip all others (requires -z). |
-Z |
| Option | Alias | Description |
|---|---|---|
--dry-run |
-D |
Print files that would be processed (respecting filters/order), but not content. |
💡 Explore further! Experiment with different filters or check dircat --help for all options.
dircat . -e rs -r "^(src|tests)/" > rust_code.mdOutput Snippet:
## File: src/lib.rs
```rs
// Library code...
```
## File: tests/integration.rs
```rs
// Test code...
```dircat . -e rs py toml -x log -i tests/* -c --no-lockfiles -o llm_context.mdOutput Snippet:
## File: src/config.py
```py
# Config loading logic (comments removed)
```
## File: Cargo.toml
```toml
# Dependencies (comments removed)
```This is useful for creating a context of only the application source code, ignoring test files. The --exclude-regex (-X) option is perfect for this, as it filters by path.
dircat . -e rs -X ^tests/ > app_code_only.mdOutput Snippet: (Files from src/ are included, but tests/ are skipped)
## File: src/main.rs
```rs
// main function...
```
## File: src/lib.rs
```rs
// library code...
```dircat . -m 50k -DOutput Snippet:
--- Dry Run: Files that would be processed ---
- src/small_module.rs
- config/settings.toml
--- End Dry Run ---
dircat . -n > root_files.mdOutput Snippet: (Includes Cargo.toml but skips src/main.rs)
## File: Cargo.toml
```toml
[package]
name = "dircat"
...
```
## File: README.md
```md
# dircat-rust ⚡
...
```dircat . -d "^test_.*" ".*_test\.rs$" > all_tests.mdOutput Snippet:
## File: tests/test_api.rs
```rs
// API tests...
```
## File: src/auth_test.rs
```rs
// Auth unit tests...
```dircat . -t > full_project_snapshot.mdOutput Snippet: (Includes files typically ignored by .gitignore)
## File: .gitignore
```
target/
*.log
```
## File: target/debug/dircat
```
...binary content...
```dircat src -f > clean_headers.mdOutput Snippet: (Note main.rs instead of src/main.rs in the header)
## File: main.rs
```rs
fn main() { /* ... */ }
```
## File: lib.rs
```rs
// ...
```dircat src/main.rs -L > main_with_lines.mdOutput Snippet:
## File: src/main.rs
```rs
1 | fn main() {
2 | println!("Hello, world!");
3 | }
```dircat . -b -s > pretty_output.mdOutput Snippet:
## File: `src/main.rs`
```rs
fn main() {}
```
---
Processed Files: (15)
- `Cargo.toml`
- `README.md`
- `src/main.rs`
...dircat . -C > project_with_counts.mdOutput Snippet: (The -C flag implies -s and adds counts to the summary)
...
## File: src/main.rs
```rs
fn main() {}
```
...
---
Processed Files: (15)
- Cargo.toml (L:50 C:1500 W:120)
- README.md (L:300 C:18000 W:2500)
- src/main.rs (L:5 C:80 W:10)
...dircat . -z README.md LICENSE > project_with_readme_last.mdOutput Snippet: (Other files appear first, then README, then LICENSE)
...
## File: src/main.rs
```rs
...
```
...
## File: README.md
```md
...
```
## File: LICENSE
```
...
```dircat . -O config/*.toml .env.example > config_files.mdOutput Snippet: (Only files matching the -O patterns are included)
## File: config/database.toml
```toml
...
```
## File: .env.example
```
VAR=value
```By chaining the --only (-O) flag, you can create a precise selection of files from different parts of your project.
dircat . -O src/**/*.rs config/*.toml > custom_selection.mdOutput Snippet: (Only files matching either glob pattern are included)
## File: src/main.rs
```rs
// main function...
```
## File: config/settings.toml
```toml
# settings...
```dircat src -e py -c -l -pdircat src -e rs | glow -dircat assets --include-binary > assets_output.mddircat . --no-lockfiles > project_without_locks.md# Clones the repo to a persistent cache directory and processes it.
# Subsequent runs for the same URL will be much faster.
# Automatically uses SSH keys for private repos.
# Displays a progress bar for long clones.
dircat git@github.com:romelium/dircat-rust.git > repo_content.md
# Clone a specific branch
dircat https://github.com/some/repo.git --git-branch develop
# Perform a shallow clone of depth 1
dircat https://github.com/some/repo.git --git-depth 1You can provide a URL that points directly to a folder on GitHub. dircat will parse the URL and use the GitHub API to download only the contents of that specific folder, which is much faster than cloning the entire repository.
# This will process only the contents of the 'src/config' directory
# from the 'main' branch of the dircat-rust repository.
dircat https://github.com/romelium/dircat-rust/tree/main/src/config# This will download and process only the 'src/main.rs' file.
dircat https://github.com/romelium/dircat-rust/blob/main/src/main.rs- Large Output: Running
dircaton large directories can produce significant output. Use filters (-m,-e,-r, etc.) or the dry-run (-D) option first. Redirect large outputs to a file (-o FILE) instead of overwhelming your terminal. - Binary Files: By default,
dircatskips binary files. Use-Bto include them. The detection is heuristic and might not be perfect. - Lockfiles: Use
-Kto easily exclude common dependency lockfiles, which is useful when generating context for LLMs. - Git Cache: When cloning repositories (from any host, including GitHub root URLs),
dircatstores them in a cache directory (e.g.,~/.cache/dircat/reposon Linux, platform-specific otherwise) to speed up future runs. You can specify a custom cache location with--git-cache-path. To force a fresh clone, you can manually delete the corresponding hashed directory from this cache. GitHub folder URLs (e.g.,.../tree/main/src) are not cached this way; they are downloaded fresh via the API on each run. - Path Handling:
- Display: File paths shown in
## File:headers and the summary (-s) are relative to the input path you provided (or the current directory if none was given). - Filtering:
- Path Regex (
-r): Matches against the relative path (from the input directory), normalized to use/separators. - Filename Regex (
-d): Matches against the filename (basename) only. - Ignore/Last Globs (
-i,-z): Match against the path relative to the input path.
- Path Regex (
- Display: File paths shown in
- Performance: While
dircat-rustis fast, processing extremely large files or a vast number of files will still take time. Use filters to narrow down the scope when possible.
| Feature | dircat-rust |
cat |
find ... -exec cat {} + |
tree |
|---|---|---|---|---|
| Directory Input | ✅ Yes | ❌ No | ✅ Yes (via find) |
✅ Yes |
| Concatenate Content | ✅ Yes | ✅ Yes (files) | ✅ Yes | ❌ No |
| Gitignore Aware | ✅ Yes (Built-in) | ❌ No | Manual (complex) | Manual |
| Markdown Output | ✅ Yes | ❌ No | ❌ No | ❌ No |
| Skip Binaries | ✅ Yes (Default) | Reads all | Manual (e.g., file) |
N/A |
| Skip Lockfiles | ✅ Yes (-K) |
❌ No | Manual (-name) |
Manual |
| Filtering (Size) | ✅ Yes (-m) |
❌ No | Manual (-size) |
Manual |
| Filtering (Ext/Regex) | ✅ Yes (-e/-x/-r/-d) |
❌ No | Manual (-name/-regex) |
Manual |
| Content Processing | ✅ Yes (-c/-l) |
❌ No | Manual (e.g., sed) |
❌ No |
| Speed Focus | ✅ Yes (Rust/Parallel) | Fast (single) | Variable | Fast (metadata) |
| Cross-Platform Binaries | ✅ Yes | OS specific | OS specific | OS specific |
dircat-rust is under active development. It utilizes modern Rust practices, including:
- Continuous Integration (CI) via GitHub Actions.
- Code formatting (
cargo fmt) and linting (cargo clippy). - Conventional Commits for clear commit history.
- Pre-commit hooks to enforce standards before committing.
Contributions are welcome! Whether it's bug reports, feature suggestions, or code improvements, please feel free to:
- Check the Issue Tracker for existing bugs or ideas.
- Open a new issue to discuss your suggestion or report a bug.
- Review the Commit Message Guidelines before submitting pull requests.
- Set up pre-commit hooks locally (
pre-commit install) to ensure your contributions meet project standards.
🤝 We welcome contributions! Please see our COMMIT.md guidelines and check the issue tracker for ways to help.
This project is licensed under the MIT License. See the LICENSE file for details.