GitHub - kidp8479/42_minishell: 42 Paris project - group project with @d2codex - implementing a bash-like shell while discovering professional Git workflows, PR reviews, and test-driven development.

This project has been created as part of the 42 curriculum by pafroidu and diade-so.

Description

Minishell is one of the most challenging and rewarding projects in the 42 curriculum. The goal is to create a functional shell, a command-line interpreter similar to bash. This project provides deep insight into process management, file descriptors, parsing, and how operating systems execute programs.

This was our first group project at 42, and it profoundly shaped how we approach software development. It taught us as much about teamwork, communication, and collaborative development as it did about system programming. Building a shell requires coordinating multiple complex subsystems: lexical analysis, parsing, execution, redirection, pipes, and built-in commands, making it an ideal project for discovering professional development workflows.

This project marked our introduction to tools and practices that would become essential: GitHub Flow, pull request reviews, conventional commits, atomic commits, unit and integration testing, and effective team communication. We discovered that working cleanly as a group isn't just about dividing tasks, it's about establishing good work ethics, using the right tools, and maintaining clear communication.

Through Minishell, we'll understand:

Process management: fork(), execve(), wait(), and process lifecycle
File descriptors: stdin, stdout, stderr, pipes, and redirections
Parsing: Tokenization, syntax analysis, and command-line interpretation
Signals: Handling keyboard interrupts (Ctrl-C, Ctrl-D, Ctrl-\)
Environment variables: Managing and expanding shell variables
Built-in commands: Implementing commands that must run in the shell process
Team collaboration: Working with Git, code review, and project management tools

Team Collaboration

Contributors:

Our Workflow

This project marked our introduction to professional software development practices. We established a rigorous workflow that would serve as a foundation for all future collaborative projects:

Project Management:

Notion Workspace: Used as a Trello-like kanban board
- Backlog of features and tasks
- Sprint planning with task assignments
- Progress tracking with status columns (To Do, In Progress, Review, Done)
- Bug tracking and technical debt management
- Meeting notes and decision logs

Version Control:

GitHub Flow: Feature branch workflow
- main branch always stable and deployable
- Feature branches for each new feature or bug fix
- Branch naming convention: feature/parser, fix/pipe-leaks, refactor/executor
Atomic Commits: Each commit represents one logical change
- Small, focused commits that do one thing
- Easy to review, revert, and understand
- Clean Git history tells the project's story

Conventional Commits: Standardized commit messages

feat: add pipe execution with multiple commands
fix: resolve memory leak in tokenizer
refactor: split parser into smaller functions
docs: update README with built-in commands
test: add test cases for redirections

Code Review Process:

Pull Requests: All code merged via PR
- Descriptive PR titles and descriptions
- Link to related Notion tickets
- Screenshots or test outputs when applicable
Peer Review: Mandatory code review before merge
- At least one approval required
- Review for correctness, style, edge cases, memory leaks
- Discussion and iteration on complex changes
No direct pushes to main: Enforced branch protection

Benefits of Our Workflow:

Quality: Code review caught bugs before they reached main
Knowledge sharing: Both team members understood all parts of the codebase
Documentation: Commit history and PR discussions serve as project documentation
Conflict resolution: Feature branches minimized merge conflicts
Professional skills: Learned industry-standard development practices

This structured approach transformed a complex project into a manageable, collaborative effort. The use of 40 pull requests with mandatory peer review ensured that every piece of code was validated by both team members, catching bugs early and maintaining code quality. Our 224 commits following atomic and conventional practices created a clean, readable history that served as documentation, making it easy to understand why changes were made and to revert problematic code when needed.

The skills we developed: clear communication, systematic problem-solving, and rigorous testing, proved invaluable for this and all subsequent projects.

Instructions

Compilation

The project includes a Makefile with standard rules:

make        # Compiles minishell
make clean  # Removes object files
make fclean # Removes object files and executable
make re     # Recompiles everything

Compilation Flags

All source files must compile with:

cc -Wall -Wextra -Werror

You'll need to link with the readline library (already well linked in the makefile thanks to the make command):

cc -Wall -Wextra -Werror *.c -lreadline

External Functions Allowed

Readline library:

readline, rl_clear_history, rl_on_new_line, rl_replace_line, rl_redisplay, add_history

Standard I/O:

printf, malloc, free, write, read, open, close

Process management:

fork, wait, waitpid, wait3, wait4, execve, exit

Signal handling:

signal, sigaction, sigemptyset, sigaddset, kill

File system:

access, stat, lstat, fstat, unlink, getcwd, chdir
opendir, readdir, closedir

Pipes and redirection:

dup, dup2, pipe

Error handling:

strerror, perror

Terminal:

isatty, ttyname, ttyslot, ioctl
tcsetattr, tcgetattr, tgetent, tgetflag, tgetnum, tgetstr, tgoto, tputs

Environment:

getenv

Project Structure

The codebase is organized into a clean, modular architecture with clear separation of concerns:

Source Code

src/
├── builtins/         # Built-in commands
│   ├── cd.c, cd_update.c
│   ├── echo.c, pwd.c, env.c, exit.c, unset.c
│   └── export.c + 5 export utilities (array, sort, update, utils, validate)
│
├── parser/           # Tokenization and AST
│   ├── tokenizer_smart_split.c      # Quote-aware tokenization
│   ├── tokenizer_count_tokens.c, tokenizer_utils.c
│   ├── categorize_tokens.c          # Token type identification
│   ├── ast_build.c, ast_build_utils.c, ast_create_nodes.c
│   ├── ast_free.c, validate_syntax.c
│   └── quote_trimming.c, execute_tokenizer.c
│
├── execution/        # Command execution
│   ├── execute_ast_tree.c           # AST traversal and execution
│   ├── execute_pipeline.c, pipeline_wait.c
│   ├── execute_builtins.c, execute_external_cmd.c
│   ├── redirections.c, heredoc.c
│   ├── find_executable.c            # PATH resolution
│   └── fd_utils.c, ast_utils.c, build_env_array.c
│
├── expansion/        # Variable expansion
│   ├── expansion.c                  # Main expansion logic
│   ├── expansion_extract.c          # Variable name extraction
│   ├── expansion_replace.c          # Value replacement ($VAR, $?)
│   └── expansion_utils.c
│
├── core/             # Shell initialization
│   ├── minishell_loop.c             # Main REPL loop
│   ├── init_shell.c                 # Environment setup
│   └── ascii_art_themes.c, print_ascii_art.c
│
├── signals/          # Signal handling
│   ├── signal_setup.c               # Signal configuration
│   └── signal_handlers.c            # SIGINT, SIGQUIT handlers
│
├── env/              # Environment management
│   └── env_import.c                 # envp import and conversion
│
├── utils/            # Utilities
│   ├── is_whitespace.c, memory_cleanup.c
│   └── print_error.c
│
└── main.c            # Entry point

Dependencies:

libft/ - Custom C standard library
- String manipulation, memory management, linked lists, I/O functions
- Reusable across multiple 42 projects

This modular design allowed us to work on different features in parallel without conflicts. Each module has a clear responsibility, making code review easier and reducing coupling between components.

Testing Strategy

One of the key learnings from this project was the importance of systematic testing. We built a custom test framework from scratch to validate our implementation.

Test Infrastructure

tests/
├── unit/                      # Unit tests
│   ├── test_builtin_echo.c            # Echo with -n flag
│   ├── test_expansion_*.c             # 5 expansion tests
│   ├── test_categorize_tokens.c       # Token categorization
│   ├── test_categorize_ast.c          # AST node types
│   ├── test_quote_trim_ast.c          # Quote removal
│   └── test_easy_quote_trim.c
│
├── integration/               # Integration tests (deprecated, moved to manual)
│
├── scripts/                   # Test automation scripts
│   └── test_bash_exit_code.sh
│
├── Makefile                   # Test build system
└── ast_print.c                # AST debugging utility

Test Framework Features:

TDD approach: Tests written before/during implementation (for some of the builtins tests)
Colorized output: Green/red for pass/fail with detailed error messages
Manual assertions: printf-based validation (no external test library)
Valgrind integration: Memory leak detection with readline suppression
Modular test builds: Each test compiles to separate binary

Running Tests:

# Build all tests
make -C tests/

# Run specific test
make -C tests/ TEST=unit/test_builtin_echo.c run

# Run with valgrind
make -C tests/ TEST=unit/test_expansion_get_var_value.c valgrind

Why we built our own framework:

Learning experience, understood how testing frameworks work internally
Full control over test behavior and output formatting
Tailored to our specific testing needs (AST validation, shell state management)

The testing process taught us that good tests are as important as good code. Writing tests forced us to think about edge cases, validated our design decisions, and gave us confidence when refactoring.

Program Usage

./minishell

The program displays a prompt and waits for commands:

minishell$ ls -la
minishell$ echo "Hello, World!"
minishell$ cat file.txt | grep pattern | wc -l
minishell$ export VAR=value
minishell$ echo $VAR
minishell$ exit

Core Features

Interactive shell:

Display a prompt when waiting for new command
Readline integration with working history (up/down arrows)
Search and launch executables (based on PATH or absolute/relative path)

Quote handling:

' (single quotes): Prevents interpretation of all metacharacters
" (double quotes): Prevents interpretation except for $

Redirections:

Operator	Description
`<`	Redirect input from file
`>`	Redirect output to file (truncate)
`<<`	Here-document (read until delimiter)
`>>`	Redirect output to file (append)

Pipes:

| : Pipe output of one command to input of next
Support for multiple pipes: cmd1 | cmd2 | cmd3 | ...

Environment variables:

$VARIABLE: Expand to their value
$?: Expand to exit status of last executed command

Signal handling:

Signal	Interactive Mode Behavior
`ctrl-C`	Display new prompt on new line
`ctrl-D`	Exit the shell
`ctrl-\`	Do nothing

Built-in Commands

These commands must be implemented as built-ins (run in the shell process, not as external programs):

Command	Options	Description
`echo`	`-n`	Print arguments (with optional -n to omit newline)
`cd`	relative or absolute path	Change directory
`pwd`	none	Print working directory
`export`	none	Set environment variables
`unset`	none	Unset environment variables
`env`	none	Print environment variables
`exit`	none	Exit the shell

Example usage:

minishell$ cd /tmp
minishell$ pwd
/tmp
minishell$ export MY_VAR=hello
minishell$ echo $MY_VAR
hello
minishell$ env | grep MY_VAR
MY_VAR=hello
minishell$ unset MY_VAR
minishell$ echo $MY_VAR

minishell$ exit

Not Required

Unclosed quotes or special characters
\ (backslash)
; (semicolon)

Mandatory Requirements

One global variable maximum: Only for signal number
No memory leaks: Except from readline() (known issue, we used a .sup file to ignore all leaks from readline lib)
Bash reference: Use bash to resolve doubts about requirements
Error handling: Handle errors gracefully without crashing
Norm compliance: Code must follow 42 Norm

Bonus Features

Not done (lack of time) but we used and AST during parsing phase so technically, everything is ready for bonus implementation if we want to come back to the project (which I hope we can, one day).

Bonus asked :

Logical operators:

&&: Execute next command only if previous succeeded
||: Execute next command only if previous failed
Parentheses for priority: (cmd1 && cmd2) || cmd3

Wildcards:

*: Wildcard expansion for current working directory

Resources

Shell Implementation

GNU Bash Manual - Reference for bash behavior
Writing Your Own Shell - Academic guide
Shell Command Language - POSIX specification

Readline Library

GNU Readline Library - Official documentation
readline(3) man page
Using History Interactively - History management

Process Management

fork() - Create child process
execve() - Execute program
wait() and waitpid() - Wait for process
Advanced Programming in UNIX Environment - Chapter 8: Process Control

File Descriptors and Pipes

pipe() - Create pipe
dup2() - Duplicate file descriptor
open() - Open file

Signal Handling

signal() - Signal handling
sigaction() - Examine and change signal action

Parsing

Lexical Analysis - Tokenization concepts
Recursive Descent Parser - Parsing technique

Testing

# Compare with bash behavior
bash$ echo "test" | cat -e
test$
minishell$ echo "test" | cat -e
test$

# Test redirections
minishell$ echo "hello" > file.txt
minishell$ cat < file.txt
hello
minishell$ echo "world" >> file.txt
minishell$ cat file.txt
hello
world

# Test pipes
minishell$ ls -la | grep minishell | wc -l

# Test environment variables
minishell$ export TEST=hello
minishell$ echo $TEST
hello
minishell$ env | grep TEST
TEST=hello

# Test exit status
minishell$ ls nonexistent
ls: cannot access 'nonexistent': No such file or directory
minishell$ echo $?
2

# Test signals
minishell$ sleep 10
^C
minishell$

# Test here-doc
minishell$ cat << EOF
> line 1
> line 2
> EOF
line 1
line 2

Notes

Implementation Strategy:

Lexer (Tokenization): Split input into tokens (words, operators, quotes)
Parser: Build command structure from tokens (commands, arguments, redirections, pipes)
Expander: Expand environment variables (and wildcards if bonus are later implemented)
Executor: Execute commands with proper redirections and pipes
Built-ins: Handle built-in commands separately
Cleanup: Free all allocated memory

Architecture Overview:

┌─────────────┐
│   Readline  │  Get user input with history
└──────┬──────┘
       │
┌──────▼──────┐
│    Lexer    │  Tokenize input (spaces, quotes, operators)
└──────┬──────┘
       │
┌──────▼──────┐
│   Parser    │  Build command structure (AST)
└──────┬──────┘
       │
┌──────▼──────┐
│  Expander   │  Expand $VAR, $?, remove quotes
└──────┬──────┘
       │
┌──────▼──────┐
│  Executor   │  Execute commands with pipes/redirections
└──────┬──────┘
       │
┌──────▼──────┐
│   Cleanup   │  Free memory, close file descriptors
└─────────────┘

Important Considerations:

Global variable: Only one allowed (for signal number)
Readline memory: readline() has known memory leaks - acceptable per the subject
Bash reference: When in doubt, test with bash
Error messages: Should match bash as closely as possible (but we are not doing posix bash)
File descriptor management: Always close unused FDs
Process cleanup: Always wait() for child processes
Quote handling: Remove quotes during expansion
Empty commands: Handle empty input gracefully

Common Pitfalls:

Not handling signals properly: Signals must behave differently in interactive vs execution mode
Built-ins in pipes: Built-ins in pipelines must run in child processes
File descriptor leaks: Always close all file descriptors
Quote removal: Quotes must be removed during expansion
Prompt in non-interactive mode: Only show prompt when stdin is a terminal (isatty())
Zombie processes: Always wait for child processes
Parsing order: Parse before expansion (don't expand inside quotes during lexing)
Exit status: Must track and update $? correctly
Memory leaks: Free all allocated memory (except readline)
Path resolution: Search PATH directories in order for command execution

Testing Checklist:

This is a school project. The code was validated and pushed to the school's Git repository after evaluation. The Git history in this public repository may not reflect the complete development process, as it represents the final version after successful validation.

Name		Name	Last commit message	Last commit date
Latest commit History 226 Commits
includes		includes
libft		libft
src		src
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
cover-minishell.png		cover-minishell.png
hell_home.png		hell_home.png
hell_home2.png		hell_home2.png
minishelle.png		minishelle.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Team Collaboration

Our Workflow

Instructions

Compilation

Compilation Flags

External Functions Allowed

Project Structure

Testing Strategy

Program Usage

Core Features

Built-in Commands

Not Required

Mandatory Requirements

Bonus Features

Resources

Shell Implementation

Readline Library

Process Management

File Descriptors and Pipes

Signal Handling

Parsing

Testing

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

kidp8479/42_minishell

Folders and files

Latest commit

History

Repository files navigation

Description

Team Collaboration

Our Workflow

Instructions

Compilation

Compilation Flags

External Functions Allowed

Project Structure

Testing Strategy

Program Usage

Core Features

Built-in Commands

Not Required

Mandatory Requirements

Bonus Features

Resources

Shell Implementation

Readline Library

Process Management

File Descriptors and Pipes

Signal Handling

Parsing

Testing

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages