Skip to content

42 Paris project - group project with @d2codex - implementing a bash-like shell while discovering professional Git workflows, PR reviews, and test-driven development.

Notifications You must be signed in to change notification settings

kidp8479/42_minishell

 
 

Repository files navigation

This project has been created as part of the 42 curriculum by pafroidu and diade-so.

Minishell Cover

Minishell Home

Description

Minishell is one of the most challenging and rewarding projects in the 42 curriculum. The goal is to create a functional shell, a command-line interpreter similar to bash. This project provides deep insight into process management, file descriptors, parsing, and how operating systems execute programs.

This was our first group project at 42, and it profoundly shaped how we approach software development. It taught us as much about teamwork, communication, and collaborative development as it did about system programming. Building a shell requires coordinating multiple complex subsystems: lexical analysis, parsing, execution, redirection, pipes, and built-in commands, making it an ideal project for discovering professional development workflows.

This project marked our introduction to tools and practices that would become essential: GitHub Flow, pull request reviews, conventional commits, atomic commits, unit and integration testing, and effective team communication. We discovered that working cleanly as a group isn't just about dividing tasks, it's about establishing good work ethics, using the right tools, and maintaining clear communication.

Through Minishell, we'll understand:

  • Process management: fork(), execve(), wait(), and process lifecycle
  • File descriptors: stdin, stdout, stderr, pipes, and redirections
  • Parsing: Tokenization, syntax analysis, and command-line interpretation
  • Signals: Handling keyboard interrupts (Ctrl-C, Ctrl-D, Ctrl-\)
  • Environment variables: Managing and expanding shell variables
  • Built-in commands: Implementing commands that must run in the shell process
  • Team collaboration: Working with Git, code review, and project management tools

Team Collaboration

Contributors:

Our Workflow

This project marked our introduction to professional software development practices. We established a rigorous workflow that would serve as a foundation for all future collaborative projects:

Project Management:

  • Notion Workspace: Used as a Trello-like kanban board
    • Backlog of features and tasks
    • Sprint planning with task assignments
    • Progress tracking with status columns (To Do, In Progress, Review, Done)
    • Bug tracking and technical debt management
    • Meeting notes and decision logs

Version Control:

  • GitHub Flow: Feature branch workflow
    • main branch always stable and deployable
    • Feature branches for each new feature or bug fix
    • Branch naming convention: feature/parser, fix/pipe-leaks, refactor/executor
  • Atomic Commits: Each commit represents one logical change
    • Small, focused commits that do one thing
    • Easy to review, revert, and understand
    • Clean Git history tells the project's story
  • Conventional Commits: Standardized commit messages
    feat: add pipe execution with multiple commands
    fix: resolve memory leak in tokenizer
    refactor: split parser into smaller functions
    docs: update README with built-in commands
    test: add test cases for redirections
    

Code Review Process:

  • Pull Requests: All code merged via PR
    • Descriptive PR titles and descriptions
    • Link to related Notion tickets
    • Screenshots or test outputs when applicable
  • Peer Review: Mandatory code review before merge
    • At least one approval required
    • Review for correctness, style, edge cases, memory leaks
    • Discussion and iteration on complex changes
  • No direct pushes to main: Enforced branch protection

Benefits of Our Workflow:

  • Quality: Code review caught bugs before they reached main
  • Knowledge sharing: Both team members understood all parts of the codebase
  • Documentation: Commit history and PR discussions serve as project documentation
  • Conflict resolution: Feature branches minimized merge conflicts
  • Professional skills: Learned industry-standard development practices

This structured approach transformed a complex project into a manageable, collaborative effort. The use of 40 pull requests with mandatory peer review ensured that every piece of code was validated by both team members, catching bugs early and maintaining code quality. Our 224 commits following atomic and conventional practices created a clean, readable history that served as documentation, making it easy to understand why changes were made and to revert problematic code when needed.

The skills we developed: clear communication, systematic problem-solving, and rigorous testing, proved invaluable for this and all subsequent projects.

Instructions

Compilation

The project includes a Makefile with standard rules:

make        # Compiles minishell
make clean  # Removes object files
make fclean # Removes object files and executable
make re     # Recompiles everything

Compilation Flags

All source files must compile with:

cc -Wall -Wextra -Werror

You'll need to link with the readline library (already well linked in the makefile thanks to the make command):

cc -Wall -Wextra -Werror *.c -lreadline

External Functions Allowed

Readline library:

  • readline, rl_clear_history, rl_on_new_line, rl_replace_line, rl_redisplay, add_history

Standard I/O:

  • printf, malloc, free, write, read, open, close

Process management:

  • fork, wait, waitpid, wait3, wait4, execve, exit

Signal handling:

  • signal, sigaction, sigemptyset, sigaddset, kill

File system:

  • access, stat, lstat, fstat, unlink, getcwd, chdir
  • opendir, readdir, closedir

Pipes and redirection:

  • dup, dup2, pipe

Error handling:

  • strerror, perror

Terminal:

  • isatty, ttyname, ttyslot, ioctl
  • tcsetattr, tcgetattr, tgetent, tgetflag, tgetnum, tgetstr, tgoto, tputs

Environment:

  • getenv

Project Structure

The codebase is organized into a clean, modular architecture with clear separation of concerns:

Source Code

src/
├── builtins/         # Built-in commands
│   ├── cd.c, cd_update.c
│   ├── echo.c, pwd.c, env.c, exit.c, unset.c
│   └── export.c + 5 export utilities (array, sort, update, utils, validate)
│
├── parser/           # Tokenization and AST
│   ├── tokenizer_smart_split.c      # Quote-aware tokenization
│   ├── tokenizer_count_tokens.c, tokenizer_utils.c
│   ├── categorize_tokens.c          # Token type identification
│   ├── ast_build.c, ast_build_utils.c, ast_create_nodes.c
│   ├── ast_free.c, validate_syntax.c
│   └── quote_trimming.c, execute_tokenizer.c
│
├── execution/        # Command execution
│   ├── execute_ast_tree.c           # AST traversal and execution
│   ├── execute_pipeline.c, pipeline_wait.c
│   ├── execute_builtins.c, execute_external_cmd.c
│   ├── redirections.c, heredoc.c
│   ├── find_executable.c            # PATH resolution
│   └── fd_utils.c, ast_utils.c, build_env_array.c
│
├── expansion/        # Variable expansion
│   ├── expansion.c                  # Main expansion logic
│   ├── expansion_extract.c          # Variable name extraction
│   ├── expansion_replace.c          # Value replacement ($VAR, $?)
│   └── expansion_utils.c
│
├── core/             # Shell initialization
│   ├── minishell_loop.c             # Main REPL loop
│   ├── init_shell.c                 # Environment setup
│   └── ascii_art_themes.c, print_ascii_art.c
│
├── signals/          # Signal handling
│   ├── signal_setup.c               # Signal configuration
│   └── signal_handlers.c            # SIGINT, SIGQUIT handlers
│
├── env/              # Environment management
│   └── env_import.c                 # envp import and conversion
│
├── utils/            # Utilities
│   ├── is_whitespace.c, memory_cleanup.c
│   └── print_error.c
│
└── main.c            # Entry point

Dependencies:

  • libft/ - Custom C standard library
    • String manipulation, memory management, linked lists, I/O functions
    • Reusable across multiple 42 projects

This modular design allowed us to work on different features in parallel without conflicts. Each module has a clear responsibility, making code review easier and reducing coupling between components.

Testing Strategy

One of the key learnings from this project was the importance of systematic testing. We built a custom test framework from scratch to validate our implementation.

Test Infrastructure

tests/
├── unit/                      # Unit tests
│   ├── test_builtin_echo.c            # Echo with -n flag
│   ├── test_expansion_*.c             # 5 expansion tests
│   ├── test_categorize_tokens.c       # Token categorization
│   ├── test_categorize_ast.c          # AST node types
│   ├── test_quote_trim_ast.c          # Quote removal
│   └── test_easy_quote_trim.c
│
├── integration/               # Integration tests (deprecated, moved to manual)
│
├── scripts/                   # Test automation scripts
│   └── test_bash_exit_code.sh
│
├── Makefile                   # Test build system
└── ast_print.c                # AST debugging utility

Test Framework Features:

  • TDD approach: Tests written before/during implementation (for some of the builtins tests)
  • Colorized output: Green/red for pass/fail with detailed error messages
  • Manual assertions: printf-based validation (no external test library)
  • Valgrind integration: Memory leak detection with readline suppression
  • Modular test builds: Each test compiles to separate binary

Running Tests:

# Build all tests
make -C tests/

# Run specific test
make -C tests/ TEST=unit/test_builtin_echo.c run

# Run with valgrind
make -C tests/ TEST=unit/test_expansion_get_var_value.c valgrind

Why we built our own framework:

  • Learning experience, understood how testing frameworks work internally
  • Full control over test behavior and output formatting
  • Tailored to our specific testing needs (AST validation, shell state management)

The testing process taught us that good tests are as important as good code. Writing tests forced us to think about edge cases, validated our design decisions, and gave us confidence when refactoring.

Program Usage

./minishell

The program displays a prompt and waits for commands:

minishell$ ls -la
minishell$ echo "Hello, World!"
minishell$ cat file.txt | grep pattern | wc -l
minishell$ export VAR=value
minishell$ echo $VAR
minishell$ exit

Core Features

Interactive shell:

  • Display a prompt when waiting for new command
  • Readline integration with working history (up/down arrows)
  • Search and launch executables (based on PATH or absolute/relative path)

Quote handling:

  • ' (single quotes): Prevents interpretation of all metacharacters
  • " (double quotes): Prevents interpretation except for $

Redirections:

Operator Description
< Redirect input from file
> Redirect output to file (truncate)
<< Here-document (read until delimiter)
>> Redirect output to file (append)

Pipes:

  • | : Pipe output of one command to input of next
  • Support for multiple pipes: cmd1 | cmd2 | cmd3 | ...

Environment variables:

  • $VARIABLE: Expand to their value
  • $?: Expand to exit status of last executed command

Signal handling:

Signal Interactive Mode Behavior
ctrl-C Display new prompt on new line
ctrl-D Exit the shell
ctrl-\ Do nothing

Built-in Commands

These commands must be implemented as built-ins (run in the shell process, not as external programs):

Command Options Description
echo -n Print arguments (with optional -n to omit newline)
cd relative or absolute path Change directory
pwd none Print working directory
export none Set environment variables
unset none Unset environment variables
env none Print environment variables
exit none Exit the shell

Example usage:

minishell$ cd /tmp
minishell$ pwd
/tmp
minishell$ export MY_VAR=hello
minishell$ echo $MY_VAR
hello
minishell$ env | grep MY_VAR
MY_VAR=hello
minishell$ unset MY_VAR
minishell$ echo $MY_VAR

minishell$ exit

Not Required

  • Unclosed quotes or special characters
  • \ (backslash)
  • ; (semicolon)

Mandatory Requirements

  • One global variable maximum: Only for signal number
  • No memory leaks: Except from readline() (known issue, we used a .sup file to ignore all leaks from readline lib)
  • Bash reference: Use bash to resolve doubts about requirements
  • Error handling: Handle errors gracefully without crashing
  • Norm compliance: Code must follow 42 Norm

Bonus Features

Not done (lack of time) but we used and AST during parsing phase so technically, everything is ready for bonus implementation if we want to come back to the project (which I hope we can, one day).

Bonus asked :

Logical operators:

  • &&: Execute next command only if previous succeeded
  • ||: Execute next command only if previous failed
  • Parentheses for priority: (cmd1 && cmd2) || cmd3

Wildcards:

  • *: Wildcard expansion for current working directory

Resources

Shell Implementation

Readline Library

Process Management

File Descriptors and Pipes

Signal Handling

Parsing

Testing

# Compare with bash behavior
bash$ echo "test" | cat -e
test$
minishell$ echo "test" | cat -e
test$

# Test redirections
minishell$ echo "hello" > file.txt
minishell$ cat < file.txt
hello
minishell$ echo "world" >> file.txt
minishell$ cat file.txt
hello
world

# Test pipes
minishell$ ls -la | grep minishell | wc -l

# Test environment variables
minishell$ export TEST=hello
minishell$ echo $TEST
hello
minishell$ env | grep TEST
TEST=hello

# Test exit status
minishell$ ls nonexistent
ls: cannot access 'nonexistent': No such file or directory
minishell$ echo $?
2

# Test signals
minishell$ sleep 10
^C
minishell$

# Test here-doc
minishell$ cat << EOF
> line 1
> line 2
> EOF
line 1
line 2

Notes

Implementation Strategy:

  1. Lexer (Tokenization): Split input into tokens (words, operators, quotes)
  2. Parser: Build command structure from tokens (commands, arguments, redirections, pipes)
  3. Expander: Expand environment variables (and wildcards if bonus are later implemented)
  4. Executor: Execute commands with proper redirections and pipes
  5. Built-ins: Handle built-in commands separately
  6. Cleanup: Free all allocated memory

Architecture Overview:

┌─────────────┐
│   Readline  │  Get user input with history
└──────┬──────┘
       │
┌──────▼──────┐
│    Lexer    │  Tokenize input (spaces, quotes, operators)
└──────┬──────┘
       │
┌──────▼──────┐
│   Parser    │  Build command structure (AST)
└──────┬──────┘
       │
┌──────▼──────┐
│  Expander   │  Expand $VAR, $?, remove quotes
└──────┬──────┘
       │
┌──────▼──────┐
│  Executor   │  Execute commands with pipes/redirections
└──────┬──────┘
       │
┌──────▼──────┐
│   Cleanup   │  Free memory, close file descriptors
└─────────────┘

Important Considerations:

  • Global variable: Only one allowed (for signal number)
  • Readline memory: readline() has known memory leaks - acceptable per the subject
  • Bash reference: When in doubt, test with bash
  • Error messages: Should match bash as closely as possible (but we are not doing posix bash)
  • File descriptor management: Always close unused FDs
  • Process cleanup: Always wait() for child processes
  • Quote handling: Remove quotes during expansion
  • Empty commands: Handle empty input gracefully

Common Pitfalls:

  • Not handling signals properly: Signals must behave differently in interactive vs execution mode
  • Built-ins in pipes: Built-ins in pipelines must run in child processes
  • File descriptor leaks: Always close all file descriptors
  • Quote removal: Quotes must be removed during expansion
  • Prompt in non-interactive mode: Only show prompt when stdin is a terminal (isatty())
  • Zombie processes: Always wait for child processes
  • Parsing order: Parse before expansion (don't expand inside quotes during lexing)
  • Exit status: Must track and update $? correctly
  • Memory leaks: Free all allocated memory (except readline)
  • Path resolution: Search PATH directories in order for command execution

Testing Checklist:

  • Prompt displays correctly
  • History works (up/down arrows)
  • Simple commands execute (ls, cat, echo)
  • Commands with arguments work
  • Absolute and relative paths work
  • Input redirection < works
  • Output redirection > works
  • Append redirection >> works
  • Here-doc << works
  • Single pipes work
  • Multiple pipes work
  • Environment variables expand correctly
  • $? shows correct exit status
  • Single quotes prevent expansion
  • Double quotes allow $ expansion
  • All built-ins work correctly
  • Signals (ctrl-C, ctrl-D, ctrl-) work correctly
  • No memory leaks (except readline)
  • No file descriptor leaks
  • Edge cases handled gracefully
  • Matches bash behavior

This is a school project. The code was validated and pushed to the school's Git repository after evaluation. The Git history in this public repository may not reflect the complete development process, as it represents the final version after successful validation.

Minishell Badge

Minishell Home 2

About

42 Paris project - group project with @d2codex - implementing a bash-like shell while discovering professional Git workflows, PR reviews, and test-driven development.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • C 97.4%
  • Makefile 2.4%
  • Shell 0.2%