The goal of this project is to recode the GNU nm binary, without options for the mandatory part, with a few for bonuses (see below).
As usual at 42 we are not allowed any library. The only functions that we are allowed are: open, close, mmap, munmap, write, fstat, malloc, free, exit, perror, strerror, getpagesize.
This project relies heavily on data structures and parsing of a binary file.
While it may seem simple at first, it requires a lot of understanding of the ELF filetype as well as a lot of error control.
To any student who would like to go this path:
While you already met elf files all along your cursurs perhaps you have never really tried to look at how it was built. Therefore a good first step would be man elf.
You will learn about:
typedef struct {
unsigned char e_ident[EI_NIDENT];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
ElfN_Addr e_entry;
ElfN_Off e_phoff;
ElfN_Off e_shoff;
uint32_t e_flags;
uint16_t e_ehsize;
uint16_t e_phentsize;
uint16_t e_phnum;
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
} ElfN_Ehdr; typedef struct {
uint32_t p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
uint32_t p_filesz;
uint32_t p_memsz;
uint32_t p_flags;
uint32_t p_align;
} Elf32_Phdr;
typedef struct {
uint32_t p_type;
uint32_t p_flags;
Elf64_Off p_offset;
Elf64_Addr p_vaddr;
Elf64_Addr p_paddr;
uint64_t p_filesz;
uint64_t p_memsz;
uint64_t p_align;
} Elf64_Phdr typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint32_t sh_flags;
Elf32_Addr sh_addr;
Elf32_Off sh_offset;
uint32_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint32_t sh_addralign;
uint32_t sh_entsize;
} Elf32_Shdr;
typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint64_t sh_flags;
Elf64_Addr sh_addr;
Elf64_Off sh_offset;
uint64_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint64_t sh_addralign;
uint64_t sh_entsize;
} Elf64_Shdr typedef struct {
uint32_t st_name;
Elf32_Addr st_value;
uint32_t st_size;
unsigned char st_info;
unsigned char st_other;
uint16_t st_shndx;
} Elf32_Sym;
typedef struct {
uint32_t st_name;
unsigned char st_info;
unsigned char st_other;
uint16_t st_shndx;
Elf64_Addr st_value;
uint64_t st_size;
} Elf64_Sym;And other types like relocation entries, Dynamic tags and notes
This is a daunting task, and obviously no errors or crash are allowed, but any file can be sent.
This file is a good entrypoint before writting your own packer and later on viruses.
In order to score a higher grade than normally possible, the following options were to be implemented
-a : display all symbols even debugger-only symbols
-g : display only external (global) symbols
-u : display only undefined symbols (external to each object file)
-r : reverse sort
-p : no-sort
I highly suggest before losing your mind on sorting algorithms to read a bit about locales LC_COLLATE and/or LC_ALL
As usual at 42, we are not allowed any libs in this exercise, therefore we are not allowed to rely on binary file descriptor library libbfd that handles this part for nm.
Fully understanding the categories (I have to say "arbitrary") used in nm was a real challenge and led to a ton of trial/error with custom built binaries. Good luck to you if you go down this path.