Do Microfinance Institutions Prioritise Need? Evidence from Loan Allocation and Repeat Borrowing Patterns
Replication files & code for an ECO225 course project.
This repository contains the paper PDF, conference presentation PDF, code, and supporting files, for a regression and machine learning analysis of microloans from Kiva.org. The project aims to use loan descriptions classified into low-need and high-need loans to investigate whether microfinance institutions allocate loan capital in proportion to borrower financial need, and whether repeated borrowing grows or shrinks.
Main File:
ECO225_Code.ipynbโ A Jupyter Notebook containing:- Data cleaning and preprocessing
- Visualisations of key trends
- Feature engineering
- Machine learning pipeline for loan classification
- Regression analysis of loans and disbursement amounts
To Replicate:
- Download the required datasets (see below)
- Place them in the appropriate folder
- Open the notebook and run all cells sequentially
External (must be downloaded):
๐ฅ Kiva Kaggle Dataset
Required files:
loans.csvโ Primary loan-level dataloan_coords.csvโ Geographical coordinates for each loan
Included in this repository:
gdp_data.csvโ Country-level GDPmpi_data.csvโ Multidimensional Poverty Indexllm_subsample.csvโ Subsample of 2,000 LLM-labeled loans used in classification tasksECO225_Paper.pdfโ Paper PDF for the projectTech-Econference_Presentation.pdfโ Presentation PDF for the project at the 5th Annual Econ-Tech Conference (2025), University of Toronto
- Python 3.8+
- Jupyter Notebook
- Core packages used:
pandas,numpy,sklearn,matplotlib,seaborn
Due to GitHubโs file size limits, only essential supporting files are included. For full replication, refer to the Kaggle dataset linked above.