Skip to content

Data science & econometrics project exploring whether micro-finance institutions prioritise borrower need, and whether repeated borrowing promotes financial inclusion. We use LLM-labelled loan narratives and ML to prepare over 1 million Kiva loans for a fixed effects regression analysis.

Notifications You must be signed in to change notification settings

Jakub-Riha/ECO225

Repository files navigation

Do Microfinance Institutions Prioritise Need? Evidence from Loan Allocation and Repeat Borrowing Patterns

Replication files & code for an ECO225 course project.


๐Ÿ“„ Project Description

This repository contains the paper PDF, conference presentation PDF, code, and supporting files, for a regression and machine learning analysis of microloans from Kiva.org. The project aims to use loan descriptions classified into low-need and high-need loans to investigate whether microfinance institutions allocate loan capital in proportion to borrower financial need, and whether repeated borrowing grows or shrinks.


๐Ÿ’ป Code & Analysis

Main File:

  • ECO225_Code.ipynb โ€“ A Jupyter Notebook containing:
    • Data cleaning and preprocessing
    • Visualisations of key trends
    • Feature engineering
    • Machine learning pipeline for loan classification
    • Regression analysis of loans and disbursement amounts

To Replicate:

  1. Download the required datasets (see below)
  2. Place them in the appropriate folder
  3. Open the notebook and run all cells sequentially

๐Ÿ—‚๏ธ Datasets

External (must be downloaded):
๐Ÿ“ฅ Kiva Kaggle Dataset

Required files:

  • loans.csv โ€“ Primary loan-level data
  • loan_coords.csv โ€“ Geographical coordinates for each loan

Included in this repository:

  • gdp_data.csv โ€“ Country-level GDP
  • mpi_data.csv โ€“ Multidimensional Poverty Index
  • llm_subsample.csv โ€“ Subsample of 2,000 LLM-labeled loans used in classification tasks
  • ECO225_Paper.pdf โ€“ Paper PDF for the project
  • Tech-Econference_Presentation.pdf โ€“ Presentation PDF for the project at the 5th Annual Econ-Tech Conference (2025), University of Toronto

โš™๏ธ Dependencies

  • Python 3.8+
  • Jupyter Notebook
  • Core packages used: pandas, numpy, sklearn, matplotlib, seaborn

๐Ÿ“Œ Notes

Due to GitHubโ€™s file size limits, only essential supporting files are included. For full replication, refer to the Kaggle dataset linked above.

About

Data science & econometrics project exploring whether micro-finance institutions prioritise borrower need, and whether repeated borrowing promotes financial inclusion. We use LLM-labelled loan narratives and ML to prepare over 1 million Kiva loans for a fixed effects regression analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published