The Netflix Data Analysis project aims to explore, clean, and visualize Netflixโs dataset to uncover key insights about its movies and TV shows. The analysis helps in understanding content trends, popular genres, distribution across countries, and the evolution of Netflixโs library over time.
Through this project, we perform data cleaning, exploratory data analysis (EDA), and visualization using Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.
Analyze Netflixโs catalog of movies and TV shows.
Identify trends in genres, ratings, and release years.
Compare the distribution of movies vs. TV shows.
Visualize country-wise and time-based content growth.
Gain hands-on experience in data preprocessing and visualization.
Source: Netflix Dataset on Kaggle
File: netflix_titles.csv
Key Columns:
show_id
type (Movie/TV Show)
title
director
cast
country
date_added
release_year
rating
duration
listed_in (Genre)
description
Programming Language: Python
Libraries:
Pandas โ Data manipulation and cleaning
NumPy โ Numerical computation
Matplotlib & Seaborn โ Data visualization
Jupyter Notebook โ Interactive analysis environment **
** Importing Libraries & Dataset
Load the Netflix dataset into a pandas DataFrame.
Data Cleaning
Handle missing values, remove duplicates, and standardize formats.
Exploratory Data Analysis (EDA)
Summary statistics, unique values, and type distributions.
Visualization
Graphical insights using bar plots, pie charts, heatmaps, and count plots.
Insights & Conclusions
Summarize findings and interpret key trends.
Movies make up the majority of Netflixโs content.
The USA and India contribute the most titles to Netflixโs catalog.
Drama and Comedy are the most common genres.
A significant increase in Netflix content can be seen after 2015.
The most common content rating is TV-MA (for mature audiences).
The analysis provides a clear picture of Netflixโs content strategy and growth. It highlights how Netflix has expanded globally, diversified its genres, and increased its focus on original programming in recent years.
Clone this repository:
git clone https://github.com/your-username/netflix-data-analysis.git
Navigate to the project folder:
cd netflix-data-analysis
Install required libraries:
pip install pandas numpy matplotlib seaborn
Open the Jupyter Notebook:
jupyter notebook netflix_analysis.ipynb
Anand Singh B.Tech โ Cloud Computing & Machine Learning ๐ง Email: anandsi1726j@gmail.com