Los Angeles County Crime Rate Prediction

Context

Los Angeles has a crime rate of 2,759 per 100,000 people, which is higher than the national average of 2,580 per 100,000 people.I developed a machine learning model that integrates demographics, and historical crime reports to predict the likelihood of specific crimes occurring in specific areas within Los Angeles.By providing a data-driven crime prediction model , policy makers and certain communities can be empowered to implement targeted security measures and make informed decisions.

Project Approch

Data Acquisition

The necessary data was accquired from the sources https://data.lacity.org/Public-Safety/Crime-Data-from-2010-to-2019/63jg-8b9z/explore and https://data.lacity.org/Public-Safety/Crime-Data-from-2020-to-Present/2nrs-mtv8

Solution Overview

Step1_DataWrangling.ipynb: Combined both data sources above,cleaned the data and handled missing data.
Step2_Exploratory_data_analysis.ipynb: Statistically explored the dataset to understand itscharacteristics, patterns, and potential issues, and creating relevant features that capture the characteristics of crime areas.
Step3_Preprocessing_and_training.ipynb: Prepared data to train the models,did feature engineering for both category and numerical data by using StandardScalar,OneHotEncoding and generated dummies datasets.
Step4_Modeling.ipynb: Trained the diffrent types of regressions model to predict crime rate trends based on the influenced factors - crime type ,area,victim characteristic,Linear Regression model, Random Forest Regressor, Gradient Boost Regressor, and Decision Tree models are explored,then accessed the model's performance using appropriate metrics such as MAE,MSE and RMSE. Finally, Decision tree has been choosen as the best model, The model file is saved at models.
ProjectReport is prepared.
Slides Presentation is prepared.

Footer Note

All files cannot be uploaded into Github because of size limitation. Full data files processed throughout the project can be found at https://drive.google.com/drive/folders/1Gyf_0yEHZs2v8h3dXyWcxj6AWsMRsXUs?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
docs		docs
models		models
notebooks		notebooks
reports		reports
src		src
.DS_Store		.DS_Store
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
banner.png		banner.png
requirements.txt		requirements.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Los Angeles County Crime Rate Prediction

Context

Project Approch

Data Acquisition

Solution Overview

Footer Note

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

thikyi/DataScienceCapstoneTwo

Folders and files

Latest commit

History

Repository files navigation

Los Angeles County Crime Rate Prediction

Context

Project Approch

Data Acquisition

Solution Overview

Footer Note

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages