https://mediaspace.illinois.edu/media/t/1_hrejpyg0
The project is a movie recommendation system based on the free topic theme option. This recommender system is meant to recommend movies to each user recorded in the dataset we choose, consisting of movie ratings from 610 users. On the website we constructed, each user will be recommended a maximum of 20 different movies, ranking in descending order of predicted favorability value. Our website allows input of user number, movie genres, and the number of recommended movies needed. The favorability values are predicted through collaborative filtering by training the Root Mean Square Error (RMSE), or the distance between the true and the predicted values.
For this project, we constructed a static website to present our recommendation system. We choose to use the MySQL Database server instead of a local database. Therefore, we made a connection to the MySQL server in the dbh.php file:
The website generated through index.php allows users to enter three inputs, including Username (User number [1, 610]), Movie Genre (e.g., Comedy, Drama, Romance, Horror, Sci-Fi, etc. Capitalized), and Number of Recommended Movies ([1,20]).
After putting a group of input (e.g., 10, Comedy, 5), the user will get a result similar to the following picture, which provides five comedy movie recommendations for user #10.
The project implements collaborative filtering using the programming language of python in Jupyter Notebook to achieve the goal of recommendation. The implementation of recommender could be divided into the following steps:
I. Import the csv dataset of movie ratings from 610 users and transform it into a matrix that contains 9724 rows, each representing a movie, and 610 columns, each representing a user. Since each user will not rate every movie, most values in this matrix are presented as N/A. We calculated the sparsity, or the percentage of N/A values, of the matrix, being 98.32% and replaced N/A values with zero.
II. Assuming 10 features, we initialize movie parameters and user parameters using the numpy.random.randint function and get the index of non-zero values. We also defined a function named rmse here to calculate the Root Mean Square Error (RMSE) using the function attached below.
III. Update parameters using the Gradient Descent Algorithm to minimize the RMSE values. This part of the code updates the predictions 100 times while keeping track of the RMSE values in the list named rmse_ls. Drawing out a graph of the changing of RMSE values as the number of epochs increases, we find the training is actually decreasing the RMSE values.
IV. Make predictions using the trained parameters and reshape the dataset as each user corresponds to a list of movieIds of recommended movies (the top 20 with the best-predicted ratings). Then reshape the dataset as a matrix of 610 rows (users) 21 columns (20 movieIds of the recommended movie for each user) as the final output.
Clone project from GitHub
git clone https://github.com/jialen2/CourseProject.git
cd CourseProjectRun php file in terminal
php -S 127.0.0.1:8000Open browser and enter
localhost:8000- Initial Idea: Li Ju
- Project Proposal & Progress Report: Jiale Ning, Li Ju, Jiawei Pei
- Frontend Interface & Server Connection: Jiale Ning
- Researching on Datasets & Recommender System: Li Ju, Jiawei Pei
- Documentation: Jiawei Pei
- Presentation: Li Ju, Jiawei Pei, Jiale Ning
Dataset used: https://files.grouplens.org/datasets/movielens/ml-latest-small.zip









