Skip to content

This project runs an machine learning approach for early detection based on 289 course projects based on an Open Source System project done by graduate students in a 500-level computer science course in past five years.

Notifications You must be signed in to change notification settings

RickKwok/Working-Pattern-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Early Detection on Students Failing Projects.

###Overview _****_ This project runs an machine learning approach for early detection based on 289 course projects based on an Open Source System project done by graduate students in a 500-level computer science course in past five years.

###Files and Data Source

The script plot_commits_each_pr.py contains helper methods, such as interacting with GITHUB api to get the working patterns and convert into arrays (here I took the first 20 days of the whole 35 periods) as well as methods (Dynamic Time Wrapping) calculate similarities between arrays, which representing time series data.

The file source_data.py contains preprocessing of test_data, the big vector I read in is from an Excel sheet. To omit some irrelevant attributes and merge the 3 decisions on projects' (fully merged, partially merged, rejected) into 2(success, failure). And convert the Github PR link into times series cluster labels(here I predefined 3 type of time series).

The file multinominol_regression.py contains Interfaces to extract the feature parts and label parts from both training set (PR_vectors.csv) and testing set(test_data.csv). And trained multinomial logistic regression model Then tested with last semesters' projects.

include FALL 2014 projects from so on. Or Spring 2015. Shuffle training set. Use 3 labels. raise a discussion to run regression on single features and several features combined.

About

This project runs an machine learning approach for early detection based on 289 course projects based on an Open Source System project done by graduate students in a 500-level computer science course in past five years.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages