Instructions on How to Run Locally:
-
Install all needed packages:
RE (regular expression - should come with python)
NLTK
-
Download the code from the github repository: https://github.com/dani-richmond/CourseProject
-
Extract Transcripts.zip file (where all of the transcripts reside)
-
In the ‘main_program.py’ code (in Final Scripts), update the ‘transcript_dir’ and ‘textbook_dir’ variable values to where these two directories are located on your local machine (both should be in the ‘CourseProject’ folder)
-
Edit the ‘unigram_weight’, ‘bigram_weight’ and ‘threshold’ as desired
-
Run the main program
-
The potential typo results can be reviewed in the results.txt file that is output
Link to Video documentation: https://drive.google.com/file/d/1-0E1blOFC5lPSQQZpVIlBQVXxFOCMdmk/view?usp=sharing