Ranked Search for Coursera Video Subtitles

Presently, Coursera video subtitles search provides exact match only for queries. Due to the inaccuracies in subtitles, as well as the frequency of typos, there is a clear benefit to implementing approximate string matching, or fuzzy search. Fuzzy search will help ensure students do not miss out on important information.

We originally intended to implement Fuzzy search, however early on we shifted to Ranked Search

Project Demo

https://uofi.box.com/s/3fhxuug2sym52l4h9h3nsr9gjx9b1y2u

Browser Extension

Requirements

Firefox (Tested on v94.0.1)

Installation Instructions

First, import the add-on:

In Firefox, go to about:debugging
Click "This Firefox"
Click "Load Temporary Add-On"
Navigate to your saved manifest.json file and click on it

Using the Extension

Right-Click on any page in Firefox
Click "Fuzzy Search"
On the Fuzzy Search page, enter a query you'd like to search for

Required JS files

background.js: This script is required for the "Fuzzy search" context menu option
tabs.js: This script is related to the Fuzzy Script page and sends queries/receives responses for the application logic

Ranked Matching Algorithm

Requirements

pip install nltk

Inputs & Outputs

input: corpus of subtitles, search query
output: matches (video name, timestamp, subtitle snippet)

Implementation

stemming of query and documents (subtitles)
removal of stop words from query
bag of words representation of query
results returned in ranked order based on closeness of match (count of query term matches in document)

Server

Installation

flask
flask-cors
nltk
glob
numpy

from the server directory, run:

python3 server.py

Implementation & Endpoints

/ - GET, check if server is running
/test - GET, takes no params, returns dummy result JSON
/search - POST, takes JSON search query, returns JSON results

query JSON:

{  
    "query": [  
      "query",  
      "terms"  
  ]  
}

results JSON:

{  
  "results": [  
    {  
      "doc": "even parts of ancients tags or even syntax to the structures",  
      "id": "66",  
      "score": 38,  
      "timestamp": "00:05:08,860 --> 00:05:13,020",  
      "videoname": "10-8-text-categorization-methods"  
    }  
  ]  
}

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
browser-extension		browser-extension
fuzzy-match		fuzzy-match
server		server
Progress Report.pdf		Progress Report.pdf
Project Proposal.pdf		Project Proposal.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ranked Search for Coursera Video Subtitles

Project Demo

Browser Extension

Requirements

Installation Instructions

Using the Extension

Required JS files

Ranked Matching Algorithm

Requirements

Inputs & Outputs

Implementation

Server

Installation

Implementation & Endpoints

About

Uh oh!

Releases

Packages

Languages

iluvcomputers/CourseProject

Folders and files

Latest commit

History

Repository files navigation

Ranked Search for Coursera Video Subtitles

Project Demo

Browser Extension

Requirements

Installation Instructions

Using the Extension

Required JS files

Ranked Matching Algorithm

Requirements

Inputs & Outputs

Implementation

Server

Installation

Implementation & Endpoints

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages