Visual Practice Rank

Goal

The goal of this project is to create a visual representation of a supplied ranking function in a form that is comparable to modern search engines with the display of additional data that would be only in the background

Project Presentation

https://mediaspace.illinois.edu/media/t/1_3s2zdgys

Getting Started

Installation

Create a virtual environment using conda:

conda create -n [your-env-name] python=3.6

e.g. conda create -n vpr python=3.6 anaconda
Activate the virtual environment

conda activate [your-env-name]

e.g. conda activate vpr
Download the repository

git clone https://github.com/VisualPracticeRank/CourseProject.git
Install the packages in requirements.txt

cd CourseProject

pip install -r requirements.txt

Running the Django server

While in the CourseProject folder, cd into VPR, list of the files and folders and you will see a file called manage.py. Run the following command:

python3 manage.py makemigrations

python3 manage.py migrate --run-snycdb

python3 manage.py runserver

A browser with the application will pop-up, or you can head to 127.0.0.1:8000

If you are running this on a VM, instead of python3 manage.py runserver, you can run the following command: python3 manage.py runserver 0.0.0.0:8000

Then, head over to [your ip]:8000, e.g. 18.219.133.210:8000

Shutting down the Django server

If you want to shutdown the Django server, you can do ctrl+c in the terminal to shut down the server. You can also deactivate the virtual environment with this command:

conda deactivate

Features

Dataset

You can upload your dataset by selecting Dataset on the homepage and fill in and upload the appropriate files needed:

Name
Description
Data
Qrels
Queries

and select Upload.

Model

In addition to the default models available ('OkapiBM25', 'PivotedLength', 'AbsoluteDiscount', 'JelinekMercer', 'DirichletPrior'), you can upload your own custom models.

You can upload your own model (ranker) by selecting Model on the homepage and fill in the textboxes:

Name
Description
Model

and select Add.

Query

After selecting the dataset and model that you want to use, you can specify a query in the textbox and select Search. You will get the top 10 documents with the highest score in your model, displayed in descending order.

Step Through

This functionality allows you to step through the queries.txt file you specified when uploading the dataset and observe the changes in the ndcg score and other various stats.

After selecting the dataset and model that you want to use, you can select Step Through. You can use the '>>' and '<<' buttons to step through and step out.

How It Works

Overview

This program is implemented using Django (Python, HTML, SQLite3) as the frontend and metapy (and Python) as the backend.

Frontend

The dataset details are stored in the webui_dataset. The dataset, qrels, and queries files are stored in a folder, with a unique name generated by the system, under the dataset folder. The webui_dataset has the following fields: id (primary key), name, description, data, qrels, queries.

When a dataset is being uploaded, the documents in the dataset is loaded into webui_document with the following fields: id (primary key), document_id, body, dataset_id (foreign key to webui_dataset.id).

When a model is being uploaded, the model is loaded into webui_model with the following fields: id (primary key), description, model, name. Before storing the actual model, it will be encoded to base64.

After you can select a dataset, model, and query, you click on Search. Then frontend (in view.py) will call the backend (search_eval.py) and pass the following variables: folder (folder where the dataset is stored), model, query.

For the iteration feature, the frontend will send these information to the backend: folder (path to dataset), model. After receiving the response from the backend, the frontend will display the following information: query (as specified in the queries.txt), NDCG, a table of the following information: the score of the document, size of the document, unique terms in the document, and snippit of the body.

Backend

The backend (search_eval.py) will these variables from the frontend: folder, model, query. It will first change its directory to [folder_to_dataset]/datasets and build an inverted index.

Then, it will determine if the model is one of the defaults ('OkapiBM25', 'PivotedLength', 'AbsoluteDiscount', 'JelinekMercer', 'DirichletPrior'). If it's not, then it will decode the string (using base64) and build the ranker.

Finally, it will run the ranker with the inverted index, query for the top 10 documents and return the top 10 documents, and their score.

For the iteration feature, the backend will utilize the qrels.txt and queries.txt that were uploaded when uploading the dataset. It will return a list of list:

results[0] = list of top k articles
results[1] = list of ndcg
results[2] = list of running avg ndcg
results[3] = list of queries

Reference

Chase Geiglem, 2017. [2-search-and-ir-eval.ipynb] (https://github.com/meta-toolkit/metapy/blob/master/tutorials/2-search-and-ir-eval.ipynb).

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
Documentation		Documentation
VPR		VPR
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Visual Practice Rank

Goal

Project Presentation

Getting Started

Installation

Running the Django server

Shutting down the Django server

Features

Dataset

Model

Query

Step Through

How It Works

Overview

Frontend

Backend

Reference

About

Uh oh!

Releases

Packages

Languages

VisualPracticeRank/CourseProject

Folders and files

Latest commit

History

Repository files navigation

Visual Practice Rank

Goal

Project Presentation

Getting Started

Installation

Running the Django server

Shutting down the Django server

Features

Dataset

Model

Query

Step Through

How It Works

Overview

Frontend

Backend

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages