Skip to content

lecheng/Tag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tag

These are projects about tag recommendation in Python2.

Flipboard

Flipboard is to get the related words of a word by flipboard api and get the co-tag file used for T2Ttable project.

Folder "hashtag" is the hashtag of our APP, it's cut into six files by category.

Folder "tags" is the co-tags picked by our staff. It's added to the co-tag file as an extension.

Folder "data" includes the layer1 and layer2 tags generated by flipboard api and the co-tag file.

crawler.py is to crawl the data of flipboard by flipboard api.

data.py is to generate the co-tag by flipboard data (using run()) and the staff picked data (using get_staff_tags()).

Glove

Glove project is to use Glove to get related words of one word. This project haven't finished yet because of time consumption of running. It should be improved.

test.py is a test code of reading file by python. There are totally four methods to read file by python. It helps to select the best one to save time or space.

sparksplit.py is to select NN,NNP,VB from the Glove corpus and get a file named newvectors.txt, in which every row is a json string.

spark.py is to get the final result. It's also a file in which every row is a json string. The "key" represent the original word. The "value" is a list of key-value, in which the key is the related word and the value is the frequence. It's sorted by the frequence.

You should run the sparksplit.py to get the "newvectors.txt" before running the spark.py.

Log

Log is to analyze the log data by Spark such as counting the tag number in log and sort them.

Select-tag

Select-tag project is an algorithm to realize extracting minimum number of hashtag from existing videos whose related hashtags cover the whole hashtags as much as possible.

T2Ttable

t2ttable.py is to generate tag-tag table from co-tag. There is a parameter "isHash" to determine whether to filter the final tag-tag table by the hashtag.

sample.json is a sample of input file.

data.json is the core data to generate our final tag-tag table, which can be extended in the same format.

usage

python t2ttable.py -f co-tag-data.json <-i False>

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published