GitHub

#Hidden Markov Model Part of Speech Tagger

This is a Hidden Markov Model part-of-speech tagger for Catalan. The training data provided is tokenized and tagged. The test data is tokenized, the program will add the tags.

python hmmlearn.py /path/to/input
The argument is a single file containing the training data;
the program will learn a hidden Markov model, and write the model parameters to a file called hmmmodel.txt.

python hmmdecode.py /path/to/input
The argument is a single file containing the test data;
the program will read the parameters of a hidden Markov model from the file hmmmodel.txt, tag each word in the test data, and write the results to a text file called hmmoutput.txt in the same format as the training data.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
hmmdecode.py		hmmdecode.py
hmmlearn.py		hmmlearn.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

NamithaGS/HMM_POSTagger

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages