Skip to content

A scalable system (using multiprocessing in Python) to find similarity between thousands of documents using difflib Sequence Matcher/ Levenstein Distance /cosine similarity/ word embeddings generated by word2vec

License

Notifications You must be signed in to change notification settings

analyticsbot/document-similarity

Repository files navigation

document-similarity

A scalable system (using multiprocessing in Python) to find similarity between thousands of documents using difflib Sequence Matcher/ Levenstein Distance /cosine similarity/ word embeddings generated by word2vec

About

A scalable system (using multiprocessing in Python) to find similarity between thousands of documents using difflib Sequence Matcher/ Levenstein Distance /cosine similarity/ word embeddings generated by word2vec

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages