fuzzy_search

Python script for fast text fuzzy search (based on Levenshtein's distance)

Usage

This script searches for phrases in large text (.txt) files, with some difference tolerance.
Optimised for natural human readable text (articles, books..) as it heavily relies on words separated by whitespaces.
** Does not work with substrings (eg. will not find 'brown' in 'quickbrownfox')

Parameters

Run with -h for details

-src source file path (required)
-f string to find (required)
-msl max word length difference
-mld max Levenshtein distance

Examples

Default max Levenshtein distance (LD) is 2 and max word length difference (WLD) is 1:

text part	to find	match	reason
QUICK BROWN FOX JUMPS	brown fox	yes	case insensitive
quick bronw fox jumps	brown fox	yes	LD=2, WLD=0
quick brownn fox jumps	brown fox	yes	LD=1, WLD=1
quick green fox jumps	brown fox	no	~~LD=3~~, WLD=0
quick brownnn fox jumps	brown fox	no	LD=2, ~~WLD=2~~

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
search.py		search.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fuzzy_search

Usage

Parameters

Examples

About

Uh oh!

Releases

Packages

Languages

popovicn/fuzzy_search

Folders and files

Latest commit

History

Repository files navigation

fuzzy_search

Usage

Parameters

Examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages