Skip to content

junior-2016/lsh-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Require

$ sudo apt-get install python-matplotlib python-numpy python-tk python2.7-dev
  • GSL(GNU Scientific Library)
    On Ubuntu, install gsl using the following command.
$ sudo apt-get install libgsl-dev

Build

$ git clone 
$ cd lsh-cpp
$ git submodule update --init --recursive
$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release ..
(use cmake -DCMAKE_BUILD_TYPE=Debug .. to build in debug mode)
$ make

Run

$ cd build
$ ./lsh_cpp_test # run lsh test case
$ ./lsh_benchmark # run lsh benchmark

TODO

  • LSH_CPP implementation

    • LSH benchmark test
    • LSH Forest impl
    • LSH ensemble impl
    • Weight MinHash impl
    • Lean MinHash impl (减少内存使用/压缩数据及参数/内存缓冲池)
    • HyperLog / HyperLog++ impl
  • LSH_CPP Large-scale data support

    • Map reduce compute
    • data persistence (redis storage)
    • data save/read/remove/split/merge/... feature
  • Other

    • 使用C++17 parallelism TS 加速部分for-loop/sort/reduce/find算法.
    • 加入对 xxhash , parallel_hash_map/set 的可选控制, 当用户编译不指定这些三方库时改用std:: hash和std:: unordered_map.
    • Python interface export using pybind11
    • WeightMinHash 实现中需要 concept 的部分待重构(比如WeightMinHash 的update()参数需要SetElementType存在weight()和value()接口)

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published