Skip to content

cdyangbo/DeepVideoAnalytics

 
 

Repository files navigation

Deep Video Analytics • Build Status

Banner UI Screenshot

Deep Video Analytics provides a platform for indexing and extracting information from videos and images. Deep learning detection and recognition algorithms are used for indexing individual frames/images along with detected objects. The goal of Deep Video analytics is to become a quickly customizable platform for developing visual & video analytics applications, while benefiting from seamless integration with state or the art models & datasets released by the vision research community.

Features

  • Visual Search using Nearest Neighbors algorithm as a primary interface
  • Upload videos, multiple images (zip file with folder names as labels)
  • Provide Youtube url to be automatically processed/downloaded by youtube-dl
  • Leverage pre-trained object recognition/detection, face recognition models for analysis and visual search.
  • Query against pre-indexed external datasets containing millions of images.
  • Metadata stored in Postgres, Operations performed asynchronously using celery tasks.
  • Separate queues and workers for selection of machines with different specifications (GPU vs RAM).
  • Videos, frames, indexes, numpy vectors stored in media directory, served through nginx
  • Explore data, manually run code & tasks without UI via a jupyter notebook explore.ipynb

Models included out of the box

We take significant efforts to ensure that following models (code+weights included) work without having to write any code.

self-promotion: If you are interested in Healthcare & Machine Learning please take a look at my another Open Source project Computational Healthcare

Libraries & Code used

References

  1. Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
  2. Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
  3. Zhang, Kaipeng, et al. "Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks." IEEE Signal Processing Letters 23.10 (2016): 1499-1503.
  4. Liu, Wei, et al. "SSD: Single shot multibox detector." European Conference on Computer Vision. Springer International Publishing, 2016.
  5. Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
  6. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
  7. Johnson, Jeff, Matthijs Douze, and Hervé Jégou. "Billion-scale similarity search with GPUs." arXiv preprint arXiv:1702.08734 (2017).

Citation

Citation for Deep Video Analytics coming soon.

Copyright

Copyright 2016-2017, Akshay Bhat, Cornell University, All rights reserved.

Please contact me for more information, I plan on relaxing the license soon, once a beta version is reached (To the extent allowed by the code/models included.e.g. FAISS disallows commercial use.).

About

A highly configurable visual search & analytics platform for images and videos.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 40.4%
  • C++ 30.0%
  • HTML 13.0%
  • C 5.5%
  • Jupyter Notebook 4.0%
  • Cuda 3.9%
  • Other 3.2%