Skip to content

valenti1234/visual-inspector

Repository files navigation

Factory Visual Inspector

This application is a component of a larger backend system for industrial parts evaluation. It uses the LLaVA model to perform visual inspections of factory equipment, identify objects, materials, and potential issues, and generate a detailed inspection report in PDF format.

Project Structure

The application is organized into the following modules:

  • app.py: The main entry point for the application.
  • model_loader.py: Handles loading the LLaVA model and processor.
  • helper_functions.py: Contains utility functions for image processing, QR code generation, and interacting with the LLaVA model.
  • report_generator.py: Generates the inspection report in Markdown and PDF format.
  • gradio_ui.py: Defines the Gradio user interface.
  • requirements.txt: Lists the Python dependencies for the project.
  • style.css: Contains the CSS for the PDF report.

LLaVA Model

The application leverages the llava-hf/llava-1.5-7b-hf model, a 7-billion parameter Large Language and Vision Assistant (LLaVA) model. This powerful model is designed to understand both text and images, making it ideal for visual inspection tasks.

Key Features

  • Multi-modal Understanding: The LLaVA model can process and reason about both visual and textual information, enabling it to analyze images and generate descriptive text.
  • 4-bit Quantization: To optimize performance and reduce the memory footprint, the model is loaded with 4-bit quantization using the bitsandbytes library. This allows the model to run efficiently on a wider range of hardware.
  • Object and Anomaly Detection: The model is used to identify objects, assess their condition, and detect anomalies or potential issues in the factory equipment.
  • Report Generation: The insights generated by the LLaVA model are used to create a comprehensive inspection report, which includes a summary of the findings, a list of identified objects, and a comparison with a reference image.

Setup and Installation

  1. Create a virtual environment:

    python3 -m venv vinsp
    source vinsp/bin/activate
  2. Install the required dependencies:

    pip install -r requirements.txt

How to Run

To start the application, run the following command:

python3 app.py

This will start a Gradio server, and you can access the user interface in your web browser at http://0.0.0.0:7860.

About

Factory Visual Inspector

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published