Factory Visual Inspector

This application is a component of a larger backend system for industrial parts evaluation. It uses the LLaVA model to perform visual inspections of factory equipment, identify objects, materials, and potential issues, and generate a detailed inspection report in PDF format.

Project Structure

The application is organized into the following modules:

app.py: The main entry point for the application.
model_loader.py: Handles loading the LLaVA model and processor.
helper_functions.py: Contains utility functions for image processing, QR code generation, and interacting with the LLaVA model.
report_generator.py: Generates the inspection report in Markdown and PDF format.
gradio_ui.py: Defines the Gradio user interface.
requirements.txt: Lists the Python dependencies for the project.
style.css: Contains the CSS for the PDF report.

LLaVA Model

The application leverages the llava-hf/llava-1.5-7b-hf model, a 7-billion parameter Large Language and Vision Assistant (LLaVA) model. This powerful model is designed to understand both text and images, making it ideal for visual inspection tasks.

Key Features

Multi-modal Understanding: The LLaVA model can process and reason about both visual and textual information, enabling it to analyze images and generate descriptive text.
4-bit Quantization: To optimize performance and reduce the memory footprint, the model is loaded with 4-bit quantization using the bitsandbytes library. This allows the model to run efficiently on a wider range of hardware.
Object and Anomaly Detection: The model is used to identify objects, assess their condition, and detect anomalies or potential issues in the factory equipment.
Report Generation: The insights generated by the LLaVA model are used to create a comprehensive inspection report, which includes a summary of the findings, a list of identified objects, and a comparison with a reference image.

Setup and Installation

Create a virtual environment:

python3 -m venv vinsp
source vinsp/bin/activate

Install the required dependencies:
```
pip install -r requirements.txt
```

How to Run

To start the application, run the following command:

python3 app.py

This will start a Gradio server, and you can access the user interface in your web browser at http://0.0.0.0:7860.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Factory Visual Inspector

Project Structure

LLaVA Model

Key Features

Setup and Installation

How to Run

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
app.py		app.py
gradio_ui.py		gradio_ui.py
helper_functions.py		helper_functions.py
model_loader.py		model_loader.py
report_generator.py		report_generator.py
requirements.txt		requirements.txt
style.css		style.css

valenti1234/visual-inspector

Folders and files

Latest commit

History

Repository files navigation

Factory Visual Inspector

Project Structure

LLaVA Model

Key Features

Setup and Installation

How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages