Skip to content

An advanced Retrieval-Augmented Generation (RAG) system designed for personalized recommendations. This project features a scalable microservice architecture (FastAPI, Celery) and an agentic workflow (ARAG) to provide nuanced, user-centric results. A practical demonstration of modern AI and MLOps principles.

License

Notifications You must be signed in to change notification settings

damrongsak/agentic-retrieval-core

Repository files navigation

RAG Embedding Worker

This project is a complete, production-ready backend system for Retrieval-Augmented Generation (RAG). Its primary purpose is to process uploaded PDF documents, convert them into a searchable format, and store them in a vector database. This enables a larger application to perform semantic searches and retrieve relevant document chunks to answer user questions.

The system is designed as a set of decoupled microservices that can be scaled and maintained independently. It is fully containerized with Docker for easy and consistent deployment.

Features

  • PDF Document Upload: Users can upload PDF files through a simple /upload REST endpoint.
  • Asynchronous Processing: Document processing is handled in the background using a task queue (Celery + Redis), so the API can respond quickly without making the user wait.
  • Text Extraction: The system automatically extracts text from PDF files using PyMuPDF.
  • Text Chunking & Embedding: It segments the extracted text into smaller chunks (sentences) and generates semantic vector embeddings for each chunk using a sentence-transformers model.
  • Vector Storage: Embeddings and associated metadata (like the original filename) are stored in the Weaviate vector database.
  • Task Status Tracking: A /status/{task_id} endpoint allows clients to poll for the status of their upload and processing job.
  • Containerized & Ready for Deployment: The entire application stack is defined in a docker-compose.yml file, allowing you to build and run all services with a single command.
  • Secure by Design: Includes measures for secure file handling, secrets management, and container security.

Architecture

The system is composed of the following microservices:

flowchart TD
  A[User uploads PDF] --> B[API Gateway - FastAPI Web Server]
  B -->|enqueue task| C[Redis + Celery Queue]
  C --> D[Embedding Worker Service]
  D -->|OCR/Text Extract| E[Text Extractor]
  D -->|Embedding| F[Sentence Transformer]
  D -->|Store| G[Weaviate Vector DB]
Loading
  • API Gateway (FastAPI): The user-facing service that exposes a REST API for uploading documents and checking their processing status.
  • Embedding Worker (Celery): A background worker that handles the heavy lifting of document processing.
  • Weaviate Vector DB: A specialized database that stores the document chunks and their corresponding vector embeddings.
  • Redis: Acts as a message broker, managing the queue of documents to be processed.
  • Flower: A web-based monitoring tool for Celery.

Technology Stack

Component Tech
API Gateway FastAPI, Python 3.11
Background Worker Celery
Message Broker Redis
PDF Parsing PyMuPDF
Embedding sentence-transformers
Vector DB Weaviate
Monitoring Flower Dashboard
Environment Docker + docker-compose

Getting Started

Prerequisites

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd rag_llama_index
  2. Create the environment file: Copy the example environment file to create your own configuration.

    cp .env.example .env

    You can modify the .env file to change the embedding model or other settings if needed.

  3. Build and run the services:

    docker-compose up --build

    This command will build the Docker images for the API gateway and the worker, and then start all the services.

How to Use

All protected endpoints require an API key to be passed in the X-API-Key header. You can configure valid keys in your PostgreSQL database.

1. Upload a PDF File

Send a POST request to the /upload endpoint with a PDF file.

Example using curl:

curl -X POST \
  -H "X-API-Key: your-secret-api-key" \
  -F "file=@/path/to/your/document.pdf" \
  http://localhost:8000/upload

The API will respond with a task_id:

{
  "message": "File uploaded successfully",
  "task_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef"
}

2. Check Processing Status

Use the task_id from the upload response to check the status of the document processing.

Example using curl:

curl -H "X-API-Key: your-secret-api-key" \
  http://localhost:8000/status/a1b2c3d4-e5f6-7890-1234-567890abcdef

The response will show the current status (PENDING, SUCCESS, FAILURE, etc.) and the result if the task is complete.

{
  "task_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
  "status": "SUCCESS",
  "result": {
    "status": "success",
    "chunks_processed": 150
  }
}

3. Query Your Documents

Once a document has been processed successfully, you can search for information using the /query endpoint.

Example using curl:

curl -X POST \
  -H "X-API-Key: your-secret-api-key" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the main topic of the document?"}' \
  http://localhost:8000/query

This will return a task_id. You can use the /status endpoint again to retrieve the search results.

Example Result from /status endpoint:

{
    "task_id": "...",
    "status": "SUCCESS",
    "result": {
        "status": "success",
        "results": [
            {
                "text": "This is a relevant chunk of text from your document.",
                "file": "document.pdf"
            },
            {
                "text": "This is another relevant chunk of text.",
                "file": "document.pdf"
            }
        ]
    }
}

Monitoring

You can monitor the status of the background processing tasks using the Flower dashboard.

This dashboard provides a real-time overview of the Celery workers and the tasks they are processing.

About

An advanced Retrieval-Augmented Generation (RAG) system designed for personalized recommendations. This project features a scalable microservice architecture (FastAPI, Celery) and an agentic workflow (ARAG) to provide nuanced, user-centric results. A practical demonstration of modern AI and MLOps principles.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published