Skip to content

Intelligent RAG Chatbot: Upload PDFs and chat instantly using Google Gemini. Built with FastAPI, LangChain, and a premium React/Tailwind interface.

License

Notifications You must be signed in to change notification settings

yugam23/RAG-Chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

51 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“š RAG Chatbot

Your Documents. Your Questions. Instant AI-Powered Answers.

License Python React FastAPI Tailwind CSS LangChain Google Gemini

Quick Start โ€ข Features โ€ข Architecture โ€ข API Docs


๐ŸŒŸ What is RAG Chatbot?

RAG Chatbot is a cutting-edge Retrieval-Augmented Generation (RAG) application that transforms how you interact with your documents. Upload any PDF, and engage in intelligent, context-aware conversations powered by Google's Gemini AI.

Unlike traditional chatbots, RAG Chatbot doesn't hallucinate โ€” it answers based strictly on the content of your uploaded documents, combining the power of semantic search with advanced language models.

๐ŸŽฏ Why RAG Chatbot?

  • โœ… 100% Context-Grounded: Answers derived exclusively from your documents
  • โšก Lightning Fast: Optimized retrieval with FAISS vector database
  • ๐ŸŽจ Premium UI/UX: Glassmorphism design with smooth animations
  • ๐Ÿ”’ Privacy-First: Process documents locally with no data persistence on restart

๐ŸŽฅ Demo

RAG Chatbot Demo

๐Ÿš€ Quick Start

Get up and running in 3 minutes!

Prerequisites

Installation

1๏ธโƒฃ Clone the Repository

git clone https://github.com/yugam23/RAG-Chatbot.git
cd RAG-Chatbot

2๏ธโƒฃ Backend Setup

cd backend
python -m venv venv

# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

pip install -r requirements.txt

Configure Environment: Create a .env file in backend/:

GOOGLE_API_KEY=your_actual_api_key_here
ALLOWED_ORIGINS=http://localhost:5173

Run the Backend:

python main.py
# ๐Ÿš€ Server running at http://localhost:8000

3๏ธโƒฃ Frontend Setup

Open a new terminal:

cd frontend
npm install
npm run dev
# โœจ App running at http://localhost:5173

โœจ Key Features

๐Ÿง  Intelligent Backend

Feature Description
๐Ÿค– Google Gemini Integration Powered by gemini-flash-latest for ultra-fast, accurate responses
๐Ÿ” Advanced RAG Pipeline Semantic chunking (800 chars, 400 overlap) + Gecko embeddings
๐Ÿ’พ Vector Search FAISS CPU-optimized indexing with k=7 retrieval
๐Ÿ“‘ Robust PDF Processing Magic byte validation, 50MB limit, secure temp storage
๐Ÿ’ฌ Session Management SQLite-based chat history with full persistence
๐Ÿ”„ Auto-Reset Session and index auto-clear on server restart

๐ŸŽจ Premium Frontend

Feature Description
โœจ Glassmorphism Design Modern blur effects, gradients, and depth
๐ŸŽฌ Startup Animation Smooth logo intro with motion transitions
๐Ÿ’ฌ Real-Time Streaming Server-Sent Events (SSE) for live response rendering
๐Ÿ“ Markdown Support Full syntax highlighting with react-markdown + remark-gfm
๐Ÿ“ฑ Fully Responsive Optimized for desktop, tablet, and mobile
โšก Optimized Caching TanStack Query for efficient data fetching

๐Ÿ—๏ธ Architecture

RAG Chatbot follows a modern client-server architecture:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         USER INTERFACE                          โ”‚
โ”‚  React + Vite + Tailwind CSS + Framer Motion + TanStack Query   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                             โ”‚ HTTP/SSE
                             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        FASTAPI BACKEND                          โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚   Routers    โ”‚  โ”‚  Middleware  โ”‚  โ”‚   State Manager     โ”‚    โ”‚
โ”‚  โ”‚ /upload      โ”‚  โ”‚ - Rate Limit โ”‚  โ”‚ - Session State     โ”‚    โ”‚
โ”‚  โ”‚ /chat        โ”‚  โ”‚ - Request ID โ”‚  โ”‚ - Vector Store Ref  โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                             โ”‚                                   โ”‚
โ”‚                             โ–ผ                                   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚                  RAG PIPELINE (LangChain)                โ”‚   โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚   โ”‚
โ”‚  โ”‚  โ”‚  PDF   โ”‚โ”€โ”€โ–ถโ”‚ Chunk  โ”‚โ”€โ”€โ–ถโ”‚  Embed   โ”‚โ”€โ”€โ–ถโ”‚  FAISS โ”‚   โ”‚   โ”‚
โ”‚  โ”‚  โ”‚ Loader โ”‚   โ”‚ (800)  โ”‚   โ”‚ (Gecko)  โ”‚   โ”‚  Index   โ”‚   โ”‚   โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”„ RAG Pipeline Workflow

  1. ๐Ÿ“ค Document Ingestion: PDF Upload โ†’ Magic Byte Validation โ†’ Text Extraction โ†’ Recursive Chunking (800/400).
  2. ๐Ÿงฎ Embedding & Indexing: Google Gecko Embeddings โ†’ FAISS Vector Store.
  3. ๐Ÿ’ฌ Query Processing: User Query โ†’ Embedding โ†’ Similarity Search (Top-k) โ†’ Prompt Construction.
  4. ๐Ÿค– Response Generation: Context + Query โ†’ Gemini Flash โ†’ Streaming Response.

๐Ÿ“‚ Project Structure

RAG-Chatbot/
โ”œโ”€โ”€ ๐Ÿ“ backend/                     # FastAPI Python Backend
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ routers/                 # API Route Handlers
โ”‚   โ”‚   โ”œโ”€โ”€ upload.py               # PDF upload & indexing endpoint
โ”‚   โ”‚   โ””โ”€โ”€ chat.py                 # Chat streaming & history endpoints
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ tests/                   # Pytest Test Suite
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ temp/                    # Temporary PDF storage
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ faiss_index/             # Vector database (generated)
โ”‚   โ”œโ”€โ”€ config.py                   # Centralized configuration
โ”‚   โ”œโ”€โ”€ database.py                 # SQLite async operations
โ”‚   โ”œโ”€โ”€ ingestion.py                # Document processing pipeline
โ”‚   โ”œโ”€โ”€ rag.py                      # RAG chain implementation
โ”‚   โ”œโ”€โ”€ state.py                    # Application state management
โ”‚   โ”œโ”€โ”€ vector_store.py             # Vector store logic
โ”‚   โ”œโ”€โ”€ middleware.py               # Rate limiting & request tracking
โ”‚   โ”œโ”€โ”€ logging_config.py           # Structured logging setup
โ”‚   โ”œโ”€โ”€ models.py                   # Pydantic models
โ”‚   โ”œโ”€โ”€ main.py                     # FastAPI app entry point
โ”‚   โ””โ”€โ”€ requirements.txt            # Python dependencies
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ frontend/                    # React + Vite Frontend
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ src/
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“ components/          # React Components (Header, ChatArea, etc.)
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“ hooks/               # Custom React Hooks (useChat, useApiQueries)
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“ services/            # API Communication
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“ context/             # React Context Providers
โ”‚   โ”‚   โ”œโ”€โ”€ App.jsx                 # Main App component
โ”‚   โ”‚   โ”œโ”€โ”€ main.jsx                # React entry point
โ”‚   โ”‚   โ””โ”€โ”€ index.css               # Global styles & theme
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ public/                  # Static Assets
โ”‚   โ””โ”€โ”€ vite.config.js              # Vite configuration
โ”‚
โ””โ”€โ”€ ๐Ÿ“„ README.md                    # This file!

โš™๏ธ Configuration

Environment Variables

Variable Required Default Description
GOOGLE_API_KEY โœ… Yes โ€” Google AI API key for Gemini & Gecko
ALLOWED_ORIGINS โŒ No http://localhost:5173 Comma-separated CORS origins

Advanced Tuning (backend/config.py)

# Document Processing
CHUNK_SIZE = 800              # Characters per chunk
CHUNK_OVERLAP = 400           # Overlap between chunks

# Retrieval
RETRIEVER_K = 7               # Number of chunks to retrieve

# Models
EMBEDDING_MODEL = "models/text-embedding-004"
LLM_MODEL = "gemini-flash-latest"

๐Ÿ“ก API Reference

Full interactive documentation available at http://localhost:8000/docs.

Core Endpoints

  • POST /upload: Upload and index a PDF. Validates magic bytes and size.
  • POST /chat: Stream chat response via SSE. Requires active session.
  • GET /history: Retrieve stored chat history.
  • POST /clear-chat: Clear history but keep document index.
  • POST /reset: Full session reset (wipes history + index).

๐Ÿงช Testing & Troubleshooting

Running Tests

cd backend
pytest -v          # Run all tests
pytest tests/test_ingestion.py  # Test specific module

Common Issues

  • ValidationError: GOOGLE_API_KEY field required: Add your API key to backend/.env.
  • Failed to fetch: Ensure backend is running on port 8000.
  • Invalid file type: Ensure the file is a valid PDF.

Made with โค๏ธ by Yugam

โฌ† Back to Top

About

Intelligent RAG Chatbot: Upload PDFs and chat instantly using Google Gemini. Built with FastAPI, LangChain, and a premium React/Tailwind interface.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published