Skip to content

Conqlab/Building_Multimodal_Search_and_RAG

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Building Multimodal Search and RAG

About

This repository contains

Course Information

  • Instructor: Sebastian Witalec (Head of Developer Relations at Weaviate)
  • Course Website

Course Contents

# Lesson Description
0 Introduction
1 Overview of Multimodality
  • Explains unifying multimodal embedding models using Contrastive representation.
2 Multimodal Search
  • Learn how a concept is understood across multiple modalities.
  • Build a text-to-any search as well as any-to-any search using Weaviate.
3 Large Multimodal Models (LMMs)
  • Understand how LLMs work
  • Understand how to combine LLMs and multimodal models into language vision models
4 Multimodal RAG (MM-RAG)
  • Search with large multimodal model using Weaviate DB
5 Industry Applications
  • Industrial applications shown by extracting structured data from images: invoice, table, flow chart
  • LLM answers query on the extracted data using reasoning.
6 Multimodal Recommender System
  • Multimodal (text and image) recommendation shown on movies dataset

Assignments

Lesson Assignment Description
#1 Overview of Multimodality
  • Build a multimodal (text, image) model on MNIST dataset using Contrastive learning.
  • Dimensionality reduction to visualize the embeddings.
#2 Multimodal Search
  • Build multimodal search (images, videos) using Weaviate client.
  • Model: multimodalembedding from Google Cloud Vertex AI
#3 Large Multimodal Models (LMMs)
  • Query images using Gemini Vision model
#5 Industry Applications
  • Extract structured data from retrieved images: invoice, table, flow chart
  • LLM answers query on the extracted data using reasoning.
#6 Multimodal Recommender System
  • Text and Image based semantic search on movies' title, overview and poster
  • Text-based semantic search using OpenAI embeddings
  • Image-based semantic search using Google Vertex multimodalembedding

Additional Resources

Certificate

Related Courses

Please visit my Github page for other Generative AI/ LLM courses.

About

DeepLearning.ai course: Building Multimodal Search and RAG

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%