YouTubeGPT is a web app that can be run fully locally and lets you summarize and chat (Q&A) with YouTube videos. You can either use OpenAI's API or a (local) Ollama instance.
YouTubeGPT's features include:
✍️ Provide a custom prompt for summaries VIEW DEMO
- you can tailor the summary to your needs by providing a custom prompt or just use the default summarization
❓ Get answers to questions about the video content VIEW DEMO
- part of the application is designed and optimized specifically for question answering tasks (Q&A)
- the summaries and answers can be saved to a library accessible at a separate page!
- additionally, summaries and answers can be exported/downloaded as Markdown files!
- choose between OpenAI's API or a (local) Ollama instance
- currently available: ChatGPT 4-5 (incl. nano & mini) and continuously updated with new models
- by choosing a different model, you can summarize even longer videos and get better responses
- adjust the temperature and top P of the model
- go to the three dots in the upper right corner, select settings and choose either light, dark or my aesthetic custom theme
If you want to use OpenAI's API, you will first need to get an OpenAI API-Key. This is very straightforward and free. Have a look at their instructions to get started.
If you want to use Ollama, you need to have an Ollama server running locally or remotely. You can download Ollama for macOS, Linux, or Windows on their website. Make sure the server is reachable either on the default port 11434 or set the OLLAMA_HOST environment variable to point to your Ollama server. Also, you need to pull the models you want to use.
Note: Ollama limits the context window to 4k tokens by default. I strongly recommend to adjust it to at least 16k tokens. This can be done in the Ollama app settings.
- set the
OPENAI_API_KEYenvironment variable either in docker-compose.yml or by runningexport OPENAI_API_KEY=<your-actual-key>in your terminal - execute the following command:
# pull from docker hub
docker-compose up -d
# or build locally
docker-compose up --build -dThe app will be accessible in the browser under http://localhost:8501.
ℹ️ For the best user-experience, you need to be in
Tier 1usage tier, which requires a one-time payment of 5$. However it's worth it, since then, you'll have access to all models and higher rate limits.
I’m working on adding more features and am open to feedback and contributions. Don't hesitate to create an issue or a pull request. Also, if you are enjoying the app or find it useful, please consider giving the repository a star ⭐
This is a small side-project and it's easy to get started! If you want to contribute, here’s the gist to get your changes rolling:
- Fork & clone: Fork the repo and clone your fork to start.
- Pick an issue or suggest One: Choose an open issue to work on, or suggest a new feature or bug fix by creating an issue for discussion.
- Develop: Make your changes.
- Ensure your code is clean and documented. Test the changes at least exploratively. Make sure to test 'edge cases'.
- Commit your changes with clear, descriptive messages, using conventional commits.
- Stay updated: Keep your branch in sync with the main branch to avoid merge conflicts.
- Pull Request: Push your changes to your fork and submit a pull request (PR) to the main repository. Describe your changes and any relevant details.
- Engage: Respond to feedback on your PR to finalize your contribution.
# create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate
# install requirements
pip install -r requirements.txt
# you'll need an API key
export OPENAI_API_KEY=<your-openai-api-key>
# run chromadb (necessary for chat)
docker-compose up -d chromadb
# run app
streamlit run main.pyThe app will be accessible in the browser under http://localhost:8501 and the ChromaDB API under http://localhost:8000/docs.
The project is built using some amazing libraries:
- The project uses YouTube Transcript API for fetching transcripts.
- LangChain is used to create a prompt, submit it to an LLM and process it's response.
- The UI is built using Streamlit.
- ChromaDB is used as a vector store for embeddings.
This project is licensed under the MIT License - see the LICENSE file for details.
