How to launch your own Private ChatGPT with API access using Ollama and Llama3.2 and FastAPI(Python)
Here's the YouTube Video.
Follow next steps in order to install Ollama with llama3.2 on Ubuntu Server
sudo apt updatesudo apt upgradeapt install python3.12-venvInstall Ollama
curl -fsSL https://ollama.com/install.sh | shPull llama3.2 LLM. You can checkout here full list of LLM's
ollama run llama3.2Check that Ollama is working
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt":"Who are you?"
}'Download Ollama API files
git clone https://github.com/saasscaleup/ollama-api.gitcd ollama-apiInstall required packages.
python3 -m venv venvsource venv/bin/activate # On Windows, use: venv\Scripts\activatepip install fastapi uvicorn requests httpxRun ollama-api app
uvicorn main:app --host 0.0.0.0 --port 3000| Endpoint | curl Command |
Description |
|---|---|---|
/generate (streaming) |
curl -N -X POST http://localhost:3000/api/generate -H "Content-Type: application/json" -d '{ "model": "llama3.2", "prompt": "What is your name?" }' |
Request streamed generation |
/generate (non-streaming) |
curl -X POST http://localhost:3000/api/generate -H "Content-Type: application/json" -d '{ "model": "llama3.2", "prompt": "What is your name?", "stream": false }' |
Request non-streamed generation |
/models/download |
curl -X POST http://localhost:3000/api/models/download -H "Content-Type: application/json" -d '{ "llm_name": "llama3.2" }' |
Download specified model |
/models |
curl -X GET http://localhost:3000/api/models |
List available models |
If you Like the tutorial and you want to support my channel so I will keep releasing amazing content that will turn you to a desirable Developer with Amazing Cloud skills... I will really appreciate if you:
- Subscribe to My youtube channel and leave a comment:
- Buy me A coffee β€οΈ:
Thanks for your support π