Ollama makes it easy to run open-source LLMs like Llama 3, Mistral, Gemma, and DeepSeek on your own server. Full privacy, no API costs, no rate limits.
Includes a REST API compatible with the OpenAI format. Pull and run models with a single command. Pairs perfectly with Open WebUI for a ChatGPT-like interface.