How to Install Ollama on a VPS
Ollama makes it easy to run large language models locally on your Breeze. It handles model downloading, quantization, and serving behind a simple CLI and REST API.
Requirements
- A Breeze with at least 8 GB RAM (16 GB+ recommended for larger models)
- Ubuntu 22.04 or newer
- Root or sudo access
Installation
Run the official install script:
curl -fsSL https://ollama.com/install.sh | sh
Verify the installation:
ollama --version
Pull and Run a Model
Download and run a model such as Llama 3:
ollama pull llama3
ollama run llama3
Run as a Service
Ollama installs a systemd service automatically. Manage it with:
sudo systemctl status ollama
sudo systemctl restart ollama
Expose the API
By default Ollama listens on localhost:11434. To bind to all interfaces, set the environment variable:
sudo systemctl edit ollama
Add Environment="OLLAMA_HOST=0.0.0.0" under the [Service] section, then restart. Use a firewall to restrict access to trusted IPs only.