Why a Unified API Gateway?
Different LLM providers have different APIs, rate limits, and pricing. A unified gateway lets your applications use one consistent API format while the gateway handles routing, fallbacks, load balancing, and cost tracking across providers.
Quick Setup
pip install litellm[proxy]
# Start the proxy
litellm --model ollama/llama3 --port 4000
# Or with config file
litellm --config config.yaml --port 4000
Configuration
# config.yaml
model_list:
- model_name: gpt-4
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: claude
litellm_params:
model: anthropic/claude-sonnet-4-20250514
api_key: os.environ/ANTHROPIC_API_KEY
- model_name: local
litellm_params:
model: ollama/llama3
api_base: http://localhost:11434
general_settings:
master_key: sk-your-master-key
router_settings:
routing_strategy: simple-shuffle
num_retries: 2
fallbacks:
- gpt-4: [claude, local]
Usage
# Use exactly like OpenAI API
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:4000",
api_key="sk-your-master-key"
)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
# If OpenAI fails, automatically falls back to Claude, then local Ollama
Virtual Keys and Budgets
# Create team-specific API keys with budgets
curl -X POST http://localhost:4000/key/generate \
-H "Authorization: Bearer sk-your-master-key" \
-d '{
"models": ["gpt-4", "claude"],
"max_budget": 100.00,
"budget_duration": "monthly",
"metadata": {"team": "engineering"}
}'
Features
- 100+ LLM provider support
- Automatic fallbacks and retries
- Spend tracking and budget limits
- Virtual API keys per team/project
- Request logging and analytics
- Rate limiting per key
- Streaming support