Getting Started
Quick Start
Get started with llms.py in minutes
1. Set API Keys
Set the API keys for the providers you want to use:
export OPENROUTER_API_KEY = "..."API Keys used for the providers enabled in llms.json:
| Provider | Environment Variable | Description |
|---|---|---|
| openrouter | OPENROUTER_API_KEY | OpenRouter |
GEMINI_API_KEY | Gemini (Google) | |
| anthropic | ANTHROPIC_API_KEY | Claude (Anthropic) |
| openai | OPENAI_API_KEY | OpenAI API key |
| groq | GROQ_API_KEY | Groq API key |
| z.ai | ZAI_API_KEY | Z.ai API key |
| qwen | DASHSCOPE_API_KEY | Qwen (Alibaba) key |
| xai | GROK_API_KEY | Grok (X.AI) API key |
| nvidia | NVIDIA_API_KEY | NVidia NIM API key |
| github | GITHUB_TOKEN | GitHub Copilot Models token |
| mistral | MISTRAL_API_KEY | Mistral API key |
| deepseek | DEEPSEEK_API_KEY | DeepSeek API key |
| chutes | CHUTES_API_KEY | chutes.ai OSS LLM and Image Models |
| huggingface | HF_TOKEN | Hugging Face API token |
| fireworks | FIREWORKS_API_KEY | fireworks.ai OSS Models |
| codestral | CODESTRAL_API_KEY | Codestral API key |
| lmstudio | LMSTUDIO_API_KEY | Any API Key enables LM Studio |
| ollama | N/A | No API key required |
2. Start the Server
Launch the web UI and API server:
llms --serve 8000Start with verbose logging:
# Verbose logging
llms --serve 8000 --verbose
# Debug and Verbose logging
DEBUG=1 llms --serve 8000 --verboseThis starts:
- Web UI at
http://localhost:8000 - OpenAI-compatible API at
http://localhost:8000/v1/chat/completions
For detailed logging:
llms --serve 8000 --verboseEnable Providers
You can enable/disable providers at runtime in the UI or CLI:
llms --enable openrouter_free google_free groqllms --disable openai anthropic grokUsing the Web UI
Once the server is running, open http://localhost:8000 in your browser.
Features
- Chat Interface: ChatGPT-like interface for conversations
- Model Selection: Choose from all enabled providers and models
- System Prompts: Access 200+ professional system prompts
- File Attachments: Upload images, audio, and documents
- Dark Mode: Toggle between light and dark themes
- Analytics: Track costs, tokens, and usage
- Search: Find past conversations easily
Keyboard Shortcuts
Enter- Send message (orCtrl/Cmd + Enterfor new line)/- Focus searchEsc- Close dialogs
OpenAI-Compatible API
Use the server as an OpenAI-compatible endpoint:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "grok-4-fast",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'This works with any OpenAI-compatible client library.
Provider Management
List Providers
# List all providers
llms --list
llms ls
# List specific providers
llms ls groq anthropicEnable/Disable Providers
# Disable providers
llms --disable openrouter_free codestral
# Enable providers
llms --enable openai grokProviders are invoked in the order they're defined in llms.json. If one fails, it automatically tries the next available provider.
Check Provider Status
Test provider connectivity and response times:
# Check all models for a provider
llms --check groq
# Check specific models
llms --check groq kimi-k2 llama4:400bThis helps verify providers are configured correctly and shows their response times.
See CLI Docs for more details.