llms.py
Getting Started

Quick Start

Get started with llms.py in minutes

1. Set API Keys

Set the API keys for the providers you want to use:

export OPENROUTER_API_KEY = "..."

API Keys used for the providers enabled in llms.json:

ProviderEnvironment VariableDescription
openrouterOPENROUTER_API_KEYOpenRouter
googleGEMINI_API_KEYGemini (Google)
anthropicANTHROPIC_API_KEYClaude (Anthropic)
openaiOPENAI_API_KEYOpenAI API key
groqGROQ_API_KEYGroq API key
z.aiZAI_API_KEYZ.ai API key
qwenDASHSCOPE_API_KEYQwen (Alibaba) key
xaiGROK_API_KEYGrok (X.AI) API key
nvidiaNVIDIA_API_KEYNVidia NIM API key
githubGITHUB_TOKENGitHub Copilot Models token
mistralMISTRAL_API_KEYMistral API key
deepseekDEEPSEEK_API_KEYDeepSeek API key
chutesCHUTES_API_KEYchutes.ai OSS LLM and Image Models
huggingfaceHF_TOKENHugging Face API token
fireworksFIREWORKS_API_KEYfireworks.ai OSS Models
codestralCODESTRAL_API_KEYCodestral API key
lmstudioLMSTUDIO_API_KEYAny API Key enables LM Studio
ollamaN/ANo API key required

2. Start the Server

Launch the web UI and API server:

llms --serve 8000

Start with verbose logging:

# Verbose logging
llms --serve 8000 --verbose

# Debug and Verbose logging
DEBUG=1 llms --serve 8000 --verbose

This starts:

  • Web UI at http://localhost:8000
  • OpenAI-compatible API at http://localhost:8000/v1/chat/completions

For detailed logging:

llms --serve 8000 --verbose

Enable Providers

You can enable/disable providers at runtime in the UI or CLI:

llms --enable openrouter_free google_free groq
llms --disable openai anthropic grok

Using the Web UI

Once the server is running, open http://localhost:8000 in your browser.

Features

  • Chat Interface: ChatGPT-like interface for conversations
  • Model Selection: Choose from all enabled providers and models
  • System Prompts: Access 200+ professional system prompts
  • File Attachments: Upload images, audio, and documents
  • Dark Mode: Toggle between light and dark themes
  • Analytics: Track costs, tokens, and usage
  • Search: Find past conversations easily

Keyboard Shortcuts

  • Enter - Send message (or Ctrl/Cmd + Enter for new line)
  • / - Focus search
  • Esc - Close dialogs

OpenAI-Compatible API

Use the server as an OpenAI-compatible endpoint:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-4-fast",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

This works with any OpenAI-compatible client library.

Provider Management

List Providers

# List all providers
llms --list
llms ls

# List specific providers
llms ls groq anthropic

Enable/Disable Providers

# Disable providers
llms --disable openrouter_free codestral

# Enable providers
llms --enable openai grok

Providers are invoked in the order they're defined in llms.json. If one fails, it automatically tries the next available provider.

Check Provider Status

Test provider connectivity and response times:

# Check all models for a provider
llms --check groq

# Check specific models
llms --check groq kimi-k2 llama4:400b

This helps verify providers are configured correctly and shows their response times.

See CLI Docs for more details.

Next Steps