Quick Start

1. Set API Keys

Set the API keys for the providers you want to use:

export OPENROUTER_API_KEY = "..."

API Keys used for the providers enabled in llms.json:

Provider	Environment Variable	Description
openrouter	`OPENROUTER_API_KEY`	OpenRouter
google	`GEMINI_API_KEY`	Gemini (Google)
anthropic	`ANTHROPIC_API_KEY`	Claude (Anthropic)
openai	`OPENAI_API_KEY`	OpenAI API key
groq	`GROQ_API_KEY`	Groq API key
z.ai	`ZAI_API_KEY`	Z.ai API key
qwen	`DASHSCOPE_API_KEY`	Qwen (Alibaba) key
xai	`GROK_API_KEY`	Grok (X.AI) API key
nvidia	`NVIDIA_API_KEY`	NVidia NIM API key
github	`GITHUB_TOKEN`	GitHub Copilot Models token
mistral	`MISTRAL_API_KEY`	Mistral API key
deepseek	`DEEPSEEK_API_KEY`	DeepSeek API key
chutes	`CHUTES_API_KEY`	chutes.ai OSS LLM and Image Models
huggingface	`HF_TOKEN`	Hugging Face API token
fireworks	`FIREWORKS_API_KEY`	fireworks.ai OSS Models
codestral	`CODESTRAL_API_KEY`	Codestral API key
lmstudio	`LMSTUDIO_API_KEY`	Any API Key enables LM Studio
ollama	N/A	No API key required

Verification

After configuration, verify it's working:

llms ls

2. Start the Server

Launch the web UI and API server:

llms --serve 8000

Start with verbose logging:

# Verbose logging
llms --serve 8000 --verbose

# Debug and Verbose logging
DEBUG=1 llms --serve 8000 --verbose

This starts:

Web UI at http://localhost:8000
OpenAI-compatible API at http://localhost:8000/v1/chat/completions

For detailed logging:

llms --serve 8000 --verbose

Enable Providers

You can enable/disable providers at runtime in the UI or CLI:

llms --enable openrouter_free google_free groq

llms --disable openai anthropic grok

Using the Web UI

Once the server is running, open http://localhost:8000 in your browser.

Features

Chat Interface: ChatGPT-like interface for conversations
Model Selection: Choose from all enabled providers and models
System Prompts: Access 200+ professional system prompts
File Attachments: Upload images, audio, and documents
Dark Mode: Toggle between light and dark themes
Analytics: Track costs, tokens, and usage
Search: Find past conversations easily

Keyboard Shortcuts

Enter - Send message (or Ctrl/Cmd + Enter for new line)
/ - Focus search
Esc - Close dialogs

OpenAI-Compatible API

Use the server as an OpenAI-compatible endpoint:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-4-fast",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

This works with any OpenAI-compatible client library.

Provider Management

List Providers

# List all providers
llms --list
llms ls

# List specific providers
llms ls groq anthropic

Enable/Disable Providers

# Disable providers
llms --disable openrouter_free codestral

# Enable providers
llms --enable openai grok

Providers are invoked in the order they're defined in llms.json. If one fails, it automatically tries the next available provider.

Check Provider Status

Test provider connectivity and response times:

# Check all models for a provider
llms --check groq

# Check specific models
llms --check groq kimi-k2 llama4:400b

This helps verify providers are configured correctly and shows their response times.

1. Set API Keys

Verification

2. Start the Server

For detailed logging:

Enable Providers

Using the Web UI

Features

Keyboard Shortcuts

OpenAI-Compatible API

Provider Management

List Providers

Enable/Disable Providers

Check Provider Status

Next Steps

Web UI Features

Configuration

CLI Reference

Multimodal Support

On this page