llms.py

Documentation

Lightweight OpenAI compatible CLI and server gateway for multiple LLMs

llms.py

llms.py is a super lightweight CLI tool and OpenAI-compatible server that acts as a configurable gateway over multiple Large Language Model (LLM) providers.

Quick Start

Install

pip install llms-py

Set API Keys

export OPENROUTER_API_KEY="sk-or-..."
export GROQ_API_KEY="gsk_..."

Start Server

llms --serve 8000

Access the UI at http://localhost:8000

Key Features

  • ðŸŠķ Ultra-Lightweight: Single file with just one aiohttp dependency
  • 🌐 Multi-Provider Support: OpenRouter, Ollama, OpenAI, Anthropic, Google, Grok, Groq, Qwen, and more
  • ðŸŽŊ Intelligent Routing: Automatic failover between providers
  • ðŸ’ŧ Web UI: ChatGPT-like interface with dark mode
  • 📊 Built-in Analytics: Track costs, tokens, and usage
  • 🔒 Privacy First: All data stored locally in browser
  • ðŸģ Docker Ready: Pre-built images available

Use Cases

For Developers

  • API Gateway: Centralize all LLM provider access through one endpoint
  • Cost Management: Automatically route to cheapest available providers
  • Reliability: Built-in failover ensures high availability

For ComfyUI Users

  • Hybrid Workflows: Combine local Ollama models with cloud APIs
  • Zero Setup: No dependency management headaches
  • Provider Flexibility: Switch providers without changing your workflow

For Enterprises

  • Vendor Independence: Avoid lock-in to any single LLM provider
  • Scalability: Distribute load across multiple providers
  • Budget Control: Intelligent routing to optimize costs

Next Steps