Analytics & Monitoring
Track costs, tokens, and usage across all providers
Overview
The analytics system provides:
- Real-time Metrics: See costs and tokens for every request
- Historical Data: Track usage over time
- Provider Breakdown: Compare costs and performance per provider
- Activity Logs: Detailed request history
Token Metrics
Per-Message Metrics
Every message shows its token count:

Displayed for each message:
- Input tokens (user message)
- Output tokens (AI response)
- Total tokens
Thread-Level Metrics
At the bottom of each conversation:
- Total cost
- Total tokens (input + output)
- Number of requests
- Total response time
- Average tokens per request
Model Selector Metrics
The model selector displays pricing for each model:
- Input cost per 1M tokens
- Output cost per 1M tokens
- Quick comparison between models
Cost Analytics
Monthly Cost Overview
Track your spending day by day:

Features:
- Daily cost breakdown
- Total monthly spend
- Expandable details per day
- Provider and model breakdown
Cost Breakdown
Click any day to see:
- Cost per provider
- Cost per model
- Number of requests
- Token usage
Token Analytics
Monthly Token Usage
Visualize token consumption over time:

Shows:
- Daily token usage
- Input vs output tokens
- Total monthly tokens
- Trends over time
Token Breakdown
Expandable details show:
- Tokens per provider
- Tokens per model
- Average tokens per request
- Input/output ratio
Activity Logs
Request History
Detailed log of all AI requests:

Each entry includes:
- Model: Which model was used
- Provider: Which provider served the request
- Prompt: Partial preview of the prompt
- Input Tokens: Tokens in the request
- Output Tokens: Tokens in the response
- Cost: Calculated cost for the request
- Response Time: How long the request took
- Speed: Tokens per second
- Timestamp: When the request was made
Filtering & Search
- Search by prompt content
- Filter by date range
- Filter by provider
- Filter by model
- Sort by any column
Data Storage
Separate from Chat History
Analytics data is stored separately from chat conversations:
- Clearing chat history preserves analytics
- Deleting analytics preserves chat history
- Independent export/import
Export Analytics
Hold ALT while clicking the Export button to export analytics data:
{
"logs": [
{
"timestamp": "2025-11-15T10:30:00Z",
"model": "grok-4-fast",
"provider": "grok",
"prompt": "What is...",
"inputTokens": 10,
"outputTokens": 150,
"cost": 0.0024,
"responseTime": 1.2,
"speed": 125
}
]
}Pricing Configuration
Pricing is configured in llms.json per provider:
{
"providers": {
"openai": {
"pricing": {
"gpt-5": {
"input": 2.50,
"output": 10.00
},
"gpt-4o": {
"input": 2.50,
"output": 10.00
}
},
"default_pricing": {
"input": 5.00,
"output": 15.00
}
}
}
}Pricing is in dollars per 1M tokens.
Free Models
Models from free providers show $0.00 cost:
- OpenRouter free models
- Groq free models
- Google free tier models
- Local Ollama models
Performance Metrics
Response Time
Track how fast providers respond:
- Per-request response time
- Average response time per provider
- Identify slow providers or models
Speed (Tokens/Second)
Measure generation speed:
- Output tokens per second
- Compare model speeds
- Optimize for faster models when needed
Provider Checking
Test provider connectivity and performance:
# Check all models for a provider
llms --check groq
# Check specific models
llms --check groq kimi-k2 llama4:400b
Shows:
- ✅ Working models
- ❌ Failed models
- Response times
- Provider availability
Automated Checks
GitHub Actions runs automated provider checks:
- Tests all configured providers
- Tests all models
- Publishes results to
/checks/latest.txt - Runs on schedule
Use Cases
Cost Optimization
- Identify expensive models
- Compare provider costs
- Route to cheaper alternatives
- Set budget limits
Performance Monitoring
- Find fastest providers
- Identify slow models
- Optimize for speed vs cost
- Detect provider issues
Usage Analysis
- Track which models you use most
- See token consumption patterns
- Identify heavy usage periods
- Plan capacity needs
Debugging
- Review failed requests
- Check response times
- Verify token counts
- Audit provider usage
Best Practices
Cost Management
- Use Free Tiers First: Enable free providers first in
llms.json - Monitor Daily Spend: Check analytics regularly
- Set Budget Alerts: Keep track of monthly costs
- Choose Appropriate Models: Use cheaper models for simple tasks
Performance Optimization
- Check Provider Status: Run
--checkperiodically - Monitor Response Times: Identify slow providers
- Balance Speed vs Cost: Choose based on needs
- Use Local Models: For privacy-sensitive or high-volume tasks
Data Hygiene
- Export Regularly: Backup analytics data
- Clean Old Logs: Remove outdated entries
- Separate Environments: Use different ports for different use cases
Privacy
Analytics data is stored locally:
- ✅ No external tracking
- ✅ No data sent to third parties
- ✅ Full control over data
- ✅ Easy to delete or export