For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.
LLM consumption
Consume services from LLM providers.
About
Providers
Model aliasing
Manage API keys
Manage API keys for LLM provider authentication.
Virtual key management
Issue API keys with per-key token budgets and cost tracking (also known as virtual keys).
Load balancing
Distribute requests across multiple LLM providers automatically (Power of Two Choices, P2C).
Budget and spend limits
Control LLM spending by enforcing token budget limits per API key or user.
Model failover
Priority-based failover across LLM providers (automatic fallback when models fail or are …
Content-based routing
Route requests to different LLM backends based on request body content, such as the requested model …
Set up prompt guards
Rate limiting for LLMs
Control LLM costs with token-based rate limiting and request-based limits.
Enrich prompts
Prompt templates
Use static and dynamic prompt templates to customize LLM requests.
CEL-based RBAC
Call functions
Guardrail webhook API
View metrics and logs
Content safety and PII protection
Protect LLM requests and responses from sensitive data exposure and harmful content using layered …
Track LLM costs
Track and monitor LLM costs per request using token usage metrics.