LLM consumption

Consume services from LLM providers.

About

Providers

Model aliasing

Manage API keys

Manage API keys for LLM provider authentication.

Virtual key management

Issue API keys with per-key token budgets and cost tracking (also known as virtual keys).

Load balancing

Distribute requests across multiple LLM providers automatically (Power of Two Choices, P2C).

Budget and spend limits

Control LLM spending by enforcing token budget limits per API key or user.

Model failover

Priority-based failover across LLM providers (automatic fallback when models fail or are …

Content-based routing

Route requests to different LLM backends based on request body content, such as the requested model …

Set up prompt guards

Rate limiting for LLMs

Control LLM costs with token-based rate limiting and request-based limits.

Enrich prompts

Prompt templates

Use static and dynamic prompt templates to customize LLM requests.

CEL-based RBAC

Call functions

Guardrail webhook API

View metrics and logs

Content safety and PII protection

Protect LLM requests and responses from sensitive data exposure and harmful content using layered …

Track LLM costs

Track and monitor LLM costs per request using token usage metrics.

Was this page helpful?