Skip to main content

Observability

Every interaction with a Large Language Model (LLM) generates crucial data that can be monitored, analyzed, and optimized.

Our LLM Gateway provides real-time tracking, security, and performance insights, acting as a centralized observability layer to streamline LLM interactions.

By routing all requests through our LLM gateway, you gain:

  • Full-stack observability by capturing every LLM request before it reaches the provider
  • Granular control over who can access what, when, and how
  • Cost efficiency through detailed usage tracking and budgeting
  • Real-time insights into token consumption, latency, and error rates

Key metrics tracked

Every LLM request logs critical telemetry data, including:

  • LLM Provider - Identify the AI service in use (e.g., OpenAI, Anthropic, Mistral)
  • Model Version - Track which model is processing requests
  • Prompt & Response Data - Log inputs and outputs for debugging & quality control
  • Token Usage Metrics - Measure input/output/reasoning token consumption
  • User Identity - Associate API usage with specific users for accountability
  • API Key Tracking - Monitor & secure API access to prevent unauthorized use
  • Cost Tracking - Compute costs per request, per model, per user
  • Ecological Impact - Estimate energy consumption and CO2 emissions
  • Budget Consumption - Track spending against defined budget limits

Using data exporters

You can use Otoroshi data exporters to extract LLM usage information and send it to any external system (Elasticsearch, Kafka, webhooks, etc.).

lb

Make sure to filter events on LLMUsageAudit:

{
"include": [{
"audit": "LLMUsageAudit"
}],
"exclude": []
}

For detailed information about LLMUsageAudit events, event fields, operation types, and dashboard examples, see the Reporting documentation.

Dashboarding

You can use LLMUsageAudit events to build dashboards with tools like Kibana, Grafana, or any visualization tool.

See the Reporting section for Elasticsearch data exporter configuration examples and event structure details.