Everything You Need to Manage LLMs in Production
From routing to guardrails, from caching to cost control — a complete toolkit for production AI infrastructure.
One OpenAI-compatible API to rule them all. Connect any LLM provider without changing a single line of client code.
Distribute workloads across providers and automatically switch during outages. Zero downtime, always.
Block prompt injections, PII leakage, toxic content, and more. Validate every request and response with built-in rules.
Real-time cost tracking, budget limits per provider, token quotas per consumer. Never get an unexpected bill again.
Embedding-based similarity matching reduces costs and latency. Exact-match and semantic caches working together.
Granular API keys, role-based access control, secret vault integration. LLM access as secure as your APIs.
Build agentic workflows with tool calling, agent handoffs, persistent memory, and Model Context Protocol support.
Audit every request, track environmental impact, export metrics to your favorite dashboards and SIEM tools.
Text, images, audio, video — generate and process any modality through the same gateway with dedicated APIs.
Why Otoroshi LLM Extension?
Not just another proxy. A full-featured AI gateway built for teams that take production seriously.
Leverage a battle-tested, cloud-native API gateway. Get mTLS, service mesh, plugins, and admin UI out of the box. Your LLM gateway inherits enterprise-grade infrastructure.
Change providers, update guardrails, adjust budgets — all without restart or downtime. Configuration changes apply instantly across your entire LLM fleet.
Custom guardrails via WASM, webhooks, or LLM-based validation. Function calling through QuickJS, HTTP, or Otoroshi workflows. Extend everything without forking.
Run on your infrastructure, keep your data where it belongs. Support for EU/French providers like OVH, Scaleway, Cloud Temple. Fully open source under Apache 2.0.
50+ LLM Providers, One API
Connect to any major LLM provider through a single, consistent OpenAI-compatible interface. Switch providers in seconds.
Built for Real-World Scenarios
From startups to enterprises, deploy AI with confidence.
Centralize all LLM access for your organization with security, quotas, and compliance built in.
Avoid vendor lock-in by routing requests to the best provider for each use case with automatic failover.
Enforce content policies, audit every interaction, track costs, and meet regulatory requirements.
Cache responses, balance load, rate-limit consumers, and keep latency low across millions of requests.
Ready to Take Control of Your AI Infrastructure?
Get started in minutes. Open source, free forever.
