ποΈ Philosophy
Today, generative AI is simply unavoidable. Itβs transforming industries, unlocking new possibilities, and accelerating innovation
ποΈ Supported LLM Providers
Here is a list of all LLM chat providers available in the Otoroshi LLM Extension :
ποΈ Setup a new LLM Provider
Quick start
ποΈ Expose your LLM Provider
Now that your provider is fully set up, you can expose it to your organization. The idea is to do it through an Otoroshi route with a plugin of type backend that will handle passing incoming requests to the actual LLM provider.
ποΈ π Secure your LLM provider endpoint
Video Turorial
ποΈ Anthropic Messages API Proxy
The LLM Anthropic Messages Proxy plugin exposes any LLM provider managed by Otoroshi through an Anthropic Messages API-compatible endpoint. This means any client that speaks the Anthropic API format (including Claude Code) can seamlessly use any LLM provider proxied by Otoroshi, whether it's OpenAI, Mistral, Ollama, Azure OpenAI, Cohere, or any other supported provider.
ποΈ Open Responses API Proxy
The LLM OpenResponse Proxy plugin exposes any LLM provider managed by Otoroshi through an Open Responses-compatible API endpoint. Open Responses is an open-source specification for multi-provider, interoperable LLM interfaces, backed by NVIDIA, Vercel, OpenRouter, Hugging Face, Databricks, Red Hat, Ollama, OpenAI, and others.
ποΈ Managing secrets
You can use a secret vault to save your LLM provider tokens. The main advantage here is to avoid spreading those token and giving otoroshi apikeys to your own people instead.
ποΈ Load Balancing
Load balancing distributes LLM requests across multiple providers, optimizing performance, ensuring availability, and preventing any single provider from being overloaded.
ποΈ Resilience
Resilience ensures that LLM interactions remain highly available and fault-tolerant, even when providers experience outages or failures.
ποΈ Fallback
The fallback mechanism ensures continuity of service by automatically switching to an alternative LLM provider when the primary provider fails.
ποΈ Observability
Every interaction with a Large Language Model (LLM) generates crucial data that can be monitored, analyzed, and optimized.