Skip to main content

Overview

Connect, setup, secure and seamlessly manage LLM models using an Universal/OpenAI compatible API

  • Unified interface: Simplify interactions and minimize integration hassles
  • Use multiple providers: 10+ LLM providers supported right now, a lot more coming
  • Load balancing: Ensure optimal performance by distributing workloads across multiple providers
  • Fallbacks: Automatically switch LLMs during failures to deliver uninterrupted & accurate performance
  • Automatic retries: LLM APIs often have inexplicable failures. You can rescue a substantial number of your requests with our in-built automatic retries feature.
  • Semantic cache: Speed up repeated queries, enhance response times, and reduce costs
  • Custom quotas: Manage LLM tokens quotas per consumer and optimise costs
  • Key vault: securely store your LLM API keys in Otoroshi vault or any other secret vault supported by Otoroshi.
  • Observability and reporting: every LLM request is audited with details about the consumer, the LLM provider and usage. All those audit events are exportable using multiple methods for further reporting
  • Fine grained authorizations: Use Otoroshi advanced fine grained authorizations capabilities to constrains model usage based on whatever you want: user identity, apikey, consumer metadata, request details, etc
  • Prompt Fences: Validate your prompts and prompts responses to avoid sensitive or personal informations leakage, irrelevant or unhelpful responses, gibberish content, etc
  • Prompt engineering: enhance your experience by providing contextual information to your prompts, storing them in a library for reusability, and using prompt templates for increased efficiency

Otoroshi LLM Extension is set of Otoroshi plugins and resources to interact with LLMs, let's discover it

Supported LLM providers

  • OpenAI
  • Azure OpenAI
  • Ollama
  • Mistral
  • Anthropic
  • Cohere
  • Gemini
  • Groq
  • Huggingface
  • OVH AI Endpoints

Introduction video