Skip to main content

Expose your LLM Provider

Now that your provider is fully set up, you can expose it to your organization. The idea is to do it through an Otoroshi route with a plugin of type backend that will handle passing incoming requests to the actual LLM provider.

OpenAI compatible plugins

We provide a set of plugins capable of exposing any supported LLM provider through a compatible OpenAI API. This is the standard way of doing LLM exposition.

Chat completions proxy

Plugin: Cloud APIM - LLM OpenAI chat/completions Proxy

cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAiCompatProxy

This plugin is compatible with the OpenAI chat completion API, including streaming.

Configuration

{
"refs": ["provider_entity_id_1", "provider_entity_id_2"]
}
ParameterTypeDescription
refsarrayList of LLM Provider entity IDs

Usage

curl https://my-own-llm-endpoint.on.otoroshi/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OTOROSHI_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "Hello how are you!"
}
]
}'

Response:

{
"id": "chatcmpl-B9MBs8CjcvOU2jLn4n570S5qMJKcT",
"object": "chat.completion",
"created": 1741569952,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 19,
"completion_tokens": 10,
"total_tokens": 29
}
}

Model routing

When multiple providers are configured in refs, you can target a specific provider using the model field:

  • Slash syntax: providerName/modelName (e.g., my_openai_provider/gpt-4o)
  • Hash syntax: providerId###modelName (e.g., provider_xxx###gpt-4o)

If no provider prefix is specified, the first configured ref is used.

Text completions proxy

Plugin: Cloud APIM - LLM OpenAI completions Proxy

cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAiCompletionProxy

This plugin is compatible with the OpenAI completions API, including streaming. It converts the legacy completions format (prompt field) to the internal chat format.

Configuration

{
"refs": ["provider_entity_id"]
}
ParameterTypeDescription
refsarrayList of LLM Provider entity IDs

Usage

curl https://my-own-llm-endpoint.on.otoroshi/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OTOROSHI_API_KEY" \
-d '{
"model": "gpt-4o",
"prompt": "Once upon a time",
"max_tokens": 100
}'

Supports echo (include the prompt in the response) and suffix parameters.

Models list

Plugin: Cloud APIM - LLM OpenAI Compat. Models list

cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAiCompatModels

Exposes the provider models list compatible with the OpenAI models API.

Configuration

{
"refs": ["provider_entity_id"]
}
ParameterTypeDescription
refsarrayList of LLM Provider entity IDs

Usage

curl https://my-own-llm-endpoint.on.otoroshi/v1/models \
-H "Authorization: Bearer $OTOROSHI_API_KEY"

Response:

{
"object": "list",
"data": [
{
"id": "o1-mini",
"object": "model",
"created": 1686935002,
"owned_by": "openai"
},
{
"id": "gpt-4",
"object": "model",
"created": 1686935002,
"owned_by": "openai"
}
]
}

When multiple providers are configured, model IDs are prefixed with the provider slug name (e.g., my_openai/gpt-4o). Use ?raw=true to get the raw model identifiers without prefixes.

Models list with provider info

Plugin: Cloud APIM - LLM OpenAI Compat. Provider with Models list

cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAiCompatProvidersWithModels

Similar to the models list, but enriches each model entry with provider metadata:

{
"object": "list",
"data": [
{
"id": "my_openai/gpt-4o",
"combined_id": "my_openai/gpt-4o",
"simple_id": "gpt-4o",
"provider_id": "my_openai",
"owned_by": "OpenAI",
"owned_by_with_model": "OpenAI / gpt-4o",
"object": "model",
"created": 1686935002
}
]
}

LLM response endpoint

Plugin: Cloud APIM - LLM Response endpoint

cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.LlmResponseEndpoint

Provides an endpoint that returns raw LLM responses from a predefined prompt. The prompt supports Expression Language for dynamic values based on the request context.

Configuration

{
"ref": "provider_entity_id",
"prompt": "Summarize the following text: ${req.body.text}",
"prompt_ref": null,
"context_ref": null
}
ParameterTypeDescription
refstringLLM Provider entity ID
promptstringPrompt text with optional Expression Language variables
prompt_refstringOptional reference to a stored prompt configuration
context_refstringOptional reference to a stored context configuration

Response

{
"generations": [
{
"message": {
"role": "assistant",
"content": "..."
}
}
]
}

Anthropic compatible plugin

We also provide an Anthropic Messages API compatible proxy that allows any Anthropic API client (including Claude Code) to use any LLM provider managed by Otoroshi. See the dedicated documentation page for details.

Open Responses compatible plugin

We provide an Open Responses API compatible proxy that allows any client speaking the Open Responses specification to use any LLM provider managed by Otoroshi. See the dedicated documentation page for details.

Route example

A complete route configuration exposing a chat completions endpoint with a models list:

{
"frontend": {
"domains": ["llm.my-domain.com"]
},
"backend": {
"targets": [
{
"hostname": "request.otoroshi.io",
"port": 443,
"tls": true
}
]
},
"plugins": [
{
"enabled": true,
"plugin": "cp:otoroshi.next.plugins.OverrideHost",
"config": {}
},
{
"enabled": true,
"plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAiCompatProxy",
"includes": [
"/v1/chat/completions"
],
"config": {
"refs": ["provider_openai_1"]
}
},
{
"enabled": true,
"plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAiCompatModels",
"includes": [
"/v1/models"
],
"config": {
"refs": ["provider_openai_1"]
}
}
]
}