Skip to main content

Embedding plugin

The Otoroshi LLM extension provides the Cloud APIM - LLM OpenAI Compat. Embeddings plugin to expose embedding models on an Otoroshi route. The API is compatible with the OpenAI embeddings API.

Plugin configuration

Add the plugin to your route:

{
"enabled": true,
"plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAICompatEmbedding",
"config": {
"refs": ["embedding-model-entity-id"]
}
}
ParameterTypeDefaultDescription
refsarray[]List of Embedding Model entity IDs

Usage

curl --request POST \
--url http://myroute.oto.tools:8080/v1/embeddings \
--header 'content-type: application/json' \
--data '{
"input": "Hello world",
"model": "text-embedding-3-small"
}'

Batch embedding

You can embed multiple texts in a single request by passing an array:

curl --request POST \
--url http://myroute.oto.tools:8080/v1/embeddings \
--header 'content-type: application/json' \
--data '{
"input": ["Hello world", "How are you?", "Goodbye"],
"model": "text-embedding-3-small"
}'

Response

{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023064255, -0.009327292, ...]
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 2,
"total_tokens": 2
}
}

Base64 encoding

For more compact responses, use "encoding_format": "base64":

curl --request POST \
--url http://myroute.oto.tools:8080/v1/embeddings \
--header 'content-type: application/json' \
--data '{
"input": "Hello world",
"model": "text-embedding-3-small",
"encoding_format": "base64"
}'

The embedding vectors are returned as base64-encoded strings of little-endian float bytes instead of JSON arrays.

Model routing

When multiple embedding model providers are configured in refs, you can target a specific provider using the model field:

{
"input": "Hello world",
"model": "providerName/modelName"
}

The provider can be referenced by:

  • Entity name (slug): my-openai-embeddings/text-embedding-3-small
  • Entity ID: embedding-model-id###text-embedding-3-small

If no provider prefix is specified, the first configured ref is used.

Route example

A complete route configuration exposing an embedding model:

{
"frontend": {
"domains": ["embeddings.my-domain.com"]
},
"backend": {
"targets": [
{
"hostname": "request.otoroshi.io",
"port": 443,
"tls": true
}
]
},
"plugins": [
{
"enabled": true,
"plugin": "cp:otoroshi.next.plugins.OverrideHost",
"config": {}
},
{
"enabled": true,
"plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAICompatEmbedding",
"config": {
"refs": ["embedding-model-entity-id"]
}
}
]
}