Skip to main content

Image generation

The Otoroshi LLM extension provides a unified, OpenAI-compatible API for generating and editing images across multiple providers. You can configure image generation providers, expose them through Otoroshi routes, and let consumers generate images using a standard API.

Supported providers

ProviderGenerationEditingDefault model
OpenAIYesYesgpt-image-1
Azure OpenAIYesNogpt-image-1
Cloud TempleYesYes
GeminiYesYesimagen-3.0-generate-002
Grok (X-AI)YesNogrok-2-image
LumaYesNophoton-1
HiveYesNoblack-forest-labs/flux-schnell

Features

  • Image generation from text prompts (all providers)
  • Image editing from uploaded images + text prompt (OpenAI, Gemini, Cloud Temple)
  • Model routing — route to different providers using provider/model syntax in the model field
  • Decode mode — return raw image bytes instead of JSON when generating a single image
  • Model constraints — restrict which models consumers can use via include/exclude regex patterns, enforceable per API key or per user
  • Budget enforcement — image generation costs are tracked and budgets are enforced
  • Cost tracking — per-request cost tracking integrated with cost tracking

API endpoints

The extension exposes two OpenAI-compatible endpoints through Otoroshi route plugins:

EndpointMethodDescription
/v1/images/generationsPOSTGenerate images from a text prompt
/v1/images/editsPOSTEdit an existing image (multipart upload)

Plugin configuration

Image generation plugin

Add the Cloud APIM - Image generation backend plugin to your route:

{
"enabled": true,
"plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAICompatImagesGen",
"config": {
"refs": ["image-model-entity-id"],
"decode": false
}
}
ParameterTypeDefaultDescription
refsarray[]List of Image Model entity IDs
decodebooleanfalseWhen true, returns raw image bytes instead of JSON for single-image responses

Image editing plugin

Add the Cloud APIM - Image edition backend plugin to your route:

{
"enabled": true,
"plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.OpenAICompatImagesEdit",
"config": {
"refs": ["image-model-entity-id"],
"max_size_upload": 104857600
}
}
ParameterTypeDefaultDescription
refsarray[]List of Image Model entity IDs
max_size_uploadnumber104857600 (100MB)Maximum upload file size in bytes

Model routing

When multiple image model providers are configured in refs, you can target a specific provider using the model field in your request:

{
"prompt": "a cat in space",
"model": "providerName/modelName"
}

The provider can be referenced by:

  • Entity name (slug): openai-provider/gpt-image-1
  • Entity ID: image-model-id###gpt-image-1

If no provider prefix is specified, the first configured ref is used.

Entity configuration

An image model entity is configured with a provider type, connection settings, and generation/edition options:

{
"id": "image-model-entity-id",
"name": "My Image Provider",
"description": "An image generation provider",
"provider": "openai",
"config": {
"connection": {
"base_url": "https://api.openai.com/v1",
"token": "sk-xxx",
"timeout": 180000
},
"options": {
"generation": {
"enabled": true,
"model": "gpt-image-1",
"size": "auto",
"quality": "auto",
"n": 1
},
"edition": {
"enabled": true,
"model": "gpt-image-1",
"n": 1
}
}
},
"models": {
"include": [],
"exclude": []
}
}

The token field supports comma-separated round-robin tokens — if the token contains commas, the system rotates through them.

Generation request

curl --request POST \
--url http://myroute.oto.tools:8080/v1/images/generations \
--header 'content-type: application/json' \
--data '{
"prompt": "a white siamese cat",
"model": "gpt-image-1",
"n": 1,
"size": "1024x1024"
}'

Generation parameters

ParameterTypeDescription
promptstringThe text prompt describing the image to generate
modelstringModel name (can include provider prefix)
nintegerNumber of images to generate
sizestringImage dimensions (e.g., 1024x1024, 1792x1024)
qualitystringImage quality (standard, hd, auto)
stylestringImage style (natural, vivid)
response_formatstringResponse format (url or b64_json)
backgroundstringBackground setting
moderationstringModeration level
output_compressionintegerOutput compression level
output_formatstringOutput format (e.g., png, jpeg)

Not all parameters are supported by all providers. See the models page for per-provider details.

Generation response

{
"created": 1762441013,
"data": [
{
"url": "https://...",
"b64_json": null,
"revised_prompt": "a white siamese cat sitting gracefully..."
}
]
}

Image editing

See the image editing page for details on editing images with supported providers.